VarQuest+

Description

VarQuest+ is a database search tool capable of identifying novel variants of a wide range of known SMs including polyketides, alkaloids, flavonoids, saponins, and many others. Algorithmic and software innovations in VarQuest+ make it much more efficient in the running time and memory consumption in comparison to existing analogs. This efficiency allowed the implementation of modification-tolerant search mode in VarQuest+, which is more challenging than a regular database search. VarQuest+ is based on molDiscovery and will be released as part of the NPDtools package.

Preliminary results

We benchmarked VarQuest + on a Korean medical plants dataset (2.5 millions of mass spectra collected on 337 samples). The standard search of the KNApSAcK database (51,179 plant SMs) resulted in the identification of 349 compounds. VarQuest+ modification-tolerant search identified 4253 SMs, an order of magnitude more than Dereplicator+. Using the same search parameters, VarQuest+ is twenty times more efficient than Dereplicator+ in runtime, and four times more memory efficient.

Additional information

The tool is being developed in collaboration with Carnegie Mellon University (PA, USA).

Preliminary results of the VarQuest+ project was presented on the BiATA-2020 conference. You may watch the talk and read the abstract published in BMC Bioinformatics volume 21, Article number: 567 (2020).

Note: this is an ongoing project, so stay tuned! If you want to try VarQuest+ pre-release version or you wish to get notification about the first public release, please write to .

This work is funded by RFBR, project number 20-04-01096.