VarQuest+ is a database search tool capable of identifying novel variants of a wide range of known SMs including polyketides, alkaloids, flavonoids, saponins, and many others. Algorithmic and software innovations in VarQuest+ make it much more efficient in the running time and memory consumption in comparison to existing analogs. This efficiency allowed the implementation of modification-tolerant search mode in VarQuest+, which is more challenging than a regular database search. VarQuest+ is based on molDiscovery and will be released as part of the NPDtools package.
We benchmarked VarQuest + on a Korean medical plants dataset (2.5 millions of mass spectra collected on 337 samples). The standard search of the KNApSAcK database (51,179 plant SMs) resulted in the identification of 349 compounds. VarQuest+ modification-tolerant search identified 4253 SMs, an order of magnitude more than Dereplicator+. Using the same search parameters, VarQuest+ is twenty times more efficient than Dereplicator+ in runtime, and four times more memory efficient.
The tool is being developed in collaboration with Carnegie Mellon University (PA, USA).
Preliminary results of the VarQuest+ project was presented on the BiATA-2020 conference. You may watch the talk and read the abstract published in BMC Bioinformatics volume 21, Article number: 567 (2020).
This work is funded by RFBR, project number 20-04-01096.