SPAligner is a standalone tool for aligning long diverged molecular (both nucleotide and amino acid) sequences against assembly graphs produced by the popular short-read assemblers. The project stemmed from our previous efforts on the long-read alignment within SPAdes (hybridSPAdes) assembler.
SPAligner is implemented on top of SPAdes and will be available soon as a part of SPAdes package.


PathRacer is a novel standalone tool that aligns profile HMM directly to the assembly graph (performing the codon translation on fly for amino acid pHMMs). The tool provides the set of most probable paths traversed by a HMM through the whole assembly graph, regardless whether the sequence of interested is encoded on the single contig or scattered across the set of edges,


NPDtools – Natural Product Discovery tools – is a toolkit containing various pipelines for in silico analysis of natural product mass spectrometry data.

The current version of NPDtools includes

  • Dereplicator — a tool for identification of peptidic natural products (PNPs) through database search of mass spectra
  • VarQuest — a tool for modification-tolerant identification of novel variants of PNPs
  • Dereplicator+ — a tool for identification of metabolites (both peptidic and non-peptidic) through database search of mass spectra
  • MetaMiner (former RiPPquest,


NPS is an approach to scoring (NPScore) and evaluating the statistical significance (NPSignificance) of peptidic natural product-spectrum matches. It is embedded into Dereplicator and VarQuest pipelines.

The method takes into account intensities of MS/MS peaks and occurrence of various additional ions during the fragmentation in mass spectrometers.


Dereplicator+ is an extension of our Dereplicator tool. As its predecessor, Dereplicator+ is a computational tool developed for identification of known natural products from high-resolution mass spectrometry data. Given a database of chemical structures, Dereplicator+ generates in-silico mass spectra of compounds by predicting how they fragment during mass spectrometry, and compares them to experimental LC/MS-MS and detects similarities.


MetaMiner (former MetaRiPPquest) is a computational tool developed for peptidogenomic-based identification of post-translationally modified peptides (RiPPs) by combining genome/metagenome mining with analysis of tandem mass spectra.

You can try MetaMiner workflow online at GNPS website (registration is needed but it is quick and simple).

Also, we provide the command line version as part of NPDtools package,


QUAST-LG is an extension of QUAST intended for evaluating large-scale genome assemblies (up to mammalian-size).

QUAST-LG is included in the QUAST  package starting from version 5.0.0 (download the latest release). Run QUAST as usual and do not forget to add ‐‐large option to your command!

A short list of the new features (see CHANGES for all):

  • Significant speedup achieved by both use of new fast aligner (minimap2) and the refactoring of alignment analyzing modules
  • New k-mer-based completeness and correctness metrics
  • BUSCO added for enhanced reference-free analysis
  • The concept of upper bound assembly (theoretical limits on the assembly completeness and contiguity for a given genome and set of reads)

Explanation of new metrics and options is given in our manual and please read QUAST-LG paper (ISMB 2018 proceedings) for even more details.

Dense Subgraph Finder

Search for dense subgraphs

Search for dense subgraphs (or corrupted cliques) is a very common problem arising in bioinformatics (e.g., co-expression of genes) and studies of social interactions (e.g., recommendation services).

We decided to make an algorithm for dense subgraphs search available as an independent tool.

IgReC logo



Repertoire construction is a complex clustering problem with the largest obstacle of distinguishing natural diversity of antibodies from PCR errors. Using molecular barcoding (UMIs) helps to separate these two sources of variations in Rep-seq and simplifies error correction.

Molecular barcodes

Molecular barcodes are short genetic sequences attached to each RNA molecule so that all amplified copies of this molecule contain the same barcode.