cycloScore is a new approach to scoring and evaluating the statistical significance of nonlinear peptide-spectrum matches. It is embedded into Dereplicator and VarQuest pipelines.

The method takes into account intensities of MS/MS peaks and occurrence of various additional ions during the fragmentation in mass spectrometers. The weights for scoring annotated and missed peaks are statistically learned.


Dereplicator+ is an extension of our Dereplicator tool. As its predecessor, Dereplicator+ is a computational tool developed for identification of known natural products from high-resolution mass spectrometry data. Given a database of chemical structures, Dereplicator+ generates in-silico mass spectra of compounds by predicting how they fragment during mass spectrometry, and compares them to experimental LC/MS-MS and detects similarities.


MetaRiPPquest is a computational tool developed for peptidogenomic-based identification of post-translationally modified peptides (RiPPs) by combining genome/metagenome mining with analysis of tandem mass spectra.

You can try MetaRiPPquest workflow online at GNPS website (registration is needed but it is quick and simple).

Also, we provide the command line version as part of NPDtools package,


QUAST-LG is an extension of QUAST intended for evaluating large-scale genome assemblies (up to mammalian-size).

QUAST-LG is included in the QUAST  package starting from version 5.0.0 (download the latest release). Run QUAST as usual and do not forget to add ‐‐large option to your command!

A short list of the new features (see CHANGES for all):

  • Significant speedup achieved by both use of new fast aligner (minimap2) and the refactoring of alignment analyzing modules
  • New k-mer-based completeness and correctness metrics
  • BUSCO added for enhanced reference-free analysis
  • The concept of upper bound assembly (theoretical limits on the assembly completeness and contiguity for a given genome and set of reads)

Explanation of new metrics and options is given in our manual and please read QUAST-LG paper (ISMB 2018 proceedings) for even more details.

Dense Subgraph Finder

Search for dense subgraphs

Search for dense subgraphs (or corrupted cliques) is a very common problem arising in bioinformatics (e.g., co-expression of genes) and studies of social interactions (e.g., recommendation services).

We decided to make an algorithm for dense subgraphs search available as an independent tool.

IgReC logo



Repertoire construction is a complex clustering problem with the largest obstacle of distinguishing natural diversity of antibodies from PCR errors. Using molecular barcoding (UMIs) helps to separate these two sources of variations in Rep-seq and simplifies error correction.

Molecular barcodes

Molecular barcodes are short genetic sequences attached to each RNA molecule so that all amplified copies of this molecule contain the same barcode.



Diversity Analyzer is a tool for annotation and diversity analysis of full-length adaptive immune repertoires (antibodies or TCRs). Diversity Analyzer is launched as a final step of IgReC. It is also available as a stand alone tool.


Diversity Analyzer takes full-length immune sequences in FASTA/FASTQ format as an input.

IgReC logo


Y-tools version 3.1.1 is released!
Bugfix release.

Download Y-tools

Our tools for analysis of adaptive immune repertoires

Modern sequencing technologies (e.g., Illumina MiSeq) allow biologists to perform full-length scanning of adaptive immune repertoire. For example, a pair of overlapping paired-end Illumina MiSeq reads (250×2 or 300×2) is able to cover a variable region of antibody or TCR:


Sequencing data of adaptive immune repertoires or Rep-seq is an input of various immunological studies.


VarQuest is a novel algorithm for identification of PNP variants via database search of mass spectra, the first high-throughput mutation-tolerant PNP identification method capable of analysing the entire GNPS infrastructure. VarQuest is based on Dereplicator source code, a method aimed at standard PNP identification.
How to run
You can try VarQuest online at GNPS website (registration is needed but it is quick and simple).


Twister: a top-down driven approach to de novo protein sequencing

Twister is a software tool for de novo sequencing of proteins and peptides from tandem mass spectra. Given a set of deconvoluted top-down tandem mass spectra, it first generates a set of de novo sequences, and subsequently combines them into a set of aggregated paths.