QUAST-LG is an extension of QUAST intended for evaluating large-scale genome assemblies (up to mammalian-size).
QUAST-LG is included in the QUAST package starting from version 5.0.0 (download the latest release). Run QUAST as usual and do not forget to add ‐‐large option to your command!
A short list of the new features (see CHANGES for all):
- Significant speedup achieved by both use of new fast aligner (minimap2) and the refactoring of alignment analyzing modules
- New k-mer-based completeness and correctness metrics
- BUSCO added for enhanced reference-free analysis
- The concept of upper bound assembly (theoretical limits on the assembly completeness and contiguity for a given genome and set of reads)
Explanation of new metrics and options is given in our manual and please read QUAST-LG paper (ISMB 2018 proceedings) for even more details. More visual explanations can be found in the ISMB 2018 talk slides and in our poster (presented at both ISMB 2018 and BiATA 2018).
We benchmarked QUAST-LG on six eukaryotic datasets assembled by the leading genome assembly software. Below you will find information about the datasets, links to raw reads and ready assemblies, and QUAST-LG reports (created using pre-release github branch)
- Yeast (S. cerevisiae, genome size: 12.1 Mbp, reference: 4 Mb): reads (Illumina pair-ends and PacBio); assemblies (5 +UpperBound, 22 Mb); report
- Yeast (S. cerevisiae, genome size: 12.1 Mbp, reference: 4 Mb): reads (Illumina pair-ends and Oxford Nanopore from Istace et al., 2017); assemblies (4 +UpperBound, 18.5 Mb); report
- Worm (C. elegans,genome size: 100.3 Mbp, reference: 30 Mb): reads (Illumina pair-ends and PacBio); assemblies (5 +UpperBound, 182.5 Mb); report
- Fruit fly (D. melanogaster, genome size: 137.6 Mbp, reference: 42 Mb): reads (Illumina pair-ends and mate-pairs); assemblies (6 +UpperBound, 283.5 Mb); report
- Human HG004 (H. sapiens, genome size: 3 Gbp, reference: 938 Mb): reads (Illumina pair-ends and mate-pairs, see links and description in ABySS 2.0 study github page); assemblies are from Jackman et al., 2017 (FTP link to the assemblies); our UpperBound assembly, 874 Mb; report
- Human HG001 (H. sapiens, genome size: 3 Gbp, reference: 938 Mb): reads (Illumina pair-ends and Oxford Nanopores, see links and description in MaSuRCA blog); assemblies made by Canu developers (direct link to the assembly), MaSuRCA developers (direct link to the assembly), and Flye developers (direct link to the assembly); our UpperBound assembly, 875 Mb; report
Please help us to make QUAST-LG better by sending your comments, bug reports, and suggestions to quast.support