Fork me on GitHub

SPAdes

SPAdes 3.15.4

It’s all about the viruses: new coronaSPAdes, rnaviralSPAdes and metaviralSPAdes pipelines.

See changes in changelog

Download SPAdes binaries for Linux (64-bit only)

Download SPAdes binaries for MacOS

Download SPAdes source code

Manuals and support

If you have a problem running SPAdes you can look for a similar issue on our GitHub repository, create a new one or write us via e-mail: .

Note, that SPAdes binaries may not work on new Linux kernels.

Subscribe for updates

 or subscribe below

For the benchmarks we used:

More datasets as well as reference genomes are available here.

E. coli K-12 MG1655 reference length is 4639675 bp with 4324 annotated genes. S. aureus USA300 FPR3757 (chromosome and three plasmids) reference length is 2917469 bp with 2622 annotated genes.

Only contigs of 500 bp and longer were taken in consideration. Tables were obtained using QUAST 4.6.3.

Assembly NG50 # contigs Largest Total length MA MM IND GF (%) # genes
Single-cell E. coli
A5 14399 745 101584 4441145 3 11.92 0.19 89.867 3443
ABySS 68534 179 178720 4345617 6 3.49 0.83 88.265 3704
CLC 32506 503 113285 4656964 1 5.54 1.00 92.286 3767
EULER-SR 26662 429 140518 4248713 12 9.98 20.17 84.846 3410
Ray 45448 361 210820 4379139 16 5.29 1.24 88.345 3634
SOAPdenovo 1540 1166 51517 2958144 1 1.49 0.11 57.668 1766
Velvet 22648 261 132865 3501984 2 2.19 1.17 73.761 3079
E+V-SC 32051 344 132865 4540286 2 2.26 0.70 91.727 3767
IDBA-UD contigs 98306 244 284464 4814043 3 4.37 0.23 95.158 4041
IDBA-UD scaffolds 109057 229 284464 4813609 3 4.42 0.75 95.145 4046
SPAdes 3.12 contigs 105885 231 268283 4795250 3 2.02 0.30 94.853 4028
SPAdes 3.12 scaffolds 117600 214 285212 4800301 3 2.41 0.61 94.886 4030
Isolate E. coli
A5 43651 176 181690 4551797 0 0.40 0.09 98.017 4163
ABySS 106155 96 221861 4619631 2 3.72 0.37 98.969 4241
CLC 86964 112 221549 4550314 1 0.99 0.18 98.057 4202
EULER-SR 110153 100 221409 4574240 4 3.14 5.43 98.092 4182
Ray 86246 98 221942 4634429 1 1.42 0.09 96.865 4136
SOAPdenovo 49626 181 165487 4535469 0 0.15 0.09 97.696 4132
Velvet 82776 120 242032 4554702 3 2.44 0.35 98.131 4190
E+V-SC 54856 171 166115 4539639 0 1.30 0.11 97.792 4134
IDBA-UD contigs 106844 110 221687 4565529 3 3.40 0.24 98.269 4200
IDBA-UD scaffolds 133098 93 284363 4565454 4 4.08 0.55 98.282 4208
SPAdes 3.12 contigs 125485 88 224545 4555008 1 1.93 0.18 98.092 4197
SPAdes 3.12 scaffolds 133189 83 264976 4555437 1 1.98 0.22 98.090 4197
Single-cell S. aureus
A5 4829 937 41828 2770402 2 24.59 0.30 91.579 1815
ABySS 43173 185 175286 2899223 3 6.46 0.50 96.572 2458
EULER-SR 7247 750 66549 2988161 29 21.79 10.78 94.395 2009
Ray 62026 84 125177 2947717 12 1.40 0.44 92.960 2410
SOAPdenovo 510 1047 27317 1473402 0 1.32 0.29 46.717 595
Velvet 15656 347 67677 2746768 3 4.41 4.27 93.181 2274
E+V-SC 32296 215 107657 2932416 5 6.89 5.03 97.497 2476
IDBA-UD contigs 87549 114 175236 2996997 4 2.47 0.76 98.658 2568
IDBA-UD scaffolds 111392 99 210360 2996115 4 2.54 1.46 98.681 2574
SPAdes 3.12 contigs 174343 79 329332 2992317 4 2.68 0.49 98.483 2582
SPAdes 3.12 scaffolds 195326 75 429070 2993815 4 2.96 0.49 98.484 2582

A5 and CLC 3.22.55708 were run with default parameters. ABySS 1.3.5, EULER-SR 2.0.1, Ray 2.2.0, SOAPdenovo 2.04, Velvet 1.2.07, and E+V-SC were run with vertex size 55. IDBA-UD 1.1.0 was run in its default iterative mode.

The total assembly size may increase (and in some cases exceeds the genome size) due to contaminants (see Chitsaz et al. (2011)), misassembled contigs, repeats, and hubs that contribute to multiple contigs. The percentage of the E. coli and S. aureus genomes covered filters out these issues (GF (%), Genome fraction (%) column).
The NG50 statistic is the same as the N50 except that the genome size is used rather than the assembly size.
Misassemblies (MA) are locations on an assembled contig where the left flanking sequence aligns over 1 kb away from the right flanking sequence on the reference.
Mismatch (substitution) error rate (MM) and number of indels (IND) per 100 kbp are measured in aligned regions of the contigs.
In each column, the best assemblers by that criteria is indicated in bold.
SPAdes 3.12 hybrid assemblies benchmarking on Illumina + PacBio E. coli data sets.
Assembly NG50 # contigs Largest Total length MA MM IND GF (%) # genes
E. coli K-12 Illumina only
SPAdes 3.12 contigs 125485 99 224454 4557432 0 2.53 0.33 98.136 4196
E. coli K-12 Illumina + PacBio P4
SPAdes 3.12 contigs 4640965 5 4640965 4643912 0 (6*) 11.41 1.12 99.968 4320
SPAdes 3.12 scaffolds 4640965 5 4640965 4643912 0 (6*) 11.41 1.12 99.968 4320
* Misassemblies are not real and correspond to the difference with respect to the reference
For the benchmarks we used:
  • E. coli K-12 MG1655 Illumina standard isolate dataset outlined above
  • E. coli K-12 MG1655 PacBio RS II C2/P4 dataset available from PacBio DevNet
SPAdes 3.12 IonTorrent benchmarking on E. coli data set.
Assembly NG50 # contigs Largest Total length MA MM IND GF (%) # genes
SPAdes 3.12 contigs 133154 85 285138 4572473 3 3.44 3.39 98.459 4204
SPAdes 3.12 scaffolds 133154 83 285138 4572673 3 3.44 3.39 98.459 4204

 


“I’d like to thank you for the great job you are doing with SPAdes. It’s a very useful software!”
Lionel Guy
Uppsala University, Sweden
“Thanks for your great SPAdes assembler, we have successfully assembled several cultured organims and your assembler always performed best compared to other assemblers when run on the PE- and/or MP MiSeq data we generally use.”
Dr. Harald R. Gruber-Vodicka
Symbiosis Group
Max Planck Institute of Marine Microbiology, Bremen, Germany
“I have used SPAdes to correct errors in my metatranscriptome data and it has significantly improved the data quality. Thanks!”
Burak Avci
Department of Molecular Ecology
Max Planck Institute of Marine Microbiology, Bremen, Germany
“We are also getting good results with SPAdes for metagenomic samples, thanks to its effort to recover as much genomic sequence as it can.”
Amr Abouelleil
Bioinformatics Assembly Analyst at Broad Institute
“I have recently used SPAdes to assembly reads generated on an Illumina platform (2 x 250 bp). The assemblies look very good!”
Mark de Been
Department of Medical Microbiology
University Medical Center Utrecht (UMCU) The Netherlands

Acknowledgements

This work was supported by the Russian Science Foundation (grant 14-50-00069). Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the organizations or agencies that provided support for the project.

Publications

  1. Mitchell AL, Almeida A, Beracochea M, Boland M, Burgin J, Cochrane G, Crusoe MR, Kale V, Potter SC, Richardson LJ, Sakharova E, Scheremetjew M, Korobeynikov A, Shlemov A, Kunyavskaya O, Lapidus A and Finn RD. MGnify: the microbiome analysis resource in 2020 Nucleic Acids Research, 2020
  2. Antipov D. , Bushmanova E. , Dvorkina T. , Gurevich A. , Kunyavskaya O. , Shlemov A., Lapidus A. , Meleshko D. , Nurk S. , Prjibelski A. , Korobeynikov A. SPAdes Family of Tools for Genome Assembly and Analysis (poster), 2019
  3. Korobeynikov A. Tools for assembly graph analysis via SPAdes toolbox and more (talk), 2019
  4. Antipov D. , Rayko M. , Lapidus A. L. , Korobeynikov A. metaplasmidSPAdes: Plasmid Detection and Assembly in Genomic and Metagenomic Datasets (talk), 2019
  5. Shlemov A. , Dvorkina T. , Antipov D. , Nurk S. , Korobeynikov A. Tools for assembly graph analysis via SPAdes toolbox and more (talk), 2019
  6. Tolstoganov I. , Pevzner P. A. cloudSPAdes: Assembly of Synthetic Long Reads Using de Bruijn graphs (talk), 2019
  7. Korobeynikov A. , Antipov D. , Bushmanova E. , Gurevich A. , Lapidus A. , Meleshko D. , Mikheenko A. , Nurk S. , Prjibelski A. , Pevzner P. A. SPAdes Family of Tools for Genome Assembly and Analysis: What’s New? (poster), 2018
  8. Korobeynikov A. , Antipov D. , Bushmanova E. , Gurevich A. , Lapidus A. L. , Meleshko D. , Mikheenko A. , Nurk S. , Prjibelski A. , and Pevzner P. A. SPAdes Family of Tools for Genome Assembly and Analysis: Current Status (poster), 2018
  9. Korobeynikov A. , Antipov D. , Bankevich A. , Bushmanova E. , Lapidus A. , Meleshko D. , Nurk S. , Prjibelski A. , Pevzner P. A. SPAdes: is there anything new we could develop? (talk), 2017
  10. Lapidus A. , Antipov D. , Bankevich A. , Bushmanova E. , Korobeynikov A. , Meleshko D. , Nurk S. , Prjibelski A. , Pevzner P. A. SPAdes Family of Tools for Genome Assembly and Analysis: What’s New? (poster), 2017
  11. Korobeynikov A. , Antipov D. , Bankevich A. , Bushmanova E. , Gurevich A. , Lapidus A. , Meleshko D. , Nurk S. , Prjibelski A. , Pevzner P. A. SPAdes Family of Tools for Genome Assembly and Analysis: What’s New? (poster), 2017
  12. Korobeynikov A. SPAdes Toolbox (tallk), 2017
  13. Lapidus A., Antipov D., Bankevich A., Gurevich A., Korobeynikov A., Nurk S., Prjibelski A., Safonova Y., Vasilinetc I., Pevzner P. A. New Frontiers of Genome Assembly with SPAdes 3.0. (poster), 2014
  14. Alekseyev M. A. SPAdes: новый ассемблер геномов с поддержкой одноклеточного секвенирования. (talk), 2011