Skip to main content

SPAdes 2.4

 

SPAdes AssemblerSPAdes manual with installation guide (ver 2.4.0)

Download SPAdes.

Support e-mail: spades.support@bioinf.spbau.ru

 

 

 

SPAdes 2.4 is out! 

See all changes in changelog.

 

For the benchmark we used:

E. coli K-12 MG1655 reference length is 4639675 with 4324 annotated genes. Only contigs of 500bp and longer were taken in consideration.

 

Assembly NG50 # contigs Largest contig Total length # misassemblies # mismatches per 100 kbp # indels per 100 kbp Genome fraction (%) # genes
Single-cell E. coli                  
A5 14399 745 101584 4441145 8 11.68 0.17 89.681 3439
ABySS 68534 179 178720 4345617 6 3.32 1.69 88.254 3703
CLC 32506 503 113285 4656964 2 5.54 1.43 92.211 3766
EULER-SR 26662 429 140518 4248713 17 10.85 35.69 84.856 3416
Ray 55395 296 210612 4649552 14 6.08 0.61 91.771 3826
SOAPdenovo 18468 569 87533 4098032 7 116.37 7.48 79.807 3037
Velvet 22648 261 132865 3501984 2 1.93 1.23 73.574 3072
E+V-SC 32051 344 132865 4540286 2 2.14 0.73 91.488 3759
IDBA1.1_contig 98306 244 284464 4814043 8 5.06 0.27 94.896 4035
IDBA1.1_scaffold 109057 229 284464 4813610 8 4.97 0.89 94.923 4040
SPAdes2.4_contigs 110539 277 269177 4877521 2 5.27 0.79 95.622 4047
SPAdes2.4_scaffolds 112120 250 269177 4910892 4 6.58 1.33 95.698 4055
                   
Isolate E. coli                  
A5 43651 176 181690 4551797 0 0.26 0.11 97.787 4154
ABySS 106155 96 221861 4619631 2 3.66 0.41 98.871 4239
CLC 86964 112 221549 4550314 1 1.79 0.31 97.799 4186
EULER-SR 110153 100 221409 4574240 8 2.49 10.15 97.846 4180
Ray 83128 113 221942 4563341 2 2.18 0.18 97.937 4185
SOAPdenovo 62512 141 172567 4519621 0 27.26 4.69 97.345 4134
Velvet 82776 120 242032 4554702 3 2.36 0.37 97.864 4185
E+V-SC 54856 171 166115 4539639 0 1.26 0.13 97.465 4124
IDBA1.1_contig 106844 110 221687 4565529 3 2.99 0.31 97.992 4195
IDBA1.1_scaffold 133098 93 284363 4565454 4 3.61 0.59 98.021 4204
SPAdes2.4_contigs 134076 97 285228 4634583 2 2.99 0.57 98.916 4245
SPAdes2.4_scaffolds 134076 97 285228 4635776 2 3.92 0.59 98.937 4245
 
ABySS 1.3.4, EULER-SR 2.0.1, Ray 2.0.0, Velvet, and E+V-SC were run with vertex size 55. A5 and CLC 3.22.55708 were run with default parameters. SOAPdenovo 1.0.4 was run with vertex size 27–31. IDBA 1.1.0 was run in its default iterative mode. The total assembly size may increase (and in some cases exceeds the genome size) due to contaminants (see Chitsaz et al. (2011)), misassembled contigs, repeats, and hubs that contribute to multiple contigs. The percentage of the E. coli genome covered filters out these issues (Genome fraction (%) column). The NG50 statistic is the same as the N50 except that the genome size is used rather than the assembly size. Misassemblies are locations on an assembled contig where the left flanking sequence aligns over 1 kb away from the right flanking sequence on the reference. Mismatch (substitution) error rate and number of indels are measured in aligned regions of the contigs. In each column, the best assembler by that criteria is indicated in bold.
 
 

Related publications

  • Anton Bankevich, Sergey Nurk, Dmitry Antipov, Alexey A. Gurevich, Mikhail Dvorkin, Alexander S. Kulikov, Valery M. Lesin, Sergey I. Nikolenko, Son Pham, Andrey D. Prjibelski, Alexey V. Pyshkin, Alexander V. Sirotkin, Nikolay Vyahhi, Glenn Tesler, Max A. Alekseyev, and Pavel A. Pevzner. SPAdes: A New Genome Assembly Algorithm and Its Applications to Single-Cell Sequencing. Journal of Computational Biology 19(5) (2012), 455-477. doi:10.1089/cmb.2012.0021

  • Son K. Pham, Dmitry Antipov, Alexander Sirotkin, Glenn Tesler, Pavel A. Pevzner, and Max A. Alekseyev. Pathset Graphs: A Novel Approach for Comprehensive Utilization of Paired Reads in Genome Assembly. Journal of Computational Biology (2012). doi:10.1089/cmb.2012.0098

  • Nikolay Vyahhi, Son K. Pham, and Pavel A. Pevzner. From de Bruijn Graphs to Rectangle Graphs for Genome Assembly. Lecture Notes in Bioinformatics 7534 (2012), pp. 249-261. doi:10.1007/978-3-642-33122-0_20
  • Sergey I. Nikolenko, Anton I. Korobeynikov and Max. A. Alekseyev. BayesHammer: Bayesian clustering for error correction in single-cell sequencing. BMC Genomics (2013) 14(S1):S7. doi:10.1186/1471-2164-14-S1-S7

 

Acknowledgements

This work was supported by the Government of the Russian Federation (grant 11.G34.31.0018) and by the National Institutes of Health, USA (NIH grant 3P41RR024851-02S1). Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the organizations or agencies that provided support for the project.