Skip to main content

Assembly Boot Camp

I. Обзорные статьи:

  1. Genome Reconstruction: A Puzzle with a Billion Pieces. P. Compeau, P Pevzner, 2010. (читать в первую очередь!)
  2. (#5 - slides) Assembly of large genomes using second-generation sequencing. MC Schatz, AL Delcher, S. Salzberg - Genome research, 2010.
  3. (#6 - slides) De novo assembly of short sequence reads. K Paszkiewicz. DJ Studholme - Briefings in Bioinformatics, 2010

II. Алгоритмические подходы:

  1. (#8 - slides) Opportunistic Data Structures with Applications. Paolo Ferragina and Giovanni Manzini. (2000) p.390 Proceedings of the 41st Annual Symposium on Foundations of Computer Science
  2. (slides) A space-efficient construction of the Burrows-Wheeler transform for genomic data. Lippert RA, Mobarry CM, Walenz BP. J Comput Biol. 2005 Sep;12(7):943-51. (source code)
  3. (#7 - slides) Suffix arrays: A new method for on-line string searches. U. Manber, G. Myers, in: Proc. of Symposium on Discrete Algorithms, SODA, 1990, pp. 319√327.
  4. (#1 - slides) An Eulerian path approach to DNA fragment assembly. Pevzner PA, Tang H, Waterman MS.Proc Natl Acad Sci U S A. 2001 Aug 14;98(17):9748-53
  5. (#2 - slides) Fragment assembly with double-barreled data. Pevzner PA, Tang H. Bioinformatics. 2001;17 Suppl 1:S225-33.
  6. (#3 - slides) Fragment assembly with short reads. M Chaisson, P Pevzner. - Bioinformatics, 2004
  7. (#4 - slides) De novo repeat classification and fragment assembly. PA Pevzner, H Tang, G. Tesler - Genome Research, 2004
  8. (slides) Efficient construction of an assembly string graph using the FM-index. Simpson JT, Durbin R. Bioinformatics. 2010 Jun 15;26(12):i367-73
  9. (slides) Succinct Data Structures for Assembling Large Genomes. Thomas C Conway, Andrew J Bromage (submitted).
  10. Paired de Bruijn graphs: a novel approach for incorporating mate pair information into genome assemblers. P Medvedev, S Pham, M Chaisson, G Tesler, P Pevzner. - RECOMB 2011 (in press)
  11. (slides) Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Langmead B, Trapnell C, Pop M, Salzberg SL. Genome Biol 2009 10:R25.
  12. (slides) DRIMM-Synteny: decomposing genomes into evolutionary conserved segments. Pham SK, Pevzner PA. Bioinformatics. 2010 Oct 15;26(20):2509-16

III. Cовременные ассемблеры:

  1. (slides) Short read fragment assembly of bacterial genomes. MJ Chaisson, P.A. Pevzner. - Genome Research, 2008
  2. (slides) Velvet: algorithms for de novo short read assembly using de Bruijn graphs. DR Zerbino, E Birney. - Genome research, 2008
  3. (slides) ALLPATHS: de novo assembly of whole-genome shotgun microreads. Butler J, MacCallum I, Kleber M, Shlyakhter IA, Belmonte MK, Lander ES, Nusbaum C, Jaffe DB. Genome Res. 2008 May;18(5):810-20.
  4. (slides) ALLPATHS 2: small genomes assembled accurately and with high continuity from short paired reads. Maccallum I, Przybylski D, Gnerre S, Burton J, Shlyakhter I, Gnirke A, Malek J, McKernan K, Ranade S, Shea TP, Williams L, Young S, Nusbaum C, Jaffe DB. Genome Biol. 2009;10(10):R103.
  5. (slides) De novo fragment assembly with short mate-paired reads: Does the read length matter? MJ Chaisson, D Brinza, P.A. Pevzner - Genome research, 2009
  6. (slides) ABySS: A parallel assembler for short read sequence data. JT Simpson, K Wong, SD Jackman. - Genome research, 2009 (nice pic)
  7. (slides) De novo assembly of human genomes with massively parallel short read sequencing. R Li, H Zhu, J Ruan, W Qian, X Fang. - Genome research, 2010
  8. (slides) Quake: quality-aware detection and correction of sequencing errors. Kelley DR, Schatz MC, Salzberg SL. Genome Biol. 2010 Nov 29;11(11):R116.
  9. Single cell genome sequencing. Chitsaz et al., (submitted)

IV. Дополнительные структуры данных:

  1. Succinct Indexable Dictionaries with Applications to Encoding k-ary Trees, Prefix Sums and Multisets. Raman, Raman, Rao, ACM, 2007.
2009-Zerbino-Genome assembly and comparison using de Bruijn graphs.pdf4.92 MB