Public

SiBELia: A comparative genomics tool

Submitted by Nikolay Vyahhi on 16 Oct 2012, Tue, 11:32

Today is the first release of SiBELia — Synteny Block ExpLoration tool! It ﬁnds synteny blocks in genomes represented as nucleotide sequences using approach based on de Bruijn graphs.

Congratulation to Ilya Minkin — author of SiBELia and grad student at the St. Petersburg Academic Unviersity, and to Son Pham who is supervising this project. For more information please take a look at the poster. We'll be happy to hear your feedback.

Add new comment

Rectangle Graph for Repeat Resolution

Rectangle Graph for Repeat Resolution in Genome Assembly

Ultimate tool for resolving repeats in genome assemblies.

Though the specific implementation of the idea of the rectangle graph approach is already included into the current SPAdes distribution, we're also releasing the Rectangle Graph Module (RGM) as the separate code which can be run independently of SPAdes. Although RGM differs from the current implementation of the rectangle graph approach in SPAdes, in the future we plan to integrate RGM in SPAdes. RGM can be run with other genome assemblers if they use the graph format as SPAdes files.

For more details see: Nikolay Vyahhi, Son K. Pham, Pavel Pevzner. From de Bruijn Graphs to Rectangle Graphs for Genome Assembly, Lecture Notes in Bioinformatics 7534 (2012), pp. 249-261.

Code provided "as is", for questions please contact Nikolay Vyahhi.

Requirements: Python 2.7.x (2.5 and 2.6 may also work)

Usage:

Run SPAdes in debug mode (spades.py --debug), so it will produce saves directory.
Run rectangle module by command line: python rrr.py -s project_name/saves [options] [-o out_dir=out]
Check out_dir for contigs, log files and other interesting debug information.

All rrr.py command line options:

-h, --help show this help message and exit

-s SAVES_DIR Name of directory with saves

-g GENOME File with genome (optional)

-o OUT_DIR Output directory, default = out (optional)

-d DEBUG_LOGGER File for debug logger (optional)

--sc Turn on if data is single-cell (optional)

Sibelia

Sibelia
(aka Synteny Block ExpLoration tool)

Sibelia: A comparative genomics tool: It assists biologists in analysing the genomic variations that correlate with pathogens, or the genomic changes that help microorganisms adapt in different environments. Sibelia will also be helpful for the evolutionary and genome rearrangement studies for multiple strains of microorganisms.

Sibelia is useful in finding: (1) shared regions, (2) regions that present in one group of genomes but not in others, (3) rearrangements that transform one genome to other genomes.

In version 2, Sibelia works with multiple strains of bacteria and partitions their genomes into synteny blocks — blocks of highly conserved regions among all compared genomes. It represents genomes in circos pictures [for publication] or interactive forms [for experts’ analysis].

Sibelia is under active development. If you see that Sibelia has a potential to support your research, please do not hesitate to contact us at vyahhi@bioinf.spbau.ru with a list of features you would like Sibelia to have.

Related publications:

Ilya Minkin, Nikolay Vyahhi, Son Pham. "SyntenyFinder: A Synteny Blocks Generation and Genome Comparison Tool" (WABI 2012 poster)

Ilya Minkin, Anand Patel, Mikhail Kolmogorov, Nikolay Vyahhi, Son Pham. Sibelia: A fast synteny blocks generation tool for many closely related microbial genomes (submitted)

Acknowledgements

This work was supported by the Government of the Russian Federation (grant 11.G34.31.0018) and by the National Institutes of Health, USA (NIH grant 3P41RR024851-02S1).

Circos visualization:

The figure illustrates the hierarchy structure of synteny blocks between two strains of Helicobacter pylori: F32 and Gambia94/24.
Staphylococcus aureus subsp. aureus, strains JH1, N315, TW20 and MSSA476. Minimum size of a block = 10 000 bp.	Pseudomonas aeruginosa, strains UCBPP-PA14, PAO1 and NCGM2.S1. Minimum size of a block = 10 000 bp.	Helicobacter pylori, strains F32 and Gambia94/24. Minimum size of a block = 5000 bp.

Русский

Наука и образование Санкт-Петербурга

Submitted by Nikolay Vyahhi on 29 Sep 2012, Sat, 15:54

20 сентября на на телеканале ТВЦ был показан репортаж про лабораторию алгоритмической биологии, а также другие научные и образовательные проекты Санкт-Петербурга.

Add new comment

RECOMB-AB and RECOMB-BE 2012 were organized in St. Petersburg

Submitted by Nikolay Vyahhi on 10 Sep 2012, Mon, 17:01

We've ogranized and hosted RECOMB-AB (Open Problems in Algorithmic Biology) and RECOMB-BE (Bioinformatics Education) satellite conferences this year.

It was awesome to see you all in St. Petersburg, we'll post all materials and photos a little bit later.

Add new comment

Interns

Summer 2013

Petar Ivanov
Moscow State University

Vitaliy Demyanuk
ITMO

Artem Tarasov
St. Petersburg State University

Summer 2012

Pavel Avdeev ITMO	Irina Gorbunova SPbSU	Aleksey Kladov SPbSU
Tatyana Krivosheeva MSU	Ilya Minkin SPbAU	Ivan Mihajlin SPbSTU
Alexander Opeykin SPbAU	Taisia Peunova SPbSU	Vladislav Saveliev SPbAU
Yana Safonova NNSU	Ilya Chernyavsky СПбАУ

Summer 2011

Maxim Gladkih

SPbSU

Maria Fomkina

SPbAU

Alex Davydow

SPbAU

Yuri Zemlyansky

SPbSU

Andrey Lushnikov

SPbSU

Ilya Makeev

ITMO

Alexey Gurevich

SPbAU

Andrey Prjibelski

SPbAU

Alex Pyshkin

SPbSU

Русский

Antibody sequencing

Antibodies are proteins produced by the body’s immune system in response to antigens – potentially harmful substances. They are formed of four polypeptide chains: two identical heavy chains, and two identical light chains. A heavy chain is composed of four gene segments: V (variable), D (diverse), J (join), and C (constant); similarly, a light chain consists of three gene segments: V, J, and C. Antibodies are not encoded directly in the genome, but are assembled from those gene segments, each chosen from hundreds of candidates. Moreover, some nucleotides may be inserted or deleted at the junctions, increasing antibody diversity, and somatic hypermutation further diversifies the antibody repertoire.

The effectiveness of an antibody in blocking a particular antigen strongly depends on its amino acid sequence, as well as on the presence (or absence) of certain modifications. This makes the task of antibody sequencing highly important. However, due to their diversity, no complete antibody database exists. As a consequence, MS/MS database search approaches to protein sequencing are inapplicable to this case, leaving de novo sequencing the most attractive alternative.

Just a few years ago, sequencing a single antibody represented a heroic effort. “Digitizing" the $25 billion antibody industry forms an important goal because antibodies act as key diagnostic and therapeutic agents. Our experimental collaborators anticipate that as soon as the cost of antibody sequencing drops below $1,000, most diagnostic and therapeutic antibodies will be routinely sequenced. This flurry of sequencing thousands of antibodies will necessarily lead to digitization throughout the industry, a task requiring advanced software tools. Also, future applications will focus on previously unsequenced polyclonal antibodies. This research, if successful, would lead to disruptive computational technology in the antibody industry.

At the first stage of this project, we have developed a de Bruijn graph approach for the de novo assembly of thousands of top-down spectra into a protein sequence.

QUAST: Quality Assessment Tool for Genome Assemblies

Dear users! Our website moved to quast.sf.net! New QUAST page is here.

QUAST evaluates genome assemblies. For metagenome assembly evaluation, see MetaQUAST project. For contig alignment visualization, see Icarus project.

QUAST works both with and without a reference genome.

The tool accepts multiple assemblies, thus is suitable for comparison.

Source code

Manual

QUAST web interface

Paper at Bioinformatics journal

Paper at PubMed

Poster at RECOMB-2013 (PDF)

Citation:

Alexey Gurevich, Vladislav Saveliev, Nikolay Vyahhi and Glenn Tesler,
QUAST: quality assessment tool for genome assemblies,
Bioinformatics (2013) 29 (8): 1072-1075.
doi: 10.1093/bioinformatics/btt086
First published online: February 19, 2013

Citation of other formats

About us:

Latest news:

April 19, 2016 — the version 4.0 is released! Now with Icarus visualizer! Check out full list of changes.
April 1, 2016 — MetaQUAST paper was published in Bioinformatics volume 32, issue 7, pp. 1088-1090.
November 27, 2015 — the version 3.2 is released! Now with raw reads support. Check out full list of changes.
July 13, 2015 — QUAST repositories are open for public access on Github! Command-line tool is here, web interface is here.
June, 2015 — the number of QUAST downloads exceeded ten thousands (including more than 6500 downloads of QUAST v.2.3)!
April 15, 2013 — The paper was published in Bioinformatics volume 29, issue 8, pp. 1072-1075.

Please help us to make QUAST better by sending your comments, bug reports, and suggestions to quast.support@bioinf.spbau.ru.

Samples of QUAST plots:

Interactive QUAST evaluation demos of single-cell E. coli, H. sapiens chr. 14 , and B. impatiens (bumblebee) assemblies.

Русский

QUAST 1.0 released.

Submitted by Nikolay Vyahhi on 3 Aug 2012, Fri, 00:01

Today we've also released first version of QUAST to public! QUAST is the ultimate tool to evaluate genome assemblies by computing various metrics (e.g., N50, number of ORFs, etc.), drawing plots and producing handy reports.

We use it every day to improve SPAdes single-cell assembler. Enjoy and we would be happy to hear your feedback.

Add new comment

Talks

Lectures:

July 21–25, 2013: International Conference "High-Throughput Sequencing in Genomics'' (HSG-2013), Novosibirsk, Russia:
- Alla Lapidus, "Genome assembly and finishing–why high quality references are needed"
- Andrey Prjibelsky, "Genome draft assembly algorithms: from the very beginning till present-day problems"
July 19–21, 2013: Youth Scientific and Practical School "Genomic Sequencing and Data Analysis", Novosibirsk, Russia:
- Alla Lapidus, "Bacterial genome assembly"
- Andrey Prjibelsky, "An easy way to assemble a genome from NGS data in 30 minutes"
January 30, 2013: Kira Vyatkina, "Protein Identification by Methods of Computational Mass Spectrometry" (at Ioffe Physical Technical Institute, the Russian Academy of Sciences, Saint Petersburg)

December 4, 2012: Nikolay Vyahhi, "Биоинформатика, молекулярная биология и сборка геномов" (at St. Petersburg State University, Семинар Лаборатории им. П.Л.Чбышева "Перечислительная комбинаторика и случайные матрицы")

November 20, 2012: Nikolay Vyahhi: "Биоинформатика, строковые и графовые алгоритмы" (keynote at HPC-2012 forum in Nizhni Novgorod)
October 9, 2012: Kira Vyatkina, "Protein Identification from Top-Down Mass Spectra" (at St Petersburg Scientific Forum "Science and Society. Science and Mankind Progress" — the Seventh Nobel Prize Laureates Meeting)
October 5-7, 2012: Nikolay Vyahhi: "Алгоритмы в биоинформатике" (at Computer Science Club in Ekaterinburg).
September 12, 2012: Kira Vyatkina. "MORPH-PRO: A Novel Algorithm and Web Server for Protein Morphing" (at WABI 2012)
September 11, 2012: Nikolay Vyahhi. "Rectangle Graphs in Genome Assembly" (at WABI 2012)
June 16, 2012: Nikolay Vyahhi. "Геном из одной клетки" ("Single-Cell Genomics", at TEDxNevaRiver «Действующие лица»).

April 27-28, 2012: A symposium to launch the Theodosius Dobzhansky Center for Genome Informatics:
- April 27, 2012: Pavel Pevzner "BioInformatics and genome assemblies — Challenges for Russia and for the world community"
- April 27, 2012: Alla Lapidus "Genomics and cancer a moving landscape"
- April 28, 2012: Sergey Nikolenko "Mathematical Biology and Informatics"

March 17, 2012: Kira Vyatkina. Теги пептидных последовательностей для масс-спектров, полученных на основе технологии top-down. (at Saint Petersburg State University)
November 18, 2011: Max Alekseyev. Вычислительные задачи и решения в сборке геномов из коротких парных ридов. (at Nizhni Novgorod Mathematical Society)
November 15, 2011: Max Alekseyev. Комбинаторные задачи и алгоритмы сравнительного анализа геномов. (at Novosibirsk State University)
September 8, 2011: Kira Vyatkina. Эффективные методы идентификации масс-спектров с использованием тегов. (at Saint Petersburg State University)
May 30, 2011: Pavel Pevzner. Genome rearrangements: from biological problems to combinatorial algorithms (and back) (at PDMI General Mathematics Seminar)

May 25, 2011: Max Alekseyev. Computational Challenges and Advances in Genome Assembly from Short Reads. (at University of South Carolina)
May 7, 2011: Max Alekseyev. Алгоритмические задачи в биоинформатике. (at Nizhni Novgorod State University)
May 7, 2011: Pavel Pevzner. Вычислительная протеомика (at Academic University).
May 1, 2011: Max Alekseyev. Комбинаторные задачи и алгоритмы сравнительного анализа геномов (at PDMI Computer Science Club).
December 9, 2010: Pavel Pezner. Genome Rearrangements: from Biological Problems to Combinatorial Algorithms (at PDMI Computer Science Club).

Русский

Public

Sibelia(aka Synteny Block ExpLoration tool)

Links:

Related publications:

Acknowledgements

Circos visualization:

Dear users! Our website moved to quast.sf.net! New QUAST page is here.

Sibelia
(aka Synteny Block ExpLoration tool)