Skip to main content

Public

QUAST 3.1 is out - even better, faster, much lighter installation package, and brand new visualizations in MetaQUAST

QUAST 3.1 released! Even better, faster, and much lighter than QUAST 3.0.

  • Significantly reduced size of the installation package
  • MetaQUAST includes Krona charts, heatmaps, and many other new features
http://quast.bioinf.spbau.ru/metaquast/metahit/summary/report.html

SPAdes 3.6: better performance + truSPAdes for TruSeq long reads

SPAdes 3.6 is out!

NEW: Added truSPAdes – an assembler for short reads produced by Illumina TruSeq Long Read technology.

Changes:

  • Better running time, less RAM consumption and improved results for BayesHammer error correction module.
  • Improvements and bugfixes in repeat resolution and scaffolding modules.
  • Improvements and bugfixes in dipSPAdes.
  • MismatchCorrector now uses bwa-mem.
  • Bugfixes in MismatchCorrector.

MetaQUAST: Quality Assessment Tool for Metagenome Assemblies

Dear users! Our website moved to quast.sf.net! New MetaQUAST page is here.

 
MetaQUAST evaluates and compares metagenome assemblies based on alignments to close references. It is based on QUAST genome quality assessment tool, but addresses features specific for metagenome datasets:
  • Huge species diversity – the tool accepts multiple references and makes multi-genome tables and plots, including Krona charts.
  • Commonly unknown species content – the tool automatically detects and downloads reference sequences from NCBI.
  • Presence of highly relative genomes – the tool detects chimeric contigs and reports "interspecies misassemblies" in addition to the regular assembly errors types. 
 
MetaQUAST can be fed with multiple assemblies, thus is perfect for comparison.
 
MetaQUAST is distributed within QUAST package since version 2.2.

 

GitHub     Full manual (MetaQUAST specific sections: running, output description)

Paper at Bioinformatics journal

Citation:

Alla Mikheenko, Vladislav Saveliev, Alexey Gurevich,
MetaQUAST: evaluation of metagenome assemblies,
Bioinformatics (2016) 32 (7): 1088-1090.
First published online: November 26, 2015

Citation of other formats

Examples of comparison of 4 assemblers on 3 datasets generated with MetaQUAST v3.2:

 
    
 
Please help us to make MetaQUAST better by sending your comments, bug reports, and suggestions to quast.support@bioinf.spbau.ru

QUAST 3 is out!

Brand new version of QUAST is released!

  • Faster: 5x–100x more running w/o reference; running in parallel with large references; new spead-up options: --no-check, --no-gc, --no-snps, --fast
  • Better: more accurate missasembly detection algorithm, new metrics and reports
  • New MetaQUAST: better contig binning, multi-reference reports, interspecies translocations

Download here. Check the full list of changes here.

rnaSPAdes: De Novo RNA-Seq Assembler

rnaSPAdes page was moved to cab.spbu.ru/software/rnaspades/

 

 

 

rnaQUAST: Quality Assessment Tool for Transcriptome Assemblies

This page has moved to cab.spbu.ru/software/rnaquast/

 

SPAdes 3.5 is out: Nanopores, Lucigen NxMate mate-pairs, new mismatch correction

New SPAdes 3.5 is out!

This release includes new mismatch correction module, support for Oxford Nanopore and Lucigen NxMate mate-pair libraries, possibility to specify coverage cutoff, improved performance and several fixes.

Eugene Kurpilyansky

 
Education:
2008 - 2012. Ural Federal University, Institute of Mathematics and Computer Science, Faculty of Mathematics and Mechanics, BSc in Computer science.
2012 - 2014. Ural Federal University, Institute of Mathematics and Computer Science, Faculty of Mathematics and Mechanics, MSc in in Computer science.
2014 - present. St. Petersburg University of the Russian Academy of Sciences, PhD student.
 
Teaching experience:
2008 - present. Summer Informatics School.
 
Projects:

Reconstruction of antibodies repertoire from NGS data.

 
Awards:

2010 and 2011. Bronze medals on ACM ICPC World Finals.

Immunoproteogenomics: analysis of antibody repertoire

What is antibody repertoire?

Antibody repertoire is a set of curculating antibodies. Reconstruction of antibody repertoire is important step of antibody drug development. We present a collection of tools for investigating antibody repertoire based on immunosequencing data:

IgRepertoireConstructor: an algorithm for construction of antibody repertoire and immunoproteogenomics analysis

IgSimulator: tool for simulation of antibody repertoire

IgQUAST: quality assessment tool for antibody repertoires (coming soon)




Antibody repertoire representation

We present an antibody repertoire as a set of clusters that correspond to antibody clones (groups of identical antibodies presenting by antibody nucleotide sequence, frequency and a set of Ig-Seq reads composing group). We use two files to describe antibody repertoire: CLUSTERS.FA (FASTA file containing antibody sequences) and RCM (Read-Cluster Map). Examples of CLUSTERS.FA and RCM files for toy repertoire are listed below.

CLUSTERS.FASTA is a FASTA file, where each sequence corresponds to the antibody clone.

Header of each sequence contains information about corresponding cluster id and size.

Example shows repertoire containing 3 clusters of sizes 3, 2, and 1.

 

Every line of RCM file contains information about read name and corresponding cluster id.

For example, cluster 1 contains of reads MISEQ@:53:000000000-A2BMW:1:2114:14345:28882,

MISEQ@:53:000000000-A2BMW:1:2114:14345:28882 and MISEQ@:53:000000000-A2BMW:1:2114:14393:28886.


IgRepertoireConstructor

IgRepertoireConstructor is a tool for construction of antibody repertoire from Illumina Ig-Seq library. IgRepertoireConstructor takes as an input immunosequencing reads that cover variable regions of antibodies and returns antibody repertoire constructed from the given reads as its output. 

Visit IgRepertoireConstructor official page at GitHub for more details and download the latest version!

 

IgSimulator

IgSimulator is a tool for simulation of antibody repertoire and Ig-Seq library. IgSimulator is designed for testing and benchmarking tools for reconstruction of Ig repertoires.

Visit IgSimulator official page at GitHub for more details and download the latest version!

 


IgQUAST

IgQUAST (Immunoglobulin QUality ASsessment Tool) is a tool for quality assessment of antibody repertoire. IgQUAST takes antibody repertoire(s) as an input and evaluates them in the different ways:
  • Single repertoire evaluation
  • Multiple repertoires comparison
  • Quality assessment against an ideal repertoire

Single repertoire evaluation

IgQUAST computes basic metrics such as # clusters, # singletons (or clusters containing of single read), size of maximal cluster, average size of cluster and a set of metrics showing number of clusters in repertoire of size larger than thresholds (# clusters >= 10, # clusters >= 50, # clusters >= 100 etc) and draws plots, such as histogram of cluster size / length distribution: 

Histogram of cluster size distribution                                           Histogram of cluster length distribution

IgQUAST additionally performs advanced analysis of mutated groups (groups of antibodies possibly developed from the same antibody). Example of advanced analysis of IgQUAST is shown below:

 

 

 

(a) Example of visualization of two clusters alignment. Peaks correspond to positions of polymorphisms in alignment. Red bars correspond to positions of CDRs computed by IgBlast.    (b) Example of visualization of summarized alignment of cluster against similar clusters.

(c) Example of histogram of relative positions of polymorphisms. Red bars correspond to theoretical positions to CDRs.

 

Multiple repertoire comparison

IgQUAST compares two or more repertoires constructed from the same Ig-Seq library and computed a set of metrics showing similarity of input repertoires.

General metrics for all compared repertoires

Metric name Description
# ideal groups Number of clusters that are identical in all input repertoires, i.e. have similar sequences and were combined by the same set of reads
# trusted groups Number of groups where clusters from different repertoires have similar sequences and share >90% of reads. Such groups occur when cluster from one repertoire is presented by one big and several small clusters in other repertoires. These groups can be result of inaccurate error correction of one of input repertoires.
# untrusted groups Number of groups where clusters from different repertoires have non-similar sequences and share >90% of reads. Existence of such groups indicates that at least one of cluster sequence from untrusted group is erroneous and should be reconstructed
# non-trivial ideal/trusted/untrusted groups Ideal/trusted/untrusted groups where at least one cluster is not singleton.
# big untrusted groups Number of groups of big clusters (only clusters of size at least as specified with option --isol-min-size) from different repertoires that have similar sequences and share >90% of reads.

Individual metrics for each repertoire

Metric name Description
# isolated clusters Number of clusters that presented in only one input repertoire and have no similar clusters in other repertoires.
# short clusters Number of clusters with length of sequence <300 nt.
# short isolated clusters Number of isolated clusters with length of sequence <300 nt.
min/avg/max cluster size Minimal/average/maximal size of isolated cluster.
# trivial isolated clusters Number of isolated singletons

 

IgQUAST reports various plots showing comparative histograms of cluster size / antibody length distribution for input repertoires:

Quality assessment against an ideal repertoire

IgQUAST evaluates repertoire with respect of ideal repertoire (e.g., in case of simulated repertoire) in terms of sensitivity (the measure of the representation of the ideal clusters by the constructed clusters) and specificity (the error rate of the incorrectly merged clusters of the ideal repertoire):

Metric name Description
# original clusters Number of clusters in ideal repertoire.
# not merged Number of non-trivial clusters in the original repertoire that contain multiple clusters in the constructed repertoire. For a correctly constructed repertoire, the value of #this metric is 0.
# not merged (not trivial + singletons) Number of not merged clusters that are formed by a single non-trivial cluster and a number of singletons in the constructed repertoire.
# original singletons number of singletons in ideal repertoire.
max original cluster Size of maximal cluster from ideal repertoire.
# constructed clusters Number of constructed clusters.
# errors Number of constructed clusters that contain reads from more than one original cluster. For the correctly constructed repertoire, this metric is 0.
# constructed singletons Number of constructed singleton clusters.
max constructed cluster size of maximal constructed cluster.
avg fill-in The value of avg fill-in for an original cluster C is computed as the ratio of the size of its largest non-erroneous subcluster in the constructed repertoire to the size of C.
fill-in of max cluster Maximal cluster of the original repertoire corresponds to the most frequent monoclonal antibodies. This metric is equal to the fill-in of the maximal original cluster.
correct singletons (%) Some singletons in the constructed repertoire can be false due to insufficient error correction. This metric shows percentage of true singletons in the constructed repertoire.
used reads (%) Percentage of reads used in the repertoire reconstruction. This metric shows how well the reads have been utilized for reconstructing repertoires.
#lost clusters Number of original clusters that were completely lost in the constructed repertoire.
lost clusters size (%) Percentage of the lost clusters size as compared to full size of original repertoire.
min/avg/max percentage of identity (%) Minimal/average/maximal percentage of identity between sequences of clusters from original repertoire and corresponding clusters in constructed repertoire (corresponding cluster from constructed repertoire selected as a cluster that have most shared reads with cluster in original repertoire).

 

SPAdes 3.1.1: a bug fix release

Introduced several improvements in IonHammer; fixed minor bugs in repeat resolution and scaffolding.

http://bioinf.spbau.ru/en/spades

 

Syndicate content