Skip to main content

Rectangle Graph for Repeat Resolution

Rectangle Graph for Repeat Resolution in Genome Assembly

Ultimate tool for resolving repeats in genome assemblies.

Though the specific implementation of the idea of the rectangle graph approach is already included into the current SPAdes distribution, we're also releasing the Rectangle Graph Module (RGM) as the separate code which can be run independently of SPAdes. Although RGM differs from the current implementation of the rectangle graph approach in SPAdes, in the future we plan to integrate RGM in SPAdes. RGM can be run with other genome assemblers if they use the graph format as SPAdes files.

For more details see: Nikolay Vyahhi, Son K. Pham, Pavel Pevzner. From de Bruijn Graphs to Rectangle Graphs for Genome Assembly, Lecture Notes in Bioinformatics 7534 (2012), pp. 249-261.

Code provided "as is", for questions please contact Nikolay Vyahhi.

Requirements: Python 2.7.x (2.5 and 2.6 may also work)

Usage:

  1. Run SPAdes in debug mode (spades.py --debug), so it will produce saves directory.
  2. Run rectangle module by command line: python rrr.py -s project_name/saves [options] [-o out_dir=out]
  3. Check out_dir for contigs, log files and other interesting debug information.

All rrr.py command line options:

  -h, --help       show this help message and exit
  -s SAVES_DIR     Name of directory with saves
  -g GENOME        File with genome (optional)
  -o OUT_DIR       Output directory, default = out (optional)
  -d DEBUG_LOGGER  File for debug logger (optional)
  --sc             Turn on if data is single-cell (optional)

 

AttachmentSize
rectangles-2.0.zip (January 9, 2013)31.68 KB