Wednesday, 12 January 2022

SLiMSuite release v1.11.0 (2022-01-12)

DOI

SLiMSuite v1.11 sees the introduction of six genome assembly tools:

  • DepthCharge = Genome assembly quality control and misassembly repair. DepthCharge is an assembly quality control and misassembly repair program. It uses mapped long read depth of coverage to charge through a genome assembly and identify coverage “cliffs” that may indicate a misassembly. If appropriate, it will then blast the assembly into fragment at those misassemblies.
  • DepthKopy = DepthKopy: Read-depth based copy number estimation. DepthKopy applies the same single-copy read depth estimate as DepthSizer to estimate the copy number of different gene regions in a slightly modified version of the approach used in the basenji genome paper.
  • DepthSizer = DepthSizer: Read-depth based genome size prediction. DepthSizer uses long-read depth profiles and BUSCO single-copy orthologues to predict genome size. DepthSizer works on the principle that Complete BUSCO genes should represent predominantly single copy (diploid read depth) regions along with some poor quality and/or repeat regions. Assembly artefacts and collapsed repeats etc. are predicted to deviate from diploid read depth in an inconsistent manner. Therefore, even if less than half the region is actually diploid coverage, the modal read depth is expected to represent the actual single copy read depth.
  • GapSpanner = GapSpanner: Genome assembly gap long read support and reassembly tool. GapSpanner uses (or generates) a BAM file of long reads mapped to a genome assembly to assess assembly “gaps” for spanning read support. Optionally, reads spanning each gap can be extracted and re-assembled with Flye. If the new assembly spans the gap, crude gap-filling can be performed. This will be reversed if edits are not subsequently supported by spanning reads mapped onto the updated assembly.
  • NUMTFinder = NUMTFinder: Nuclear mitochondrial fragment (NUMT) search tool. NUMTFinder uses a mitochondrial genome to search against genome assembly and identify putative NUMTs. NUMT fragments are then combined into NUMT blocks based on proximity.
  • Taxolotl = Taxolotl: Genome assembly taxonomy summary and assessment tool. Taxolotl combines the MMseqs2 easy-taxonomy with GFF parsing to perform taxonomic analysis of a genome assembly (and any subsets given by taxsubsets=LIST) using an annotated proteome. Taxonomic assignments are mapped onto genes as well as assembly scaffolds and (if assembly=FILE is given) contigs.

Documentation for these tools can be found in their individual repos. Please note that individual repos may be ahead of the main SLiMSuite repo.

More information can also be found in the corresponding publications:

See also the included release_notes.txt on GitHub for a full list of the python module updates since v1.9.0.