Sunday, 27 December 2020

SLiMSuite release v1.9.1 (2020-12-27)

DOI

SLiMSuite release v1.9.1 (2020-12-27) is now on GitHub and Zenodo:

SLiMSuite v1.9 sees the introduction of four genome assembly tools:

  • Diploidocus = Diploid genome assembly analysis toolkit. Includes assembly cleanup (haplotig/artefact removal), genome size prediction and read depth copy number analysis.
  • PAFScaff = Pairwise mApping Format reference-based scaffold anchoring and super-scaffolding. Uses minimap2 to map a genome assembly onto reference chromosomes.
  • SAAGA = Summarise, Annotate & Assess Genome Annotations. Uses a reference proteome to summarise and assess genome annotations.
  • SynBad = Synteny-based scaffolding adjustment tool for comparing two related genome assemblies and identify putative translocations and inversions between the two that correspond to gap positions. (Development only.)

There have also been significant updates to:

  • BUSCOMP = BUSCO Compiler and Comparison tool. Used for genome assembly completeness estimates that are robust to sequence quality, and for compiling BUSCO results.

Other changes include some initial reformatting for Python3 compatibility. This is ongoing work; please report any odd behaviour.

See the included release_notes.txt for a full list of the python module updates since v1.8.1.

NOTE: At time of posting, the REST servers have not yet been updated with the latest version. This will happen soon.

Monday, 27 May 2019

SLiMSuite release v1.8.1 (2019-05-27)

SLiMSuite release v1.8.1 (2019-05-27) is now on GitHub and Zenodo:

This update has fast-forwarded the SLiMSuite release to v1.8.1 to be consistent with the tools/slimsuite.py wrapper script. A top level SLiMSuite.py file can now be run to access the main tools and functions of the package. The REST servers have also been updated to run this version of the code.

This release of SLiMSuite contains a number of updates related to the REST servers and some new tools, notably SAMPhaser long read diploid phasing algorithm, and BUSCOMP BUSCO compiler and comparison tool. See release notes (below) for more details.

SLiMSuite updates

Updates in extras/:

• rje_pydocs: Updated from Version 2.16.7.
→ Version 2.16.8: Updated to to parse https.
→ Version 2.16.9: Tweaked docstring parsing.

Updates in libraries/:

• rje: Updated from Version 4.19.0.
→ Version 4.19.1: Added code for catching non-ASCII log filenames.
→ Version 4.20.0: Added quiet mode to log object and output of errors to stderr. Fixed rankList(unique=True)
→ Version 4.21.0: Added hashlib MD% functions.
→ Version 4.21.1: Fixed bug where silent=T wasn't running silent.

• rje_blast_V2: Updated from Version 2.22.2.
→ Version 2.23.3: Fixed LocalIDCut error for GABLAM and QAssemble stat filtering.

• rje_db: Updated from Version 1.9.0.
→ Version 1.9.1: Updated logging of adding/removing fields: default is now when debugging only.

• rje_disorder: Updated from Version 1.2.0.
→ Version 1.3.0: Switched default behaviour to be md5acc=T.
→ Version 1.4.0: Fixed up disorder=parse and disorder=foldindex.
→ Version 1.5.0: Added iupred2 and anchor2 parsing from URL using accnum. Made default disorder=iushort2.

• rje_genbank: Updated from Version 1.5.3.
→ Version 1.5.4: Added recognition of *.gbff for genbank files.

• rje_obj: Updated from Version 2.2.2.
→ Version 2.3.0: Added quiet mode to object and stderr output.
→ Version 2.4.0: Added vLog() and bugLog() methods.
→ Version 2.4.1: Fixed bug where silent=T wasn't running silent.

• rje_paf: Created/Renamed/moved.
→ Version 0.0.0: Initial Compilation.
→ Version 0.1.0: Initial working version. Compatible with GABLAM v2.30.0 and Snapper v1.7.0.
→ Version 0.2.0: Added endextend=X : Extend minimap2 hits to end of sequence if with X bp [10]
→ Version 0.3.0: Added mapsplice mode for dealing with transcript mapping.
→ Version 0.3.1: Correct PAF splicing bug.
→ Version 0.4.0: Added TmpDir and forking for GABLAM conversion.
→ Version 0.5.0: Added uniquehit=T/F : Option to use *.hitunique.tdt table of unique coverage for GABLAM coverage stats [False]

• rje_ppi: Updated from Version 2.8.1.
→ Version 2.9.0: Added ppiout=FILE : Save pairwise PPI file following processing (if rest=None) [None]

• rje_qsub: Updated from Version 1.9.2.
→ Version 1.9.3: Updates the order of the qsub -S /bin/bash flag.

• rje_rmd: Created/Renamed/moved.
→ Version 0.0.0: Initial Compilation.

• rje_samtools: Updated from Version 1.20.0.
→ Version 1.20.1: Fixed mlen bug. Added catching of unmapped reads in SAM file. Fixed RLen bug. Changed softclip defaults.
→ Version 1.20.2: Fixed readlen coverage bug and acut bug.

• rje_seq: Updated from Version 3.25.0.
→ Version 3.25.1: Fixed -long_seqids retrieval bug.
→ Version 3.25.2: Fixed 9spec filtering bug.

• rje_seqlist: Updated from Version 1.29.0.
→ Version 1.30.0: Updated and improved DNA2Protein.
→ Version 1.31.0: Added genecounter to rename option for use with other programs, e.g. PAGSAT.
→ Version 1.31.1: Fixed edit bug when not in DNA mode.
→ Version 1.32.0: Added genomesize and NG50/LG50 to DNA summarise.
→ Version 1.32.1: Fixed LG50/L50 bug.

• rje_sequence: Updated from Version 2.6.0.
→ Version 2.7.0: Added shift=X to maskRegion() for 1-L input. Fixed cterminal maskRegion.

• rje_slimcore: Updated from Version 2.9.0.
→ Version 2.10.0: Added seqfilter=T/F : Whether to apply sequence filtering options (goodX, badX etc.) to input [False]
→ Version 2.10.1: Fixed default results file bug.
→ Version 2.10.2: Improved handling and REST output of disorder scores.
→ Version 2.11.0: Modified qregion=X,Y to be 1-L numbering.

• rje_slimlist: Updated from Version 1.7.3.
→ Version 1.7.4: Modified concetanation of SLiMSuite results to use "|" in place of "#" for better compatibility.

• rje_uniprot: Updated from Version 3.25.0.
→ Version 3.25.1: Fixed proteome download bug following Uniprot changes.
→ Version 3.25.2: Fixed Uniprot protein extraction issues by using curl. (May not be a robust fix!)

Updates in tools/:

• buscomp: Created/Renamed/moved.
→ Version 0.0.0: Initial Compilation.
→ Version 0.1.0: Basic working version.
→ Version 0.2.0: Functional version with basic RMarkdown HTML output.
→ Version 0.3.0: Added ratefas=FILELIST: Additional fasta files of assemblies to rate with BUSCOMPSeq (No BUSCO run) [].
→ Version 0.4.0: Implemented forking and tidied up output a little.
→ Version 0.5.0: Updated genome stats and RMarkdown HTML output. Reorganised assembly loading and proeccessing. Added menus.
→ Version 0.5.1: Reorganised code for clearer flow and documentation. Unique and missing BUSCO output added.
→ Version 0.5.2: Dropped paircomp method and added Rmarkdown control methods. Updated Rmarkdown descriptions. Updated log output.
→ Version 0.5.3: Tweaked log output and fixed a few minor bugs.
→ Version 0.5.4: Deleted some excess code and tweaked BUSCO percentage plot outputs.
→ Version 0.5.5: Fixed minlocid bug and cleared up minimap temp directories. Added LnnIDxx to BUSCOMP outputs.
→ Version 0.5.6: Added uniquehit=T/F : Option to use *.hitunique.tdt table of unique coverage for GABLAM coverage stats [False]
→ Version 0.6.0: Added more minimap options, changed defaults and dev generation of a table changes in ratings from BUSCO to BUSCOMP.
→ Version 0.6.1: Fixed bug that was including Duplicated sequences in the buscomp.fasta file. Added option to exclude from BUSCOMPSeq compilation.
→ Version 0.6.2: Fixed bug introduced that had broken manual group review/editing.
→ Version 0.7.0: Updated the defaults in the light of test analyses. Tweaked Rmd report.
→ Version 0.7.1: Fixed unique group count bug when some genomes are not in a group. Fixed running with non-standard options.
→ Version 0.7.2: Added loadsummary=T/F option to regenerate summaries and fixed bugs running without BUSCO results.

• comparimotif_V3: Updated from Version 3.13.0.
→ Version 3.14.0: Modified memsaver mode to take different input formats.

• gablam: Updated from Version 2.29.0.
→ Version 2.30.0: Added mapper=X : Program to use for mapping files against each other (blast/minimap) [blast]
→ Version 2.30.1: Fixed BLAST LocalIDCut error for GABLAM and QAssemble stat filtering.

• gopher: Updated from Version 3.4.3.
→ Version 3.5.0: Added separate outputs for trees with different alignment programs.
→ Version 3.5.1: Added capacity to run DNA GOPHER with tblastx. (Not tested!)
→ Version 3.5.2: Added acc=LIST as alias for uniprotid=LIST and updated docstring for REST to make it clear that rest=X needed.

• haqesac: Updated from Version 1.12.0.
→ Version 1.13.0: Modified qregion=X,Y to be 1-L numbering.

• pagsat: Updated from Version 2.4.0.
→ Version 2.5.0: Reduced the executed code when mapfas=T assessment=F. (Recommended first run.) Added renaming.
→ Version 2.5.1: Added recognition of *.gbff for genbank files.
→ Version 2.6.0: Added mapper=X : Program to use for mapping files against each other (blast/minimap) [blast]
→ Version 2.6.1: Switch failure to find key report files to a long warning, not program exit.
→ Version 2.6.2: Fixed bugs with mapper=minimap mode and started adding more internal documentation.
→ Version 2.6.3: Fixed default behaviour to run report=T mode.
→ Version 2.6.4: Fixed summary table merge bug.
→ Version 2.6.5: Fixed compile path bug.
→ Version 2.6.6: Fixed BLAST LocalIDCut error for GABLAM and QAssemble stat filtering.
→ Version 2.6.7: Generalised compile path bug fix.
→ Version 2.6.8: Added ChromXcov fields to PAGSAT Compare.

• pingu_V4: Updated from Version 4.9.0.
→ Version 4.9.1: Fixed Pairwise parsing and filtering for more flexibility of input. Fixed fasid=X bug and ppiseqfile names.
→ Version 4.10.0: Added hubfield and spokefield options for parsing hublist.

• qslimfinder: Updated from Version 2.2.0.
→ Version 2.3.0: Modified qregion=X,Y to be 1-L numbering.

• samphaser: Created/Renamed/moved.
→ Version 0.0.0: Initial Compilation.
→ Version 0.1.0: Updated SAMPhaser to be more memory efficient.
→ Version 0.2.0: Added reading of sequence and generation of SNP-altered haplotype blocks.
→ Version 0.2.1: Fixed bug in which zero-phasing sequences were being excluded from blocks output.
→ Version 0.3.0: Made a new unzip process.
→ Version 0.4.0: Added RGraphics for unzip.
→ Version 0.4.1: Fixed MeanX bug in devUnzip.
→ Version 0.4.2: Made phaseindels=F by default: mononucleotide indel errors will probably add phasing noise. Fixed basefile R bug.
→ Version 0.4.3: Fixed bug introduced by adding depthplot code. Fixed phaseindels bug. (Wasn't working!)
→ Version 0.4.4: Modified mincut=X to adjust for samtools V1.12.0.
→ Version 0.4.5: Updated for modified RJE_SAMTools output.
→ Version 0.4.6: splitzero=X : Whether to split haplotigs at zero-coverage regions of X+ bp (-1 = no split) [100]
→ Version 0.5.0: snptable=T/F : Output filtered alleles to SNP Table [False]
→ Version 0.6.0: Converted haplotig naming to be consistent for PAGSAT generation. Updated for rje_samtools v1.21.1.
→ Version 0.7.0: Added skiploci=LIST and phaseloci=LIST : Optional list of loci to skip phasing []
→ Version 0.8.0: poordepth=T/F : Whether to include reads with poor track probability in haplotig depth plots (random track) [False]

• seqmapper: Updated from Version 2.2.0.
→ Version 2.3.0: Added GABLAM-free method.

• seqsuite: Updated from Version 1.19.1.
→ Version 1.20.0: Added rje_paf.PAF.
→ Version 1.21.0: Added NG50 and LG50 to batch summarise.
→ Version 1.22.0: Added BUSCOMP to programs.
→ Version 1.23.0: Added rje_ppi.PPI to programs.

• slimbench: Updated from Version 2.18.2.
→ Version 2.18.3: Added better handling of motifs without TP occurrences for OccBench. Added minocctp=INT.
→ Version 2.18.4: Fixed ELMBench rating bug.
→ Version 2.18.5: Fixed Balanced=F bug.
→ Version 2.19.0: Implemented dataset=LIST: List of headers to split dataset into. If blank, will use datatype defaults. []

• slimfarmer: Updated from Version 1.9.0.
→ Version 1.10.0: Added appending contents of jobini file to slimsuite=F farm commands.

• slimfinder: Updated from Version 5.3.4.
→ Version 5.3.5: Fixed slimcheck and advanced stats models bug.
→ Version 5.4.0: Modified qregion=X,Y to be 1-L numbering.

• slimparser: Updated from Version 0.5.0.
→ Version 0.5.1: Minor docs and bug fixes.
→ Version 0.6.0: Improved functionality as replacement pureapi with rest=jobid and rest=check functions.

• slimsuite: Updated from Version 1.7.1.
→ Version 1.8.0: Added BUSCOMP and basic test function.
→ Version 1.8.1: Updated documentation and added IUPred2. General tidy up and new example data for protocols paper.

• smrtscape: Updated from Version 2.2.2.
→ Version 2.2.3: Fixed bug where SMRT subreads are not returned by seqlist in correct order. Fixed RQ=0 bug.

• snapper: Updated from Version 1.6.1.
→ Version 1.7.0: Added mapper=minimap setting, compatible with GABLAM v2.30.0 and rje_paf v0.1.0.


© RJ Edwards 2019. Last modified 27 May 2019.

Monday, 2 July 2018

SLiMSuite Downloads

UPDATE: Please see the Downloads page for the most recent release.



The current SLiMSuite release is v1.4.0 (2018-07-02) and can be downloaded by clicking the button (left).

In addition to the tarball available via the links above, SLiMSuite is available as a GitHub repository (right).

DOI

See also: Installation and Setup.

Previous Releases

SLiMSuite release v1.4.0 (2018-06-02) now oline

SLiMSuite release v1.4.0 (2018-07-02) is now on GitHub. The REST servers have also been updated to run this version of the code.

This release of SLiMSuite contains a number of updates related to the REST servers and some new pre-release dev tools in the main repo (but not the *.tgz file).

SeqList has updated sequence summary statistics and grep-based redundancy removal for large genomes.

One major bug fix is a change to parsing Uniprot entries from the website following a change in behaviour of the API.

SLiMSuite updates

Updates in extras/:

• rje_pydocs: Updated from Version 2.16.3.
→ Version 2.16.4: Tweaked formatDocString.
→ Version 2.16.5: Added general commands to docstring HTML for REST servers.
→ Version 2.16.6: Modified parsing to keep DocString for SPyDarm runs.
→ Version 2.16.7: Fixed T/F/FILE option type parsing bug.

Updates in libraries/:

• rje_blast_V2: Updated from Version 2.18.0.
→ Version 2.19.0: Added blastgz=T/F : Whether to zip and unzip BLAST results files [False]
→ Version 2.19.1: Fixed erroneous i=-1 blastprog over-ride but not sure why it was happening.
→ Version 2.20.0: Added localGFF output
→ Version 2.21.0: Added blasttask=X setting for BLAST -task ['megablast']
→ Version 2.22.0: Added dust filter for blastn and setting blastprog based on blasttask
→ Version 2.22.1: Added trimLocal error catching for exonerate issues.
→ Version 2.22.2: Fixed GFF attribute case issue.

• rje_db: Updated from Version 1.8.6.
→ Version 1.9.0: Added comment output to saveToFile().

• rje_disorder: Updated from Version 0.8.
→ Version 1.0.0: Added random disorder function and elevated to v1.x as fully functional for SLiMSuite
→ Version 1.1.0: Added strict option for disorder method selection. Added minorder=X.
→ Version 1.2.0: Added saving and loading scores to IUScoreDir/.

• rje_gff: Created/Renamed/moved.
→ Version 0.0.0: Initial Compilation.
→ Version 0.1.0: Basic functional version.

• rje_hpc: Updated from Version 1.1.
→ Version 1.1.1: Added output of subjob command to log as run.

• rje_html: Updated from Version 0.2.1.
→ Version 0.3.0: Added optional loading of javascript files and stupidtable.js?dev default.

• rje_qsub: Updated from Version 1.9.1.
→ Version 1.9.2: Modified qsub() to return job ID.

• rje_samtools: Updated from Version 1.19.2.
→ Version 1.20.0: Added parsing of BAM file - needs samtools on system. Added minsoftclip=X, maxsoftclip=X and minreadlen=X.

• rje_seq: Updated from Version 3.24.0.
→ Version 3.25.0: 9spec=T/F : Whether to treat 9XXXX species codes as actual species (generally higher taxa) [False]

• rje_seqlist: Updated from Version 1.25.0.
→ Version 1.26.0: Updated sequence statistics and fixed N50 underestimation bug.
→ Version 1.26.1: Fixed median length overestimation bug.
→ Version 1.26.2: Fixed sizesort bug. (Now big to small as advertised.)
→ Version 1.27.0: Added grepNR() method (dev only). Switched default to RevCompNR=T.
→ Version 1.28.0: Fixed second pass NR naming bug and added option to switch off altogether.
→ Version 1.29.0: Added maker=T/F : Whether to extract MAKER2 statistics (AED, eAED, QI) from sequence names [False]

• rje_slimcalc: Updated from Version 0.9.3.
→ Version 0.10.0: Added extra disorder methods to slimcalc.

• rje_taxonomy: Updated from Version 1.2.0.
→ Version 1.3.0: taxtable=T/F : Whether to output results in a table rather than text lists [False]

• rje_tree: Updated from Version 2.15.0.
→ Version 2.16.0: 9spec=T/F : Whether to treat 9XXXX species codes as actual species (generally higher taxa) [False]
→ Version 2.16.1: Modified NSF reading to cope with extra information beyond the ";".

• rje_uniprot: Updated from Version 3.24.1.
→ Version 3.24.2: Updated HTTP to HTTPS. Having some download issues with server failures.
→ Version 3.25.0: Fixed new Uniprot batch query URL. Added onebyone=T/F : Whether to download one entry at a time. Slower but should maintain order [False].

• rje_zen: Updated from Version 1.3.2.
→ Version 1.4.0: Added some more words and "They fight crime!" structure.

Updates in tools/:

• gablam: Updated from Version 2.28.3.
→ Version 2.29.0: Added localGFF=T/F output

• gasp: Updated from Version 1.4.
→ Version 2.0.0: Upgraded to rje_obj framework for REST server.

• gasp_V1: Created/Renamed/moved.
→ Version 0.0: Initial Compilation.
→ Version 1.0: Improved version with second pass.
→ Version 1.1: Improved OO. Restriction to descendant AAs. (Good for BAD etc.)
→ Version 1.2: No Out Object in Objects
→ Version 1.3: Added more interactive load options
→ Version 1.4: Minor tweaks to imports.

• gopher: Updated from Version 3.4.2.
→ Version 3.4.3: Added checking and warning if no bootstraps for orthtree.

• haqesac: Updated from Version 1.11.0.
→ Version 1.12.0: 9spec=T/F : Whether to treat 9XXXX species codes as actual species (generally higher taxa) [False]

• multihaq: Updated from Version 1.3.0.
→ Version 1.4.0: Added SLiMFarmer batch forking if autoskip=F and i=-1.
→ Version 1.4.1: Added haqblastdir=PATH: Directory in which MultiHAQ BLAST2FAS BLAST runs will be performed [./HAQBLAST/]

• pagsat: Updated from Version 2.3.3.
→ Version 2.3.4: Fixed full.fas request bug.
→ Version 2.4.0: Added PAGSAT compile mode to generate comparisons of reference chromosomes across assemblies.

• seqsuite: Updated from Version 1.14.0.
→ Version 1.14.1: Added zentest for testing the REST servers.
→ Version 1.15.0: Added GASP to REST servers.
→ Version 1.16.0: Add rje_gff.GFF to REST servers.
→ Version 1.17.0: Added batch summarise mode.
→ Version 1.18.0: Added rje_apollo.Apollo to REST servers.
→ Version 1.19.0: Tweaked the output of batch summarise, adding Gap% and reducing dp for some fields.
→ Version 1.19.1: Fixed GapPC summarise output to be a percentage, not a fraction.

• slimbench: Updated from Version 2.14.0.
→ Version 2.14.1: Fixed up PPIBench results loading.
→ Version 2.14.2: Fixed ByCloud bug.
→ Version 2.15.0: Updated assessSearchMemSaver() to handle different data types properly. dombench not yet supported.
→ Version 2.16.0: Added ppi hub/slim summary and motif filter for assessment datasets post-rating (still count as OT)
→ Version 2.16.1: Bug-fixing PPI generation from pairwise PPI files.
→ Version 2.16.2: Fixed benchmarking setup bug.
→ Version 2.16.3: Fixed bug when Hub-PPI links fail during PPI Benchmarking.
→ Version 2.17.0: Added output of missing datasets when balanced=T.
→ Version 2.18.0: Added dev OccBench with improved ratings and more efficient results handling. (dev only)
→ Version 2.18.1: Added additional OccBench options (bymotif, occsource, occspec)
→ Version 2.18.2: Fixed problem with source file selection ignoring i=-1.

• slimfarmer: Updated from Version 1.7.0.
→ Version 1.8.0: jobforks=X : Number of forks to pass to farmed out run if >0 [0]
→ Version 1.9.0: daisychain=X : Chain together a set of qsub runs of the same call that depend on the previous job.

• slimfinder: Updated from Version 5.3.3.
→ Version 5.3.4: Fixed terminal (^/$) musthave bug.

• slimsuite: Updated from Version 1.7.0.
→ Version 1.7.1: Added error raising for protected REST alias data.

• smrtscape: Updated from Version 2.2.1.
→ Version 2.2.2: Added dna=T to all SeqList object generation.

• snapper: Updated from Version 1.6.0.
→ Version 1.6.1: Fixed bug for reducing to unique-unique pairings that was over-filtering.


© RJ Edwards 2018. Last modified 2 Jul 2018.

Tuesday, 16 January 2018

SLiMSuite REST server is back up

The REST server is back up. The development server is currently having an upgrade and should not be used.

SLiMSuite REST server is currently down

The SLiMSuite REST server is experiencing some technical difficulties at the moment. It will hopefully be back up soon.