SLiMSuite & SeqSuite sequence analysis tools: January 2015

One of the changes in the last release was the introduction of three component X.Y.Z version numbers in place of the old X.Y numbers. These are slowly being rolled out across all the modules in an effort to approach proper semantic versioning for SLiMSuite. (Due to the somewhat organic nature of its development, it may never reach full semantic versioning.)

From release 2015-01-07 onwards, therefore, version number changes should indicate the nature of the change following MAJOR.MINOR.PATCH version numbering:

MAJOR version increments when a backwards-incompatible change is made. Typically a major change to input/output or core module/class structure.
MINOR version increments when functionality is added in a backwards-compatible manner.
PATCH version increments when bugs are fixed or minor functionality added in a backwards-compatible manner.

Under the old MAJOR.MINOR version numbers, PATCH changes were treated as MINOR changes.

Regrettably, due to modular structure of SLiMSuite, the main program modules will not always have MINOR and PATCH version increments when the underlying modules are changed. The plan is to make sure that the main SLiMSuite and SeqSuite modules do increment with a new release to reflect changes. In the meantime, please contact the author if you have any questions or unexpected behaviour.

A new download of SLiMSuite (release 2015-01-07) is now available at both UK (U. Southampton) and Australia (UNSW) sites (svn r613).

Many of the changes are under the hood, in preparation for a new set of REST services, which will be coming soon. The new download also features two new programs in the tools/ folder, which will hopefully simplify running many of the programs. The core programs and several of the key accessory programs (e.g. rje_seq and rje_uniprot) can now be run using the main SLiMSuite program:

python tool/slimsuite.py -prog X

where X is one of the SLiMSuite or SeqSuite programs. To see which are currently supported, run with -help. Simply add additional commandline options for the chosen program (and/or use ini files) as normal. For program-specific help, run with help=T: this will give the help documentation for program X rather than SLiMSuite. (NB. SLiMSuite can be used to access both SLiMSuite and SeqSuite programs. There is also a seqsuite.py that can be used to access just the SeqSuite programs and accessories.)

The other major update is that SLiMSuite programs (SLiMProb, SLiMFinder, QSLiMFinder and SLiMCore) can now take lists of Uniprot accession numbers as alternative input, using uniprotid=LIST in place of seqin=FILE. Providing there is an open internet connection, the relevant proteins will be downloaded from the Uniprot server for analysis.

GABLAM has also benefited from the addition of a new fullblast=T mode, which will perform the full all versus all BLAST+ search prior to GABLAM processing. Depending on your machine setup, this can be faster than the current method that forks out a single sequence at a time and is more IO-intensive as a result. The GABLAM functions to use existing BLAST+ results have also been fixed and tidied a little. (If re-running might be required, keepblast=T can retain the full BLAST results file to accelerate subsequent runs.)

Updates since last release:

• fiesta: Updated from Version 1.8.
→ Version 1.8.1: Replaced type with stype throughout to try and avoid TypeError crashes.
→ Version 1.9.0: Altered HAQDB to be a list of files rather than just one.

• gablam: Updated from Version 2.14.
→ Version 2.15.0: Added seqnr function. Add run() method.
→ Version 2.16.0: Added fullblast=T/F : Whether to perform full BLAST followed by blastres analysis [False]
→ Version 2.16.1: Fixed a bug where the fullblast option was failing to return scores and evalues.

• multihaq: Updated from Version 1.1.
→ Version 1.2: Changed defaults to autoskip=F.

• pingu_V4: Updated from Version 4.2.
→ Version 4.3: Modified to use Pfam as hub field for DomPPI generation. Modified naming of PPI output after ppisource.

• seqsuite: Created/Renamed.
→ Version 0.0: Initial Compilation.
→ Version 0.1: Added rje_seq and FIESTA. Added Uniprot.
→ Version 1.0: Moved to tools/ for general release. Added HAQESAC and MultiHAQ. Moved mod to enable easy external access.
→ Version 1.1: Added XRef = rje_xref.XRef. Identifier cross-referencing module.
→ Version 1.2: Added taxonomy.
→ Version 1.3.0: Added rje_zen.Zen. Modified code to work with REST services.
→ Version 1.4.0: Added rje_tree.Tree, GABLAM and GOPHER.

• slimbench: Updated from Version 2.5.
→ Version 2.6: Added ELM domain interactions table: http://www.elm.eu.org/infos/browse_elm_interactiondomains.tsv.
→ Version 2.6: Fixed issues introduced with new SLiMCore V2.0 SLiMSuite code.
→ Version 2.7: Reinstate filtering. (Not sure why disabled.) Add genspec=LIST to filter by species. Added domlink=T/F.
→ Version 2.8.0: Implemented PPIBench benchmarking for datasets without Motifs in name.

• slimfarmer: Updated from Version 1.3.
→ Version 1.4: Added modules=LIST : List of modules to add in job file [clustalo,mafft]
→ Version 1.4.1: Fixed farm=batch mode for qsub=T.

• slimmaker: Updated from Version 1.1.
→ Version 1.2.0: Modified to work with REST servers

• slimmutant: Updated from Version 1.0.
→ Version 1.1: Minor tweaks to generate method to increase speed. (Make index in method.) Added splitfield=X.
→ Version 1.2: Added a batch mode for mutfiles - all other options will be kept fixed. Added maxmutant and minmutant.
→ Version 1.3: Added SLiMPPI analysis (will set analyse=T). Started basing on SLiMCore

• slimprob: Updated from Version 2.1.
→ Version 2.2.0: Added basic REST functionality.

• slimsuite: Created/Renamed.
→ Version 0.0: Initial Compilation based on SeqSuite.
→ Version 1.0: Moved to tools/ for general release. Added reading and using of SeqSuite programs.
→ Version 1.1: Added slimlist.
→ Version 1.2: Added SLiMBench.
→ Version 1.3.0: Added SLiMMaker and modified code to work with REST services.

• rje: Updated from Version 4.12.
→ Version 4.13.0: Added new built-in attributes/options for REST services.
→ Version 4.13.1: Fixed MemSaver typo in WarnLog output. Modified mkDir() to avoid clashes raising errors.

• rje_db: Updated from Version 1.5.
→ Version 1.6: Added option to save a subset of entries using saveToFile(savekeys=LIST).
→ Version 1.7.0: Added splitchar to table splitting.
→ Version 1.7.1: Reinstated raise error if expected table missing.

• rje_dismatrix_V3: Created/Renamed.
→ Version 3.0: Updated to new rje_obj.RJE_Object class.

• rje_ensembl: Updated from Version 2.13.
→ Version 2.14: Add enspep=T/F : Create full gnspacc EnsEMBL peptide datasets [False]

• rje_genbank: Added to download.
→ Version 0.0: Initial Compilation.
→ Version 0.1: Modified and Tidied output a little.
→ Version 0.2: Added details to skip and option to use different detail for protein accession number.
→ Version 0.3: Added reloading of features.
→ Version 1.0: Basic functioning version. Added fetchuid=LIST Genbank retrieval to generate seqin=FILE.
→ Version 1.1: Added use of rje_taxonomy for getting Species Code from TaxID.
→ Version 1.2: Modified to deal with genbank protein entries.
→ Version 1.2.1: Fixed feature bug that was breaking parser and removing trailing '*' from protein sequences.
→ Version 1.2.2: Fixed more features that were breaking parser.

• rje_obj: Updated from Version 2.0.
→ Version 2.1.0: Added new built-in attributes/options for REST services.

• rje_ppi: Updated from Version 2.8.
→ Version 2.8.1: Fixed bug with Spring Layout interruption message.

• rje_qsub: Updated from Version 1.5.
→ Version 1.6: Added modules=LIST : List of modules to add in job file [clustalo,mafft]
→ Version 1.6.1: Added R/3.1.1 to modules.

• rje_seq: Updated from Version 3.20.
→ Version 3.21.0: Added extraction of uniprot IDs for seqin.

• rje_seqlist: Updated from Version 1.7.
→ Version 1.8: Added sortseq=X : Whether to sort sequences prior to output (size/invsize/accnum/name/seq/species/desc) [None]
→ Version 1.9.0: Added extra functions for returning sequence AccNum, ID or Species code.
→ Version 1.10.0: Added extraction of uniprot IDs for seqin. Added more dna2prot reformatting options.

• rje_sequence: Updated from Version 2.3.
→ Version 2.4: Added recognition of modified IPI format. Added standalone low complexity masking.
→ Version 2.4.1: Moved the gnspacc fragment recognition to reduce issues. Should perhaps remove completely?

• rje_slim: Updated from Version 1.8.
→ Version 1.9: Reinstated ambcut for slimToPattern()

• rje_slimcalc: Updated from Version 0.8.
→ Version 0.9: Improvements to use of GOPHER.

• rje_slimcore: Updated from Version 2.2.
→ Version 2.3: Docstring edits. Minor tweak to walltime() to close open files.
→ Version 2.4: Added megaslimfix=T/F : Whether to run megaslim in "fix" mode to tidy/repair existing files [False]
→ Version 2.5: Added (hidden) slimmutant=T/F : Whether to ignore '.p.\D\d+\D' at end of accnum. Made default append=True.
→ Version 2.6.0: Added uniprotid=LIST : Extract IDs/AccNums in list from Uniprot into BASEFILE.dat and use as seqin=FILE. []
→ Version 2.6.1: Removed the maxseq default setting.

• rje_slimlist: Updated from Version 1.4.
→ Version 1.5: Added run() method for slimsuite.py compatibility. Improved split motif handling.
→ Version 1.6: Modified to read in new ELM class download file with extra header information. Added varlength=T/F filter.
→ Version 1.6: Modified so that filtering one element of a split motif removes all.

• rje_tree: Updated from Version 2.10.
→ Version 2.11.0: Modified for standalone running as part of SeqSuite.

• rje_uniprot: Updated from Version 3.19.
→ Version 3.20: Updated dbsplit=T output and checked function with Pfam. Probably needs work for other databases.
→ Version 3.20.1: Added uniprotid=LIST as an alias to acclist=LIST and extract=LIST.
→ Version 3.20.2: Added extra sequence return methods to UniprotEntry. Added fasta REST output.
→ Version 3.20.3: Fixed bug if new uniprot extraction method fails.

• rje_xml: Created/Renamed.
→ Version 0.0: Initial Compilation.
→ Version 0.1: Added xml.sax functions.
→ Version 0.2: Added parsing from URL.

• rje_xref: Updated from Version 1.1.
→ Version 1.2: Added join=LIST Run in join mode for list of FILE:key1|...|keyN:JoinField [] and naturaljoin=T/F.
→ Version 1.3.0: Added compress=LIST to handle 1:many input data. []

• rje_zen: Updated from Version 1.2.
→ Version 1.3.0: Modified output to work with new REST service calls.

SLiMSuite & SeqSuite sequence analysis tools

Thursday, 8 January 2015

New MAJOR.MINOR.PATCH version numbers

Wednesday, 7 January 2015

SLiMSuite release 2015-01-07 now available

Updates since last release: