Monday 25 August 2014

SLiMSuite release 2014-08-25 now available

A new download of SLiMSuite (release 2014-08-25) is now available (svn r466).

The latest release sees a major revamp of the basic chassis for SLiMSuite onto a newer underlying SLiMCore 2.x class to enable the megaslim=FILE upgrade (below). This should have no effect on end use but please report any odd behaviour, as it is possible that some compatibility bugs have crept in that will crash programs in rare scenarios.

The following tools are now in ./legacy/:

slimsuite/legacy/gopher_V2.py
slimsuite/legacy/qslimfinder_V1.9.py
slimsuite/legacy/slimbench_V1.py
slimsuite/legacy/slimdisc_V1.4.py
slimsuite/legacy/slimfinder_V4.9.py
slimsuite/legacy/slimprob_V1.4.py

SLiMSuite now features a new mode for multiple runs/datasets featuring the same input proteins. Running with megaslim=FILE, where FILE is a fasta file of all possible sequences in the dataset(s), will generate files of masking scores that can be reused instead of recalculating each time. In each case, each sequence will be on one line, with the sequence name followed by a score per residue. If dismask=T then *.iupred.txt or *.anchor.txt will be created/read, which will list the disorder score for each position. If consmask=T then *.rlc.txt will list RLC scores. By default, scores will be read where present, or calculated and appended where missing. Running rje_slimcore will calculate all sequences for megaslim=FILE.

In addition to disorder and conservation scores, running SLiMCore with megablam=T will also create an all-by-all GABLAM run, which can be used for subsequent UPC generation. Even without megaslim=FILE, this can still be used with the gablamdis=FILE option.

If megaslim=None (the default), creating alignments for conservation masking can also be sped up giving usegopher=T gopherdir=PATH forks=X, where forks > 1. This will use forking to check/create GOPHER alignments for all input sequences before a regular masking run.

The final upgrade of note is that SLiMBench now features an occurrence benchmarking mode (occbench=T), which has not yet made its way into the manual. More on SLiMBench soon.

Bug fixes in this release:

  • Combined case masking and disorder masking bug fixed.
  • Minor bugs (introduced with DNA=T mode) that affect extended alphabets in SLiMFinder have been fixed.
  • A minor bug in SLiMProb that returned position 0 to L-1 rather than 1-L for N-terminal motif matches (^xxx) has been fixed.
  • A bug affecting GOPHER when run with rje_blast_V2 has been fixed. There might still be some issues with rje_blast_V2 when number of alignments output is smaller than the number of one-line hits.

Other miscellaneous updates are listed below.

Updates since last release:

• fiesta: Updated from Version 1.7.
→ Version 1.8: Minor crash fixes. Updated more functions to work with BLAST+.

• multihaq: Updated from Version 0.1.
→ Version 1.0: Fully working version. Fixed minor basefile bug. Added blastcut filter.
→ Version 1.1: Improved pickup of aborted run.

• qslimfinder: Updated from Version 1.8.
→ Version 1.9: Preparation for QSLiMFinder V2.0 & SLiMCore V2.0 using newer RJE_Object.
→ Version 2.0: Converted to use rje_obj.RJE_Object as base. Version 1.9 moved to legacy/.

• slimbench: Updated from Version 2.4.
→ Version 2.5: Basic OccBench assessment benchmarking. Added ELM Uniprot acclist output. (Download issues?)

• slimfinder: Updated from Version 4.8.
→ Version 4.9: Preparation for SLiMFinder V5.0 & SLiMCore V2.0 using newer RJE_Object.
→ Version 5.0: Converted to use rje_obj.RJE_Object as base. Version 4.9 moved to legacy/.
→ Version 5.1: Modified SLiMChance slightly to catch missing aafreq.

• slimprob: Updated from Version 1.3.
→ Version 1.4: Preparation for SLiMProb V2.0 & SLiMCore V2.0 using newer RJE_Object.
→ Version 2.0: Converted to use rje_obj.RJE_Object as base. Version 1.4 moved to legacy/.
→ Version 2.1: Modified output of N-terminal motifs to correctly start at position 1.

• rje: Updated from Version 4.11.
→ Version 4.11: Added self.name() to basic object class.
→ Version 4.12: Added 'bool' and 'str' to _cmdRead() to ease switchover to new RJE_Objects.

• rje_blast_V2: Updated from Version 2.6.
→ Version 2.7: Fixed occasional oneline versus description mismatch error. Fixed some localhits bugs.

• rje_db: Updated from Version 1.4.
→ Version 1.5: Fixed occasional key error following addField. Added indexReport() method.

• rje_disorder: Updated from Version 0.7.
→ Version 0.8: Added makeRegions() method.

• rje_obj: Updated from Version 1.7.
→ Version 1.8: Cleaned up some erroeneous opt, stat and info references.
→ Version 2.0: Added self.file dictionary and methods for handling file handles with matching self.str filenames.

• rje_seq: Updated from Version 3.19.
→ Version 3.20: Added run() method for SeqSuite.

• rje_seqlist: Updated from Version 1.6.
→ Version 1.6: Add sequence fragment extraction.
→ Version 1.7: Added code to create rje_sequence.Sequence objects.

• rje_slim: Updated from Version 1.7.
→ Version 1.8: Modified use of aa/dna defaults to (hopefully) not break when using extended alphabets.

• rje_slimcore: Updated from Version 1.15.
→ Version 1.16: Preparation for SLiMCore V2.0 using newer RJE_Object.
→ Version 2.0: Converted to use rje_obj.RJE_Object as base. Version 1.16 moved to legacy/.
→ Version 2.1: Added megaslim=FILE option to make/use precomputed results for a proteome. Upgraded MotifSeq method.
→ Version 2.2: Modified aa frequency calculations to use alphabet to generate 0.0 frequencies (rather than missing aa).

• rje_slimlist: Updated from Version 1.3.
→ Version 1.4: Modified code to be compatible with SLiMCore V2.x objects.

• rje_zen: Updated from Version 1.1.
→ Version 1.2: Added a webserver mode to return text directly.

Wednesday 6 August 2014

SLiMSuite bug with combined sequence case and disorder masking

A small flaw has been discovered in the current implementation of disorder masking when it is combined with masking upper or lower case residues (casemask=X dismask=T). Rather than predicting disorder on the unmasked sequence and then combining with any case masking, disorder predictions are currently made on the masked sequences.

Hopefully, this will have minimal impact for the majority of cases. (Although I am not certain, I suspect that it will produce a tendency to over-predict disorder and thus under-mask.) This bug has been fixed for the next release of SLiMSuite. Note that other masking combinations are not affected.