SLiMSuite & SeqSuite sequence analysis tools: November 2015

The November 2015 release of SLiMSuite v1.1.0 (2015-11-30) in now on GitHub. This is intermediate release in preparation for the BioInfoSummer 2015 SLiMSuite workshop and contains a few minor modifications to SLiMSuite programs. The main updates are preliminary versions of some tools for PacBio genomics, notably PAGSAT and SMRTSCAPE. These are still in development and need further documentation and testing before use is advised.

The SeqSuite Genbank parser has some bug fixes for reverse complemented protein sequences with introns, and initial capacity for different codon tables. (This has been implemented for yeast, so only NCBI transl_tables 1-3 currently implemented: please get in touch if you want to use this program with other codon tables.)

SLiMSuite updates in this release

Updates in libraries/:

• rje: Updated from Version 4.14.0.
→ Version 4.14.1: Fixed matchExp method to be able to handline multilines. (Shame re.DOTALL doesn’t work!)
→ Version 4.14.2: Modified integer commands to read/convert floats.
→ Version 4.15.0: Added intList() and numList() functions.

• rje_db: Updated from Version 1.7.5.
→ Version 1.7.6: Added table.opt[‘Formatted’] = Whether table data has been successfully formatted using self.dataFormat()
→ Version 1.7.7: Added option to constrain table splitting to certain field values.
→ Version 1.8.0: Added option to store keys as tuples for correct sorting. (Make default at some point.)

• rje_genbank: Updated from Version 1.3.1.
→ Version 1.3.2: Fixed bug in reverse complement sequences with introns.

• rje_iridis: Updated from Version 1.10.
→ Version 1.10.1: Attempted to fix SLiMFarmer batch run problem. (Should not be setting irun=batch!)
→ Version 1.10.2: Trying to clean up unknown 30s pause. Might be freemem issue?

• rje_obj: Updated from Version 2.1.2.
→ Version 2.1.3: Modified integer commands to read/convert floats.

• rje_qsub: Updated from Version 1.6.2.
→ Version 1.6.3: Tweaked the showstart command for katana.

• rje_samtools: Created/Renamed/moved.
→ Version 0.0: Initial Compilation.
→ Version 0.1.0: Modified version to handle multiple loci per file. (Original was for single bacterial chromosomes.)

• rje_seqlist: Updated from Version 1.11.0.
→ Version 1.12.0: Added peptides/qregion reformatting and region=X,Y.
→ Version 1.13.0: Added summarise=T option for generating some summary statistics for sequence data. Added minlen & maxlen.
→ Version 1.14.0: Added splitseq=X split output sequence file according to X (gene/species) [None]
→ Version 1.15.0: Added names() method.
→ Version 1.15.1: Fixed bug with storage and return of summary stats.
→ Version 1.15.2: Fixed dna2prot reformatting.
→ Version 1.15.3: Fixed summarise bug (n=1).

• rje_sequence: Updated from Version 2.4.1.
→ Version 2.5.0: Added yeast genome renaming.
→ Version 2.5.1: Modified reverse complement code.
→ Version 2.5.2: Tried to speed up dna2prot code.

• rje_slimcalc: Updated from Version 0.9.
→ Version 0.9.1: Modified combining of motif stats to handle expectString format for individual values.
→ Version 0.9.2: Changed default conscore in docstring to RLC.

• rje_slimcore: Updated from Version 2.7.3.
→ Version 2.7.4: Fixed walltime server bug.
→ Version 2.7.5: Fixed feature masking.

• rje_slimlist: Updated from Version 1.7.2.
→ Version 1.7.3: Fixed bug that could not accept variable length motifs from commandline. Improved error message.

• rje_taxonomy: Updated from Version 1.0.
→ Version 1.1.0: Added parsing of yeast strains.

• rje_tree: Updated from Version 2.11.2.
→ Version 2.12.0: Added treeLen() method.
→ Version 2.13.0: Updated PNG saving with R to use newer code.

• rje_uniprot: Updated from Version 3.21.3.
→ Version 3.21.4: Fixed Feature masking. Should this be switched off by default?

• rje_xref: Updated from Version 1.6.0.
→ Version 1.7.0: Added comments=LIST ist of comment line prefixes marking lines to ignore (throughout file) [‘//’,’%’]
→ Version 1.7.1: Added xreformat=T/F : Whether to apply field reformatting to input xrefdata (True) or just xrefs to map (False) [False]
→ Version 1.8.0: Added recognition and parsing of yeast.txt XRef file from Uniprot (http://www.uniprot.org/docs/yeast.txt)

• snp_mapper: Created/Renamed/moved.
→ Version 0.0: Initial Compilation. Batch mode for mapping SNPs needs updating.
→ Version 0.1: SNP mapping against a GenBank file.
→ Version 0.2: Fixed complement strand bug.
→ Version 0.3.0: Updated to work with RATT(/Mummer?) snp output file. Improved docs.
→ Version 0.4.0: Major reworking for easier updates and added functionality. (Convert to 1.0.0 when complete.)

Updates in tools/:

• gablam: Updated from Version 2.19.2.
→ Version 2.20.0: Added SNP Table output.

• gopher: Updated from Version 3.4.1.
→ Version 3.4.2: Removed GOPHER System Exit on IOError to prevent breaking of REST server.

• pagsat: Created/Renamed/moved.
→ Version 1.0.0: Initial working version for based on rje_pacbio assessment=T.
→ Version 1.1.0: Fixed bug with gene and protein summary data. Removed gene/protein reciprocal searches. Added compare mode.
→ Version 1.1.1: Added PAGSAT output directory for tidiness!
→ Version 1.1.2: Renamed the PacBio class PAGSAT.
→ Version 1.2.0: Tidied up output directories. Added QV filter and Top Gene/Protein hits output.
→ Version 1.2.1: Added casefilter=T/F : Whether to filter leading/trailing lower case (low QV) sequences [True]
→ Version 1.3.0: Added tophitbuffer=X and initial synteny analysis for keeping best reference hits.
→ Version 1.4.0: Added chrom-v-contig alignment files along with *.ordered.fas.
→ Version 1.4.1: Made default chromalign=T.
→ Version 1.4.2: Fixed casefilter=F.
→ Version 1.5.0: diploid=T/F : Whether to treat assembly as a diploid [False]
→ Version 1.6.0: mincontiglen=X : Minimum contig length to retain in assembly [1000]
→ Version 1.6.1: Added diploid=T/F to R PNG call.

• peptcluster: Updated from Version 1.5.1.
→ Version 1.5.2: Improved clarity of warning message.

• pingu_V4: Updated from Version 4.5.0.
→ Version 4.5.1: Debugging missing identifiers and indexing speed. Added good and bad DB.
→ Version 4.5.2: Fixed SIF output and changed names to sif-* for opening in browser.
→ Version 4.5.3: Updated REST output.

• seqsuite: Updated from Version 1.8.0.
→ Version 1.9.0: Added PAGSAT and SMRTSCAPE.
→ Version 1.9.1: Fixed HAQESAC setobjects=True error.
→ Version 1.10.0: Added batchrun=FILELIST batcharg=X batch running mode.
→ Version 1.11.0: Added SAMTools and Snapper/SNP_Mapper.

• slimbench: Updated from Version 2.10.0.
→ Version 2.10.1: Updated ELM Source URLs.

• slimfarmer: Updated from Version 1.4.2.
→ Version 1.4.3: Added recognition of missing slimsuite programs and switching to slimsuite=F.

• slimfinder: Updated from Version 5.2.0.
→ Version 5.2.1: Fixed ambocc<1 and minocc<1 issue. (Using integers rather than floats.) Fixed OccRes Sig output format.

• slimparser: Updated from Version 0.3.1.
→ Version 0.3.2: Fixed issue reading files for full output.
→ Version 0.3.3: Tidied output names when restbase=jobid.

• slimprob: Updated from Version 2.2.3.
→ Version 2.2.4: Improved slimcalc output (s.f.).

• slimsuite: Updated from Version 1.5.0.
→ Version 1.5.1: Changed disorder to iuscore to avoid module conflict.

• smrtscape: Created/Renamed/moved.
→ Version 0.0.0: Initial Compilation.
→ Version 1.0.0: Initial working version for server.
→ Version 1.1.0: Added xnlist=LIST : Additional columns giving % sites with coverage >= Xn [10,25,50,100].
→ Version 1.2.0: Added assessment -> now PAGSAT.
→ Version 1.3.0: Added seed and anchor read coverage generator (calculate=T).
→ Version 1.3.1: Deleted assessment function. (Now handled by PAGSAT.)
→ Version 1.4.0: Added new coverage=T function that incorporates seed and anchor subreads.
→ Version 1.5.0: Added parseparam=FILES with paramlist=LIST to parse restricted sets of parameters.
→ Version 1.6.0: New SMRTSCAPE program building on PacBio v1.5.0. Added predict=T/F option.
→ Version 1.6.1: Updated parameters=T to incorporate that the seed read counts as X=1.
→ Version 1.7.0: Added *.summary.tdt output from subread summary analysis. Added minreadlen.
→ Version 1.8.0: preassembly=FILE: Preassembly fasta file to assess/correct over-fragmentation (use seqin=FILE for subreads)

SLiMSuite & SeqSuite sequence analysis tools

Monday 30 November 2015

SLiMSuite release v1.1.0 (2015-11-30) online

SLiMSuite updates in this release

Updates in libraries/:

Updates in tools/: