SLiMSuite & SeqSuite sequence analysis tools: July 2013

Tuesday, 30 July 2013

Getting Help

Much of the information here is also contained in the documentation of the Python modules themselves. A full list of command-line parameters can be printed to screen using the help option, with short descriptions for each one:

python program.py help
python program.py -help
python program.py -h

Details of command-line options specific to each program can also be found in the distributed readme.txt and readme.html files.

You can also retrieve documentation for a given program by replacing sitemap in the box below and clicking View Documentation. Leaving sitemap in the box will list all modules, which can then be clicked on.

If stuck, or something is unclear, then please e-mail me (seqsuite@gmail.com) whatever question you have. If it is the results of an error message, then please send me that and the log file too.

Thursday, 18 July 2013

SLiMScape: a protein short linear motif analysis plugin for Cytoscape.

New paper published!

O’Brien KT, Haslam NJ & Shields DC (2013). SLiMScape: a protein short linear motif analysis plugin for Cytoscape. BMC Bioinformatics 14(1):224. [Epub ahead of print]

BACKGROUND: Computational protein short linear motif discovery can use protein interaction information to search for motifs among proteins which share a common interactor. Cytoscape provides a visual interface for protein networks but there is no streamlined way to rapidly visualize motifs in a network of proteins, or to integrate computational discovery with such visualizations.

RESULTS: We present SLiMScape, a Cytoscape plugin, which enables both de novo motif discovery and searches for instances of known motifs. Data is presented using Cytoscape’s visualization features thus providing an intuitive interface for interpreting results. The distribution of discovered or user defined motifs may be selectively displayed and the distribution of protein domains may be viewed simultaneously. To facilitate this SLiMScape automatically retrieves domains for each protein.

CONCLUSION: SLiMScape provides a platform for performing short linear motif analyses of protein interaction networks by integrating motif discovery and searchtools in a network visualization environment. This significantly aids in the discovery of novel short linear motifs and in visualizing the distributionof known motifs.

PMID: 23855714

Sunday, 14 July 2013

SLiMSuite at the 3rd International Conference on Proteomics & Bioinformatics

If anyone is attending the 3rd International Conference on Proteomics & Bioinformatics this week then be sure to say hello. I am speaking on the last day in the “Computational Biology” track.. (Never the best time to talk at a conference as there is limited time for follow up but at least it is before lunch!)

SLiM Pickings: mining structural and sequence data for the prediction of short linear protein interaction motifs

Short Linear Motifs (SLiMs) are short functional protein sequences that act as ligands to mediate transient protein-protein interactions (PPI) in critical biological pathways and signaling networks. SLiMs are short (3-15aa), generally tolerate considerable sequence variation and typically have fewer than five residues critical for function. These features result in a degree of evolutionary plasticity not seen in domains and SLiMs often add new functions to proteins by convergent evolution. They also present a challenge for computational identification, making it difficult to differentiate biological signal from stochastic patterns. Despite this, discovering new SLiMs is of great interest due to their potential as therapeutic targets.

In recent years, we have made great progress in SLiM discovery, particularly through development of the SLiMSuite package of bioinformatics tools. SLiMs generally occur in structurally disordered regions of proteins and exhibit evolutionary conservation relative to other disordered residues. SLiMFinder uses this knowledge and exploits patterns of convergent evolution to predict novel, over-represented motifs within a statistical framework with high specificity. Applying this approach to a comprehensive set of human PPI data has highlighted interactome complexity and quality as the next challenges for SLiM prediction. Our latest development, QSLiMFinder (“Query” SLiMFinder) tackles some of these issues by incorporating specific interaction data to restrict the motif search space, which improves both the sensitivity and biological relevance of predictions. We are now using QSLiMFinder to combine structurally defined domain-motif interactions with large-scale PPI data to perform large-scale de novo SLiM prediction.

Thursday, 11 July 2013

Documentation

SLiMSuite and SeqSuite have grown into rather unwieldy beasts since their origins as individual programs and the documentation has struggled to keep up. In particular, the original plan of a single PDF manual per program is getting creaky. Because of the shared reliance on common modules, multiple programs make use of the same sets of options for alignments and conservation scoring etc. and propagating tweaks and modifications through all the manuals can be a bit head-wrecking.

As a result of all of this, the documentation currently undergoing a bit of a review and rethink. I am still keen to keep the PDF manuals (as I think they are useful) but will be working through an intermediate phase of online Markdown/HTML documentation of some kind. The current plan is to trickle out draft copies via the blog and then probably release a Git repository once sufficiently populated.

In the meantime, I would be interested to hear any thoughts regarding favoured documentation styles etc. (e.g. HTML vs PDF, large files vs small chunks) as well as bits that are particularly unclear or in need of attention.

Monday, 8 July 2013

New Software Release

New releases of SeqSuite, SLiMSuite and RJESuite are now available.

The biggest change since the last release is the renaming of SLiMSearch to SLiMProb. This is to avoid confusion between the old SLiMSearch 1.x (now SLiMProb) and the newer SLiMSearch 2.x webserver, which has a different range of functions.

Updates since last release:

• cpppred: Created.

• gopher: Updated from Version 3.1.
→ Version 3.2: Minor tweak to prevent unwanted directory generation for programs using existing GOPHER alignments.
→ Version 3.3: Added rje_blast_V2 to use BLAST+. Run with legacy=T to stick with old NCBI BLAST. Started utilising rje_seqlist.

• pepbindpred: Created.

• slimprob: Created.
→ Version 1.0: SLiMProb 1.0 based on SLiMSearch 1.7. Altered output files to be *.csv and *.occ.csv.

• file_monster: Updated from Version 2.0.
→ Version 2.1: Added dirsum function.

• rje: Updated from Version 4.5.
→ Version 4.6: Added dev and warn options.

• rje_blast_V2: Created.
→ Version 2.0: Initial Compilation from rje_blast_V1 V1.14.
→ Version 2.1: Tweaking code to work with GOPHER 3.x - removing self.info etc. Added blastObj() method.

• rje_db: Updated from Version 0.4.
→ Version 0.5: Initial coding of index mode. (Not yet fully functional.)
→ Version 1.0: Working, so upgraded to version 1.0!

• rje_obj: Updated from Version 0.0.
→ Version 1.0: Fully working version, so upgraded to 1.0. Added dev and warn options.

• rje_seq: Updated from Version 3.15.
→ Version 3.16: Added BLAST+ path and seqFromBlastDBCmd()

• rje_slimcalc: Updated from Version 0.5.
→ Version 0.6: Minor tweak to avoid unwanted GOPHER directory generation.
→ Version 0.7: Added RLC to "All" conscore running.

• rje_slimcore: Updated from Version 1.9.
→ Version 1.10: Bypass UPC generation for single sequences.

Documentation is still in the process of development. BLAST+ implementation is ongoing - please get in touch if this is something you need.