Subject |
Biopython The ability to parse bioinformatics files into Python utilizable data structures, including support for the
following formats:
- Blast output both from standalone and WWW Blast
- Clustalw
- FASTA
- GenBank
- PubMed and Medline
- ExPASy files, like Enzyme and Prosite
- SCOP, including 'dom' and 'lin' files
- UniGene
- SwissProt
Files in the supported formats can be iterated over record by record or indexed and accessed via a Dictionary
interface. Code to deal with popular on-line bioinformatics destinations such as:
- NCBI Blast, Entrez and PubMed services
- ExPASy Swiss-Prot and Prosite entries, as well as Prosite searches Interfaces to common bioinformatics
programs such as:
- Standalone Blast from NCBI
- Clustalw alignment program
- EMBOSS command line tools
A standard sequence class that deals with sequences, ids on sequences, and sequence features. Tools for performing
common operations on sequences, such as translation, transcription and weight calculations. Code to perform
classification of data using k Nearest Neighbors, Naive Bayes or Support Vector Machines. Code for dealing with
alignments, including a standard way to create and deal with substitution matrices. Code making it easy to split up
parallelizable tasks into separate processes. GUI-based programs to do basic sequence manipulations, translations,
BLASTing, etc. Extensive documentation and help with using the modules, including this file, on-line wiki
documentation, the web site, and the mailing list. Integration with BioSQL, a sequence database schema also
supported by the BioPerl and BioJava projects.
|