What is the UniProt database?
What is it?
The
Universal Protein Resource ("UniProt") is the largest provider of data on the proteome of organisms.It is primarily composed of Swiss-Prot, a database of is manually annotated and reviewed data, and TrEMBL, which is automatically annotated and is not reviewed. Protein sequences are mostly based on submissions to the DNA sequence databases, with varying levels of curation.
What is it for?
UniProt is a good way to quickly assess a vast set of protein properties, particularly for species that are not model organisms.
In addition to providing an understanding of basis protein properties (
example), the database is also useful for:
- Understanding protein function
- Creating antibodies
- Site-specific mutagenesis
What's in it?
Constituent databases:
UniProtKB Protein knowledgebase: Swiss-Prot, which is manually annotated and reviewed, and TrEMBL, which is automatically annotated and is not reviewed.
- Protein data from every species where available.
- Protein properties such as:
- Sites of expression
- Interacting proteins
- Related proteins; homology; orthology
- Protein complex data
- Protein modification data
- Association with disease
- Protein domains and subunit structure
- Data about functional regions of the protein & sequence variants
- Links to protein structure
Key references
- Apweiler R, Bairoch A, Wu CH, Barker WC, Boeckmann B, Ferro S, Gasteiger E, Huang H, Lopez R, Magrane M, Martin MJ, Natale DA, O'Donovan C, Redaschi N, Yeh LS (2004) "UniProt: the Universal Protein Knowledgebase", Nuc. Acids Res.
32:D115-D119.
- All PubMed references pertaining to UniProt and published by the UniProt group.
-
OVERVIEW VIDEO: UniProt video presented at Stanford's Bioinformatics Week 2006; must be viewed using Quicktime 7 player by using File/Open URL and pasting this URL: rtsp://171.65.20.203:554/qtmedia/BioInfoWeek/EBI_UniProt_SA_H264.mov.
Source
Lane Librarian
Record created 5/2/2006; updated 9/6/2006
ypouliot, September 16, 2009