Submit Sequences | Resources | Documentation | Contact | |
Datasets of Proteins of Known Localization This website contains links to specific datasets used to train different versions of PSORTb as well as datasets created for software evaluation purposes. For more advanced querying and downloading subsets of data, please go to PSORTdb in place of the datasets on this page. Datasets created by other groups are now linked to in the list of resources on the PSORT.org Index Page. PSORTdb Dataset (Brinkman Lab): PSORTdb is a database of bacterial protein subcellular localization that contains both information determined through laboratory experimentation (ePSORTdb dataset) and computational predictions (cPSORTdb dataset). The ePSORTdb dataset of experimentally verified information (~11,600 proteins) was manually curated by us and represents the largest dataset of its kind. ePSORTdb offers the user the ability to search, browse, and BLAST against proteins of experimentally verified localization, and to customize and download the results of their searches. We recommend using ePSORTdb to generate your custom datasets, however the archived versions datasets listed below are useful for training algorithms and comparing performance between different localization prediction software. Archived PSORTb Datasets:
For a detailed description of how the PSORTb dataset was initially created, please see the PSORT-B v.1.0 paper (Gardy et al, 2003). If you make use of an archived PSORTb dataset in your research, please cite the version number and one of the papers below. Citation information for PSORTdb can be found on the PSORTdb website.
|