By Monique Zahn, neXtProt team, SIB Swiss Institute of Bioinformatics, Switzerland
Our understanding of human biology at the molecular level has gone from sequencing the human genome in 2001 to experimentally validating over 90% of the human proteome in 2020. Now that the protein “parts list” is almost complete, it is time to turn to our efforts to annotate the function of these human proteins, i.e. to complete the functional human proteome. The C-HPP neXt-CP50 pilot project launched in 2018 aims to characterize 50 functionally uncharacterized but identified PE1 proteins (uPE1 proteins) (1). In the current neXtProt release (data release 2021-02-18), there are 20,379 entries, of which 1,273 are uPE1 proteins and 396 are uMPs (proteins with evidence suggestive of their existence (PE2–4) having no known or predicted function. A manual workflow to generate hypotheses for the function of these uncharacterized proteins has been developed, based on predicted and experimental information on protein properties, interactions, tissular expression, subcellular localization, conservation in other organisms, as well as phenotypic data in mutant model organisms. This workflow has been applied in the frame of a course-based undergraduate research experience (CURE) organized at the University of Geneva (2). Function predictions for 21 entries, of which 20 are uPE1, are online. Many more entries are waiting for functional predictions!
A functional human proteome project page describing the goal, tracking progress and providing instructions on how to submit predictions to neXtProt is available at https://www.nextprot.org/about/functional-proteome-project.
Figure: Manual data mining workflow.
1. Paik YK et al. Launching the C-HPP neXt-CP50 Pilot Project for Functional Characterization of Identified Proteins with No Known Function. J Proteome Res. 2018 Dec 7;17(12):4042-4050. doi: 10.1021/acs.jproteome.8b00383
2. Duek P et al. Functionathon: a manual data mining workflow to generate functional hypotheses for uncharacterized human proteins and its application by undergraduate students. Database 2021, baab046 (2021). doi: 10.1093/database/baab046