Prediction of function shift in protein families
With the availability of a large number of complete genome sequences, it has become essential to annotate the protein sequences derived from them as precisely as possible. Even though presently available computational methods can predict broad functionality for most protein sequences, there is room for improvement in order to get more precise functional annotation. Analysis of functional conservation and divergence in protein families can improve the quality of annotation for available genome sequences. Such an analysis adds an extra level of usefulness to protein families as it would predict which subgroups in a family share identical functions and which groups are likely to have diverged in function. Many genes of pharmacological interest occur in large families for which understanding the specific function is important. This thesis describes large-scale analysis of functional shifts in protein families.
Initially, we created a large dataset of protein families and subfamilies with known functional differences and assessed how well function shifts can be predicted by using existing methods for identifying subfamily specific functional residues. We showed that these methods can discriminate between same function and different function subfamilies and achieved a prediction accuracy of 71%. This approach predicted many previously unknown cases of function divergence (Paper I). A new measure was introduced for predicting function shift, which is representative of all positions in the alignment and by combining it with previously proposed measures, we achieved further improvement of function shift prediction (Paper II). A web resource was developed, available freely to the public for disseminating subfamily classification and function shift analysis of protein families (Paper Ill). We analyzed multi-species ortholog groups for functional shifts using the methods proposed and predicted many new cases of functional shifts between ortholog and paralog subfamilies (Paper IV).
This work demonstrates the power of classifying protein families into subfamilies along with function shift analysis for better annotation of protein sequences emerging from genome sequencing efforts. The methods and resources developed as part of this thesis represent a valuable resource for scientists elucidating detailed functional aspects of proteins, thus helping in evolutionary studies, comparative genomics and better drug designs for Human diseases.
List of scientific papers
I. Abhiman S, Sonnhammer EL (2005). Large-scale prediction of function shift in protein families with a focus on enzymatic function. Proteins. Sep 1;60(4): 758-68.
https://pubmed.ncbi.nlm.nih.gov/16001403
II. Abhiman S, Daub CO, Sonnhammer EL (2006). "Prediction of function divergence in protein families using the substitution rate variation parameter alpha." Mol Biol Evol. Jul;23(7): 1406-13
https://pubmed.ncbi.nlm.nih.gov/16672285
III. Abhiman S, Sonnhammer EL (2005). FunShift: a database of function shift analysis on protein subfamilies. Nucleic Acids Res. Jan 1;33: D197-200.
https://pubmed.ncbi.nlm.nih.gov/15608176
IV. Abhiman S, Lassmann T, Sonnhammer EL (2006). Function shift analysis in ortholog groups. [Manuscript]
History
Defence date
2006-09-04Department
- Department of Cell and Molecular Biology
Publication year
2006Thesis type
- Doctoral thesis
ISBN-10
91-7140-869-XNumber of supporting papers
4Language
- eng