Karolinska Institutet
Browse
DOCUMENT
Spikblad_Lina_Hultin_Rosenberg.pdf (81.98 kB)
DOCUMENT
Thesis_Lina_Hultin_Rosenberg.pdf (2.09 MB)
1/0
2 files

Multivariate analysis of cancer proteomics data : towards a biological systems view and understanding

thesis
posted on 2024-09-02, 23:32 authored by Lina Hultin-rosenberg

Important aims of cancer proteomics include gaining better understanding of cancer biology and identifying cancer biomarkers. Mass spectrometry (MS) based shotgun proteomics allow for identification and quantification of thousands of proteins in complex human samples. However, proteomics discovery research in clinical material faces many challenges. The biological differences between groups are often expected to be rather small, at the same time the human proteome is highly complex and there is large biological variation between clinical samples. To be able to extract meaningful results from proteomics data derived from biological and clinical material, care has to be taken to all the critical steps in the data analysis workflow. First of all we need to have robust methods to extract good quality data. A proper statistical analysis is then of outmost importance, taking into account risks of over- fitting and false positives. In addition, we also need system based approaches to relate the data to clinical and biological questions.

The main goal of this thesis was to generate robust methods for selection of key proteins, networks and pathways relevant for answering biological and clinical questions. The work includes development and evaluation of workflows for quantitative analysis of proteomics data.

In paper I, a multivariate meta-analysis workflow was developed to link existing proteomics data from human colon and prostate tumours. The aim was to identify proteins distinguishing between normal and tumour samples independent of tissue origin, as well as to find unique markers. The bioinformatics workflow for meta- analysis developed in this study enabled the finding of a common protein profile for the two malign tumour types, which was not possible when analysing the data sets separately. The purpose of paper II was to generate a basis for the decision of what protein quantities are reliable and find a way for accurate and precise protein quantification. We developed a methodology for improved protein quantification in shotgun proteomics and introduced a way to assess quantification for proteins with few peptides. The experimental design and developed algorithms decreased the relative protein quantification error in the analysis of complex biological samples. In paper III, we presented SpliceVista, a tool for splice variant identification and visualization based on MS proteomics data. SpliceVista identifies splice variant specific peptides and provides the possibility to perform splice variant specific quantitative analysis. SpliceVista was applied in two experimental datasets to exemplify its capability of detecting differentially expressed splice variants at the protein level. The aim of paper IV was to develop a network based analysis workflow for proteomics data to identify protein subnetworks with different activity between groups of samples. The methodology, which is based on a multivariate model directed by the network, was applied to several of our clinical mass spectrometry datasets. The output from the subnetwork analysis was functional subunits of proteins, rather than a collection of sparse proteins, which were shown to more readily provide a model of the biological mechanisms studied, and thus aid in the biological interpretation.

List of scientific papers

I. Rosenberg LH, Franzén B, Auer G, Lehtiö J, Forshed J. Multivariate meta-analysis of proteomics data from human prostate and colon tumours. BMC Bioinformatics. 2010 Sep 17;11:468.
https://doi.org/10.1186/1471-2105-11-468

II. Hultin-Rosenberg L, Forshed J, Branca R M M, Lehtiö J and Johansson H. Defining, comparing and improving iTRAQ quantification in mass spectrometry proteomics data. Molecular and Cellular Proteomics. 2013, 12.7.
https://doi.org/10.1074/mcp.M112.021592

III. Zhu Y, Hultin-Rosenberg L, Forshed J, Lehtiö J. SpliceVista - an identification and visualization tool to detect splice variants in shotgun proteomics data. [Submitted]

IV. Hultin-Rosenberg L, Zhu Y, Branca R M M, Eriksson H, Forshed J, Lehtiö J. A multivariate network based analysis of in-depth proteomics data from cancer studies. [Manuscript]

History

Defence date

2013-09-19

Department

  • Department of Oncology-Pathology

Publisher/Institution

Karolinska Institutet

Main supervisor

Lehtiö, Janne

Publication year

2013

Thesis type

  • Doctoral thesis

ISBN

978-91-7549-243-8

Number of supporting papers

4

Language

  • eng

Original publication date

2013-08-22

Author name in thesis

Hultin Rosenberg, Lina

Original department name

Department of Oncology-Pathology

Place of publication

Stockholm

Usage metrics

    Theses

    Categories

    No categories selected

    Keywords

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC