Karolinska Institutet
Browse

Cancer proteogenomics : connecting genotype to molecular phenotype

Download (2.54 MB)
thesis
posted on 2024-09-02, 23:40 authored by Ioannis SiavelisIoannis Siavelis

The central dogma of molecular biology describes the one-way road from DNA to RNA and finally to protein. Yet, how this flow of information encoded in DNA as genes (genotype) is regulated in order to produce the observable traits of an individual (phenotype) remains unanswered. Recent advances in high-throughput data, i.e., ‘omics’, have allowed the quantification of DNA, RNA and protein levels leading to integrative analyses that essentially probe the central dogma along all of its constituent molecules. Evidence from these analyses suggest that mRNA abundances are at best a moderate proxy for proteins which are the main functional units of cells and thus closer to the phenotype.

Cancer proteogenomic studies consider the ensemble of proteins, the so-called proteome, as the readout of the functional molecular phenotype to investigate its influence by upstream events, for example DNA copy number alterations. In typical proteogenomic studies, however, the identified proteome is a simplification of its actual composition, as they methodologically disregard events such as splicing, proteolytic cleavage and post-translational modifications that generate unique protein species – proteoforms.

The scope of this thesis is to study the proteome diversity in terms of: a) the complex genetic background of three tumor types, i.e. breast cancer, childhood acute lymphoblastic leukemia and lung cancer, and b) the proteoform composition, describing a computational method for detecting protein species based on their distinct quantitative profiles. In Paper I, we present a proteogenomic landscape of 45 breast cancer samples representative of the five PAM50 intrinsic subtypes. We studied the effect of copy number alterations (CNA) on mRNA and protein levels, overlaying a public dataset of drug- perturbed protein degradation.

In Paper II, we describe a proteogenomic analysis of 27 B-cell precursor acute lymphoblastic leukemia clinical samples that compares high hyperdiploid versus ETV6/RUNX1-positive cases. We examined the impact of the amplified chromosomes on mRNA and protein abundance, specifically the linear trend between the amplification level and the dosage effect. Moreover, we investigated mRNA-protein quantitative discrepancies with regard to post-transcriptional and post-translational effects such as mRNA/protein stability and miRNA targeting.

In Paper III, we describe a proteogenomic cohort of 141 non-small cell lung cancer clinical samples. We used clustering methods to identify six distinct proteome-based subtypes. We integrated the protein abundances in pathways using protein-protein correlation networks, bioinformatically deconvoluted the immune composition and characterized the neoantigen burden.

In Paper IV, we developed a pipeline for proteoform detection from bottom-up mass- spectrometry-based proteomics. Using an in-depth proteomics dataset of 18 cancer cell lines, we identified proteoforms related to splice variant peptides supported by RNA-seq data.

This thesis adds on the previous literature of proteogenomic studies by analyzing the tumor proteome and its regulation along the flow of the central dogma of molecular biology. It is anticipated that some of these findings would lead to novel insights about tumor biology and set the stage for clinical applications to improve the current cancer patient care.

List of scientific papers

I. Johansson, H.J., Socciarelli, F., Vacanti, N.M., Haugen, M.H., Zhu, Y., Siavelis, I., Fernandez-Woodbridge, A., Aure, M.R., Sennblad, B., Vesterlund, M., Branca, R.M., Orre, L.M., Huss, M., Fredlund, E., Beraki, E., Garred, Ø., Boekel, J., Sauer, T., Zhao, W., Nord, S., Höglander, E.K., Jans, D.C., Brismar, H., Haukaas, T.H., Bathen, T.F., Schlichting, E., Naume, B., Luders, T., Borgen, E., Kristensen, V.N., Russnes, H.G., Lingjærde, O.C., Mills, G.B., Sahlberg, K.K., Børresen-Dale, A.-L., Lehtiö, J., 2019. Breast cancer quantitative proteome and proteogenomic landscape. Nature Communications. 10, 1600.
https://doi.org/10.1038/s41467-019-09018-y

II. Yang, M., Vesterlund, M., Siavelis, I., Moura-Castro, L.H., Castor, A., Fioretos, T., Jafari, R., Lilljebjörn, H., Odom, D.T., Olsson, L., Ravi, N., Woodward, E.L., Harewood, L., Lehtiö, J., Paulsson, K., 2019. Proteogenomics and Hi-C reveal transcriptional dysregulation in high hyperdiploid childhood acute lymphoblastic leukemia. Nature Communications. 10, 1519.
https://doi.org/10.1038/s41467-019-09469-3

III. Lehtiö, J., Arslan, T., Siavelis, I., Pan, Y., Socciarelli, F., Berkovska, O., Umer, H.M., Mermelekas, G., Pirmoradian, M., Jönsson, M., Brunnström, H., Brustugun, O.T., Purohit, K.P., Cunningham, R., Asl, H.F., Isaksson, S., Arbajian, E., Aine, M., Karlsson, A., Kotevska, M., Hansen, C.G., Haakensen, V.D., Helland, Å., Tamborero, D., Johansson, H.J., Branca, R.M., Planck, M., Staaf, J., Orre, L.M., 2021. Proteogenomics of non-small cell lung cancer reveals molecular subtypes associated with specific therapeutic targets and immune-evasion mechanisms. Nature Cancer. 2, 1224–1242.
https://doi.org/10.1038/s43018-021-00259-9

IV. Siavelis, I., Johansson, H.J., Stahl, M., Socciarelli, F., Mermelekas, G., Jafari, R., Lehtiö, J. DEpMS: Differential Expression analysis of proteoforms from Mass Spectrometry-based bottom-up proteomics. [Manuscript]

History

Defence date

2022-12-16

Department

  • Department of Oncology-Pathology

Publisher/Institution

Karolinska Institutet

Main supervisor

Lehtiö, Janne

Co-supervisors

Johansson, Henrik

Publication year

2022

Thesis type

  • Doctoral thesis

ISBN

978-91-8016-873-1

Number of supporting papers

4

Language

  • eng

Original publication date

2022-11-24

Author name in thesis

Siavelis, Ioannis

Original department name

Department of Oncology-Pathology

Place of publication

Stockholm

Usage metrics

    Theses

    Categories

    No categories selected

    Keywords

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC