Karolinska Institutet
Browse

File(s) not publicly available

Analyses of genomic and gene expression signatures

thesis
posted on 2024-09-02, 20:48 authored by Rickard Sandberg

Biology has entered a challenging, information-intense period where computational experiments are complementing traditional experiments. A plethora of new techniques have allowed biological processes to be investigated on a global scale. The data analysis has become non-trivial, but crucial in order to draw the appropriate conclusions from these experiments. This thesis combines molecular biology techniques, with a focus on computational techniques, to investigate gene expression profiles and genome signatures.

The technological breakthrough with high-density oligonucleotide arrays and cDNA microarrays has enabled the parallel monitoring of the expression levels of thousands of mRNA transcripts. Using high-density oligonucleotide arrays of mRNA from different regions of the adult mouse brain, we identified both region-specific- and strain-specific (129SvEv and C57bl/6) gene expression differences. These genes with strain-specific differential expression are candidates to be involved in the behavioural differences between 129SvEv and C57bl/6. In two reference gene expression studies, we primarily compared the gene expression profiles of cell lines with their corresponding normal and tumor tissues.

We estimated the degree of differential expression between cell lines and tissues, and the expression of tissue-specific genes in cell lines. Secondly, we also developed a method to measure tumor and tissue characteristic gene expression in individual cell lines. In pharmaceutical screening programs and in experimental research, when cell lines are used as model systems, the proposed Tissue Similarity Index can be an important tool in the selection of the most appropriate cell lines. Each prokaryote genome has a species-specific bias in the occurrence of short nucleotide motifs, known as its genomic signatures. We demonstrated that genomic signatures where detectable in short DNA sequences and designed a naive Bayesian classifier that identified the correct species origin of DNA sequences based on the genomic signature representation.

The classification of DNA sequences was applied to the identification of horizontal gene transfer events. Further, the species-specificity of other sequence biases, such as codon bias, G+C content, and amino acid bias in relation to the genomic signatures were quantified.

List of scientific papers

I. Sandberg R, Yasuda R, Pankratz DG, Carter TA, Del Rio JA, Wodicka L, Mayford M, Lockhart DJ, Barlow C (2000). Regional and strain-specific gene expression mapping in the adult mouse brain. Proc Natl Acad Sci U S A. 97(20): 11038-43
https://pubmed.ncbi.nlm.nih.gov/11005875

II. Sandberg R, Ernberg I (2004). The transcriptome of cell lines compared to tissues reveals origin-independent differences in every third gene. [Submitted]

III. Sandberg R, Ernberg I (2004). Measuring tumor characteristic gene expression in cell lines using a Tissue Similarity Index (TSI). [Submitted]

IV. Sandberg R, Winberg G, Branden CI, Kaske A, Ernberg I, Coster J (2001). Capturing whole-genome characteristics in short sequences using a naive Bayesian classifier. Genome Res. 11(8): 1404-9.
https://pubmed.ncbi.nlm.nih.gov/11483581

V. Sandberg R, Branden CI, Ernberg I, Coster J (2003). Quantifying the species-specificity in genomic signatures, synonymous codon choice, amino acid usage and G+C content. Gene. 311: 35-42.
https://pubmed.ncbi.nlm.nih.gov/12853136

History

Defence date

2004-09-10

Department

  • Department of Microbiology, Tumor and Cell Biology

Publication year

2004

Thesis type

  • Doctoral thesis

ISBN-10

91-7140-015-x

Number of supporting papers

5

Language

  • eng

Original publication date

2004-08-20

Author name in thesis

Sandberg, Rickard

Original department name

Microbiology and Tumor Biology Center (MTC)

Place of publication

Stockholm

Usage metrics

    Theses

    Categories

    No categories selected

    Keywords

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC