File(s) not publicly available
Analyses of genomic and gene expression signatures
Biology has entered a challenging, information-intense period where computational experiments are complementing traditional experiments. A plethora of new techniques have allowed biological processes to be investigated on a global scale. The data analysis has become non-trivial, but crucial in order to draw the appropriate conclusions from these experiments. This thesis combines molecular biology techniques, with a focus on computational techniques, to investigate gene expression profiles and genome signatures.
The technological breakthrough with high-density oligonucleotide arrays and cDNA microarrays has enabled the parallel monitoring of the expression levels of thousands of mRNA transcripts. Using high-density oligonucleotide arrays of mRNA from different regions of the adult mouse brain, we identified both region-specific- and strain-specific (129SvEv and C57bl/6) gene expression differences. These genes with strain-specific differential expression are candidates to be involved in the behavioural differences between 129SvEv and C57bl/6. In two reference gene expression studies, we primarily compared the gene expression profiles of cell lines with their corresponding normal and tumor tissues.
We estimated the degree of differential expression between cell lines and tissues, and the expression of tissue-specific genes in cell lines. Secondly, we also developed a method to measure tumor and tissue characteristic gene expression in individual cell lines. In pharmaceutical screening programs and in experimental research, when cell lines are used as model systems, the proposed Tissue Similarity Index can be an important tool in the selection of the most appropriate cell lines. Each prokaryote genome has a species-specific bias in the occurrence of short nucleotide motifs, known as its genomic signatures. We demonstrated that genomic signatures where detectable in short DNA sequences and designed a naive Bayesian classifier that identified the correct species origin of DNA sequences based on the genomic signature representation.
The classification of DNA sequences was applied to the identification of horizontal gene transfer events. Further, the species-specificity of other sequence biases, such as codon bias, G+C content, and amino acid bias in relation to the genomic signatures were quantified.
List of scientific papers
I. Sandberg R, Yasuda R, Pankratz DG, Carter TA, Del Rio JA, Wodicka L, Mayford M, Lockhart DJ, Barlow C (2000). Regional and strain-specific gene expression mapping in the adult mouse brain. Proc Natl Acad Sci U S A. 97(20): 11038-43
https://pubmed.ncbi.nlm.nih.gov/11005875
II. Sandberg R, Ernberg I (2004). The transcriptome of cell lines compared to tissues reveals origin-independent differences in every third gene. [Submitted]
III. Sandberg R, Ernberg I (2004). Measuring tumor characteristic gene expression in cell lines using a Tissue Similarity Index (TSI). [Submitted]
IV. Sandberg R, Winberg G, Branden CI, Kaske A, Ernberg I, Coster J (2001). Capturing whole-genome characteristics in short sequences using a naive Bayesian classifier. Genome Res. 11(8): 1404-9.
https://pubmed.ncbi.nlm.nih.gov/11483581
V. Sandberg R, Branden CI, Ernberg I, Coster J (2003). Quantifying the species-specificity in genomic signatures, synonymous codon choice, amino acid usage and G+C content. Gene. 311: 35-42.
https://pubmed.ncbi.nlm.nih.gov/12853136
History
Defence date
2004-09-10Department
- Department of Microbiology, Tumor and Cell Biology
Publication year
2004Thesis type
- Doctoral thesis
ISBN-10
91-7140-015-xNumber of supporting papers
5Language
- eng