Determination of transcription factor binding specificities
The term "genetic code" refers to the way in which the information encoded in nucleic acids is converted into the amino-acid sequence of proteins. There is however also a second genetic code, one that is used by the cells to read the blueprints of the entire organism. This second genetic code is composed of gene regulatory information that specifies how much of a gene product should be made when and where. This information is read primarily by sequence specific DNA binding proteins called transcription factors (TFs). TFs recognize and bind short DNA sequences that are located in the regions of DNA that are either just adjacent or relatively close to their target genes. When bound to these sites, TFs directly regulate transcription rates by recruiting the general transcription machinery, or by inhibiting its recruitment. Alternatively, TFs can influence transcription rates indirectly by recruiting proteins that will change the local chromatin environment in a way that will promote or inhibit transcription.
Each TF has its target specificity, it binds to a range of similar sequences that can be ranked based on their relative binding strengths. A major gap in our understanding of life is the lack of knowledge of the TF DNA binding-specificities. While we have good estimates of the total number of TFs and their general types, we do not yet understand the way in which the gene regulatory instructions are encoded in the genome. To approach this important question, we first need to know which DNA sequences TFs bind and how strongly.
The aim of this thesis project was to develop efficient methods for the characterization of TF binding specificities and then use these methods to catalogue DNA-binding specificities of as many human TFs as possible.
In Study I, we converted the classical Systematic Evolution of Ligands by Exponential Enrichment (SELEX) assay into a high throughput compatible method (HT-SELEX) and showcased the method by analyzing DNA binding specificities of 18 TFs representing 14 structural classes. Some of the results were validated by in vivo results from chromatin immunoprecipitation assays.
In Study II, we used HT-SELEX to analyze the binding specificities for clones representing almost all human TFs, generating a dataset of high resolution DNA binding specificity models for more mammalian TFs than in the entire previously published literature combined. Another major feature of our dataset is its high consistency, which was achieved by performing all of the experiments in parallel with the same method.
In Study III we studied evolution of gene regulation by analyzing the DNA binding specificities of TFs from the fruit fly Drosophila melanogaster. Analysis showed that even though the common ancestor of human and insects lived over 600 million years ago, the TF binding-specificities were very conserved between these species and there were similar counterparts to almost all of the TFs in either of the species.
List of scientific papers
I. Jolma A, Kivioja T, Toivonen J, Cheng L, Wei G, Enge M, Taipale M, Vaquerizas JM, Yan J, Sillanpää MJ, Bonke M, Palin K, Talukder S. Hughes TR, Luscombe NM, Ukkonen E, and Taipale J. Multiplexed massively parallel SELEX for characterization of human transcription factor binding specificities. Genome Res. 2010 Jun; 20(6): 861-73.
https://doi.org/10.1101/gr.100552.109
II. Jolma A, Yan J, Whitington T, Toivonen J, Nitta KR, Rastas P, Morgunova E, Enge M, Taipale M, Palin K, Vaquerizas JM, Vincentelli R, Luscombe NM, Hughes TR, Lemaire P, Ukkonen E, Kivioja T and Taipale J. DNA-binding specificities of human transcription factors. Cell. 2013 Jan 17; 152(1-2): 327-39.
https://doi.org/10.1016/j.cell.2012.12.009
III. Nitta KR, Jolma A, Yin Y, Morgunova E, Kivioja T, Akhtar J, Hens K, Toivonen J, Deplancke B, Furlong EEM, Taipale J. Conservation of transcription factor binding specificities across 600 million years of bilateria evolution. eLife. 2015 Mar 17;4.
https://doi.org/10.7554/eLife.04837
History
Defence date
2015-12-11Department
- Department of Medicine, Huddinge
Publisher/Institution
Karolinska InstitutetMain supervisor
Taipale, MinnaPublication year
2015Thesis type
- Doctoral thesis
ISBN
978-91-7676-122-9Number of supporting papers
3Language
- eng