Oct 27, 2017
Lopata Hall, Room 101
"Predicting Gene Responsiveness From Regulator Binding Signals by Learning Models"
Adviser: Michael Brent
In a cell, the regulatory proteins called transcription factors (TFs) change the production rates of their target genes. Researchers have conducted two main types of experiments to determine such TF-target relationships. One determines which genes a TF binds, while the other determines which genes respond to the perturbation of a TF. However, the fact that only a very small fraction of bound genes (~3-5%) are responsive when the TF is deleted has been disregarded by the community. Recently, we bridged two emerging technologies that respectively measure binding and responsiveness, showing much better agreement than the existing methods had to offer. Other resources, such as histone, marks encode the genome state in the region of each gene. Here, we hypothesize that TF binding signals in combination with histone marks are predictive of the target’s responsiveness. We propose two machine learning models: random forest (RF) and convolutional neural network (CNN), to tackle the prediction task. We show that both models are suitable for predicting the responsive targets, given the binding signals only. Specifically, CNN model performs better than RF model by deciphering the locality of binding signals on the DNA sequence. Furthermore, the model that incorporates genome state near the TF improves the prediction accuracy.