CSE Doctoral Student Seminar: Yiming Kang and Junjie Liu

Sep 16, 2016
12:30 p.m.
2:00 p.m.
Lopata Hall, Room 101

​​​​​Yiming Kang
Advisor: Michael Brent

"NetProphet 2.0: a Robust, Scalable Approach to Map Transcription Factor Network"

Cells process information, in part, through transcription factor (TF) networks that tune the rates at which individual genes produce their products.  Figuring out which TFs regulate each gene in a genome (TF network mapping) is an important problem in computational biology. Current TF network mapping procedures rely heavily on data about the locations in which each TF bindings to the genome. However, the experiments required to obtain this location data are labor intensive, expensive, and unpredictable. On the other hand, experiments that measure the output of the TF network (the amount of product made by each gene) are more reliable and cost-effective. The​ Brent Lab previously published NetProphet 1.0, a state-of-the-art TF mapping algorithm that requires only output data. Here, we proposed NetProphet 2.0, which incorporates additional machine learning modules into the existing pipeline by making use of genome sequencing information. The ensemble of computational modules consistently improved mapping qualities on two model species (yeast and fruit fly). This work represents a significant step towards making TF network mapping a reliable, semi-automated task that can be applied to any species at a reasonable cost.

Junjie Liu​
Advisor: Roch Guérin

"Scale Down: Improving Job Response Time with Virtualization"

The scalability problem for Web Service is serving increasing job load while maintaining a reasonable response time. Existing solutions are either increasing the resources on a single server (Scale UP) or increase the number of servers (Scale Out). In my talk, we intend to explore another potential direction: without increase the resources, we simply breaking current resources into multiple virtual servers (Scale Down). And our result shows that, when the job service time distribution has a high CoV (coefficient of variation) with a reasonable amount of system utilization, scale down would greatly improve the average job response time.​