Doctoral Student Seminar: Chen Xu and Rajagopal Venkatesarami

Feb 28
12:30 PM
Lopata 101

Rajagopal Venkatesaramani

Title : Reidentification in Genomic Databases with Publicly Available Facial Images

Abstract: In this study, we evaluate the risk of reidentification of an individual in a genomic database containing single-nucleotide polymorphisms (SNPs), based on their face images. Previous studies have demonstrated that it is possible to reconstruct faces and predict other phenotypic information given an individual's DNA, although the effectiveness of such approaches may have been inflated due to the availability of high-resolution 3D images and a select small population. In contrast, we study how effective such approaches are "in the wild" using publicly shared data on OpenSNP as well as other public resources, such as Google. We study how the effectiveness of such identification approaches is influenced by population heterogeneity, as well as how we can significantly increase privacy by minimally limiting disclosed genomic information. Our methods first link publicly available face images and OpenSNP data. We then use machine learning techniques to first derive specific phenotypes from face images, and then probabilistically link them to genomes to assess reidentification risk. Subsequently, we investigate adversarial machine learning techniques as potential defenses against reidentification.

Chen Xu

Title: Implementing responsive and low-overhead preemption in Cilk runtime system

Abstract: Modern parallel platforms, such as clouds or servers, are often shared among multiple parallel jobs. There has been a lot of theoretical work on designing scheduling policies for multiple parallel jobs on such platforms. Most of those scheduling policies need support from runtime systems on doing dynamic allocation of cores between parallel jobs while they are executing. However, most parallel runtime systems, such as Cilk, are designed to run a single parallel job. Re-allocating cores among parallel jobs in those runtime systems will be extremely inefficient. In this work, we present a multi-programmed Cilk runtime system that supports running multiple parallel jobs. Features of this runtime system include a customizable scheduler over a scheduling interface that lets system designers implement their scheduling policy and a client-server architecture that allows users to submit jobs to the runtime system dynamically. One important feature of the multi-programmed Cilk runtime system is the responsive and low-overhead preemption mechanism which allows fast re-allocation of cores between parallel jobs. This talk will focus on how we implemented such a preemption mechanism. We tested the efficiency of our implementation and observed low overhead. We also applied the runtime system to two multi-programmed scenarios and observed improvement of scenario-specific criteria for both of them.