Skip to main content

CSE Doctoral Student Seminar: Arghya Datta and Muhan Zhang

Apr 13
12:30 p.m.
2 p.m.
Lopata Hall, Room 101

"Significance of mining Electronic Health Care Records"

​Arghya Datta
Adviser: Joshua Swamidass

The continuously increasing cost of the US healthcare system has received significant attention. Central to the ideas aimed at curbing this trend is the use of technology, particularly, machine learning to mine Electronic Health-care Records. An electronic health-care record (EHR) is a digital version of a patient’s paper chart. EHRs are real-time, patient-centered records that make information available instantly and securely to authorized users. While an EHR does contain the medical and treatment histories of patients, an EHR system is built to go beyond standard clinical data collected in a provider’s office and can be inclusive of a broader view of a patient’s care. EHRs contain a patient’s medical history, diagnoses, medications, surgical procedures, treatment plans, immunization dates, allergies, radiology images, and laboratory and test results. In this talk, I will discuss why mining EHR is significant, why deep learning is essential for EHR mining, what are the clinical questions that can be answered and the significant impact that it can have on the health-care industry. I will further discuss about our model in analyzing a life-threatening disease called Venous Thromboembolism(VTE), various factors that might lead to having a higher risk of VTE and if there are any drugs that the patients are prescribed that is reducing the risk for VTE.

"Beyond Link Prediction: Predicting Hyperlinks in Adjacency Space"

Muhan Zhang
Adviser: Yixin Chen

This paper addresses the hyperlink prediction problem in hypernetworks. Different from the traditional link prediction problem where only pairwise relations are considered as links, our task here is to predict the linkage of multiple nodes, i.e., hyperlink. Each hyperlink is a set of an arbitrary number of nodes which together form a multiway relationship. Hyperlink prediction is challenging -- since the cardinality of a hyperlink is variable, existing classifiers based on a fixed number of input features become infeasible. Heuristic methods, such as the common neighbors and Katz index, do not work for hyperlink prediction, since they are restricted to pairwise similarities. In this paper, we formally define the hyperlink prediction problem, and propose a new algorithm called Coordinated Matrix Minimization (CMM), which alternately performs nonnegative matrix factorization and least square matching in the vertex adjacency space of the hypernetwork, in order to infer a subset of candidate hyperlinks that are most suitable to fill the training hypernetwork. We evaluate CMM on two novel tasks: predicting recipes of Chinese food, and finding missing reactions of metabolic networks. Experimental results demonstrate the superior performance of our method over many seemingly promising baselines.