Lopata Hall, Room 101
Pattern-Based Mining of Entity/Relation Structures from Massive Text
The majority of information nowadays is carried by massive and unstructured text, in the form of news, articles, reports, or social media messages. This poses a major research challenge on mining entity/relation structures from unstructured text. Manual curation or labeling cannot be scalable to match the rapid growth of text. Most existing information extraction approaches rely on heavy human annotations, which can be too expensive to tune and not adaptable to new domains.
In this talk, I will present a pattern-based methodology that conducts information extraction from the massive corpora using existing resources with little human effort. The first component, WW-PIE, discovers meaningful textual patterns that contain the entities of interest. The second component, TruePIE, discovers high quality textual patterns for target relation types. I will demonstrate how semi-supervised methods can empower information extraction for broad applications and provide explainable results.
Qi Li is currently a postdoc researcher and adjunct professor at Department of Computer Science, University of Illinois at Urbana-Champaign, working with Prof. Jiawei Han. Her research interests lie in the area of data mining with a focus on the extraction and aggregation of information from multiple data sources. Qi obtained her PhD in Computer Science and Engineering from the State University of New York at Buffalo in 2017 advised by Prof. Jing Gao, and MS in Statistics from University of Illinois at Urbana-Champaign in 2012. She has received several awards including the 2018 Data Mining Research Excellent Award (Bronze) at UIUC, the Presidential Fellowship of University at Buffalo, the Best CSE Graduate Research Award and the CSE Best Dissertation Award at Department of Computer Science and Engineering, University at Buffalo. More information can be found at https://publish.illinois.edu/qili5/.