Feb 10, 2017
Lopata Hall, Room 101
From Geometric Matrix Factorization to Anchor-Free Topic Mining
Department of Electrical and Computer Engineering
University of Minnesota, Minneapolis
Nonnegative matrix factorization (NMF) is one of the workhorses of data analytics. While NMF is widely used, it does not generally offer identifiability guarantees for the low-rank factor matrices, except under rather restrictive conditions. Factor identifiability is a crucial consideration in many applications; e.g., in topic mining, it ensures that the prominent topics in a document corpus can be recovered uniquely. A widely adopted assumption that helps establish NMF identifiability is the so-called separability condition, which, in topic mining, translates to the existence of characteristic words for each topic -- i.e., words that do not appear in any other topic; these are called anchor words. Relying on separability/anchor words makes factorization/topic mining algorithms sensitive to violation of model assumptions. In this talk, I will introduce a matrix factorization technique that guarantees factor identifiability under mild and realistic conditions. We show that, by minimizing the volume of a latent factor matrix when decomposing the data matrix, identifiability can be ensured even when the separability assumption is grossly violated. Volume minimization has been employed as a heuristic for improving NMF performance since the 1990s, but a precise identifiability-enabling condition proved elusive for more than 20 years, until our work filled this gap. Based on the insights gained from volume minimization, an anchor-free topic mining algorithmic framework will also be introduced in the talk, which exhibits favorable performance relative to existing anchor-based algorithms on a number of real datasets.
Xiao Fu received his Ph.D. degree in Electronic Engineering from The Chinese University of Hong Kong (CUHK), Shatin, N.T., Hong Kong. He is currently a Postdoctoral Associate in the Department of Electrical and Computer Engineering, University of Minnesota, Minneapolis, United States. His research interests include the broad area of data and signal analytics, with a recent emphasis on tensor and matrix factorization and applications.
Dr. Fu was an awardee of the Overseas Research Attachment Programme (ORAP) 2013 of the Engineering Faculty, CUHK, which sponsored his visit to the Department of Electrical and Computer Engineering, University of Minnesota, from September 2013 to February 2014. He received a Best Student Paper Award at ICASSP 2014, and coauthored a Best Student Paper Award at IEEE CAMSAP 2015. In 2016, He was recognized as one of two Outstanding Postdoctoral Scholars by University of Minnesota. Dr. Fu is currently co-PI on a National Science Foundation (NSF) grant, a Digital Technology Initiatives Grant from the University of Minnesota, and collaboration research with Huawei Inc.