AI Seminar: Understanding Multiview and Self-Supervised Representation Learning: A Nonlinear Mixture Identification Perspective

Image
Event Speaker
Xiao Fu
Assistant Professor, School of Electrical Engineering and Computer Science, Oregon State University
Event Type
Artificial Intelligence
Date
Event Location
KEC 1001
Event Description

Central to representation learning is to succinctly represent high-dimensional data using the “essential information’’ while discarding the “redundant information”. Properly formulating and approaching this objective is critical to fending against overfitting, and can also benefit many important tasks such as domain adaptation and transfer learning. This talk aims to deepen understanding of representation learning and using the gained insights to come up with a new learning method. In particular, attention will be paid to two representation learning paradigms using multiple views of data, as both naturally acquired (e.g., image and audio) and artificially produced (e.g., via adding different noise to data samples) multiview data have empirically proven useful in producing essential information-reflecting vector representations. Natural views are often handled by multiview analysis tools, e.g., (deep) canonical correlation analysis [(D)CCA], while the artificial ones are frequently used in self-supervised learning (SSL) paradigms, e.g., BYOL and Barlow Twins. However, the effectiveness of these methods is mostly validated empirically, and more insights and theoretical underpinnings remain to be discovered. In this talk, an intuitive generative model of multiview data is adopted, where the views are different nonlinear mixtures of shared and private components. Since the shared components are view/distortion-invariant, such components may serve for representing the essential information of data in a non-redundant way. Under this model, a key module used in a suite of DCCA and SSL paradigms, namely, latent correlation maximization, is shown to guarantee the extraction of the shared components across views (up to certain ambiguities). It is further shown that the private information in each view can be provably disentangled from the shared using proper regularization design---which can facilitate tasks such cross-view translation and data generation. A finite sample analysis, which has been rare in nonlinear mixture identifiability study, is also presented. The theoretical results and newly designed regularization are tested on a series of tasks.

Speaker Biography

Xiao Fu received the Ph.D. degree in Electronic Engineering from The Chinese University of Hong Kong (CUHK), Shatin, N.T., Hong Kong, in 2014. He was a Postdoctoral Associate with the Department of Electrical and Computer Engineering, University of Minnesota, Minneapolis, MN, USA, from 2014 to 2017. Since 2017, he has been an Assistant Professor with the School of Electrical Engineering and Computer Science, Oregon State University, Corvallis, OR, USA. His research interests include the broad area of signal processing and machine learning.

Dr. Fu received a Best Student Paper Award at ICASSP 2014, and was a recipient of the Outstanding Postdoctoral Scholar Award at University of Minnesota in 2016. His coauthored papers received Best Student Paper Awards from IEEE CAMSAP 2015 and IEEE MLSP 2019, respectively. He received the National Science Foundation CAREER Award in 2022. He serves as a member of the Sensor Array and Multichannel Technical Committee (SAM-TC) of the IEEE Signal Processing Society (SPS). He is also a member of the Signal Processing for Multisensor Systems Technical Area Committee (SPMuS-TAC) of EURASIP. He is the Treasurer of the IEEE SPS Oregon Chapter. He serves as an Editor of Signal Processing and an Associate Editor of IEEE Transactions on Signal Processing. He was a tutorial speaker at ICASSP 2017 and SIAM Conference on Applied Linear Algebra 2021.