CIKM 2025

Keynote Speakers

PROGRAM

Keynote Speakers

Awards
Conference Program
Keynote Speakers
Accepted Papers
Poster and Demo Session
Proceedings
Tutorials
Workshops
CIKM2025 AnalytiCup

Haesun Park

Georgia Institute of Technology

Numerical Linear Algebraic Foundations for Large-Scale Unsupervised Learning

Abstract: Numerical Linear Algebra provides essential foundations in many large-scale data analytic tasks. In this talk, it is illustrated that some of the powerful methods especially for unsupervised tasks such as clustering, topic modeling, community detection, embedding, and representation learning can be derived based on a framework of low rank approximation (LRA). These include the ubiquitous singular value decomposition (SVD), latent semantic indexing (LSI), principal component analysis (PCA), and the constrained LRA (CLRA)-based methods such as nonnegative matrix factorization (NMF) and its variants such as Symmetric NMF (SymNMF), and JointNMF. It is shown that all these methods can be explained using one framework which can then be further generalized into more advanced methods such as co-clustering and co-embedding for more complex situations including multi-view and multi-granularity data sets, and into semi-supervised methods incorporating prior knowledge. The presented algorithms that utilize advances in numerical linear algebra are shown to achieve scalability, efficiency, and effectiveness. Substantial experimental results on synthetic and real-life problems illustrate significant benefits of exploiting numerical linear algebra-based methods in many data analytic tasks.

Biography: Dr. Haesun Park is a Stephen Fleming Professor, Regents’ Professor, and Chair of the School of Computational Science and Engineering (CSE), Georgia Institute of Technology, Atlanta, Georgia, U.S.A. She was elected as a SIAM Fellow, IEEE Fellow, and ACM Fellow for her outstanding contributions in numerical computing, data analysis, visual analytics, and leadership in Computational Science and Engineering. She has published extensively in the areas of numerical algorithms, data analytics, visual analytics, text and network analysis, bioinformatics, and parallel computing.

She served on numerous conference committees, advisory boards, as the conference chair, and on editorial boards of leading journals such as IEEE Transactions on Pattern Analysis and Machine Intelligence, SIAM Journal on Matrix Analysis and Applications, and SIAM Journal on Scientific Computing. She gave plenary keynote lectures at major meetings including the SIAM Conference on Applied Linear Algebra, SIAM International Conference on Data Mining, SIAM Conference on Parallel Processing for Scientific Computing, and US National Academies National Research Council meeting on New Research Directions for the National Geospatial-Intelligence Agency.

She served as the director of the NSF/DHS FODAVA (Foundations of Data and Visual Analytics) Center at Georgia Tech 2008-2014 overseeing development of Data and Visual analytics of 17 partner groups across U.S. universities. Before joining Georgia Tech, she was a professor in the Department of Computer Science and Engineering, University of Minnesota, Twin Cities 1987- 2005 and a program director at the National Science Foundation, Arlington, VA, U.S.A. 2003 - 2005. She received a Ph.D. degree and an M.S. degree in Computer Science from Cornell University, Ithaca, NY, and a B.S. degree in Mathematics from Seoul National University, Seoul, Korea with the Presidential Medal.

Sihem Amer-Yahia

CNRS / Univ. Grenoble Alpes

AI Planning for Data Exploration

Abstract: Data Exploration is an incremental process that helps users express what they want through a conversation with the data. Reinforcement Learning (RL) is one of the most notable approaches to automate data exploration and several solutions have been proposed. With the advent of Large Language Models and their ability to reason sequentially, it has become legitimate to ask the question: would LLMs and, more generally AI planning, outperform an customized RL policy in data exploration? More specifically, would LLMs help circumvent retraining for new tasks and striking a balance between specificity and generality? This talk will attempt to answer this question by reviewing RL training and policy reusability for data exploration.

Biography: Sihem Amer-Yahia is a Silver Medal CNRS Research Director and Deputy Director of the Lab of Informatics of Grenoble. She works on exploratory data analysis and algorithmic upskilling. Prior to that she was Principal Scientist at QCRI, Senior Scientist at Yahoo! Research and Member of Technical Staff at at&t Labs. Sihem served as PC chair for SIGMOD 2023 and as the coordinator of the Diversity, Equity and Inclusion initiative for the database community. In 2024, she received the 2024 IEEE TCDE Impact Award, the SIGMOD Contributions Award, and the VLDB Women in Database Award.

Yong-Yeol Ahn

University of Virginia

The Geometry of Knowledge and Computational Discovery

Abstract: Modern neural networks transform vast datasets into continuous embedding spaces, translating semantic relationships into geometric structures. The key to unlocking their full potential lies in making these representations interpretable. This talk presents a simple theory underlining the interpretability of the embedding space and how this principle can allow us to analyze the data, design new metrics, and model the dynamics of the system, moving beyond black-box models to data-driven interpretable insights.

Biography: Yong-Yeol (YY) Ahn is a Quantitative Foundation Distinguished Professor at the University of Virginia School of Data Science. He was previously a Professor at Indiana University School of Informatics, Computing, and Engineering (2011–2025) and a Visiting Professor at MIT (2020–2021). He worked as a postdoctoral research associate at the Center for Complex Network Research at Northeastern University and as a visiting researcher at the Center for Cancer Systems Biology at Dana-Farber Cancer Institute after earning his PhD in Statistical Physics from KAIST in 2008. His research focuses on data science, spanning methodological work in network science, machine learning, and AI, as well as their applications to computational social science, computational neuroscience and biology, and the science of science. He is a recipient of several awards, including Microsoft Research Faculty Fellowship and LinkedIn Economic Graph Challenge.