Accepted Tutorials
Morning (tentative)
Fairness in Large Language Models: A Tutorial (Zichong Wang, Avash Palikhe, Zhipeng Yin and Wenbin Zhang)
Neural Shifts in Collaborative Team Recommendation (Mahdis Saeedi and Hossein Fani)
Continual Recommender Systems (Hyunsik Yoo, Seongku Kang and Hanghang Tong)
Generative Models for Synthetic Data: Transforming Data Mining in the GenAI Era (Dawei Li, Yue Huang, Ming Li, Tianyi Zhou, Xiangliang Zhang and Huan Liu)
Neural Differential Equations for Continuous-Time Analysis (Yongkyung Oh, Dongyoung Lim and Sungil Kim)
Afternoon (tentative)
Socially Responsible and Trustworthy Generative Foundation Models: Principles, Challenges, and Practices (Yue Huang, Canyu Chen, Lu Cheng, Bhavya Kailkhura, Nitesh Chawla and Xiangliang Zhang)
Retrieval of Graph Structured Objects: Theory and Applications (Indradyumna Roy, Soumen Chakrabarti and Abir De)
Towards Large Generative Recommendation: A Tokenization Perspective (Yupeng Hou, An Zhang, Leheng Sheng, Jiancan Wu, Xiang Wang, Tat-Seng Chua and Julian McAuley)
Uncertain Boundaries: A Tutorial on Copyright Challenges and Cross-Disciplinary Solutions for Generative AI (Zhipeng Yin, Zichong Wang, Avash Palikhe and Wenbin Zhang)
A Tutorial on Hypergraph Neural Networks: An In-Depth and Step-By-Step Guide (Sunwoo Kim, Soo Yong Lee, Yue Gao, Alessia Antelmi, Mirko Polato and Kijung Shin)
Tutorial Abstracts and Webpages
Fairness in Large Language Models: A Tutorial.
Zichong Wang (Florida International University), Avash Palikhe (Florida International University), Zhipeng Yin (Florida International University) and Wenbin Zhang (Florida International University).
Abstract. Large Language Models (LLMs) have achieved remarkable performance across a wide range of applications, but their outputs often exhibit systematic biases that pose challenges for trustworthy deployment.
While fairness has been extensively studied in traditional machine learning models, most existing tutorials and frameworks focus on settings where model internals or training data are accessible, assumptions that often do not hold for LLMs.
As LLMs become increasingly influential, there is a growing need to understand fairness in this new context, including how bias manifests, how it can be measured, and what mitigation strategies are most effective.
To address this gap, this tutorial surveys recent research advancements in fairness for LLMs. We begin with an introduction to LLMs and real-world examples of biased behavior, followed by an analysis of the underlying sources of bias.
The tutorial then defines key fairness concepts tailored to LLMs and reviews various bias evaluation methods and fairness-enhancing algorithms. We also present a multi-dimensional taxonomy of benchmark datasets for fairness evaluation and conclude with a discussion of open research challenges.
https://fairness-llms-tutorial.github.io
Neural Shifts in Collaborative Team Recommendation.
Mahdis Saeedi (University of Windsor) and Hossein Fani (University of Windsor).
Abstract. Team recommendation involves selecting skilled experts to form an almost surely successful collaborative team, or refining the team composition to maintain or excel at performance.
To eschew the tedious and error-prone manual process, various computational and social science theoretical approaches have been proposed wherein the problem definition remains essentially the same, while it has been referred to by such other names as team allocation, selection, composition, and formation.
In this tutorial, we study the advancement of computational approaches from greedy search in pioneering works to the recent learning-based approaches, with a particular in-depth exploration of graph neural network-based methods as the cutting-edge class, via unifying definitions, formulations, and evaluation schema.
More importantly, we then discuss team refinement, a subproblem in team recommendation that involves structural adjustments or expert replacements to enhance team performance in dynamic environments.
Finally, we introduce training strategies, benchmarking datasets, and open-source tools, along with future research directions and real-world applications.
https://fani-lab.github.io/OpeNTF/tutorial/cikm25
Continual Recommender Systems.
Hyunsik Yoo (University of Illinois Urbana-Champaign), Seongku Kang (Korea University) and Hanghang Tong(University of Illinois Urbana-Champaign).
Abstract. Modern recommender systems operate in uniquely dynamic settings: user interests, item pools, and popularity trends shift continuously, and models must adapt in real time without forgetting past preferences.
While existing tutorials on continual or lifelong learning cover broad machine learning domains (e.g., vision and graphs), they do not address recommendation-specific demands—such as balancing stability and plasticity per user, handling cold-start items, and optimizing recommendation metrics under streaming feedback.
This tutorial aims to make a timely contribution by filling that gap. We begin by reviewing the background and problem settings, followed by a comprehensive overview of existing approaches, including replay-based and regularization-based methods.
We then highlight recent efforts to apply continual learning to practical deployment environments, such as resource-constrained systems and sequential interaction settings.
Finally, we discuss open challenges and future research directions.
We believe this tutorial will be valuable to researchers and practitioners in recommender systems, data mining, and artificial intelligence, and will benefit a wide range of real-world application domains related to information retrieval.
The website of this tutorial is available.
https://www.idea.korea.ac.kr/research/tutorial-continual-recommender-systems
Generative Models for Synthetic Data: Transforming Data Mining in the GenAI Era.
Dawei Li (Arizona State University), Yue Huang (University of Notre Dame), Ming Li (University of Maryland), Tianyi Zhou (University of Maryland), Xiangliang Zhang (University of Notre Dame) and Huan Liu (Arizona State University).
Abstract. In the era of data-driven artificial intelligence (AI), access to large-scale, high-quality datasets has become a fundamental requirement for breakthroughs in data mining and machine learning.
However, real-world data is often scarce, expensive to annotate, or restricted due to privacy and proprietary concerns.
Synthetic data, algorithmically generated datasets that mimic the statistical properties and underlying patterns of real-world data, has emerged as a powerful solution to these challenges.
https://syndata4dm.github.io
Neural Differential Equations for Continuous-Time Analysis.
Yongkyung Oh (University of California, Los Angeles), Dongyoung Lim (Ulsan National Institute of Science & Technology) and Sungil Kim (Ulsan National Institute of Science & Technology).
Abstract. Modeling complex, irregular time series is a critical challenge in knowledge discovery and data mining.
This tutorial introduces Neural Differential Equations (NDEs)—a powerful paradigm for continuous-time deep learning that intrinsically handles the non-uniform sampling and missing values where traditional models falter.
We provide a comprehensive review of the theory and practical application of the entire NDE family: Neural Ordinary (NODEs), Controlled (NCDEs), and Stochastic (NSDEs) Differential Equations.
The tutorial emphasizes robustness and stability and culminates in a hands-on session where participants will use key open-source libraries to solve real-world tasks like interpolation and classification.
Designed for AI researchers and practitioners, this tutorial equips attendees with essential tools for time series analysis.
https://nde-for-ts.github.io
Socially Responsible and Trustworthy Generative Foundation Models: Principles, Challenges, and Practices.
Yue Huang (University of Notre Dame), Canyu Chen (Northwestern University), Lu Cheng (University of Illinois Chicago), Bhavya Kailkhura (Lawrence Livermore National Laboratory), Nitesh Chawla (University of Notre Dame) and Xiangliang Zhang (University of Notre Dame).
Abstract. Generative foundation models (GenFMs), including large language and multimodal models, are transforming information retrieval and knowledge management.
However, their rapid adoption raises urgent concerns about social responsibility, trustworthiness, and governance.
This tutorial offers a comprehensive, hands-on overview of recent advances in responsible GenFMs, covering foundational concepts, multi-dimensional risk taxonomies (including safety, privacy, robustness, truthfulness, fairness, and machine ethics), state-of-the-art evaluation benchmarks, and effective mitigation strategies.
We integrate real-world case studies and practical exercises using open-source tools, and present key perspectives from both policy and industry, including recent regulatory developments and enterprise practices.
The session concludes with a discussion of open challenges, providing actionable guidance for the CIKM community.
https://tutorial-trustgenfm.github.io
Retrieval of Graph Structured Objects: Theory and Applications.
Indradyumna Roy (IIT Bombay), Soumen Chakrabarti (IIT Bombay) and Abir De (IIT Bombay).
Abstract. Graph-structured data is ubiquitous across diverse domains like social networks, search, question answering, and drug discovery.
Effective retrieval of (sub-)graphs with relevant substructures has become critical to the success of these applications.
This proposed tutorial will introduce attendees to state-of-the-art neural methods for graph retrieval, highlighting architectures that effectively model relevance through innovative combinations of early and late interaction mechanisms.
Participants will explore relevance models that represent graphs as sets of embeddings, enabling alignment-driven similarity scoring between query and corpus graphs and supporting diverse cost functions, both symmetric and asymmetric. We will also discuss compatibility with Approximate Nearest Neighbor (ANN) methods, covering recent advances in locality-sensitive hashing (LSH) and other indexing techniques that significantly enhance scalability in graph retrieval.The tutorial includes hands-on experience with an accessible, PyTorch-integrated toolkit that provides downloadable graph retrieval datasets and baseline implementations of recent methods. Participants will learn to adapt these methods for multi-modal applications — such as molecule, text, and image retrieval — where graph-based retrieval proves particularly effective. Designed for researchers and practitioners, this session delivers both foundational concepts and practical tools for implementing and scaling neural graph retrieval solutions across interdisciplinary applications.
https://sites.google.com/view/graph-match-tutorial/home
Towards Large Generative Recommendation: A Tokenization Perspective.
Yupeng Hou (University of California, San Diego), An Zhang (University of Science and Technology of China), Leheng Sheng (National University of Singapore), Jiancan Wu (University of Science and Technology of China), Xiang Wang (University of Science and Technology of China), Tat-Seng Chua (National University of Singapore) and Julian McAuley (University of California, San Diego).
Abstract. The emergence of large generative models is transforming the landscape of recommender systems. One of the most fundamental components in building these models is action tokenization, the process of converting human-readable data (e.g., user-item interactions) into machine-readable formats (e.g., discrete token sequences).
In this half-day tutorial, we present a comprehensive overview of existing action tokenization techniques, converting actions to (1) item IDs, (2) textual descriptions, and (3) semantic IDs, and explore how they relate to the development of large generative recommendation models.
We then make an in-depth discussion on the challenges, open questions, and potential future directions from the perspective of action tokenization, aiming to inspire the design of next-generation recommender systems.
https://large-genrec.github.io/cikm2025.html
Uncertain Boundaries: A Tutorial on Copyright Challenges and Cross-Disciplinary Solutions for Generative AI.
Zhipeng Yin (Florida International University), Zichong Wang (Florida International University), Avash Palikhe (Florida International University) and Wenbin Zhang (Florida International University).
Abstract. As generative artificial intelligence (AI) becomes increasingly prevalent in creative industries, intellectual property issues have come to the forefront, especially regarding AI-generated content that closely resembles human-created works.
Recent high-profile incidents involving AI-generated outputs reproducing copyrighted materials underscore the urgent need to reassess current copyright frameworks and establish effective safeguards against infringement.
This tutorial systematically examines copyright-related challenges throughout various AI development stages, providing practical recommendations for developers.
It first outlines fundamental copyright principles and considerations specific to generative AI, followed by methods for detecting and assessing potential infringements in AI-generated content.
It also introduces protective strategies to prevent unauthorized replication of creative materials and training datasets.
Additionally, the tutorial details specialized training methods designed to minimize the likelihood of infringement.
Finally, it reviews current AI copyright regulatory frameworks, identifies open research questions, and proposes directions for future research.
https://aicopyright-tutorial.github.io
A Tutorial on Hypergraph Neural Networks: An In-Depth and Step-By-Step Guide.
Sunwoo Kim (KAIST), Soo Yong Lee (KAIST), Yue Gao (Tsinghua University), Alessia Antelmi (University of Turin), Mirko Polato (University of Turin) and Kijung Shin (KAIST).
Abstract. Higher-order interactions (HOIs) are ubiquitous in real-world networks, such as group discussions on online Q&A platforms, co-purchases of items in e-commerce, and collaborations of researchers.
Investigation of deep learning for networks of HOIs, expressed as hypergraphs, has become an important agenda for the data mining and machine learning communities.
As a result, hypergraph neural networks (HNNs) have emerged as a powerful tool for representation learning on hypergraphs. Given this emerging trend, we provide a timely tutorial dedicated to HNNs.
We cover the following topics: (1) inputs, (2) message passing schemes, (3) training strategies, (4) applications (e.g., recommender systems and time series analysis), and (5) open problems of HNNs.
This tutorial is intended for researchers and practitioners who are interested in hypergraph representation learning and its applications.
https://sites.google.com/view/hnn-tutorial