Second Workshop on Uncertainty-Aware NLP @EMNLP 2025

The workshop will take place November 9th, 2025 in room A207 at the Suzhou International Expo Centre. Below is the detailed program of the workshop. All times listed are in local Suzhou time (UTC+8).

Workshop Schedule

09:00 - 09:10: Opening remarks
09:10 - 09:55: Invited Talk - Maxim Panov
- Title: Uncertainty Quantification for Generative Language Models
- Abstract: The widespread deployment of large language models (LLMs) has made ML-based applications even more vulnerable to risks of causing various forms of harm to users. For example, models often “hallucinate”, i.e., fabricate facts without providing users an apparent means to validate their statements. Uncertainty quantification (UQ) methods could be used to detect unreliable generations unlocking the safer and more responsible use of LLMs in practice. UQ methods for generative LLMs are a subject of cutting-edge research, which is currently quite scarce and scattered. We systemize these efforts, discuss common caveats, and provide suggestions for the development of novel techniques in this area.
09:55 - 10:30: Poster lightning round 1 (in-person presenters)
- Uncertainty-driven Partial Diacritization for Arabic Text (Humaid Ali Alblooshi, Artem Shelmanov, and Hanan Aldarmaki)
- The Geometry of Creative Variability: How Credal Sets Expose Calibration Gaps in Language Models (Esteban Garces Arias, Julian Rodemann, and Christian Heumann)
- Asking a Language Model for Diverse Responses (Sergey Troshin, Irina Saparina, Antske Fokkens, and Vlad Niculae)
- Consensus or Conflict? Fine-Grained Evaluation of Conflicting Answers in Question-Answering (Eviatar Nachshoni, Arie Cattan, Shmuel Amar, Ori Shapira, and Ido Dagan)
- Confidence-Based Response Abstinence: Improving LLM Trustworthiness via Activation-Based Uncertainty Estimation (Zhiqi Huang, Vivek Datla, Chenyang Zhu, Alfy Samuel, Daben Liu, Anoop Kumar, and Ritesh Soni)
- Causal Understanding by LLMs: The Role of Uncertainty (Oscar William Lithgow-Serrano, Vani Kanjirangat, and Alessandro Antonucci)
- It Depends: Resolving Referential Ambiguity in Minimal Contexts with Commonsense Knowledge (Lukas Ellinger and Georg Groh)
- Read Your Own Mind: Reasoning Helps Surface Self-Confidence Signals in LLMs (Jakub Podolak and Rajeev Verma)
- Calibrating Language Models for Neural Ranking under Noisy Supervision with Relaxed Labels (Arnab Sharma, Daniel Vollmers, and Axel-Cyrille Ngonga Ngomo)
- ERGO: Entropy-guided Resetting for Generation Optimization in Multi-turn Language Models (Haziq Mohammad Khalid, Athikash Jeyaganthan, Timothy Do, Yicheng Fu, Vasu Sharma, Sean O’Brien, and Kevin Zhu)
- Can Vision-Language Models Infer Speaker’s Ignorance? The Role of Visual and Linguistic Cues (Ye-eun Cho and Yunho Maeng)
- DeLTa: A Decoding Strategy based on Logit Trajectory Prediction Improves Factuality and Reasoning Ability (Yunzhen He, Yusuke Takase, Yoichi Ishibashi, and Hidetoshi Shimodaira)
- Decoding Uncertainty: The Impact of Decoding Strategies for Uncertainty Estimation in Large Language Models (Wataru Hashimoto, Hidetaka Kamigaito, and Taro Watanabe)
10:30 - 11:00: Coffee Break
11:00 - 12:15: In-person poster session
- Same set of papers as above (in-person presenters).
12:15 - 13:15: Lunch break
13:15 - 14:00: Invited Talk - Parisa Kordjamshidi
- Title: Reasoning under Uncertainty with Large Multimodal Language Models
- Abstract: Uncertainty in intelligent models has multiple facets. One aspect concerns a model’s own uncertainty or confidence in its generated outputs. Another pertains to factual knowledge about uncertainty within specific concepts. For example, statements such as “10–20% of lifelong smokers will develop lung cancer” express factual uncertainty derived from statistical data analyses and represented in text. A key research question is whether language models can form and convey such factual uncertainties—integrating information, drawing on their internal knowledge, and aligning this with their confidence when expressing opinions. While addressing this question is highly challenging, I will present our research that explores related directions and the following research question: 1) How do language models understand uncertainty expressions in natural language and perform probabilistic inference over them? 2) How can models be trained to follow the principles of probabilistic reasoning when handling uncertainty in text? 3) How can today’s large models reason over uncertain text? specifically focusing on mapping language into formal probabilistic logic programs?, and finally, in the context of grounding natural language in the visual modality, 4) How can uncertainty in perception be explicitly represented in reasoning? specifically focusing on mappings to differentiable probabilistic programs.
14:00 - 14:45: Poster lightning round 2 (virtual presenters)
- The Benefits of Being Uncertain: Perplexity as a Signal for Naturalness in Multilingual Machine Translation (Timothy Pistotti, Michael J. Witbrock, Padriac Amato Tahua O’Leary, and Jason Brown)
- Learning to vary: Teaching LMs to reproduce human linguistic variability in next-word prediction (Tobias Groot, Salo Lacunes, and Evgenia Ilia)
- Phases of Uncertainty: Confidence–Calibration Dynamics in Language Model Training (Aneesh Durai)
- Beyond Human Judgment: A Bayesian Evaluation of LLMs’ Moral Values Understanding (Maciej Skorski and Alina Landowska)
- Do Large Language Models Know When Not to Answer in Medical QA? (Sravanthi Machcha, Sushrita Yerra, Sharmin Sultana, Hong Yu, and Zonghai Yao)
- Certain but not Probable? Differentiating Certainty from Probability in LLM Token Outputs for Probabilistic Scenarios (Autumn Toney and Ryan Wails)
- HALLUCINOGEN: Benchmarking Hallucination in Implicit Reasoning within Large Vision Language Models (Ashish Seth, Dinesh Manocha, and Chirag Agarwal)
- Uncertainty in Semantic Language Modeling with PIXELS (Stefania Radu, Marco Zullich, and Matias Valdenegro-Toro)
- Confidence Calibration in Large Language Model-Based Entity Matching (Iris Kamsteeg, Juan Cardenas-Cartagena, Floris van Beers, Tsegaye Misikir Tashu, and Matias Valdenegro-Toro)
- Demystify Verbosity Compensation Behavior of Large Language Models (Yusen Zhang, Sarkar Snigdha Sarathi Das, and Rui Zhang)
- On the Role of Unobserved Sequences on Sample-based Uncertainty Quantification for LLMs (Lucie Kunitomo-Jacquin, Edison Marrese-Taylor, and Ken Fukuda)
- Amortized Bayesian Meta-Learning for Low-Rank Adaptation of Large Language Models (Liyi Zhang, Jake C. Snell, and Thomas L. Griffiths)
- Towards Trustworthy Summarization of Cardiovascular Articles: A Factuality-and-Uncertainty-Aware Biomedical LLM Approach (Eleni Partalidou, Tatiana Passali, Chrysoula Zerva, Grigorios Tsoumakas, and Sophia Ananiadou)
- Towards Open-Ended Discovery for Low-Resource NLP (Bonaventure F. P. Dossou and Henri Aïdasso)
- Investigating Factuality in Long-Form Text Generation: The Roles of Self-Known and Self-Unknownan (Lifu Tu, Rui Meng, Shafiq Joty, Yingbo Zhou, and Semih Yavuz)
14:45 - 15:30: Invited Talk - Gal Yona
- Title: Beyond Factuality: Improving Trust and Reliablility of Large Language Models
- Abstract: Factuality is a cornerstone for trustworthy LLMs, yet despite impressive progress, frontier LLMs still make many confident errors when faced with questions beyond their knowledge boundaries. In this talk I’ll present Faithful Response Uncertainty, a different desiderata that shifts the focus away from measuring the number of incorrect statements and towards measuring the alignment between the model’s expressed certainty (“decisiveness”) and intrinsic certainty (“confidence”). I’ll conclude with a discussion of open problems and possible next steps at the intersection of factuality and uncertainty in frontier LLMs.
15:30 - 16:00: Coffee Break
16:00 - 16:45: Invited Talk - Eyke Hüllermeier
- Title: Challenges in Uncertainty Quantification for Large Language Models
- Abstract: Uncertainty quantification is important in the context of large language models (LLMs) because the outputs produced by these models are often incorrect. However, due to the complexity of language and the numerous sources of uncertainty in textual data, quantifying uncertainty in LLMs is challenging. Indeed, simply transferring existing approaches to uncertainty quantification developed for standard machine learning problems, such as classification and regression, is neither straightforward nor appropriate. This is particularly pertinent to the definition of aleatoric and epistemic uncertainty, and how they are distinguished based on the notion of reducibility. This talk will discuss the challenges of uncertainty quantification for LLMs, propose potential solutions and highlight promising avenues for future research in this emerging field.
16:45 - 17:00: Closing remarks