Awards

Best Paper

  • Infini-gram mini: Exact n-gram Search at the Internet Scale with FM-Index
    Hao Xu, Jiacheng Liu, Yejin Choi, Noah A. Smith, Hannaneh Hajishirzi

Outstanding Papers

  1. LingGym: How Far Are LLMs from Thinking Like Field Linguists?
    Changbing Yang, Franklin Ma, Freda Shi, Jian Zhu

  2. Mind the Value-Action Gap: Do LLMs Act in Alignment with Their Values?
    Hua Shen, Nicholas Clark, Tanu Mitra

  3. DiscoSG: Towards Discourse-Level Text Scene Graph Parsing through Iterative Graph Refinement
    Shaoqing Lin, Chong Teng, Fei Li, Donghong Ji, Lizhen Qu, Zhuang Li

  4. Generative or Discriminative? Revisiting Text Classification in the Era of Transformers
    Siva Rajesh Kasa, Karan Gupta, Sumegh Roychowdhury, Ashutosh Kumar,
    Yaswanth Biruduraju, Santhoh Kumar Kasa, Pattisapu Nikhil Priyatam,
    Arindam Bhattacharya, Shailendra Agarwal, Vijay Huddar

  5. Measuring Chain of Thought Faithfulness by Unlearning Reasoning Steps
    Martin Tutek, Fateme Hashemi Chaleshtori, Ana Marasovic, Yonatan Belinkov

  6. MiCRo: Mixture Modeling and Context-aware Routing for Personalized Preference Learning
    Jingyan Shen, Jiarui Yao, Rui Yang, Yifan Sun, Feng Luo, Rui Pan, Tong Zhang, Han Zhao

  7. Causal Interventions Reveal Shared Structure Across English Filler-Gap Constructions
    Sasha Boguraev, Christopher Potts, Kyle Mahowald


Best Special Theme Paper

  • InterIDEAS: Philosophical Intertextuality via LLMs
    Yue Yang, Yinzhi Xu, Chenghao Huang, JohnMichael Jurgensen, Han Hu, Hao Wang

Best Resource Paper

  • Autoformalization in the Wild: Assessing LLMs on Real-World Mathematical Definitions
    Lan Zhang, Marco Valentino, Andre Freitas

Social Impact Award

  • AccessEval: Benchmarking Disability Bias in Large Language Models
    Srikant Panda, Amit Agarwal, Hitesh Laxmichand Patel

People’s Choice Award

  • Randomly Removing 50% of Dimensions in Text Embeddings has Minimal Impact on Retrieval and Classification Tasks
    Sotaro Takeshita, Yurina Takeshita, Daniel Ruffinelli, Simone Paolo Ponzetto

SAC Highlights

The Senior Area Chairs highlighted the following 35 papers as particularly noteworthy:

  • PAFT: Prompt-Agnostic Fine-Tuning
    Chenxing Wei, Yao Shu, Mingwen Ou, Ying He, Fei Yu

  • Constructions are Revealed in Word Distributions
    Joshua Rozner, Leonie Weissweiler, Kyle Mahowald, Cory Shain

  • To Mask or to Mirror: Human-AI Alignment in Collective Reasoning
    Crystal Qian, Aaron T. Parisi, Clémentine Bouleau, Vivian Tsai, Maël Lebreton, Lucas Dixon

  • Whisper-UT: A Unified Translation Framework for Speech and Text
    Cihan Xiao, Matthew Wiesner, Debashish Chakraborty, Reno Kriz, Keith Cunningham, Kenton Murray,
    Kevin Duh, Luis Tavarez-Arce, Paul McNamee, Sanjeev Khudanpur

  • Calibrating Verbal Uncertainty as a Linear Feature to Reduce Hallucinations
    Ziwei Ji, Lei Yu, Yeskendir Koishekenov, Yejin Bang, Anthony Hartshorn, Alan Schelten, Cheng Zhang,
    Pascale Fung, Nicola Cancedda

  • Quantifying Language Disparities in Multilingual Large Language Models
    Songbo Hu, Ivan Vulić, Anna Korhonen

  • Collab-Overcooked: Benchmarking and Evaluating Large Language Models as Collaborative Agents
    Haochen Sun, Shuwen Zhang, Lujie Niu, Lei Ren, Hao Xu, Hao Fu, Fangkun Zhao, Caixia Yuan, Xiaojie Wang

  • Think in Safety: Unveiling and Mitigating Safety Alignment Collapse in Multimodal Large Reasoning Model
    Xinyue Lou, You Li, Jinan Xu, Xiangyu Shi, Chi Chen, Kaiyu Huang

  • PSET: a Phonetics-Semantics Evaluation Testbed
    Gianluca Sperduti, Dong Nguyen

  • GATEAU: Selecting Influential Samples for Long Context Alignment
    Shuzheng Si, Haozhe Zhao, Gang Chen, Yunshui Li, Kangyang Luo, Chuancheng Lv, Kaikai An, Fanchao Qi,
    Baobao Chang, Maosong Sun

  • AbsVis: Benchmarking How Humans and Vision-Language Models “See” Abstract Concepts in Images
    Tarun Tater, Diego Frassinelli, Sabine Schulte im Walde

  • Analyzing Uncertainty of LLM-as-a-Judge: Interval Evaluations with Conformal Prediction
    Huanxin Sheng, Xinyi Liu, Hangfeng He, Jieyu Zhao, Jian Kang

  • ReSo: A Reward-driven Self-organizing LLM-based Multi-Agent System for Reasoning Tasks
    Heng Zhou, Hejia Geng, Xiangyuan Xue, Li Kang, Yiran Qin, Zhiyong Wang, Zhenfei Yin, Lei Bai

  • Prototypical Human-AI Collaboration Behaviors from LLM-Assisted Writing in the Wild
    Sheshera Mysore, Debarati Das, Hancheng Cao, Bahareh Sarrafzadeh

  • Comparing human and LLM politeness strategies in free production
    Haoran Zhao, Robert D. Hawkins

  • Who Holds the Pen? Caricature and Perspective in LLM Retellings of History
    Lubna Zahan Lamia, Mabsur Fatin Bin Hossain, Md Mosaddek Khan

  • AMACE: Automatic Multi-Agent Chart Evolution for Iteratively Tailored Chart Generation
    Hyuk Namgoong, Jeesu Jung, Hyeonseok Kang, Yohan Lee, Sangkeun Jung

  • HMoE: Heterogeneous Mixture of Experts for Language Modeling
    An Wang, Xingwu Sun, Ruobing Xie, Shuaipeng Li, Jiaqi Zhu, Zhen Yang, Pinxue Zhao, Weidong Han,
    Zhanhui Kang, Di Wang, Naoaki Okazaki, Cheng-zhong Xu

  • Lemmatization of Polish Multi-word Expressions
    Magdalena Król, Aleksander Smywiński-Pohl, Zbigniew Kaleta, Paweł Lewkowicz

  • Discriminating Form and Meaning in Multilingual Models with Minimal-Pair ABX Tasks
    Maureen de Seyssel, Jie Chi, Skyler Seto, Maartje Ter Hoeve, Masha Fedzechkina, Natalie Schluter

  • Tokenization and Representation Biases in Multilingual Models on Dialectal NLP Tasks
    Vani Kanjirangat, Tanja Samardzic, Ljiljana Dolamic, Fabio Rinaldi

  • ReasonMed: A 370K Multi-Agent Generated Dataset for Advancing Medical Reasoning
    Yu Sun, Xingyu Qian, Weiwen Xu, Hao Zhang, Chenghao Xiao, Long Li, Deli Zhao, Wenbing Huang,
    Tingyang Xu, Qifeng Bai, Yu Rong

  • Detecting Legal Citations in United Kingdom Court Judgments
    Holli Sargeant, Andreas Östling, Måns Magnusson

  • MAviS: A Multimodal Conversational Assistant For Avian Species
    Yevheniia Kryklyvets, Mohammed Irfan Kurpath, Sahal Shaji Mullappilly, Jinxing Zhou,
    Fahad Shahbaz Khan, Rao Muhammad Anwer, Salman Khan, Hisham Cholakkal

  • Cardiverse: Harnessing LLMs for Novel Card Game Prototyping
    Danrui Li, Sen Zhang, Samuel S. Sohn, Kaidong Hu, Muhammad Usman, Mubbasir Kapadia

  • Estimating LLM Consistency: A User Baseline vs Surrogate Metrics
    Xiaoyuan Wu, Weiran Lin, Omer Akgul, Lujo Bauer

  • Beyond WER: Probing Whisper’s Sub-token Decoder Across Diverse Language Resource Levels
    Siyu Liang, Nicolas Ballier, Gina-Anne Levow, Richard Wright

  • RALS: Resources and Baselines for Romanian Automatic Lexical Simplification
    Fabian Anghel, Cristea Petru-Theodor, Claudiu Creanga, Sergiu Nisioi

  • NormGenesis: Multicultural Dialogue Generation via Exemplar-Guided Social Norm Modeling and Violation Recovery
    Minki Hong, Jangho Choi, Jihie Kim

  • LiTEx: A Linguistic Taxonomy of Explanations for Understanding Within-Label Variation in Natural Language Inference
    Pingjun Hong, Beiduo Chen, Siyao Peng, Marie-Catherine de Marneffe, Barbara Plank

  • Liaozhai through the Looking-Glass: On Paratextual Explicitation of Culture-Bound Terms in Machine Translation
    Sherrie Shen, Weixuan Wang, Alexandra Birch

  • Aligning Text/Speech Representations from Multimodal Models with MEG Brain Activity During Listening
    Padakanti Srijith, Khushbu Pahwa, Radhika Mamidi, Bapi Raju Surampudi, Manish Gupta, Subba R Oota

  • FoREST: Frame of Reference Evaluation in Spatial Reasoning Tasks
    Tanawan Premsri, Parisa Kordjamshidi

  • Layer-wise Minimal Pair Probing Reveals Contextual Grammatical-Conceptual Hierarchy in Speech Representations
    Linyang He, Qiaolin Wang, Xilin Jiang, Nima Mesgarani

  • Mind the Blind Spots: A Focus-Level Evaluation Framework for LLM Reviews
    Hyungyu Shin, Jingyu Tang, Yoonjoo Lee, Nayoung Kim, Hyunseung Lim, Ji Yong Cho, Hwajung Hong,
    Moontae Lee, Juho Kim


Outstanding Senior Area Chairs

  • Ashiqur R. KhudaBukhsh
  • Cassandra L. Jacobs
  • Debora Nozza
  • Luciana Benotti
  • Miryam de Lhoneux
  • Richard Sproat
  • Sachin Kumar
  • Usman Naseem
  • Wenpeng Yin

Outstanding Area Chairs

  • Alla Rozovskaya
  • Gaël Guibon
  • Gerasimos Spanakis
  • Jianhui Pang
  • JinYeong Bak
  • Jivnesh Sandhan
  • Lei Li
  • Lucy Li
  • Mark G. Lee
  • Matthieu Labeau
  • Shruti Rijhwani
  • Vered Shwartz
  • Vivek Gupta
  • Wei Zhao
  • Xindi Wang

Outstanding Reviewers

  • Alon Eirew
  • Andrey Sakhovskiy
  • Arkadiusz Janz
  • Chun-Ying Huang
  • Daixuan Cheng
  • Di Wang
  • Dominic Petrak
  • Esra Dönmez
  • George Zerveas
  • Giacomo Frisoni
  • Huachuan Qiu
  • Huimu Wang
  • Jiaqi Chen
  • Jiawei Ma
  • Jiayi Wang
  • Jiho Jin
  • Jingcheng Niu
  • Joel Ruben Antony Moniz
  • Jonathan P. Chang
  • Justin Vasselli
  • Kavin R. V.
  • Kerem Zaman
  • Kevin Duh
  • Liu Chengwu
  • Long H. B. Nguyen
  • Mahan Malihi
  • Mengru Wang
  • Mun Yong Yi
  • Munmun De Choudhury
  • Nishant Balepur
  • Pretam Ray
  • Qing Zong
  • Răzvan-Alexandru Smădu
  • Ruida Wang
  • Sahal Shaji Mullappilly
  • Shaolin Zhu
  • Shuai Zhao
  • Suyash Damle
  • Tanmay Parekh
  • Tiankai Yang
  • Tong Ding
  • Wei Yao
  • Xiulin Yang
  • Xiwen Liang
  • Yaswanth Narsupalli
  • Yeyun Gong
  • Youna Kim
  • Yuxuan Chen
  • Yves Scherrer
  • Zhenzhou Ji
  • Zhicheng Yang

Paper Award Selection Process

The Best Paper Committee was chaired by Owen Rambow (Stony Brook University) and Mirella Lapata (University of Edinburgh). A special committee, with assignments based on reviewer expertise, produced 92 reviews and selected four best paper nominations, from which the final awardees were chosen (including ARR-originating submissions with fewer initial nominations).