Accepted Main Conference Papers
- Towards Automated Error Discovery: A Study in Conversational AI
Dominic Petrak, Thy Thy Tran, Iryna Gurevych
- Break the Checkbox: Challenging Closed-Style Evaluations of Cultural Alignment in LLMs
Mohsinul Kabir, Ajwad Abrar, Sophia Ananiadou
- Biased Tales: Cultural and Topic Bias in Generating Children’s Stories
Donya Rooein, Vilém Zouhar, Debora Nozza, Dirk Hovy
- Large Language Models as Realistic Microservice Trace Generators
Donghyun Kim, Sriram Ravula, Taemin Ha, Alex Dimakis, Daehyeok Kim, Aditya Akella
- JUDGEBERT: Assessing Legal Meaning Preservation Between Sentences
David Beauchemin, Michelle Albert-Rochette, Richard Khoury, Pierre-Luc Déziel
- QFrCoLA: a Quebec-French Corpus of Linguistic Acceptability Judgments
David Beauchemin, Richard Khoury
- Revisiting LLM Value Probing Strategies: Are They Robust and Expressive?
Siqi Shen, Mehar Singh, Lajanugen Logeswaran, Moontae Lee, Honglak Lee, Rada Mihalcea
- A Systematic Analysis of Base Model Choice for Reward Modeling
Kian Ahrabian, Pegah Jandaghi, Negar Mokhberian, Sai Praneeth Karimireddy, Jay Pujara
- Comparing Specialised Small and General Large Language Models on Text Classification: 100 Labelled Samples to Achieve Break-Even Performance
Branislav Pecher, Ivan Srba, Maria Bielikova
- Is the Top Still Spinning? Evaluating Subjectivity in Narrative Understanding
Melanie Subbiah, Akankshya Mishra, Grace Kim, Liyan Tang, Greg Durrett, Kathleen McKeown
- MathTutorBench: A Benchmark for Measuring Open-ended Pedagogical Capabilities of LLM Tutors
Jakub Macina, Nico Daheim, Ido Hakimi, Manu Kapur, Iryna Gurevych, Mrinmaya Sachan
- Preemptive Detection and Correction of Misaligned Actions in LLM Agents
Haishuo Fang, Xiaodan Zhu, Iryna Gurevych
- Fingerprinting LLMs through Survey Item Factor Correlation: A Case Study on Humor Style Questionnaire
Simon Münker
- Gradient-Attention Guided Dual-Masking Synergetic Framework for Robust Text-based Person Retrieval
Tianlu Zheng, Yifan Zhang, Xiang An, Ziyong Feng, Kaicheng Yang, Qichuan Ding
- From Problem-Solving to Teaching Problem-Solving: Aligning LLMs with Pedagogy using Reinforcement Learning
David Dinucu-Jianu, Jakub Macina, Nico Daheim, Ido Hakimi, Iryna Gurevych, Mrinmaya Sachan
- CompKBQA: Component-wise Task Decomposition for Knowledge Base Question Answering
Yuhang Tian, Dandan Song, Zhijing Wu, Pan Yang, Changzhi Zhou, Jun Yang, Hao Wang, Huipeng Ma, Chenhao Li, Luan Zhang
- Permutative Preference Alignment from Listwise Ranking of Human Judgments
Yang Zhao, Yixin Wang, Mingzhang Yin
- ToneCraft: Cantonese Lyrics Generation with Harmony of Tones and Pitches
Junyu Cheng, Chang Pan, Shuangyin Li
- SensorLLM: Aligning Large Language Models with Motion Sensors for Human Activity Recognition
Zechen Li, Shohreh Deldari, Linyao Chen, Hao Xue, Flora D. Salim
- MixLoRA-DSI: Dynamically Expandable Mixture-of-LoRA Experts for Rehearsal-Free Generative Retrieval over Dynamic Corpora
Tuan-Luc Huynh, Thuy-Trang Vu, Weiqing Wang, Trung Le, Dragan Gasevic, Yuan-Fang Li, Thanh-Toan Do
- ViClaim: A Multilingual Multilabel Dataset for Automatic Claim Detection in Videos
Patrick Giedemann, Pius von Däniken, Jan Milan Deriu, Alvaro Rodrigo, Anselmo Peñas, Mark Cieliebak
- DeepResearcher: Scaling Deep Research via Reinforcement Learning in Real-world Environments
Yuxiang Zheng, Dayuan Fu, Xiangkun Hu, Xiaojie Cai, Lyumanshan Ye, Pengrui Lu, Pengfei Liu
- Mixture of Length and Pruning Experts for Knowledge Graphs Reasoning
Enjun Du, Siyi Liu, Yongqi Zhang
- MPRF: Interpretable Stance Detection through Multi-Path Reasoning Framework
ZhaoDan Zhang, Jin Zhang, Hui Xu, Jiafeng Guo, Xueqi Cheng
- Analyzing the Effects of Supervised Fine-Tuning on Model Knowledge from Token and Parameter Levels
Junjie Ye, Yuming Yang, Yang Nan, Shuo Li, Qi Zhang, Tao Gui, Xuanjing Huang, Peng Wang, Zhongchao Shi, Jianping Fan
- J$\text{I}^2$S: Joint Influence‑Aware Instruction Data Selection for Efficient Fine‑Tuning
Jingyu Wei, Bo Liu, Tianjiao Wan, Baoyun Peng, Xingkong Ma, Mengmeng Guo
- SoundMind: RL-Incentivized Logic Reasoning for Audio-Language Models
Xingjian Diao, Chunhui Zhang, Keyi Kong, Weiyi Wu, Chiyu Ma, Zhongyu Ouyang, Peijun Qing, Soroush Vosoughi, Jiang Gui
- Seeing More, Saying More: Lightweight Language Experts are Dynamic Video Token Compressors
Xiangchen Wang, Jinrui Zhang, Teng Wang, Haigang Zhang, Feng Zheng
- RoT: Enhancing Table Reasoning with Iterative Row-Wise Traversals
Xuanliang Zhang, Dingzirui Wang, Keyan Xu, Qingfu Zhu, Wanxiang Che
- T-MAD: Target-driven Multimodal Alignment for Stance Detection
ZhaoDan Zhang, Jin Zhang, Xueqi Cheng, Hui Xu
- Emotion Transfer with Enhanced Prototype for Unseen Emotion Recognition in Conversation
Kun Peng, Cong Cao, Hao Peng, Guanlin Wu, Zhifeng Hao, Lei Jiang, Yanbing Liu, Philip S. Yu
- PBI-Attack: Prior-Guided Bimodal Interactive Black-Box Jailbreak Attack for Toxicity Maximization
Ruoxi Cheng, Yizhong Ding, Shuirong Cao, Ranjie Duan, Xiaoshuang Jia, Shaowei Yuan, Simeng Qin, Zhiqiang wang, Xiaojun Jia
- Training a Utility-based Retriever Through Shared Context Attribution for Retrieval-Augmented Language Models
Yilong Xu, Jinhua Gao, Xiaoming Yu, Yuanhai Xue, Baolong Bi, Huawei Shen, Xueqi Cheng
- SportReason: Evaluating Retrieval-Augmented Reasoning across Tables and Text for Sports Question Answering
Kaiyue Feng, Siyue Zhang, Bingsen Chen, Yilun Zhao, Chen Zhao
- MAC-Tuning: LLM Multi-Compositional Problem Reasoning with Enhanced Knowledge Boundary Awareness
Junsheng Huang, Zhitao He, Yuchen Huang, Sandeep Polisetty, Qingyun Wang, Yi R. Fung
- CODI: Compressing Chain-of-Thought into Continuous Space via Self-Distillation
Zhenyi Shen, Hanqi Yan, Linhai Zhang, Zhanghao Hu, Yali Du, Yulan He
- PAFT: Prompt-Agnostic Fine-Tuning
Chenxing Wei, Yao Shu, Mingwen Ou, Ying Tiffany He, Fei Yu
- Theorem-Validated Reverse Chain-of-Thought Problem Generation for Geometric Reasoning
Deng Linger, Linghao Zhu, Yuliang Liu, Yu Wang, Qunyi Xie, Jingjing Wu, Gang Zhang, Yingying Zhu, Xiang Bai
- TACO: Enhancing Multimodal In-context Learning via Task Mapping-Guided Sequence Configuration
Yanshu Li, Jianjiang Yang, Tian Yun, Pinyuan Feng, Jinfa Huang, Ruixiang Tang
- Towards Controllable Speech Synthesis in the Era of Large Language Models: A Systematic Survey
Tianxin Xie, Yan Rong, Pengfei ZHANG, Wenwu Wang, Li Liu
- Automating Steering for Safe Multimodal Large Language Models
Lyucheng Wu, Mengru Wang, Ziwen Xu, Tri Cao, Nay Oo, Bryan Hooi, Shumin Deng
- EMNLP: Educator-role Moral and Normative Large Language Models Profiling
Yilin Jiang, Mingzi Zhang, Sheng Jin, Zengyi Yu, Xiangjie Kong, Binghao Tu
- TracSum: A New Benchmark for Aspect-Based Summarization with Sentence-Level Traceability in Medical Domain
Bohao Chu, Meijie Li, Sameh Frihat, Chengyu Gu, Georg Lodde, Elisabeth Livingstone, Norbert Fuhr
- Context Reasoner: Incentivizing Reasoning Capability for Contextualized Privacy and Safety Compliance via Reinforcement Learning
Wenbin Hu, Haoran Li, Huihao JING, Qi Hu, Ziqian Zeng, Sirui Han, Xu Heli, Tianshu Chu, Peizhao Hu, Yangqiu Song
- Towards General-Domain Word Sense Disambiguation: Distilling Large Language Model into Compact Disambiguator
Liqiang Ming, Sheng-hua Zhong, Yuncong Li
- SLoW: Select Low-frequency Words! Automatic Dictionary Selection for Translation on Large Language Models
Hongyuan Lu, Zixuan Li, Zefan Zhang, Wai Lam
- Parallel Continuous Chain-of-Thought with Jacobi Iteration
Haoyi Wu, Zhihao Teng, Kewei Tu
- EQA-RM: A Generative Embodied Reward Model with Test-time Scaling
Yuhang Chen, Zhen Tan, Tianlong Chen
- Refusal-Aware Red Teaming: Exposing Inconsistency in Safety Evaluations
Yongkang Chen, Xiaohu Du, Xiaotian Zou, Chongyang Zhao, Huan Deng, Hu LI, Xiaohui Kuang
- OmniThink: Expanding Knowledge Boundaries in Machine Writing through Thinking
Zekun Xi, Wenbiao Yin, Jizhan Fang, Jialong Wu, Runnan Fang, Ningyu Zhang, Yong Jiang, Pengjun Xie, Fei Huang, Huajun Chen
- LinkAlign: Scalable Schema Linking for Real-World Large-Scale Multi-Database Text-to-SQL
Yihan Wang, Peiyu Liu, Xin Yang
- On Relation-Specific Neurons in Large Language Models
Yihong Liu, Runsheng Chen, Lea Hirlimann, Ahmad Dawar Hakimi, Mingyang Wang, Amir Hossein Kargaran, Sascha Rothe, François Yvon, Hinrich Schuetze
- IPIGuard: A Novel Tool Dependency Graph-Based Defense Against Indirect Prompt Injection in LLM Agents
Hengyu An, Jinghuai Zhang, Tianyu Du, Chunyi Zhou, Qingming Li, Tao Lin, Shouling Ji
- ProtoVQA: An Adaptable Prototypical Framework for Explainable Fine-Grained Visual Question Answering
Xingjian Diao, Weiyi Wu, Keyi Kong, Peijun Qing, Xinwen Xu, Ming Cheng, Soroush Vosoughi, Jiang Gui
- SEA: Supervised Embedding Alignment for Token-Level Visual-Textual Integration in MLLMs
Yuanyang Yin, Yaqi Zhao, Yajie Zhang, Yuanxing Zhang, Ke Lin, Jiahao Wang, Xin Tao, Pengfei Wan, Wentao Zhang, Feng Zhao
- Molecular String Representation Preferences in Pretrained LLMs: A Comparative Study in Zero- & Few-Shot Molecular Property Prediction
George Arthur Baker, Mario Sanz-Guerrero, Katharina von der Wense
- Weight-Aware Activation Sparsity with Constrained Bayesian Optimization Scheduling for Large Language Models
Ming Wang, Miao Zhang, Xuebo Liu, Liqiang Nie
- DatawiseAgent: A Notebook-Centric LLM Agent Framework for Adaptive and Robust Data Science Automation
Ziming You, Yumiao Zhang, Dexuan Xu, Yiwei Lou, Yandong Yan, Wei Wang, Huamin Zhang, Yu Huang
- VC4VG: Optimizing Video Captions for Text-to-Video Generation
Yang Du, Zhuoran Lin, Kaiqiang Song, Biao Wang, Zhicheng Zheng, Tiezheng Ge, Bo Zheng, Qin Jin
- LaMP-QA: A Benchmark for Personalized Long-form Question Answering
Alireza Salemi, Hamed Zamani
- The LLM Already Knows: Estimating LLM-Perceived Question Difficulty via Hidden Representations
Yubo Zhu, Dongrui Liu, Zecheng Lin, Wei Tong, Sheng Zhong, Jing Shao
- MCIP: Protecting MCP Safety via Model Contextual Integrity Protocol
Huihao JING, Haoran Li, Wenbin Hu, Qi Hu, Xu Heli, Tianshu Chu, Peizhao Hu, Yangqiu Song
- SAKI-RAG: Mitigating Context Fragmentation in Long-Document RAG via Sentence-level Attention Knowledge Integration
Wenyu Tao, Xiaofen Xing, Zeliang Li, Xiangmin Xu
- Skeletons Matter: Dynamic Data Augmentation for Text-to-Query
Yuchen Ji, Bo Xu, Jie Shi, Jiaqing Liang, Deqing Yang, Yu Mao, Hai Chen, Yanghua Xiao
- CondenseLM: LLMs-driven Text Dataset Condensation via Reward Matching
Cheng Shen, Yew-Soon Ong, Joey Tianyi Zhou
- MovieCORE: COgnitive REasoning in Movies
Gueter Josmy Faure, Min-Hung Chen, Jia-Fong Yeh, Ying Cheng, Hung-Ting Su, Shang-Hong Lai, Winston H. Hsu
- Think Wider, Detect Sharper: Reinforced Reference Coverage for Document-Level Self-Contradiction Detection
Yuhao Chen, Yuanjie Lyu, Shuochen Liu, Chao Zhang, Junhui Lv, Tong Xu
- DRISHTIKON: A Multimodal Multilingual Benchmark for Testing Language Models’ Understanding on Indian Culture
Arijit Maji, Raghvendra Kumar, Akash Ghosh, Anushka, Nemil Shah, Abhilekh Borah, Vanshika Shah, Nishant Mishra, Sriparna Saha
- Learning from Few Samples: A Novel Approach for High-Quality Malcode Generation
Haijian Ma, Daizong Liu, Xiaowen Cai, Yulai Xie, Pan Zhou
- Personality Matters: User Traits Predict LLM Preferences in Multi-Turn Collaborative Tasks
Sarfaroz Yunusov, Kaige Chen, Kazi Nishat Anwar, Ali Emami
- VisualWebInstruct: Scaling up Multimodal Instruction Data through Web Search
Yiming Jia, Jiachen Li, Xiang Yue, Bo Li, Ping Nie, Kai Zou, Wenhu Chen
- Thinking Out Loud: Do Reasoning Models Know When They’re Right?
Qingcheng Zeng, Weihao Xuan, Leyang Cui, Rob Voigt
- Seeing is Believing, but How Much? A Comprehensive Analysis of Verbalized Calibration in Vision-Language Models
Weihao Xuan, Qingcheng Zeng, Heli Qi, Junjue Wang, Naoto Yokoya
- Enhancing Efficiency and Exploration in Reinforcement Learning for LLMs
Mengqi Liao, Xiangyu Xi, Chen Ruinian, Jia Leng, Yangen Hu, Ke Zeng, Shuai Liu, Huaiyu Wan
- LLM Bias Detection and Mitigation through the Lens of Desired Distributions
Ingroj Shrestha, Padmini Srinivasan
- MEBench: Benchmarking Large Language Models for Cross-Document Multi-Entity Question Answering
Teng LIN
- POSITION BIAS MITIGATES POSITION BIAS: Mitigate Position Bias Through Inter-Position Knowledge Distillation
Yifei Wang, Feng Xiong, Yong Wang, Linjing Li, Xiangxiang Chu, Daniel Dajun Zeng
- MMLU-ProX: A Multilingual Benchmark for Advanced Large Language Model Evaluation
Weihao Xuan, Rui Yang, Heli Qi, Qingcheng Zeng, Yunze Xiao, Aosong Feng, Dairui Liu, Yun Xing, Junjue Wang, Fan Gao, Jinghui Lu, Yuang Jiang, Huitao Li, Xin Li, Kunyu Yu, Ruihai Dong, Shangding Gu, Yuekang Li, Xiaofei Xie, Felix Juefei-Xu, Foutse Khomh, Osamu Yoshie, Qingyu Chen, Douglas Teodoro, Nan Liu, Randy Goebel, Lei Ma, Edison Marrese-Taylor, Shijian Lu, Yusuke Iwasawa, Yutaka Matsuo, Irene Li
- NL-Debugging: Exploiting Natural Language as an Intermediate Representation for Code Debugging
Weiming Zhang, Qingyao Li, Xinyi Dai, Jizheng Chen, Kounianhua Du, Weinan Zhang, Weiwen Liu, Yasheng Wang, Ruiming Tang, Yong Yu
- Persuasion Dynamics in LLMs: Investigating Robustness and Adaptability in Knowledge and Safety with DuET-PD
Bryan Chen Zhengyu Tan, Daniel Wai Kit Chin, Zhengyuan Liu, Nancy F. Chen, Roy Ka-Wei Lee
- POINTS-Reader: Distillation-Free Adaptation of Vision-Language Models for Document Conversion
Yuan Liu, Zhongyin Zhao, Le Tian, Haicheng Wang, Xubing Ye, Yangxiu You, Zilin Yu, Chuhan Wu, Zhou Xiao, Yang Yu, Jie Zhou
- Large Language Models for Automated Literature Review: An Evaluation of Reference Generation, Abstract Writing, and Review Composition
Xuemei Tang, Xufeng Duan, Zhenguang Cai
- CoBia: Constructed Conversations Can Trigger Otherwise Concealed Societal Biases in LLMs
Nafiseh Nikeghbal, Amir Hossein Kargaran, Jana Diesner
- From Schema to State: Zero-Shot Scheme-Only Dialogue State Tracking via Diverse Synthetic Dialogue and Step-by-Step Distillation
Huan Xu, Zequn Li, Wen Tang, Jian Jun Zhang
- Beyond the Surface: Measuring Self-Preference in LLM Judgments
Zhi-Yuan Chen, Hao Wang, Xinyu Zhang, Enrui Hu, Yankai Lin
- Beyond Input Activations: Identifying Influential Latents by Gradient Sparse Autoencoders
Dong Shu, Xuansheng Wu, Haiyan Zhao, Mengnan Du, Ninghao Liu
- Utility-Focused LLM Annotation for Retrieval and Retrieval-Augmented Generation
Hengran Zhang, Minghao Tang, Keping Bi, Jiafeng Guo, Shihao Liu, Daiting Shi, Dawei Yin, Xueqi Cheng
- CiteBART: Learning to Generate Citations for Local Citation Recommendation
Ege Yiğit Çelik, Selma Tekir
- Culture Cartography: Mapping the Landscape of Cultural Knowledge
Caleb Ziems, William Barr Held, Jane Yu, Amir Goldberg, David Grusky, Diyi Yang
- Interpretability Analysis of Arithmetic In-Context Learning in Large Language Models
Gregory Polyakov, Christian Hepting, Carsten Eickhoff, Seyed Ali Bahrainian
- SwarmAgentic: Towards Fully Automated Agentic System Generation via Swarm Intelligence
Yao Zhang, Chenyang Lin, Shijie Tang, Haokun Chen, Shijie Zhou, Yunpu Ma, Volker Tresp
- We Politely Insist: Your LLM Must Learn the Persian Art of Taarof
Nikta Gohari Sadr, Sahar Heidariasl, Karine Megerdoomian, Laleh Seyyed-Kalantari, Ali Emami
- Unstructured Evidence Attribution for Long Context Query Focused Summarization
Dustin Wright, Zain Muhammad Mujahid, Lu Wang, Isabelle Augenstein, David Jurgens
- RAVEN: Query-Guided Representation Alignment for Question Answering over Audio, Video, Embedded Sensors, and Natural Language
Subrata Biswas, Mohammad Nur Hossain Khan, Bashima Islam
- Cache-of-Thought: Master-Apprentice Framework for Cost-Effective Vision Language Model Reasoning
Mingyuan Wu, Jize Jiang, Haozhen Zheng, Meitang Li, Zhaoheng Li, Beitong Tian, Bo Chen, Yongjoo Park, Minjia Zhang, ChengXiang Zhai, Klara Nahrstedt
- Video Compression Commander: Plug-and-Play Inference Acceleration for Video Large Language Models
Xuyang Liu, Yiyu Wang, Junpeng Ma, Linfeng Zhang
- Router-Tuning: A Simple and Effective Approach for Dynamic Depth
Shwai He, Tao Ge, Guoheng Sun, Bowei Tian, Xiaoyang Wang, Dong Yu
- Foot-In-The-Door: A Multi-turn Jailbreak for LLMs
Zixuan Weng, Xiaolong Jin, Jinyuan Jia, Xiangyu Zhang
- TurnaboutLLM: A Deductive Reasoning Benchmark from Detective Games
Yuan Yuan, Muyu He, Muhammad Adil Shahid, Ziyang Li, Jiani Huang, Li Zhang
- Transferable Direct Prompt Injection via Activation-Guided MCMC Sampling
Minghui Li, Hao Zhang, Yechao Zhang, Wei Wan, Shengshan Hu, pei Xiaobing, Jing Wang
- Direct Judgement Preference Optimization
PeiFeng Wang, Austin Xu, Yilun Zhou, Caiming Xiong, Shafiq Joty
- WebInject: Prompt Injection Attack to Web Agents
Xilong Wang, John Bloch, Zedian Shao, Yuepeng Hu, Shuyan Zhou, Neil Zhenqiang Gong
- F²Bench: An Open-ended Fairness Evaluation Benchmark for LLMs with Factuality Considerations
Tian Lan, Jiang Li, Yemin Wang, Xu Liu, Xiangdong Su, Guanglai Gao
- Value Profiles for Encoding Human Variation
Taylor Sorensen, Pushkar Mishra, Roma Patel, Michael Henry Tessler, Michiel A. Bakker, Georgina Evans, Iason Gabriel, Noah Goodman, Verena Rieser
- Language Models as Causal Effect Generators
Lucius E.J. Bynum, Kyunghyun Cho
- Constructions are Revealed in Word Distributions
Joshua Rozner, Leonie Weissweiler, Kyle Mahowald, Cory Shain
- CodeMixBench: Evaluating Code-Mixing Capabilities of LLMs Across 18 Languages
Yilun Yang, Yekun Chai
- RBPtool: A Deep Language Model Framework for Multi-Resolution RBP-RNA Binding Prediction and RNA Molecule Design
Jiyue Jiang, Yitao Xu, Zikang Wang, Yihan Ye, Yanruisheng Shao, Yuheng Shan, Jiuming Wang, Xiaodan Fan, Jiao Yuan, Yu Li
- Unveiling Internal Reasoning Modes in LLMs: A Deep Dive into Latent Reasoning vs. Factual Shortcuts with Attribute Rate Ratio
Yiran Yang, Haifeng Sun, Jingyu Wang, Qi Qi, Zirui Zhuang, Huazheng Wang, Pengfei Ren, Jing Wang, Jianxin Liao
- SAE-SSV: Supervised Steering in Sparse Representation Spaces for Reliable Control of Language Models
Zirui He, Mingyu Jin, Bo Shen, Ali Payani, Yongfeng Zhang, Mengnan Du
- BabyLM’s First Constructions: Causal interventions provide a signal of learning
Joshua Rozner, Leonie Weissweiler, Cory Shain
- Effective Red-Teaming of Policy-Adherent Agents
Itay Nakash, George Kour, Koren Lazar, Matan Vetzler, Guy Uziel, Ateret Anaby Tavor
- CondAmbigQA: A Benchmark and Dataset for Conditional Ambiguous Question Answering
Zongxi Li, Yang Li, Haoran Xie, S. Joe Qin
- SafeScientist: Enhancing AI Scientist Safety for Risk-Aware Scientific Discovery
Kunlun Zhu, Jiaxun Zhang, Ziheng Qi, Nuoxing Shang, Zijia Liu, Peixuan Han, Yue Su, Haofei Yu, Jiaxuan You
- Improving Informally Romanized Language Identification
Adrian Benton, Alexander Gutkin, Christo Kirov, Brian Roark
- Integral Transformer: Denoising Attention, Not Too Much Not Too Little
Ivan Kobyzev, Abbas Ghaddar, Dingtao Hu, Boxing Chen
- CHENGYU-BENCH: Benchmarking Large Language Models for Chinese Idiom Understanding and Use
Yicheng Fu, Zhemin Huang, Liuxin Yang, Yumeng Lu, Zhongdongming Dai
- Improving Cross Lingual Transfer by Pretraining with Active Forgetting
Divyanshu Aggarwal, Ashutosh Sathe, Sunayana Sitaram
- Re-Align: Aligning Vision Language Models via Retrieval-Augmented Direct Preference Optimization
Shuo Xing, Peiran Li, Yuping Wang, Ruizheng Bai, Yueqi Wang, Chan-Wei Hu, Chengxuan Qian, Huaxiu Yao, Zhengzhong Tu
- To Mask or to Mirror: Human-AI Alignment in Collective Reasoning
Crystal Qian, Aaron T Parisi, Clémentine Bouleau, Vivian Tsai, Maël Lebreton, Lucas Dixon
- SWAN: An Efficient and Scalable Approach for Long-Context Language Modeling
Krishna C Puvvada, Faisal Ladhak, Santiago Akle Serano, Cheng-Ping Hsieh, Shantanu Acharya, Somshubra Majumdar, Fei Jia, Samuel Kriman, Simeng Sun, Dima Rekesh, Boris Ginsburg
- LLMs Behind the Scenes: Enabling Narrative Scene Illustration
Melissa Roemmele, John Joon Young Chung, Taewook Kim, Yuqian Sun, Alex Calderwood, Max Kreminski
- REARANK: Reasoning Re-ranking Agent via Reinforcement Learning
Le Zhang, Bo Wang, Xipeng Qiu, Siva Reddy, Aishwarya Agrawal
- Large Language Models Do Multi-Label Classification Differently
Marcus Ma, Georgios Chochlakis, Niyantha Maruthu Pandiyan, Jesse Thomason, Shrikanth Narayanan
- FilBench: Can LLMs Understand and Generate Filipino?
Lester James Validad Miranda, Elyanah Aco, Conner G. Manuel, Jan Christian Blaise Cruz, Joseph Marvin Imperial
- M-ABSA: A Multilingual Dataset for Aspect-Based Sentiment Analysis
ChengYan Wu, Bolei Ma, Yihong Liu, Zheyu Zhang, Ningyuan Deng, Yanshu Li, Baolan Chen, Yi Zhang, Yun Xue, Barbara Plank
- RuCCoD: Towards Automated ICD Coding in Russian
Alexandr Nesterov, Andrey Sakhovskiy, Ivan Sviridov, Airat Valiev, Vladimir Makharev, Petr Anokhin, Galina Zubkova, Elena Tutubalina
- Code to Think, Think to Code: A Survey on Code-Enhanced Reasoning and Reasoning-Driven Code Intelligence in LLMs
Dayu Yang, Tianyang Liu, Daoan Zhang, Antoine Simoulin, Xiaoyi Liu, Yuwei Cao, Zhaopu Teng, Xin Qian, Grey Yang, Jiebo Luo, Julian McAuley
- Efficient Model Development through Fine-tuning Transfer
Pin-Jie Lin, Rishab Balasubramanian, Fengyuan Liu, Nikhil Kandpal, Tu Vu
- Language Mixing in Reasoning Language Models: Patterns, Impact, and Internal Causes
Mingyang Wang, Lukas Lange, Heike Adel, Yunpu Ma, Jannik Strötgen, Hinrich Schuetze
- User Feedback in Human-LLM Dialogues: A Lens to Understand Users But Noisy as a Learning Signal
Yuhan Liu, Michael JQ Zhang, Eunsol Choi
- Read to Hear: A Zero-Shot Pronunciation Assessment Using Textual Descriptions and LLMs
Yu-Wen Chen, Melody Ma, Julia Hirschberg
- COCO-Tree: Compositional Hierarchical Concept Trees for Enhanced Reasoning in Vision-Language Models
Sanchit Sinha, Guangzhi Xiong, Aidong Zhang
- SurveyGen: Quality-Aware Scientific Survey Generation with Large Language Models
Tong Bao, Mir Tafseer Nayeem, Davood Rafiei, Chengzhi Zhang
- VoiceCraft-X: Unifying Multilingual, Voice-Cloning Speech Synthesis and Speech Editing
Zhisheng Zheng, Puyuan Peng, Anuj Diwan, Cong Phuoc Huynh, Xiaohang Sun, Zhu Liu, Vimal Bhat, David Harwath
- From Generation to Judgment: Opportunities and Challenges of LLM-as-a-judge
Dawei Li, Bohan Jiang, Liangjie Huang, Alimohammad Beigi, Chengshuai Zhao, Zhen Tan, Amrita Bhattacharjee, Yuxuan Jiang, Canyu Chen, Tianhao Wu, Kai Shu, Lu Cheng, huan liu
- MultiMatch: Multihead Consistency Regularization Matching for Semi-Supervised Text Classification
Iustin Sirbu, Robert-Adrian Popovici, Cornelia Caragea, Stefan Trausan-Matu, Traian Rebedea
- TTT-Bench: A Benchmark for Evaluating Reasoning Ability with Simple and Novel Tic-Tac-Toe-style Games
Prakamya Mishra, Jiang Liu, Jialian Wu, Xiaodong Yu, Zicheng Liu, Emad Barsoum
- Learning from Diverse Reasoning Paths with Routing and Collaboration
Zhenyu Lei, Zhen Tan, Song Wang, Yaochen Zhu, Zihan Chen, Yushun Dong, Jundong Li
- Ask Patients with Patience: Enabling LLMs for Human-Centric Medical Dialogue with Grounded Reasoning
Jiayuan Zhu, Jiazhen Pan, Yuyuan Liu, Fenglin Liu, Junde Wu
- MedHallu: A Comprehensive Benchmark for Detecting Medical Hallucinations in Large Language Models
Shrey Pandit, Jiawei Xu, Junyuan Hong, Zhangyang Wang, Tianlong Chen, Kaidi Xu, Ying Ding
- NUTMEG: Separating Signal From Noise in Annotator Disagreement
Jonathan Ivey, Susan Gauch, David Jurgens
- Alignment Quality Index (AQI) : Beyond Refusals: AQI as an Intrinsic Alignment Diagnostic via Latent Geometry, Cluster Divergence, and Layer wise Pooled Representations
Abhilekh Borah, Chhavi Sharma, Danush Khanna, Utkarsh Bhatt, Gurpreet Singh, Hasnat Md Abdullah, Raghav Kaushik Ravi, Vinija Jain, Jyoti Patel, Shubham Singh, Vasu Sharma, Arpita Vats, Rahul Raja, Aman Chadha, Amitava Das
- MythTriage: Scalable Detection of Opioid Use Disorder Myths on a Video-Sharing Platform
Hayoung Jung, Shravika Mittal, Ananya Aatreya, Navreet Kaur, Munmun De Choudhury, Tanu Mitra
- Demystifying optimized prompts in language models
Rimon Melamed, Lucas Hurley McCabe, H Howie Huang
- Whisper-UT: A Unified Translation Framework for Speech and Text
Cihan Xiao, Matthew Wiesner, Debashish Chakraborty, Reno Kriz, Keith Cunningham, Kenton Murray, Kevin Duh, Luis Tavarez-Arce, Paul McNamee, Sanjeev Khudanpur
- Unleashing the Reasoning Potential of LLMs by Critique Fine-Tuning on One Problem
Yubo Wang, Ping Nie, Kai Zou, Lijun Wu, Wenhu Chen
- Active Layer-Contrastive Decoding Reduces Hallucination in Large Language Model Generation
Hongxiang Zhang, Hao Chen, Muhao Chen, Tianyi Zhang
- BBScoreV2: Learning Time-Evolution and Latent Alignment from Stochastic Representation
Tianhao Zhang, Zhecheng Sheng, Zhexiao Lin, Chen Jiang, Dongyeop Kang
- SAND: Boosting LLM Agents with Self-Taught Action Deliberation
Yu Xia, Yiran Jenny Shen, Junda Wu, Tong Yu, Sungchul Kim, Ryan A. Rossi, Lina Yao, Julian McAuley
- LLMs as World Models: Data-Driven and Human-Centered Pre-Event Simulation for Disaster Impact Assessment
Lingyao Li, Dawei Li, Zhenhui Ou, Xiaoran Xu, Jingxiao Liu, Zihui Ma, Runlong Yu, Min Deng
- Two Heads Are Better Than One: Dual-Model Verbal Reflection at Inference-Time
Jiazheng Li, Yuxiang Zhou, Junru Lu, Gladys Tyen, Lin Gui, Cesare Aloisi, Yulan He
- Image Embedding Sampling Method for Diverse Captioning
Sania Waheed, Na Min An
- Diagnosing Memorization in Chain-of-Thought Reasoning, One Token at a Time
Huihan Li, You Chen, Siyuan Wang, Yixin He, Ninareh Mehrabi, Rahul Gupta, Xiang Ren
- FANS: Formal Answer Selection for LLM Natural Language Math Reasoning Using Lean4
Jiarui Yao, Ruida WANG, Tong Zhang
- Date Fragments: A Hidden Bottleneck of Tokenization for Temporal Reasoning
Gagan Bhatia, Maxime Peyrard, Wei Zhao
- Measuring Risk of Bias in Biomedical Reports: The RoBBR Benchmark
Jianyou Wang, Weili Cao, Longtian Bao, Youze Zheng, Gil Pasternak, Kaicheng Wang, Xiaoyue Wang, Ramamohan Paturi, Leon Bergen
- SHIFT: Selected Helpful Informative Frame for Video-guided Machine Translation
Boyu Guan, Chuang Han, Yining Zhang, Yupu Liang, Zhiyang Zhang, Yang Zhao, Chengqing Zong
- Surge: On the Potential of Large Language Models as General-Purpose Surrogate Code Executors
Bohan Lyu, Siqiao Huang, Zichen Liang, Qian Sun, Jiaming Zhang
- Few-Shot Learning Translation from New Languages
Carlos Mullov, Alexander Waibel
- Humanizing Machines: Rethinking LLM Anthropomorphism Through a Multi-Level Framework of Design
Yunze Xiao, Lynnette Hui Xian Ng, Jiarui Liu, Mona T. Diab
- TokenSkip: Controllable Chain-of-Thought Compression in LLMs
Heming Xia, Chak Tou Leong, Wenjie Wang, Yongqi Li, Wenjie Li
- Are Generative Models Underconfident? Better Quality Estimation with Boosted Model Probability
Tu Anh Dinh, Jan Niehues
- reWordBench: Benchmarking and Improving the Robustness of Reward Models with Transformed Inputs
Zhaofeng Wu, Michihiro Yasunaga, Andrew Cohen, Yoon Kim, Asli Celikyilmaz, Marjan Ghazvininejad
- Why Do Some Inputs Break Low-Bit LLM Quantization?
Ting-Yun Chang, Muru Zhang, Jesse Thomason, Robin Jia
- LiteASR: Efficient Automatic Speech Recognition with Low-Rank Approximation
Keisuke Kamahori, Jungo Kasai, Noriyuki Kojima, Baris Kasikci
- AROMA: Autonomous Rank-one Matrix Adaptation
Hao Nan SHENG, Zhi-Yong Wang, Hing Cheung So, Mingrui Yang
- Large Language Models Have Intrinsic Meta-Cognition, but Need a Good Lens
Ziyang Ma, Qingyue Yuan, Zhenglin Wang, Deyu Zhou
- Anchoring-Guidance Fine-Tuning (AnGFT): Elevating Professional Response Quality in Role-Playing Conversational Agents
Qibin Li, Zhen Xu, Shengyuan Bai, Nianmin Yao, Kaili Sun, Ying Li, Baoxun Wang, Bowen Wu
- RiTTA: Modeling Event Relations in Text-to-Audio Generation
Yuhang He, Yash Jain, Xubo Liu, Andrew Markham, Vibhav Vineet
- Shallow Focus, Deep Fixes: Enhancing Shallow Layers Vision Attention Sinks to Alleviate Hallucination in LVLMs
Xiaofeng Zhang, Yihao Quan, Chen Shen, Chaochen Gu, Xiaosong Yuan, Shaotian Yan, Jiawei Cao, Hao Cheng, Kaijie Wu, Jieping Ye
- WangchanThaiInstruct: An instruction-following Dataset for Culture-Aware, Multitask, and Multi-domain Evaluation in Thai
Peerat Limkonchotiwat, Pume Tuchinda, Lalita Lowphansirikul, Surapon Nonesung, Panuthep Tasawong, Alham Fikri Aji, Can Udomcharoenchaikit, Sarana Nutanong
- MemeReaCon: Probing Contextual Meme Understanding in Large Vision-Language Models
Zhengyi Zhao, Shubo Zhang, Yuxi Zhang, Yanxi Zhao, Yifan Zhang, Zezhong WANG, Huimin WANG, Yutian Zhao, Bin Liang, Yefeng Zheng, Binyang Li, Kam-Fai Wong, Xian Wu
- A Comprehensive Literary Chinese Reading Comprehension Dataset with an Evidence Curation Based Solution
Dongning Rao, Rongchu Zhou, Peng Chen, Zhihua Jiang
- Dialect-SQL: An Adaptive Framework for Bridging the Dialect Gap in Text-to-SQL
Jie Shi, Xi Cao, Bo Xu, Jiaqing Liang, Yanghua Xiao, Jia Chen, Peng Wang, Wei Wang
- FinMTEB: Finance Massive Text Embedding Benchmark
Yixuan Tang, Yi Yang
- Scaling Rich Style-Prompted Text-to-Speech Datasets
Anuj Diwan, Zhisheng Zheng, David Harwath, Eunsol Choi
- Exploring Changes in Nation Perception with Nationality-Assigned Personas in LLMs
Mahammed Kamruzzaman, Gene Louis Kim
- Eliciting Implicit Acoustic Styles from Open-domain Instructions to Facilitate Fine-grained Controllable Generation of Speech
Jianxing Yu, Gou Zihao, Chen Li, Zhisheng Wang, Peiji Yang, Wenqing Chen, Jian Yin
- OBLIVIATE: Robust and Practical Machine Unlearning for Large Language Models
Xiaoyu Xu, Minxin Du, Qingqing Ye, Haibo Hu
- AdaptThink: Reasoning Models Can Learn When to Think
Jiajie Zhang, Nianyi Lin, Lei Hou, Ling Feng, Juanzi Li
- T$^2$: An Adaptive Test-Time Scaling Strategy for Contextual Question Answering
Zhengyi Zhao, Shubo Zhang, Zezhong WANG, Huimin WANG, Yutian Zhao, Bin Liang, Yefeng Zheng, Binyang Li, Kam-Fai Wong, Xian Wu
- Non-Existent Relationship: Fact-Aware Multi-Level Machine-Generated Text Detection
Yang Wu, Ruijia Wang, Jie Wu
- Calibrating Verbal Uncertainty as a Linear Feature to Reduce Hallucinations
Ziwei Ji, Lei Yu, Yeskendir Koishekenov, Yejin Bang, Anthony Hartshorn, Alan Schelten, Cheng Zhang, Pascale Fung, Nicola Cancedda
- JUREX-4E: Juridical Expert-Annotated Four-Element Knowledge Base for Legal Reasoning
Huanghai Liu, Quzhe Huang, Qingjing Chen, Yiran HU, Jiayu Ma, Yun Liu, Weixing Shen, Yansong Feng
- CIE: Controlling Language Model Text Generations Using Continuous Signals
Vinay Samuel, Harshita Diddee, Yiming Zhang, Daphne Ippolito
- Stand on The Shoulders of Giants: Building JailExpert from Previous Attack Experience
Xi Wang, Songlei Jian, Shasha Li, Xiaopeng Li, Bin Ji, Ma Jun, Xiaodong Liu, Jing Wang, Jianfeng Zhang, Jie Yu, Feilong Bao, Wangbaosheng
- Language-to-Space Programming for Training-Free 3D Visual Grounding
Boyu Mi, Hanqing Wang, Tai Wang, Yilun Chen, Jiangmiao Pang
- RAG-Instruct: Boosting LLMs with Diverse Retrieval-Augmented Instructions
Wanlong Liu, Junying Chen, Ke Ji, Li Zhou, Wenyu Chen, Benyou Wang
- AdaRewriter: Unleashing the Power of Prompting-based Conversational Query Reformulation via Test-Time Adaptation
Yilong Lai, Jialong Wu, Zhenglin Wang, Deyu Zhou
- SmartBench: Is Your LLM Truly a Good Chinese Smartphone Assistant?
Xudong Lu, Haohao Gao, Renshou Wu, Shuai Ren, Xiaoxin Chen, Hongsheng Li, Fangyuan Li
- F2TEval: Human-Aligned Multi-Dimensional Evaluation for Figure-to-Text Task
Tan Yue, Rui Mao, Zilong Song, Zonghai Hu, Dongyan Zhao
- Icon$^2$: Aligning Large Language Models Using Self-Synthetic Preference Data via Inherent Regulation
Qiyuan Chen, Hongsen Huang, Qian Shao, Jiahe Chen, Jintai Chen, Hongxia Xu, Renjie Hua, Ren Chuan, Jian Wu
- DSCD: Large Language Model Detoxification with Self-Constrained Decoding
Ming Dong, Jinkui Zhang, Bolong Zheng, Xinhui Tu, Po Hu, Tingting He
- From Reasoning to Answer: Empirical, Attention-Based and Mechanistic Insights into Distilled DeepSeek R1 Models
Jue Zhang, Qingwei Lin, Saravan Rajmohan, Dongmei Zhang
- Quantifying Language Disparities in Multilingual Large Language Models
Songbo Hu, Ivan Vulić, Anna Korhonen
- KoBLEX: Open Legal Question Answering with Multi-hop Reasoning
Jihyung Lee, DaeHee Kim, Seonjeong Hwang, Hyounghun Kim, Gary Lee
- End-to-End Learnable Psychiatric Scale Guided Risky Post Screening for Depression Detection on Social Media
Bichen Wang, Yuzhe Zi, Yixin Sun, Hao Yang, Yanyan Zhao, Bing Qin
- ReAgent: Reversible Multi-Agent Reasoning for Knowledge-Enhanced Multi-Hop QA
Zhao Xinjie, Fan Gao, Xingyu Song, Yingjian Chen, Rui Yang, Yanran Fu, Yuyang Wang, Yusuke Iwasawa, Yutaka Matsuo, Irene Li
- Matter-of-Fact: A Benchmark for Verifying the Feasibility of Literature-Supported Claims in Materials Science
Peter Jansen, Samiah Hassan, Ruoyao Wang
- ModRWKV: Transformer Multimodality in Linear Time
Jiale Kang, Ziyin Yue, Qingyu Yin, Rui Jiang, Weile Li, Zening Lu, Zhouran Ji
- Multimedia Event Extraction with LLM Knowledge Editing
Jiaao Yu, Yijing Lin, Zhipeng Gao, Xuesong Qiu, Lanlan Rui
- Exploring the Impact of Personality Traits on LLM Toxicity and Bias
Shuo Wang, Renhao Li, Xi Chen, Yulin Yuan, Min Yang, Derek F. Wong
- Task-aware Contrastive Mixture of Experts for Quadruple Extraction in Conversations with Code-like Replies and Non-opinion Detection
Chenyuan He, Yuxiang Jia, Fei Gao, Senbin Zhu, Hongde Liu, Hongying Zan, Min Peng
- Mitigating Biases in Language Models via Bias Unlearning
Dianqing Liu, Yi Liu, Guoqing Jin, Zhendong Mao
- UNComp: Can Matrix Entropy Uncover Sparsity? — A Compressor Design from an Uncertainty-Aware Perspective
Jing Xiong, Jianghan Shen, Fanghua Ye, Chaofan Tao, Zhongwei Wan, Jianqiao Lu, Xun Wu, Chuanyang Zheng, Zhijiang Guo, Min Yang, Lingpeng Kong, Ngai Wong
- Superpose Task-specific Features for Model Merging
Haiquan Qiu, You Wu, Dong Li, Jianmin Guo, Quanming Yao
- FinRAGBench-V: A Benchmark for Multimodal RAG with Visual Citation in the Financial Domain
Zhao Suifeng, Zhuoran Jin, Sujian Li, Jun Gao
- BacktrackAgent: Enhancing GUI Agent with Error Detection and Backtracking Mechanism
Qinzhuo Wu, Pengzhi Gao, Wei Liu, Jian Luan
- Diffusion vs. Autoregressive Language Models: A Text Embedding Perspective
Siyue Zhang, Yilun Zhao, Liyuan Geng, Arman Cohan, Anh Tuan Luu, Chen Zhao
- BannerAgency: Advertising Banner Design with Multimodal LLM Agents
Heng Wang, Yotaro Shimose, Shingo Takamatsu
- DIDS: Domain Impact-aware Data Sampling for Large Language Model Training
Weijie Shi, Jipeng Zhang, Yaguang Wu, Jingzhi Fang, Shibo Zhang, Yao Zhao, Hao Chen, Ruiyuan Zhang, Yue Cui, Jia Zhu, Sirui Han, Jiajie Xu, Xiaofang Zhou
- Training LLMs to be Better Text Embedders through Bidirectional Reconstruction
Chang Su, Dengliang Shi, Siyuan Huang, Jintao Du, Changhua Meng, Yu Cheng, Weiqiang Wang, Zhouhan Lin
- ReMedy: Learning Machine Translation Evaluation from Human Preferences with Reward Modeling
Shaomu Tan, Christof Monz
- SolEval: Benchmarking Large Language Models for Repository-level Solidity Smart Contract Generation
Zhiyuan Peng, Xin Yin, Rui Qian, Peiqin Lin, YongKang Liu, Hao Zhang, Chenhao Ying, Yuan Luo
- In-Context Learning Boosts Speech Recognition via Human-like Adaptation to Speakers and Language Varieties
Nathan Roll, Calbert Graham, Yuka Tatsumi, Kim Tien Nguyen, Meghan Sumner, Dan Jurafsky
- Reasoning Model Unlearning: Forgetting Traces, Not Just Answers, While Preserving Reasoning Skills
Changsheng Wang, Chongyu Fan, Yihua Zhang, Jinghan Jia, Dennis Wei, Parikshit Ram, Nathalie Baracaldo, Sijia Liu
- Chain-of-Talkers (CoTalk): Fast Human Annotation of Dense Image Captions
Yijun Shen, Delong Chen, Fan Liu, Xingyu Wang, Chuanyi Zhang, Liang Yao, Yuhui Zheng
- DecoupleSearch: Decouple Planning and Search via Hierarchical Reward Modeling
Hao Sun, Zile Qiao, Bo Wang, Guoxin Chen, Yingyan Hou, Yong Jiang, Pengjun Xie, Fei Huang, Yan Zhang
- RewardDS: Privacy-Preserving Fine-Tuning for Large Language Models via Reward Driven Data Synthesis
Jianwei Wang, Chengming Shi, Junyao Yang, Haoran Li, Qianli Ma, Huiping Zhuang, Cen Chen, Ziqian Zeng
- Synergizing Multimodal Temporal Knowledge Graphs and Large Language Models for Social Relation Recognition
Haorui Wang, Zheng Wang, Yuxuan Zhang, Bo Wang, Bin Wu
- LegalSearchLM: Rethinking Legal Case Retrieval as Legal Elements Generation
Chaeeun Kim, Jinu Lee, Wonseok Hwang
- ChartMind: A Comprehensive Benchmark for Complex Real-world Multimodal Chart Question Answering
Jingxuan Wei, Nan Xu, Junnan Zhu, haoyanni, Gaowei Wu, Qi Chen, Bihui Yu, Lei Wang
- COLA: Collaborative Multi-Agent Framework with Dynamic Task Scheduling for GUI Automation
Di Zhao, Longhui Ma, Siwei Wang, Miao Wang, Zhao Lv
- DASA-Trans-STM: Adaptive Efficient Transformer for Short Text Matching using Data Augmentation and Semantic Awareness
Jiguo Liu, Chao Liu, Meimei Li, Nan Li, Shihao Gao, Dali Zhu
- Pruning the Paradox: How CLIP’s Most Informative Heads Enhance Performance While Amplifying Bias
Avinash Madasu, Vasudev Lal, Phillip Howard
- CoLA: Compute-Efficient Pre-Training of LLMs via Low-Rank Activation
Ziyue Liu, Ruijie ZHANG, Zhengyang Wang, Zi Yang, Paul D. Hovland, Bogdan Nicolae, Franck Cappello, Zheng Zhang
- TS-CLIP: Time Series Understanding by CLIP
Ziwen Chen, Xiaoyuan Zhang, Ming Zhu
- MultiAgentESC: A LLM-based Multi-Agent Collaboration Framework for Emotional Support Conversation
YangyangXu, Jinpeng Hu, Zhuoer Zhao, Zhangling Duan, Xiao Sun, Xun Yang
- Continuously Steering LLMs Sensitivity to Contextual Knowledge with Proxy Models
Yilin Wang, Heng Wang, Yuyang Bai, Minnan Luo
- Probing LLM World Models: Enhancing Guesstimation with Wisdom of Crowds Decoding
Yun-Shiuan Chuang, Sameer Narendran, Nikunj Harlalka, Alexander Cheung, Sizhe Gao, Siddharth Suresh, Junjie Hu, Timothy T. Rogers
- Recall with Reasoning: Chain-of-Thought Distillation for Mamba’s Long-Context Memory and Extrapolation
Jun-Yu Ma, Tianqing Fang, Zhisong Zhang, Hongming Zhang, Haitao Mi, Dong Yu
- Scalable Data Synthesis through Human-like Cognitive Imitation and Data Recombination
Zhongyi Ye, Weitai Zhang, Xinyuan Zhou, Yongxin Zhu, Ninghui Rao, Enhong Chen
- BeSimulator: A Large Language Model Powered Text-based Behavior Simulator
Jianan Wang, Bin Li, Jingtao Qi, xueying wang, Fu Li, Lihanxun
- Too Consistent to Detect: A Study of Self-Consistent Errors in LLMs
Hexiang Tan, Fei Sun, Sha Liu, Du Su, Qi Cao, Xin Chen, Jingang Wang, Xunliang Cai, Yuanzhuo Wang, Huawei Shen, Xueqi Cheng
- pFedGPT: Hierarchically Optimizing LoRA Aggregation Weights for Personalized Federated GPT Models
Zhanming Shen, Tianqi Xu, Hao Wang, Jian Li, Miao Pan
- QSpec: Speculative Decoding with Complementary Quantization Schemes
Juntao Zhao, Wenhao Lu, Sheng Wang, Lingpeng Kong, Chuan Wu
- Co-Evolving LLMs and Embedding Models via Density-Guided Preference Optimization for Text Clustering
Zetong Li, Qinliang Su, Minhua Huang, Yin Yang
- P-MMEval: A Parallel Multilingual Multitask Benchmark for Consistent Evaluation of LLMs
Yidan Zhang, Yu Wan, Boyi Deng, Baosong Yang, Hao-Ran Wei, Fei Huang, Bowen Yu, Dayiheng Liu, Junyang Lin, Fei Huang, Jingren Zhou
- Single LLM, Multiple Roles: A Unified Retrieval-Augmented Generation Framework Using Role-Specific Token Optimization
Yutao Zhu, Jiajie Jin, Hongjin Qian, Zheng Liu, Zhicheng Dou, Ji-Rong Wen
- TrInk: Ink Generation with Transformer Network
Zezhong Jin, Shubhang Desai, Xu Chen, Biyi Fang, Zhuoyi Huang, Zhe LI, Chong-Xin Gan, Xiao Tu, Man-Wai Mak, Yan Lu, Shujie LIU
- CalligraphicOCR for Chinese Calligraphy Recognition
Xiaoyi Bao, Zhongqing Wang, Jinghang Gu, Chu-Ren Huang
- When Audio and Text Disagree: Revealing Text Bias in Large Audio-Language Models
Cheng Wang, Gelei Deng, XIANGLIN YANG, Han Qiu, Tianwei Zhang
- RESF: Regularized-Entropy-Sensitive Fingerprinting for Black-Box Tamper Detection of Large Language Models
Pingyi Hu, Xiaofan Bai, Xiaojing Ma, Chaoxiang He, Dongmei Zhang, Bin Benjamin Zhu
- Model-based Large Language Model Customization as Service
Zhaomin Wu, Jizhou Guo, Junyi Hou, Bingsheng He, Lixin Fan, Qiang Yang
- Collab-Overcooked: Benchmarking and Evaluating Large Language Models as Collaborative Agents
Haochen Sun, Shuwen Zhang, Lujie Niu, Lei Ren, Hao Xu, Hao Fu, Fangkun Zhao, Caixia Yuan, Xiaojie Wang
- Improving Reasoning Capabilities in Small Models through Mixture-of-layers Distillation with Stepwise Attention on Key Information
YaoChen, Jiawei Sheng, Wenyuan Zhang, Tingwen Liu
- Through the Valley: Path to Effective Long CoT Training for Small Language Models
Renjie Luo, Jiaxi Li, Chen Huang, Wei Lu
- RED: Unleashing Token-Level Rewards from Holistic Feedback via Reward Redistribution
Jiahui Li, Lin Li, Tai-Wei Chang, Kun Kuang, Long Chen, JUN ZHOU, Cheng Yang
- SDGO: Self-Discrimination-Guided Optimization for Consistent Safety in Large Language Models
Peng Ding, Wen Sun, Dailin Li, Wei Zou, Jiaming wang, Jiajun Chen, Shujian Huang
- InMind: Evaluating LLMs in Capturing and Applying Individual Human Reasoning Styles
Zizhen Li, Chuanhao Li, Yibin Wang, Qi Chen, Diping Song, Yukang Feng, Jianwen Sun, Jiaxin Ai, Fanrui Zhang, Mingzhu Sun, Kaipeng Zhang
- MIO: A Foundation Model on Multimodal Tokens
Zekun Moore Wang, King Zhu, Chunpu Xu, Wangchunshu Zhou, Jiaheng Liu, Yibo Zhang, Jiashuo WANG, Ning Shi, Siyu Li, Yizhi LI, Haoran Que, Zhaoxiang Zhang, Yuanxing Zhang, Ge Zhang, Ke Xu, Jie Fu, Wenhao Huang
- DART: Distilling Autoregressive Reasoning to Silent Thought
Nan Jiang, Ziming Wu, De-Chuan Zhan, Fuming Lai, Shaobing Lian
- LeTS: Learning to Think-and-Search via Process-and-Outcome Reward Hybridization
Qi Zhang, Shouqing Yang, Lirong Gao, Hao Chen, Xiaomeng Hu, Jinglei Chen, Jiexiang Wang, Sheng Guo, Bo Zheng, Haobo Wang, Junbo Zhao
- CYCLE-INSTRUCT: Fully Seed-Free Instruction Tuning via Dual Self-Training and Cycle Consistency
Zhanming Shen, Hao Chen, Yulei Tang, shaolin Zhu, Wentao Ye, Xiaomeng Hu, Haobo Wang, Gang Chen, Junbo Zhao
- Good Intentions Beyond ACL: Who Does NLP for Social Good, and Where?
Grace LeFevre, Qingcheng Zeng, Adam Leif, Jason Jewell, Denis Peskoff, Rob Voigt
- From General Reward to Targeted Reward: Improving Open-ended Long-context Generation Models
Zhihan Guo, Jiele Wu, Wenqian Cui, Yifei Zhang, Minda Hu, Yufei Wang, Irwin King
- Think in Safety: Unveiling and Mitigating Safety Alignment Collapse in Multimodal Large Reasoning Model
Xinyue Lou, You Li, Jinan Xu, Xiangyu Shi, Chi Chen, Kaiyu Huang
- Understanding the Modality Gap: An Empirical Study on the Speech-Text Alignment Mechanism of Large Speech Language Models
Bajian Xiang, Shuaijiang Zhao, Tingwei Guo, Wei zou
- AssoCiAm: A Benchmark for Evaluating Association Thinking while Circumventing Ambiguity
Yifan Liu, Wenkuan Zhao, Shanshan Zhong, Jinghui Qin, Mingfu Liang, Zhongzhan Huang, Wushao Wen
- M-BRe: Discovering Training Samples for Relation Extraction from Unlabeled Texts with Large Language Models
Zexuan Li, Hongliang Dai, Piji Li
- R-TOFU: Unlearning in Large Reasoning Models
Sangyeon Yoon, Wonje Jeung, Albert No
- Chat-Driven Text Generation and Interaction for Person Retrieval
Zequn Xie, Chuxin Wang, Yeqiang Wang, Sihang Cai, Shulei Wang, Tao Jin
- Spontaneous Giving and Calculated Greed in Language Models
Yuxuan Li, Hirokazu Shirado
- SenDetEX: Sentence-Level AI-Generated Text Detection for Human-AI Hybrid Content via Style and Context Fusion
Lei Jiang, Desheng Wu, Xiaolong Zheng
- Judge and Improve: Towards a Better Reasoning of Knowledge Graphs with Large Language Models
Mo Zhiqiang, yanghua, Jiahui Li, Yuan Liu, Shawn Wong, Jianmin Huang
- Add-One-In: Incremental Sample Selection for Large Language Models via a Choice-Based Greedy Paradigm
Zhuo Li, Yuhao Du, Xiaoqi Jiao, Steven Y. Guo, yuege feng, Xiang Wan, Anningzhe Gao, Jinpeng Hu
- QuZO: Quantized Zeroth-Order Fine-Tuning for Large Language Models
Jiajun Zhou, Yifan Yang, Kai Zhen, Ziyue Liu, Yequan Zhao, Ershad Banijamali, Athanasios Mouchtaris, Ngai Wong, Zheng Zhang
- Cost-Optimal Grouped-Query Attention for Long-Context Modeling
Yingfa Chen, Yutong Wu, Chenyang Song, Zhen Leng Thai, Xingyu Shen, Xu Han, Zhiyuan Liu, Maosong Sun
- ChatVLA: Unified Multimodal Understanding and Robot Control with Vision-Language-Action Model
Zhongyi Zhou, Yichen Zhu, Minjie Zhu, Junjie Wen, Ning Liu, Zhiyuan Xu, Weibin Meng, Ran Cheng, Yaxin Peng, Chaomin Shen, Feifei Feng
- KG-RAG: Enhancing GUI Agent Decision-Making via Knowledge Graph-Driven Retrieval-Augmented Generation
Ziyi Guan, Jason Chun Lok Li, Zhijian Hou, Pingping Zhang, Donglai Xu, Yuzhi Zhao, Mengyang Wu, Jinpeng Chen, Thanh-Toan Nguyen, Pengfei Xian, Wenao MA, Shengchao Qin, Graziano Chesi, Ngai Wong
- CLIP-MoE: Towards Building Mixture of Experts for CLIP with Diversified Multiplet Upcycling
Jihai Zhang, Xiaoye Qu, Tong Zhu, Yu Cheng
- Search-o1: Agentic Search-Enhanced Large Reasoning Models
Xiaoxi Li, Guanting Dong, Jiajie Jin, Yuyao Zhang, Yujia Zhou, Yutao Zhu, Zhicheng Dou
- From Personas to Talks: Revisiting the Impact of Personas on LLM-Synthesized Emotional Support Conversations
Shenghan Wu, Yimo Zhu, Wynne Hsu, Mong-Li Lee, Yang Deng
- Select-Then-Decompose: From Empirical Analysis to Adaptive Selection Strategy for Task Decomposition in Large Language Models
Shuodi Liu, Yingzhuo Liu, Zi Wang, yusheng wang, Huijia Wu, Liuyu Xiang, Zhaofeng He
- TombRaider: Entering the Vault of History to Jailbreak Large Language Models
Junchen Ding, Jiahao Zhang, Yi Liu, Ziqi Ding, Gelei Deng, Yuekang Li
- Text Meets Topology: Rethinking Out-of-distribution Detection in Text-Rich Networks
Danny Wang, Ruihong Qiu, Guangdong Bai, Zi Huang
- APLOT: Robust Reward Modeling via Adaptive Preference Learning with Optimal Transport
Zhuo Li, yuege feng, Dandan Guo, Jinpeng Hu, Anningzhe Gao, Xiang Wan
- HS-STaR: Hierarchical Sampling for Self-Taught Reasoners via Difficulty Estimation and Budget Reallocation
Feng Xiong, Hongling Xu, Yifei Wang, Runxi Cheng, Yong Wang, Xiangxiang Chu
- SEPS: A Separability Measure for Robust Unlearning in LLMs
Wonje Jeung, Sangyeon Yoon, Albert No
- TRUST-VL: An Explainable News Assistant for General Multimodal Misinformation Detection
Zehong Yan, Peng Qi, Wynne Hsu, Mong-Li Lee
- Tree-of-Quote Prompting Improves Factuality and Attribution in Multi-Hop and Medical Reasoning
Justin Xu, Yiming Li, Zizheng Zhang, Augustine Yui Hei Luk, Mayank Jobanputra, Samarth Oza, David W Eyre
- UnitCoder: Scalable Code Synthesis from Pre-training Corpora
Yichuan Ma, Yunfan Shao, Peiji Li, Demin Song, Qipeng Guo, Linyang Li, Xipeng Qiu, Kai Chen
- GRPO-LEAD: A Difficulty-Aware Reinforcement Learning Approach for Concise Mathematical Reasoning in Language Models
Jixiao Zhang, Chunsheng Zuo
- Improving Low-Resource Sequence Labeling with Knowledge Fusion and Contextual Label Explanations
Peichao Lai, Jiaxin Gan, Feiyang Ye, Wentao Zhang, Fangcheng Fu, Yilei Wang, Bin CUI
- Rethinking Cross-Subject Data Splitting for Brain-to-Text Decoding
Congchi Yin, Qian Yu, Zhiwei Fang, Changping Peng, Piji Li
- RCScore: Quantifying Response Consistency in Large Language Models
Dongjun Jang, Youngchae Ahn, Hyopil Shin
- A Multi-Agent Framework with Automated Decision Rule Optimization for Cross-Domain Misinformation Detection
Hui Li, Ante Wang, Kunquan Li, Zhihao Wang, Liang Zhang, Delai Qiu, Qingsong Liu, Jinsong Su
- OmniEval: An Omnidirectional and Automatic RAG Evaluation Benchmark in Financial Domain
Shuting Wang, Jiejun Tan, Zhicheng Dou, Ji-Rong Wen
- AQuilt: Weaving Logic and Self-Inspection into Low-Cost, High-Relevance Data Synthesis for Specialist LLMs
Xiaopeng Ke, Hexuan Deng, Xuebo Liu, Jun Rao, Zhenxi Song, Jun Yu, Min Zhang
- MoSEs: Uncertainty-Aware AI-Generated Text Detection via Mixture of Stylistics Experts with Conditional Thresholds
Junxi Wu, Jinpeng Wang, Zheng Liu, Bin Chen, Dongjian Hu, Hao Wu, Shu-Tao Xia
- Merger-as-a-Stealer: Stealing Targeted PII from Aligned LLMs with Model Merging
Lin Lu, Zhigang Zuo, Ziji Sheng, Pan Zhou
- Pragmatic Inference Chain (PIC) Improving LLMs’ Reasoning of Authentic Implicit Toxic Language
Xi Chen, Shuo Wang
- Beyond Demonstrations: Dynamic Vector Construction from Latent Representations
Wang Cai, Hsiu-Yuan Huang, Zhixiang Wang, Yunfang Wu
- Detoxifying Large Language Models via the Diversity of Toxic Samples
Ying Zhao, Yuanzhao Guo, XuemengWeng, Yuan Tian, Wei Wang, Yi Chang
- LLM-Driven Implicit Target Augmentation and Fine-Grained Contextual Modeling for Zero-Shot and Few-Shot Stance Detection
Yanxu Ji, Jinzhong Ning, Yijia Zhang, Zhi Liu, Hongfei Lin
- Dial-In LLM: Human-Aligned LLM-in-the-loop Intent Clustering for Customer Service Dialogues
Mengze Hong, Wailing Ng, Chen Jason Zhang, Yuanfeng SONG, Di Jiang
- Superficial Self-Improved Reasoners Benefit from Model Merging
Xiangchi Yuan, Chunhui Zhang, Zheyuan Liu, Dachuan Shi, Leyan Pan, Soroush Vosoughi, Wenke Lee
- CARFT: Boosting LLM Reasoning via Contrastive Learning with Annotated Chain-of-Thought-based Reinforced Fine-Tuning
Wenqiao Zhu, Ji Liu, Rongjunchen Zhang, Haipang WU, Yulun Zhang
- QualBench: Benchmarking Chinese LLMs with Localized Professional Qualifications for Vertical Domain Evaluation
Mengze Hong, Wailing Ng, Chen Jason Zhang, Di Jiang
- VideoEraser: Concept Erasure in Text-to-Video Diffusion Models
Naen Xu, Jinghuai Zhang, Changjiang Li, Zhi Chen, Chunyi Zhou, Qingming Li, Tianyu Du, Shouling Ji
- Diagram-Driven Course Questions Generation
Xinyu Zhang, Lingling Zhang, Yanrui Wu, Muye Huang, Wenjun Wu, Bo Li, Shaowei Wang, Basura Fernando, Jun Liu
- ECC: An Emotion-Cause Conversation Dataset for Empathy Response
Yuanyuan He, Yongsen Pan, Wei Li, Jiali You, Jiawen Deng, Fuji Ren
- ThoughtProbe: Classifier-Guided LLM Thought Space Exploration via Probing Representations
Zijian Wang, Chang Xu
- JOLT-SQL: Joint Loss Tuning of Text-to-SQL with Confusion-aware Noisy Schema Sampling
Jinwang Song, Hongying Zan, Kunli Zhang, Lingling Mu, Yingjie Han, Haobo Hua, Min Peng
- DMDTEval: An Evaluation and Analysis of LLMs on Disambiguation in Multi-domain Translation
Zhibo Man, Yuanmeng Chen, Yujie Zhang, Jinan Xu
- SciRIFF: A Resource to Enhance Language Model Instruction-Following over Scientific Literature
David Wadden, Kejian Shi, Jacob Morrison, Alan Li, Aakanksha Naik, Shruti Singh, Nitzan Barzilay, Kyle Lo, Tom Hope, Luca Soldaini, Shannon Zejiang Shen, Doug Downey, Hannaneh Hajishirzi, Arman Cohan
- MAKAR: a Multi-Agent framework based Knowledge-Augmented Reasoning for Grounded Multimodal Named Entity Recognition
Xinkui Lin, yuhui zhang, Yongxiu Xu, Kun Huang, Hongzhang Mu, Yubin Wang, Gaopeng Gou, Li Qian, Li Peng, Wei Liu, Hongbo Xu
- VisCRA: A Visual Chain Reasoning Attack for Jailbreaking Multimodal Large Language Models
Bingrui Sima, Linhua Cong, Wenxuan Wang, Kun He
- Investigating Neurons and Heads in Transformer-based LLMs for Typographical Errors
Kohei Tsuji, Tatsuya Hiraoka, Yuchang Cheng, Eiji Aramaki, Tomoya Iwakura
- LMR-BENCH: Evaluating LLM Agent’s Ability on Reproducing Language Modeling Research
Shuo Yan, Ziming Luo, Zimu Wang, Ruochen Li, Daoyang Li, Liqiang Jing, Kaiyu He, Peilin Wu, Juntong Ni, George Michalopoulos, Yue Zhang, Ziyang Zhang, Mian Zhang, Zhiyu Chen, Xinya Du
- RAV: Retrieval-Augmented Voting for Tactile Descriptions Without Training
Jinlin Wang, Yulong Ji, Hongyu Yang
- Static Word Embeddings for Sentence Semantic Representation
Takashi Wada, Yuki hirakawa, Ryotaro Shimizu, Takahiro Kawashima, Yuki Saito
- PropRAG: Guiding Retrieval with Beam Search over Proposition Paths
William Wang, Jiawei Han
- Rethinking Backdoor Detection Evaluation for Language Models
Jun Yan, Wenjie Jacky Mo, Xiang Ren, Robin Jia
- Glider: Global and Local Instruction-Driven Expert Router
Pingzhi Li, Prateek Yadav, Jaehong Yoon, Jie Peng, Yi-Lin Sung, Mohit Bansal, Tianlong Chen
- CoVoGER: A Multilingual Multitask Benchmark for Speech-to-text Generative Error Correction with Large Language Models
Zhengdong Yang, Zhen Wan, Sheng Li, Chao-Han Huck Yang, Chenhui Chu
- Tiny Budgets, Big Gains: Parameter Placement Strategy in Parameter Super-Efficient Fine-Tuning
Jinman Zhao, Xueyan Zhang, Jiaru Li, Jingcheng Niu, Yulan Hu, Erxue Min, Gerald Penn
- Legal Fact Prediction: The Missing Piece in Legal Judgment Prediction
Junkai Liu, Yujie Tong, Hui Huang, Bowen Zheng, Yiran HU, Peicheng Wu, Chuan Xiao, Makoto Onizuka, Muyun Yang, Shuyuan Zheng
- DAMON: A Dialogue-Aware MCTS Framework for Jailbreaking Large Language Models
Xu Zhang, Xunjian Yin, Dinghao Jing, Huixuan Zhang, Xinyu Hu, Xiaojun Wan
- Multilingual Prompting for Improving LLM Generation Diversity
Qihan Wang, Shidong Pan, Tal Linzen, Emily Black
- MOSAIC: Modeling Social AI for Content Dissemination and Regulation in Multi-Agent Simulations
Genglin Liu, Vivian T. Le, Salman Rahman, Elisa Kreiss, Marzyeh Ghassemi, Saadia Gabriel
- Identification of Multiple Logical Interpretations in Counter-Arguments
Wenzhi Wang, Paul Reisert, Shoichi Naito, Naoya Inoue, Machi Shimmei, Surawat Pothong, Jungmin Choi, Kentaro Inui
- LyapLock: Bounded Knowledge Preservation in Sequential Large Language Model Editing
Peng Wang, biyu zhou, Xuehai Tang, Jizhong Han, Songlin Hu
- AlignX: Advancing Multilingual Large Language Models with Multilingual Representation Alignment
Mengyu Bu, Shaolei Zhang, Zhongjun He, Hua Wu, Yang Feng
- What Makes a Good Reasoning Chain? Uncovering Structural Patterns in Long Chain-of-Thought Reasoning
Gangwei Jiang, Yahui Liu, Zhaoyi Li, V. W., Fuzheng Zhang, Linqi Song, Ying Wei, Defu Lian
- HD-PiSSA: High-Rank Distributed Orthogonal Adaptation
Yiding Wang, Fanxu Meng, Xuefeng Zhang, Fan Jiang, Pingzhi Tang, Muhan Zhang
- Firewall Routing: Blocking Leads to Better Hybrid Inference for LLMs
Runyu Peng, Yunhua Zhou, Kai Lv, Yang Gao, Qipeng Guo, Xipeng Qiu
- SPE Attention: Making Attention Equivariant to Semantic-Preserving Permutation for Code Processing
Chengyu Jiao, Shuhao Chen, Yu Zhang
- Audio-centric Video Understanding Benchmark without Text Shortcut
Yudong Yang, Jimin Zhuang, Guangzhi Sun, Changli Tang, Yixuan Li, Peihan Li, Yifan Jiang, Wei Li, Zejun MA, Chao Zhang
- TurboRAG: Accelerating Retrieval-Augmented Generation with Precomputed KV Caches for Chunked Text
Songshuo Lu, Hua Wang, Yutian Rong, Zhi Chen, Yaohua Tang
- ZoomEye: Enhancing Multimodal LLMs with Human-Like Zooming Capabilities through Tree-Based Image Exploration
Haozhan Shen, Kangjia Zhao, Tiancheng Zhao, Ruochen Xu, Zilun Zhang, Mingwei Zhu, Jianwei Yin
- Learning Like Humans: Advancing LLM Reasoning Capabilities via Adaptive Difficulty Curriculum Learning and Expert-Guided Self-Reformulation
Enci Zhang, Xingang Yan, Wei Lin, Tianxiang.Zhang, LU Qianchun
- VersaTune: An Efficient Data Composition Framework for Training Multi-Capability LLMs
Keer Lu, Keshi Zhao, Zhuoran Zhang, Zheng Liang, Bin CUI, Tengjiao Wang, Wentao Zhang
- FlightGPT: Towards Generalizable and Interpretable UAV Vision-and-Language Navigation with Vision-Language Models
Hengxing Cai, Jinhan Dong, Jingjun Tan, Jingcheng Deng, Sihang Li, Zhifeng Gao, Haidong Wang, Zicheng Su, Agachai Sumalee, Renxin ZHONG
- Multimodal Language Models See Better When They Look Shallower
Haoran Chen, Junyan Lin, Xinghao Chen, Yue Fan, Jianfeng Dong, Xin Jin, Hui Su, Jinlan Fu, Xiaoyu Shen
- LoSiA: Efficient High-Rank Fine-Tuning via Subnet Localization and Optimization
Xujia Wang, Yunjia Qi, Bin Xu
- Invisible Entropy: Towards Safe and Efficient Low-Entropy LLM Watermarking
Tianle Gu, Zongqi Wang, Kexin Huang, Yuanqi Yao, Xiangliang Zhang, Yujiu Yang, Xiuying Chen
- Measuring Bias or Measuring the Task: Understanding the Brittle Nature of LLM Gender Biases
Bufan Gao, Elisa Kreiss
- Alignment-Augmented Speculative Decoding with Alignment Sampling and Conditional Verification
Jikai Wang, Zhenxu Tian, Juntao Li, Qingrong Xia, Xinyu Duan, Zhefeng Wang, Baoxing Huai, Min Zhang
- ViLBench: A Suite for Vision-Language Process Reward Modeling
Haoqin Tu, Weitao Feng, Hardy Chen, Hui Liu, Xianfeng Tang, Cihang Xie
- Keep Security! Benchmarking Security Policy Preservation in Large Language Model Contexts Against Indirect Attacks in Question Answering
Hwan Chang, Yumin Kim, Yonghyun Jun, Hwanhee Lee
- Route Sparse Autoencoder to Interpret Large Language Models
Wei Shi, Sihang Li, Tao Liang, Mingyang Wan, Guojun Ma, Xiang Wang, Xiangnan He
- BTS: Harmonizing Specialized Experts into a Generalist LLM
Qizhen Zhang, Prajjwal Bhargava, Chloe Bi, Chris X. Cai, Jakob Nicolaus Foerster, Jeremy Fu, Punit Singh Koura, Ruan Silva, Sheng Shen, Emily Dinan, Suchin Gururangan, Mike Lewis
- CoCoA: Confidence- and Context-Aware Adaptive Decoding for Resolving Knowledge Conflicts in Large Language Models
Anant Khandelwal, Manish Gupta, Puneet Agrawal
- R-Bind: Unified Enhancement of Attribute and Relation Binding in Text-to-Image Diffusion Models
Huixuan Zhang, Xiaojun Wan
- Middo: Model-Informed Dynamic Data Optimization for Enhanced LLM Fine-Tuning via Closed-Loop Learning
Zinan Tang, Xin Gao, Zhuoshi Pan, Qizhi Pei, Mengzhang Cai, Jiang Wu, Conghui He, Lijun Wu
- Information Integration in Large Language Models is Gated by Linguistic Structural Markers
Wei Liu, Nai Ding
- Why and How LLMs Benefit from Knowledge Introspection in Commonsense Reasoning
Chengfeng Zhao, Shizhu He, Shanshan Jiang, Bin Dong, Jun Zhao, Kang Liu
- GraDaSE: Graph-Based Dataset Search with Examples
Jing He, Mingyang Lv, Qing Shi, Gong Cheng
- Confidence-guided Refinement Reasoning for Zero-shot Question Answering
Youwon Jang, Woo Suk Choi, Minjoon Jung, Minsu Lee, Byoung-Tak Zhang
- DICE: Structured Reasoning in LLMs through SLM-Guided Chain-of-Thought Correction
Yiqi Li, Yusheng Liao, Zhe Chen, Yanfeng Wang, Yu Wang
- CTCC: A Robust and Stealthy Fingerprinting Framework for Large Language Models via Cross-Turn Contextual Correlation Backdoor
Zhenhua Xu, Xixiang Zhao, Xubin Yue, shengwei tian, Changting Lin, Meng Han
- Realistic Training Data Generation and Rule Enhanced Decoding in LLM for NameGuess
Yikuan Xia, Jiazun Chen, Sujian Li, Jun Gao
- EverTracer: Hunting Stolen Large Language Models via Stealthy and Robust Probabilistic Fingerprint
Zhenhua Xu, Meng Han, Wenpeng Xing
- Selective Preference Optimization via Token-Level Reward Function Estimation
Kailai Yang, Zhiwei Liu, Qianqian Xie, Jimin Huang, Erxue Min, Sophia Ananiadou
- Arena-lite: Efficient and Reliable Large Language Model Evaluation via Tournament-Based Direct Comparisons
Seonil Son, Ju-Min Oh, Heegon Jin, Cheolhun Jang, JEONGBEOM JEONG, KunTae Kim
- Addressing Tokenization Inconsistency in Steganography and Watermarking Based on Large Language Models
Ruiyi Yan, Yugo Murawaki
- ExeCoder: Empowering Large Language Models with Executability Representation for Code Translation
Minghua He, Yue Chen, Fangkai Yang, Pu Zhao, Wenjie Yin, Yu Kang, Qingwei Lin, Saravan Rajmohan, Dongmei Zhang
- TableEval: A Real-World Benchmark for Complex, Multilingual, and Multi-Structured Table Question Answering
Junnan Zhu, Jingyi Wang, Bohan Yu, Xiaoyu Wu, Junbo Li, Lei Wang, Nan Xu
- NOVA-63: Native Omni-lingual Versatile Assessments of 63 Disciplines
Jinyang Zhang, Kexin Yang, Yu Wan, Muyang Ye, Baosong Yang, Fei Huang, Junyang Lin, Dayiheng Liu
- InfoGain-RAG: Boosting Retrieval-Augmented Generation through Document Information Gain-based Reranking and Filtering
Zihan Wang, Zihan Liang, Zhou Shao, Yufei Ma, Huangyu Dai, Ben Chen, MaoLingtao, Chenyi Lei, Yuqing DING, Han Li
- SpecVLM: Enhancing Speculative Decoding of Video LLMs via Verifier-Guided Token Pruning
Yicheng Ji, Jun Zhang, Heming Xia, Jinpeng Chen, Lidan Shou, Gang Chen, Huan Li
- What Do Indonesians Really Need from Language Technology? A Nationwide Survey
Muhammad Dehan Al Kautsar, Lucky Susanto, Derry Tanti Wijaya, Fajri Koto
- LEO-MINI: An Efficient Multimodal Large Language Model using Conditional Token Reduction and Mixture of Multi-Modal Experts
Yimu Wang, Mozhgan Nasr Azadani, Sean Sedwards, Krzysztof Czarnecki
- Confounding Factors in Relating Model Performance to Morphology
Wessel Poelman, Thomas Bauwens, Miryam de Lhoneux
- Context-Aware Membership Inference Attacks against Pre-trained Large Language Models
Hongyan Chang, Ali Shahin Shamsabadi, Kleomenis Katevas, Hamed Haddadi, Reza Shokri
- Formalizing Style in Personal Narratives
Gustave Cortal, Alain Finkel
- TopicAttack: An Indirect Prompt Injection Attack via Topic Transition
Yulin Chen, Haoran Li, Yuexin Li, Yue Liu, Yangqiu Song, Bryan Hooi
- PSET: a Phonetics-Semantics Evaluation Testbed
Gianluca Sperduti, Dong Nguyen
- From Unaligned to Aligned: Scaling Multilingual LLMs with Multi-Way Parallel Corpora
Yingli Shen, Wen Lai, Shuo Wang, Kangyang Luo, Alexander Fraser, Maosong Sun
- GATEAU: Selecting Influential Samples for Long Context Alignment
Shuzheng Si, Haozhe Zhao, Gang Chen, Yunshui Li, Kangyang Luo, Chuancheng Lv, Kaikai An, Fanchao Qi, Baobao Chang, Maosong Sun
- Teach Small Models to Reason by Curriculum Distillation
Wangyi Jiang, Yaojie Lu, Hongyu Lin, Xianpei Han, Le Sun
- Enhancing Reasoning Abilities of Small LLMs with Cognitive Alignment
Wenrui Cai, Chengyu Wang, Junbing Yan, Jun Huang, Xiangzhong Fang
- NOVER: Incentive Training for Language Models via Verifier-Free Reinforcement Learning
Wei Liu, Siya Qi, Xinyu Wang, Chen Qian, Yali Du, Yulan He
- Genre Matters: How Text Types Interact with Decoding Strategies and Lexical Predictors in Shaping Reading Behavior
Lena Sophia Bolliger, Lena Ann Jäger
- RTE-GMoE: A Model-agnostic Approach for Relation Triplet Extraction via Graph-based Mixture-of-Expert Mutual Learning
Aziguli Wulamu, Kaiyuan Gong, Lyu Zhengyu, Yu Han, Zhihong Zhu, Bowen Xing
- Avoidance Decoding for Diverse Multi-Branch Story Generation
Kyeongman Park, Nakyeong Yang, Kyomin Jung
- Probabilistic Soundness Guarantees in LLM Reasoning Chains
Weiqiu You, Anton Xue, Shreya Havaldar, Delip Rao, Helen Jin, Chris Callison-Burch, Eric Wong
- SQLWOZ: A Realistic Task-Oriented Dialogue Dataset with SQL-Based Dialogue State Representation for Complex User Requirements
Heng-Da Xu, Xian-Ling Mao, Fanshu Sun, Tian-Yi Che, Cheng-Xin Xin, Heyan Huang
- SURE: Safety Understanding and Reasoning Enhancement for Multimodal Large Language Models
Yuxin Gou, Xiaoning Dong, Qin Li, Shishen Gu, Richang Hong, Wenbo Hu
- EMO: Embedding Model Distillation via Intra-Model Relation and Optimal Transport Alignments
Minh Phuc Truong, Hai An Vu, Tu Vu, Nguyen Thi Ngoc Diep, Linh Ngo Van, Thien Huu Nguyen, Trung Le
- AesBiasBench: Evaluating Bias and Alignment in Multimodal Language Models for Personalized Image Aesthetic Assessment
Kun Li, Lai Man Po, Hongzheng Yang, XUYUAN XU, Kangcheng Liu, Yuzhi Zhao
- DA-Pred: Performance Prediction for Text Summarization under Domain-Shift and Instruct-Tuning
Anum Afzal, Florian Matthes, Alexander Fabbri
- UnCo: Uncertainty-Driven Collaborative Framework of Large and Small Models for Grounded Multimodal NER
Jielong Tang, Yang Yang, Jianxing Yu, Zhen-Xing Wang, Haoyuan Liang, Liang Yao, Jian Yin
- An Empirical Study of LLM Reasoning Ability Under Strict Output Length Constraint
Yi Sun, Han Wang, Jiaqiang Li, Jiacheng Liu, Xiangyu Li, Hao Wen, Huiwen Zheng, Yan Liang, Yuanchun Li, Yunxin Liu
- Enrich-on-Graph: Query-Graph Alignment for Complex Reasoning with LLM Enriching
Songze Li, Zhiqiang Liu, Zhengke Gui, Huajun Chen, Wen Zhang
- Noise, Adaptation, and Strategy: Assessing LLM Fidelity in Decision-Making
Yuanjun Feng, Vivek Choudhary, Yash Raj Shrestha
- Structuring Radiology Reports: Challenging LLMs with Lightweight Models
Johannes Moll, Louisa Fay, Asfandyar Azhar, Sophie Ostmeier, Sergios Gatidis, Tim C. Lueth, Curtis Langlotz, Jean-Benoit Delbrouck
- PricingLogic: Evaluating LLMs Reasoning on Complex Tourism Pricing Tasks
Yunuo Liu, Dawei Zhu, Zena Al Khalili, Dai Cheng, Yanjun Chen, Dietrich Klakow, Wei Zhang, Xiaoyu Shen
- EcoTune: Token-Efficient Multi-Fidelity Hyperparameter Optimization for Large Language Model Inference
Yuebin XU, Zeyi Wen
- Investigating Value-Reasoning Reliability in Small Large Language Models
杜霞, Shuhan Sun, Pengyuan Liu, Dong Yu
- Can LLMs Explain Themselves Counterfactually?
Zahra Dehghanighobadi, Asja Fischer, Muhammad Bilal Zafar
- Self-Adjust Softmax
Chuanyang Zheng, Yihang Gao, Guoxuan Chen, Han Shi, Jing Xiong, Xiaozhe Ren, Chao Huang, Zhenguo Li, Yu Li
- XAutoLM: Efficient Fine-Tuning of Language Models via Meta-Learning and AutoML
Ernesto Luis Estevanell Valladares, Suilan Estevez-Velarde, Yoan Gutierrez, Andrés Montoyo, Ruslan Mitkov
- UNCERTAINTY-LINE: Length-Invariant Estimation of Uncertainty for Large Language Models
Roman Vashurin, Maiya Goloburda, Preslav Nakov, Maxim Panov
- WebAgent-R1: Training Web Agents via End-to-End Multi-Turn Reinforcement Learning
Zhepei Wei, Wenlin Yao, Yao Liu, Weizhi Zhang, Qin Lu, Liang Qiu, Changlong Yu, Puyang Xu, Chao Zhang, Bing Yin, Hyokun Yun, Lihong Li
- Same evaluation, more tokens: On the effect of input length for machine translation evaluation using Large Language Models
Tobias Domhan, Dawei Zhu
- PAKTON: A Multi-Agent Framework for Question Answering in Long Legal Agreements
Raptopoulos Petros, Giorgos Filandrianos, Maria Lymperaiou, Giorgos Stamou
- PoSum-Bench: Benchmarking Position Bias in LLM-based Conversational Summarization
XU SUN, Lionel Delphin-Poulat, Christèle Tarnec, Anastasia Shimorina
- ConCISE: Confidence-guided Compression in Step-by-step Efficient Reasoning
Ziqing Qiao, Yongheng Deng, Jiali Zeng, Dong Wang, Lai Wei, Guanbo Wang, Fandong Meng, Jie Zhou, Ju Ren, Yaoxue Zhang
- Layer-Aware Representation Filtering: Purifying Finetuning Data to Preserve LLM Safety Alignment
Hao Li, Lijun Li, Zhenghao Lu, Xianyi Wei, Rui Li, Jing Shao, Lei Sha
- Cross-domain Rumor Detection via Test-Time Adaptation and Large Language Models
Yuxia Gong, Shuguo Hu, Huaiwen Zhang
- MLWQ: Efficient Small Language Model Deployment via Multi-Level Weight Quantization
Chun Hu, Junhui He, Shangyu Wu, YuxinHe, Chun Jason Xue, Qingan Li
- ToDi: Token-wise Distillation via Fine-Grained Divergence Control
Seongryong Jung, Suwan Yoon, DongGeon Kim, Hwanhee Lee
- RethinkMCTS: Refining Erroneous Thoughts in Monte Carlo Tree Search for Code Generation
Qingyao Li, Wei Xia, Xinyi Dai, Kounianhua Du, Weiwen Liu, Yasheng Wang, Ruiming Tang, Yong Yu, Weinan Zhang
- Probing for Arithmetic Errors in Language Models
Yucheng Sun, Alessandro Stolfo, Mrinmaya Sachan
- NILE: Internal Consistency Alignment in Large Language Models
Minda Hu, Qiyuan Zhang, Yufei Wang, Bowei He, Hongru WANG, Jingyan Zhou, Liangyou Li, Yasheng Wang, Chen Ma, Irwin King
- Mining the Past with Dual Criteria: Integrating Three types of Historical Information for Context-aware Event Forecasting
Rong Ma, Lei Wang, Yating Yang, Bo Ma, Rui Dong, Fengyi Yang, Ahtamjan Ahmat, Kaiwen Lu, Xinyue Wang
- RAGferee: Building Contextual Reward Models for Retrieval-Augmented Generation
Andrei Catalin Coman, Ionut Teodor Sorodoc, Leonardo F. R. Ribeiro, Bill Byrne, James Henderson, Adrià de Gispert
- Large Language Models Discriminate Against Speakers of German Dialects
Minh Duc Bui, Carolin Holtermann, Valentin Hofmann, Anne Lauscher, Katharina von der Wense
- Uncovering Argumentative Flow: A Question-Focus Discourse Structuring Framework
Yini Wang, Xian Zhou, Shengan Zheng, Linpeng Huang, Zhunchen Luo, Wei Luo, Xiaoying Bai
- AbsVis – Benchmarking How Humans and Vision-Language Models “See” Abstract Concepts in Images
Tarun Tater, Diego Frassinelli, Sabine Schulte im Walde
- A Rigorous Evaluation of LLM Data Generation Strategies for Low-Resource Languages
Tatiana Anikina, Jan Cegin, Jakub Simko, Simon Ostermann
- Alignment with Fill-In-the-Middle for Enhancing Code Generation
Houxing Ren, Zimu Lu, Weikang Shi, Haotian Hou, Yunqiao Yang, Ke Wang, Aojun Zhou, Junting Pan, Mingjie Zhan, Hongsheng Li
- A Middle Path for On-Premises LLM Deployment: Preserving Privacy Without Sacrificing Model Confidentiality
Hanbo Huang, Yihan Li, Bowen Jiang, Bo Jiang, Lin Liu, Zhuotao Liu, Ruoyu Sun, Shiyu Liang
- Variance Sensitivity Induces Attention Entropy Collapse and Instability in Transformers
Jonghyun Hong, Sungyoon Lee
- X-FLoRA: Cross-modal Federated Learning with Modality-expert LoRA for Medical VQA
Min Hyuk Kim, Changheon Kim, Seok Bong Yoo
- Robust Native Language Identification through Agentic Decomposition
Ahmet Yavuz Uluslu, Tannon Kew, Tilia Ellendorff, Gerold Schneider, Rico Sennrich
- ConsistentChat: Building Skeleton-Guided Consistent Multi-Turn Dialogues for Large Language Models from Scratch
Jiawei Chen, Xinyan Guan, Qianhao Yuan, Mo guozhao, Weixiang Zhou, Yaojie Lu, Hongyu Lin, Ben He, Le Sun, Xianpei Han
- Does Acceleration Cause Hidden Instability in Vision Language Models? Uncovering Instance-Level Divergence Through a Large-Scale Empirical Study
Yizheng Sun, Hao Li, Chang Xu, Hongpeng Zhou, Chenghua Lin, Riza Batista-Navarro, Jingyuan Sun
- When Annotators Disagree, Topology Explains: Mapper, a Topological Tool for Exploring Text Embedding Geometry and Ambiguity
Nisrine Rair, Alban Goupil, Valeriu Vrabie, Emmanuel Chochoy
- Self-Critique and Refinement for Faithful Natural Language Explanations
Yingming Wang, Pepa Atanasova
- The Psychology of Falsehood: A Human-Centric Survey of Misinformation Detection
Arghodeep Nandi, Megha Sundriyal, Euna Mehnaz Khan, Jikai Sun, Emily K. Vraga, Jaideep Srivastava, Tanmoy Chakraborty
- SEAL: Structure and Element Aware Learning Improves Long Structured Document Retrieval
Xinhao Huang, Zhibo Ren, Yipeng Yu, Ying Zhou, Zulong Chen, Zeyi Wen
- AnchorAttention: Difference-Aware Sparse Attention with Stripe Granularity
Yu Zhang, Dong Guo, Fang Wu, Dian Ding, Yiming Zhang
- Attacks by Content: Automated Fact-checking is an AI Security Issue
Michael Sejr Schlichtkrull
- MUZO: Leveraging Multiple Queries and Momentum for Zeroth-Order Fine-Tuning of Large Language Models
Yuezhang PENG, Yuxin Liu, Fei Wen, Xie Chen
- Your Language Model Can Secretly Write Like Humans: Contrastive Paraphrase Attacks on LLM-Generated Text Detectors
Hao Fang, Jiawei Kong, Tianqu Zhuang, Yixiang Qiu, Kuofeng Gao, Bin Chen, Shu-Tao Xia, Yaowei Wang, Min Zhang
- Will It Still Be True Tomorrow? Multilingual Evergreen Question Classification to Improve Trustworthy QA
Sergey Pletenev, Maria Marina, Nikolay Ivanov, Daria Galimzianova, Nikita Krayko, Mikhail Salnikov, Vasily Konovalov, Alexander Panchenko, Viktor Moskvoretskii
- Steering Language Models in Multi-Token Generation: A Case Study on Tense and Aspect
Alina Klerings, Jannik Brinkmann, Daniel Ruffinelli, Simone Paolo Ponzetto
- DocReRank: Single-Page Hard Negative Query Generation for Training Multi-Modal RAG Rerankers
navve wasserman, Oliver Heinimann, Yuval Golbari, Tal Zimbalist, Eli Schwartz, michal Irani
- Reason to Rote: Rethinking Memorization in Reasoning
Yupei Du, Philipp Mondorf, Silvia Casola, Yuekun Yao, Robert Litschko, Barbara Plank
- VELA: An LLM-Hybrid-as-a-Judge Approach for Evaluating Long Image Captions
Kazuki Matsuda, Yuiga Wada, Shinnosuke Hirano, Seitaro Otsuki, Komei Sugiura
- LLM-Independent Adaptive RAG: Let the Question Speak for Itself
Maria Marina, Nikolay Ivanov, Sergey Pletenev, Mikhail Salnikov, Daria Galimzianova, Nikita Krayko, Vasily Konovalov, Alexander Panchenko, Viktor Moskvoretskii
- TurnBack: A Geospatial Route Cognition Benchmark for Large Language Models through Reverse Route
Hongyi Luo, Qing Cheng, Daniel Matos, Hari Krishna Gadi, Yanfeng Zhang, Lu Liu, Yongliang Wang, Niclas Zeller, Daniel Cremers, Liqiu Meng
- Certainty in Uncertainty: Reasoning over Uncertain Knowledge Graphs with Statistical Guarantees
Yuqicheng Zhu, Jingcheng Wu, Yizhen Wang, Hongkuan Zhou, Jiaoyan Chen, Evgeny Kharlamov, Steffen Staab
- Beyond Seen Data: Improving KBQA Generalization Through Schema-Guided Logical Form Generation
Shengxiang Gao, Jey Han Lau, Jianzhong Qi
- A Training-Free Length Extrapolation Approach for LLMs: Greedy Attention Logit Interpolation
Yan Li, Tianyi Zhang, Zechuan Li, Caren Han
- Taming Text-to-Image Synthesis for Novices: User-centric Prompt Generation via Multi-turn Guidance
Yilun Liu, Minggui HE, Feiyu Yao, Yuhe Ji, Shimin Tao, Jingzhou DU, JustinLi, Jian Gao, Zhang Li, Hao Yang, Boxing Chen, Osamu Yoshie
- We Need to Measure Data Diversity in NLP — Better and Broader
Dong Nguyen, Esther Ploeger
- Sheaf Discovery with Joint Computation Graph Pruning and Flexible Granularity
Jingcheng Niu, Lei Yu, Zining Zhu, Xi Chen, Gerald Penn
- Hierarchical Bracketing Encodings Work for Dependency Graphs
Ana Ezquerro, Carlos Gómez-Rodríguez, David Vilares
- Multimodal Fine-grained Context Interaction Graph Modeling for Conversational Speech Synthesis
Zhenqi Jia, Rui Liu, Berrak Sisman, Haizhou Li
- Judging Quality Across Languages: A Multilingual Approach to Pretraining Data Filtering with Language Models
Mehdi Ali, Manuel Brack, Max Lübbering, Elias Wendt, Abbas Goher Khan, Richard Rutmann, Alex Jude, Maurice Kraus, Alexander Arno Weber, Felix Stollenwerk, David Kaczér, Florian Mai, Lucie Flek, Rafet Sifa, Nicolas Flores-Herr, Joachim Koehler, Patrick Schramowski, Michael Fromm, Kristian Kersting
- Conditional [MASK] Discrete Diffusion Language Model
Hyukhun Koh, Minha Jhang, Dohyung Kim, Sangmook Lee, Kyomin Jung
- Language-Guided Temporal Token Pruning for Efficient VideoLLM Processing
Yogesh Kumar
- A Fully Probabilistic Perspective on Large Language Model Unlearning: Evaluation and Optimization
Anda Cheng, Wei Huang, Yinggui Wang
- IIET: Efficient Numerical Transformer via Implicit Iterative Euler Method
Xinyu Liu, Bei Li, Jiahao Liu, Junhao Ruan, Kechen Jiao, Hongyin Tang, Jingang Wang, Tong Xiao, JingBo Zhu
- WebEvolver: Enhancing Web Agent Self-Improvement with Co-evolving World Model
Tianqing Fang, Hongming Zhang, Zhisong Zhang, Kaixin Ma, Wenhao Yu, Haitao Mi, Dong Yu
- Leveraging Semantic Triples for Private Document Generation with Local Differential Privacy Guarantees
Stephen Meisenbacher, Maulik Chevli, Florian Matthes
- HVGuard: Utilizing Multimodal Large Language Models for Hateful Video Detection
yiheng jing, mingming zhang, Yong Zhuang, jiacheng guo, Juan Wang, Xiaoyang Xu, Wenzhe Yi, Keyan Guo, Hongxin Hu
- Accelerate Parallelizable Reasoning via Parallel Decoding within One Sequence
Yijiong Yu, Ji Pei, Wei Wang, Ran Chen
- SlideCoder: Layout-aware RAG-enhanced Hierarchical Slide Generation from Design
Wenxin Tang, Jingyu Xiao, Wenxuan Jiang, Xi Xiao, Yuhang Wang, Xuxin Tang, Qing Li, Yuehe Ma, Junliang Liu, Shisong Tang, Michael R. Lyu
- LLM-OREF: An Open Relation Extraction Framework Based on Large Language Models
Hongyao Tu, Liang Zhang, Yujie Lin, Xin Lin, Haibo Zhang, Long zhang, Jinsong Su
- Ambiguity Awareness Optimization: Towards Semantic Disambiguation for Direct Preference Optimization
Jian Li, Shenglin Yin, Yujia Zhang, Alan Zhao, Xi Chen, Xiaohui Zhou, Pengfei Xu
- Improving Multilingual Retrieval-Augmented Language Models through Dialectic Reasoning Argumentations
Leonardo Ranaldi, Federico Ranaldi, Fabio Massimo Zanzotto, Barry Haddow, Alexandra Birch
- Predicate-Guided Generation for Mathematical Reasoning
Jiajun Chen, Yik-Cheung Tam
- ComplexTempQA: A 100m Dataset for Complex Temporal Question Answering
Raphael Gruber, Abdelrahman Abdallah, Michael Färber, Adam Jatowt
- ViDoRAG: Visual Document Retrieval-Augmented Generation via Dynamic Iterative Reasoning Agents
Qiuchen Wang, Ruixue Ding, Zehui Chen, Weiqi Wu, Shihang Wang, Pengjun Xie, Feng Zhao
- IndoSafety: Culturally Grounded Safety for LLMs in Indonesian Languages
Muhammad Falensi Azmi, Muhammad Dehan Al Kautsar, Alfan Farizki Wicaksono, Fajri Koto
- Can LLMs Help You at Work? A Sandbox for Evaluating LLM Agents in Enterprise Environments
Harsh Vishwakarma, Ankush Agarwal, Ojas Patil, Chaitanya Devaguptapu, Mahesh Chandran
- Steering LLM Reasoning Through Bias-Only Adaptation
Viacheslav Sinii, Alexey Gorbatovski, Artem Cherepanov, Boris Shaposhnikov, Nikita Balagansky, Daniil Gavrilov
- VLASCD: A Visual Language Action Model for Simultaneous Chatting and Decision Making
Zuojin Tang, Bin Hu, Chenyang Zhao, De Ma, Gang Pan, Bin Liu
- M-LongDoc: A Benchmark For Multimodal Super-Long Document Understanding And A Retrieval-Aware Tuning Framework
Yew Ken Chia, Liying Cheng, Hou Pong Chan, Maojia Song, Chaoqun Liu, Mahani Aljunied, Soujanya Poria, Lidong Bing
- Look Again, Think Slowly: Enhancing Visual Reflection in Vision-Language Models
Pu Jian, Junhong Wu, Wei Sun, Chen Wang, Shuo Ren, Jiajun Zhang
- FB-Bench: A Fine-Grained Multi-Task Benchmark for Evaluating LLMs’ Responsiveness to Human Feedback
Youquan Li, Miao Zheng, Fan Yang, Guosheng Dong, Bin CUI, Weipeng Chen, Zenan Zhou, Wentao Zhang
- HYDRA: A Multi-Head Encoder-only Architecture for Hierarchical Text Classification
Fabian Karl, Ansgar Scherp
- CARD: Cross-modal Agent Framework for Generative and Editable Residential Design
Pengyu Zeng, Jun Yin, Miao Zhang, Yuqin Dai, Jizhizi Li, Shuai Lu
- DrDiff: Dynamic Routing Diffusion with Hierarchical Attention for Breaking the Efficiency-Quality Trade-off
Jusheng Zhang, Yijia Fan, Kaitong Cai, Zimeng Huang, Xiaofei Sun, Jian Wang, Chengpei Tang, Keze Wang
- FaST: Feature-aware Sampling and Tuning for Personalized Preference Alignment with Limited Data
Thibaut Thonet, Germán Kruszewski, Jos Rozen, Pierre ERBACHER, Marc Dymetman
- On LLM-Based Scientific Inductive Reasoning Beyond Equations
Brian S. Lin, Jiaxin Yuan, Zihan Zhou, Shouli Wang, Shuo Wang, Cunliang Kong, Qi Shi, Yuxuan Li, Liner Yang, Zhiyuan Liu, Maosong Sun
- SPECS: Specificity-Enhanced CLIP-Score for Long Image Caption Evaluation
Xiaofu Chen, Israfel Salazar, Yova Kementchedjhieva
- LM-Searcher: Cross-domain Neural Architecture Search with LLMs via Unified Numerical Encoding
Yuxuan Hu, Jihao Liu, Ke Wang, Jinliang Zheng, Weikang Shi, Manyuan Zhang, Qi Dou, Rui Liu, Aojun Zhou, Hongsheng Li
- Does quantization affect models’ performance on long-context tasks?
Anmol Mekala, Anirudh Atmakuru, Yixiao Song, Marzena Karpinska, Mohit Iyyer
- Token-Aware Editing of Internal Activations for Large Language Model Alignment
Tianbo Wang, Kewei Liao, Yuqing Ma, Chengzhao Yang, Zhange Zhang, Jiakai Wang, Xianglong Liu
- Bitune: Leveraging Bidirectional Attention to Improve Decoder-Only LLMs
Dawid Jan Kopiczko, Tijmen Blankevoort, Yuki M Asano
- Disambiguation in Conversational Question Answering in the Era of LLMs and Agents: A Survey
Mehrab Tanjim, Yeonjun In, Xiang Chen, Victor Bursztyn, Ryan A. Rossi, Sungchul Kim, Guang-Jie Ren, Vaishnavi Muppala, Shun Jiang, Yongsung Kim, Chanyoung Park
- Plan Dynamically, Express Rhetorically: A Debate-Driven Rhetorical Framework for Argumentative Writing
Xueguan Zhao, Wenpeng Lu, Chaoqun Zheng, Weiyu Zhang, Jiasheng Si, Deyu Zhou
- TCPO: Thought-Centric Preference Optimization for Effective Embodied Decision-making
Kechen Jiao, Zhirui Fang, Jiahao Liu, Bei Li, Qifan Wang, Xinyu Liu, Junhao Ruan, Zhongjian Qiao, Yifan Zhu, Yaxin Xu, Jingang Wang, Xiu Li
- Reimagining Safety Alignment with An Image
Yifan Xia, Guorui Chen, Wenqian Yu, Zhijiang Li, Philip Torr, Jindong Gu
- Visual Contextual Attack: Jailbreaking MLLMs with Image-Driven Context Injection
Miao Ziqi, Yi Ding, Lijun Li, Jing Shao
- Can Large Language Models Win the International Mathematical Games?
Alessio Cocchieri, Luca Ragazzi, Giuseppe Tagliavini, Lorenzo Tordi, Antonella Carbonaro, Gianluca Moro
- CodeArena: Evaluating and Aligning CodeLLMs on Human Preference
Jian Yang, Jiaxi Yang, Wei Zhang, JinKe, Yibo Miao, Lei Zhang, Liqun Yang, Zeyu Cui, Yichang Zhang, Zhoujun Li, Binyuan Hui, Junyang Lin
- Language models can learn implicit multi-hop reasoning, but only if they have lots of training data
Yuekun Yao, Yupei Du, Dawei Zhu, Michael Hahn, Alexander Koller
- UniversalCEFR: Enabling Open Multilingual Research on Language Proficiency Assessment
Joseph Marvin Imperial, Abdullah Barayan, Regina Stodden, Rodrigo Wilkens, Ricardo Muñoz Sánchez, GAO Lingyun, Melissa Torgbi, Dawn Knight, Gail Forey, Reka R. Jablonkai, Ekaterina Kochmar, Robert Joshua Reynolds, Eugénio Ribeiro, Horacio Saggion, Elena Volodina, Sowmya Vajjala, Thomas François, Fernando Alva-Manchego, Harish Tayyar Madabushi
- CROP: Contextual Region-Oriented Visual Token Pruning
Jiawei Guo, Feifei Zhai, Pu Jian, qianrun Wei, Yu Zhou
- CR4-NarrEmote: An Open Vocabulary Dataset of Narrative Emotions Derived Using Citizen Science
Andrew Piper, Robert Budac
- XQuant: Achieving Ultra-Low Bit KV Cache Quantization with Cross-Layer Compression
Haoqi Yang, Yao Yao, Zuchao Li, Baoyuan Qi, Liu Guoming, hai zhao
- DINT Transformer
Yueyang Cang, Yuhang Liu, Xiaoteng Zhang, Erlu Zhao, Li Shi
- ICR: Iterative Clarification and Rewriting for Conversational Search
Zhiyu Cao, Peifeng Li, Qiaoming Zhu
- Pre-training CLIP against Data Poisoning with Optimal Transport-based Matching and Alignment
Tong Zhang, Kuofeng Gao, Jiawang Bai, Leo Yu Zhang, Xin Yin, Zonghui Wang, Shouling Ji, Wenzhi CHEN
- Similarity = Value? Consultation Value-Assessment and Alignment for Personalized Search
Weicong Qin, Yi Xu, Weijie Yu, Teng Shi, Chenglei Shen, Ming He, Jianping Fan, Xiao Zhang, Jun Xu
- RTQA : Recursive Thinking for Complex Temporal Knowledge Graph Question Answering with Large Language Models
Zhaoyan Gong, Juan Li, Zhiqiang Liu, Lei Liang, Huajun Chen, Wen Zhang
- Not All Parameters Are Created Equal: Smart Isolation Boosts Fine-Tuning Performance
YaoWang, Di Liang, Minlong Peng
- AI Knows Where You Are: Exposure, Bias, and Inference in Multimodal Geolocation with KoreaGEO
Xiaonan Wang, Bo Shao, Hansaem Kim
- CAT: Causal Attention Tuning For Injecting Fine-grained Causal Knowledge into Large Language Models
Kairong Han, Wenshuo Zhao, Ziyu Zhao, Ye Jun Jian, Lujia Pan, Kun Kuang
- Enhancing LLM Text Detection with Retrieved Contexts and Logits Distribution Consistency
Zhaoheng Huang, Yutao Zhu, Ji-Rong Wen, Zhicheng Dou
- Stop Looking for ``Important Tokens’’ in Multimodal Language Models: Duplication Matters More
Zichen Wen, Yifeng Gao, Shaobo Wang, Junyuan Zhang, Qintong Zhang, Weijia Li, Conghui He, Linfeng Zhang
- AgentPro: Enhancing LLM Agents with Automated Process Supervision
Yuchen Deng, Shichen Fan, Naibo Wang, Xinkui Zhao, See-Kiong Ng
- PORTS: Preference-Optimized Retrievers for Tool Selection with Large Language Models
Lorenzo Molfetta, Giacomo Frisoni, Nicolò Monaldini, Gianluca Moro
- MusKGC: A Flexible Multi-source Knowledge Enhancement Framework for Open-World Knowledge Graph Completion
Xin Song, Liu Haiyan, Haiyang Wang, Ye Wang, Kai Chen, Bin Zhou
- Towards Transferable Personality Representation Learning based on Triplet Comparisons and Its Applications
Kai Tang, Rui Wang, Renyu Zhu, Minmin Lin, Xiao Ding, Tangjie Lv, Changjie Fan, Runze Wu, Haobo Wang
- Reshaping Representation Space to Balance the Safety and Over-rejection in Large Audio Language Models
Hao Yang, Lizhen Qu, Ehsan Shareghi, Gholamreza Haffari
- Benchmarking Large Language Models Under Data Contamination: A Survey from Static to Dynamic Evaluation
Simin Chen, Yiming Chen, Zexin Li, Yifan Jiang, Zhongwei Wan, Yixin He, Dezhi Ran, Tianle Gu, Haizhou Li, Tao Xie, Baishakhi Ray
- FinTrust: A Comprehensive Benchmark of Trustworthiness Evaluation in Finance Domain
Tiansheng Hu, Tongyan Hu, Liuyang Bai, Yilun Zhao, Arman Cohan, Chen Zhao
- RecGPT: A Foundation Model for Sequential Recommendation
Yangqin Jiang, Xubin Ren, Lianghao Xia, Da Luo, Kangyi Lin, Chao Huang
- Towards Holistic Evaluation of Large Audio-Language Models: A Comprehensive Survey
Chih-Kai Yang, Neo S. Ho, Hung-yi Lee
- Train One Sparse Autoencoder Across Multiple Sparsity Budgets to Preserve Interpretability and Accuracy
Nikita Balagansky, Yaroslav Aksenov, Daniil Laptev, Vadim Kurochkin, Gleb Gerasimov, Nikita Koriagin, Daniil Gavrilov
- Learn and Unlearn: Addressing Misinformation in Multilingual LLMs
TaiMing Lu, Philipp Koehn
- PRISM: Efficient Long-Range Reasoning With Short-Context LLMs
Dulhan Jayalath, James Bradley Wendt, Nicholas Monath, Sandeep Tata, Beliz Gunel
- Augmenting Multi-Agent Communication with State Delta Trajectory
Yichen Tang, Weihang Su, Yujia Zhou, Yiqun LIU, Min Zhang, Shaoping Ma, Qingyao Ai
- SAEs Are Good for Steering – If You Select the Right Features
Dana Arad, Aaron Mueller, Yonatan Belinkov
- CoBA: Counterbias Text Augmentation for Mitigating Various Spurious Correlations via Semantic Triples
Kyohoon Jin, Juhwan Choi, JungMin Yun, Junho Lee, Soojin Jang, YoungBin Kim
- Layered Insights: Generalizable Analysis of Human Authorial Style by Leveraging All Transformer Layers
Milad Alshomary, Nikhil Reddy Varimalla, Vishal Anand, Smaranda Muresan, Kathleen McKeown
- When Long Helps Short: How Context Length in Supervised Fine-tuning Affects Behavior of Large Language Models
Yingming Zheng, Hanqi Li, Lu Chen, Kai Yu
- A Case Against Implicit Standards: Homophone Normalization in Machine Translation for Languages that use the Ge’ez Script.
Hellina Hailu Nigatu, Atnafu Lambebo Tonja, Henok Biadglign Ademtew, Hizkiel Mitiku Alemayehu, Negasi Haile Abadi, Tadesse Destaw Belay, Seid Muhie Yimam
- Evaluating Language Translation Models by Playing Telephone
Syeda Jannatus Saba, Steven Skiena
- Doubling Your Data in Minutes: Ultra-fast Tabular Data Generation via LLM-Induced Dependency Graphs
Shuo Yang, Zheyu Zhang, Bardh Prenkaj, Gjergji Kasneci
- SPaRC: A Spatial Pathfinding Reasoning Challenge
Lars Benedikt Kaesberg, Jan Philip Wahle, Terry Ruas, Bela Gipp
- Primus: A Pioneering Collection of Open-Source Datasets for Cybersecurity LLM Training
Yao-Ching Yu, Tsun-Han Chiang, Cheng-Wei Tsai, Chien-Ming Huang, Wen-Kwang Tsao
- Bit-Flip Error Resilience in LLMs: A Comprehensive Analysis and Defense Framework
Yuhang Chen, Zhen Tan, AJAY KUMAR JAISWAL, Huaizhi Qu, Xinyu Zhao, Qi Lin, Yu Cheng, Andrew Kwong, Zhichao Cao, Tianlong Chen
- Towards Faithful Natural Language Explanations: A Study Using Activation Patching in Large Language Models
Yeo Wei Jie, Ranjan Satapathy, Erik Cambria
- Calibrating LLM Confidence by Probing Perturbed Representation Stability
Reza Khanmohammadi, Erfan Miahi, Mehrsa Mardikoraem, Simerjot Kaur, Ivan Brugere, Charese Smiley, Kundan S Thind, Mohammad M. Ghassemi
- SATER: A Self-Aware and Token-Efficient Approach to Routing and Cascading
Yuanzhe Shen, Yide Liu, Zisu Huang, Ruicheng Yin, Xiaoqing Zheng, Xuanjing Huang
- DSG-MCTS: A Dynamic Strategy-Guided Monte Carlo Tree Search for Diversified Reasoning in Large Language Models
Rui Ha, Chaozhuo Li, Rui Pu, Litian Zhang, Xi Zhang, Sen Su
- CIFLEX: Contextual Instruction Flow for Sub-task Execution in Multi-Turn Interactions with a Single On-Device LLM
Juntae Lee, Jihwan Bang, Seunghan Yang, Simyung Chang
- On the Role of Model Prior in Real-World Inductive Reasoning
Zhuo Liu, Ding Yu, Hangfeng He
- Viability of Machine Translation for Healthcare in Low-Resourced Languages
Hellina Hailu Nigatu, Nikita Mehandru, Negasi Haile Abadi, Blen Gebremeskel, Ahmed Alaa, Monojit Choudhury
- Latent Inter-User Difference Modeling for LLM Personalization
Yilun Qiu, Tianhao Shi, Xiaoyan Zhao, Fengbin ZHU, Yang Zhang, Fuli Feng
- IG-Pruning: Input-Guided Block Pruning for Large Language Models
Kangyu Qiao, Shaolei Zhang, Yang Feng
- Are Checklists Really Useful for Automatic Evaluation of Generative Tasks?
Momoka Furuhashi, Kouta Nakayama, Takashi Kodama, Saku Sugawara
- Measuring the Effect of Disfluency in Multilingual Knowledge Probing Benchmarks
Kirill Semenov, Rico Sennrich
- Knowledge Editing through Chain-of-Thought
Changyue Wang, Weihang Su, Qingyao Ai, Yichen Tang, Yiqun LIU
- SelfRACG: Enabling LLMs to Self-Express and Retrieve for Code Generation
Qian Dong, Jia Chen, Qingyao Ai, Hongning Wang, Haitao Li, YIWU, Yao Hu, Yiqun LIU, Shaoping Ma
- Probing Logical Reasoning of MLLMs in Scientific Diagrams
Yufei Wang, Adriana Kovashka
- AdamS: Momentum Itself Can Be A Normalizer for LLM Pretraining and Post-training
Huishuai Zhang, Bohan Wang, Luoxin Chen
- Demystifying Synthetic Data in LLM Pre-training: A Systematic Study of Scaling Laws, Benefits, and Pitfalls
Feiyang Kang, Newsha Ardalani, Michael Kuchnik, Youssef Emad, Mostafa Elhoushi, Shubhabrata Sengupta, Shang-Wen Li, Ramya Raghavendra, Ruoxi Jia, Carole-Jean Wu
- Static or Dynamic: Towards Query-Adaptive Token Selection for Video Question Answering
Yumeng Shi, Quanyu Long, Wenya Wang
- DischargeSim: A Simulation Benchmark for Educational Doctor–Patient Communication at Discharge
Zonghai Yao, Michael Sun, Won Seok Jang, SUNJAE KWON, Soie Kwon, hong yu
- Can Vision-Language Models Solve Visual Math Equations?
Monjoy Narayan Choudhury, Junling Wang, Yifan Hou, Mrinmaya Sachan
- From Scores to Steps: Diagnosing and Improving LLM Performance in Evidence-Based Medical Calculations
Benlu Wang, Iris Xia, Yifan Zhang, Junda Wang, Feiyun Ouyang, Shuo Han, hong yu, Zonghai Yao
- Bridging External and Parametric Knowledge: Mitigating Hallucination of LLMs with Shared-Private Semantic Synergy in Dual-Stream Knowledge
Yi Sui, Chaozhuo Li, Chen Zhang, Dawei Song, Qiuchi Li
- Deep Associations, High Creativity: A Simple yet Effective Metric for Evaluating Large Language Models
Ziliang Qiu, Renfen Hu
- Identifying Unlearned Data in LLMs via Membership Inference Attacks
Advit Deepak, Megan Mou, Jing Huang, Diyi Yang
- Feature Extraction and Steering for Enhanced Chain-of-Thought Reasoning in Language Models
Zihao Li, Xu Wang, Yuzhe YANG, Ziyu Yao, Haoyi Xiong, Mengnan Du
- LLMs cannot spot math errors, even when allowed to peek into the solution
KV Aditya Srivatsa, Kaushal Kumar Maurya, Ekaterina Kochmar
- Can LLMs be Good Graph Judge for Knowledge Graph Construction?
Haoyu Huang, Chong Chen, Zeang Sheng, Yang Li, Wentao Zhang
- NeuroAda: Activating Each Neuron’s Potential for Parameter-Efficient Fine-Tuning
Zhi Zhang, Yixian Shen, Congfeng Cao, Ekaterina Shutova
- NileChat: Towards Linguistically Diverse and Culturally Aware LLMs for Local Communities
Abdellah EL MEKKI, Houdaifa Atou, OMER NACAR, Shady Shehata, Muhammad Abdul-Mageed
- A Computational Simulation of Language Production in First Language Acquisition
Yuan Gao
- Long-Form Information Alignment Evaluation Beyond Atomic Facts
Danna Zheng, Mirella Lapata, Jeff Z. Pan
- Voice of a Continent: Mapping Africa’s Speech Technology Frontier
AbdelRahim A. Elmadany, Sang Yun Kwon, Hawau Olamide Toyin, Alcides Alcoba Inciarte, Hanan Aldarmaki, Muhammad Abdul-Mageed
- Cache-Efficient Posterior Sampling for Reinforcement Learning with LLM-Derived Priors Across Discrete and Continuous Domains
Ibne Farabi Shihab, Sanjeda Akter, Anuj Sharma
- Circuit Complexity Bounds for RoPE-based Transformer Architecture
Bo Chen, Xiaoyu Li, Yingyu Liang, Jiangxuan Long, Zhenmei Shi, Zhao Song, Jiahao Zhang
- Efficient Unstructured Pruning of Mamba State-Space Models for Resource-Constrained Environments
Ibne Farabi Shihab, Sanjeda Akter, Anuj Sharma
- Towards Infinite-Long Prefix in Transformer
Yingyu Liang, Zhenmei Shi, Zhao Song, Chiwun Yang
- LATTE: Learning to Think with Vision Specialists
Zixian Ma, Jianguo Zhang, Zhiwei Liu, Jieyu Zhang, Juntao Tan, Manli Shu, Juan Carlos Niebles, Shelby Heinecke, Huan Wang, Caiming Xiong, Ranjay Krishna, silvio savarese
- SUA: Stealthy Multimodal Large Language Model Unlearning Attack
Xianren Zhang, Hui Liu, Delvin Ce Zhang, Xianfeng Tang, Qi He, Dongwon Lee, Suhang Wang
- ResFormer: All-Time Reservoir Memory for Long Sequence Classification
Hongbo Liu, Jia Xu
- Back Attention: Understanding and Enhancing Multi-Hop Reasoning in Large Language Models
Zeping Yu, Yonatan Belinkov, Sophia Ananiadou
- Interdisciplinary Research in Conversation: A Case Study in Computational Morphology for Language Documentation
Enora Rice, Katharina von der Wense, Alexis Palmer
- Analyzing Uncertainty of LLM-as-a-Judge: Interval Evaluations with Conformal Prediction
Huanxin Sheng, Xinyi Liu, Hangfeng He, Jieyu Zhao, Jian Kang
- AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time
Junyu Zhang, Runpei Dong, Han Wang, Xuying Ning, Haoran Geng, Peihao Li, Xialin He, Yutong Bai, Jitendra Malik, Saurabh Gupta, Huan Zhang
- Dual-Path Dynamic Fusion with Learnable Query for Multimodal Sentiment Analysis
Miao Zhou, Lina Yang, Thomas Wu, Dongnan Yang, Xinru Zhang
- CaKE: Circuit-aware Editing Enables Generalizable Knowledge Learners
Yunzhi Yao, Jizhan Fang, Jia-Chen Gu, Ningyu Zhang, Shumin Deng, Huajun Chen, Nanyun Peng
- DEL-ToM: Inference-Time Scaling for Theory-of-Mind Reasoning via Dynamic Epistemic Logic
Yuheng Wu, Jianwen Xie, Denghui Zhang, Zhaozhuo Xu
- Collaborative Beam Search: Enhancing LLM Reasoning via Collective Consensus
Yangyifan Xu, Shuo Ren, Jiajun Zhang
- Deriving Strategic Market Insights with Large Language Models: A Benchmark for Forward Counterfactual Generation
Keane Ong, Rui Mao, Deeksha varshney, Paul Pu Liang, Erik Cambria, Gianmarco Mengaldo
- Towards Statistical Factuality Guarantee for Large Vision-Language Models
Zhuohang Li, Chao Yan, Nicholas J Jackson, Wendi Cui, Bo Li, Jiaxin Zhang, Bradley A. Malin
- Unlearning vs. Obfuscation: Are We Truly Removing Knowledge?
Guangzhi Sun, Potsawee Manakul, Xiao Zhan, Mark Gales
- Reward-Shifted Speculative Sampling Is An Efficient Test-Time Weak-to-Strong Aligner
Bolian Li, Yanran Wu, Xinyu Luo, Ruqi Zhang
- Stimulate the Critical Thinking of LLMs via Debiasing Discussion
Ruiyu Xiao, Lei Wu, Yuanxing Liu, Weinan Zhang, Ting Liu
- Toward Multi-Session Personalized Conversation: A Large-Scale Dataset and Hierarchical Tree Framework for Implicit Reasoning
Xintong Li, Jalend Bantupalli, Ria Dharmani, Yuwei Zhang, Jingbo Shang
- Improving Instruct Models for Free: A Study on Partial Adaptation
Ozan Irsoy, Pengxiang Cheng, Jennifer L Chen, Daniel Preotiuc-Pietro, Shiyue Zhang, Duccio Pappadopulo
- CoMMIT: Coordinated Multimodal Instruction Tuning
Xintong Li, Junda Wu, Tong Yu, Rui Wang, Yu Wang, Xiang Chen, Jiuxiang Gu, Lina Yao, Julian McAuley, Jingbo Shang
- Meta-Rewarding Language Models: Self-Improving Alignment with LLM-as-a-Meta-Judge
Tianhao Wu, Weizhe Yuan, Olga Golovneva, Jing Xu, Yuandong Tian, Jiantao Jiao, Jason E Weston, Sainbayar Sukhbaatar
- AnyMAC: Cascading Flexible Multi-Agent Collaboration via Next-Agent Prediction
Song Wang, Zhen Tan, Zihan Chen, Shuang Zhou, Tianlong Chen, Jundong Li
- A Good Plan is Hard to Find: Aligning Models with Preferences is Misaligned with What Helps Users
Nishant Balepur, Matthew Shu, Yoo Yeon Sung, Seraphina Goldfarb-Tarrant, Shi Feng, Fumeng Yang, Rachel Rudinger, Jordan Lee Boyd-Graber
- Words Like Knives: Backstory-Personalized Modeling and Detection of Violent Communication
Jocelyn J Shen, Akhila Yerukola, Xuhui Zhou, Cynthia Breazeal, Maarten Sap, Hae Won Park
- Separate the Wheat from the Chaff: Winnowing Down Divergent Views in Retrieval Augmented Generation
Song Wang, Zihan Chen, Peng Wang, Zhepei Wei, Zhen Tan, Yu Meng, Cong Shen, Jundong Li
- Cognitive Linguistic Identity Fusion Score (CLIFS): A Scalable Cognition‑Informed Approach to Quantifying Identity Fusion from Text
Devin R. Wright, Jisun An, Yong-Yeol Ahn
- SilVar: Speech-Driven Multimodal Model for Reasoning Visual Question Answering and Object Localization
Tan-Hanh Pham, Le Hoang Nam, Phu-Vinh Nguyen, Chris Ngo, Truong-Son Hy
- CEMTM: Contextual Embedding-based Multimodal Topic Modeling
Amirhossein Abaskohi, Raymond Li, Chuyuan Li, Shafiq Joty, Giuseppe Carenini
- RedHerring Attack: Testing the Reliability of Attack Detection
Jonathan Rusert
- Modeling Bottom-up Information Quality during Language Processing
Cui Ding, Yanning Yin, Lena Ann Jäger, Ethan Wilcox
- Data Drives Unstable Hierarchical Generalization in LMs
Tian Qin, Naomi Saphra, David Alvarez-Melis
- EmoAgent: Assessing and Safeguarding Human-AI Interaction for Mental Health Safety
Jiahao Qiu, Yinghui He, Xinzhe Juan, Yimin Wang, Yuhan Liu, Zixin Yao, Yue Wu, xun jiang, Ling Yang, Mengdi Wang
- Polysemantic Dropout: Conformal OOD Detection for Specialized LLMs
Ayush Gupta, Ramneet Kaur, Anirban Roy, Adam D. Cobb, Rama Chellappa, Susmit Jha
- Facilitating Cognitive Accessibility with LLMs: A Multi-Task Approach to Easy-to-Read Text Generation
François Ledoyen, Gaël Dias, Jeremie Pantin, Fabrice Maurel, Alexis Lechervy, Youssef Chahir
- D-CoDe: Scaling Image-Pretrained VLMs to Video via Dynamic Compression and Question Decomposition
Yiyang Huang, Yizhou Wang, Yun Fu
- ReEvalMed: Rethinking Medical Report Evaluation by Aligning Metrics with Real-World Clinical Judgment
Ruochen Li, Jun Li, Bailiang Jian, Kun yuan, Youxiang Zhu
- MultiMed-ST: Large-scale Many-to-many Multilingual Medical Speech Translation
Khai Le-Duc, Tuyen Tran, Bach Phan Tat, Nguyen Kim Hai Bui, Quan Dang Anh, Hung-Phong Tran, Thanh Thuy Nguyen, Ly Nguyen, Tuan Minh Phan, Thi Thu Phuong Tran, Chris Ngo, Khanh Xuan Nguyen, Thanh Nguyen-Tang
- Beyond Checkmate: Exploring the Creative Choke Points for AI Generated Texts
Nafis Irtiza Tripto, Saranya Venkatraman, Mahjabin Nahar, Dongwon Lee
- MoR: Better Handling Diverse Queries with a Mixture of Sparse, Dense, and Human Retrievers
Jushaan Singh Kalra, Xinran Zhao, To Eun Kim, Fengyu Cai, Fernando Diaz, Tongshuang Wu
- Learning Contextual Retrieval for Robust Conversational Search
Seunghan Yang, Juntae Lee, Jihwan Bang, Kyuhong Shim, Minsoo Kim, Simyung Chang
- LIDDIA: Language-based Intelligent Drug Discovery Agent
Reza Averly, Frazier N. Baker, Xia Ning
- Agentic-R1: Distilled Dual-Strategy Reasoning
Weihua Du, Pranjal Aggarwal, Sean Welleck, Yiming Yang
- Proactive Assistant Dialogue Generation from Streaming Egocentric Videos
Yichi Zhang, Xin Luna Dong, Zhaojiang Lin, Andrea Madotto, Anuj Kumar, Babak Damavandi, Joyce Chai, Seungwhan Moon
- Should I Share this Translation? Evaluating Quality Feedback for User Reliance on Machine Translation
Dayeon Ki, Kevin Duh, Marine Carpuat
- ChartGaze: Enhancing Chart Understanding in LVLMs with Eye-Tracking Guided Attention Refinement
Ali Salamatian, Amirhossein Abaskohi, Wan-Cyuan Fan, Mir Rayat Imtiaz Hossain, Leonid Sigal, Giuseppe Carenini
- LogiCoL: Logically-Informed Contrastive Learning for Set-based Dense Retrieval
Yanzhen Shen, Sihao Chen, Xueqiang Xu, Yunyi Zhang, Chaitanya Malaviya, Dan Roth
- ModalPrompt: Towards Efficient Multimodal Continual Instruction Tuning with Dual-Modality Guided Prompt
Fanhu Zeng, Fei Zhu, Haiyang Guo, Xu-Yao Zhang, Cheng-Lin Liu
- Skip-Thinking: Chunk-wise Chain-of-Thought Distillation Enable Smaller Language Models to Reason Better and Faster
Xiaoshu Chen, Sihang Zhou, KE LIANG, Xiaoyu Sun, Xinwang Liu
- Can an Individual Manipulate the Collective Decisions of Multi-Agents?
Fengyuan Liu, Rui Zhao, Shuo Chen, Guohao Li, Philip Torr, Lei Han, Jindong Gu
- Toxicity Red-Teaming: Benchmarking LLM Safety in Singapore’s Low-Resource Languages
Yujia Hu, Ming Shan Hee, Preslav Nakov, Roy Ka-Wei Lee
- Improving Clustering with Positive Pairs Generated from LLM-Driven Labels
Xiaotong Zhang, Ying Li
- Gamma-Guard: Lightweight Residual Adapters for Robust Guardrails in Large Language Models
Lijia Lv, Yuanshu Zhao, Guan Wang, Xuehai Tang, Wen Jie, Jizhong Han, Songlin Hu
- Facilitating Long Context Understanding via Supervised Chain-of-Thought Reasoning
Jingyang Lin, Andy Wong, Tian Xia, Shenghua He, Hui Wei, Mei Han, Jiebo Luo
- Dynamic Energy-Based Contrastive Learning with Multi-Stage Knowledge Verification for Event Causality Identification
Ya Su, Hu zhang, Yue Fan, Guangjun Zhang, YuJie Wang, Ru Li, Hongye Tan
- ICG: Improving Cover Image Generation via MLLM-based Prompting and Personalized Preference Alignment
Zhipeng Bian, Jieming Zhu, Qijiong Liu, Wang Lin, Guohao Cai, Zhaocheng Du, Jiacheng Sun, Zhou Zhao, Zhenhua Dong
- From Long to Lean: Performance-aware and Adaptive Chain-of-Thought Compression via Multi-round Refinement
JianZhi Yan, Le Liu, Youcheng Pan, Shiwei Chen, Zike Yuan, Yang Xiang, Buzhou Tang
- A Symbolic Adversarial Learning Framework for Evolving Fake News Generation and Detection
Chong Tian, Qirong Ho, Xiuying Chen
- RareSyn: Health Record Synthesis for Rare Disease Diagnosis
Huimin WANG, Yutian Zhao, Yefeng Zheng, Xian Wu
- Sticker-TTS: Learn to Utilize Historical Experience with a Sticker-driven Test-Time Scaling Framework
Jie Chen, Jinhao Jiang, Yingqian Min, Zican Dong, Shijie Wang, Xin Zhao, Ji-Rong Wen
- CMHG: A Dataset and Benchmark for Headline Generation of Minority Languages in China
Guixian Xu, Zeli Su, Ziyin Zhang, Jianing Liu, Xu Han, Ting Zhang, Yushuang Dong
- Understanding the Information Propagation Effects of Communication Topologies in LLM-based Multi-Agent Systems
Xu Shen, Yixin Liu, Yiwei Dai, Yili Wang, Rui Miao, Yue Tan, Shirui Pan, Xin Wang
- Boosting Data Utilization for Multilingual Dense Retrieval
Chao Huang, Fengran Mo, Yufeng Chen, Changhao Guan, Zhenrui Yue, Xinyu Wang, Jinan Xu, Kaiyu Huang
- Self-Augmented Preference Alignment for Sycophancy Reduction in LLMs
Chien Hung Chen, Hen-Hsen Huang, Hsin-Hsi Chen
- TP-RAG: Benchmarking Retrieval-Augmented Large Language Model Agents for Spatiotemporal-Aware Travel Planning
Hang Ni, Fan Liu, Xinyu Ma, Lixin Su, Shuaiqiang Wang, Dawei Yin, Hui Xiong, Hao Liu
- Recontextualizing Revitalization: A Mixed Media Approach to Reviving the Nüshu Language
Ivory Yang, Xiaobo Guo, Yuxin Wang, Hefan Zhang, Yaning Jia, William Dinauer, Soroush Vosoughi
- Towards Advanced Mathematical Reasoning for LLMs via First-Order Logic Theorem Proving
Chuxue Cao, Mengze Li, Juntao Dai, Jinluan Yang, Zijian Zhao, Shengyu Zhang, Weijie Shi, Chengzhong LIU, Sirui Han, Yike Guo
- From Tens of Hours to Tens of Thousands: Scaling Back-Translation for Speech Recognition
Tianduo Wang, Lu Xu, Wei Lu, Shanbo Cheng
- CityEQA: A Hierarchical LLM Agent on Embodied Question Answering Benchmark in City Space
Yong Zhao, Kai Xu, Zhengqiu Zhu, Yue Hu, Zhiheng Zheng, Yingfeng Chen, Yatai Ji, Chen Gao, Yong Li, Jincai Huang
- Mitigating Hallucinations in Vision-Language Models through Image-Guided Head Suppression
Sreetama Sarkar, Yue Che, Alex Gavin, Peter Anthony Beerel, Souvik Kundu
- Examining False Positives under Inference Scaling for Mathematical Reasoning
Yu Wang, Nan Yang, Liang Wang, Furu Wei, Fuli Feng
- Translationese-index: Using Likelihood Ratios for Graded and Generalizable Measurement of Translationese
Yikang Liu, Wanyang Zhang, Yiming Wang, Jialong Tang, Pei Zhang, Baosong Yang, Fei Huang, Rui Wang, Hai Hu
- Exploring the Limitations of Mamba in COPY and CoT Reasoning
Ruifeng Ren, Zhicong Li, Yong Liu
- ProcWorld: Benchmarking Large Model Planning in Reachability-Constrained Environments
Dong Wang, Xinghang Li, Zhengshen Zhang, Jirong Liu, Xiao Ma, Hanbo Zhang, Tao Kong, Huaping Liu
- R2I-Bench: Benchmarking Reasoning-Driven Text-to-Image Generation
Kaijie Chen, Zihao Lin, Zhiyang Xu, Ying Shen, Yuguang Yao, Joy Rimchala, Jiaxin Zhang, Lifu Huang
- Can GRPO Boost Complex Multimodal Table Understanding?
Xiaoqiang Kang, Shengen Wu, Zimu Wang, Yilin Liu, Xiaobo Jin, Kaizhu Huang, Wei Wang, Yutao Yue, Xiaowei Huang, Qiufeng Wang
- MoMoE: Mixture of Moderation Experts Framework for AI-Assisted Online Governance
Agam Goyal, Xianyang Zhan, Yilun Chen, Koustuv Saha, Eshwar Chandrasekharan
- Following the Autoregressive Nature of LLM Embeddings via Compression and Alignment
Jingcheng Deng, Zhongtao Jiang, Liang Pang, Zihao Wei, Liwei Chen, Kun Xu, Yang Song, Huawei Shen, Xueqi Cheng
- Evaluating LLM-Generated Diagrams as Graphs
Chumeng Liang, Jiaxuan You
- Breaking Bad Tokens: Detoxification of LLMs Using Sparse Autoencoders
Agam Goyal, Vedant Rathi, William Yeh, Yian Wang, Yuen Chen, Hari Sundaram
- VCSearch: Bridging the Gap Between Well-Defined and Ill-Defined Problems in Mathematical Reasoning
Shi-Yu Tian, Zhi Zhou, Kun-Yang Yu, Ming Yang, Lin-Han Jia, Lan-Zhe Guo, Yu-Feng Li
- How do autoregressive transformers solve full addition?
WANG PEIXU, Chen Yu, Yu Ming, Cheng Xiang
- MAIN: Mutual Alignment Is Necessary for instruction tuning
Fanyi Yang, Jianfeng Liu, Xin Zhang, Haoyu Liu, Xixin Cao, Yuefeng Zhan, Hao Sun, Weiwei Deng, Feng Sun, Qi Zhang
- Expanding before Inferring: Enhancing Factuality in Large Language Models through Premature Layers Interpolation
Dingwei Chen, Ziqiang Liu, Feiteng Fang, Chak Tou Leong, Shiwen Ni, Ahmadreza Argha, Hamid Alinejad-Rokny, Min Yang, Chengming Li
- DeepWell-Adol: A Scalable Expert-Based Dialogue Corpus for Adolescent Positive Mental Health and Wellbeing Promotion
Wenyu Qiu, Yuxiong Wang, Jiajun Tan, Hanchao Hou, Qinda Liu, WEI YAO, Shiguang NI
- Data to Defense: The Role of Curation in Aligning Large Language Models Against Safety Compromise
Xiaoqun Liu, Jiacheng Liang, Luoxi Tang, Muchao Ye, Weicheng Ma, Zhaohan Xi
- Speculative Safety-Aware Decoding
Xuekang Wang, Shengyu Zhu, Xueqi Cheng
- PanicToCalm: A Proactive Counseling Agent for Panic Attacks
Jihyun Lee, Yejin Min, San Kim, Yejin Jeon, Sung Jun Yang, Hyounghun Kim, Gary Lee
- CoPL: Collaborative Preference Learning for Personalizing LLMs
Youngbin Choi, Seunghyuk Cho, Minjong Lee, MoonJeong Park, Yesong Ko, Jungseul Ok, Dongwoo Kim
- Dynamic Collaboration of Multi-Language Models based on Minimal Complete Semantic Units
Chao Hao, Zezheng Wang, Yanhua Huang, Ruiwen Xu, Wenzhe Niu, Xin Liu, Zitong YU
- AI Chatbots as Professional Service Agents: Developing a Professional Identity
Wenwen Li, Kangwei Shi, YidongChai
- DeepResonance: Enhancing Multimodal Music Understanding via Music-centric Multi-way Instruction Tuning
Zhuoyuan Mao, Mengjie Zhao, Qiyu Wu, Hiromi Wakaki, Yuki Mitsufuji
- Advancing Oversight Reasoning across Languages for Audit Sycophantic Behaviour via X-Agent
Leonardo Ranaldi, Giulia Pucci
- CAFE: Retrieval Head-based Coarse-to-Fine Information Seeking to Enhance Multi-Document QA Capability
Han Peng, Jinhao Jiang, Zican Dong, Xin Zhao, LEI FANG
- SSA-COMET: Do LLMs Outperform Learned Metrics in Evaluating MT for Under-Resourced African Languages?
Senyu Li, Jiayi Wang, Felermino D. M. A. Ali, Colin Cherry, Daniel Deutsch, Eleftheria Briakou, Rui Sousa-Silva, Henrique Lopes Cardoso, Pontus Stenetorp, David Ifeoluwa Adelani
- FaithUn: Toward Faithful Forgetting in Language Models by Investigating the Interconnectedness of Knowledge
Nakyeong Yang, Minsung Kim, Seunghyun Yoon, Joongbo Shin, Kyomin Jung
- Calibrating Pseudo-Labeling with Class Distribution for Semi-supervised Text Classification
Weiyi Yang, Richong Zhang, Junfan Chen, Jiawei Sheng
- Coarse-to-Fine Grounded Memory for LLM Agent Planning
Wei Yang, Jinwei Xiao, Hongming Zhang, Qingyang Zhang, Yanna Wang, bo xu
- From A and B to A+B: Can Large Language Models Solve Compositional Math Problems?
Xisheng Xiao, Hanlin Zhao
- Sycophancy Mitigation Through Reinforcement Learning with Uncertainty-Aware Adaptive Reasoning Trajectories
Mohammad Beigi, Ying Shen, Parshin Shojaee, Qifan Wang, Zichao Wang, Chandan K. Reddy, Ming Jin, Lifu Huang
- SimVBG: Simulating Individual Values by Backstory Generation
Bangde Du, Ziyi Ye, Zhijing Wu, Monika A. Jankowska, Shuqi Zhu, Qingyao Ai, Yujia Zhou, Yiqun LIU
- EvolveSearch: An Iterative Self-Evolving Search Agent
Ding-Chu Zhang, Yida Zhao, Jialong Wu, Liwen Zhang, Baixuan Li, Wenbiao Yin, Yong Jiang, Yu-Feng Li, Kewei Tu, Pengjun Xie, Fei Huang
- Syntax-Aware Retrieval Augmentation for Neural Symbolic Regression
Canmiao Zhou, Han Huang
- Merge then Realign: Simple and Effective Modality-Incremental Continual Learning for Multimodal LLMs
Dingkun Zhang, Shuhan Qi, Xinyu Xiao, Kehai Chen, Xuan Wang
- Graceful Forgetting in Generative Language Models
Chunyang Jiang, Chi-Min Chan, Yiyang Cai, Yulong Liu, Wei Xue, Yike Guo
- Answering Narrative-Driven Recommendation Queries via a Retrieve–Rank Paradigm and the OCG-Agent
Yunxiao Shi, Haoning Shang, Xing Zi, Wujiang Xu, Yue Feng, Min Xu
- Direct Value Optimization: Improving Chain-of-Thought Reasoning in LLMs with Refined Values
Hongbo Zhang, Han Cui, Guangsheng Bao, Linyi Yang, Jun Wang, Yue Zhang
- Jailbreak-Tuning: Models Efficiently Learn Jailbreak Susceptibility
Brendan Murphy, Dillon Bowen, Shahrad Mohammadzadeh, Tom Tseng, Julius Broomfield, Adam Gleave, Kellin Pelrine
- Neural Topic Modeling via Contextual and Graph Information Fusion
Jiyuan Liu, Jiaxing Yan, Chunjiang Zhu, Xingyu Liu, Li Qing, Yanghui Rao
- CARE: A Disagreement Detection Framework with Concept Alignment and Reasoning Enhancement
Jiyuan Liu, Jielin Song, Yunhe Pang, Zhiyu Shen, Yanghui Rao
- Beyond Task-Oriented and Chitchat Dialogues: Proactive and Transition-Aware Conversational Agents
Yejin Yoon, Yuri Son, Namyeong So, Minseo Kim, Minsoo Cho, Chanhee Park, Seungshin Lee, Taeuk Kim
- LightThinker: Thinking Step-by-Step Compression
Jintian Zhang, Yuqi Zhu, Mengshu Sun, Yujie Luo, Shuofei Qiao, Lun Du, Da Zheng, Huajun Chen, Ningyu Zhang
- How Is LLM Reasoning Distracted by Irrelevant Context? An Analysis Using a Controlled Benchmark
Minglai Yang, Ethan Huang, Liang Zhang, Mihai Surdeanu, William Yang Wang, Liangming Pan
- Investigating Pedagogical Teacher and Student LLM Agents: Genetic Adaptation Meets Retrieval-Augmented Generation Across Learning Styles
Debdeep Sanyal, Agniva Maiti, Umakanta Maharana, Dhruv Kumar, Ankur Mali, C. Lee Giles, Murari Mandal
- GeoEdit: Geometric Knowledge Editing for Large Language Models
Yujie Feng, Li-Ming Zhan, ZEXIN LU, Yongxin Xu, Xu Chu, Yasha Wang, Jiannong Cao, Philip S. Yu, Xiao-Ming Wu
- A Generative Pre-Trained Language Model for Channel Prediction in Wireless Communications Systems
Bo Lin, Huanming Zhang, Yuhua Jiang, Yucong Wang, Tengyu Zhang, Shaoqiang Yan, Hongyao Li, Yihong Liu, Feifei Gao
- AIMMerging: Adaptive Iterative Model Merging Using Training Trajectories for Language Model Continual Learning
Yujie Feng, Jian Li, Xiaoyu DONG, Pengfei Xu, Xiaohui Zhou, Yujia Zhang, ZEXIN LU, Yasha Wang, Alan Zhao, Xu Chu, Xiao-Ming Wu
- R-PRM: Reasoning-Driven Process Reward Modeling
Shuaijie She, Junxiao Liu, Yifeng Liu, Jiajun Chen, Xin Huang, Shujian Huang
- RLAE: Reinforcement Learning-Assisted Ensemble for LLMs
Yuqian Fu, Yuanheng Zhu, Jiajun Chai, Guojun Yin, Wei Lin, Qichao Zhang, Dongbin Zhao
- Do Large Language Models Truly Grasp Addition? A Rule-Focused Diagnostic Using Two-Integer Arithmetic
Yang Yan, Yu Lu, Renjun Xu, Zhenzhong Lan
- AskToAct: Enhancing LLMs Tool Use via Self-Correcting Clarification
Xuan Zhang, Yongliang Shen, Zhe Zheng, Linjuan Wu, Wenqi Zhang, Yuchen Yan, Qiuying Peng, Jun Wang, Weiming Lu
- START: Self-taught Reasoner with Tools
Chengpeng Li, Mingfeng Xue, Zhenru Zhang, Jiaxi Yang, Beichen Zhang, Bowen Yu, Binyuan Hui, Junyang Lin, Xiang Wang, Dayiheng Liu
- The Impact of Negated Text on Hallucination with Large Language Models
Jaehyung Seo, Hyeonseok Moon, Heuiseok Lim
- A Probabilistic Inference Scaling Theory for LLM Self-Correction
Zhe Yang, Yichang Zhang, Yudong Wang, Ziyao Xu, Junyang Lin, Zhifang Sui
- MentalGLM Series: Explainable Large Language Models for Mental Health Analysis on Chinese Social Media
Wei Zhai, Nan Bai, Qing Zhao, Jianqiang Li, Fan Wang, Hongzhi Qi, Meng Jiang, Xiaoqin Wang, Bing Xiang Yang, Guanghui FU
- Knowledge-Aware Co-Reasoning for Multidisciplinary Collaboration
xurui li, wanghaijiao, Kaisong Song, Rui Zhu, Haixu Tang
- Astra: Efficient Transformer Architecture and Contrastive Dynamics Learning for Embodied Instruction Following
Yueen Ma, DaFeng Chi, Shiguang Wu, Yuecheng Liu, Yuzheng Zhuang, Irwin King
- MAVL: A Multilingual Audio-Video Lyrics Dataset for Animated Song Translation
Woohyun Cho, Youngmin Kim, Sunghyun Lee, Youngjae Yu
- MuTIS: Enhancing Reasoning Efficiency through Multi Turn Intervention Sampling in Reinforcement Learning
Wenshuo Zhao, Haoxing Zhai, Xinyu Qiu, Zhenting Qi, Shuhe Li, Linchao Zhu
- PRIM: Towards Practical In-Image Multilingual Machine Translation
Yanzhi Tian, Zeming Liu, Zhengyang Liu, Chong Feng, Xin Li, Heyan Huang, Yuhang Guo
- Mind the Inclusivity Gap: Multilingual Gender-Neutral Translation Evaluation with mGeNTE
beatrice savoldi, Giuseppe Attanasio, Eleonora Cupin, Eleni Gkovedarou, Janiça Hackenbuchner, Anne Lauscher, Matteo Negri, Andrea Piergentili, Manjinder Thind, Luisa Bentivogli
- DiplomacyAgent: Do LLMs Balance Interests and Ethical Principles in International Events?
Jianxiang Peng, Ling Shi, Xinwei Wu, Hanwen Zhang, Fujiang Liu, Haocheng Lyu, Deyi Xiong
- DisLoRA: Task-specific Low-Rank Adaptation via Orthogonal Basis from Singular Value Decomposition
She Yifei, Xinhao Wei, Yulong Wang
- Unmasking Deceptive Visuals: Benchmarking Multimodal Large Language Models on Misleading Chart Question Answering
Zixin CHEN, Sicheng Song, KaShun SHUM, Yanna Lin, Rui SHENG, Weiqi Wang, Huamin Qu
- Textual Aesthetics in Large Language Models
Lingjie Jiang, Shaohan Huang, Xun Wu, Furu Wei
- Section-Level Simplification of Biomedical Abstracts
Jan Bakker, Jaap Kamps
- PoseStitch-SLT: Linguistically Inspired Pose-Stitching for End-to-End Sign Language Translation
Abhinav Joshi, Vaibhav Sharma, Sanjeet Singh, Ashutosh Modi
- Few-Shot Open-Set Classification via Reasoning-Aware Decomposition
Avyav Kumar Singh, Helen Yannakoudakis
- Translation in the Hands of Many: Centering Lay Users in Machine Translation Interactions
beatrice savoldi, Alan Ramponi, Matteo Negri, Luisa Bentivogli
- iTool: Reinforced Fine-Tuning with Dynamic Deficiency Calibration for Advanced Tool Use
Yirong Zeng, Xiao Ding, Yuxian Wang, Weiwen Liu, Yutai Hou, Wu Ning, Xu Huang, Duyu Tang, Dandan Tu, Bing Qin, Ting Liu
- Transplant Then Regenerate: A New Paradigm for Text Data Augmentation
Guangzhan Wang, Hongyu Zhang, Beijun Shen, Xiaodong Gu
- Compositional Generalisation for Explainable Hate Speech Detection
Agostina Calabrese, Tom Sherborne, Björn Ross, Mirella Lapata
- CCQA: Generating Question from Solution Can Improve Inference-Time Reasoning in SLMs
Jinyoung Kim, Ji Won Yoon
- TVQACML: Benchmarking Text-Centric Visual Question Answering in Multilingual Chinese Minority Languages
shajiu, Mengxiao Zhu, Chong Feng, LAMA jIe
- Transparent and Coherent Procedural Mistake Detection
Shane Storks, Itamar Bar-Yossef, Yayuan Li, Zheyuan Zhang, Jason J Corso, Joyce Chai
- Teaching Your Models to Understand Code via Focal Preference Alignment
Jie Wu, Haoling Li, Xin Zhang, Xiao Liu, Yangyu Huang, Jianwen Luo, Yizhen Zhang, Zuchao Li, Ruihang Chu, Yujiu Yang, Scarlett Li
- MoLoRAG: Bootstrapping Document Understanding via Multi-modal Logic-aware Retrieval
Xixi Wu, Yanchao Tan, Nan Hou, Ruiyang Zhang, Hong Cheng
- Vision-Free Retrieval: Rethinking Multimodal Search with Textual Scene Descriptions
Ioanna Ntinou, ALEXANDROS XENOS, Yassine Ouali, Adrian Bulat, Georgios Tzimiropoulos
- TableRAG: A Retrieval Augmented Generation Framework for Heterogeneous Document Reasoning
Xiaohan Yu, Pu Jian, Chong Chen
- Retrieval Enhanced Feedback via In-context Neural Error-book
Jongyeop Hyun, Bumsoo Kim
- Improve LLM-as-a-Judge Ability as a General Ability
Jiachen Yu, Shaoning Sun, Xiaohui Hu, Jiaxu Yan, Kaidong Yu, Xuelong Li
- G2: Guided Generation for Enhanced Output Diversity in LLMs
Zhiwen Ruan, Yixia Li, Yefeng Liu, Yun Chen, Weihua Luo, Peng Li, Yang Liu, Guanhua Chen
- ToolSafety: A Comprehensive Dataset for Enhancing Safety in LLM-Based Agent Tool Invocations
Yuejin Xie, Youliang Yuan, Wenxuan Wang, Fan Mo, Jianmin Guo, Pinjia He
- Learning to See through Sound: From VggCaps to Multi2Cap for Richer Automated Audio Captioning
Sangyeon Cho, Mingi Kim, Jinkwon Hwang, Jaehoon Go, Minuk Ma, Sunjae Yoon, Junyeong Kim
- Towards Optimal Evaluation Efficiency for Large Language Models
Guohong Li, Deyi Xiong
- MMAPG: A Training-Free Framework for Multimodal Multi-hop Question Answering via Adaptive Planning Graphs
Yiheng Hu, Xiaoyang Wang, Qing Liu, Xiwei Xu, Qian Fu, Wenjie Zhang, Liming Zhu
- Mixture-of-Clustered-Experts: Advancing Expert Specialization and Generalization in Instruction Tuning
Sugyeong Eo, Jung Jun Lee, Chanjun Park, Heuiseok Lim
- Process-Supervised Reinforcement Learning for Code Generation
Yufan Ye, Ting Zhang, Wenbin Jiang, Hua Huang
- MuCAL: Contrastive Alignment for Preference-Driven KG-to-Text Generation
Yifei Song, Claire Gardent
- Advancing Fine-Grained Visual Understanding with Multi-Scale Alignment in Multi-Modal Models
Wei Wang, Zhaowei Li, Qi Xu, Linfeng Li, YiQing Cai, Botian Jiang, Hang Song, Xingcan Hu, Pengyu Wang, Li Xiao
- Thought calibration: Efficient and confident test-time scaling
Menghua Wu, Cai Zhou, Stephen Bates, Tommi Jaakkola
- Can LLMs Reason Abstractly Over Math Word Problems Without CoT? Disentangling Abstract Formulation From Arithmetic Computation
Ziling Cheng, Meng Cao, Leila Pishdad, Yanshuai Cao, Jackie CK Cheung
- QCRD: Quality-guided Contrastive Rationale Distillation for Large Language Models
Wei Wang, Zhaowei Li, Qi Xu, YiQing Cai, Hang Song, Qi Qi, Ran Zhou, Zhida Huang, Tao Wang, Li Xiao
- SHARP: Steering Hallucination in LVLMs via Representation Engineering
Junfei Wu, Yue Ding, Guofan Liu, Tianze Xia, Ziyue Huang, Dianbo Sui, Qiang Liu, Shu Wu, Liang Wang, Tieniu Tan
- Think, Verbalize, then Speak: Bridging Complex Thoughts and Comprehensible Speech
Sang Hoon Woo, Sehun Lee, Kang-wook Kim, Gunhee Kim
- Warm Up Before You Train: Unlocking General Reasoning in Resource-Constrained Settings
Safal Shrestha, Minwu Kim, Aadim Nepal, Anubhav Shrestha, Keith W. Ross
- PPTAgent: Generating and Evaluating Presentations Beyond Text-to-Slides
Hao Zheng, Xinyan Guan, Hao Kong, Wenkai Zhang, Jia Zheng, Weixiang Zhou, Hongyu Lin, Yaojie Lu, Xianpei Han, Le Sun
- SWAM: Adaptive Sliding Window and Memory-Augmented Attention Model for Rumor Detection
Mei Guo, Chen Chen, Chunyan Hou, Yike Wu, Xiaojie Yuan
- HydraRAG: Structured Cross-Source Enhanced Large Language Model Reasoning
Xingyu Tan, Xiaoyang Wang, Qing Liu, Xiwei Xu, Xin Yuan, Liming Zhu, Wenjie Zhang
- VRoPE: Rotary Position Embedding for Video Large Language Models
Zikang Liu, Longteng Guo, Yepeng Tang, Tongtian Yue, Junxian Cai, Kai Ma, Qingbin Liu, Xi Chen, Jing Liu
- SciNLP: A Domain-Specific Benchmark for Full-Text Scientific Entity and Relation Extraction in NLP
Decheng Duan, Jitong Peng, Yingyi Zhang, Chengzhi Zhang
- Think and Recall: Layer-Level Prompting for Lifelong Model Editing
Jinke Wang, Zenan Ying, Qi Liu, Wei Chen, Tong Xu, huijun hou, Zhi Zheng
- SPIRIT: Patching Speech Language Models against Jailbreak Attacks
Amirbek Djanibekov, Nurdaulet Mukhituly, Kentaro Inui, Hanan Aldarmaki, Nils Lukas
- FIRE: Flexible Integration of Data Quality Ratings for Effective Pretraining
Xu Liangyu, Xuemiao Zhang, Feiyu Duan, Sirui Wang, Rongxiang Weng, Jingang Wang, Xunliang Cai
- Multi-Domain Explainability of Preferences
Nitay Calderon, Liat Ein-Dor, Roi Reichart
- Tuning Less, Prompting More: In-Context Preference Learning Pipeline for Natural Language Transformation
Shuyun Yang, Yan Zhang, Zhengmao Ye, Lei Duan, Mingjie Tang
- IL-PCSR: Legal Corpus for Prior Case and Statute Retrieval
Shounak Paul, Dhananjay Ghumare, Pawan Goyal, Saptarshi Ghosh, Ashutosh Modi
- ESGenius: Benchmarking LLMs on Environmental, Social, and Governance (ESG) and Sustainability Knowledge
Chaoyue He, Xin Zhou, Yi Wu, Xinjia Yu, yan zhang, Lei Zhang, Di Wang, Shengfei Lyu, Hong Xu, Wang Xiaoqiao, Wei Liu, Chunyan Miao
- How Sememic Components Can Benefit Link Prediction for Lexico-Semantic Knowledge Graphs?
Hansi Wang, Yue Wang, Qiliang Liang, Yang Liu
- WISE: Weak-Supervision-Guided Step-by-Step Explanations for Multimodal LLMs in Image Classification
Yiwen Jiang, Deval Mehta, Siyuan Yan, Yaling Shen, Zimu Wang, Zongyuan Ge
- Calibration Across Layers: Understanding Calibration Evolution in LLMs
Abhinav Joshi, Areeb Ahmad, Ashutosh Modi
- The discordance between embedded ethics and cultural inference in large language models
Aida Ramezani, Yang Xu
- SSA: Semantic Contamination of LLM-Driven Fake News Detection
Cheng Xu, Nan Yan, Shuhao Guan, Yuke Mei, Tahar Kechadi
- Logits-Based Finetuning
Jingyao Li, Senqiao Yang, Sitong Wu, Han Shi, Chuanyang Zheng, Hong Xu, Jiaya Jia
- STARE at the Structure: Steering ICL Exemplar Selection with Structural Alignment
Jiaqian Li, Qisheng Hu, Jing Li, Wenya Wang
- PPC-GPT: Federated Task-Specific Compression of Large Language Models via Pruning and Chain-of-Thought Distillation
Tao Fan, GuoqiangMa, Yuanfeng SONG, Lixin Fan, Qiang Yang
- Efficient Beam Search for Large Language Models Using Trie-Based Decoding
Brian J Chan, Mao-xun Huang, Jui-Hung Cheng, Chao-Ting Chen, Hen-Hsen Huang
- Power doesn’t reside in size: A Low Parameter Hybrid Language Model (HLM) for Sentiment Analysis in Code-mixed data
Pavan Sai Balaga, Nagasamudram Karthik, Challa Vishwanath, Raksha Sharma, Rudra Murthy, Ashish Mittal
- Evaluating Taxonomy Free Character Role Labeling (TF-CRL) in News Stories using Large Language Models
David G Hobson, Derek Ruths, Andrew Piper
- MIRROR: Multimodal Cognitive Reframing Therapy for Rolling with Resistance
Subin Kim, Hoonrae Kim, Jihyun Lee, Yejin Jeon, Gary Lee
- RETAIL: Towards Real-world Travel Planning for Large Language Models
Bin Deng, Yizhe Feng, Zeming Liu, Qing Wei, Xiangrong Zhu, Shuai Chen, Yuanfang Guo, Yunhong Wang
- Unraveling Interwoven Roles of Large Language Models in Authorship Privacy: Obfuscation, Mimicking, and Verification
Tuc Nguyen, Yifan Hu, Thai Le
- Reward Model Perspectives: Whose Opinions Do Reward Models Reward?
Elle
- FLRC: Fine-grained Low-Rank Compressor for Efficient LLM Inference
Yu-Chen Lu, Chong-Yan Chen, Chi-Chih Chang, Yu-Fang Hu, Kai-Chiang Wu
- Do You Know About My Nation? Investigating Multilingual Language Models’ Cultural Literacy Through Factual Knowledge
Eshaan Tanwar, Anwoy Chatterjee, Michael Saxon, Alon Albalak, William Yang Wang, Tanmoy Chakraborty
- CoEvo: Coevolution of LLM and Retrieval Model for Domain-Specific Information Retrieval
Ang Li, Yiquan Wu, Yinghao Hu, Lizhi Qing, Shihang Wang, Chengyuan Liu, Tao Wu, Adam Jatowt, Ming Cai, Fei Wu, Kun Kuang
- Conan-Embedding-v2: Training an LLM from Scratch for Text Embeddings
Shiyu Li, Yang Tang, Ruijie Liu, Shi-Zhe Chen, Xi Chen
- Vision-and-Language Navigation with Analogical Textual Descriptions in LLMs
Yue Zhang, Tianyi Ma, Zun Wang, Yanyuan Qiao, Parisa Kordjamshidi
- MUCAR: Benchmarking Multilingual Cross-Modal Ambiguity Resolution for Multimodal Large Language Models
Xiaolong Wang, Zhaolu Kang, Wangyuxuan Zhai, Xinyue Lou, Yunghwei Lai, Ziyue Wang, Yawen Wang, Kaiyu Huang, Yile Wang, Peng Li, Yang Liu
- Mind the Gap: How BabyLMs Learn Filler-Gap Dependencies
Chi-Yun Chang, Xueyang Huang, Humaira Nasir, Shane Storks, Olawale Akingbade, Huteng Dai
- Paths Not Taken: Understanding and Mending the Multilingual Factual Recall Pipeline
Meng Lu, Ruochen Zhang, Carsten Eickhoff, Ellie Pavlick
- BTC-SAM: Leveraging LLMs for Generation of Bias Test Cases for Sentiment Analysis Models
Zsolt T. Kardkovács, LYNDA DJENNANE, Anna Field, Boualem Benatallah, Yacine GACI, Fabio Casati, Walid Gaaloul
- Debate-to-Detect: Reformulating Misinformation Detection as a Real-World Debate with Large Language Models
Chen Han, Wenzhen Zheng, Xijin Tang
- Controllable Memorization in LLMs via Weight Pruning
Chenjie Ni, Zhepeng Wang, Runxue Bao, Shangqian Gao, Yanfu Zhang
- Tracing L1 Interference in English Learner Writing: A Longitudinal Corpus with Error Annotations
Poorvi Acharya, J. Elizabeth Liebl, Dhiman Goswami, Kai North, Marcos Zampieri, Antonios Anastasopoulos
- DCIS: Efficient Length Extrapolation of LLMs via Divide-and-Conquer Scaling Factor Search
Lei Yang, Shaoyang Xu, Jianxiang Peng, shaolin Zhu, Deyi Xiong
- Who is in the Spotlight: The Hidden Bias Undermining Multimodal Retrieval-Augmented Generation
Jiayu Yao, Shenghua Liu, Yiwei Wang, Lingrui Mei, Baolong Bi, Yuyao Ge, Zhecheng Li, Xueqi Cheng
- Let’s Play Across Cultures: A Large Multilingual, Multicultural Benchmark for Assessing Language Models’ Understanding of Sports
Punit kumar singh, Nishant Kumar, Akash Ghosh, Kunal Pasad, Khushi Soni, Manisha Jaishwal, Sriparna Saha, Syukron Abu Ishaq Alfarozi, Asres Temam Abagissa, Kitsuchart Pasupa, Jose G Moreno, Haiqin Yang
- Multilingual Federated Low-Rank Adaptation for Collaborative Content Anomaly Detection across Multilingual Social Media Participants
Jiaxin Li, Geng Zhao
- M3Retrieve: Benchmarking Multimodal Retrieval for Medicine
Arkadeep Acharya, Akash Ghosh, Pradeepika Verma, Kitsuchart Pasupa, Sriparna Saha, Dr Priti Singh
- The Hidden Strength of Disagreement: Unraveling the Consensus-Diversity Tradeoff in Adaptive Multi-Agent Systems
Zengqing Wu, Takayuki Ito
- Friend or Foe? A Computational Investigation of Semantic False Friends across Romance Languages
Ana Sabina Uban, Liviu P Dinu, Ioan-Bogdan Iordache, Simona Georgescu, Claudia Vlad
- KLAAD: Refining Attention Mechanisms to Reduce Societal Bias in Generative Language Models
Seorin Kim, Dongyoung Lee, Jaejin Lee
- SeMob: Semantic Synthesis for Dynamic Urban Mobility Prediction
Runfei Chen, Shuyang Jiang, Wei Huang
- DyePack: Provably Flagging Test Set Contamination in LLMs Using Backdoors
Yize Cheng, Wenxiao Wang, Mazda Moayeri, Soheil Feizi
- Minimal, Local, and Robust: Embedding-Only Edits for Implicit Bias in T2I Models
Feng He, Chao Zhang, Zhixue Zhao
- Journalism-Guided Agentic In-context Learning for News Stance Detection
Dahyun Lee, Jonghyeon Choi, Jiyoung Han, Kunwoo Park
- Less Is MuRE: Revisiting Shallow Knowledge Graph Embeddings
Victor Charpenay, Steven Schockaert
- Jailbreak LLMs through Internal Stance Manipulation
Shuangjie Fu, Du Su, Beining Huang, Fei Sun, Jingang Wang, Wei Chen, Huawei Shen, Xueqi Cheng
- Pierce the Mists, Greet the Sky: Decipher Knowledge Overshadowing via Knowledge Circuit Analysis
Haoming Huang, Yibo Yan, Jiahao Huo, Xin Zou, Xinfeng Li, Kun Wang, Xuming Hu
- Complex Numerical Reasoning with Numerical Semantic Pre-training Framework
Jun Zhang, Haihong E, Tianyi Hu, Yifan Zhu, Meina Song, Haoran Luo
- Automated Knowledge Graph Construction using Large Language Models and Sentence Complexity Modelling
Sydney Anuyah, Mehedi Mahmud Kaushik, Sri Rama Krishna Reddy Dwarampudi, Rakesh Shiradkar, Arjan Durresi, Sunandan Chakraborty
- OntologyRAG-Q: Resource Development and Benchmarking for Retrieval-Augmented Question Answering in Qur’anic Tafsir
Sadam Al-Azani, Maad Alowaifeer, Alhanoof Alhunief, Ahmed Abdelali
- The Practical Impacts of Theoretical Constructs on Empathy Modeling
Allison Lahnala, Charles Welch, David Jurgens, Lucie Flek
- RecBase: Generative Foundation Model Pretraining for Zero-Shot Recommendation
Sashuai Zhou, Weinan Gan, Qijiong Liu, Ke Lei, Jieming Zhu, Hai Huang, Yan Xia, Ruiming Tang, Zhenhua Dong, Zhou Zhao
- Grouping Entities with Shared Properties using Multi-Facet Prompting and Property Embeddings
Amit Gajbhiye, Thomas Bailleux, Zied Bouraoui, Luis Espinosa-Anke, Steven Schockaert
- Context-Aware Hierarchical Taxonomy Generation for Scientific Papers via LLM-Guided Multi-Aspect Clustering
kun Zhu, Lizi Liao, Yuxuan Gu, Lei Huang, Xiaocheng Feng, Bing Qin
- Benchmark Profiling: Mechanistic Diagnosis of LLM Benchmarks
Dongjun Kim, Gyuho Shim, Yongchan Chun, Minhyuk Kim, Chanjun Park, Heuiseok Lim
- TreeReview: A Dynamic Tree of Questions Framework for Deep and Efficient LLM-based Scientific Peer Review
Yuan Chang, Ziyue Li, Hengyuan Zhang, Yuanbo Kong, Yanru Wu, Zhijiang Guo, Ngai Wong
- Improving Chemical Understanding of LLMs via SMILES Parsing
Yunhui Jang, Jaehyung Kim, Sungsoo Ahn
- Can Large Language Models Tackle Graph Partitioning?
Yiheng Wu, Ningchao Ge, Yanmin Li, Liwei Qian, Mengna Zhu, Haoyu Yang, Haiwen Chen, JibingWu
- To See a World in a Spark of Neuron: Disentangling Multi-Task Interference for Training-Free Model Merging
Zitao Fang, Guodong DU, Shuyang Yu, Yifei Guo, Yiwei Zhang, Yiyao Cao, Jing Li, Ho-Kin Tang, Sim Kuan Goh
- What You Read Isn’t What You Hear: Linguistic Sensitivity in Deepfake Speech Detection
Binh Nguyen, Shuju Shi, Ryan Ofman, Thai Le
- Task-Aware Resolution Optimization for Visual Large Language Models
Weiqing Luo, Zhen Tan, Yifan Li, Xinyu Zhao, Kwonjoon Lee, Behzad Dariush, Tianlong Chen
- CheckEval: A reliable LLM-as-a-Judge framework for evaluating text generation using checklists
Yukyung Lee, JoongHoon Kim, Jaehee Kim, Hyowon Cho, Jaewook Kang, Pilsung Kang, Najoung Kim
- A Necessary Step toward Faithfulness: Measuring and Improving Consistency in Free-Text Explanations
Lingjun Zhao, Hal Daumé III
- Boosting Multi-modal Keyphrase Prediction with Dynamic Chain-of-Thought in Vision-Language Models
Qihang Ma, Shengyu Li, Jie Tang, Dingkang Yang, Chenshaodong, Yingyi Zhang, ChaoFeng, Ran Jiao
- Chart2Code53: A Large-Scale Diverse and Complex Dataset for Enhancing Chart-to-Code Generation
Tianhao Niu, Yiming Cui, Baoxin Wang, Xiao Xu, Xin Yao, Qingfu Zhu, Dayong Wu, Shijin Wang, Wanxiang Che
- The State of Multilingual LLM Safety Research: From Measuring The Language Gap To Mitigating It
Zheng Xin Yong, Beyza Ermis, Marzieh Fadaee, Stephen Bach, Julia Kreutzer
- AIP: Subverting Retrieval-Augmented Generation via Adversarial Instructional Prompt
Saket Sanjeev Chaturvedi, Gaurav Bagwe, Lan Emily Zhang, Xiaoyong Yuan
- From Capabilities to Performance: Evaluating Key Functional Properties of LLM Architectures in Penetration Testing
Lanxiao Huang, Daksh Dave, Tyler Cody, Peter A. Beling, Ming Jin
- Editing Across Languages: A Survey of Multilingual Knowledge Editing
Nadir Durrani, Basel Mousi, Fahim Dalvi
- Your RAG is Unfair: Exposing Fairness Vulnerabilities in Retrieval-Augmented Generation via Backdoor Attacks
Gaurav Bagwe, Saket Sanjeev Chaturvedi, Xiaolong Ma, Xiaoyong Yuan, Kuang-Ching Wang, Lan Emily Zhang
- Drift-Adapter: A Practical Approach to Near Zero-Downtime Embedding Model Upgrades in Vector Databases
Harshil Vejendla
- The Staircase of Ethics: Probing LLM Value Priorities through Multi-Step Induction to Complex Moral Dilemmas
Ya Wu, Qiang Sheng, Danding Wang, Guang Yang, Yifan Sun, Zhengjia Wang, Yuyan Bu, Juan Cao
- SliceMoE: Routing Embedding Slices Instead of Tokens for Fine-Grained and Balanced Transformer Scaling
Harshil Vejendla
- ReSo: A Reward-driven Self-organizing LLM-based Multi-Agent System for Reasoning Tasks
Heng Zhou, Hejia Geng, Xiangyuan Xue, Li Kang, Yiran Qin, Zhiyong Wang, Zhenfei Yin, LEI BAI
- ConstraintLLM: A Neuro-Symbolic Framework for Industrial-Level Constraint Programming
Weichun Shi, Minghao Liu, Wanting Zhang, Langchen Shi, Fuqi Jia, Feifei Ma, Jian Zhang
- VisEscape: A Benchmark for Evaluating Exploration-driven Decision-making in Virtual Escape Rooms
Seungwon Lim, Sungwoong Kim, Jihwan Yu, Sungjae Lee, Jiwan Chung, Youngjae Yu
- ESC-Judge: A Framework for Comparing Emotional Support Conversational Agents
Navid Madani
- Neuron-Level Differentiation of Memorization and Generalization in Large Language Models
Ko-Wei Huang, Yi-Fu Fu, Ching-Yu Tsai, Yu-Chieh Tu, TZU-LING CHENG, Cheng-Yu Lin, Yi-Ting Yang, Heng-Yi Liu, Keng-Te Liao, Da-Cheng Juan, Shou-De Lin
- Sparse Neurons Carry Strong Signals of Question Ambiguity in LLMs
Zhuoxuan Zhang, Jinhao Duan, Edward Kim, Kaidi Xu
- Do Slides Help? Multi-modal Context for Automatic Transcription of Conference Talks
Supriti Sinhamahapatra, Jan Niehues
- Promote, Suppress, Iterate: How Language Models Answer One-to-Many Factual Queries
Tianyi Lorena Yan, Robin Jia
- Out of Sight, Not Out of Context? Egocentric Spatial Reasoning in VLMs Across Disjoint Frames
Sahithya Ravi, Gabriel Herbert Sarch, Vibhav Vineet, Andrew D Wilson, Balasaravanan Thoravi Kumaravel
- Enhancing Chain-of-Thought Reasoning via Neuron Activation Differential Analysis
Yiru Tang, Kun Zhou, Yingqian Min, Jing Sha, Zhichao Sheng, Shijin Wang, Xin Zhao
- PakBBQ: A Culturally Adapted Bias Benchmark for QA
Abdullah Hashmat, Muhammad Arham Mirza, Agha Ali Raza
- MULTIGUARD: An Efficient Approach for AI Safety Moderation Across Languages and Modalities
Sahil Verma, Keegan Hines, Jeff Bilmes, Charlotte Siska, Luke Zettlemoyer, Hila Gonen, Chandan Singh
- Comparing human and LLM politeness strategies in free production
Haoran Zhao, Robert D. Hawkins
- ASTRA: A Negotiation Agent with Adaptive and Strategic Reasoning via Tool-integrated Action for Dynamic Offer Optimization
Deuksin Kwon, Jiwon Hae, Emma Clift, Daniel Shamsoddini, Jonathan Gratch, Gale Lucas
- CARMA: Enhanced Compositionality in LLMs via Advanced Regularisation and Mutual Information Alignment
Nura Aljaafari, Danilo Carvalho, Andre Freitas
- MEPT: Mixture of Expert Prompt Tuning as a Manifold Mapper
Runjia Zeng, Guangyan Sun, Qifan Wang, Tong Geng, Sohail Dianat, Xiaotian Han, Raghuveer Rao, XUELING ZHANG, Cheng Han, Lifu Huang, Dongfang Liu
- KG-CQR: Leveraging Structured Relation Representations in Knowledge Graphs for Contextual Query Retrieval
Chi Minh Bui, Ngoc Mai Thieu, Vinh Van Nguyen, Jason J. Jung, Khac-Hoai Nam Bui
- SABER: Uncovering Vulnerabilities in Safety Alignment via Cross-Layer Residual Connection
Maithili Joshi, Palash Nandi, Tanmoy Chakraborty
- When Truthful Representations Flip Under Deceptive Instructions?
Xianxuan Long, Yao Fu, Runchao Li, Mu Sheng, Haotian Yu, Xiaotian Han, Pan Li
- Can LLMs simulate the same correct solutions to free-response math problems as real students?
Yuya Asano, Diane Litman, Erin Walker
- Evaluating Behavioral Alignment in Conflict Dialogue: A Multi-Dimensional Comparison of LLM Agents and Humans
Deuksin Kwon, Kaleen Shrestha, Bin Han, Gale Lucas
- RECALL: REpresentation-aligned Catastrophic-forgetting ALLeviation via Hierarchical Model Merging
Bowen Wang, Haiyuan Wan, 石力文, Chen Yang, Peng He, Yue MA, Haochen Han, Wenhao Li, Tiao Tan, Yongjian Li, Fangming Liu, Gong Yifan, Sheng Zhang
- Not-Just-Scaling Laws: Towards a Better Understanding of the Downstream Impact of Language Model Design Decisions
Emmy Liu, Amanda Bertsch, Lintang Sutawika, Lindia Tjuatja, Patrick Fernandes, Lara Marinov, Michael Chen, Shreya Singhal, Carolin Lawrence, Aditi Raghunathan, Kiril Gashteovski, Graham Neubig
- Synthetic Socratic Debates: Examining Persona Effects on Moral Decision and Persuasion Dynamics
Jiarui Liu, Yueqi Song, Yunze Xiao, Mingqian Zheng, Lindia Tjuatja, Jana Schaich Borg, Mona T. Diab, Maarten Sap
- Linear-Time Demonstration Selection for In-Context Learning via Gradient Estimation
Ziniu Zhang, Zhenshuo Zhang, Dongyue Li, Lu Wang, Jennifer Dy, Hongyang R. Zhang
- Speech Vecalign: an Embedding-based Method for Aligning Parallel Speech Documents
Chutong Meng, Philipp Koehn
- TurBLiMP: A Turkish Benchmark of Linguistic Minimal Pairs
Ezgi Başar, Francesca Padovani, Jaap Jumelet, Arianna Bisazza
- DynamicNER: A Dynamic, Multilingual, and Fine-Grained Dataset for LLM-based Named Entity Recognition
Hanjun Luo, Yingbin Jin, Yiran Wang, Xinfeng Li, Tong Shang, Xuecheng Liu, Ruizhe Chen, Kun Wang, Hanan Salam, Qingsong Wen, Zuozhu Liu
- Reliable and Cost-Effective Exploratory Data Analysis via Graph-Guided RAG
Mossad Helali, Yutai Luo, Tae Jun Ham, Jim Plotts, Ashwin Chaugule, Jichuan Chang, Parthasarathy Ranganathan, Essam Mansour
- Med-PRM: Medical Reasoning Models with Stepwise, Guideline-verified Process Rewards
Jaehoon Yun, Jiwoong Sohn, Jungwoo Park, Hyunjae Kim, Xiangru Tang, Daniel Shao, Yong Hoe Koo, Ko Minhyeok, Qingyu Chen, Mark Gerstein, Michael Moor, Jaewoo Kang
- Graders Should Cheat: Privileged Information Enables Expert-Level Automated Evaluations
Jin Peng Zhou, Séb Arnold, Nan Ding, Kilian Q Weinberger, Nan Hua, Fei Sha
- SAMULE: Self-Learning Agents Enhanced by Multi-level Reflection
Yubin Ge, Salvatore Romeo, Jason Cai, MONICA SUNKARA, Yi Zhang
- Database-Augmented Query Representation for Information Retrieval
Soyeong Jeong, Jinheon Baek, Sukmin Cho, Sung Ju Hwang, Jong C. Park
- The Enemy from Within: A Study of Political Delegitimization Discourse in Israeli Political Speech
Naama Rivlin-Angert, Guy Mor-Lan
- Attention Eclipse: Manipulating Attention to Bypass LLM Safety-Alignment
Pedram Zaree, Md Abdullah Al Mamun, Quazi Mishkatul Alam, Yue Dong, Ihsen Alouani, Nael Abu-Ghazaleh
- Representation Potentials of Foundation Models for Multimodal Alignment: A Survey
Jianglin Lu, Hailing Wang, Yi Xu, Yizhou Wang, Kuo Yang, Yun Fu
- Draft Model Knows When to Stop: Self-Verification Speculative Decoding for Long-Form Generation
Ziyin Zhang, Jiahao Xu, Tian Liang, Xingyu Chen, Zhiwei He, Rui Wang, Zhaopeng Tu
- Visual-Aware Speech Recognition for Noisy Scenarios
Balaji Darur, Karan Singla
- Advancing Arabic Diacritization: Improved Datasets, Benchmarking, and State-of-the-Art Models
Abubakr Mohamed, Hamdy Mubarak
- Implicit Values Embedded in How Humans and LLMs Complete Subjective Everyday Tasks
Arjun Arunasalam, Madison Pickering, Z. Berkay Celik, Blase Ur
- Dynamic Retriever for In-Context Knowledge Editing via Policy Optimization
Mahmud Wasif Nafee, Maiqi JIANG, Haipeng Chen, Yanfu Zhang
- LVLMs are Bad at Overhearing Human Referential Communication
Zhengxiang Wang, Weiling Li, Panagiotis Kaliosis, Susan Brennan, Owen Rambow
- Let’s Reason Formally: Natural-Formal Hybrid Reasoning Enhances LLM’s Math Capability
Ruida WANG, Yuxin Li, Yi R. Fung, Tong Zhang
- TORSO: Template-Oriented Reasoning Towards General Tasks
Minhyuk Kim, Seungyoon Lee, Heuiseok Lim
- Prototypical Human-AI Collaboration Behaviors from LLM-Assisted Writing in the Wild
Sheshera Mysore, Debarati Das, Hancheng Cao, Bahareh Sarrafzadeh
- WildScore: Benchmarking MLLMs in-the-Wild Symbolic Music Reasoning
Gagan Mundada, Yash Vishe, Amit Namburi, Xin Xu, Zachary Novack, Julian McAuley, Junda Wu
- TRIAL: Token Relations and Importance Aware Late-interaction for Accurate Text Retrieval
Hyukkyu Kang, Injung Kim, Wook-Shin Han
- Do Large Language Models excel in Complex Logical Reasoning with Formal Language?
Jin Jiang, Jianing Wang, Yuchen Yan, Yang Liu, Jianhua Zhu, Mengdi Zhang, Liangcai Gao
- Fair or Framed? Political Bias in News Articles Generated by LLMs
Junho Yoo
- ReviewRL: Towards Automated Scientific Review with RL
Sihang Zeng, Kai Tian, Kaiyan Zhang, Yuru wang, Junqi Gao, Runze Liu, Sa Yang, Jingxuan Li, Xinwei Long, Jiaheng Ma, Biqing Qi, Bowen Zhou
- Grammar Pruning: Enabling Low-Latency Zero-Shot Task-Oriented Language Models for Edge AI
Octavian Alexandru Trifan, Jason Lee Weber, Marc Titus Trifan, Alexandru Nicolau, Alexander Veidenbaum
- Calibrating LLMs for Text-to-SQL Parsing by Leveraging Sub-clause Frequencies
Terrance Liu, Shuyi Wang, Daniel Preotiuc-Pietro, Yash Chandarana, Chirag Gupta
- REACT: Representation Extraction And Controllable Tuning to Overcome Overfitting in LLM Knowledge Editing
Haitian Zhong, Yuhuan Liu, Ziyang Xu, Guofan Liu, Qiang Liu, Shu Wu, Zhe Zhao, Liang Wang, Tieniu Tan
- ThinkEdit: Interpretable Weight Editing to Mitigate Overly Short Thinking in Reasoning Models
Chung-En Sun, Ge Yan, Tsui-Wei Weng
- Incorporating Diverse Perspectives in Cultural Alignment: Survey of Evaluation Benchmarks Through A Three-Dimensional Framework
Meng-Chen Wu, Si-Chi Chin, Tess Wood, Ayush Goyal, Narayanan Sadagopan
- Are Large Language Models Chronically Online Surfers? A Dataset for Chinese Internet Meme Explanation
Yubo Xie, Chenkai Wang, Zongyang Ma, Fahui Miao
- RoDEval: A Robust Word Sense Disambiguation Evaluation Framework for Large Language Models
Luyang Zhang, Shuaimin Li, Yishuo Li, Kunpeng Kang, Kaiyuan Zhang, Cong Wang, Wenpeng Lu
- PychoAgent: Psychology-driven LLM Agents for Explainable Panic Prediction on Social Media during Sudden Disaster Events
Mengzhu Liu, Zhengqiu Zhu, Chuan Ai, Chen Gao, Xinghong Li, Lingnan He, Kaisheng Lai, Yingfeng Chen, Xin Lu, Yong Li, Quanjun Yin
- Stepwise Reasoning Checkpoint Analysis: A Test Time Scaling Method to Enhance LLMs’ Reasoning
Zezhong WANG, Xingshan Zeng, Weiwen Liu, Yufei Wang, Liangyou Li, Yasheng Wang, Lifeng Shang, Xin Jiang, Qun Liu, Kam-Fai Wong
- Inter-sentence Context Modeling and Structure-aware Representation Enhancement for Conversational Sentiment Quadruple Extraction
Yu Zhang, Zhaoman Zhong, Huihui LV
- Igniting Creative Writing in Small Language Models: LLM-as-a-Judge versus Multi-Agent Refined Rewards
Xiaolong Wei, Bo Lu, Xingyu Zhang, Zhejun Zhao, Dongdong Shen, Long Xia, Dawei Yin
- Governance in Motion: Co-evolution of Constitutions and AI models for Scalable Safety
Chenhao Huang, Ziyu Shen, Yicong Ren, Huiyuan Zheng, Jiazheng Zhang, Mingxu Chai, Ming Zhang, Shihan Dou, Fan Mo, Jie Shi, Tao Gui, Qi Zhang, Xuanjing Huang
- Web Intellectual Property at Risk: Preventing Unauthorized Real-Time Retrieval by Large Language Models
Yisheng Zhong, Yizhu Wen, Junfeng Guo, Mehran Kafai, Heng Huang, Hanqing Guo, Zhuangdi Zhu
- SciEvent: Benchmarking Multi-domain Scientific Event Extraction
Bofu Dong, Pritesh Shah, Sumedh Sonawane, Tiyasha Banerjee, Erin Brady, Xinya Du, Ming Jiang
- Media Source Matters More Than Content: Unveiling Political Bias in LLM-Generated Citations
Sunhao Dai, Zhanshuo Cao, Wenjie Wang, Liang Pang, Jun Xu, See-Kiong Ng, Tat-Seng Chua
- RJE: A Retrieval-Judgment-Exploration Framework for Efficient Knowledge Graph Question Answering with LLMs
Can Lin, Zhengwang Jiang, Ling Zheng, Qi Zhao, Yuhang Zhang, Qi Song, Wangqiu Zhou
- Bias Mitigation or Cultural Commonsense? Evaluating LLMs with a Japanese Dataset
Taisei Yamamoto, Ryoma Kumon, Danushka Bollegala, Hitomi Yanaka
- Chameleon LLMs: User Personas Influence Chatbot Personality Shifts
Jane Xing, Tianyi Niu, Shashank Srivastava
- GuessingGame: Measuring the Informativeness of Open-Ended Questions in Large Language Models
Dylan Hutson, Daniel Vennemeyer, Aneesh Deshmukh, Justin Zhan, Tianyu Jiang
- SynC-LLM: Generation of Large-Scale Synthetic Circuit Code with Hierarchical Language Models
Shang Liu, Yao Lu, Wenji Fang, Jing Wang, Zhiyao Xie
- Why Stop at One Error? Benchmarking LLMs as Data Science Code Debuggers for Multi-Hop and Multi-Bug Errors
Zhiyu Yang, Shuo Wang, Yukun Yan, Yang Deng
- Dovetail: A CPU/GPU Heterogeneous Speculative Decoding for LLM inference
Libo Zhang, Zhaoning Zhang, xubaizhou, Rui Li, Zhiliang Tian, Songzhu Mei, Dongsheng Li
- V-SEAM: Visual Semantic Editing and Attention Modulating for Causal Interpretability of Vision-Language Models
Qidong Wang, Junjie Hu, Ming Jiang
- LORAXBENCH: A Multitask, Multilingual Benchmark Suite for 20 Indonesian Languages
Alham Fikri Aji, Trevor Cohn
- SAFE: Schema-Driven Approximate Distance Join for Efficient Knowledge Graph Querying
Sangoh Lee, Sungho Park, Wook-Shin Han
- Structured Preference Optimization for Vision-Language Long-Horizon Task Planning
Xiwen Liang, Min Lin, Weiqi Ruan, Rongtao Xu, Yuecheng Liu, Jiaqi Chen, Bingqian Lin, Yuzheng Zhuang, Xiaodan Liang
- Position: LLMs Can be Good Tutors in English Education
Jingheng Ye, Shen Wang, Deqing Zou, Yibo Yan, Kun Wang, Hai-Tao Zheng, Ruitong Liu, Zenglin Xu, Irwin King, Philip S. Yu, Qingsong Wen
- CLLMate: A Multimodal Benchmark for Weather and Climate Events Forecasting
Haobo Li, Zhaowei Wang, Jiachen Wang, Yueya WANG, Alexis Kai Hon Lau, Huamin Qu
- Extracting and Combining Abilities For Building Multi-lingual Ability-enhanced Large Language Models
Zhipeng Chen, Kun Zhou, Liang Song, Xin Zhao, Bingning Wang, Weipeng Chen, Ji-Rong Wen
- Evaluating the Effectiveness and Scalability of LLM-Based Data Augmentation for Retrieval
Pranjal A Chitale, Bishal Santra, Yashoteja Prabhu, Amit Sharma
- Temporal Referential Consistency: Do LLMs Favor Sequences Over Absolute Time References?
Ashutosh Bajpai, Tanmoy Chakraborty
- MemeArena: Automating Context-Aware Unbiased Evaluation of Harmfulness Understanding for Multimodal Large Language Models
Zixin Chen, Hongzhan Lin, Kaixin Li, Ziyang Luo, Yayue Deng, Jing Ma
- Multi-perspective Analysis of Large Language Model Domain Specialization: An Experiment in Accounting Audit Procedures Generation
Yusuke Noro
- Generator-Assistant Stepwise Rollback Framework for Large Language Model Agent
Xingzuo Li, Kehai Chen, Yunfei Long, Xuefeng Bai, Yong Xu, Min Zhang
- DocAgent: An Agentic Framework for Multi-Modal Long-Context Document Understanding
Li Sun, Liu He, Shuyue Jia, Yangfan He, Chenyu You
- EasyRec: Simple yet Effective Language Models for Recommendation
Xubin Ren, Chao Huang
- From Automation to Autonomy: A Survey on Large Language Models in Scientific Discovery
Tianshi Zheng, Zheye Deng, Hong Ting Tsang, Weiqi Wang, Jiaxin Bai, Zihao Wang, Yangqiu Song
- Mapping the Minds of LLMs: A Graph-Based Analysis of Reasoning LLMs
Zhen Xiong, Yujun Cai, Zhecheng Li, Yiwei Wang
- ViPE: Visual Perception in Parameter Space for Efficient Video-Language Understanding
Shichen Lu, Tongtian Yue, Longteng Guo, Handong Li, Xingjian He, Si Liu, Jing Liu
- Alignment for Efficient Tool Calling of Large Language Models
Hongshen Xu, Zihan Wang, Zichen Zhu, Lei Pan, Xingyu Chen, Shuai Fan, Lu Chen, Kai Yu
- ToM: Leveraging Tree-oriented MapReduce for Long-Context Reasoning in Large Language Models
Jiani Guo, Zuchao Li, Jie Wu, Qianren Wang, Yun Li, Lefei Zhang, hai zhao, Yujiu Yang
- BANMIME : Misogyny Detection with Metaphor Explanation on Bangla Memes
Md Ayon Mia, Akm Moshiur Rahman Mazumder, Khadiza Sultana Sayma, Md Fahim, Md Tahmid Hasan Fuad, MUHAMMAD IBRAHIM KHAN, AKMMAHBUBUR RAHMAN
- Phi: Preference Hijacking in Multi-modal Large Language Models at Inference Time
Yifan Lan, Yuanpu Cao, Weitong Zhang, Lu Lin, Jinghui Chen
- Retrieval-augmented GUI Agents with Generative Guidelines
Ran Xu, Kaixin Ma, Wenhao Yu, Hongming Zhang, Joyce C. Ho, Carl Yang, Dong Yu
- COAS2W: A Chinese Older-Adults Spoken-to-Written Transformation Corpus with Context Awareness
Chun Kang, Zhigu Qian, Zhen Fu, Jiaojiao Fu, Yangfan Zhou
- Answer Convergence as a Signal for Early Stopping in Reasoning
Xin Liu, Lu Wang
- VeriFact: Enhancing Long-Form Factuality Evaluation with Refined Fact Extraction and Reference Facts
Xin Liu, Lechen Zhang, Sheza Munir, Yiyang Gu, Lu Wang
- SQUAB: Evaluating LLM robustness to Ambiguous and Unanswerable Questions in Semantic Parsing
Simone Papicchio, Luca Cagliero, Paolo Papotti
- Reliable Evaluation and Benchmarks for Statement Autoformalization
Auguste Poiroux, Gail Weiss, Viktor Kunčak, Antoine Bosselut
- VisBias: Measuring Explicit and Implicit Social Biases in Vision Language Models
Jen-tse Huang, Jiantong Qin, Jianping Zhang, Youliang Yuan, Wenxuan Wang, Jieyu Zhao
- Less Is More? Examining Fairness in Pruned Large Language Models for Summarising Opinions
Nannan Huang, Haytham M. Fayek, Xiuzhen Zhang
- AI Sees Your Location—But With A Bias Toward The Wealthy World
Jingyuan Huang, Jen-tse Huang, Ziyi Liu, Xiaoyuan Liu, Wenxuan Wang, Jieyu Zhao
- Faster In-Context Learning for LLMs via N-Gram Trie Speculative Decoding
Jinglin Chen, Qiwei Li, Zuchao Li, Baoyuan Qi, Liu Guoming, Haojun Ai, hai zhao, Ping Wang
- From Surveys to Narratives: Rethinking Cultural Value Adaptation in LLMs
Muhammad Farid Adilazuarda, Chen Cecilia Liu, Iryna Gurevych, Alham Fikri Aji
- Iterative Prompt Refinement for Safer Text-to-Image Generation
Jinwoo Jeon, JunHyeok Oh, Hayeong Lee, Byung-Jun Lee
- Language Models as Continuous Self-Evolving Data Engineers
Peidong Wang, Ming Wang, Zhiming Ma, Xiaocui Yang, Shi Feng, Daling Wang, Yifei Zhang, Kaisong Song
- Unilaw-R1: A Large Language Model for Legal Reasoning with Reinforcement Learning and Iterative Inference
Hua Cai, Shuang Zhao, liang zhang, Xuli Shen, Qing Xu, Weilin Shen, ZihaoWen, Tianke Ban
- Exploring Response Uncertainty in MLLMs: An Empirical Evaluation under Misleading Scenarios
Yunkai Dang, Mengxi Gao, Yibo Yan, Xin Zou, Yanggan Gu, Jungang Li, Jingyu Wang, Peijie Jiang, Aiwei Liu, Jia Liu, Xuming Hu
- Evaluating and Aligning Human Economic Risk Preferences in LLMs
Jiaxin Liu, Yixuan Tang, Yi Yang, KAR YAN TAM
- Ensembling Prompting Strategies for Zero-Shot Hierarchical Text Classification with Large Language Models
Mingxuan Xia, Zhijie Jiang, Haobo Wang, Junbo Zhao, Tianlei Hu, Gang Chen
- Improbable Bigrams Expose Vulnerabilities of Incomplete Tokens in Byte-Level Tokenizers
Eugene Jang, Kimin Lee, Jin-Woo Chung, Keuntae Park, Seungwon Shin
- UI-Hawk: Unleashing the Screen Stream Understanding for Mobile GUI Agents
Jiwen Zhang, Ya-Qi Yu, Minghui Liao, WenTao Li, Jihao Wu, zhongyu wei
- UniDebugger: Hierarchical Multi-Agent Framework for Unified Software Debugging
Cheryl Lee, Chunqiu Steven Xia, Longji Yang, Jen-tse Huang, Zhouruixing Zhu, LINGMING ZHANG, Michael R. Lyu
- Understanding the Thinking Process of Reasoning Models: A Perspective from Schoenfeld’s Episode Theory
Nan Zhang, Ming Li, Chenrui Fan, Hong Jiao, Yanbin Fu, Sydney Peters, Qingshu Xu, Robert Lissitz, Tianyi Zhou
- Thread: A Logic-Based Data Organization Paradigm for How-To Question Answering with Retrieval Augmented Generation
Kaikai An, Fangkai Yang, Liqun Li, Junting Lu, Sitao Cheng, Shuzheng Si, Lu Wang, Pu Zhao, Lele Cao, Qingwei Lin, Saravan Rajmohan, Dongmei Zhang, Baobao Chang
- Unsupervised Word-level Quality Estimation for Machine Translation Through the Lens of Annotators (Dis)agreement
Gabriele Sarti, Vilém Zouhar, Malvina Nissim, Arianna Bisazza
- STEER-BENCH: A Benchmark for Evaluating the Steerability of Large Language Models
Kai Chen, Zihao He, Taiwei Shi, Kristina Lerman
- Combining Constrained and Unconstrained Decoding via Boosting: BoostCD and Its Application to Information Extraction
Marija Sakota, Robert West
- MultiLogicNMR(er): A Benchmark and Neural-Symbolic Framework for Non-monotonic Reasoning with Multiple Extensions
Yeliang Xiu, Yongmei Liu
- Beyond Demographics: Enhancing Cultural Value Survey Simulation with Multi-Stage Personality-Driven Cognitive Reasoning
Haijiang Liu, Qiyuan Li, Chao Gao, Yong Cao, Xiangyu Xu, XUN WU, Daniel Hershcovich, Jinguang Gu
- CrystalICL: Enabling In-Context Learning for Crystal Generation
Ruobing Wang, Qiaoyu Tan, Yili Wang, Ying Wang, Xin Wang
- Towards a Unified Paradigm of Concept Editing in Large Language Models
Zhuowen Han, Xinwei Wu, Dan Shi, Renren Jin, Deyi Xiong
- Step-level Verifier-guided Hybrid Test-Time Scaling for Large Language Models
Kaiyan Chang, Yonghao Shi, Chenglong Wang, Hang Zhou, Chi Hu, Xiaoqian Liu, yingfeng luo, Yuan Ge, Tong Xiao, JingBo Zhu
- Dynamic Expert Specialization: Towards Catastrophic Forgetting-Free Multi-Domain MoE Adaptation
Junzhuo Li, Bo Wang, Xiuze Zhou, Xuming Hu
- RRInf: Efficient Influence Function Estimation via Ridge Regression for Large Language Models and Text-to-Image Diffusion Models
Zhuozhuo Tu, Cheng Chen, Yuxuan Du
- Evaluating Spatiotemporal Consistency in Automatically Generated Sewing Instructions
Luisa Geiger, Mareike Hartmann, Michael Sullivan, Alexander Koller
- MaZO: Masked Zeroth-Order Optimization for Multi-Task Fine-Tuning of Large Language Models
Zhen Zhang, Yifan Yang, Kai Zhen, Nathan Susanj, Athanasios Mouchtaris, Siegfried Kunzmann, Zheng Zhang
- Procedural Environment Generation for Tool-Use Agents
Michael Sullivan, Mareike Hartmann, Alexander Koller
- FacLens: Transferable Probe for Foreseeing Non-Factuality in Fact-Seeking Question Answering of Large Language Models
Yanling Wang, Haoyang Li, Hao Zou, Jing Zhang, Xinlei He, Qi Li, Ke Xu
- OMS: On-the-fly, Multi-Objective, Self-Reflective Ad Keyword Generation via LLM Agent
Bowen Chen, Zhao Wang, Shingo Takamatsu
- Med-VRAgent: A Framework for Medical Visual Reasoning-Enhanced Agents
guangfu guo, Xiaoqian Lu, Yue Feng
- TrojanWave: Exploiting Prompt Learning for Stealthy Backdoor Attacks on Large Audio-Language Models
Asif Hanif, Maha Tufail Agro, Fahad Shamshad, Karthik Nandakumar
- Can LLMs be Literary Companions?: Analysing LLMs on Bengali Figures of Speech Identification
Sourav Das, Kripabandhu Ghosh
- Group-SAE: Efficient Training of Sparse Autoencoders for Large Language Models via Layer Groups
Davide Ghilardi, Federico Belotti, Marco Molinari, Tao Ma, Matteo Palmonari
- Retrieval over Classification: Integrating Relation Semantics for Multimodal Relation Extraction
Lei Hei, Tingjing Liao, peiyingxin, Yiyang Qi, Jiaqi Wang, Ruiting Li, Feiliang Ren
- PunMemeCN: A Benchmark to Explore Vision-Language Models’ Understanding of Chinese Pun Memes
Zhijun Xu, Siyu Yuan, Yiqiao Zhang, Jingyu Sun, Tong Zheng, Deqing Yang
- UltraIF: Advancing Instruction Following from the Wild
Kaikai An, Li Sheng, Ganqu Cui, Shuzheng Si, Ning Ding, Yu Cheng, Baobao Chang
- Identifying Pre-training Data in LLMs: A Neuron Activation-Based Detection Framework
HONGYI TANG, Zhihao Zhu, Yi Yang
- TreeRare: Syntax Tree-Guided Retrieval and Reasoning for Knowledge-Intensive Question Answering
Boyi Zhang, Zhuo Liu, Hangfeng He
- Mapping Toxic Comments Across Demographics: A Dataset from German Public Broadcasting
Jan Fillies, Michael Peter Hoffmann, Rebecca Reichel, Roman Salzwedel, Sven Bodemer, Adrian Paschke
- Small Models, Big Results: Achieving Superior Intent Extraction through Decomposition
Danielle Cohen, Yoni Halpern, Anatoly Efros, Noam Kahlon, Joel Oren, Omri Berkovitch, Sapir Caduri, Ido Dagan
- On Pruning State-Space LLMs
Tamer Ghattas, Michael Hassid, Roy Schwartz
- An Orthogonal High-Rank Adaptation for Large Language Models
Xin Zhang, Guang-Ze Chen, Shuzhen Li, zhulin liu, C.L.Philip Chen, Tong Zhang
- BSFA: Leveraging the Subspace Dichotomy to Accelerate Neural Network Training
WenJie Zhou, Bohan Wang, Wei Chen, Xueqi Cheng
- Debatable Intelligence: Benchmarking LLM Judges via Debate Speech Evaluation
Noy Sternlicht, Ariel Gera, Roy Bar-Haim, Tom Hope, Noam Slonim
- METok: Multi-Stage Event-based Token Compression for Efficient Long Video Understanding
Mengyue Wang, Shuo Chen, Kristian Kersting, Volker Tresp, Yunpu Ma
- VisiPruner: Decoding Discontinuous Cross-Modal Dynamics for Efficient Multimodal LLMs
Yingqi Fan, Anhao Zhao, Jinlan Fu, Junlong Tong, Hui Su, Yijie Pan, Wei Zhang, Xiaoyu Shen
- Beyond Static Testbeds: An Interaction-Centric Agent Simulation Platform for Dynamic Recommender Systems
Song Jin, Juntian Zhang, Yuhan Liu, Xun Zhang, Yufei zhang, Guojun Yin, Fei Jiang, Wei Lin, Rui Yan
- SheetDesigner: MLLM-Powered Spreadsheet Layout Generation with Rule-Based and Vision-Based Reflection
Qin Chen, Yuanyi Ren, Xiaojun Ma, Mugeng Liu, Shi Han, Dongmei Zhang
- CAIR: Counterfactual-based Agent Influence Ranker for Agentic AI Workflows
Amit Giloni, Chiara Picardi, Roy Betser, Shamik Bose, Aishvariya Priya Rathina Sabapathy, Roman Vainshtein
- ReSURE: Regularizing Supervision Unreliability for Multi-turn Dialogue Fine-tuning
Yiming Du, Yifan Xiang, Bin Liang, Dahua Lin, Kam-Fai Wong, Fei Tan
- Precise In-Parameter Concept Erasure in Large Language Models
Yoav Gur-Arieh, Clara Haya Suslik, Yihuai Hong, Fazl Barez, Mor Geva
- PhonoThink: Improving Large Language Models’ Reasoning on Chinese Phonological Ambiguities
Jianfei Ma, Zhaoxin Feng, Emmanuele Chersoni, Huacheng Song, Ziqi Zhang
- SAFE-SQL: Self-Augmented In-Context Learning with Fine-grained Example Selection for Text-to-SQL
Jimin Lee, Ingeol Baek, Byeongjeong Kim, Hyunkyung Bae, Hwanhee Lee
- ExpandR: Teaching Dense Retrievers Beyond Queries with LLM Guidance
Sijia Yao, Pengcheng Huang, Zhenghao Liu, Yu Gu, Yukun Yan, Shi Yu, Ge Yu
- Anecdoctoring: Automated Red-Teaming Across Language and Place
Alejandro Cuevas, Saloni Dash, Dan Vann, Madeleine I. G. Daepp
- ACING: Actor-Critic for Instruction Learning in Black-Box LLMs
Salma Kharrat, Fares Fourati, Marco Canini
- Women, Infamous, and Exotic Beings: A Comparative Study of Honorific Usages in Wikipedia and LLMs for Bengali and Hindi
Sourabrata Mukherjee, Atharva Mehta, Sougata Saha, Akhil Arora, Monojit Choudhury
- Process-Supervised Reward Models for Verifying Clinical Note Generation: A Scalable Approach Guided by Domain Expertise
Hanyin Wang, Chufan Gao, Qiping Xu, Bolun Liu, Guleid Hussein, Hariprasad Reddy Korsapati, Mohamad El Labban, Kingsley Iheasirim, Mohamed Hassan, Gokhan Anil, Brian Bartlett, Jimeng Sun
- GCML: Gradient Coherence Guided Meta-Learning for Cross-Domain Emerging Topic Rumor Detection
Zejiang He, jingyuan huang, Menglong Lu, Zhen Huang, Shanshan Liu, Zhiliang Tian, Dongsheng Li
- Can LLMs Generate and Solve Linguistic Olympiad Puzzles?
Neh Majmudar, Elena Filatova
- E2LLM: Encoder Elongated Large Language Models for Long-Context Understanding and Reasoning
Zihan Liao, Jun Wang, Hang Yu, Lingxiao Wei, Jianguo Li, Jun Wang, Wei Zhang
- DivScore: Zero-Shot Detection of LLM-Generated Text in Specialized Domains
Zhihui Chen, Kai He, Yucheng Huang, Yunxiao Zhu, Mengling Feng
- Multi-Document Event Extraction Using Large and Small Language Models
Qingkai Min, Zitian Qu, Qipeng Guo, Xiangkun Hu, Zheng Zhang, Yue Zhang
- MA-GTS: A Multi-Agent Framework for Solving Complex Graph Problems in Real-World Applications
Zike Yuan, Ming Liu, Hui Wang, Bing Qin
- Enhancing Speech Large Language Models with Prompt-Aware Mixture of Audio Encoders
Weiqiao Shan, Yuang Li, Yuhao Zhang, yingfeng luo, Chen Xu, Xiaofeng Zhao, Long Meng, Yunfei Lu, Min Zhang, Hao Yang, Tong Xiao, JingBo Zhu
- CIKT: A Collaborative and Iterative Knowledge Tracing Framework with Large Language Models
Runze Li, siyu wu, Jun Wang, Wei Zhang
- Mitigating Hallucinations in LM-Based TTS Models via Distribution Alignment Using GFlowNets
Chenlin Liu, Jiqing Han, Minghui Fang, Wei Zhou, Jie Gao
- MolErr2Fix: Benchmarking LLM Trustworthiness in Chemistry via Modular Error Detection, Localization, Explanation, and Correction
Yuyang Wu, Jinhui Ye, Shuhao Zhang, Lu Dai, Yonatan Bisk, Olexandr Isayev
- Shared Path: Unraveling Memorization in Multilingual LLMs through Language Similarities
Xiaoyu Luo, Yiyi Chen, Johannes Bjerva, Qiongxiu Li
- Embedding Domain Knowledge for Large Language Models via Reinforcement Learning from Augmented Generation
Chaojun Nie, Jun Zhou, Guanxiang Wang, Shisong Wu, Zichen Wang
- LLM-Driven Completeness and Consistency Evaluation for Cultural Heritage Data Augmentation in Cross-Modal Retrieval
Jian Zhang, Junyi Guo, Junyi Yuan, Huanda Lu, Yanlin Zhou, Fangyu Wu, Qiufeng Wang, Dongming Lu
- Artificial Impressions: Evaluating Large Language Model Behavior Through the Lens of Trait Impressions
Nicholas Deas, Kathleen McKeown
- Mitigating Hallucinations in Large Vision-Language Models via Entity-Centric Multimodal Preference Optimization
Jiulong Wu, Zhengliang Shi, Shuaiqiang Wang, Jizhou Huang, Dawei Yin, Lingyong Yan, Min Cao, Min Zhang
- 3DS: Medical Domain Adaptation of LLMs via Decomposed Difficulty-based Data Selection
Hongxin Ding, Yue Fang, Runchuan Zhu, Xinke Jiang, Jinyang Zhang, Yongxin Xu, Weibin Liao, Xu Chu, Junfeng Zhao, Yasha Wang
- InfiniBench: A Benchmark for Large Multi-Modal Models in Long-Form Movies and TV Shows
Kirolos Ataallah, Eslam Mohamed BAKR, Mahmoud Ahmed, Chenhui Gou, Khushbu Pahwa, Jian Ding, Mohamed Elhoseiny
- Intrinsic Test of Unlearning Using Parametric Knowledge Traces
Yihuai Hong, Lei Yu, Haiqin Yang, Shauli Ravfogel, Mor Geva
- Speculative Streaming: Efficient and Scalable Speculative Decoding with Multi-Stream Attention
Nikhil Bhendawade, Irina Belousova, Qichen Fu, Henry Mason, Antonie Lin, Mohammad Rastegari, Mahyar Najibi
- Evaluating Cognitive-Behavioral Fixation via Multimodal User Viewing Patterns on Social Media
Yujie Wang, Yunwei Zhao, Jing Yang, Han han, Shiguang Shan, Jie Zhang
- Mind the Gap: A Closer Look at Tokenization for Multiple-Choice Question Answering with LLMs
Mario Sanz-Guerrero, Minh Duc Bui, Katharina von der Wense
- VocalNet: Speech LLMs with Multi-Token Prediction for Faster and High-Quality Generation
Yuhao Wang, Heyang Liu, Ziyang Cheng, Ronghua Wu, Qunshan Gu, Yanfeng Wang, Yu Wang
- Path Drift in Large Reasoning Models: How First-Person Commitments Override Safety
Yuyi Huang
- CBP-Tuning: Efficient Local Customization for Black-box Large Language Models
Jiaxuan Zhao, Naibin Gu, Yuchen Feng, Xiyu Liu, Peng Fu, Zheng Lin, Weiping Wang
- Beyond the Score: Uncertainty-Calibrated LLMs for Automated Essay Assessment
Ahmed Karim, Zheng Yuan, Qiao Wang
- Humans Hallucinate Too: Language Models Identify and Correct Subjective Annotation Errors With Label-in-a-Haystack Prompts
Georgios Chochlakis, Peter Wu, Tikka Arjun Singh Bedi, Marcus Ma, Kristina Lerman, Shrikanth Narayanan
- $\textit{Do It Yourself (DIY)}$: Modifying Images for Poems in a Zero-Shot Setting Using Weighted Prompt Manipulation
Sofia Jamil, Kotla Sai Charan, Sriparna Saha, Koustava Goswami, Joseph K J
- Looking Beyond Text: Reducing Language Bias in Large Vision-Language Models via Multimodal Dual-Attention and Soft-Image Guidance
Haozhe Zhao, Shuzheng Si, Liang Chen, Yichi Zhang, Maosong Sun, Baobao Chang, Minjia Zhang
- Who Holds the Pen? Caricature and Perspective in LLM Retellings of History
Lubna Zahan Lamia, Mabsur Fatin Bin Hossain, Md Mosaddek Khan
- DSMoE: Matrix-Partitioned Experts with Dynamic Routing for Computation-Efficient Dense LLMs
Minxuan Lv, Zhenpeng Su, Leiyu Pan, Yizhe Xiong, Zijia Lin, Hui Chen, Wei Zhou, Jungong Han, Guiguang Ding, Wenwu Ou, Di ZHANG, Kun Gai, Songlin Hu
- Search Wisely: Mitigating Sub-optimal Agentic Searches By Reducing Uncertainty
Peilin Wu, Mian Zhang, Xinlu Zhang, Xinya Du, Zhiyu Chen
- Child-Directed Language Does Not Consistently Boost Syntax Learning in Language Models
Francesca Padovani, Jaap Jumelet, Yevgen Matusevych, Arianna Bisazza
- Benchmarking Debiasing Methods for LLM-based Parameter Estimates
Nicolas Audinet de Pieuchon, Adel Daoud, Connor Thomas Jerzak, Moa Johansson, Richard Johansson
- (Almost) Free Modality Stitching of Foundation Models
Jaisidh Singh, Diganta Misra, Boris Knyazev, Antonio Orvieto
- VERITAS: Leveraging Vision Priors and Expert Fusion to Improve Multimodal Data
Tingqiao Xu, Ziru Zeng, Jiayu Chen
- Rescorla-Wagner Steering of LLMs for Undesired Behaviors over Disproportionate Inappropriate Context
Rushi Wang, Jiateng Liu, Cheng Qian, Yifan Shen, Yanzhou Pan, Zhaozhuo Xu, Ahmed Abbasi, Heng Ji, Denghui Zhang
- Exploring Artificial Image Generation for Stance Detection
Zhengkang Zhang, Zhongqing Wang, Guodong Zhou
- Hope vs. Hate: Understanding User Interactions with LGBTQ+ News Content in Mainstream US News Media through the Lens of Hope Speech
Jonathan Pofcher, Christopher M Homan, Randall Sell, Ashiqur R. KhudaBukhsh
- Flaw or Artifact? Rethinking Prompt Sensitivity in Evaluating LLMs
Andong Hua, Kenan Tang, Chenhe Gu, Jindong Gu, Eric Wong, Yao Qin
- Topic Coverage-based Demonstration Retrieval for In-Context Learning
Wonbin Kweon, SeongKu Kang, Runchu Tian, Pengcheng Jiang, Jiawei Han, Hwanjo Yu
- On the Same Wavelength? Evaluating Pragmatic Reasoning in Language Models across Broad Concepts
Linlu Qiu, Cedegao E. Zhang, Joshua B. Tenenbaum, Yoon Kim, Roger P. Levy
- MuseScorer: Idea Originality Scoring At Scale
Ali Sarosh Bangash, Krish Veera, Ishfat Abrar Islam, Raiyan Abdul Baten
- SAFENUDGE: Safeguarding Large Language Models in Real-time with Tunable Safety-Performance Trade-offs
Joao Fonseca, Andrew Bell, Julia Stoyanovich
- RaDeR: Reasoning-aware Dense Retrieval Models
DEBRUP DAS, Sam O’Nuallain, Razieh Rahimi
- A Culturally-diverse Multilingual Multimodal Video Benchmark & Model
Bhuiyan Sanjid Shafique, Ashmal Vayani, Muhammad Maaz, Hanoona Abdul Rasheed, Dinura Dissanayake, Mohammed Irfan Kurpath, Yahya Hmaiti, Go Inoue, Jean Lahoud, Md. Safirur Rashid, Shadid Intisar Quasem, Maheen Fatima, Franco Vidal, Mykola Maslych, Ketan Pravin More, Sanoojan Baliah, Hasindri Watawana, Yuhao Li, Fabian Farestam, Leon Schaller, Roman Tymtsiv, Simon Weber, Hisham Cholakkal, Ivan Laptev, Shin’ichi Satoh, Michael Felsberg, Mubarak Shah, Salman Khan, Fahad Shahbaz Khan
- DRES: Fake news detection by dynamic representation and ensemble selection
Faramarz Farhangian, Leandro Augusto Ensina, George D C Cavalcanti, Rafael M. O. Cruz
- A Graph-Theoretical Framework for Analyzing the Behavior of Causal Language Models
Rashin Rahnamoun, Mehrnoush Shamsfard
- Membership and Memorization in LLM Knowledge Distillation
Ziqi Zhang, Ali Shahin Shamsabadi, Hanxiao Lu, Yifeng Cai, Hamed Haddadi
- Balanced Multi-Factor In-Context Learning for Multilingual Large Language Models
Masahiro Kaneko, Alham Fikri Aji, Timothy Baldwin
- Efficient Context Selection for Long-Context QA: No Tuning, No Iteration, Just Adaptive‑$k$
Chihiro Taguchi, Seiji Maekawa, Nikita Bhutani
- Languages Still Left Behind: Toward a Better Multilingual Machine Translation Benchmark
Chihiro Taguchi, Seng Mai, Keita Kurabe, Yusuke Sakai, Georgina Agyei, Soudabeh Eslami, David Chiang
- Think Globally, Group Locally: Evaluating LLMs Using Multi-Lingual Word Grouping Games
César Guerra-Solano, Zhuochun Li, Xiang Lorraine Li
- Pointing to a Llama and Call it a Camel: On the Sycophancy of Multimodal Large Language Models
Renjie Pi, Kehao Miao, LI PEIHANG, Runtao Liu, Jiahui Gao, Jipeng Zhang, Xiaofang Zhou
- MR. Judge: Multimodal Reasoner as a Judge
Renjie Pi, Haoping Bai, Qibin Chen, Xiaoming Simon Wang, Jiulong Shan, Xiaojiang Liu, Meng Cao
- MobiZO: Enabling Efficient LLM Fine-Tuning at the Edge via Inference Engines
Lei Gao, Amir Ziashahabi, Yue Niu, Salman Avestimehr, Murali Annavaram
- Fann or Flop: A Multigenre, Multiera Benchmark for Arabic Poetry Understanding in LLMs
Wafa Al Ghallabi, Ritesh Thawkar, Sara Ghaboura, Ketan Pravin More, Omkar Thawakar, Hisham Cholakkal, Salman Khan, Rao Muhammad Anwer
- CoMAT: Chain of Mathematically Annotated Thought Improves Mathematical Reasoning
Joshua Ong Jun Leang, Aryo Pradipta Gema, Shay B Cohen
- s1: Simple test-time scaling
Niklas Muennighoff, Zitong Yang, Weijia Shi, Xiang Lisa Li, Li Fei-Fei, Hannaneh Hajishirzi, Luke Zettlemoyer, Percy Liang, Emmanuel Candes, Tatsunori Hashimoto
- Learning Subjective Label Distributions via Sociocultural Descriptors
MOHAMMED FAYIZ PARAPPAN, Ricardo Henao
- COM-BOM: Bayesian Exemplar Search for Efficiently Exploring the Accuracy-Calibration Pareto Frontier
Gaoxiang Luo, Aryan Deshwal
- ML-Promise: A Multilingual Dataset for Corporate Promise Verification
Yohei Seki, Hakusen Shu, Anaïs Lhuissier, Hanwool Lee, Juyeon Kang, Min-Yuh Day, Chung-Chi Chen
- Reading Between the Prompts: How Stereotypes Shape LLM’s Implicit Personalization
Vera Neplenbroek, Arianna Bisazza, Raquel Fernández
- Paired by the Teacher: Turning Unpaired Data into High-Fidelity Pairs for Low-Resource Text Generation
Yen-Ju Lu, Thomas Thebaud, Laureano Moro-Velazquez, Najim Dehak, Jesus Villalba
- Please Translate Again: Two Simple Experiments on Whether Human-Like Reasoning Helps Translation
Di Wu, Seth Aycock, Christof Monz
- How Do Large Vision-Language Models See Text in Image? Unveiling the Distinctive Role of OCR Heads
Ingeol Baek, Hwan Chang, Sunghyun Ryu, Hwanhee Lee
- Explainability and Interpretability of Multilingual Large Language Models: A Survey
Lucas Resck, Isabelle Augenstein, Anna Korhonen
- Decoding the Rule Book: Extracting Hidden Moderation Criteria from Reddit Communities
Youngwoo Kim, Himanshu Beniwal, Steven L. Johnson, Thomas Hartvigsen
- AcT2I: Evaluating and Improving Action Depiction in Text-to-Image Models
Vatsal Malaviya, Agneet Chatterjee, Maitreya Patel, Yezhou Yang
- Assessing French Readability for Adults with Low Literacy: A Global and Local Perspective
Wafa Aissa, Thibault Bañeras-Roux, Elodie Vanzeveren, GAO Lingyun, Rodrigo Wilkens, Thomas François
- LILaC: Late Interacting in Layered Component Graph for Open-domain Multimodal Multihop Retrieval
Joohyung Yun, Doyup Lee, Wook-Shin Han
- DiCoRe: Enhancing Zero-shot Event Detection via Divergent-Convergent LLM Reasoning
Tanmay Parekh, Kartik Mehta, Ninareh Mehrabi, Kai-Wei Chang, Nanyun Peng
- SNaRe: Domain-aware Data Generation for Low-Resource Event Detection
Tanmay Parekh, Yuxuan Dong, Lucas Bandarkar, Artin Kim, I-Hung Hsu, Kai-Wei Chang, Nanyun Peng
- Table-R1: Inference-Time Scaling for Table Reasoning Tasks
Zheyuan Yang, Lyuhao Chen, Arman Cohan, Yilun Zhao
- LimRank: Less is More for Reasoning-Intensive Information Reranking
Tingyu Song, Yilun Zhao, Siyue Zhang, Chen Zhao, Arman Cohan
- PlanGEN: A Multi-Agent Framework for Generating Planning and Reasoning Trajectories for Complex Problem Solving
Mihir Parmar, Xin Liu, Palash Goyal, Yanfei Chen, Long Le, Swaroop Mishra, Hossein Mobahi, Jindong Gu, Zifeng Wang, Hootan Nakhost, Chitta Baral, Chen-Yu Lee, Tomas Pfister, Hamid Palangi
- An Empirical Study on Strong-Weak Model Collaboration for Repo-level Code Generation
Shubham Gandhi, Atharva Naik, Yiqing Xie, Carolyn Rose
- What are Foundation Models Cooking in the Post-Soviet World?
Anton Lavrouk, Tarek Naous, Alan Ritter, Wei Xu
- LogiDynamics: Unraveling the Dynamics of Inductive, Abductive and Deductive Logical Inferences in LLM Reasoning
Tianshi Zheng, Cheng Jiayang, Chunyang Li, Haochen Shi, Zihao Wang, Jiaxin Bai, Yangqiu Song, Ginny Wong, Simon See
- EcoLoRA: Communication-Efficient Federated Fine-Tuning of Large Language Models
Han Liu, Ruoyao Wen, Srijith Nair, Jia Liu, Wenjing Lou, Chongjie Zhang, William Yeoh, Yevgeniy Vorobeychik, Ning Zhang
- Memorization $\neq$ Understanding: Do Large Language Models Have the Ability of Scenario Cognition?
Boxiang Ma, Ru Li, Wang Yuanlong, Hongye Tan, Xiaoli Li
- Priority on High-Quality: Selecting Instruction Data via Consistency Verification of Noise Injection
Hong Zhang, Feng Zhao, Ruilin Zhao, Cheng Yan, Kangzheng Liu
- Can Prompts Rewind Time for LLMs? Evaluating the Effectiveness of Prompted Knowledge Cutoffs
Xin Gao, Ruiyi Zhang, Sai Ashish Somayajula, Daniel Du, Saurabh Mahindre, Pengtao Xie
- DSVD: Dynamic Self-Verify Decoding for Faithful Generation in Large Language Models
YiQiu Guo, Yuchen Yang, Zhe Chen, Pingjie Wang, Yusheng Liao, Ya Zhang, Yanfeng Wang, Yu Wang
- Metric Calculating Benchmark: Code-Verifiable Complicate Instruction Following Benchmark for Large Language Models
Hyeonseok Moon, Seongtae Hong, Jaehyung Seo, Heuiseok Lim
- Generative Annotation for ASR Named Entity Correction
Yuanchang Luo, Daimeng Wei, Shaojun Li, Hengchao Shang, Jiaxin GUO, Zongyao Li, Zhanglin Wu, Xiaoyu Chen, Zhiqiang Rao, Jinlong Yang, Hao Yang
- SOLAR: Towards Characterizing Subjectivity of Individuals through Modeling Value Conflicts and Trade-offs
Younghun Lee, Dan Goldwasser
- LogicTree: Structured Proof Exploration for Coherent and Rigorous Logical Reasoning with Large Language Models
Kang He, Kaushik Roy
- Unmasking Fake Careers: Detecting Machine-Generated Career Trajectories via Multi-layer Heterogeneous Graphs
Michiharu Yamashita, Thanh Tran, Delvin Ce Zhang, Dongwon Lee
- GAP: a Global Adaptive Pruning Method for Large Language Models
Zhihua Ban, Haotian Ma, Siheng Zhang, Shengyu Liu, Xichen Chen, Ming Yang
- Distribution Prompting: Understanding the Expressivity of Language Models Through the Next-Token Distributions They Can Produce
Haojin Wang, Zining Zhu, Freda Shi
- LGA: LLM-GNN Aggregation for Temporal Evolution Attribute Graph Prediction
Feng Zhao, Ruoyu Chai, Kangzheng Liu, Xianggan Liu
- EIFBENCH: Extremely Complex Instruction Following Benchmark for Large Language Models
Tao Zou, Xinghua Zhang, Haiyang Yu, Minzheng Wang, Fei Huang, Yongbin Li
- Tool Preferences in Agentic LLMs are Unreliable
Kazem Faghih, Wenxiao Wang, Yize Cheng, Siddhant Bharti, Gaurang Sriramanan, Sriram Balasubramanian, Parsa Hosseini, Soheil Feizi
- Enhancing Large Language Model for Knowledge Graph Completion via Structure-Aware Alignment-Tuning
Yu Liu, Yanan Cao, Xixun Lin, Yanmin Shang, Shi Wang, Shirui Pan
- MultiDocFusion : Hierarchical and Multimodal Chunking Pipeline for Enhanced RAG on Long Industrial Documents
Joong Min Shin, Chanjun Park, Jeongbae Park, Jaehyung Seo, Heuiseok Lim
- Attention-guided Self-reflection for Zero-shot Hallucination Detection in Large Language Models
Qiang Liu, Xinlong Chen, Yue Ding, Bowen Song, Weiqiang Wang, Shu Wu, Liang Wang
- ‘Rich Dad, Poor Lad’: How do Large Language Models Contextualize Socioeconomic Factors in College Admission ?
Huy Nghiem, Phuong-Anh Nguyen-Le, John Prindle, Rachel Rudinger, Hal Daumé III
- Understanding and Mitigating Overrefusal in LLMs from an Unveiling Perspective of Safety Decision Boundary
Licheng Pan, Yongqi Tong, Xin Zhang, Xiaolu Zhang, JUN ZHOU, Zhixuan Chu
- MMAG: Multimodal Learning for Mucus Anomaly Grading in Nasal Endoscopy via Semantic Attribute Prompting
Xinpan Yuan, Mingzhu Huang, Liujie Hua, JianuoJu, XuZhang
- The Emperor’s New Reasoning: Format Imitation Overshadows Genuine Mathematical Understanding in SFT
Linyao Yang, Jian-Tao Huang, Yafei Lu, Zhenhui Jessie Li, Guirong Xue
- Step Guided Reasoning: Improving Mathematical Reasoning using Guidance Generation and Step Reasoning
Lang Cao, Yingtian Zou, Chao Peng, Renhong Chen, Wu Ning, Yitong Li
- Flexibly Utilize Memory for Long-Term Conversation via a Fragment-then-Compose Framework
Cai Ke, Yiming Du, Bin Liang, Yifan Xiang, Lin Gui, Zhongyang Li, Baojun Wang, Yue Yu, Hui Wang, Kam-Fai Wong, Ruifeng Xu
- STRICT: Stress-Test of Rendering Image Containing Text
Tianyu Zhang, Xinyu Wang, Zhenghan Tai, Lu Li, Jijun Chi, Jingrui Tian, Hailin He, Suyuchen Wang
- A Sequential Multi-Stage Approach for Code Vulnerability Detection via Confidence- and Collaboration-based Decision Making
Chung-Nan Tsai, Xin Wang, Cheng-Hsiung Lee, Ching-Sheng Lin
- Leveraging Large Models to Evaluate Novel Content: A Case Study on Advertisement Creativity
Zhaoyi Joey Hou, Adriana Kovashka, Xiang Lorraine Li
- BIRD: Bronze Inscription Restoration and Dating
Wenjie Hua, Hoang H Nguyen, Gangyan Ge
- DCP: Dual-Cue Pruning for Efficient Large Vision-Language Models
Lei Jiang, Zixun Zhang, Yuting Zeng, Chunzhao Xie, Tongxuan Liu, Zhen Li, Lechao Cheng, Xiaohua Xu
- Improving Context Fidelity via Native Retrieval-Augmented Reasoning
Suyuchen Wang, Jinlin Wang, Xinyu Wang, Shiqi Li, Xiangru Tang, Sirui Hong, Xiao-Wen Chang, Chenglin Wu, Bang Liu
- Koel-TTS: Enhancing LLM based Speech Generation with Preference Alignment and Classifier Free Guidance
Shehzeen Samarah Hussain, Paarth Neekhara, Xuesong Yang, Edresson Casanova, Subhankar Ghosh, Roy Fejgin, Mikyas T. Desta, Rafael Valle, Jason Li
- Mixing Inference-time Experts for Enhancing LLM Reasoning
Soumya Sanyal, Tianyi Xiao, Xiang Ren
- Reinforced Query Reasoners for Reasoning-intensive Retrieval Tasks
Xubo Qin, Jun Bai, Jiaqi Li, Zixia Jia, Zilong Zheng
- TokenSelect: Efficient Long-Context Inference and Length Extrapolation for LLMs via Dynamic Token-Level KV Cache Selection
Wei Wu, Zhuoshi Pan, Kun Fu, Chao Wang, Liyi Chen, Yunchu Bai, Tianfu Wang, Zheng Wang, Hui Xiong
- MUSE: MCTS-Driven Red Teaming Framework for Enhanced Multi-Turn Dialogue Safety in Large Language Models
Siyu Yan, Long Zeng, Xuecheng Wu, Chengcheng Han, Kongcheng Zhang, Chong Peng, Xuezhi Cao, Xunliang Cai, Chenjuan Guo
- EnAnchored-X2X: English-Anchored Optimization for Many-to-Many Translation
Sen Yang, Yu Bao, Yu Lu, Jiajun Chen, Shujian Huang, Shanbo Cheng
- “I’ve Decided to Leak”: Probing Internals Behind Prompt Leakage Intents
Jianshuo Dong, Yutong Zhang, Liu Yan, Zhenyu Zhong, Tao Wei, Tianwei Zhang, Ke Xu, Minlie Huang, Chao Zhang, Han Qiu
- Nullspace Disentanglement for Red Teaming Language Models
Yi Han, Yuanxing Liu, Weinan Zhang, Ting Liu
- Supervised Attention Mechanism for Low-quality Multimodal Data
Sijie Mai, Shiqin Han, Haifeng Hu
- Reinforcement Learning for Large Language Models via Group Preference Reward Shaping
Huaisheng Zhu, Siyuan Xu, Hangfan Zhang, Teng Xiao, Zhimeng Guo, Shijie Zhou, Shuyue Hu, Vasant G Honavar
- zFLoRA: Zero-Latency Fused Low-Rank Adapters
Dhananjaya Gowda, Seoha Song, Harshith Goka, Junhyun Lee
- PLAN-TUNING: Post-Training Language Models to Learn Step-by-Step Planning for Complex Problem Solving
Mihir Parmar, Palash Goyal, Xin Liu, Yiwen Song, Mingyang Ling, Chitta Baral, Hamid Palangi, Tomas Pfister
- Semantic Inversion, Identical Replies: Revisiting Negation Blindness in Large Language Models
Jinsung Kim, Seonmin Koo, Heuiseok Lim
- AMACE: Automatic Multi-Agent Chart Evolution for Iteratively Tailored Chart Generation
Hyuk Namgoong, Jeesu Jung, Hyeonseok Kang, Sangkeun Jung
- ActionStudio: A Lightweight Framework for Data and Training of Large Action Models
Jianguo Zhang, Thai Quoc Hoang, Ming Zhu, Zuxin Liu, Shiyu Wang, Tulika Manoj Awalgaonkar, Akshara Prabhakar, Haolin Chen, Weiran Yao, Zhiwei Liu, Juntao Tan, Juan Carlos Niebles, Shelby Heinecke, Huan Wang, Silvio Savarese, Caiming Xiong
- Interpretation Meets Safety: A Survey on Interpretation Methods and Tools for Improving LLM Safety
Seongmin Lee, Aeree Cho, Grace C. Kim, ShengYun Peng, Mansi Phute, Duen Horng Chau
- Unveiling the Response of Large Vision-Language Models to Visually Absent Tokens
Sohee Kim, Soohyun Ryu, Joonhyung Park, Eunho Yang
- Improving Task Diversity in Label Efficient Supervised Finetuning of LLMs
Abhinav Arabelly, Jagrut Nemade, Robert D Nowak, Jifan Zhang
- Look Beyond Feeling: Unveiling Latent Needs from Implicit Expressions for Proactive Emotional Support
Xing Fu, Haozhen Li, Bichen Wang, Hao Yang, Yanyan Zhao, Bing Qin
- s3: You Don’t Need That Much Data to Train a Search Agent via RL
Pengcheng Jiang, Xueqiang Xu, Jiacheng Lin, Jinfeng Xiao, Zifeng Wang, Jimeng Sun, Jiawei Han
- FuseChat: Knowledge Fusion of Chat Models
Fanqi Wan, Longguang Zhong, Ziyi Yang, Ruijun Chen, Xiaojun Quan
- Continuous-Time Attention: PDE-Guided Mechanisms for Long-Sequence Transformers
YUKUN ZHANG, Xueqing Zhou
- Forget What You Know about LLMs Evaluations - LLMs are Like a Chameleon
Nurit Cohen Inger, Yehonatan Elisha, Bracha Shapira, Lior Rokach, Seffi Cohen
- Memorization or Reasoning? Exploring the Idiom Understanding of LLMs
Jisu Kim, Youngwoo Shin, Uiji Hwang, Jihun Choi, richeng xuan, Taeuk Kim
- RD-MCSA: A Multi-Class Sentiment Analysis Approach Integrating In-Context Classification Rationales and Demonstrations
Haihua Xie, Yinzhu Cheng, Yaqing Wang, Miao He, Mingming Sun
- Puzzled by Puzzles: When Vision-Language Models Can’t Take a Hint
Heekyung Lee, Jiaxin Ge, Tsung-Han Wu, Minwoo Kang, Trevor Darrell, David M. Chan
- CREPE: Rapid Chest X-ray Report Evaluation by Predicting Multi-category Error Counts
Gihun Cho, Seunghyun Jang, Hanbin Ko, Inhyeok Baek, Chang Min Park
- TIDES: Technical Information Discovery and Extraction System
Jihee Kim, Subeen Park, Hakyung Lee, YongTaek Lim, Hyo-won Suh, Kyungwoo Song
- Learning to Ask: When LLM Agents Meet Unclear Instruction
Wenxuan Wang, SHI Juluan, Zixuan Ling, Yuk-Kit Chan, Chaozheng Wang, Cheryl Lee, Youliang Yuan, Jen-tse Huang, Wenxiang Jiao, Michael R. Lyu
- RICO: Improving Accuracy and Completeness in Image Recaptioning via Visual Reconstruction
Yuchi Wang, Yishuo Cai, Shuhuai Ren, Sihan Yang, Linli Yao, Yuanxin Liu, Yuanxing Zhang, Pengfei Wan, Xu Sun
- StepSearch: Igniting LLMs Search Ability via Step-Wise Proximal Policy Optimization
Xuhui Zheng, Kang An, Ziliang Wang, Yuhang Wang, Yichao Wu
- Dynamic Model-Bank Test-Time Adaptation for Automatic Speech Recognition
Yanshuo Wang, Yanghao Zhou, Yukang Lin, Haoxing Chen, Jin Zhang, Wentao Zhu, Jie Hong, Xuesong Li
- Mitigating Catastrophic Forgetting in Large Language Models with Forgetting-aware Pruning
Wei Huang, Anda Cheng, Yinggui Wang
- Does Localization Inform Unlearning? A Rigorous Examination of Local Parameter Attribution for Knowledge Unlearning in Language Models
Hwiyeong Lee, Uiji Hwang, Hyelim Lim, Taeuk Kim
- ArgCMV: An Argument Summarization Benchmark for the LLM-era
Omkar Gurjar, Agam Goyal, Eshwar Chandrasekharan
- VistaWise: Building Cost-Effective Agent with Cross-Modal Knowledge Graph for Minecraft
Honghao Fu, Junlong Ren, Qi Chai, Deheng Ye, Yujun Cai, Hao Wang
- GraphKV: Breaking the Static Selection Paradigm with Graph-Based KV Cache Eviction
Xuelin Li, Xiangqi Jin, Linfeng Zhang
- Joint Modeling of Entities and Discourse Relations for Coherence Assessment
Wei Liu, Michael Strube
- Understanding and Leveraging the Expert Specialization of Context Faithfulness in Mixture-of-Experts LLMs
Jun Bai, Minghao Tong, Yang Liu, Zixia Jia, Zilong Zheng
- HMoE: Heterogeneous Mixture of Experts for Language Modeling
An Wang, Xingwu Sun, Ruobing Xie, Shuaipeng Li, Jiaqi Zhu, Zhen Yang, Pinxue Zhao, Weidong Han, Zhanhui Kang, Di Wang, Naoaki Okazaki, Cheng-zhong Xu
- The Ranking Blind Spot: Decision Hijacking in LLM-based Text Ranking
Yaoyao Qian, Yifan Zeng, Yuchao Jiang, Chelsi Jain, Huazheng Wang
- Uniform Information Density and Syntactic Reduction: Revisiting that-Mentioning in English Complement Clauses
Hailin Hao, Elsi Kaiser
- GRIT: Guided Relational Integration for Efficient Multi-Table Understanding
Yujin Kang, Park Seong Woo, Yoon-Sik Cho
- RPDR: A Round-trip Prediction-Based Data Augmentation Framework for Long-Tail Question Answering
Yiming Zhang, Siyue Zhang, Junbo Zhao, Chen Zhao
- Discrepancy Detection at the Data Level: Toward Consistent Multilingual Question Answering
Lorena Calvo-Bartolomé, Valérie Aldana, Karla Cantarero, Alonso Madroñal de Mesa, Jerónimo Arenas-García, Jordan Lee Boyd-Graber
- Data-Efficient Selection via Grammatical Complexity in Continual Pre-training of Domain-Specific LLMs
Yizhou Ying, Geng Zhang, Cui Danxin, Chengyu Du, Guanglei Yue, Sihang Jiang, Jiaqing Liang, Yifei Fu, Hailin Hu, Yanghua Xiao
- Comprehensive and Efficient Distillation for Lightweight Sentiment Analysis Models
Guangyu Xie, Yice Zhang, Jianzhu Bao, Qianlong Wang, Yang Sun, Bingbing Wang, Ruifeng Xu
- One Planner To Guide Them All ! Learning Adaptive Conversational Planners for Goal-oriented Dialogues
Huy Quang Dao, Lizi Liao
- Unsupervised Hallucination Detection by Inspecting Reasoning Processes
Ponhvoan Srey, Xiaobao Wu, Anh Tuan Luu
- Multimodal Neural Machine Translation: A Survey of the State of the Art
Yi Feng, Chuanyi Li, Jiatong He, Zhenyu Hou, Vincent Ng
- Lemmatization of Polish Multi-word Expressions
Magdalena Król, Aleksander Smywiński-Pohl, Zbigniew Kaleta, Paweł Lewkowicz
- Targeted Distillation for Sentiment Analysis
Yice Zhang, Guangyu Xie, Jingjie Lin, Jianzhu Bao, Qianlong Wang, Xi Zeng, Ruifeng Xu
- DiffusionAttacker: Diffusion-Driven Prompt Manipulation for LLM Jailbreak
Hao Wang, Hao Li, Junda Zhu, Xinyuan Wang, Chengwei Pan, Minlie Huang, Lei Sha
- Rank-Awareness and Angular Constraints: A New Perspective on Learning Sentence Embeddings from NLI Data
Zicheng Zhou, Min Huang, Qinghai Miao
- LLM-Guided Semantic Relational Reasoning for Multimodal Intent Recognition
Qianrui Zhou, Hua Xu, Yifan Wang, Xinzhi Dong, Hanlei Zhang
- Seeing Culture: A Benchmark for Visual Reasoning and Grounding
Burak Satar, Zhixin Ma, Patrick Amadeus Irawan, Wilfried Ariel Mulyawan, Jing Jiang, Ee-Peng Lim, Chong-Wah Ngo
- GRADA: Graph-based Reranking against Adversarial Documents Attack
Jingjie Zheng, Aryo Pradipta Gema, Giwon Hong, Xuanli He, Pasquale Minervini, Youcheng Sun, Qiongkai Xu
- Orchestrating Audio: Multi-Agent Framework for Long-Video Audio Synthesis
Yehang Zhang, Xinli Xu, Xiaojie Xu, Doudou ZHANG, Li Liu, Ying-Cong Chen
- MADAWSD: Multi-Agent Debate Framework for Adversarial Word Sense Disambiguation
Kaiyuan Zhang, Qian Liu, Luyang Zhang, Chaoqun Zheng, Shuaimin Li, Bing Xu, Muyun Yang, Xinxiao Qiao, Wenpeng Lu
- Interpretable Text Embeddings and Text Similarity Explanation: A Survey
Juri Opitz, Lucas Moeller, Andrianos Michail, Sebastian Padó, Simon Clematide
- Dyve: Thinking Fast and Slow for Dynamic Process Verification
Jianyuan Zhong, Zeju Li, Zhijian Xu, Xiangyu Wen, Qiang Xu
- PERSEVAL: A Framework for Perspectivist Classification Evaluation
Soda Marem Lo, Silvia Casola, Erhan Sezerer, Valerio Basile, Franco Sansonetti, Antonio Uva, Davide Bernardi
- Massive Supervised Fine-tuning Experiments Reveal How Data, Layer, and Training Factors Shape LLM Alignment Quality
Yuto Harada, Yusuke Yamauchi, Yusuke Oda, Yohei Oseki, Yusuke Miyao, Yu Takagi
- IndiGEC: Multilingual Grammar Error Correction for Low-Resource Indian Languages
Ujjwal Sharma, Pushpak Bhattacharyya
- Bias Beware: The Impact of Cognitive Biases on LLM-Driven Product Recommendations
Giorgos Filandrianos, Angeliki Dimitriou, Maria Lymperaiou, Konstantinos Thomas, Giorgos Stamou
- T2R-BENCH: A Benchmark for Real World Table-to-Report Task
JieZhangChinaTele, Changzai Pan, Sishi Xiong, Kaiwen Wei, Yu Zhao, xiangyu Li, Jiaxin Peng, Xiaoyan Gu, Jian Yang, Wenhan Chang, Zhenhe Wu, Jiang Zhong, Shuangyong Song, Xuelong Li
- TCP: a Benchmark for Temporal Constraint-Based Planning
Zifeng Ding, Sikuan Yan, Moy Yuan, Xianglong Hu, Fangru Lin, Andreas Vlachos
- The Role of Outgoing Connection Heterogeneity in Feedforward Layers of Large Language Models
Felix Stahlberg, Shankar Kumar
- Follow the Flow: Fine-grained Flowchart Attribution with Neurosymbolic Agents
Manan Suri, Puneet Mathur, Nedim Lipka, Franck Dernoncourt, Ryan A. Rossi, Vivek Gupta, Dinesh Manocha
- Collaborative Rational Speech Act: Pragmatic Reasoning for Multi-Turn Dialog
Lautaro Estienne, Gabriel Ben Zenou, Nona Naderi, Jackie CK Cheung, Pablo Piantanida
- Understanding Subword Compositionality of Large Language Models
Qiwei Peng, Yekun Chai, Anders Søgaard
- Internal Chain-of-Thought: Empirical Evidence for Layer‑wise Subtask Scheduling in LLMs
Zhipeng Yang, Junzhuo Li, Siyu Xia, Xuming Hu
- From Understanding to Generation: An Efficient Shortcut for Evaluating Language Models
Viktor Hangya, Fabian Küch, Darina Gold
- Debiasing Multilingual LLMs in Cross-lingual Latent Space
Qiwei Peng, Guimin Hu, Yekun Chai, Anders Søgaard
- Context is Gold to find the Gold Passage: Evaluating and Training Contextual Document Embeddings
Max Conti, Manuel Faysse, Gautier Viaud, Antoine Bosselut, CELINE HUDELOT, Pierre Colombo
- MS-RAG: Simple and Effective Multi-Semantic Retrieval-Augmented Generation
Xiaozhou You, Yahui Luo, Lihong Gu
- Transitive self-consistency evaluation of NLI models without gold labels
Wei Wu, Mark Last
- MiLQ: Benchmarking IR Models for Bilingual Web Search with Mixed Language Queries
Jonghwi Kim, Deokhyung Kang, Seonjeong Hwang, Yunsu Kim, Jungseul Ok, Gary Lee
- Enhancing Chinese Offensive Language Detection with Homophonic Perturbation
Junqi Wu, Jishujie, Kang Zhong, Huiling Peng, Zhendongxiao, Xiongding Liu, Wu Wei
- Persona-Augmented Benchmarking: Evaluating LLMs Across Diverse Writing Styles
Kimberly Truong, Riccardo Fogliato, Hoda Heidari, Steven Wu
- Computational Analysis of Character Development in Holocaust Testimonies
Esther Shizgal, Eitan Wagner, Renana Keydar, Omri Abend
- TASO: Task-Aligned Sparse Optimization for Parameter-Efficient Model Adaptation
Daiye Miao, Yufang Liu, Jie Wang, Changzhi Sun, Yunke Zhang, Demei Yan, Shaokang Dong, Qi Zhang, Yuanbin Wu
- Dual-Path Counterfactual Integration for Multimodal Aspect-Based Sentiment Classification
Rui Liu, Jiahao Cao, Jiaqian Ren, Xu Bai, Yanan Cao
- Job Unfair: An Investigation of Gender and Occupational Bias in Free-Form Text Completions by LLMs
Camilla Casula, Sebastiano Vecellio Salto, Elisa Leonardelli, Sara Tonelli
- C3: A Bilingual Benchmark for Spoken Dialogue Models Exploring Challenges in Complex Conversations
Chengqian Ma, Wei Tao, Steven Y. Guo
- Understanding LLMs’ Cross-Lingual Context Retrieval: How Good It Is And Where It Comes From
Changjiang Gao, Hankun Lin, Xin Huang, Xue Han, Junlan Feng, Chao Deng, Jiajun Chen, Shujian Huang
- Blind Men and the Elephant: Diverse Perspectives on Gender Stereotypes in Benchmark Datasets
Mahdi Zakizadeh, Mohammad Taher Pilehvar
- Linguistic and Embedding-Based Profiling of Texts Generated by Humans and Large Language Models
Sergio E. Zanotto, Segun Aroyehun
- An Interdisciplinary Approach to Human-Centered Machine Translation
Marine Carpuat, Omri Asscher, Kalika Bali, Luisa Bentivogli, Fred Blain, Lynne Bowker, Monojit Choudhury, Hal Daumé III, Kevin Duh, Ge Gao, Alvin C Grissom II, Marzena Karpinska, Elaine C Khoong, William D. Lewis, Andre Martins, Mary Nurminen, Douglas W. Oard, Maja Popovic, Michel Simard, François Yvon
- Exploring the Hidden Capacity of LLMs for One-Step Text Generation
Gleb Mezentsev, Ivan Oseledets
- Mixture of Weight-shared Heterogeneous Group Attention Experts for Dynamic Token-wise KV Optimization
Guanghui Song, Dongping Liao, Yiren Zhao, Kejiang Ye, Cheng-zhong Xu, Xitong Gao
- PathwiseRAG: Multi-Dimensional Exploration and Integration Framework
Hengrui Zhang, Pin-Siang Huang, Zhen Zhang, Peican Lin, Yao-Ching Yu, Bo Hu, Yulu Du
- “Mm, Wat?” Detecting Other-intiated Repair Requests in Dialogue
Anh Ha Ngo, Nicolas Rollet, Catherine Pelachaud, Chloé Clavel
- R-BPE: Improving BPE-Tokenizers with Token Reuse
Nancy Hamdan, Osama Rakan Al Mraikhat, Fadi zaraket
- Language Models Can be Efficiently Steered via Minimal Embedding Layer Transformations
Diogo Tavares, David Semedo, Joao Magalhaes, Alexander Rudnicky
- Adversarial Attacks Against Automated Fact-Checking: A Survey
Fanzhen Liu, Sharif Abuadbba, Kristen Moore, Surya Nepal, Cecile Paris, Jia Wu, Jian Yang, Quan Z. Sheng
- WildDoc: How Far Are We from Achieving Comprehensive and Robust Document Understanding in the Wild?
An-Lan Wang, Jingqun Tang, Lei Liao, Hao Feng, Qi Liu, Xiang Fei, Jinghui Lu, Han Wang, Hao Liu, Yuliang Liu, Xiang Bai, Can Huang
- DCR: Quantifying Data Contamination in LLMs Evaluation
Cheng Xu, Nan Yan, Shuhao Guan, Changhong Jin, Yuke Mei, Yibing Guo, Tahar Kechadi
- Building Trust in Clinical LLMs: Bias Analysis and Dataset Transparency
Svetlana Maslenkova, Clement Christophe, Marco AF Pimentel, Tathagata Raha, Muhammad Umar Salman, Ahmed Al Mahrooqi, Avani Gupta, Shadab Khan, Ronnie Rajan, Praveenkumar Kanithi
- Surprise Calibration for Better In-Context Learning
Zhihang Tan, Jingrui Hou, Ping Wang, Qibiao Hu, Peng Zhu
- SPARK: Simulating the Co-evolution of Stance and Topic Dynamics in Online Discourse with LLM-based Agents
Bowen Zhang, Yi Yang, Fuqiang Niu, Xianghua Fu, Genan Dai, Hu Huang
- Drivel-ology: Challenging LLMs with Interpreting Nonsense with Depth
Yang Wang, Chenghao Xiao, Chia-Yi Hsiao, Zi Yan Chang, Chi-Li Chen, Tyler Loakman, Chenghua Lin
- Can Large Language Models be Effective Online Opinion Miners?
Ryang Heo, Yongsik Seo, JunseongLee, Dongha Lee
- Can Large Language Models Translate Unseen Languages in Underrepresented Scripts?
Dianqing Lin, Aruukhan, Hongxu Hou, shuo sun, Wei Chen, Yichen Yang, Guo dong Shi
- KCS: Diversify Multi-hop Question Generation with Knowledge Composition Sampling
Yangfan Wang, Jie Liu, Chen Tang, Lian Yan, Jingchi Jiang
- Fooling the LVLM Judges: Visual Biases in LVLM-Based Evaluation
Yerin Hwang, Dongryeol Lee, Kyungmin Min, taegwan kang, Yongil Kim, Kyomin Jung
- Disentangled Information Bottleneck for Adversarial Text Defense
Yidan Xu, Xinghao Yang, Wei Liu, Bao-di Liu, Weifeng Liu
- How do Language Models Reshape Entity Alignment? A Survey of LM-Driven EA Methods: Advances, Benchmarks, and Future
Zerui Chen, huiming fan, Qianyu Wang, Tao He, Ming Liu, Heng Chang, Weijiang Yu, Ze Li, Bing Qin
- Enhancing LLM-Based Social Bot via an Adversarial Learning Framework
Fanqi Kong, Xiaoyuan Zhang, Xinyu Chen, Yaodong Yang, Song-Chun Zhu, Xue Feng
- GER-LLM: Efficient and Effective Geospatial Entity Resolution with Large Language Model
Haojia Zhu, Zhicheng Li, Jiahui Jin
- CodeRAG: Finding Relevant and Necessary Knowledge for Retrieval-Augmented Repository-Level Code Completion
Sheng Zhang, Yifan Ding, Shuquan Lian, Shun Song, Hui Li
- Searching for the Most Human-like Emergent Language
Brendon Boldt, David R. Mortensen
- Does Context Matter? A Prosodic Comparison of English and Spanish in Monolingual and Multilingual Discourse Settings
Debasmita Bhattacharya, David Sasu, Michela Marchini, Natalie Schluter, Julia Hirschberg
- ZERA: Zero-init Instruction Evolving Refinement Agent – From Zero Instructions to Structured Prompts via Principle-based Optimization
Seungyoun Yi, Minsoo Khang, Sungrae Park
- Toward Machine Interpreting: Lessons from Human Interpreting Studies
Matthias Sperber, Maureen de Seyssel, Jiajun Bao, Matthias Paulik
- FlashAdventure: A Benchmark for GUI Agents Solving Full Story Arcs in Diverse Adventure Games
Jaewoo Ahn, Junseo Kim, Heeseung Yun, Jaehyeon Son, Dongmin Park, Jaewoong Cho, Gunhee Kim
- FLARE: Faithful Logic-Aided Reasoning and Exploration
Erik Arakelyan, Pasquale Minervini, Patrick Lewis, Pat Verga, Isabelle Augenstein
- Discourse-Driven Code-Switching: Analyzing the Role of Content and Communicative Function in Spanish-English Bilingual Speech
Debasmita Bhattacharya, Juan Junco, Divya Tadimeti, Julia Hirschberg
- Can Large Language Models Translate Spoken-Only Languages through International Phonetic Transcription?
Jiale Chen, Xuelian Dong, Qihao Yang, Wenxiu Xie, Tianyong Hao
- ClimateViz: A Benchmark for Statistical Reasoning and Fact Verification on Scientific Charts
Ruiran Su, Jiasheng Si, Zhijiang Guo, Janet B. Pierrehumbert
- Bridging the Gap Between Molecule and Textual Descriptions via Substructure-aware Alignment
Hyuntae Park, Yeachan Kim, SangKeun Lee
- SLlama: Parameter-Efficient Language Model Architecture for Enhanced Linguistic Competence Under Strict Data Constraints
Victor Adelakun Omolaoye, Babajide Alamu Owoyele, Gerard de Melo
- What You See is What You Ask: Evaluating Audio Descriptions
Divy Kala, Eshika Khandelwal, Makarand Tapaswi
- TAPS: Tool-Augmented Personalisation via Structured Tagging
Ekaterina Taktasheva, Jeff Dalton
- Investigating How Pre-training Data Leakage Affects Models’ Reproduction and Detection Capabilities
Masahiro Kaneko, Timothy Baldwin
- Walk and Read Less: Improving the Efficiency of Vision-and-Language Navigation via Tuning-Free Multimodal Token Pruning
Wenda Qin, Andrea Burns, Bryan A. Plummer, Margrit Betke
- Connecting the Knowledge Dots: Retrieval-augmented Knowledge Connection for Commonsense Reasoning
Junho Kim, Soyeon Bak, Mingyu Lee, Minju Hong, Songha Kim, Tae-Eui Kam, SangKeun Lee
- Agent-as-Judge for Factual Summarization of Long Narratives
Yeonseok Jeong, Minsoo Kim, seung-won hwang, Byung-Hak Kim
- DnDScore: Decontextualization and Decomposition for Factuality Verification in Long-Form Text Generation
Miriam Wanner, Benjamin Van Durme, Mark Dredze
- RAcQUEt: Unveiling the Dangers of Overlooked Referential Ambiguity in Visual LLMs
Alberto Testoni, Barbara Plank, Raquel Fernández
- Resource-Rational Noisy-Channel Language Processing: Testing the Effect of Algorithmic Constraints on Inferences
Thomas Hikaru Clark, Jacob Hoover Vigly, Edward Gibson, Roger P. Levy
- In Benchmarks We Trust … Or Not?
Ine Gevers, Victor De Marez, Jens Van Nooten, Jens Lemmens, Andriy Kosar, Ehsan Lotfi, Nikolay Banar, Pieter Fivez, Luna De Bruyne, Walter Daelemans
- Video2Roleplay: A Multimodal Dataset and Framework for Video-Guided Role-playing Agents
Xueqiao Zhang, Chao Zhang, Jingtao Xu, Yifan Zhu, Xin Shi, Yi Yang, Yawei Luo
- Discriminating Form and Meaning in Multilingual Models with Minimal-Pair ABX Tasks
Maureen de Seyssel, Jie Chi, Skyler Seto, Maartje Ter Hoeve, Masha Fedzechkina, Natalie Schluter
- Rethinking Text-based Protein Understanding: Retrieval or LLM?
Juntong Wu, Zijing Liu, He CAO, Li Hao, Bin Feng, Zishan Shu, Ke Yu, Li Yuan, Yu Li
- Grounded Semantic Role Labelling from Synthetic Multimodal Data for Situated Robot Commands
Claudiu Daniel Hromei, Antonio Scaiella, Danilo Croce, Roberto Basili
- Easy as PIE? Identifying Multi-Word Expressions with LLMs
Kai Golan Hashiloni, Ofri Hefetz, Kfir Bar
- Query-Focused Retrieval Heads Improve Long-Context Reasoning and Re-ranking
Wuwei Zhang, Fangcong Yin, Howard Yen, Danqi Chen, Xi Ye
- Robust Adaptation of Large Multimodal Models for Retrieval Augmented Hateful Meme Detection
Jingbiao Mei, Jinghong Chen, Guangyu Yang, Weizhe Lin, Bill Byrne
- Audio-Reasoner: Improving Reasoning Capability in Large Audio Language Models
Xie Zhifei, Mingbao Lin, Zihang Liu, Pengcheng Wu, Shuicheng YAN, Chunyan Miao
- From perception to production: how acoustic invariance facilitates articulatory learning in a self-supervised vocal imitation model
Marvin Lavechin, Thomas Hueber
- REALM: Recursive Relevance Modeling for LLM-based Document Re-Ranking
Pinhuan Wang, Zhiqiu Xia, Chunhua Liao, Feiyi Wang, Hang Liu
- PLLuM-Align: Polish Preference Dataset for Large Language Model Alignment
Karolina Seweryn, Anna Kołos, Agnieszka Karlińska, Katarzyna Lorenc, Katarzyna Dziewulska, Maciej Chrabaszcz, Aleksandra Krasnodębska, Paula Betscher, Zofia Cieślińska, Katarzyna Kowol, Julia Moska, Dawid Motyka, Paweł Walkowiak, Bartosz Żuk, Arkadiusz Janz
- Graph-R1: Incentivizing the Zero-Shot Graph Learning Capability in LLMs via Explicit Reasoning
Yicong Wu, Guangyue Lu, Yuan Zuo, Huarong Zhang, Junjie Wu
- Scalable and Culturally Specific Stereotype Dataset Construction via Human-LLM Collaboration
Weicheng Ma, John J. Guerrerio, Soroush Vosoughi
- Can Large Language Models Be Good Language Teachers?
LiQing Xu, Qiwei Li, Tianshuo Peng, Zuchao Li, hai zhao, Ping Wang
- Empowering Math Problem Generation and Reasoning for Large Language Model via Synthetic Data based Continual Learning Framework
Qian Wan, Wangzi Shi, Jintian Feng, Shengyingjie Liu, Luona Wei, Zhicheng Dai, Jianwen Sun
- Tokenization and Representation Biases in Multilingual Models on Dialectal NLP Tasks
Vani Kanjirangat, Tanja Samardzic, Ljiljana Dolamic, Fabio Rinaldi
- Evaluating the Evaluators: Are readability metrics good measures of readability?
Isabel Cachola, Daniel Khashabi, Mark Dredze
- Text Takes Over: A Study of Modality Bias in Multimodal Intent Detection
Ankan Mullick, Saransh Sharma, Abhik Jana, Pawan Goyal
- What’s in a prompt? Language models encode literary style in prompt embeddings
Raphaël Sarfati, Haley Moller, Toni J.B. Liu, Nicolas Boulle, Christopher Earls
- Identifying and Answering Questions with False Assumptions: An Interpretable Approach
Zijie Wang, Eduardo Blanco
- VisFinEval: A Scenario-Driven Chinese Multimodal Benchmark for Holistic Financial Understanding
Zhaowei Liu, Xin Guo, Haotian Xia, Lingfeng Zeng, Fangqi Lou, Jinyi Niu, Mengping Li, Qi Qi, Jiahuan Li, Wei Zhang, Yinglong Wang, Weige Cai, Weining Shen, Liwen Zhang
- Socratic-MCTS: Test-Time Visual Reasoning by Asking the Right Questions
David Acuna, Ximing Lu, Jaehun Jung, Hyunwoo Kim, Amlan Kar, Sanja Fidler, Yejin Choi
- LLMs Don’t Know Their Own Decision Boundaries: The Unreliability of Self-Generated Counterfactual Explanations
Harry Mayne, Ryan Othniel Kearns, Yushi Yang, Andrew M. Bean, Eoin D. Delaney, Chris Russell, Adam Mahdi
- Grounding Multilingual Multimodal LLMs With Cultural Knowledge
Jean de Dieu Nyandwi, Yueqi Song, Simran Khanuja, Graham Neubig
- Following Length Constraints in Instructions
Weizhe Yuan, Ilia Kulikov, Ping Yu, Kyunghyun Cho, Sainbayar Sukhbaatar, Jason E Weston, Jing Xu
- Memory-QA: Answering Recall Questions Based on Multimodal Memories
Hongda Jiang, Xinyuan Zhang, Siddhant Garg, Rishab Arora, Shiun-Zu Kuo, Jiayang Xu, AARON COLAK, Xin Luna Dong
- NEXUS: Network Exploration for eXploiting Unsafe Sequences in Multi-Turn LLM Jailbreaks
Javad Rafiei Asl, Sidhant Narula, Mohammad Ghasemigol, Eduardo Blanco, Daniel Takabi
- Sketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketching
Simon A. Aytes, Jinheon Baek, Sung Ju Hwang
- From Language to Cognition: How LLMs Outgrow the Human Language Network
Badr AlKhamissi, Greta Tuckute, Yingtian Tang, Taha Osama A Binhuraib, Antoine Bosselut, Martin Schrimpf
- Logos as a Well-Tempered Pre-train for Sign Language Recognition
Ilya Ovodov, Petr Surovtsev, Karina Kvanchiani, Alexander Kapitanov, Alexander Nagaev
- Hallucination Detection in LLMs Using Spectral Features of Attention Maps
Jakub Binkowski, Denis Janiak, Albert Sawczyn, Bogdan Gabrys, Tomasz Jan Kajdanowicz
- Composable Cross-prompt Essay Scoring by Merging Models
Sanwoo Lee, Kun Liang, Yunfang Wu
- Towards a Holistic and Automated Evaluation Framework for Multi-Level Comprehension of LLMs in Book-Length Contexts
Yuho Lee, Jiaqi Deng, Nicole Hee-Yeon Kim, Hyangsuk Min, Taewon Yun, Minjeong Ban, Kim Yul, Hwanjun Song
- Improving Large Language Models Function Calling and Interpretability via Guided-Structured Templates
Hy Dang, Tianyi Liu, Zhuofeng Wu, Jingfeng Yang, Haoming Jiang, Tao Yang, Pei Chen, Zhengyang Wang, Helen Wang, Huasheng Li, Bing Yin, Meng Jiang
- Evaluation and Facilitation of Online Discussions in the LLM Era: A Survey
Katerina Korre, Dimitris Tsirmpas, Nikos Gkoumas, Emma Cabalé, Danai Myrtzani, Theodoros Evgeniou, Ion Androutsopoulos, John Pavlopoulos
- Temporal Scaling Law for Large Language Models
Yizhe Xiong, Xiansheng Chen, Xin Ye, Hui Chen, Zijia Lin, Haoran Lian, Zhenpeng Su, Wei Huang, Jianwei Niu, Jungong Han, Guiguang Ding
- Reframe Your Life Story: Interactive Narrative Therapist and Innovative Moment Assessment with Large Language Models
Yi Feng, Jiaqi Wang, Wenxuan Zhang, Zhuang Chen, Shen Yutong, Xiyao Xiao, Minlie Huang, Liping Jing, Jian Yu
- From Word to World: Evaluate and Mitigate Culture Bias in LLMs via Word Association Test
Xunlian Dai, Li Zhou, Benyou Wang, Haizhou Li
- Mitigating the Privacy Issues in Retrieval-Augmented Generation (RAG) via Pure Synthetic Data
Shenglai Zeng, Jiankun Zhang, Pengfei He, Jie Ren, Tianqi Zheng, Hanqing Lu, Han Xu, Hui Liu, Yue Xing, Jiliang Tang
- AdaSteer: Your Aligned LLM is Inherently an Adaptive Jailbreak Defender
Weixiang Zhao, Jiahe Guo, Yulin Hu, Yang Deng, An Zhang, Xingyu Sui, Xinyang Han, Yanyan Zhao, Bing Qin, Tat-Seng Chua, Ting Liu
- Large Language Models Meet Knowledge Graphs for Question Answering: Synthesis and Opportunities
Chuangtao Ma, Yongrui Chen, Tianxing Wu, Arijit Khan, Haofen Wang
- TFDP: Token-Efficient Disparity Audits for Autoregressive LLMs via Single-Token Masked Evaluation
Inderjeet Singh, Ramya Srinivasan, Roman Vainshtein, Hisashi Kojima
- Hanfu-Bench: A Multimodal Benchmark on Cross-Temporal Cultural Understanding and Transcreation
Li Zhou, Lutong Yu, Dongchu Xie, Shaohuan Cheng, Wenyan Li, Haizhou Li
- MERMAID: Multi-perspective Self-reflective Agents with Generative Augmentation for Emotion Recognition
Zhongyu Yang, Junhao Song, Siyang Song, Wei Pang, Yingfang Yuan
- Personality Vector: Modulating Personality of Large Language Models by Model Merging
Seungjong Sun, Seo Yeon Baek, Jang Hyun Kim
- Beyond Outlining: Heterogeneous Recursive Planning for Adaptive Long-form Writing with Language Models
Ruibin Xiong, Yimeng Chen, Dmitrii Khizbullin, Mingchen Zhuge, Jürgen Schmidhuber
- Hidden in Plain Sight: Reasoning in Underspecified and Misspecified Scenarios for Multimodal LLMs
Qianqi Yan, Hongquan Li, Shan Jiang, Yang Zhao, Xinze Guan, Ching-Chen Kuo, Xin Eric Wang
- PrimeX: A Dataset of Worldview, Opinion, and Explanation
Rik Koncel-Kedziorski, Brihi Joshi, Tim Paek
- LASER: An LLM-based ASR Scoring and Evaluation Rubric
Amruta Parulekar, Preethi Jyothi
- Improving Zero-shot Sentence Decontextualisation with Content Selection and Planning
Zhenyun Deng, Yulong Chen, Andreas Vlachos
- Beyond Text: Unveiling Privacy Vulnerabilities in Multi-modal Retrieval-Augmented Generation
Jiankun Zhang, Shenglai Zeng, Jie Ren, Tianqi Zheng, Hui Liu, Xianfeng Tang, Hui Liu, Yi Chang
- Code Execution as Grounded Supervision for LLM Reasoning
Dongwon Jung, Wenxuan Zhou, Muhao Chen
- Subjective Behaviors and Preferences in LLM: Language of Browsing
Sai Sundaresan, Harshita Chopra, Atanu R. Sinha, Koustava Goswami, Nagasai Saketh Naidu, Raghav Karan, N Anushka
- Pixels Versus Priors: Controlling Knowledge Priors in Vision-Language Models through Visual Counterfacts
Michal Golovanevsky, William Rudman, Michael A. Lepori, Amir Bar, Ritambhara Singh, Carsten Eickhoff
- Balcony: A Lightweight Approach to Dynamic Inference of Generative Language Models
Benyamin Jamialahmadi, Parsa Kavehzadeh, Mehdi Rezagholizadeh, Parsa Farinneya, Hossein Rajabzadeh, Aref Jafari, Boxing Chen, Marzieh S. Tahaei
- Social Genome: Grounded Social Reasoning Abilities of Multimodal Models
Leena Mathur, Marian Qian, Paul Pu Liang, Louis-Philippe Morency
- Profiler: Black-box AI-generated Text Origin Detection via Context-aware Inference Pattern Analysis
Hanxi Guo, Siyuan Cheng, Xiaolong Jin, ZHUO ZHANG, Guangyu Shen, Kaiyuan Zhang, Shengwei An, Guanhong Tao, Xiangyu Zhang
- Speech Discrete Tokens or Continuous Features? A Comparative Analysis for Spoken Language Understanding in SpeechLLMs
Dingdong WANG, Junan Li, Mingyu Cui, Dongchao Yang, Xueyuan Chen, Helen M. Meng
- RAG-Zeval: Enhancing RAG Responses Evaluator through End-to-End Reasoning and Ranking-Based Reinforcement Learning
Kun LI, Yunxiang Li, Tianhua Zhang, Hongyin Luo, Xixin Wu, James R. Glass, Helen M. Meng
- Mahānāma: A Unique Testbed for Literary Entity Discovery and Linking
Sujoy Sarkar, Gourav Sarkar, Manoj Balaji Jagadeeshan, Jivnesh Sandhan, Amrith Krishna, Pawan Goyal
- Adaptively profiling models with task elicitation
Davis Brown, Prithvi Balehannina, Helen Jin, Shreya Havaldar, Hamed Hassani, Eric Wong
- TactfulToM: Do LLMs have the Theory of Mind ability to understand White Lies?
Yiwei Liu, Emma Jane Pretty, Jiahao Huang, Saku Sugawara
- Don’t Sweat the Small Stuff: Segment-Level Meta-Evaluation Based on Pairwise Difference Correlation
Colten DiIanni, Daniel Deutsch
- SMART: Simulated Students Aligned with Item Response Theory for Question Difficulty Prediction
Alexander Scarlatos, Nigel Fernandez, Christopher Ormerod, Susan Lottridge, Andrew Lan
- HESEIA: A community-based dataset for evaluating social biases in large language models, co-designed in real school settings in Latin America
Guido Ivetta, Marcos J Gomez, Sofía Martinelli, Pietro Palombini, M Emilia Echeveste, Nair Carolina Mazzeo, Beatriz Busaniche, Luciana Benotti
- WebMMU: A Benchmark for Multimodal Multilingual Website Understanding and Code Generation
Rabiul Awal, Mahsa Massoud, Aarash Feizi, Zichao Li, Suyuchen Wang, Christopher Pal, Aishwarya Agrawal, David Vazquez, Siva Reddy, Juan A. Rodriguez, Perouz Taslakian, Spandana Gella, Sai Rajeswar
- Analyzing values about gendered language reform in LLMs’ revisions
Jules Watson, Xi Wang, Raymond Liu, Suzanne Stevenson, Barend Beekhuizen
- ALLabel: Three-stage Active Learning for LLM-based Entity Recognition using Demonstration Retrieval
Zihan Chen, Lei Shi, Weize Wu, Qiji Zhou, Yue Zhang
- HyperKGR: Knowledge Graph Reasoning in Hyperbolic Space with Graph Neural Network Encoding Symbolic Path
Lihui Liu
- LLaMP: Large Language Model Made Powerful for High-fidelity Materials Knowledge Retrieval
Yuan Chiang, Elvis Hsieh, Chia-Hong Chou, Janosh Riebesell
- ReSeeding Latent States for Sequential Language Understanding
Stéphane Aroca-Ouellette, Katharina von der Wense, Alessandro Roncone
- DPED: Multi-Layer Noise Distillation for Privacy-Preserving Text Embeddings
Shuya Feng, Yuan Hong
- Identifying & Interactively Refining Ambiguous User Goals for Data Visualization Code Generation
Mert Inan, Anthony Sicilia, Alex Xie, Saujas Vaduguru, Daniel Fried, Malihe Alikhani
- Morpheme Induction for Emergent Language
Brendon Boldt, David R. Mortensen
- Stepwise Informativeness Search for Improving LLM Reasoning
Siyuan Wang, Enda Zhao, Xiang Ren
- Social Good or Scientific Curiosity? Uncovering the Research Framing Behind NLP Artefacts
Eric Chamoun, Nedjma Ousidhoum, Michael Sejr Schlichtkrull, Andreas Vlachos
- FairGen: Controlling Sensitive Attributes for Fair Generations in Diffusion Models via Adaptive Latent Guidance
Mintong Kang, Vinayshekhar Bannihatti Kumar, Shamik Roy, Abhishek Kumar, Sopan Khosla, Balakrishnan Murali Narayanaswamy, Rashmi Gangadharaiah
- Contra4: Evaluating Contrastive Cross-Modal Reasoning in Audio, Video, Image, and 3D
Artemis Panagopoulou, Le Xue, Honglu Zhou, silvio savarese, Ran Xu, Caiming Xiong, Chris Callison-Burch, Mark Yatskar, Juan Carlos Niebles
- Proactive Hearing Assistants that Isolate Egocentric Conversations
Guilin Hu, Malek Itani, Tuochao Chen, Shyamnath Gollakota
- fLSA: Learning Semantic Structures in Document Collections Using Foundation Models
Weijia Xu, Nebojsa Jojic, Nicolas Le Roux
- SafeKey: Amplifying Aha-Moment Insights for Safety Reasoning
Kaiwen Zhou, Xuandong Zhao, Jayanth Srinivasa, Gaowen Liu, Aosong Feng, Dawn Song, Xin Eric Wang
- HypER: Literature-grounded Hypothesis Generation and Distillation with Provenance
Rosni Vasu, Chandrayee Basu, Bhavana Dalvi Mishra, Cristina Sarasua, Peter Clark, Abraham Bernstein
- Empowering GraphRAG with Knowledge Filtering and Integration
Kai Guo, Harry Shomer, Shenglai Zeng, Haoyu Han, Yu Wang, Jiliang Tang
- Interpretable Mnemonic Generation for Kanji Learning via Expectation-Maximization
Jaewook Lee, Alexander Scarlatos, Andrew Lan
- Refining Attention for Explainable and Noise-Robust Fact-Checking with Transformers
Jean-Flavien Bussotti, Paolo Papotti
- Harmful Prompt Laundering: Jailbreaking LLMs with Abductive Styles and Symbolic Encoding
Seongho Joo, Hyukhun Koh, Kyomin Jung
- Pathway to Relevance: How Cross-Encoders Implement a Semantic Variant of BM25
Meng Lu, Catherine Chen, Carsten Eickhoff
- Rewarding the Unlikely: Lifting GRPO Beyond Distribution Sharpening
Andre Wang He, Daniel Fried, Sean Welleck
- PhoniTale: Phonologically Grounded Mnemonic Generation for Typologically Distant Language Pairs
Sana Kang, Myeongseok Gwon, Su Young Kwon, Jaewook Lee, Andrew Lan, Bhiksha Raj, Rita Singh
- Amulet: Putting Complex Multi-Turn Conversations on the Stand with LLM Juries
Sahana Ramnath, ANURAG MUDGIL, Brihi Joshi, Skyler Hallinan, Xiang Ren
- Exploring Chain-of-Thought Reasoning for Steerable Pluralistic Alignment
Yunfan Zhang, Kathleen McKeown, Smaranda Muresan
- CMedCalc-Bench: A Fine-Grained Benchmark for Chinese Medical Calculations in LLM
Yunyan Zhang, Zhihong Zhu, Xian Wu
- Evaluating Robustness of Large Audio Language Models to Audio Injection: An Empirical Study
Guanyu Hou, Jiaming He, Yinhang Zhou, Ji Guo, Yitong Qiao, Rui Zhang, Wenbo Jiang
- How Far Can LLMs Improve from Experience? Measuring Test-Time Learning Ability in LLMs with Human Comparison
Jiayin Wang, Zhiqiang Guo, Weizhi Ma, Min Zhang
- Subtle Risks, Critical Failures: A Framework for Diagnosing Physical Safety of LLMs for Embodied Decision Making
Yejin Son, Minseo Kim, Sungwoong Kim, Seungju Han, Jian Kim, Dongju Jang, Youngjae Yu, Chan Young Park
- SwiftKV: Fast Prefill-Optimized Inference with Knowledge-Preserving Model Transformation
Aurick Qiao, Zhewei Yao, Samyam Rajbhandari, Yuxiong He
- Co-Eval: Augmenting LLM-based Evaluation with Machine Metrics
Ling-I Wu, Weijie Wu, Minyu Chen, Jianxin Xue, Guoqiang Li
- Sali4Vid: Saliency-Aware Video Reweighting and Adaptive Caption Retrieval for Dense Video Captioning
MinJu Jeon, Si-Woo Kim, Ye-Chan Kim, HyunGee Kim, Dong-Jin Kim
- Semantic Networks Extracted from Students’ Think-Aloud Data are Correlated with Students’ Learning Performance
Pingjing Yang, Sullam Jeoung, Jennifer Cromley, Jana Diesner
- Less is More: The Effectiveness of Compact Typological Language Representations
York Hay Ng, Phuong Hanh Hoang, En-Shiun Annie Lee
- Sparse Activation Editing for Reliable Instruction Following in Narratives
Runcong Zhao, Chengyu Cao, Qinglin Zhu, Xiucheng Ly, Shun Shao, Lin Gui, Ruifeng Xu, Yulan He
- Inceptive Transformers: Enhancing Contextual Representations through Multi-Scale Feature Learning Across Domains and Languages
Asif Shahriar, Rifat Shahriyar, M Saifur Rahman
- Causal Tree Extraction from Medical Case Reports: A Novel Task for Experts-like Text Comprehension
Sakiko Yahata, Zhen Wan, Fei Cheng, Sadao Kurohashi, Hisahiko Sato, Ryozo Nagai
- OWL: Probing Cross-Lingual Recall of Memorized Texts via World Literature
Alisha Srivastava, Emir Kaan Korukluoglu, Minh Nhat Le, Duyen Tran, Chau Minh Pham, Marzena Karpinska, Mohit Iyyer
- Enhanced Noun-Noun Compound Interpretation through Textual Enrichment
Bingyang Ye, Jingxuan Tu, James Pustejovsky
- ICL CIPHERS: Quantifying ‘‘Learning’’ in In-Context Learning via Substitution Ciphers
Zhouxiang Fang, Aayush Mishra, Muhan Gao, Anqi Liu, Daniel Khashabi
- Corrupted but Not Broken: Understanding and Mitigating the Negative Impacts of Corrupted Data in Visual Instruction Tuning
Yunhao Gou, Hansi Yang, Zhili Liu, Kai Chen, Yihan Zeng, Lanqing HONG, Zhenguo Li, Qun Liu, Bo Han, James Kwok, Yu Zhang
- Memory OS of AI Agent
Jiazheng Kang, Mingming Ji, Zhe Zhao, Ting Bai
- Rule Discovery for Natural Language Inference Data Generation Using Out-of-Distribution Detection
Juyoung Han, Hyunsun Hwang, Changki Lee
- Jigsaw-Puzzles: From Seeing to Understanding to Reasoning in Vision-Language Models
Zesen Lyu, Dandan Zhang, Wei Ye, Fangdi Li, Zhihang Jiang, Yao Yang
- Definition Generation for Word Meaning Modeling: Monolingual, Multilingual, and Cross-Lingual Perspectives
Francesco Periti, Roksana Goworek, Haim Dubossarsky, Nina Tahmasebi
- Language Model Based Text-to-Audio Generation: Anti-Causally Aligned Collaborative Residual Transformers
Juncheng Wang, Chao Xu, Cheng Yu, Zhe Hu, Haoyu Xie, Guoqi Yu, Lei Shang, Shujun Wang
- HELENE: Hessian Layer-wise Clipping and Gradient Annealing for Accelerating Fine-tuning LLM with Zeroth-order Optimization
Huaqin Zhao, Jiaxi Li, Yi Pan, Shizhe Liang, Xiaofeng Yang, Fei Dou, Tianming Liu, Jin Lu
- Zero-shot Multimodal Document Retrieval via Cross-modal Question Generation
Yejin Choi, Jaewoo Park, Janghan Yoon, Saejin Kim, Jaehyun Jeon, Youngjae Yu
- From Parameters to Performance: A Data-Driven Study on LLM Structure and Development
Suqing Wang, Zuchao Li, Shi Luohe, Bo Du, hai zhao, Yun Li, Qianren Wang
- Logical Reasoning with Outcome Reward Models for Test-Time Scaling
Ramya Keerthy Thatikonda, Wray Buntine, Ehsan Shareghi
- Speculating LLMs’ Chinese Training Data Pollution from Their Tokens
Qingjie Zhang, Di Wang, Haoting Qian, Liu Yan, Tianwei Zhang, Ke Xu, Qi Li, Minlie Huang, Hewu Li, Han Qiu
- NovelHopQA: Diagnosing Multi-Hop Reasoning Failures in Long Narrative Contexts
Abhay Gupta, Kevin Zhu, Vasu Sharma, Sean O’Brien, Michael Lu
- Weights-Rotated Preference Optimization for Large Language Models
Chenxu Yang, Ruipeng Jia, Mingyu Zheng, Naibin Gu, Zheng Lin, Siyuan Chen, Weichong Yin, Hua Wu, Weiping Wang
- The Stepwise Deception: Simulating the Evolution from True News to Fake News with LLM Agents
Yuhan Liu, Zirui Song, Juntian Zhang, Xiaoqing Zhang, Xiuying Chen, Rui Yan
- How to inject knowledge efficiently? Knowledge Infusion Scaling Law for Pre-training Large Language Models
Kangtao Lv, Haibin Chen, Yujin Yuan, Langming Liu, Shilei Liu, Yongwei Wang, Wenbo Su, Bo Zheng
- SMEC:Rethinking Matryoshka Representation Learning for Retrieval Embedding Compression
Biao Zhang, Lixin Chen, Tong Liu, Bo Zheng
- Reverse Prompt Engineering: A Zero-Shot, Genetic Algorithm Approach to Language Model Inversion
Hanqing Li, Diego Klabjan
- DiMo-GUI: Advancing Test-time Scaling in GUI Grounding via Modality-Aware Visual Reasoning
Hang Wu, Hongkai Chen, Yujun Cai, Chang Liu, Qingwen Ye, Ming-Hsuan Yang, Yiwei Wang
- SocioBench: Modeling Human Behavior in Sociological Surveys with Large Language Models
jia WANG, Ziyu Zhao, Tingjuntao Ni, zhongyu wei
- Financial Risk Relation Identification through Dual-view Adaptation
Wei-Ning Chiu, Yu-Hsiang Wang, Andy Hsiao, Yu-Shiang Huang, Chuan-Ju Wang
- CopySpec: Accelerating LLMs with Speculative Copy-and-Paste
Razvan-Gabriel Dumitru, Minglai Yang, Vikas Yadav, Mihai Surdeanu
- GRASP: Replace Redundant Layers with Adaptive Singular Parameters for Efficient Model Compression
Kainan Liu, Yong Zhang, Ning Cheng, Zhitao Li, Shaojun Wang, Jing Xiao
- GraphAgent: Agentic Graph Language Assistant
Yuhao Yang, Jiabin Tang, Lianghao Xia, Xingchen Zou, Yuxuan Liang, Chao Huang
- DDO: Dual-Decision Optimization for LLM-Based Medical Consultation via Multi-Agent Collaboration
Zhihao Jia, Mingyi Jia, Junwen Duan, Jianxin Wang
- FedMABench: Benchmarking Mobile GUI Agents on Decentralized Heterogeneous User Data
WenHao Wang, Zijie Yu, Rui Ye, Jianqing Zhang, Guangyi Liu, Liang Liu, Siheng Chen, Yanfeng Wang
- VLA-Mark: A cross modal watermark for large vision-language alignment models
Shuliang Liu, Zheng Qi, Jesse Jiaxi Xu, Yibo Yan, He GENG, Junyan Zhang, Aiwei Liu, Peijie Jiang, Jia Liu, Yik-Cheung Tam, Xuming Hu
- Sentence Smith: Controllable Edits for Evaluating Text Embeddings
Hongji Li, Andrianos Michail, Reto Gubelmann, Simon Clematide, Juri Opitz
- ReasonMed: A 370K Multi-Agent Generated Dataset for Advancing Medical Reasoning
Yu Sun, Xingyu Qian, Weiwen Xu, Hao Zhang, Chenghao Xiao, Long Li, Yu Rong, Wenbing Huang, Qifeng Bai, Tingyang Xu
- Decoding Dense Embeddings: Sparse Autoencoders for Interpreting and Discretizing Dense Retrieval
Seongwan park, Taeklim Kim, Youngjoong Ko
- UICOMPASS: UI Map Guided Mobile Task Automation via Adaptive Action Generation
Yuanzhang Lin, Zhe Zhang, He Rui, Qingao Dong, Mingyi Zhou, Jing Zhang, Xiang Gao, Hailong Sun
- Leaky Thoughts: Large Reasoning Models Are Not Private Thinkers
Tommaso Green, Martin Gubri, Haritz Puerto, Sangdoo Yun, Seong Joon Oh
- Model Unlearning via Sparse Autoencoder Subspace Guided Projections
Xu Wang, Zihao Li, Benyou Wang, Yan Hu, Difan Zou
- ConvSearch-R1: Enhancing Query Reformulation for Conversational Search with Reasoning via Reinforcement Learning
Changtai Zhu, Siyin Wang, Ruijun Feng, Kai Song, Xipeng Qiu
- How to Make Large Language Models Generate 100% Valid Molecules?
Wen Tao, Jing Tang, Alvin Chan, Bryan Hooi, Baolong Bi, Nanyun Peng, Yuansheng Liu, Yiwei Wang
- Exploring Quality and Diversity in Synthetic Data Generation for Argument Mining
Jianzhu Bao, Yuqi Huang, Yang Sun, Wenya Wang, Yice Zhang, Bojun Jin, Ruifeng Xu
- Dynamic Jointly Batch Selection for Data Efficient Machine Translation Fine-Tuning
Mohammad Amin Ghanizadeh, Mohammad Javad Dousti
- 3MDBench: Medical Multimodal Multi-agent Dialogue Benchmark
Ivan Sviridov, Amina Miftahova, Tereshchenko Artemiy Vladimirovich, Galina Zubkova, Pavel Blinov, Andrey Savchenko
- OpenTuringBench: An Open-Model-based Benchmark and Framework for Machine-Generated Text Detection and Attribution
Lucio La Cava, Andrea Tagarelli
- CRITICTOOL: Evaluating Self-Critique Capabilities of Large Language Models in Tool-Calling Error Scenarios
Shiting Huang, Zhen Fang, Zehui Chen, Siyu Yuan, Junjie Ye, Yu Zeng, Lin Chen, Qi Mao, Feng Zhao
- Pre-trained Language Models Learn Remarkably Accurate Representations of Numbers
Marek Kadlčík, Michal Štefánik, Timothee Mickus, Josef Kuchař, Michal Spiegel
- Enhancing Large Vision-Language Models with Ultra-Detailed Image Caption Generation
Yu Zeng, Yukun Qi, Yiming Zhao, Xikun Bao, Lin Chen, Zehui Chen, Shiting Huang, Jie Zhao, Feng Zhao
- Translate Smart, not Hard: Cascaded Translation Systems with Quality-Aware Deferral
António Farinhas, Nuno M Guerreiro, Sweta Agrawal, Ricardo Rei, Andre Martins
- iVISPAR — An Interactive Visual-Spatial Reasoning Benchmark for VLMs
Julius Mayer, Mohamad Ballout, Serwan Jassim, Farbod Nosrat Nezami, Elia Bruni
- Are LLMs Better than Reported? Detecting Label Errors and Mitigating Their Effect on Model Performance
Omer Nahum, Nitay Calderon, Orgad Keller, Idan Szpektor, Roi Reichart
- Detecting Legal Citations in United Kingdom Court Judgments
Holli Sargeant, Andreas Östling, Måns Magnusson
- Large Language Models Badly Generalize across Option Length, Problem Types, and Irrelevant Noun Replacements
Guangxiang Zhao, Saier Hu, Xiaoqi Jian, Wu Jinzhu, Yuhan Wu, Lin Sun, Xiangzheng Zhang
- Studying the Role of Input-Neighbor Overlap in Retrieval-Augmented Language Models Training Efficiency
Ehsan Doostmohammadi, Marco Kuhlmann
- Principled Personas: Defining and Measuring the Intended Effects of Persona Prompting on Task Performance
Pedro Henrique Luz de Araujo, Paul Röttger, Dirk Hovy, Benjamin Roth
- HydraOpt: Navigating the Efficiency-Performance Trade-off of Adapter Merging
Taha Ceritli, Ondrej Bohdal, Mete Ozay, Jijoong Moon, Kyenghun Lee, Hyeonmok Ko, Umberto Michieli
- Parrot: A Training Pipeline Enhances Both Program CoT and Natural Language CoT for Reasoning
Senjie Jin, Lu Chen, Zhiheng Xi, Yuhui Wang, Sirui Song, Yuhao Zhou, Xinbo Zhang, peng sun, Hong Lu, Tao Gui, Qi Zhang, Xuanjing Huang
- Spec-VLA: Speculative Decoding for Vision-Language-Action Models with Relaxed Acceptance
Songsheng Wang, Rucheng Yu, Zhihang Yuan, Chao Yu, Feng Gao, Yu Wang, Derek F. Wong
- Leveraging Text-to-Text Transformers as Classifier Chain for Few-Shot Multi-Label Classification
Quang Anh Nguyen, Nadi Tomeh, Mustapha Lebbah, Thierry Charnois, Hanane AZZAG
- M-Wanda: Improving One-Shot Pruning for Multilingual LLMs
Rochelle Choenni, Ivan Titov
- Beyond Hate Speech: NLP’s Challenges and Opportunities in Uncovering Dehumanizing Language
Hamidreza Saffari, Mohammadamin Shafiei, Hezhao Zhang, Lasana T. Harris, Nafise Sadat Moosavi
- Conflict-Aware Soft Prompting for Retrieval-Augmented Generation
Eunseong Choi, June Park, Hyeri Lee, Jongwuk Lee
- R-CHAR: A Metacognition-Driven Framework for Role-Playing in Large Language Models
Haiming Qin, Jiwei Zhang, Wei Zhang, KeZhong Lu, Mingyang Zhou, Hao Liao, Rui Mao
- Annotating Training Data for Conditional Semantic Textual Similarity Measurement using Large Language Models
Gaifan Zhang, Yi Zhou, Danushka Bollegala
- When Words Smile: Generating Diverse Emotional Facial Expressions from Text
Haidong Xu, Meishan Zhang, Hao Ju, Zhedong Zheng, Erik Cambria, Min Zhang, Hao Fei
- Improving Online Job Advertisement Analysis via Compositional Entity Extraction
Kai Krüger, Stefan Winnige, Alan Akbik, Johanna Binnewitt, Kathrin Ehmann
- Correlation-Aware Example Selection for In-Context Learning with Nonsymmetric Determinantal Point Processes
Qiunan Du, Zhiliang Tian, Zhen Huang, Kailun Bian, Tianlun Liu, Zhaoning Zhang, Xinwang Liu, Feng Liu, Dongsheng Li
- Leveraging Cognitive Complexity of Texts for Contextualization in Dense Retrieval
Effrosyni Sokli, Georgios Peikos, Pranav Kasela, Gabriella Pasi
- Beyond Online Sampling: Bridging Offline-to-Online Alignment via Dynamic Data Transformation for LLMs
Zhang Zhang, Guhao Feng, Jian Guan, Di He, Wei Wu
- CAVE : Detecting and Explaining Commonsense Anomalies in Visual Environments
Rishika Bhagwatkar, Syrielle Montariol, Angelika Romanou, Beatriz Borges, Irina Rish, Antoine Bosselut
- Enhancing LLM Language Adaption through Cross-lingual In-Context Pre-training
Linjuan Wu, Hao-Ran Wei, Huan Lin, Tianhao Li, Baosong Yang, Fei Huang, Weiming Lu
- SemVink: Advancing VLMs’ Semantic Understanding of Optical Illusions via Visual Global Thinking
Sifan Li, Yujun Cai, Yiwei Wang
- Order Doesn’t Matter, But Reasoning Does: Training LLMs with Order-Centric Augmentation
Qianxi He, Qianyu He, Jiaqing Liang, Weikang Zhou, Zeye Sun, Fei Yu, Yanghua Xiao
- Type-Less yet Type-Aware Inductive Link Prediction with Pretrained Language Models
Alessandro De Bellis, Salvatore Bufi, Giovanni Servedio, Vito Walter Anelli, Tommaso Di Noia, Eugenio Di Sciascio
- Extracting Linguistic Information from Large Language Models: Syntactic Relations and Derivational Knowledge
Tsedeniya Kinfe Temesgen, Marion Di Marco, Alexander Fraser
- Beyond Correctness: Confidence-Aware Reward Modeling for Enhancing Large Language Model Reasoning
Qianxi He, Qingyu Ren, Shanzhe Lei, Xuhong Wang, Yingchun Wang
- TrojanStego: Your Language Model Can Secretly Be A Steganographic Privacy Leaking Agent
Dominik Meier, Jan Philip Wahle, Paul Röttger, Terry Ruas, Bela Gipp
- Frequency & Compositionality in Emergent Communication
Jean-Baptiste Sevestre, Emmanuel Dupoux
- Summarizing Speech: A Comprehensive Survey
Fabian Retkowski
- CogDual: Enhancing Dual Cognition of LLMs via Reinforcement Learning with Implicit Rule-Based Rewards
Cheng Liu, YifeiLu, Fanghua Ye, Jian Li, Xingyu Chen, Feiliang Ren, Zhaopeng Tu, Xiaolong Li
- Assay2Mol: Large Language Model-based Drug Design Using BioAssay Context
Yifan Deng, Spencer S Ericksen, Anthony Gitter
- Frame First, Then Extract: A Frame-Semantic Reasoning Pipeline for Zero-Shot Relation Triplet Extraction
Zehan Li, Fu Zhang, Wenqing Zhang, JiaweiLi, Zhou Li, Jingwei Cheng, Tianyue Peng
- MrGuard: A Multilingual Reasoning Guardrail for Universal LLM Safety
Yahan Yang, Soham Dan, Shuo Li, Dan Roth, Insup Lee
- TALON: A Multi-Agent Framework for Long-Table Exploration and Question Answering
Ruochun Jin, Xiyue Wang, DongWang, Haoqi Zheng, Yunpeng Qi, Silin Yang, Meng Zhang
- You Are What You Train: Effects of Data Composition on Training Context-aware Machine Translation Models
Paweł Mąka, Yusuf Can Semerci, Jan Scholtes, Gerasimos Spanakis
- Improving Neutral Point-of-View Generation with Data- and Parameter-Efficient RL
Jessica Hoffmann, Christiane Ahlheim, Zac Yu, Aria Walfrand, Jarvis Jin, Marie Tano, Ahmad Beirami, Erin MacMurray van Liemt, Nithum Thain, Hakim Sidahmed, Lucas Dixon
- Randomized Smoothing Meets Vision-Language Models
Emmanouil Seferis, Changshun Wu, Stefanos Kollias, Saddek Bensalem, Chih-Hong Cheng
- PIIvot: A Lightweight NLP Anonymization Framework for Question-Anchored Tutoring Dialogues
Matthew Zent, Digory Smith, Simon Woodhead
- Trustworthy Medical Question Answering: An Evaluation-Centric Survey
Yinuo Wang, Baiyang Wang, Robert E. Mercer, Frank Rudzicz, Sudipta Singha Roy, Pengjie Ren, Zhumin Chen, and Xindi Wang
- Unpacking Let Alone: Human-Scale Models Generalize to a Rare Construction in Form but not Meaning
Wesley Scivetti, Tatsuya Aoyama, Ethan Wilcox, Nathan Schneider
- BOUQuET : dataset, Benchmark and Open initiative for Universal Quality Evaluation in Translation
Pierre Andrews, Mikel Artetxe, Mariano Coria Meglioli, Marta R. Costa-jussà, Joe Chuang, David Dale, Mark Duppenthaler, Nathanial Paul Ekberg, Cynthia Gao, Daniel Edward Licht, Jean Maillard, Alexandre Mourachko, Christophe Ropers, Safiyyah Saleem, Eduardo Sánchez, Ioannis Tsiamas, Arina Turkatenko, Albert Ventayol-Boada, Shireen Yates
- HealthCards: Exploring Text-to-Image Generation as Visual Aids for Healthcare Knowledge Democratizing and Education
Qian Wu, Zheyao Gao, Longfei Gou, Yifan Hou, Qi Dou
- When Life Gives You Samples: The Benefits of Scaling up Inference Compute for Multilingual LLMs
Ammar Khairi, Daniel D’souza, Ye Shen, Julia Kreutzer, Sara Hooker
- Creativity in LLM-based Multi-Agent Systems: A Survey
Yi-Cheng Lin, Kang-Chieh Chen, Zhe-Yan Li, Tzu-Heng Wu, Tzu-Hsuan Wu, Kuan-Yu Chen, Hung-yi Lee, Yun-Nung Chen
- Context and POS in Action: A Comparative Study of Chinese Homonym Disambiguation in Human and Language Models
XIE Chenwei, Matthew King-Hang Ma, Wenbo Wang, William Shiyuan Wang
- Attacking Misinformation Detection Using Adversarial Examples Generated by Language Models
Piotr Przybyła, Euan McGill, Horacio Saggion
- Leveraging Loanword Constraints for Improving Machine Translation in a Low-Resource Multilingual Context
Felermino D. M. A. Ali, Henrique Lopes Cardoso, Rui Sousa-Silva
- Linguistic Neuron Overlap Patterns to Facilitate Cross-lingual Transfer on Low-resource Languages
Yuemei Xu, Kexin Xu, Jian Zhou, Ling Hu, Lin Gui
- Scaling Low-Resource MT via Synthetic Data Generation with LLMs
Ona De Gibert Bonet, Joseph Attieh, Teemu Vahtola, Mikko Aulamo, Zihao Li, Raúl Vázquez, Tiancheng Hu, Jörg Tiedemann
- Tailoring Table Retrieval from a Field-aware Hybrid Matching Perspective
Da Li, Keping Bi, Jiafeng Guo, Xueqi Cheng
- Randomly Removing 50% of Dimensions in Text Embeddings has Minimal Impact on Retrieval and Classification Tasks
Sotaro Takeshita, Yurina Takeshita, Daniel Ruffinelli, Simone Paolo Ponzetto
- Morables: A Benchmark for Assessing Abstract Moral Reasoning in LLMs with Fables
Matteo Marcuzzo, Alessandro Zangari, Andrea Albarelli, Jose Camacho-Collados, Mohammad Taher Pilehvar
- MessIRve: A Large-Scale Spanish Information Retrieval Dataset
Francisco Valentini, Viviana Cotik, Damián Furman, Ivan Bercovich, Edgar Altszyler, Juan Manuel Pérez
- AFRIDOC-MT: Document-level MT Corpus for African Languages
Jesujoba Oluwadara Alabi, Israel Abebe Azime, Miaoran Zhang, Cristina España-Bonet, Rachel Bawden, Dawei Zhu, David Ifeoluwa Adelani, Clement Oyeleke Odoje, Idris Akinade, Iffat Maab, Davis David, Shamsuddeen Hassan Muhammad, Neo Putini, David O. Ademuyiwa, Andrew Caines, Dietrich Klakow
- Charting the Landscape of African NLP: Mapping Progress and Shaping the Road Ahead
Jesujoba Oluwadara Alabi, Michael A. Hedderich, David Ifeoluwa Adelani, Dietrich Klakow
- GLIMPSE: Do Large Vision-Language Models Truly Think With Videos or Just Glimpse at Them?
Yiyang Zhou, Linjie Li, Shi Qiu, Zhengyuan Yang, Yuyang Zhao, Siwei Han, Yangfan He, Kangqi Li, Haonian Ji, Zihao Zhao, Haibo Tong, Lijuan Wang, Huaxiu Yao
- Social Bias in Multilingual Language Models: A Survey
Lance Calvin Lim Gamboa, Yue Feng, Mark G. Lee
- BYOKG-RAG: Multi-Strategy Graph Retrieval for Knowledge Graph Question Answering
Costas Mavromatis, Soji Adeshina, Vassilis N. Ioannidis, Zhen Han, Qi Zhu, Ian Robinson, Bryan Thompson, Huzefa Rangwala, George Karypis
- Synth-SBDH: A Synthetic Dataset of Social and Behavioral Determinants of Health for Clinical Text
Avijit Mitra, Zhichao Yang, Emily Druhl, Raelene Goodwin, hong yu
- Pun Unintended: LLMs and the Illusion of Humor Understanding
Alessandro Zangari, Matteo Marcuzzo, Andrea Albarelli, Mohammad Taher Pilehvar, Jose Camacho-Collados
- RACCooN: Versatile Instructional Video Editing with Auto-Generated Narratives
Jaehong Yoon, Shoubin Yu, Mohit Bansal
- Pre-trained Models Perform the Best When Token Distributions Follow Zipf’s Law
Yanjin He, Qingkai Zeng, Meng Jiang
- Do RAG Systems Really Suffer From Positional Bias?
Florin Cuconasu, Simone Filice, Guy Horowitz, Yoelle Maarek, Fabrizio Silvestri
- Aspect-Oriented Summarization for Psychiatric Short-Term Readmission Prediction
WonJin Yoon, Boyu Ren, Spencer Thomas, Chanhwi Kim, Guergana K Savova, Mei-Hua Hall, Timothy A. Miller
- Adapting Bias Evaluation to Domain Contexts using Generative Models
Tamara Quiroga, Felipe Bravo-Marquez, Valentin Barriere
- Emergent morpho-phonological representations in self-supervised speech models
Jon Gauthier, Canaan Breiss, Matthew K Leonard, Edward F. Chang
- Multilingual Language Model Pretraining using Machine-translated Data
Jiayi Wang, Yao Lu, Maurice Weber, Max Ryabinin, David Ifeoluwa Adelani, Yihong Chen, Raphael Tang, Pontus Stenetorp
- IntentionFrame: A Semi-Structured, Multi-Aspect Framework for Fine-Grained Conversational Intention Understanding
Jinggui Liang, Dung Vo, Lizi Liao
- Video-RTS: Rethinking Reinforcement Learning and Test-Time Scaling for Efficient and Enhanced Video Reasoning
Ziyang Wang, Jaehong Yoon, Shoubin Yu, Md Mohaiminul Islam, Gedas Bertasius, Mohit Bansal
- Efficient Compositional Multi-tasking for On-device Large Language Models
Ondrej Bohdal, Mete Ozay, Jijoong Moon, Kyenghun Lee, Hyeonmok Ko, Umberto Michieli
- Improving Large Language Model Safety with Contrastive Representation Learning
Samuel Simko, Mrinmaya Sachan, Bernhard Schölkopf, Zhijing Jin
- Leveraging What’s Overfixed: Post-Correction via LLM Grammatical Error Overcorrection
Taehee Park, Heejin Do, Gary Lee
- Scaling Up Temporal Domain Generalization via Temporal Experts Averaging
Aoming Liu, Kevin Miller, Venkatesh Saligrama, Kate Saenko, Boqing Gong, Ser-Nam Lim, Bryan A. Plummer
- LinguaLens: Towards Interpreting Linguistic Mechanisms of Large Language Models via Sparse Auto-Encoder
Yi Jing, Zijun Yao, Hongzhu Guo, Lingxu Ran, Xiaozhi Wang, Lei Hou, Juanzi Li
- The Strawberry Problem: Emergence of Character-level Understanding in Tokenized Language Models
Adrian Cosma, Stefan Ruseti, Emilian Radoi, Mihai Dascalu
- Improving the Quality of Web-mined Parallel Corpora of Low-Resource Languages using Debiasing Heuristics
Surangika Ranathunga, Aloka Fernando, Menan Velayuthan, Charitha Rathnayaka, Nisansa de Silva
- Weaver: Interweaving SQL and LLM for Table Reasoning
Rohit Khoja, Devanshu Gupta, Yanjie Fu, Dan Roth, Vivek Gupta
- ECO Decoding: Entropy-Based Control for Controllability and Fluency in Controllable Dialogue Generation
Seungmin Shin, Dooyoung Kim, Youngjoong Ko
- Investigating the interaction of linguistic and mathematical reasoning in language models using multilingual number puzzles
Antara Raaghavi Bhattacharya, Isabel Papadimitriou, Kathryn Davidson, David Alvarez-Melis
- Unsupervised Concept Vector Extraction for Bias Control in LLMs
Hannah Cyberey, Yangfeng Ji, David Evans
- Seeing the Same Story Differently: Framing‑Divergent Event Coreference for Computational Framing Analysis
Jin Zhao, Xinrui Hu, Nianwen Xue
- LLMs are Better Than You Think: Label-Guided In-Context Learning for Named Entity Recognition
Fan Bai, Hamid Hassanzadeh, Ardavan Saeedi, Mark Dredze
- COUNTDOWN: Contextually Sparse Activation Filtering Out Unnecessary Weights in Down Projection
Jaewon Cheon, Pilsung Kang
- SimpleDoc: Multi‑Modal Document Understanding with Dual‑Cue Page Retrieval and Iterative Refinement
Chelsi Jain, Yiran Wu, Yifan Zeng, Jiale Liu, Shengyu Dai, Zhenwen Shao, Qingyun Wu, Huazheng Wang
- VLP: Vision-Language Preference Learning for Embodied Manipulation
Runze Liu, Chenjia Bai, Jiafei Lyu, Shengjie Sun, Yali Du, Xiu Li
- QG-CoC: Question-Guided Chain-of-Captions for Large Multimodal Models
Kuei-Chun Kao, Hsu Tzu-Yin, Yunqi Hong, Ruochen Wang, Cho-Jui Hsieh
- EGOILLUSION: Benchmarking Hallucinations in Egocentric Video Understanding
Ashish Seth, Utkarsh Tyagi, Ramaneswaran Selvakumar, Nishit Anand, Sonal Kumar, Sreyan Ghosh, Ramani Duraiswami, Chirag Agarwal, Dinesh Manocha
- MULTIVOX: A Benchmark for Evaluating Voice Assistants for Multimodal Interactions
Ramaneswaran Selvakumar, Ashish Seth, Nishit Anand, Utkarsh Tyagi, Sonal Kumar, Sreyan Ghosh, Dinesh Manocha
- Do All Autoregressive Transformers Remember Facts the Same Way? A Cross-Architecture Analysis of Recall Mechanisms
Minyeong Choe, Haehyun Cho, Changho Seo, Hyunil Kim
- Probing Narrative Morals: A New Character-Focused MFT Framework for Use with Large Language Models
Luca Mitran, Sophie Wu, Andrew Piper
- Probing and Boosting Large Language Models Capabilities via Attention Heads
Dezhi Zhao, Xiaocheng Feng, Xin Liu, Hui Wang, Bing Qin
- A Survey of Link Prediction in N-ary Knowledge Graphs
Jiyao Wei, Saiping Guan, Da Li, Zhongni Hou, Miao Su, Yucan Guo, Xiaolong Jin, Jiafeng Guo, Xueqi Cheng
- Multi-Frequency Contrastive Decoding: Alleviating Hallucinations for Large Vision-Language Models
Bingqian Liu, Fu Zhang, Guoqing Chen, Jingwei Cheng
- ORPP: Self-Optimizing Role-playing Prompts to Enhance Language Model Capabilities
Yifan Duan, Yihong Tang, Kehai Chen, Liqiang Nie, Min Zhang
- BrailleLLM: Braille Instruction Tuning with Large Language Models for Braille Domain Tasks
Tianyuan Huang, Zepeng Zhu, Hangdi Xing, Zirui Shao, Zhi Yu, Chaoxiong Yang, Jiaxian He, Xiaozhong Liu, Jiajun Bu
- MAviS: A Multimodal Conversational Assistant For Avian Species
Yevheniia Kryklyvets, Mohammed Irfan Kurpath, Sahal Shaji Mullappilly, Jinxing Zhou, Fahad Shahbaz Khan, Rao Muhammad Anwer, Salman Khan, Hisham Cholakkal
- Refining Text Generation for Realistic Conversational Recommendation via Direct Preference Optimization
Manato Tajiri, Michimasa Inaba
- Large Language Models Threaten Language’s Epistemic and Communicative Foundations
Shashank Srivastava
- Detecting Knowledge Boundary of Vision Large Language Models by Sampling-Based Inference
Zhuo Chen, Xinyu Wang, Yong Jiang, Zhen Zhang, Xinyu Geng, Pengjun Xie, Fei Huang, Kewei Tu
- Multi-view-guided Passage Reranking with Large Language Models
Jeongwoo Na, Jun Kwon, Eunseong Choi, Jongwuk Lee
- Disentangling Subjectivity and Uncertainty for Hate Speech Annotation and Modeling using Gaze
Özge Alacam, Sanne Hoeken, Andreas Säuberli, Hannes Gröner, Diego Frassinelli, Sina Zarrieß, Barbara Plank
- VoiceBBQ: Investigating Effect of Content and Acoustics in Social Bias of Spoken Language Model
Junhyuk Choi, Ro-hoon Oh, Jihwan Seol, Bugeun Kim
- Explaining Differences Between Model Pairs in Natural Language through Sample Learning
Advaith Malladi, Rakesh R Menon, Yuvraj Jain, Shashank Srivastava
- Compound AI Systems Optimization: A Survey of Methods, Challenges, and Future Directions
Yu-Ang Lee, Guan-Ting Yi, Mei-Yi Liu, Jui-Chao Lu, Guan-Bo Yang, Yun-Nung Chen
- A Multi-Level Benchmark for Causal Language Understanding in Social Media Discourse
Xiaohan Ding, Kaike Ping, Buse Çarık, Eugenia Rho
- Causal Representation Learning from Multimodal Clinical Records under Non-Random Modality Missingness
Zihan Liang, Ziwen Pan, Ruoxuan Xiong
- XLQA: A Benchmark for Locale-Aware Multilingual Open-Domain Question Answering
Keonwoo Roh, Yeong-Joon Ju, Seong-Whan Lee
- Transformer-Based Temporal Information Extraction and Application: A Review
Xin Su, Phillip Howard, Steven Bethard
- How to Protect Yourself from 5G Radiation? Investigating LLM Responses to Implicit Misinformation
Ruohao Guo, Wei Xu, Alan Ritter
- AmpleHate: Amplifying the Attention for Versatile Implicit Hate Detection
Yejin Lee, Joonghyuk Hahn, Hyeseon Ahn, Yo-Sub Han
- Can Large Language Models Act as Ensembler for Multi-GNNs?
Hanqi Duan, Yao Cheng, Jianxiang Yu, Yao Liu, Xiang Li
- Agent-to-Agent Theory of Mind: Testing Interlocutor Awareness among Large Language Models
Younwoo Choi, Changling Li, Yongjin Yang, Zhijing Jin
- From Charts to Fair Narratives: Uncovering and Mitigating Geo-Economic Biases in Chart-to-Text
Ridwan Mahbub, Mohammed Saidul Islam, Mir Tafseer Nayeem, Md Tahmid Rahman Laskar, Mizanur Rahman, Shafiq Joty, Enamul Hoque
- Real-time Ad Retrieval via LLM-generative Commercial Intention for Sponsored Search Advertising
Tongtong Liu, Zhaohui Wang, Meiyue Qin, Zenghui Lu, Xudong Chen, Yuekui Yang, Peng Shu
- Toward Efficient Sparse Autoencoder-Guided Steering for Improved In-Context Learning in Large Language Models
Ikhyun Cho, Julia Hockenmaier
- CLMTracing: Black-box User-level Watermarking for Code Language Model Tracing
Boyu Zhang, Ping He, Tianyu Du, Xuhong Zhang, LEI YUN, Kingsum Chow, Jianwei Yin
- The Good, the Bad and the Constructive: Automatically Measuring Peer Review’s Utility for Authors
Abdelrahman Sadallah, Tim Baumgärtner, Iryna Gurevych, Ted Briscoe
- Evolving Chinese Spelling Correction with Corrector-Verifier Collaboration
Linfeng Liu, Hongqiu Wu, hai zhao
- M2Edit: Locate and Edit Multi-Granularity Knowledge in Multimodal Large Language Model
Yang Zhou, Pengfei Cao, Yubo Chen, Qingbin Liu, Dianbo Sui, Xi Chen, Kang Liu, Jun Zhao
- Do LLMs Behave as Claimed? Investigating How LLMs Follow Their Own Claims using Counterfactual Questions
Haochen Shi, Shaobo Li, Guoqing Chao, Xiaoliang Shi, Wentao Chen, Zhenzhou Ji
- Multilingual vs Crosslingual Retrieval of Fact-Checked Claims: A Tale of Two Approaches
Alan Ramponi, Marco Rovera, Robert Moro, Sara Tonelli
- How Much Do LLMs Hallucinate across Languages? On Realistic Multilingual Estimation of LLM Hallucination
Saad Obaid Ul Islam, Anne Lauscher, Goran Glavaš
- LiTransProQA: An LLM-based Literary Translation Evaluation Metric with Professional Question Answering
Ran Zhang, Wei Zhao, Lieve Macken, Steffen Eger
- Improving Handshape Representations for Sign Language Processing: A Graph Neural Network Approach
Alessa Carbo, Eric Nalisnick
- Instructing Large Language Models for Low-Resource Languages: A Systematic Study for Basque
Oscar Sainz, Naiara Perez, Julen Etxaniz, Joseba Fernandez de Landa, Itziar Aldabe, Iker García-Ferrero, Aimar Zabala, Ekhi Azurmendi, German Rigau, Eneko Agirre, Mikel Artetxe, Aitor Soroa
- SOCIAL SCAFFOLDS: A Generalization Framework for Social Understanding Tasks
Ritam Dutt, Carolyn Rose, Maarten Sap
- Beyond A Single AI Cluster: A Survey of Decentralized LLM Training
Haotian Dong, Jingyan Jiang, Rongwei Lu, Jiajun Luo, Jiajun Song, Bowen Li, Ying Shen, Zhi Wang
- Can LLM Agents Maintain a Persona in Discourse?
Pranav Bhandari, Nicolas Fay, Michael J Wise, Amitava Datta, Stephanie Meek, Usman Naseem, Mehwish Nasim
- Iterative Multilingual Spectral Attribute Erasure
Shun Shao, Yftah Ziser, Zheng Zhao, Yifu QIU, Shay B Cohen, Anna Korhonen
- TinySQL: A Progressive Text-to-SQL Dataset for Mechanistic Interpretability Research
Abir HARRASSE, Philip Quirke, Clement Neo, Dhruv Nathawani, Luke Marks, Amir Abdullah
- SCRIBE: Structured Chain Reasoning for Interactive Behaviour Explanations using Tool Calling
Fares Fawzi, Vinitra Swamy, Dominik Glandorf, Tanya Nazaretsky, Tanja Käser
- Logit Space Constrained Fine-Tuning for Mitigating Hallucinations in LLM-Based Recommender Systems
Jianfeng Deng, Qingfeng Chen, Debo Cheng, Jiuyong Li, Lin Liu
- PACHAT: Persona-Aware Speech Assistant for Multi-party Dialogue
Dongjie Fu, Xize Cheng, Linjun Li, Xiaoda Yang, Lujia Yang, Tao Jin
- Reasoning-to-Defend: Safety-Aware Reasoning Can Defend Large Language Models from Jailbreaking
Junda Zhu, Lingyong Yan, Shuaiqiang Wang, Dawei Yin, Lei Sha
- Graph-Guided Textual Explanation Generation Framework
Shuzhou Yuan, Jingyi Sun, Ran Zhang, Michael Färber, Steffen Eger, Pepa Atanasova, Isabelle Augenstein
- The Validation Gap: A Mechanistic Analysis of How Language Models Compute Arithmetic but Fail to Validate It
Leonardo Bertolazzi, Philipp Mondorf, Barbara Plank, Raffaella Bernardi
- A Causal Lens for Evaluating Faithfulness Metrics
Kerem Zaman, Shashank Srivastava
- Sequential-NIAH: A Needle-In-A-Haystack Benchmark for Extracting Sequential Needles from Long Contexts
Yifei Yu, Qian-Wen Zhang, Lingfeng Qiao, di yin, Fang Li, Jie Wang, ChenZengXi, Suncong Zheng, Xiaolong Liang, Xing Sun
- FISTAPruner: Layer-wise Post-training Pruning for Large Language Models
Pengxiang Zhao, Hanyu Hu, Ping Li, Yi ZHENG, Zhefeng Wang, Xiaoming Yuan
- Do LLMs Encode Frame Semantics? Evidence from Frame Identification
Jayanth Krishna Chundru, Rudrashis Poddar, Jie Cao, Tianyu Jiang
- StepER: Step-wise Knowledge Distillation for Enhancing Reasoning Ability in Multi-Step Retrieval-Augmented Language Models
Kyumin Lee, Minjin Jeon, Sanghwan Jang, Hwanjo Yu
- How Does DPO Reduce Toxicity? A Mechanistic Neuron-Level Analysis
Yushi Yang, Filip Sondej, Harry Mayne, Andrew Lee, Adam Mahdi
- It’s All About In-Context Learning! Teaching Extremely Low-Resource Languages to LLMs
Yue Li, Zhixue Zhao, Carolina Scarton
- Where to show Demos in Your Prompt: A Positional Bias of In-Context Learning
Kwesi Adu Cobbina, Tianyi Zhou
- Multilingual Pretraining for Pixel Language Models
Ilker Kesen, Jonas F. Lotz, Ingo Ziegler, Phillip Rust, Desmond Elliott
- MetaFaith: Faithful Natural Language Uncertainty Expression in LLMs
Gabrielle Kaili-May Liu, Gal Yona, Avi Caciularu, Idan Szpektor, Tim G. J. Rudner, Arman Cohan
- Machine-generated text detection prevents language model collapse
George Drayson, Emine Yilmaz, Vasileios Lampos
- Data-Efficient Hate Speech Detection via Cross-Lingual Nearest Neighbor Retrieval with Limited Labeled Data
Faeze Ghorbanpour, Daryna Dementieva, Alexander Fraser
- V-VAE: A Variational Auto Encoding Framework Towards Fine-Grained Control over Human-Like Chat
Qi Lin, Weikai Xu, Lisi Chen, Bin Dai
- Mixture of Languages: Improved Multilingual Encoders Through Language Grouping
João Maria Janeiro, Belen Alastruey, Francisco Massa, Maha Elbayad, Benjamin Piwowarski, Patrick Gallinari, Loic Barrault
- Too Helpful, Too Harmless, Too Honest or Just Right?
Gautam Siddharth Kashyap, Mark Dras, Usman Naseem
- Cardiverse: Harnessing LLMs for Novel Card Game Prototyping
Danrui Li, Sen Zhang, Samuel S. Sohn, Kaidong Hu, Muhammad Usman, Mubbasir Kapadia
- Assessing effective de-escalation of crisis conversations using transformer-based models and trend statistics
Ignacio J. Tripodi, Greg Buda, Margaret Meagher, Elizabeth A. Olson
- Measuring and Mitigating Media Outlet Name Bias in Large Language Models
Seong-Jin Park, Kang-Min Kim
- The Good, the Bad, and the Debatable: A Survey on the Impacts of Data for In-Context Learning
Stephanie Schoch, Yangfeng Ji
- Where Confabulation Lives: Latent Feature Discovery in LLMs
Thibaud Ardoin, Yi Cai, Gerhard Wunder
- Analysing Chain of Thought Dynamics: Active Guidance or Unfaithful Post-hoc Rationalisation?
Samuel Lewis-Lim, Xingwei Tan, Zhixue Zhao, Nikolaos Aletras
- Playpen: An Environment for Exploring Learning From Dialogue Game Feedback
Nicola Horst, Davide Mazzaccara, Antonia Schmidt, Michael Sullivan, Filippo Momentè, Luca Franceschetti, Philipp Sadler, Sherzod Hakimov, Alberto Testoni, Raffaella Bernardi, Raquel Fernández, Alexander Koller, Oliver Lemon, David Schlangen, Mario Giulianelli, Alessandro Suglia
- GenLink: Generation-Driven Schema-Linking via Multi-Model Learning for Text-to-SQL
Zhifeng Hao, Junqi Huang, Shaobin Shi, Ruichu Cai, Boyan Xu
- TSVer: A Benchmark for Fact Verification Against Time-Series Evidence
Marek Strong, Andreas Vlachos
- Cross-MoE: An Efficient Temporal Prediction Framework Integrating Textual Modality
Ruizheng Huang, Zhicheng Zhang, Yong Wang
- Sparse Autoencoder Features for Classifications and Transferability
Jack Gallifant, Shan Chen, Kuleen Sasse, Hugo Aerts, Thomas Hartvigsen, Danielle Bitterman
- KGE Calibrator: An Efficient Probability Calibration Method of Knowledge Graph Embedding Models for Trustworthy Link Prediction
Yang Yang, Mohan Timilsina, Edward Curry
- LCES: Zero-shot Automated Essay Scoring via Pairwise Comparisons Using Large Language Models
Takumi Shibata, Yuichi Miyamura
- The Arabic Generality Score: Another Dimension of Modeling Arabic Dialectness
Sanad Sha’ban, Nizar Habash
- Lemmatization as a Classification Task: Results from Arabic across Multiple Genres
Mostafa Saeed, Nizar Habash
- A Comprehensive Framework to Operationalize Social Stereotypes for Responsible AI Evaluations
Aida Mostafazadeh Davani, Sunipa Dev, Héctor Pérez-Urbina, Vinodkumar Prabhakaran
- Correct-Detect: Balancing Performance and Ambiguity Through the Lens of Coreference Resolution in LLMs
Amber Shore, Russell Scheinberg, Ameeta Agrawal, So Young Lee
- GRAID: Synthetic Data Generation with Geometric Constraints and Multi-Agentic Reflection for Harmful Content Detection
Melissa Kazemi Rad, Alberto Purpura, Himanshu Kumar, Emily Chen, Mohammad Shahed Sorower
- LaMDAgent: An Autonomous Framework for Post-Training Pipeline Optimization via LLM Agents
Taro Yano
- Finetuning LLMs for Human Behavior Prediction in Social Science Experiments
Akaash Kolluri, Shengguang Wu, Joon Sung Park, Michael S. Bernstein
- How Private are Language Models in Abstractive Summarization?
Anthony Hughes, Ning Ma, Nikolaos Aletras
- Expectation Preference Optimization: Reliable Preference Estimation for Improving the Reasoning Capability of Large Language Models
Zelin Li, Dawei Song
- Split-Merge: Scalable and Memory-Efficient Merging of Expert LLMs
Sruthi Gorantla, Aditya Rawal, Devamanyu Hazarika, Kaixiang Lin, Mingyi Hong, Mahdi Namazifar
- Model Consistency as a Cheap yet Predictive Proxy for LLM Elo Scores
Ashwin Ramaswamy, Nestor Demeure, Ermal Rrapaj
- Plutus: Benchmarking Large Language Models in Low-Resource Greek Finance
Xueqing Peng, Triantafillos Papadopoulos, Efstathia Soufleri, Polydoros Giannouris, Ruoyu Xiang, Yan Wang, Lingfei Qian, Jimin Huang, Qianqian Xie, Sophia Ananiadou
- TaxoAlign: Scholarly Taxonomy Generation Using Language Models
Avishek Lahiri, Yufang Hou, Debarshi Kumar Sanyal
- DiNaM: Disinformation Narrative Mining with Large Language Models
Witold Sosnowski, Arkadiusz Modzelewski, Kinga Skorupska, Adam Wierzbicki
- VeriLocc: End-to-End Cross-Architecture Register Allocation via LLM
Lesheng Jin, Zhenyuan Ruan, Haohui Mai, Jingbo Shang
- MemeIntel: Explainable Detection of Propagandistic and Hateful Memes
Mohamed Bayan Kmainasi, Abul Hasnat, Md Arid Hasan, Ali Ezzat Shahroor, Firoj Alam
- FLUID QA: A Multilingual Benchmark for Figurative Language Usage in Dialogue across English, Chinese, and Korean
Seoyoon Park, Hyeji Choi, Minseon Kim, Subin An, Xiaonan Wang, Gyuri Choi, Hansaem Kim
- Structured Moral Reasoning in Language Models: A Value-Grounded Evaluation Framework
Mohna Chakraborty, Lu Wang, David Jurgens
- VerIF: Verification Engineering for Reinforcement Learning in Instruction Following
Hao Peng, Yunjia Qi, Xiaozhi Wang, Bin Xu, Lei Hou, Juanzi Li
- UNCLE: Benchmarking Uncertainty Expressions in Long-Form Generation
Ruihan Yang, Caiqi Zhang, Zhisong Zhang, Xinting Huang, Dong Yu, Nigel Collier, Deqing Yang
- Enhancing Study-Level Inference from Clinical Trial Papers via Reinforcement Learning-Based Numeric Reasoning
Massimiliano Pronesti, Michela Lorandi, Paul Flanagan, Oisín Redmond, Anya Belz, Yufang Hou
- Context-aware Biases for Length Extrapolation
Ali veisi, Hamidreza Amirzadeh, Amir M. Mansourian
- AutoSDT: Scaling Data-Driven Discovery Tasks Toward Open Co-Scientists
Yifei Li, Hanane Nour Moussa, Ziru Chen, Shijie Chen, Botao Yu, Mingyi Xue, Benjamin Burns, Tzu-Yao Chiu, Vishal Dey, Zitong Lu, Chen Wei, Qianheng Zhang, Tianyu Zhang, Song Gao, Xuhui Huang, Xia Ning, Nesreen K. Ahmed, Ali Payani, Huan Sun
- Finding your MUSE: Mining Unexpected Solutions Engine
Nir Sweed, Hanit Hakim, Ben Wolfson, Hila Lifshitz, Dafna Shahaf
- Quantized but Deceptive? A Multi-Dimensional Truthfulness Evaluation of Quantized LLMs
Yao Fu, Xianxuan Long, Runchao Li, Haotian Yu, Mu Sheng, Xiaotian Han, Yu Yin, Pan Li
- Leveraging Knowledge Graph-Enhanced LLMs for Context-Aware Medical Consultation
Su-Hyeong Park, Ho-Beom Kim, Seong-Jin Park, Dinara Aliyeva, Kang-Min Kim
- Reflective Agreement: Combining Self-Mixture of Agents with a Sequence Tagger for Robust Event Extraction
Fatemeh Haji, Mazal Bethany, Cho-Yu Jason Chiang, Anthony Rios, Peyman Najafirad
- Simple Yet Effective: An Information-Theoretic Approach to Multi-LLM Uncertainty Quantification
Maya Kruse, Majid Afshar, Saksham Khatwani, Anoop Mayampurath, Guanhua Chen, Yanjun Gao
- Exploring morphology-aware tokenization: A case study on Spanish language modeling
Alba Táboas García, Piotr Przybyła, Leo Wanner
- Studying Rhetorically Ambiguous Questions
Oghenevovwe Ikumariegbe, Eduardo Blanco, Ellen Riloff
- Estimating LLM Consistency: A User Baseline vs Surrogate Metrics
Xiaoyuan Wu, Weiran Lin, Omer Akgul, Lujo Bauer
- Are Vision-Language Models Safe in the Wild? A Meme-Based Benchmark Study
DongGeon Lee, Joonwon Jang, Jihae Jeong, Hwanjo Yu
- Improving Rule-based Reasoning in LLMs using Neurosymbolic Representations
Varun Dhanraj, Chris Eliasmith
- Can LLMs Extract Frame-Semantic Arguments?
Jacob Devasier, Rishabh Mediratta, Chengkai Li
- Accelerated Test-Time Scaling with Model-Free Speculative Sampling
Woomin Song, Saket Dingliwal, Sai Muralidhar Jayanthi, Bhavana Ganesh, Jinwoo Shin, Aram Galstyan, Sravan Babu Bodapati
- Enhancing RLHF with Human Gaze Modeling
Karim Galliamov, Ivan Titov, Ilya Pershin
- Mapping semantic networks to Dutch word embeddings as a diagnostic tool for cognitive decline
Maithe van Noort, Michal Korenar, Jelke Bloem
- CausalVLBench: Benchmarking Visual Causal Reasoning in Large Vision-Language Models
Aneesh Komanduri, Karuna Bhaila, Xintao Wu
- Implicit Behavioral Alignment of Language Agents in High-Stakes Crowd Simulations
Yunzhe Wang, Gale Lucas, Burcin Becerik-Gerber, Volkan Ustun
- Are Language Models Consequentialist or Deontological Moral Reasoners?
Keenan Samway, Max Kleiman-Weiner, David Guzman Piedrahita, Rada Mihalcea, Bernhard Schölkopf, Zhijing Jin
- PatentScore: Multi-dimensional Evaluation of LLM-Generated Patent Claims
Yongmin Yoo, Qiongkai Xu, Longbing Cao
- All for One: LLMs Solve Mental Math at the Last Token With Information Transferred From Other Tokens
Siddarth Mamidanna, Daking Rai, Ziyu Yao, Yilun Zhou
- A Position Paper on the Automatic Generation of Machine Learning Leaderboards
Roelien C. Timmer, Yufang Hou, Stephen Wan
- SimMark: A Robust Sentence-Level Similarity-Based Watermarking Algorithm for Large Language Models
Amirhossein Dabiriaghdam, Lele Wang
- SERVAL: Surprisingly Effective Zero-Shot Visual Document Retrieval Powered by Large Vision and Language Models
Thong Nguyen, Yibin Lei, Jia-Huei Ju, Andrew Yates
- Meta-Semantics Augmented Few-Shot Relational Learning
Han Wu, Jie Yin
- ProLongVid: A Simple but Strong Baseline for Long-context Video Instruction Tuning
Rui Wang, Bohao Li, Xiyang Dai, Jianwei Yang, Yi-Ling Chen, Zhen Xing, Yifan Yang, Dongdong Chen, Xipeng Qiu, Zuxuan Wu, Yu-Gang Jiang
- ModelCitizens: Representing Community Voices in Online Safety
Ashima Suvarna, Christina A Chance, Karolina Naranjo, Hamid Palangi, Sophie Hao, Thomas Hartvigsen, Saadia Gabriel
- UnifiedVisual: A Framework for Constructing Unified Vision-Language Datasets
Pengyu Wang, Shaojun Zhou, Chenkun Tan, Xinghao Wang, Wei Huang, Zhen Ye, Zhaowei Li, Botian Jiang, Dong Zhang, Xipeng Qiu
- The Pursuit of Empathy: Evaluating Small Language Models for PTSD Dialogue Support
Suhas BN, Yash Mahajan, Dominik O. Mattioli, Andrew M. Sherrill, Rosa I. Arriaga, Christopher Wiese, Saeed Abdullah
- Is Cognition Consistent with Perception? Assessing and Mitigating Multimodal Knowledge Conflicts in Document Understanding
Zirui Shao, Feiyu Gao, Zhaoqing Zhu, Chuwei Luo, Hangdi Xing, Zhi Yu, Qi Zheng, Ming Yan, Jiajun Bu
- AutoCT: Automating Interpretable Clinical Trial Prediction with LLM Agents
Fengze Liu, Haoyu Wang, Joonhyuk Cho, Dan Roth, Andrew Lo
- MMDocIR: Benchmarking Multimodal Retrieval for Long Documents
Kuicai Dong, Yujing Chang, Derrick Goh Xin Deik, Dexun Li, Ruiming Tang, Yong Liu
- Program of Thoughts for Financial Reasoning: Leveraging Dynamic In-Context Examples and Generative Retrieval
Subhendu Khatuya, Shashwat Naidu, Pawan Goyal, Niloy Ganguly
- Waste-Bench: A Comprehensive Benchmark for Evaluating VLLMs in Cluttered Environments
Muhammad Ali, Salman Khan
- Demystifying Domain-adaptive Post-training for Financial LLMs
Zixuan Ke, Yifei Ming, Xuan-Phi Nguyen, Caiming Xiong, Shafiq Joty
- HICode: Hierarchical Inductive Coding with LLMs
Mian Zhong, Pristina Wang, Anjalie Field
- Cacheback: Speculative Decoding With Nothing But Cache
Zhiyao Ma, In Gim, Lin Zhong
- MA-DPR: Manifold-aware Distance Metrics for Dense Passage Retrieval
Yifan Liu, Qianfeng Wen, Mark Zhao, Jiazhou Liang, Scott Sanner
- LLM-Guided Co-Training for Text Classification
Md Mezbaur Rahman, Cornelia Caragea
- LeanK: Learnable K Cache Channel Pruning for Efficient Decoding
Yike Zhang, Zhiyuan He, Huiqiang Jiang, Chengruidong Zhang, Yuqing Yang, Jianyong Wang, Lili Qiu
- DELOC: Document Element Localizer
Hammad Ayyubi, Puneet Mathur, Mehrab Tanjim, Vlad I Morariu
- NL2Lean: Translating Natural Language into Lean 4 through Multi-Aspect Reinforcement Learning
Yue Fang, Shaohan Huang, Xin Yu, Haizhen Huang, Zihan Zhang, Weiwei Deng, Furu Wei, Feng Sun, Qi Zhang, Zhi Jin
- A Multilingual, Culture-First Approach to Addressing Misgendering in LLM Applications
Sunayana Sitaram, Adrian de Wynter, Isobel McCrum, Qilong Gu, Si-Qing Chen
- X-CoT: Explainable Text-to-Video Retrieval via LLM-based Chain-of-Thought Reasoning
Prasanna Reddy Pulakurthi, Jiamian Wang, MAJID RABBANI, Sohail Dianat, Raghuveer Rao, Zhiqiang Tao
- Token-level Proximal Policy Optimization for Query Generation
Yichen Ouyang, Lu Wang, Fangkai Yang, Pu Zhao, Chenghua Huang, Jianfeng Liu, Bochen Pang, Yaming Yang, Yuefeng Zhan, Hao Sun, Qingwei Lin, Saravan Rajmohan, Weiwei Deng, Dongmei Zhang, Feng Sun
- Prior Prompt Engineering for Reinforcement Fine-Tuning
Pittawat Taveekitworachai, Potsawee Manakul, Sarana Nutanong, Kunat Pipatanakul
- Beyond WER: Probing Whisper’s Sub‑token Decoder Across Diverse Language Resource Levels
Siyu Liang, Nicolas Ballier, Gina-Anne Levow, Richard Wright
- ThinkTuning: Instilling Cognitive Reflections without Distillation
Aswin RRV, Jacob Dineen, Divij Handa, Md Nayem Uddin, Mihir Parmar, Chitta Baral, Ben Zhou
- $\texttt{Droid}$: A Resource Suite for AI-Generated Code Detection
Daniil Orel, Indraneil Paul, Iryna Gurevych, Preslav Nakov
- LoRACoE: Improving Large Language Model via Composition-based LoRA Expert
Guanyu Li, Zhiheng Xi, Zhihao Zhang, Boyang Hong, Tao Gui, Qi Zhang, Xuanjing Huang
- Same Question, Different Words: A Latent Adversarial Framework for Prompt Robustness
Tingchen Fu, Fazl Barez
- Pluralistic Alignment for Healthcare: A Role-Driven Framework
Jiayou Zhong, Anudeex Shetty, Chao Jia, Xuanrui Lin, Usman Naseem
- Flexible-length Text Infilling for Discrete Diffusion Models
Andrew Zhang, Anushka Sivakumar, Chia-Wei Tang, Chris Thomas
- Beyond the Leaderboard: Understanding Performance Disparities in Large Language Models via Model Diffing
Sabri Boughorbel, Fahim Dalvi, Nadir Durrani, Majd Hawasly
- Explicit Learning and the LLM in Machine Translation
Malik Marmonier, Rachel Bawden, Benoît Sagot
- Towards Language-Agnostic STIPA: Universal Phonetic Transcription to Support Language Documentation at Scale
Jacob Lee Suchardt, Hana El-Shazli, Pierluigi Cassotti
- Beyond Pairwise: Global Zero-shot Temporal Graph Generation
Alon Eirew, Kfir Bar, Ido Dagan
- “Feels Feminine to Me”: Understanding Perceived Gendered Style through Human Annotations
Hongyu Chen, Neele Falk, Michael Roth, Agnieszka Falenska
- RALS: Resources and Baselines for Romanian Automatic Lexical Simplification
Fabian Anghel, Cristea Petru-Theodor, Claudiu Creanga, Sergiu Nisioi
- How Do Social Bots Participate in Misinformation Spread? A Comprehensive Dataset and Analysis
Herun Wan, Minnan Luo, Zihan Ma, Guang Dai, Xiang Zhao
- Are Stereotypes Leading LLMs’ Zero-Shot Stance Detection ?
Anthony Dubreuil, Antoine Gourru, Christine Largeron, Amine Trabelsi
- Multi-Modal Framing Analysis of News
Arnav Arora, Srishti Yadav, Maria Antoniak, Serge Belongie, Isabelle Augenstein
- TempParaphraser: “Heating Up” Text to Evade AI-Text Detection through Paraphrasing
Junjie Huang, Ruiquan Zhang, Jinsong Su, Yidong Chen
- ComicScene154: A Scene Dataset for Comic Analysis
Sandro Paval, Pascal Meißner, Ivan P. Yamshchikov
- MedLinkDE – MedDRA Entity Linking for German with Guided Chain of Thought Reasoning
Roman Christof, Farnaz Zeidi, Manuela Messelhäußer, Dirk Mentzer, Renate Koenig, Liam Childs, Alexander Mehler
- HookMoE: A learnable performance compensation strategy of Mixture-of-Experts for LLM inference acceleration
Cheng Longkai, Along He, Mulin Li, Xie xueshuo, Tao Li
- Cross-Document Cross-Lingual NLI via RST-Enhanced Graph Fusion and Interpretability Prediction
Mengying Yuan, WenHao Wang, Zixuan Wang, Yujie Huang, Kangli Wei, Fei Li, Chong Teng, Donghong Ji
- 3R: Enhancing Sentence Representation Learning via Redundant Representation Reduction
Longxuan Ma, Xiao Wu, Yuxin Huang, Shengxiang Gao, Zhengtao Yu
- When Big Models Train Small Ones: Label-Free Model Parity Alignment for Efficient Visual Question Answering using Small VLMs
Abhirama Subramanyam Penamakuri, Navlika Singh, Piyush Arora, Anand Mishra
- ProReason: Multi-Modal Proactive Reasoning with Decoupled Eyesight and Wisdom
Jingqi Zhou, Sheng Wang, Jingwei Dong, Kai Liu, Lei Li, Jiahui Gao, Jiyue Jiang, Lingpeng Kong, Chuan Wu
- Extractive Fact Decomposition for Interpretable Natural Language Inference in one Forward Pass
Nicholas Popovic, Michael Färber
- Structure-Conditional Minimum Bayes Risk Decoding
Bryan Eikema, Anna Rutkiewicz, Mario Giulianelli
- Label Set Optimization via Activation Distribution Kurtosis for Zero-Shot Classification with Generative Models
Yue Li, Zhixue Zhao, Carolina Scarton
- The Transfer Neurons Hypothesis: An Underlying Mechanism for Language Latent Space Transitions in Multilingual LLMs
Hinata Tezuka, Naoya Inoue
- VEHME: A Vision-Language Model For Evaluating Handwritten Mathematics Expressions
Thu Phuong Nguyen, Duc M. Nguyen, Hyotaek Jeon, Hyunwook Lee, Hyunmin Song, Sungahn Ko, Taehwan Kim
- All Roads Lead to Rome: Graph-Based Confidence Estimation for Large Language Model Reasoning
Caiqi Zhang, Chang Shu, Ehsan Shareghi, Nigel Collier
- SEMMA: A Semantic Aware Knowledge Graph Foundation Model
Arvindh Arun, Sumit Kumar, Mojtaba Nayyeri, Bo Xiong, Ponnurangam Kumaraguru, Antonio Vergari, Steffen Staab
- Text2Vis: A Challenging and Diverse Benchmark for Generating Multimodal Visualizations from Text
Mizanur Rahman, Md Tahmid Rahman Laskar, Shafiq Joty, Enamul Hoque
- Predicting Prosodic Boundaries for Children’s Texts
Mansi Dhamne, Sneha Raman, Preeti Rao
- Enhancing Logical Reasoning in Language Models via Symbolically-Guided Monte Carlo Process Supervision
Xingwei Tan, Marco Valentino, Mahmud Elahi Akhter, Maria Liakata, Nikolaos Aletras
- Can Large Language Models Outperform Non-Experts in Poetry Evaluation? A Comparative Study Using the Consensual Assessment Technique
Piotr Sawicki, Marek Grzes, Dan Brown, Fabricio Goes
- Beyond Human Labels: A Multi-Linguistic Auto-Generated Benchmark for Evaluating Large Language Models on Resume Parsing
Zijian Ling, Han Zhang, Jiahao Cui, Zhequn Wu, Xu Sun, Guohao Li, Xiangjian He
- Orthogonal Finetuning Made Scalable
Zeju Qiu, Weiyang Liu, Adrian Weller, Bernhard Schölkopf
- AIR: Complex Instruction Generation via Automatic Iterative Refinement
Wei Liu, Yancheng He, Yu Li, Hui Huang, Chengwei Hu, Jiaheng Liu, Shilong Li, Wenbo Su, Bo Zheng
- SQUiD: Synthesizing Relational Databases from Unstructured Text
Mushtari Sadia, Zhenning Yang, Yunming Xiao, Ang Chen, Amrita Roy Chowdhury
- RAG+: Enhancing Retrieval-Augmented Generation with Application-Aware Reasoning
Yu Wang, Shiwan Zhao, Zhihu Wang, Ming FAN, Yubo Zhang, Xicheng Zhang, Zhengfan Wang, Heyuan Huang, Ting Liu
- Rapid Word Learning Through Meta In-Context Learning
Wentao Wang, Guangyuan Jiang, Tal Linzen, Brenden Lake
- EuroGEST: Investigating gender stereotypes in multilingual language models
Jacqueline Rowe, Mateusz Klimaszewski, Liane Guillou, Shannon Vallor, Alexandra Birch
- How Persuasive Is Your Context?
Tu Nguyen, Kevin Du, Alexander Miserlis Hoyle, Ryan Cotterell
- The Medium Is Not the Message: Deconfounding Document Embeddings via Linear Concept Erasure
Yu Fan, Yang Tian, Shauli Ravfogel, Mrinmaya Sachan, Elliott Ash, Alexander Miserlis Hoyle
- Measuring scalar constructs in social science with LLMs
Hauke Licht, Rupak Sarkar, Patrick Y. Wu, Pranav Goel, Niklas Stoehr, Elliott Ash, Alexander Miserlis Hoyle
- Text Detoxification: Data Efficiency, Semantic Preservation and Model Generalization
Jing Yu, Yibo Zhao, Jiapeng Zhu, Wenming Shao, Bo Pang, Zhao Zhang, Xiang Li
- Not What the Doctor Ordered: Surveying LLM-based De-identification and Quantifying Clinical Information Loss
Kiana Aghakasiri, Noopur Zambare, JoAnn Thai, Carrie Ye, Mayur Mehta, J Ross Mitchell, Mohamed Abdalla
- Reasoning under Uncertainty: Efficient LLM Inference via Unsupervised Confidence Dilution and Convergent Adaptive Sampling
Zhenning Shi, Yijia Zhu, Yi Xie, Junhan Shi, Guorui Xie, Haotian Zhang, Yong Jiang, Congcong Miao, Qing Li
- Africa Health Check: Probing Cultural Bias in Medical LLMs
Charles Nimo, Shuheng Liu, Irfan Essa, Michael L. Best
- Assumed Identities: Quantifying Gender Bias in Machine Translation of Gender-Ambiguous Occupational Terms
Orfeas Menis Mastromichalakis, Giorgos Filandrianos, Maria Symeonaki, Giorgos Stamou
- REVIVING YOUR MNEME: Predicting The Side Effects of LLM Unlearning and Fine-Tuning via Sparse Model Diffing
Aly M. Kassem, Golnoosh Farnadi, Negar Rostamzadeh, Zhuan Shi
- ToM-SSI: Evaluating Theory of Mind in Situated Social Interactions
Matteo Bortoletto, Constantin Ruhdorfer, Andreas Bulling
- Recursive Training Loops in LLMs: How training data properties modulate distribution shift in generated data?
Grgur Kovač, Jérémy Perez, Rémy Portelas, Peter Ford Dominey, Pierre-Yves Oudeyer
- Detecting LLM Hallucination Through Layer-wise Information Deficiency: Analysis of Ambiguous Prompts and Unanswerable Questions
Hazel Kim, Tom A. Lamb, Adel Bibi, Philip Torr, Yarin Gal
- Extending Automatic Machine Translation Evaluation to Book-Length Documents
Kuang-Da Wang, Shuoyang Ding, Chao-Han Huck Yang, Ping-Chun Hsieh, Wen-Chih Peng, Vitaly Lavrukhin, Boris Ginsburg
- MedFact: A Large-scale Chinese Dataset for Evidence-based Medical Fact-checking of LLM Responses
Tong Chen, Zimu Wang, Yiyi Miao, Haoran Luo, Sun Yuanfei, Wei Wang, Zhengyong Jiang, Procheta Sen, Jionglong Su
- VideoPASTA: 7K Preference Pairs That Matter for Video-LLM Alignment
Yogesh Kulkarni, Pooyan Fazli
- Do LLMs Adhere to Label Definitions? Examining Their Receptivity to External Label Definitions
Seyedali Mohammadi, Bhaskara Hanuma Vedula, Hemank Lamba, Edward Raff, Ponnurangam Kumaraguru, Francis Ferraro, Manas Gaur
- Group-Aware Reinforcement Learning for Output Diversity in Large Language Models
Oron Anschel, Alon Shoshan, Adam Botach, Shunit Haviv Hakimi, Asaf Gendler, Emanuel Ben Baruch, Nadav Bhonker, Igor Kviatkovsky, Manoj Aggarwal, Gerard Medioni
- Model-Based Ranking of Source Languages for Zero-Shot Cross-Lingual Transfer
Abteen Ebrahimi, Adam Wiemerslage, Katharina von der Wense
- PruneCD: Contrasting Pruned Self Model to Improve Decoding Factuality
Byeongho Yu, Changhun Lee, Jun-gyu Jin, Eunhyeok Park
- Crisp: Cognitive Restructuring of Negative Thoughts through Multi-turn Supportive Dialogues
Jinfeng Zhou, Yuxuan Chen, Jianing Yin, Yongkang Huang, Yihan Shi, Xikun Zhang, Libiao Peng, Rongsheng Zhang, Tangjie Lv, Zhipeng Hu, Hongning Wang, Minlie Huang
- The Impact of Language Mixing on Bilingual LLM Reasoning
Yihao Li, Jiayi Xin, Miranda Muqing Miao, Qi Long, Lyle Ungar
- VISaGE: Understanding Visual Generics and Exceptions
Stella Frank, Emily Allaway
- Stronger Baselines for Retrieval-Augmented Generation with Long-Context Language Models
Alex Laitenberger, Christopher D Manning, Nelson F. Liu
- Discursive Circuits: How Do Language Models Understand Discourse Relations?
Yisong Miao, Min-Yen Kan
- Making VLMs More Robot-Friendly: Self-Critical Distillation of Low-Level Procedural Reasoning
Chan Young Park, Jillian Fisher, Marius Memmel, Dipika Khullar, Seoho Yun, Abhishek Gupta, Yejin Choi
- ThinkSLM: Towards Reasoning in Small Language Models
Gaurav Srivastava, Shuxiang Cao, Xuan Wang
- MAgICoRe: Multi-Agent, Iterative, Coarse-to-Fine Refinement for Reasoning
Justin Chen, Archiki Prasad, Swarnadeep Saha, Elias Stengel-Eskin, Mohit Bansal
- Batched Self-Consistency Improves LLM Relevance Assessment and Ranking
Anton Korikov, Pan Du, Scott Sanner, Navid Rekabsaz
- SemCSE: Semantic Contrastive Sentence Embeddings Using LLM-Generated Summaries For Scientific Abstracts
Marc Felix Brinner, Sina Zarrieß
- Controlled Generation for Private Synthetic Text
Zihao Zhao, Anjalie Field
- Towards AI-Assisted Psychotherapy: Emotion-Guided Generative Interventions
Kilichbek Haydarov, Youssef Mohamed, Emilio Goldenhersch, Paul OCallaghan, Li-jia Li, Mohamed Elhoseiny
- From Shortcuts to Balance: Attribution Analysis of Speech-Text Feature Utilization in Distinguishing Original from Machine-Translated Texts
YONGJIAN CHEN, Antonio Toral
- DEBATE, TRAIN, EVOLVE: Self‑Evolution of Language Model Reasoning
Gaurav Srivastava, Zhenyu Bi, Meng Lu, Xuan Wang
- From Chat Logs to Collective Insights: Aggregative Question Answering
Wentao Zhang, Woojeong Kim, Yuntian Deng
- A Text-Based Recommender System that Leverages Explicit Affective State Preferences
Tonmoy Hasan, Razvan Bunescu
- CARE: Multilingual Human Preference Learning for Cultural Awareness
Geyang Guo, Tarek Naous, Hiromi Wakaki, Yukiko Nishimura, Yuki Mitsufuji, Alan Ritter, Wei Xu
- Multilingual Dialogue Generation and Localization with Dialogue Act Scripting
Justin Vasselli, Eunike Andriani Kardinata, Yusuke Sakai, Taro Watanabe
- SUE: Sparsity-based Uncertainty Estimation via Sparse Dictionary Learning
Tamás Ficsor, Gábor Berend
- Planning-Aware Code Infilling via Horizon-Length Prediction
Yifeng Ding, Hantian Ding, Shiqi Wang, Qing Sun, Varun Kumar, Zijian Wang
- SinhalaMMLU: A Comprehensive Benchmark for Evaluating Multitask Language Understanding in Sinhala
Ashmari Pramodya, Nirasha Nelki, Heshan Shalinda, Chamila Liyanage, Yusuke Sakai, Randil Pushpananda, Ruvan Weerasinghe, Hidetaka Kamigaito, Taro Watanabe
- OG-RAG: Ontology-grounded retrieval-augmented generation for large language models
Kartik Sharma, Peeyush Kumar, Yunqing Li
- Convergence and Divergence of Language Models under Different Random Seeds
Finlay Fehlauer, Kyle Mahowald, Tiago Pimentel
- Analyzing and Modeling LLM Response Lengths with Extreme Value Theory: Anchoring Effects and Hybrid Distributions
Liuxuan Jiao, Chen Gao, Yiqian Yang, Chenliang Zhou, YiXian Huang, Yong Li, Xinlei Chen
- Language Models Identify Ambiguities and Exploit Loopholes
Jio Choi, Mohit Bansal, Elias Stengel-Eskin
- Benchmarking LLMs for Translating Classical Chinese Poetry: Evaluating Adequacy, Fluency, and Elegance
Andong Chen, Lianzhang Lou, Kehai Chen, Xuefeng Bai, Yang Xiang, Muyun Yang, Tiejun Zhao, Min Zhang
- AraEval: An Arabic Multi-Task Evaluation Suite for Large Language Models
Alhanoof Althnian, Norah A. Alzahrani, Shaykhah Z. Alsubaie, Eman Albilali, Ahmed Abdelali, Nouf M. Alotaibi, M Saiful Bari, Yazeed Alnumay, Abdulhamed Alothaimen, Maryam Saif, Shahad D. Alzaidi, Faisal Abdulrahman Mirza, Yousef Almushayqih, Mohammed Al Saleem, Ghadah Alabduljabbar, Abdulmohsen Al-Thubaity, Areeb Alowisheq, Nora Al-Twairesh
- QUIDS: Query Intent Description for Exploratory Search via Dual Space Modeling
Yumeng Wang, Xiuying Chen, Suzan Verberne
- A Systematic Survey of Automatic Prompt Optimization Techniques
Kiran Ramnath, Kang Zhou, Sheng Guan, Soumya Smruti Mishra, Xuan Qi, Zhengyuan Shen, Shuai Wang, Sangmin Woo, Sullam Jeoung, Yawei Wang, Haozhu Wang, Han Ding, Yuzhe Lu, Zhichao Xu, Yun Zhou, Balasubramaniam Srinivasan, Qiaojing Yan, Yueyan Chen, Haibo Ding, Panpan Xu, Lin Lee Cheong
- Threading the Needle: Reweaving Chain-of-Thought Reasoning to Explain Human Label Variation
Beiduo Chen, Yang Janet Liu, Anna Korhonen, Barbara Plank
- MemInsight: Autonomous Memory Augmentation for LLM Agents
Rana Salama, Jason Cai, Michelle Yuan, Anna Currey, MONICA SUNKARA, Yi Zhang, Yassine Benajiba
- Breaking the Noise Barrier: LLM-Guided Semantic Filtering and Enhancement for Multi-Modal Entity Alignment
Chenglong Lu, Chenxiao Li, Jingwei Cheng, Yongquan Ji, Guoqing Chen, Fu Zhang
- ImpliRet: Benchmarking the Implicit Fact Retrieval Challenge
Zeinab Sadat Taghavi, Ali Modarressi, Yunpu Ma, Hinrich Schuetze
- No Need for Explanations: LLMs can implicitly learn from mistakes in-context
Lisa Alazraki, Maximilian Mozes, Jon Ander Campos, Tan Yi-Chern, Marek Rei, Max Bartolo
- MoVa: Towards Generalizable Classification of Human Morals and Values
Ziyu Chen, Junfei Sun, Chenxi Li, Tuan Dung Nguyen, Jing Yao, Xiaoyuan Yi, Xing Xie, Chenhao Tan, Lexing Xie
- GUI-Bee: Align GUI Action Grounding to Novel Environments via Autonomous Exploration
Yue Fan, Handong Zhao, Ruiyi Zhang, Yu Shen, Xin Eric Wang, Gang Wu
- Revealing and Mitigating the Challenge of Detecting Character Knowledge Errors in LLM Role-Playing
Wenyuan Zhang, Shuaiyi Nie, Jiawei Sheng, Zefeng Zhang, Xinghua Zhang, Yongquan He, Tingwen Liu
- Taking Notes Brings Focus? Towards Multi-Turn Multimodal Dialogue Learning
jiazheng liu, Sipeng Zheng, Börje F. Karlsson, Zongqing Lu
- Graph-Based Multi-Trait Essay Scoring
Shengjie Li, Vincent Ng
- Benchmarking LLMs on Semantic Overlap Summarization
John Salvador, Naman Bansal, Mousumi Akter, Souvika Sarkar, Anupam Das, Santu Karmaker
- N-CORE: N-View Consistency Regularization for Disentangled Representation Learning in Nonverbal Vocalizations
Siddhant Bikram Shah, Kristina T. Johnson
- Probability Distribution Collapse: A Critical Bottleneck to Compact Unsupervised Neural Grammar Induction
Jinwook Park, Kangil Kim
- Spatial Layouts in News Homepages Capture Human Preferences
Alexander Spangher, Michael Vu, Arda Kaz, Naitian Zhou, Ben Welsh
- KRETA: A Benchmark for Korean Reading and Reasoning in Text-Rich VQA Attuned to Diverse Visual Contexts
Taebaek Hwang, Minseo Kim, Gisang Lee, Seonuk Kim, Hyunjun Eun
- ReflAct: World-Grounded Decision Making in LLM Agents via Goal-State Reflection
Jeonghye Kim, Sojeong Rhee, Minbeom Kim, Dohyung Kim, Sangmook Lee, Youngchul Sung, Kyomin Jung
- CompassVerifier: A Unified and Robust Verifier for LLMs Evaluation and Outcome Reward
Shudong Liu, Hongwei Liu, Junnan Liu, Linchen Xiao, Songyang Gao, Chengqi Lyu, Yuzhe Gu, Wenwei Zhang, Derek F. Wong, Songyang Zhang, Kai Chen
- A Knowledge-driven Adaptive Collaboration of LLMs for Enhancing Medical Decision-making
Xiao Wu, Ting-Zhu Huang, Liang-Jian Deng, Yanyuan Qiao, Imran Razzak, Yutong Xie
- Castle: Causal Cascade Updates in Relational Databases with Large Language Models
Yongye Su, Yucheng Zhang, Zeru Shi, Bruno Ribeiro, Elisa Bertino
- Idiosyncratic Versus Normative Modeling of Atypical Speech Recognition: Dysarthric Case Studies
Vishnu Raja, Adithya V Ganesan, Anand Syamkumar, Ritwik Banerjee, H. Schwartz
- NESTFUL: A Benchmark for Evaluating LLMs on Nested Sequences of API Calls
Kinjal Basu, Ibrahim Abdelaziz, Kiran Kate, Mayank Agarwal, Maxwell Crouse, Yara Rizk, Kelsey Bradford, Asim Munawar, Sadhana Kumaravel, Saurabh Goyal, Xin Wang, Luis A. Lastras, Pavan Kapanipathi
- Benchmarking and Mitigating MCQA Selection Bias of Large Vision-Language Models
Md. Atabuzzaman, Ali Asgarov, Chris Thomas
- Can Large Language Models Unlock Novel Scientific Research Ideas?
Sandeep Kumar, Tirthankar Ghosal, Vinayak Goyal, Asif Ekbal
- Word Salad Chopper: Reasoning Models Waste A Ton Of Decoding Budget On Useless Repetitions, Self-Knowingly
Wenya Xie, Shaochen Zhong, Hoang Anh Duy Le, Zhaozhuo Xu, Jianwen Xie, Zirui Liu
- DIWALI - Diversity and Inclusivity aWare cuLture specific Items for India: Dataset and Assessment of LLMs for Cultural Text Adaptation in Indian Context
Maharaj Brahma, Pramit Sahoo, Maunendra Sankar Desarkar
- SYNC: A Synthetic Long-Context Understanding Benchmark for Controlled Comparisons of Model Capabilities
Shuyang Cao, Kaijian Zou, Lu Wang
- OpenNER 1.0: Standardized Open-Access Named Entity Recognition Datasets in 50+ Languages
Chester Palen-Michel, Maxwell Pickering, Maya Kruse, Jonne Sälevä, Constantine Lignos
- Mondrian: A Framework for Logical Abstract (Re)Structuring
Elizabeth Grace Orwig, Shinwoo Park, Hyundong Jin, Yo-Sub Han
- Case-Based Decision-Theoretic Decoding with Quality Memories
Hiroyuki Deguchi, Masaaki Nagata
- PRIME: Large Language Model Personalization with Cognitive Dual-Memory and Personalized Thought Process
Xinliang Frederick Zhang, Nicholas Beauchamp, Lu Wang
- Mechanisms vs. Outcomes: Probing for Syntax Fails to Explain Performance on Targeted Syntactic Evaluations
Ananth Agarwal, Jasper Jian, Christopher D Manning, Shikhar Murty
- Image Difference Captioning via Adversarial Preference Optimization
Zihan Huang, Junda Wu, Rohan Surana, Tong Yu, David Arbour, Ritwik Sinha, Julian McAuley
- seqBench: A Tunable Benchmark to Quantify Sequential Reasoning Limits of LLMs
Mohammad Ramezanali, Mo Vazifeh, Paolo Santi
- NormGenesis: Multicultural Dialogue Generation via Exemplar-Guided Social Norm Modeling and Violation Recovery
Minki Hong, Jangho Choi, Jihie Kim
- SATBench: Benchmarking LLMs’ Logical Reasoning via Automated Puzzle Generation from SAT Formulas
Anjiang Wei, Yuheng Wu, Yingjia Wan, Tarun Suresh, Huanmi Tan, Zhanke Zhou, Sanmi Koyejo, Ke Wang, Alex Aiken
- Data Descriptions from Large Language Models with Influence Estimation
Chaeri Kim, Jaeyeon Bae, Taehwan Kim
- EquiBench: Benchmarking Large Language Models’ Reasoning about Program Semantics via Equivalence Checking
Anjiang Wei, Jiannan Cao, Ran Li, Hongyu Chen, Yuhui Zhang, Ziheng Wang, Yuan Liu, Thiago S. F. X. Teixeira, Diyi Yang, Ke Wang, Alex Aiken
- MicroEdit: Neuron-level Knowledge Disentanglement and Localization in Lifelong Model Editing
Shiqi Wang, Qi Wang, Runliang Niu, He Kong, Yi Chang
- Do Large Language Models Understand Word Senses?
Domenico Meconi, Simone Stirpe, Federico Martelli, Leonardo Lavalle, Roberto Navigli
- Diverse, not Short: A Length-Controlled Data Selection Strategy for Improving Response Diversity of Language Models
Vijeta Deshpande, Debasmita Ghose, John D Patterson, Roger E. Beaty, Anna Rumshisky
- Uncovering the Bigger Picture: Comprehensive Event Understanding Via Diverse News Retrieval
Yixuan Tang, Yuanyuan Shi, Yiqun Sun, Anthony Kum Hoe Tung
- Personalized LLM Decoding via Contrasting Personal Preference
Hyungjune Bu, ChanJoo Jung, Minjae Kang, Jaehyung Kim
- The Missing Parts: Augmenting Fact Verification with Half Truth Detection
Yixuan Tang, Jincheng Wang, Anthony Kum Hoe Tung
- Toward Machine Translation Literacy: How Lay Users Perceive and Rely on Imperfect Translations
Yimin Xiao, Yongle Zhang, Dayeon Ki, Calvin Bao, Marianna J. Martindale, Charlotte Vaughn, Ge Gao, Marine Carpuat
- Personalization up to a Point: Why Personalized Content Moderation Needs Boundaries, and How We Can Enforce Them
Emanuele Moscato, Tiancheng Hu, Matthias Orlikowski, Paul Röttger, Debora Nozza
- MPCG: Multi-Round Persona-Conditioned Generation for Modeling the Evolution of Misinformation with LLMs
Chong Jun Rong Brian, Yixuan Tang, Anthony Kum Hoe Tung
- LiTEx: A Linguistic Taxonomy of Explanations for Understanding Within-Label Variation in Natural Language Inference
Pingjun Hong, Beiduo Chen, Siyao Peng, Marie-Catherine de Marneffe, Barbara Plank
- LiteraryQA: Towards Effective Evaluation of Long-document Narrative QA
Tommaso Bonomo, Luca Gioffré, Roberto Navigli
- FillerSpeech: Towards Human-Like Text-to-Speech Synthesis with Filler Insertion and Filler Style Control
Seung-Bin Kim, Jun-Hyeok Cha, Hyung-Seok Oh, Heejin Choi, Seong-Whan Lee
- Multi-LMentry: Can Multilingual LLMs Solve Elementary Tasks Across Languages?
Luca Moroni, Javier Aula-Blasco, Simone Conia, Irene Baucells, Naiara Perez, Silvia Paniagua Suárez, Anna Sallés, Malte Ostendorff, Júlia Falcão, Guijin Son, Aitor Gonzalez-Agirre, Roberto Navigli, Marta Villegas
- Lookahead Q-Cache: Achieving More Consistent KV Cache Eviction via Pseudo Query
Yixuan Wang, Shiyu Ji, Yijun Liu, Yuzhuang Xu, Yang Xu, Qingfu Zhu, Wanxiang Che
- PerspectiveMod: A Perspectivist Resource for Deliberative Moderation
Eva Maria Vecchi, Neele Falk, Carlotta Quensel, Iman Jundi, Gabriella Lapesa
- LoCt-Instruct: An Automatic Pipeline for Constructing Datasets of Logical Continuous Instructions
Hongyu Sun, Yusuke Sakai, Haruki Sakajo, Shintaro Ozaki, Kazuki Hayashi, Hidetaka Kamigaito, Taro Watanabe
- CodeSSM: Towards State Space Models for Code Understanding
Shweta Verma, Abhinav Anand, Mira Mezini
- EduAdapt: A Question Answer Benchmark Dataset for Evaluating Grade-Level Adaptability in LLMs
Numaan Naeem, Abdellah EL MEKKI, Muhammad Abdul-Mageed
- xCoRe: Cross-context Coreference Resolution
Giuliano Martinelli, Bruno Gatti, Roberto Navigli
- Retrieval-Augmented Generation with Estimation of Source Reliability
Jeongyeon Hwang, Junyoung Park, Hyejin Park, Dongwoo Kim, Sangdon Park, Jungseul Ok
- NitiBench: Benchmarking LLM Frameworks on Thai Legal Question Answering Capabilities
Pawitsapak Akarajaradwong, Pirat Pothavorn, Chompakorn Chaksangchaichot, Panuthep Tasawong, Thitiwat Nopparatbundit, Keerakiat Pratai, Sarana Nutanong
- From Input Perception to Predictive Insight: Modeling Model Blind Spots Before They Become Errors
Maggie Mi, Aline Villavicencio, Nafise Sadat Moosavi
- $\mathrm{Wojood^{Relations}}$: Arabic Relation Extraction Corpus and Modeling
Alaa Aljabari, Mohammed Khalilia, Mustafa Jarrar
- Conflicting Needles in a Haystack: How LLMs behave when faced with contradictory information
Murathan Kurfali
- Towards Event Extraction with Massive Types: LLM-based Collaborative Annotation and Partitioning Extraction
Wenxuan Liu, Zixuan Li, Long Bai, Yuxin Zuo, Daozhu Xu, Xiaolong Jin, Jiafeng Guo, Xueqi Cheng
- Liaozhai through the Looking-Glass: On Paratextual Explicitation of Culture-Bound Terms in Machine Translation
Sherrie Shen, Weixuan Wang, Alexandra Birch
- Concept-pedia: a Wide-coverage Semantically-annotated Multimodal Dataset
Karim Ghonim, Andrei Stefan Bejgu, Alberte Fernández-Castro, Roberto Navigli
- RAED: Retrieval-Augmented Entity Description Generation for Emerging Entity Linking and Disambiguation
Karim Ghonim, Pere-Lluís Huguet Cabot, Riccardo Orlando, Roberto Navigli
- Personalized Language Models via Privacy-Preserving Evolutionary Model Merging
Kyuyoung Kim, Jinwoo Shin, Jaehyung Kim
- Aligning Text/Speech Representations from Multimodal Models with MEG Brain Activity During Listening
Padakanti Srijith, Khushbu Pahwa, Radhika Mamidi, Bapi Raju Surampudi, Manish Gupta, SUBBA REDDY OOTA
- STARQA: A Question Answering Dataset for Complex Analytical Reasoning over Structured Databases
Mounica Maddela, Lingjue Xie, Daniel Preotiuc-Pietro, Mausam
- Slim-SC: Thought Pruning for Efficient Scaling with Self-Consistency
Colin Hong Fung Heng, Xu Guo, Anand Chaanan Singh, Esha Choukse, Dmitrii Ustiugov
- Long Chain-of-Thought Fine-tuning via Understanding-to-Reasoning Transition
Chenxin An, Zhihui Xie, Xiaonan Li, Ming Zhong, Shansan Gong, Lei Li, Jun Zhang, Jingjing Xu, Lingpeng Kong
- Exploring Large Language Models for Detecting Mental Disorders
Gleb Kuzmin, Petr Strepetov, Maksim Stankevich, Natalia Chudova, Artem Shelmanov, Ivan Smirnov
- Efficient Real-time Refinement of Language Model Text Generation
Joonho Ko, Jinheon Baek, Sung Ju Hwang
- Reward-Weighted Sampling: Enhancing Non-Autoregressive Characteristics in Masked Diffusion LLMs
Daehoon Gwak, Minseo Jung, Junwoo Park, Minho Park, ChaeHun Park, Junha Hyung, Jaegul Choo
- AI Argues Differently: Distinct Argumentative and Linguistic Patterns of LLMs in Persuasive Contexts
Esra Dönmez, Maximilian Maurer, Gabriella Lapesa, Agnieszka Falenska
- TounsiBench: Benchmarking Large Language Models for Tunisian Arabic
Souha Ben Hassine, Asma Arrak, Marouene Addhoum, Steven R Wilson
- Moral Framing in Politics (MFiP): A new resource and models for moral framing
Ines Rehbein, Ines Reinig, Simone Paolo Ponzetto
- ReDepress: A Cognitive Framework for Detecting Depression Relapse from Social Media
Aakash Kumar Agarwal, Saprativa Bhattacharjee, Mauli Rastogi, Jemima S. Jacob, Biplab Banerjee, Rashmi Gupta, Pushpak Bhattacharyya
- iKnow-audio: Integrating Knowledge Graphs with Audio-Language Models
Michel Olvera, Changhong Wang, Paraskevas Stamatiadis, Gaël Richard, Slim Essid
- EduVidQA: Generating and Evaluating Long-form Answers to Student Questions based on Lecture Videos
Sourjyadip Ray, Shubham Sharma, Somak Aditya, Pawan Goyal
- The Illusion of Progress: Re-evaluating Hallucination Detection in LLMs
Denis Janiak, Jakub Binkowski, Albert Sawczyn, Bogdan Gabrys, Ravid Shwartz-Ziv, Tomasz Jan Kajdanowicz
- Turning Logic Against Itself: Probing Model Defenses Through Contrastive Questions
Rachneet Singh Sachdeva, Rima Hazra, Iryna Gurevych
- CHURRO: Making History Readable with an Open-Weight Large Vision-Language Model for High-Accuracy, Low-Cost Historical Text Recognition
Sina Semnani, Han Zhang, Xinyan He, Merve Tekgurler, Monica Lam
- Towards Author-informed NLP: Mind the Social Bias
Inbar Cohen, Einat Minkov
- Detecting Corpus-Level Knowledge Inconsistencies in Wikipedia with Large Language Models
Sina Semnani, Jirayu Burapacheep, Arpandeep Khatua, Thanawan Atchariyachanvanit, Zheng Wang, Monica Lam
- Leveraging Multilingual Training for Authorship Representation: Enhancing Generalization across Languages and Domains
Junghwan Kim, Haotian Zhang, David Jurgens
- DrFrattn: Directly Learn Adaptive Policy from Attention for Simultaneous Machine Translation
Libo Zhao, Jing Li, Ziqian Zeng
- The Sound of Syntax: Finetuning and Comprehensive Evaluation of Language Models for Speech Pathology
Fagun Patel, Duc Quang Nguyen, Sang T. Truong, Jody Vaynshtok, Sanmi Koyejo, Nick Haber
- NormXLogit: The Head-on-Top Never Lies
Sina Abbasi, Mohammad Reza Modarres, Mohammad Taher Pilehvar
- Doc2Chart: Intent-Driven Zero-Shot Chart Generation from Documents
Akriti Jain, Pritika Ramu, Aparna Garimella, Apoorv Saxena
- Breaking Agents: Compromising Autonomous LLM Agents Through Malfunction Amplification
Boyang Zhang, Yicong Tan, Yun Shen, Ahmed Salem, Michael Backes, Savvas Zannettou, Yang Zhang
- FoREST: Frame of Reference Evaluation in Spatial Reasoning Tasks
Tanawan Premsri, Parisa Kordjamshidi
- Multilinguality Does not Make Sense: Investigating Factors Behind Zero-Shot Cross-Lingual Transfer in Sense-Aware Tasks
Roksana Goworek, Haim Dubossarsky
- Translating Domain-Specific Terminology in Typologically-Diverse Languages: A Study in Tax and Financial Education
Arturo Oncevay, Elena Kochkina, Keshav Ramani, Toyin Aguda, Simerjot Kaur, Charese Smiley
- Train It and Forget It: Merge Lists are Unnecessary for BPE Inference in Language Models
Tomohiro Sawada, Kartik Goyal
- Spectral Scaling Laws in Language Models: \ emph{How Effectively Do Feed-Forward Networks Use Their Latent Space?}
Nandan Kumar Jha, Brandon Reagen
- TLUE: A Tibetan Language Understanding Evaluation Benchmark
Fan Gao, Cheng Huang, Yutong Liu, Nyima Tashi, Xiangxiang Wang, Thupten Tsering, BAN Ma-bao, RENZENG Duojie, Gadeng Luosang, Rinchen Dongrub, Dorje Tashi, XiaoFengCD, Yongbin Yu, Hao Wang
- Retrieving Support to Rank Answers in Open-Domain Question Answering
Zeyu Zhang, Alessandro Moschitti, Thuy Vu
- Trojsten Benchmark: Evaluating LLM Problem-Solving in Slovak STEM Competition Problems
Adam Zahradník, Marek Suppa
- BRSpeech-DF: A Deep Fake Synthetic Speech Dataset for Portuguese Zero-Shot TTS
Alexandre Costa Ferro Filho, Rafaello Virgilli, Lucas Alcantara Souza, F S de Oliveira, Marcelo Henrique Lopes Ferreira, Daniel Tunnermann, Gustavo dos Reis Oliveira, Anderson Da Silva Soares, Arlindo Rodrigues Galvão Filho
- A Simple Yet Effective Method for Non-Refusing Context Relevant Fine-grained Safety Steering in LLMs
Shaona Ghosh, Amrita Bhattacharjee, Yftah Ziser, Christopher Parisien
- Statistical and Neural Methods for Hawaiian Orthography Modernization
Jaden Kapali, Keaton Williamson, Winston Wu
- so much depends / upon / a whitespace: Why Whitespace Matters for Poets and LLMs
Sriharsh Bhyravajjula, Melanie Walsh, Anna Preus, Maria Antoniak
- Certified Mitigation of Worst-Case LLM Copyright Infringement
Jingyu Zhang, Jiacan Yu, Marc Marone, Benjamin Van Durme, Daniel Khashabi
- Quantifying Logical Consistency in Transformers via Query-Key Alignment
Eduard Tulchinskii, Laida Kushnareva, Anastasia Voznyuk, Andrei Andriiainen, Irina Piontkovskaya, Evgeny Burnaev, Serguei Barannikov
- SimulatorArena: Are User Simulators Reliable Proxies for Multi-Turn Evaluation of AI Assistants?
Yao Dou, Michel Galley, Baolin Peng, Chris Kedzie, Weixin Cai, Alan Ritter, Chris Quirk, Wei Xu, Jianfeng Gao
- CourtReasoner: Can LLM Agents Reason Like Judges?
Simeng Han, Yoshiki Takashima, Shannon Zejiang Shen, Chen Liu, Yixin Liu, Roque K. Thuo, Sonia Knowlton, Ruzica Piskac, Scott J Shapiro, Arman Cohan
- Not Your Typical Government Tipline: LLM-Assisted Routing of Environmental Protection Agency Citizen Tips
Sharanya Majumder, Zehua Li, Derek Ouyang, Kit T Rodolfa, Elena Eneva, Julian Nyarko, Daniel E. Ho
- Retracing the Past: LLMs Emit Training Data When They Get Lost
Myeongseob Ko, Nikhil Reddy Billa, Adam Nguyen, Charles Fleming, Ming Jin, Ruoxi Jia
- Layer-wise Minimal Pair Probing Reveals Contextual Grammatical-Conceptual Hierarchy in Speech Representations
Linyang He, Qiaolin Wang, Xilin Jiang, Nima Mesgarani
- Current Semantic-change Quantification Methods Struggle with Semantic Change Discovery in the Wild
Khonzoda Umarova, Lillian Lee, Laerdon Kim
- Evaluating Large Language Models for Detecting Antisemitism
Jay Patel, Hrudayangam Mehta, Jeremy Blackburn
- D-RAG: Differentiable Retrieval-Augmented Generation for Knowledge Graph Question Answering
Guangze Gao, Zixuan Li, Chunfeng Yuan, Jiawei Li, Wu Jianzhuo, Yuehao Zhang, Xiaolong Jin, Bing Li, Weiming Hu
- Towards Robust Mathematical Reasoning
Thang Luong, Hoang H Nguyen, Dawsen Hwang, Golnaz Ghiasi, Yuri Chervonyi, Insuk Seo, Garrett Bingham, Jonathan Lee, Swaroop Mishra, Alex Zhai, Huiyi Hu, Henryk Michalewski, Jimin Kim, Jeonghyun Ahn, Junhwi Bae, Quoc V Le, Junehyuk Jung
- Table-LLM-Specialist: Language Model Specialists for Tables using Iterative Fine-tuning
Junjie Xing, Yeye He, Mengyu Zhou, Haoyu Dong, Shi Han, Dongmei Zhang, Surajit Chaudhuri
- Introducing Spotlight: A Novel Approach for Generating Captivating Key Information from Documents
Ankan Mullick, Sombit Bose, Rounak Saha, Ayan Kumar Bhowmick, Aditya Vempaty, Prasenjit Dey, Ravi Kokku, Pawan Goyal, Niloy Ganguly
- Argument Summarization and its Evaluation in the Era of Large Language Models
Moritz Altemeyer, Steffen Eger, Johannes Daxenberger, Yanran Chen, Tim Altendorf, Philipp Cimiano, Benjamin Schiller
- Computational Analysis of Conversation Dynamics through Participant Responsivity
Margaret Hughes, Brandon Roy, Elinor Poole-Dayan, Deb Roy, Jad Kabbara
- AMQ: Enabling AutoML for Mixed-precision Weight-Only Quantization of Large Language Models
Sangjun Lee, Seung-taek Woo, Jun-gyu Jin, Changhun Lee, Eunhyeok Park
- Beyond Averages: Learning with Annotator Disagreement in STS
Alejandro Benito-Santos, Adrian Ghajari
- Dipper: Diversity in Prompts for Producing Large Language Model Ensembles in Reasoning Tasks
Gregory Kang Ruey Lau, Wenyang Hu, Liu Diwen, Chen Jizhuo, See-Kiong Ng, Bryan Kian Hsiang Low
- Constrained Non-negative Matrix Factorization for Guided Topic Modeling of Minority Topics
Seyedeh Fatemeh Ebrahimi, Jaakko Peltonen
- Which Word Orders Facilitate Length Generalization in LMs? An Investigation with GCG-Based Artificial Languages
Nadine El-Naggar, Tatsuki Kuribayashi, Ted Briscoe
- Training compute-optimal transformer encoder models
Megi Dervishi, Alexandre Allauzen, Gabriel Synnaeve, Yann LeCun
- Mind the Blind Spots: A Focus-Level Evaluation Framework for LLM Reviews
Hyungyu Shin, Jingyu Tang, Yoonjoo Lee, Nayoung Kim, Hyunseung Lim, Ji Yong Cho, Hwajung Hong, Moontae Lee, Juho Kim
- Seeing Through Words, Speaking Through Pixels: Deep Representational Alignment Between Vision and Language Models
Zoe Wanying He, Sean Trott, Meenakshi Khosla
- Unconditional Truthfulness: Learning Unconditional Uncertainty of Large Language Models
Artem Vazhentsev, Ekaterina Fadeeva, Rui Xing, Gleb Kuzmin, Ivan Lazichny, Alexander Panchenko, Preslav Nakov, Timothy Baldwin, Maxim Panov, Artem Shelmanov
- Chinese Toxic Language Mitigation via Sentiment Polarity Consistent Rewrites
Xintong Wang, Yixiao Liu, Jingheng Pan, Liang Ding, Longyue Wang, Chris Biemann
- A Head to Predict and a Head to Question: Pre-trained Uncertainty Quantification Heads for Hallucination Detection in LLM Outputs
Artem Shelmanov, Ekaterina Fadeeva, Akim Tsvigun, Ivan Tsvigun, Zhuohan Xie, Igor Kiselev, Nico Daheim, Caiqi Zhang, Artem Vazhentsev, Mrinmaya Sachan, Preslav Nakov, Timothy Baldwin
- Generative or Discriminative? Revisiting Text Classification in the Era of Transformers
Siva Rajesh Kasa, Karan Gupta, Sumegh Roychowdhury, Ashutosh Kumar, Yaswanth Biruduraju, SANTHOSH KUMAR KASA, Pattisapu Nikhil Priyatam, Arindam Bhattacharya, Shailendra Agarwal, Vijay huddar
- Measuring Chain of Thought Faithfulness by Unlearning Reasoning Steps
Martin Tutek, Fateme Hashemi Chaleshtori, Ana Marasovic, Yonatan Belinkov
- LingGym: How Far Are LLMs from Thinking Like Field Linguists?
Changbing Yang, Franklin Ma, Freda Shi, Jian Zhu
- MiCRo: Mixture Modeling and Context-aware Routing for Personalized Preference Learning
Jingyan Shen, Jiarui Yao, Rui Yang, Yifan Sun, Feng Luo, Rui Pan, Tong Zhang, Han Zhao
- Autoformalization in the Wild: Assessing LLMs on Real-World Mathematical Definitions
Lan Zhang, Marco Valentino, Andre Freitas
- InterIDEAS: Philosophical Intertextuality via LLMs
Yue Yang, Yinzhi Xu, Chenghao Huang, JohnMichael Jurgensen, Han Hu, Hao Wang
- Infini-gram mini: Exact n-gram Search at the Internet Scale with FM-Index
Hao Xu, Jiacheng Liu, Yejin Choi, Noah A. Smith, Hannaneh Hajishirzi
- Causal Interventions Reveal Shared Structure Across English Filler–Gap Constructions
Sasha Boguraev, Christopher Potts, Kyle Mahowald
- Mind the Value-Action Gap: Do LLMs Act in Alignment with Their Values?
Hua Shen, Nicholas Clark, Tanu Mitra
- AccessEval: Benchmarking Disability Bias in Large Language Models
Srikant Panda, Amit Agarwal, Hitesh Laxmichand Patel
- DiscoSG: Towards Discourse-Level Text Scene Graph Parsing through Iterative Graph Refinement
Shaoqing Lin, Chong Teng, Fei Li, Donghong Ji, Lizhen Qu, Zhuang Li