Accepted Findings Papers

  • Automating Alternative Generation in Decision-Making
    Yevhen Kostiuk, Clara Seyfried, Chris Reed
  • Bias Analysis and Mitigation through Protected Attribute Detection and Regard Classification
    Takuma Udagawa, YANG ZHAO, Hiroshi Kanayama, Bishwaranjan Bhattacharjee
  • Large Language Models Might Not Care What You Are Saying: Prompt Format Beats Descriptions
    Chenming Tang, Zhixiang Wang, Hao Sun, Yunfang Wu
  • Boundary Matters: Leveraging Structured Text Plots for Long Text Outline Generation
    Yuanchi Ma, Jiamou Liu, Hui He, Libo Zhang, Haoyuan Li, Zhendong Niu
  • Can Large Language Models Personalize Dialogues to Generational Styles?
    Pier Felice Balestrucci, Ondrej Dusek, Luca Anselma, Alessandro Mazzei
  • Toward Optimal LLM Alignments Using Two-Player Games
    Rui Zheng, Hongyi Guo, Zhihan Liu, Xiaoying Zhang, Yuanshun Yao, Xiaojun Xu, Zhaoran Wang, Zhiheng Xi, Tao Gui, Qi Zhang, Xuanjing Huang, Yang Liu, Hang Li
  • Structural Patent Classification Using Label Hierarchy Optimization
    Mengting Gui, Shufeng Hao, Chongyang Shi, Qi Zhang
  • Exploring Hyperbolic Hierarchical Structure for Multimodal Rumor Detection
    Md Mahbubur Rahman, Shufeng Hao, Chongyang Shi, An Lao, Jinyan Liu
  • Multi-Surrogate-Objective Optimization for Neural Topic Models
    Tue Le, Hoang Tran Vuong, Tung Nguyen, Linh Ngo Van, Sang Dinh, Trung Le, Thien Huu Nguyen
  • How Diversely Can Language Models Solve Problems? Exploring the Algorithmic Diversity of Model-Generated Code
    Seonghyeon Lee, HeeJae Chon, Joonwon Jang, Dongha Lee, Hwanjo Yu
  • ReAL: How Can LLMs Simulate the Real Teacher? Retrieval-enhanced Agent for Adaptive Learning
    Rui Lv, Qi Liu, Weibo Gao, Jiatong Li, Kai Zhang, Shiwei Tong
  • LLMsPark: A Benchmark for Evaluating Large Language Models in Strategic Gaming Contexts
    Junhao Chen, Jingbo Sun, Xiang Li, Haidong Xin, Yuhao Xue, Yibin Xu, Hao Zhao
  • Versatile Framework for Song Generation with Prompt-based Control
    Yu Zhang, Wenxiang Guo, Changhao Pan, Zhiyuan Zhu, Ruiqi Li, Jingyu Lu, Rongjie Huang, Ruiyuan Zhang, Zhiqing Hong, Ziyue Jiang, Zhou Zhao
  • InsBank: Evolving Instruction Subset for Ongoing Alignment
    Jiayi Shi, Yiwei Li, Shaoxiong Feng, Peiwen Yuan, Xinglin Wang, Yueqi Zhang, Chuyi Tan, Boyuan Pan, Huan Ren, Yao Hu, Kan Li
  • TL-Training: A Task-Feature-Based Framework for Training Large Language Models in Tool Use
    Junjie Ye, Yilong Wu, Sixian Li, Yuming Yang, Zhiheng Xi, Tao Gui, Qi Zhang, Xuanjing Huang, Peng Wang, Zhongchao Shi, Jianping Fan, Zhengyin Du
  • DCMKC: A Dual Consistency Matching Approach for Multi-hop Question Answering in LLMs
    Xinyi Wang, YIPING SONG, Chang Liu, Tingjin Luo, Bo Liu, Zheng Xie, Minlie Huang
  • On Domain-Adaptive Post-Training for Multimodal Large Language Models
    Daixuan Cheng, Shaohan Huang, Ziyu Zhu, Xintong Zhang, Xin Zhao, Zhongzhi Luan, Bo Dai, Zhenliang Zhang
  • CPO: Addressing Reward Ambiguity in Role-playing Dialogue via Comparative Policy Optimization
    Jing Ye, Rui Wang, Yuchuan Wu, Victor Ma, Feiteng Fang, Fei Huang, Yongbin Li
  • SPPD: Self-training with Process Preference Learning Using Dynamic Value Margin
    Hao Yi, Qingyang Li, Yulan Hu, Fuzheng Zhang, Di ZHANG, Yong Liu
  • Error Classification of Large Language Models on Math Word Problems: A Dynamically Adaptive Framework
    Zhangyue Yin, YuHong Sun, Xuanjing Huang, Xipeng Qiu, Hui Zhao
  • sudoLLM: On Multi-role Alignment of Language Models
    Soumadeep Saha, Akshay Chaturvedi, Joy Mahapatra, Utpal Garain
  • DAC: Decomposed Automation Correction for Text-to-SQL
    Dingzirui Wang, Longxu Dou, Xuanliang Zhang, Qingfu Zhu
  • VehicleWorld: A Highly Integrated Multi-Device Environment for Intelligent Vehicle Interaction
    Jie Yang, Jiajun Chen, Zhangyue Yin, Shuo Chen, Yuxin Wang, YiranGuo, Yuan Li, Yining Zheng, Xuanjing Huang, Xipeng Qiu
  • End-to-End Optimization for Multimodal Retrieval-Augmented Generation via Reward Backpropagation
    Zhiyuan Fan, Longfei Yun, Ming Yan, Yumeng Wang, Dadi Guo, Brian Mak, James Kwok, Yi R. Fung
  • Audio-Aware Large Language Models as Judges for Speaking Styles
    Cheng-Han Chiang, Xiaofei Wang, Chung-Ching Lin, Kevin Lin, Linjie Li, Radu Kopetz, Yao Qian, Zhendong Wang, Zhengyuan Yang, Hung-yi Lee, Lijuan Wang
  • Evaluation of Text-to-Image Generation from a Creativity Perspective
    Xinhao Wang, Xinyu Ma, ShengYong Ding, Derek F. Wong
  • Perovskite-LLM: Knowledge-Enhanced Large Language Models for Perovskite Solar Cell Research
    Xiang Liu, Penglei Sun, Shuyan Chen, Longhan Zhang, Peijie Dong, Huajie You, Yongqi Zhang, Chang YAN, Xiaowen Chu, Tong-yi Zhang
  • ProPy: Building Interactive Prompt Pyramids upon CLIP for Partially Relevant Video Retrieval
    Yi Pan, Yujia Zhang, Michael Kampffmeyer, Xiaoguang Zhao
  • Multilingual Datasets for Custom Input Extraction and Explanation Requests Parsing in Conversational XAI Systems
    Qianli Wang, Tatiana Anikina, Nils Feldhus, Simon Ostermann, Fedor Splitt, Jiaao Li, Yoana Tsoneva, Sebastian Möller, Vera Schmitt
  • Toolscaler: Scalable Generative Tool Calling via Structure-Aware Semantic Tokenization
    Yunyue Su, Zhang Jinshuai, Bowen Fang, Wen Ye, Jinghao Zhang, Qiang Liu, Bowen Song, Weiqiang Wang, Liang Wang
  • LaMP-Val: Large Language Models Empower Personalized Valuation in Auction
    Jie Sun, Tianyu Zhang, Houcheng Jiang, Junkang Wu, Xiang Shu, Jiancan Wu, An Zhang, Chi Luo, Zhibo Zhu, Xingyu Lu, Lintao Ma, Xiang Wang
  • Exploring Model Kinship for Merging Large Language Models
    Yedi Hu, Yunzhi Yao, Ningyu Zhang, Huajun Chen, Shumin Deng
  • MULTITAT: Benchmarking Multilingual Table-and-Text Question Answering
    Xuanliang Zhang, Dingzirui Wang, Keyan Xu, Qingfu Zhu, Wanxiang Che
  • LoRA-MGPO: Mitigating Double Descent in Low-Rank Adaptation via Momentum-Guided Perturbation Optimization
    Yupeng Chang, Chenlu Guo, Yi Chang, Yuan Wu
  • R-LoRA: Randomized Multi-Head LoRA for Efficient Multi-task Learning
    Jinda Liu, Yi Chang, Yuan Wu
  • RACQC: Advanced Retrieval-Augmented Generation for Chinese Query Correction
    Jinbo Su, Lingzhe Gao, Wei Li, Shihao Liu, Haojie Lei, Xinyi Wang, Yuanzhao Guo, Ke Wang, Daiting Shi, Dawei Yin
  • Mechanistic Understanding and Mitigation of Language Confusion in English-Centric Large Language Models
    Ercong Nie, Helmut Schmid, Hinrich Schuetze
  • Assessing and Mitigating Medical Knowledge Drift and Conflicts in Large Language Models
    Weiyi Wu, Xinwen Xu, Chongyang Gao, Xingjian Diao, Lucas A. Salas, Jiang Gui
  • Improving LLM Reasoning through Interpretable Role-Playing Steering
    Anyi Wang, Dong Shu, Yifan Wang, Yunpu Ma, Mengnan Du
  • R2A-TLS: Reflective Retrieval-Augmented Timeline Summarization with Causal-Semantic Integration
    Chenlong Bao, Shijie Li, Minghao Hu, Ming Qiao, Bin Zhang, Jin-Tao Tang, Shasha Li, Ting Wang
  • MedEBench: Diagnosing Reliability in Text-Guided Medical Image Editing
    Minghao LIU, Zhitao He, Zhiyuan Fan, Qingyun Wang, Yi R. Fung
  • FairCoT: Enhancing Fairness in Text-to-Image Generation via Chain of Thought Reasoning with Multimodal Large Language Models
    Zahraa Al Sahili, Ioannis Patras, Matthew Purver
  • Bag of Tricks for Sparse Mixture-of-Experts: A Benchmark Across Reasoning, Efficiency, and Safety
    Mufan Qiu, Zheyu Shen, Pingzhi Li, Ang Li, Tianlong Chen
  • Don’t Take the Premise for Granted: Evaluating the Premise Critique Ability of Large Language Models
    Jinzhe Li, Gengxu Li, Yi Chang, Yuan Wu
  • Mitigating Geospatial Knowledge Hallucination in Large Language Models: Benchmarking and Dynamic Factuality Aligning
    Shengyuan Wang, Jie Feng, Tianhui Liu, Dan Pei, Yong Li
  • The Power of Framing: How News Headlines Guide Search Behavior
    Amrit Poudel, Maria Milkowski, Tim Weninger
  • DivLogicEval: A Framework for Benchmarking Logical Reasoning Evaluation in Large Language Models
    Tsz Ting Chung, Lemao Liu, Mo Yu, Dit-Yan Yeung
  • THCM-CAL: Temporal-Hierarchical Causal Modelling with Conformal Calibration for Clinical Risk Prediction
    Xin Zhang, Qiyu Wei, Yingjie Zhu, Fanyi Wu, Sophia Ananiadou
  • GenPilot: A Multi-Agent System for Test-Time Prompt Optimization in Image Generation
    Wen Ye, Zhaocheng Liu, Gui Yuwei, Tingyu Yuan, Yunyue Su, Bowen Fang, Chaoyang Zhao, Qiang Liu, Liang Wang
  • Grounded-VideoLLM: Sharpening Fine-grained Temporal Grounding in Video Large Language Models
    Haibo Wang, Zhiyang Xu, Yu Cheng, Shizhe Diao, Yufan Zhou, Yixin Cao, Qifan Wang, Weifeng Ge, Lifu Huang
  • DongbaMIE: A Multimodal Information Extraction Dataset for Evaluating Semantic Understanding of Dongba Pictograms
    Xiaojun Bi, Shuo Li, Junyao Xing, Ziyue Wang, Fuwen Luo, Weizheng Qiao, Lu Han, Ziwei Sun, Peng Li, Yang Liu
  • Optimizing Cross-Client Domain Coverage for Federated Instruction Tuning of Large Language Models
    Zezhou Wang, Yaxin Du, Xingjun Ma, Yu-Gang Jiang, Zhuzhong Qian, Siheng Chen
  • Aligning Black-Box LLMs for Aspect Sentiment Quad Prediction
    Shichen Li, Jiawei Zhang, Zhongqing Wang, Peifeng Li
  • Multifaceted Evaluation of Audio-Visual Capability for MLLMs: Effectiveness, Efficiency, Generalizability and Robustness
    Yusheng Zhao, Xiao Luo, Junyu Luo, Weizhi Zhang, Zhiping Xiao, Wei Ju, Philip S. Yu, Ming Zhang
  • Two Steps from Hell: Compositionality on Chemical LMs
    Veronika Ganeeva, Kuzma Khrabrov, Artur Kadurin, Elena Tutubalina
  • GTA: Supervised-Guided Reinforcement Learning for Text Classification with Large Language Models
    Min Zeng, Jingfei Sun, Xueyou Luo, Shiqi Zhang, Li Xie, Caiquan Liu, Xiaoxin Chen
  • Unearthing Gems from Stones: Policy Optimization with Negative Sample Augmentation for LLM Reasoning
    Zhaohui Yang, Yuxiao Ye, Shilei Jiang, Shihong Deng, Chen Hu, Linjing Li, Daxin Jiang
  • LEAF: Large Language Diffusion Model for Time Series Forecasting
    Yuhang Pei, Yifan Wang, Tao Ren, Zhipeng Sun, Wei Ju, Chong Chen, Xian-Sheng Hua, Xiao Luo
  • SPFT-SQL: Enhancing Large Language Model for Text-to-SQL Parsing by Self-Play Fine-Tuning
    Yuhao Zhang, Shaoming Duan, Jinhang Su, Chuanyi Liu, Peiyi Han
  • Multilingual Verbalisation of Knowledge Graphs
    Yifei Song, William Soto Martinez, Anna Nikiforovskaya, Evan Parker Kelly Chapple, Claire Gardent
  • LAGCL4Rec: When LLMs Activate Interactions Potential in Graph Contrastive Learning for Recommendation
    Leqi Zheng, Chaokun Wang, Canzhi Chen, Jiajun Zhang, Cheng Wu, Zixin Song, Shannan Yan, Ziyang Liu, Hongwei Li
  • English as Defense Proxy: Mitigating Multilingual Jailbreak via Eliciting English Safety Knowledge
    Zekai Zhang, Yiduo Guo, Jiuheng Lin, Shanghaoran Quan, Huishuai Zhang, Dongyan Zhao
  • Dagger Behind Smile: Fool LLMs with a Happy Ending Story
    Xurui Song, Zhixin Xie, Shuo Huai, Jiayi Kong, Jun Luo
  • Mitigating Object Hallucinations in MLLMs via Multi-Frequency Perturbations
    Shuo Li, Jiajun Sun, Guodong Zheng, Xiaoran Fan, Yujiong Shen, Yi Lu, Zhiheng Xi, Yuming Yang, Wenming Tan, Tao Ji, Tao Gui, Qi Zhang, Xuanjing Huang
  • Natural Context Drift Undermines the Natural Language Understanding of Large Language Models
    Yulong Wu, Viktor Schlegel, Riza Batista-Navarro
  • Minimal Ranks, Maximum Confidence: Parameter-efficient Uncertainty Quantification for LoRA
    Patryk Marszałek, Klaudia Bałazy, Jacek Tabor, Tomasz Kuśmierczyk
  • Chain-of-Thought Prompting Obscures Hallucination Cues in Large Language Models: An Empirical Evaluation
    Jiahao Cheng, Tiancheng Su, Jia Yuan, Guoxiu He, Jiawei Liu, XinqiTao, Jingwen Xie, Huaxia Li
  • Large Language Model Evaluation via Matrix Nuclear-Norm
    Yahan Li, Tingyu Xia, Yuan Wu, Yi Chang
  • From Grounding to Manipulation: Case Studies of Foundation Model Integration in Embodied Robotic Systems
    Xiuchao Sui, Daiying Tian, Qi Sun, Ruirui Chen, Dongkyu Choi, Kenneth Kwok, Soujanya Poria
  • Flexible Thinking for Multimodal Emotional Support Conversation via Reinforcement Learning
    Fanfan Wang, Xiangqing Shen, Jianfei Yu, Rui Xia
  • ORAL: Prompting Your Large-Scale LoRAs via Conditional Recurrent Diffusion
    Rana Shahroz, Dongwen Tang, Pingzhi Li, Kai Wang, Tianlong Chen
  • NLoRA: Nyström-Initiated Low-Rank Adaptation for Large Language Models
    Chenlu Guo, Yi Chang, Yuan Wu
  • Bhaasha, Bhāṣā, Zaban: A Survey for Low-Resourced Languages in South Asia – Current Stage and Challenges
    Sampoorna Poria, Xiaolei Huang
  • DISCO Balances the Scales: Adaptive Domain- and Difficulty-Aware Reinforcement Learning on Imbalanced Data
    Yuhang Zhou, Jing Zhu, Shengyi Qian, Zhuokai Zhao, Xiyao Wang, Xiaoyu Liu, Ming Li, Paiheng Xu, Wei Ai, Furong Huang
  • What Makes for Good Image Captions?
    Delong Chen, Samuel Cahyawijaya, Etsuko Ishii, Ho Shu Chan, Yejin Bang, Pascale Fung
  • What’s Not Said Still Hurts: A Description-Based Evaluation Framework for Measuring Social Bias in LLMs
    Jinhao Pan, Chahat Raj, Ziyu Yao, Ziwei Zhu
  • Identifying Rare Languages in Common Crawl Data is a Needles-in-a-Haystack Problem
    Rasul Dent, Pedro Ortiz Suarez, Thibault Clérice, Benoît Sagot
  • Training Language Models to Critique With Multi-agent Feedback
    Tian Lan, Wenwei Zhang, Chengqi Lyu, Shuaibin Li, Chen Xu, Heyan Huang, Dahua Lin, Xian-Ling Mao, Kai Chen
  • RELIC: Enhancing Reward Model Generalization for Low-Resource Indic Languages with Few-Shot Examples
    Soumya Suvra Ghosal, Vaibhav Singh, Akash Ghosh, Soumyabrata Pal, Subhadip Baidya, Sriparna Saha, Dinesh Manocha
  • Invoke Interfaces Only When Needed: Adaptive Invocation for Large Language Models in Question Answering
    Jihao Zhao, Chunlai Zhou, Daixuan Li, Shuaishuai Zu, Biao Qin
  • SQLSpace: A Representation Space for Text-to-SQL to Discover and Mitigate Robustness Gaps
    Neha Srikanth, Victor Bursztyn, Puneet Mathur, Ani Nenkova
  • One More Modality: Does Abstract Meaning Representation Benefit Visual Question Answering?
    Shira Wein, Emma Markle, Abhidip Bhattacharyya
  • DP-GTR: Differentially Private Prompt Protection via Group Text Rewriting
    Mingchen Li, Heng Fan, Song Fu, Junhua Ding, Yunhe Feng
  • Legal Mathematical Reasoning with LLMs: Procedural Alignment through Two-Stage Reinforcement Learning
    Kepu Zhang, Guofu Xie, Weijie Yu, Mingyue Xu, Xu Tang, Yaxin Li, Jun Xu
  • ModelingAgent: Bridging LLMs and Mathematical Modeling for Real-World Challenges
    Cheng Qian, Hongyi Du, Hongru WANG, Xiusi Chen, Yuji Zhang, Avirup Sil, ChengXiang Zhai, Kathleen McKeown, Heng Ji
  • Beyond Coarse Labels: Fine-Grained Problem Augmentation and Multi-Dimensional Feedback for Emotional Support Conversation
    Yuanchen Shi, Jiawang Hao, Fang Kong
  • FinHEAR: Human Expertise and Adaptive Risk-Aware Temporal Reasoning for Financial Decision-Making
    Jiaxiang Chen, mingxi Zou, Zhuo Wang, Qifan Wang, Danny Dongning Sun, Zhang Chi, Zenglin Xu
  • EvolKV: Evolutionary KV Cache Compression for LLM Inference
    Bohan Yu, Yekun Chai
  • A Survey on Sparse Autoencoders: Interpreting the Internal Mechanisms of Large Language Models
    Dong Shu, Xuansheng Wu, Haiyan Zhao, Daking Rai, Ziyu Yao, Ninghao Liu, Mengnan Du
  • Large Vision-Language Model Alignment and Misalignment: A Survey Through the Lens of Explainability
    Dong Shu, Haiyan Zhao, Jingyu Hu, Weiru Liu, Ali Payani, Lu Cheng, Mengnan Du
  • Attention Consistency for LLMs Explanation
    Tian LAN, JINYUAN XU, Xue HE, Jenq-Neng Hwang, Lei Li
  • Confusion is the Final Barrier: Rethinking Jailbreak Evaluation and Investigating the Real Misuse Threat of LLMs
    Yu Yan, Sheng Sun, Zhe Wang, Yijun Lin, Zenghao Duan, zhifei zheng, Min Liu, Zhiyi yin, Jianping Zhang
  • CCL-XCoT: An Efficient Cross-Lingual Knowledge Transfer Method for Mitigating Hallucination Generation
    Zheng Weihua, Roy Ka-Wei Lee, Zhengyuan Liu, Wu Kui, AiTi Aw, Bowei Zou
  • Evaluating Step-by-step Reasoning Traces: A Survey
    Jinu Lee, Julia Hockenmaier
  • Beyond Guilt: Legal Judgment Prediction with Trichotomous Reasoning
    Kepu Zhang, Haoyue Yang, Xu Tang, Weijie Yu, Jun Xu
  • Not Every Token Needs Forgetting: Selective Unlearning Balancing Forgetting and Utility in Large Language Models
    Yixin Wan, Anil Ramakrishna, Kai-Wei Chang, Volkan Cevher, Rahul Gupta
  • DisastIR: A Comprehensive Information Retrieval Benchmark for Disaster Management
    Kai Yin, Xiangjue Dong, Chengkai Liu, Lipai Huang, Yiming Xiao, Zhewei LIU, Ali Mostafavi, James Caverlee
  • Data or Language Supervision: What Makes CLIP Better than DINO?
    Yiming Liu, Yuhui Zhang, Dhruba Ghosh, Ludwig Schmidt, Serena Yeung-Levy
  • Do LLMs Understand Wine Descriptors Across Cultures? A Benchmark for Cultural Adaptations of Wine Reviews
    Chenye Zou, Xingyue Wen, Tianyi Hu, Qian Janice Wang, Daniel Hershcovich
  • DeFT-X: Denoised Sparse Fine-Tuning for Zero-Shot Cross-Lingual Transfer
    Sona Elza Simon, Preethi Jyothi
  • Memory-enhanced Large Language Model for Cross-lingual Dependency Parsing via Deep Hierarchical Syntax Understanding
    Jianjian Liu, Ying Li, Zhengtao Yu, Shun Su, Shengxiang Gao, Yuxin Huang
  • Developing and Utilizing a Large-Scale Cantonese Dataset for Multi-Tasking in Large Language Models
    Jiyue Jiang, Alfred Kar Yin Truong, Yanyu CHEN, Qinghang Bao, Sheng Wang, Pengan CHEN, Jiuming Wang, Lingpeng Kong, Yu Li, Chuan Wu
  • A Structured Framework for Evaluating and Enhancing Interpretive Capabilities of Multimodal LLMs in Culturally Situated Tasks
    Haorui Yu, Ramon Ruiz-Dolz, Qiufeng Yi
  • Train a Unified Multimodal Data Quality Classifier with Synthetic Data
    Weizhi Wang, Rongmei Lin, Shiyang Li, Colin Lockard, Ritesh Sarkhel, Sanket Lokegaonkar, Jingbo Shang, Xifeng Yan, Nasser Zalmout, Xian Li
  • Self-Improvement in Multimodal Large Language Models: A Survey
    Shijian Deng, Kai Wang, Tianyu Yang, Harsh Singh, Yapeng Tian
  • Towards Achieving Concept Completeness for Textual Concept Bottleneck Models
    Milan Bhan, Yann CHOHO, Jean-Noël Vittaut, Nicolas CHESNEAU, Pierre Moreau, Marie-Jeanne Lesot
  • EmoBench-UA: A Benchmark Dataset for Emotion Detection in Ukrainian
    Daryna Dementieva, Nikolay Babakov, Alexander Fraser
  • Scientific Paper Retrieval with LLM-Guided Semantic-Based Ranking
    Yunyi Zhang, Ruozhen Yang, Siqi Jiao, SeongKu Kang, Jiawei Han
  • DLIR: Spherical Adaptation for Cross-Lingual Knowledge Transfer of Sociological Concepts Alignment
    Zeqiang Wang, Jon Johnson, Suparna De
  • Test-Time Steering for Lossless Text Compression via Weighted Product of Experts
    Qihang Zhang, Muchen Li, Ziao Wang, Renjie Liao, Lele Wang
  • Zero-Shot Contextual Embeddings via Offline Synthetic Corpus Generation
    Philip Lippmann, Jie Yang
  • The Hallucination Tax of Reinforcement Finetuning
    Linxin Song, Taiwei Shi, Jieyu Zhao
  • Tracing Multilingual Factual Knowledge Acquisition in Pretraining
    Yihong Liu, Mingyang Wang, Amir Hossein Kargaran, Felicia Körner, Ercong Nie, Barbara Plank, François Yvon, Hinrich Schuetze
  • Exploring the Vulnerability of the Content Moderation Guardrail in Large Language Models via Intent Manipulation
    Jun Zhuang, Haibo Jin, Ye Zhang, Zhengjian Kang, Wenbin Zhang, Gaby G. Dagher, Haohan Wang
  • Examining Multilingual Embedding Models Cross-Lingually Through LLM-Generated Adversarial Examples
    Andrianos Michail, Simon Clematide, Rico Sennrich
  • EmoGist: Efficient In-Context Learning for Visual Emotion Understanding
    Ronald Seoh, Dan Goldwasser
  • Soft Token Attacks Cannot Reliably Audit Unlearning in Large Language Models
    Haokun Chen, Sebastian Szyller, Weilin Xu, Nageen Himayat
  • Bridging the Editing Gap in LLMs: FineEdit for Precise and Targeted Text Modifications
    Yiming Zeng, Wanhao Yu, Zexin Li, Tao Ren, Yu Ma, Jinghan Cao, Xiyan Chen, Tingting Yu
  • LLM-based Conversational Recommendation Agents with Collaborative Verbalized Experience
    Yaochen Zhu, Harald Steck, Dawen Liang, Yinhan He, Nathan Kallus, Jundong Li
  • Hardware-Aware Parallel Prompt Decoding for Memory-Efficient Acceleration of LLM Inference
    Hao Mark Chen, Wayne Luk, Yiu Ka Fai Cedric, Rui Li, Konstantin Mishchenko, Stylianos Venieris, Hongxiang Fan
  • Measuring Sycophancy of Language Models in Multi-turn Dialogues
    Jiseung Hong, Grace Byun, Seungone Kim, Kai Shu
  • On the Role of Entity and Event Level Conceptualization in Generalizable Reasoning: A Survey of Tasks, Methods, Applications, and Future Directions
    Weiqi Wang, Tianqing Fang, Haochen Shi, Baixuan Xu, Wenxuan Ding, Liyu Zhang, Wei Fan, Jiaxin Bai, Haoran Li, Xin Liu, Yangqiu Song
  • Mitigating Visual Knowledge Forgetting in MLLM Instruction-tuning via Modality-decoupled Gradient Descent
    Junda Wu, Yuxin Xiong, Xintong Li, Yu Xia, Yu Wang, Tong Yu, Sungchul Kim, Ryan A. Rossi, Lina Yao, Jingbo Shang, Julian McAuley
  • PathoHR: Hierarchical Reasoning for Vision-Language Models in Pathology
    Yating Huang, Ziyan Huang, Lintao Xiang, Qijun Yang, Hujun Yin
  • “What’s Up, Doc?”: Analyzing How Users Seek Health Information in Large-Scale Conversational AI Datasets
    Akshay Paruchuri, Maryam Aziz, Rohit Vartak, Ayman Ali, Best Uchehara, Xin Liu, Ishan Chatterjee, Monica Agrawal
  • Dynamic Evaluation for Oversensitivity in LLMs
    Sophia Xiao Pu, Sitao Cheng, Xin Eric Wang, William Yang Wang
  • Self-Correcting Code Generation Using Small Language Models
    Jeonghun Cho, Deokhyung Kang, Hyounghun Kim, Gary Lee
  • A Unified Framework for N-ary Property Information Extraction in Materials Science
    Van-Thuy Phi, Yuji Matsumoto
  • A Benchmark for Translations Across Styles and Language Variants
    Xin Tan, Bowei Zou, AiTi Aw
  • ManuSearch: Democratizing Deep Search in Large Language Models with a Transparent and Open Multi-Agent Framework
    Lisheng Huang, Yichen Liu, Jinhao Jiang, Rongxiang Zhang, Jiahao yan, Junyi Li, Xin Zhao
  • Proactive User Information Acquisition via Chats on User-Favored Topics
    Shiki Sato, Jun Baba, Asahi Hentona, Shinji Iwata, Akifumi Yoshimoto, Koichiro Yoshino
  • Evaluating Text Generation Quality Using Spectral Distances of Surprisal
    Zhichen Liu, Yongyuan Li, Yang Xu, Yu Wang, Yingfang Yuan, Zuhao Yang
  • NLP-ADBench: NLP Anomaly Detection Benchmark
    Yuangang Li, Jiaqi li, Zhuo Xiao, Tiankai Yang, Yi Nian, Xiyang Hu, Yue Zhao
  • Toward Inclusive Language Models: Sparsity-Driven Calibration for Systematic and Interpretable Mitigation of Social Biases in LLMs
    Prommy Sultana Hossain, Chahat Raj, Ziwei Zhu, Jessica Lin, Emanuela Marasco
  • Table-Text Alignment: Explaining Claim Verification Against Tables in Scientific Papers
    Xanh Ho, Sunisth Kumar, Yun-Ang Wu, Florian Boudin, Atsuhiro Takasu, Akiko Aizawa
  • DCRM: A Heuristic to Measure Response Pair Quality in Preference Optimization
    Chengyu Huang, Tanya Goyal
  • Advancing Reasoning with Off-the-Shelf LLMs: A Semantic Structure Perspective
    Pengfei He, Zitao Li, Yue Xing, Yaliang Li, Jiliang Tang, Bolin Ding
  • LLM-based Open Domain Planning by Leveraging Entity-Attribute-Level Domain Models
    Dongning Rao, Songlin He, Ruishi Liang, Zhihua Jiang
  • DICP: Deep In-Context Prompt for Event Causality Identification
    Lin Mu, Jun Shen, Li Ni, Lei Sang, Zhize Wu, Peiquan Jin, Yiwen Zhang
  • Seeing is Believing: Emotion-Aware Audio-Visual Language Modeling for Expressive Speech Generation
    Weiting Tan, Jiachen Lian, Hirofumi Inaguma, Paden Tomasello, Philipp Koehn, Xutai Ma
  • GRV-KBQA: A Three-Stage Framework for Knowledge Base Question Answering with Decoupled Logical Structure, Semantic Grounding and Structure-Aware Validation
    Yuhang Tian, Pan Yang, Dandan Song, Zhijing Wu, Hao Wang
  • Improving Prompt Generalization for Cross-prompt Essay Trait Scoring from the Scoring-invariance Perspective
    Jiong Wang, Shengquan Yu
  • When Format Changes Meaning: Investigating Semantic Inconsistency of Large Language Models
    Cheongwoong Kang, Jongeun Baek, Yeonjea Kim, Jaesik Choi
  • ASTPrompter: Preference-Aligned Automated Language Model Red-Teaming to Generate Low-Perplexity Unsafe Prompts
    Amelia Hardy, Houjun Liu, Allie Griffith, Bernard Lange, Duncan Eddy, Mykel Kochenderfer
  • How Do Large Language Models Perform on PDE Discovery: A Coarse-to-fine Perspective
    Xiao Luo, Changhu Wang, Yizhou Sun, Wei Wang
  • Rethinking Data Selection at Scale: Random Selection is Almost All You Need
    Tingyu Xia, Bowen Yu, Kai Dang, An Yang, Yuan Wu, Yi Chang, Junyang Lin
  • PromptKeeper: Safeguarding System Prompts for LLMs
    Zhifeng Jiang, Zhihua Jin, Guoliang He
  • Automating eHMI Action Design with LLMs for Automated Vehicle Communication
    Ding Xia, Xinyue Gui, Fan Gao, Dongyuan Li, Mark Colley, Takeo Igarashi
  • A Dynamic Fusion Model for Consistent Crisis Response
    Xiaoying Song, Anirban Saha Anik, Eduardo Blanco, Vanessa Frias-Martinez, Lingzi Hong
  • UIOrchestra: Generating High-Fidelity Code from UI Designs with a Multi-agent System
    Chuhuai Yue, Jiajun Chai, Yufei zhang, Zixiang Ding, Xihao Liang, Peixin Wang, Shihai Chen, Wang Yixuan, wangyanping, Wei Lin, Guojun Yin
  • CrossQG: Improving Difficulty-Controllable Question Generation through Consistency Enhancement
    Kunze Li, Yu Zhang
  • Progressive Facial Granularity Aggregation with Bilateral Attribute-based Enhancement for Face-to-Speech Synthesis
    Yejin Jeon, Youngjae Kim, Jihyun Lee, Hyounghun Kim, Gary Lee
  • Speaking at the Right Level: Literacy-Controlled Counterspeech Generation with RAG-RL
    Xiaoying Song, Anirban Saha Anik, Dibakar Barua, Pengcheng Luo, Junhua Ding, Lingzi Hong
  • FNSCC: Fuzzy Neighborhood-Aware Self-Supervised Contrastive Clustering for Short Text
    Zijian Zheng, Yonghe Lu, Jian Yin
  • AuraDial: A Large-Scale Human-Centric Dialogue Dataset for Chinese AI Psychological Counseling
    Xiantao Zhang
  • TS-SQL: Test-driven Self-refinement for Text-to-SQL
    Wenbo Xu, Haifeng Zhu, Liang Yan, Chuanyi Liu, Peiyi Han, Shaoming Duan, Jeff Z. Pan
  • DemonAgent: Dynamically Encrypted Multi-Backdoor Implantation Attack on LLM-based Agent
    Pengyu Zhu, Zhenhong Zhou, Yuanhe Zhang, Shilinlu Yan, Kun Wang, Sen Su
  • MotivGraph-SoIQ: Integrating Motivational Knowledge Graphs and Socratic Dialogue for Enhanced LLM Ideation
    Xinping Lei, Tong Zhou, Yubo Chen, Kang Liu, Jun Zhao
  • ExpertGenQA: Open-ended QA generation in Specialized Domains
    Haz Sameen Shahgir, Chansong Lim, Jia Chen, Evangelos E. Papalexakis, Yue Dong
  • VisCoder: Fine-Tuning LLMs for Executable Python Visualization Code Generation
    Yuansheng Ni, Ping Nie, Kai Zou, Xiang Yue, Wenhu Chen
  • Conversational Education at Scale: A Multi-LLM Agent Workflow for Procedural Learning and Pedagogic Quality Assessment
    Jiahuan Pei, Fanghua Ye, XIN SUN, Wentao Deng, Koen Hindriks, Junxiao Wang
  • Visual Program Distillation with Template-Based Augmentation
    Michal Shlapentokh-Rothman, Yu-Xiong Wang, Derek Hoiem
  • NeighXLM: Enhancing Cross-Lingual Transfer in Low-Resource Languages via Neighbor-Augmented Contrastive Pretraining
    Sicheng Wang, Wenyi Wu, Zibo Zhang
  • ICLER: Intent CLassification with Enhanced Reasoning
    Dezheng Gao, Dong xiaozheng, SHuangtao Yang, Bo Fu
  • PreGenie: An Agentic Framework for High-quality Visual Presentation Generation
    Xiaojie Xu, Xinli Xu, Sirui CHEN, Haoyu Chen, Fan Zhang, Ying-Cong Chen
  • RIVAL: Reinforcement Learning with Iterative and Adversarial Optimization for Machine Translation
    Tianjiao Li, Mengranyu, Chenyu Shi, Yanjun Zhao, Xiaojing Liu, Qi Zhang, Xuanjing Huang, Qiang Zhang, Jiayin Wang
  • MRAG: A Modular Retrieval Framework for Time-Sensitive Question Answering
    Siyue Zhang, Yuxiang Xue, Yiming Zhang, Xiaobao Wu, Anh Tuan Luu, Chen Zhao
  • CoT-RAG: Integrating Chain of Thought and Retrieval-Augmented Generation to Enhance Reasoning in Large Language Models
    Feiyang Li, Peng Fang, Zhan Shi, Arijit Khan, Fang Wang, Weihao Wang, zhangxin-hw, Cui Yongjian
  • TabDSR: Decompose, Sanitize, and Reason for Complex Numerical Reasoning in Tabular Data
    Changjiang Jiang, Fengchang Yu, Haihua Chen, Wei Lu, Jin Zeng
  • Chain-of-Thought Matters: Improving Long-Context Language Models with Reasoning Path Supervision
    Dawei Zhu, Xiyu Wei, Guangxiang Zhao, Wenhao Wu, Haosheng Zou, Junfeng Ran, XWang, Lin Sun, Xiangzheng Zhang, Sujian Li
  • Multimodal Document-level Triple Extraction via Dynamic Graph Enhancement and Relation-Aware Reflection
    Xiang Li, Runhai Jiao, ZHOU CHANGYU, Shoupeng Qiao, Ruojiao Qiao, Ruifan Li
  • Distill Visual Chart Reasoning Ability from LLMs to MLLMs
    Wei He, Zhiheng Xi, Wanxu Zhao, Xiaoran Fan, Yiwen Ding, Zifei Shan, Tao Gui, Qi Zhang, Xuanjing Huang
  • FlowMalTrans: Unsupervised Binary Code Translation for Malware Detection Using Flow-Adapter Architecture
    Minghao Hu, Junzhe Wang, Weisen Zhao, Qiang Zeng, Lannan Luo
  • AdaTP: Attention-Debiased Token Pruning for Video Large Language Models
    Fengyuan Sun, Leqi Shen, Hui Chen, Sicheng Zhao, Jungong Han, Guiguang Ding
  • AdaptFlow: Adaptive Workflow Optimization via Meta-Learning
    Runchuan Zhu, Bowen Jiang, Lingrui Mei, Fangkai Yang, Lu Wang, Haoxiang Gao, Fengshuo Bai, Pu Zhao, Qingwei Lin, Saravan Rajmohan, Dongmei Zhang
  • LMUNIT: Fine-grained Evaluation with Natural Language Unit Tests
    Jon Saad-Falcon, Rajan Pathe Vivek, William Berrios, Nandita Shankar Naik, Matija Franklin, Bertie Vidgen, Amanpreet Singh, Douwe Kiela, Shikib Mehri
  • ThinkAnswer Loss: Balancing Semantic Similarity and Exact Matching for LLM Reasoning Enhancement
    Shan Yang, Kun Wu, Zeju Li, Linlin Zhang, Xiangyu Pei, Leike An, Yu Liu
  • Detecting Stealthy Backdoor Samples based on Intra-class Distance for Large Language Models
    Jinwen Chen, Hainan Zhang, Fei Sun, Qinnan Zhang, Sijia Wen, Ziwei Wang, Zhiming Zheng
  • Rust-doctor: Enhanced Feature for Rust Ownership and Lifetime Repair with Balanced Training Data Generation
    Wenzhang Yang, xiaoning ren, Cuifeng Gao, Yinxing Xue
  • SLIM: Subtrajectory-Level Elimination for More Effective Reasoning
    Xifeng Yao, Chengyuan Ma, Dongyu Lang, Yinhao Ni, Zhiwei Xu, Huarui Xie, Zihao Chen, Guang Shen, Dandan Tu, Yi Bai, Changzheng Zhang
  • From Cross-Task Examples to In-Task Prompts: A Graph-Based Pseudo-Labeling Framework for In-context Learning
    Zihan Chen, Song Wang, Xingbo Fu, Chengshuai Shi, Zhenyu Lei, Cong Shen, Jundong Li
  • Instance-level Randomization: Toward More Stable LLM Evaluations
    Yiyang Li, Yonghuang Wu, Ying Luo, Liangtai Sun, Zishu Qin, Lin Qiu, Xuezhi Cao, Xunliang Cai
  • Not All Voices Are Rewarded Equally: Probing and Repairing Reward Models across Human Diversity
    Zihao Li, Feihao Fang, Xitong Zhang, Jiaru Zou, Zhining Liu, Wei Xiong, Ziwei Wu, Baoyu Jing, Jingrui He
  • PAMN: Multi-phase Correlation Modeling for Contrast-Enhanced 3D Medical Image Retrieval
    Haonan Tong, Ke Liu, Chuang Zhang, Xinglin Zhang, Tao Chen, Jenq-Neng Hwang, Lei Li
  • Safety in Large Reasoning Models: A Survey
    Cheng Wang, Yue Liu, Baolong Bi, Duzhen Zhang, Zhong-Zhi Li, YINGWEI MA, Yufei He, Shengju Yu, Xinfeng Li, Junfeng Fang, Jiaheng Zhang, Bryan Hooi
  • SafeConf: A Confidence-Calibrated Safety Self-Evaluation Method for Large Language Models
    Bo Zhang, Cong Gao, Linkang Yang, Bingxu Han, Minghao Hu, Zhunchen Luo, Guotong Geng, Xiaoying Bai, Jun Zhang, Wen Yao, Zhong Wang
  • DocAssistant: Integrating Key-region Reading and Step-wise Reasoning for Robust Document Visual Question Answering
    Jinxu Zhang, QiyuanFan, Yu Zhang
  • LNE-Blocking: An Efficient Framework for Contamination Mitigation Evaluation on Large Language Models
    Ruijie Hou, Yueyang Jiao, Hanxu Hu, Yingming Li, Wai Lam, Huajian Zhang, Hongyuan Lu
  • Enhancing Hate Speech Classifiers through a Gradient-assisted Counterfactual Text Generation Strategy
    Michael Van Supranes, Shaowen Peng, Shoko Wakamiya, Eiji Aramaki
  • Learning SQL Like a Human: Structure-Aware Curriculum Learning for Text-to-SQL Generation
    Xiaohu Zhu, Qian Li, Lizhen Cui, Yuntao Du
  • Chain-of-Interactions: Multi-step Iterative ICL Framework for Abstractive Task-Oriented Dialogue Summarization of Conversational AI Interactions
    Jason S Lucas, ALI AL LAWATI, Mahjabin Nahar, John Chen, Mahnoosh Mehrabani
  • Your Semantic-Independent Watermark is Fragile: A Semantic Perturbation Attack against EaaS Watermark
    Zekun Fei, Biao Yi, Jianing Geng, He Ruiqi, Lihai Nie, Zheli Liu
  • Query Optimization for Parametric Knowledge Refinement in Retrieval-Augmented Large Language Models
    Youan Cong, Pritom Saha Akash, Cheng Wang, Kevin Chen-Chuan Chang
  • SKA-Bench: A Fine-Grained Benchmark for Evaluating Structured Knowledge Understanding of LLMs
    Zhiqiang Liu, Enpei Niu, Yin Hua, Mengshu Sun, Lei Liang, Huajun Chen, Wen Zhang
  • $PD^3F$: A Pluggable and Dynamic DoS-Defense Framework against resource consumption attacks targeting Large Language Models
    Yuanhe Zhang, Xinyue Wang, Haoran Gao, Zhenhong Zhou, Fanyu Meng, Yuyao Zhang, Sen Su
  • From Implicit Exploration to Structured Reasoning: Guideline and Refinement for LLMs
    Jiaxiang Chen, Zhuo Wang, mingxi Zou, Zhucong Li, Zhijian Zhou, Song Wang, Zenglin Xu
  • PIP: Perturbation-based Iterative Pruning for Large Language Models
    Yi Cao, Wei-Jie Xu, Yucheng Shen, Weijie Shi, Chi-Min Chan, Jianfeng Qu, Jiajie Xu
  • Convolutional LoRA Aggregation for Unseen Tasks Adaptation
    Xinhao Wu, Jialin Liu, Yutai Duan, Jie Liu
  • CDT: A Comprehensive Capability Framework for Large Language Models Across Cognition, Domain, and Task
    Haosi Mo, Xinyu Ma, Xuebo Liu, Derek F. Wong, YU LI, Jie Liu, Min Zhang
  • Multilingual Collaborative Defense for Large Language Models
    Hongliang Li, Jinan Xu, Gengping Cui, Changhao Guan, Fengran Mo, Kaiyu Huang
  • Role-Guided Annotation and Prototype-Aligned Representation Learning for Historical Literature Sentiment Classification
    Hongfei Du, Jiacheng Shi, Jacobo Myerston, Sidi Lu, Gang Zhou, Ashley Gao
  • MetaMixSpeech: Meta Task Augmentation for Low-Resource Speech Recognition
    Yaqi Chen, Hao Zhang, Wenlin Zhang, XuKui Yang, Dan Qu, Yunpeng Liu
  • RECAST: Retrieval-Augmented Contextual ASR via Decoder-State Keyword Spotting
    Ashish Mittal, Sunita Sarawagi, Preethi Jyothi
  • PREE: Towards Harmless and Adaptive Fingerprint Editing in Large Language Models via Knowledge Prefix Enhancement
    Xubin Yue, Zhenhua Xu, Wenpeng Xing, Jiahui Yu, Mohan Li, Meng Han
  • Beyond Spurious Signals: Debiasing Multimodal Large Language Models via Counterfactual Inference and Adaptive Expert Routing
    Zichen Wu, Hsiu-Yuan Huang, Yunfang Wu
  • Text-centric Alignment for Bridging Test-time Unseen Modality
    Yun-Da Tsai, Ting-Yu Yen, Pei-Fu Guo, Zhe-Yan Li, Shou-De Lin
  • HierPrompt: Zero-Shot Hierarchical Text Classification with LLM-Enhanced Prototypes
    Qian Zhang, Qinliang Su, Wei Zhu, Pang Yachun
  • RouterEval: A Comprehensive Benchmark for Routing LLMs to Explore Model-level Scaling Up in LLMs
    Zhongzhan Huang, Guoming Ling, Yupei Lin, Yandong Chen, Shanshan Zhong, Hefeng Wu, Liang Lin
  • Can We Steer Reasoning Direction by Thinking Intervention?
    Xingsheng ZHANG, Luxi Xing, Chen Zhang, Yanbing Liu, Yifan Deng, Yunpeng Li, Yue Hu, Chenxu Niu
  • MPO: Boosting LLM Agents with Meta Plan Optimization
    Weimin Xiong, Yifan Song, Qingxiu Dong, Bingchan Zhao, Feifan Song, XWang, Sujian Li
  • Exploring the Generalizability of Factual Hallucination Mitigation via Enhancing Precise Knowledge Utilization
    Siyuan Zhang, Yichi Zhang, Yinpeng Dong, Hang Su
  • Learning What to Remember: Adaptive Probabilistic Memory Retention for Memory-Efficient Language Models
    S M Rafiuddin, Muntaha Nujat Khan
  • Unlocking Smarter Device Control: Foresighted Planning with a World Model-Driven Code Execution Approach
    Xiaoran Yin, Xu Luo, Hao Wu, Lianli Gao, Jingkuan Song
  • RGAR: Recurrence Generation-augmented Retrieval for Factual-aware Medical Question Answering
    Sichu Liang, Linhai Zhang, Hongyu Zhu, Wenwen Wang, Yulan He, Deyu Zhou
  • EcoSafeRAG: Efficient Security through Context Analysis in Retrieval-Augmented Generation
    Ruobing Yao, Yifei Zhang, Shuang Song, Neng Gao, Chenyang Tu
  • StereoDetect: Detecting Stereotypes and Anti-stereotypes the Correct Way Using Social Psychological Underpinnings
    Kaustubh Shivshankar Shejole, Pushpak Bhattacharyya
  • Sparkle: Mastering Basic Spatial Capabilities in Vision Language Models Elicits Generalization to Spatial Reasoning
    Yihong Tang, Ao Qu, Zhaokai Wang, Dingyi Zhuang, Zhaofeng Wu, Wei Ma, Shenhao Wang, Yunhan Zheng, Zhan Zhao, Jinhua Zhao
  • How Does Knowledge Selection Help Retrieval Augmented Generation?
    Xiangci Li, Jessica Ouyang
  • UPLex: Fine-Grained Personality Control in Large Language Models via Unsupervised Lexical Modulation
    Tianlong Li, Xiaoqing Zheng
  • ParetoRAG: Leveraging Sentence-Context Attention for Robust and Efficient Retrieval-Augmented Generation
    Ruobing Yao, Yifei Zhang, Shuang Song, Yuhan Liu, Neng Gao, Chenyang Tu
  • FlexQuant: A Flexible and Efficient Dynamic Precision Switching Framework for LLM Quantization
    Fangxin Liu, Zongwu Wang, Jinhong Xia, Junping Zhao, Jian Liu, Haibing Guan, Li Jiang
  • ReLoop: “Seeing Twice and Thinking Backwards” via Closed-loop Training to Mitigate Hallucinations in Multimodal understanding
    Jianjiang Yang, Yanshu Li, Ziyan Huang
  • Sequence Structure Aware Retriever for Procedural Document Retrieval: A New Dataset and Baseline
    Zhenqi Ye, HaoPeng Ren, Yi Cai, Qingbao Huang, Jing Qin, Pinli Zhu, Songwen Gong
  • The Effect of Language Diversity When Fine-Tuning Large Language Models for Translation
    David Stap, Christof Monz
  • David vs. Goliath: Cost-Efficient Financial QA via Cascaded Multi-Agent Reasoning
    Chenghao Liu, Qian Liu, Ziqin Zhu, Hao Fei, Aniket Mahanti
  • Benchmarking Uncertainty Metrics for LLM Target-Aware Search
    Pei-Fu Guo, Yun-Da Tsai, Shou-De Lin
  • ZOGRASCOPE: A New Benchmark for Semantic Parsing over Property Graphs
    Francesco Cazzaro, Justin Kleindienst, Sofia Márquez Gomez, Ariadna Quattoni
  • FG-PRM: Fine-grained Hallucination Detection and Mitigation in Language Model Mathematical Reasoning
    Ruosen Li, Ziming Luo, Xinya Du
  • Multi-Faceted Self-Consistent Preference Alignment for Query Rewriting in Conversational Search
    Zhiyu Cao, Peifeng Li, Qiaoming Zhu
  • Recipe2Plan: Evaluating Planning Abilities of LLMs for Efficient and Feasible Multitasking with Time Constraints Between Actions
    Zirui Wu, Xiao Liu, Jiayi Li, Lingpeng Kong, Yansong Feng
  • Unlocking the Effectiveness of LoRA-FP for Seamless Transfer Implantation of Fingerprints in Downstream Models
    Zhenhua Xu, Zhaokun Yan, Binhan Xu, Xin Tong, Haitao Xu, Yourong Chen, Meng Han
  • AELC: Adaptive Entity Linking with LLM-Driven Contextualization
    Fang Wang, Zhengwei Tao, Ming Wang, Minghao Hu, Xiaoying Bai
  • MetaLadder: Ascending Mathematical Solution Quality via Analogical-Problem Reasoning Transfer
    Honglin Lin, Zhuoshi Pan, Qizhi Pei, Xin Gao, Yu Li, Mengzhang Cai, Conghui He, Lijun Wu
  • GLProtein: Global-and-Local Structure Aware Protein Representation Learning
    Yunqing LIU, Wenqi Fan, Xiaoyong Wei, Li Qing
  • Reward Mixology: Crafting Hybrid Signals for Reinforcement Learning Driven In-Context Learning
    Changshuo Zhang, Ang Gao, Xiao Zhang, Yong Liu, Deyang Li, Fangchao Liu, Xinyu Zhang
  • Can Multimodal LLMs See Materials Clearly? A Multimodal Benchmark on Materials Characterization
    Zhengzhao Lai, Youbin Zheng, Zhenyang Cai, HAONAN LYU, Jingpu Yang, Hong-Qing Liang, Yan Hu, Benyou Wang
  • GRADE: Generating multi-hop QA and fine-gRAined Difficulty matrix for RAG Evaluation
    Jeongsoo Lee, Daeyong Kwon, Kyohoon Jin
  • FusionDTI: Fine-grained Binding Discovery with Token-level Fusion for Drug-Target Interaction
    Zhaohan Meng, Zaiqiao Meng, Ke Yuan, Iadh Ounis
  • A Survey on Training-free Alignment of Large Language Models
    Birong Pan, Yongqi Li, Weiyu Zhang, Wenpeng Lu, Mayi Xu, Shen Zhou, Yuanyuan Zhu, Ming Zhong, Tieyun Qian
  • CIVET: Systematic Evaluation of Understanding in VLMs
    Massimo Rizzoli, Simone Alghisi, Olha Khomyn, Gabriel Roccabruna, Seyed Mahed Mousavi, giuseppe riccardi
  • How Does Cognitive Bias Affect Large Language Models? A Case Study on the Anchoring Effect in Price Negotiation Simulations
    Yoshiki Takenami, Yin Jou Huang, Yugo Murawaki, Chenhui Chu
  • Enhancing Speech-to-Speech Dialogue Modeling with End-to-End Retrieval-Augmented Generation
    Pengchao Feng, Ziyang Ma, Wenxi Chen, Yao Li, SHENG WANG, Kai Yu, Xie Chen
  • Backdoor-Powered Prompt Injection Attacks Nullify Defense Methods
    Yulin Chen, Haoran Li, Yuan Sui, Yangqiu Song, Bryan Hooi
  • Path-enhanced Pre-trained Language Model for Knowledge Graph Completion
    Hao Wang, Dandan Song, Zhijing Wu, Yuhang Tian, Pan Yang
  • Zero-shot Cross-lingual NER via Mitigating Language Difference: An Entity-aligned Translation Perspective
    Zhihao Zhang, Sophia Yat Mei Lee, Dong Zhang, Shoushan Li, Guodong Zhou
  • Zero-Shot Cross-Domain Aspect-Based Sentiment Analysis via Domain-Contextualized Chain-of-Thought Reasoning
    Chuming Shen, Wei Wei, Dong Wang, Zhong-Hao Wang
  • Tree of Agents: Improving Long-Context Capabilities of Large Language Models through Multi-Perspective Reasoning
    Song Yu, Xiaofei Xu, KE DENG, Li Li, LIN TIAN
  • Cross-Cultural Transfer of Commonsense Reasoning in LLMs: Evidence from the Arab World
    Saeed Almheiri, Rania Elbadry, Mena Attia, Chenxi Wang, Preslav Nakov, Timothy Baldwin, Fajri Koto
  • Enhancing Partially Relevant Video Retrieval with Robust Alignment Learning
    Long Zhang, Peipei Song, Jianfeng Dong, Kun Li, Xun Yang
  • Multi-level Diagnosis and Evaluation for Robust Tabular Feature Engineering with Large Language Models
    Yebin Lim, Susik Yoon
  • Prejudge-Before-Think: Enhancing Large Language Models at Test-Time by Process Prejudge Reasoning
    Jianing Wang, Jin Jiang, Yang Liu, Mengdi Zhang, Xunliang Cai
  • FroM: Frobenius Norm-Based Data-Free Adaptive Model Merging
    Zijian Li, Xiaocheng Feng, Huixin Liu, Yichong Huang, Ting Liu, Bing Qin
  • Dynamic Simulation Framework for Disinformation Dissemination and Correction With Social Bots
    Boyu Qiao, Kun Li, Wei Zhou, Songlin Hu
  • Beyond the First Error: Process Reward Models for Reflective Mathematical Reasoning
    Zhaohui Yang, Chenghua He, Xiaowen Shi, Shihong Deng, Linjing Li, Qiyue Yin, Daxin Jiang
  • PrAd: Prompt Adaptive Tuning for Decoder-only Language Models
    Youneng Ma, Junyi He, Haojun Fei
  • Personalized Question Answering with User Profile Generation and Compression
    Hang Su, Yun Yang, Tianyang Liu, Xin Liu, Peng Pu, Xuesong Lu
  • Dream to Chat: Model-based Reinforcement Learning on Dialogues with User Belief Modeling
    Yue Zhao, Xiaoyu Wang, Dan Wang, Zhonglin Jiang, Qingqing Gu, Teng Chen, Ningyuan Xi, Jinxian Qu, Yong Chen, Luo Ji
  • FakeSV-VLM: Taming VLM for Detecting Fake Short-Video News via Progressive Mixture-Of-Experts Adapter
    JunXi Wang, Yaxiong Wang, Lechao Cheng, Zhun Zhong
  • Beyond Inherent Cognition Biases in LLM-Based Event Forecasting: A Multi-Cognition Agentic Framework
    Zhen Wang, Xi Zhou, Yating Yang, Bo Ma, Lei Wang, Rui Dong, Azmat Anwar
  • Breaking the Reviewer: Assessing the Vulnerability of Large Language Models in Automated Peer Review Under Textual Adversarial Attacks
    Tzu-Ling Lin, Wei-Chih Chen, Teng-Fang Hsiao, Hou-I Liu, Ya-Hsin Yeh, Yu-Kai Chan, Wen-Sheng Lien, Po-Yen Kuo, Philip S. Yu, Hong-Han Shuai
  • Watermarking with Low-Entropy POS-Guided Token Partitioning and Z-Score-Driven Dynamic Bias for Large Language Models
    He Li, Xiaojun Chen, Zhendong Zhao, Yunfei Yang, Xin Zhao, Jingcheng He
  • Knowledge Graph-Driven Memory Editing with Directional Interventions
    Jinhu Fu, Kun Wang, chongye guo, Junfeng Fang, Wentao Zhang, Sen Su
  • DTDES-KGE: Dual-Teacher Knowledge Distillation with Distinct Embedding Spaces for Knowledge Graph Embeddings
    Bofan Wei, Hongyuan Xu, Yuhang Niu, Jiarui Ren, Yanlong Wen, Xiaojie Yuan
  • LLMEval-Med: A Real-world Clinical Benchmark for Medical LLMs with Physician Validation
    Ming Zhang, Yujiong Shen, Zelin Li, Huayu Sha, Binze Hu, Yuhui Wang, Chenhao Huang, Shichun Liu, Jingqi Tong, Changhao Jiang, Mingxu Chai, Zhiheng Xi, Shihan Dou, Tao Gui, Qi Zhang, Xuanjing Huang
  • Watermark Smoothing Attacks against Language Models
    Hongyan Chang, Hamed Hassani, Reza Shokri
  • PICD-Instruct: A Generative Instruction Learning Framework for Few-Shot Multi-Intent Spoken Language Understanding
    Wenbin Hua, Rui Fan, Tingting He, Ming Dong
  • Forewarned is Forearmed: Pre-Synthesizing Jailbreak-like Instructions to Enhance LLM Safety Guardrail to Potential Attacks
    Sheng Liu, Qiang Sheng, Danding Wang, Yang Li, Guang Yang, Juan Cao
  • Are Knowledge and Reference in Multilingual Language Models Cross-Lingually Consistent?
    Xi Ai, Mahardika Krisna Ihsani, Min-Yen Kan
  • Krikri: Advancing Open Large Language Models for Greek
    Dimitris Roussis, Leon Voukoutis, Georgios Paraskevopoulos, Sokratis Sofianopoulos, Prokopis Prokopidis, Vassilis Papavassileiou, Athanasios Katsamanis, Stelios Piperidis, Vassilis Katsouros
  • Beyond the Scientific Document: A Citation-Aware Multi-Granular Summarization Approach with Heterogeneous Graphs
    Quoc-An Nguyen, Xuan-Hung Le, Thi-Minh-Thu Vu, Hoang-Quynh Le
  • Detecting Continuously Evolving Scam Calls under Limited Annotation: A LLM-Augmented Expert Rule Framework
    Haoyu Ma, Qinliang Su, Minhua Huang, Wu Kai
  • An Empirical Study of Position Bias in Modern Information Retrieval
    Ziyang Zeng, Dun Zhang, Jiacheng Li, zoupanxiang, Yuqing Yang
  • GenPoE: Generative Passage-level Mixture of Experts for Knowledge Enhancement of LLMs
    Xuebing Liu, Shanbao Qiao, Seung-Hoon Na
  • CoRanking: Collaborative Ranking with Small and Large Ranking Agents
    Wenhan Liu, Xinyu Ma, Yutao Zhu, Lixin Su, Shuaiqiang Wang, Dawei Yin, Zhicheng Dou
  • HIRAG: Hierarchical-Thought Instruction-Tuning Retrieval-Augmented Generation
    Yihan Jiao, Zhehao Tan, Dan Yang, Duolin sun, Jie Feng, YUE SHEN, Jian Wang, Peng Wei
  • Towards Personalized Conversational Sales Agents: Contextual User Profiling for Strategic Action
    Tongyoung Kim, Jeongeun Lee, SooJin Yoon, SungHwan Kim, Dongha Lee
  • WebCoT: Enhancing Web Agent Reasoning by Reconstructing Chain-of-Thought in Reflection, Branching, and Rollback
    Minda Hu, Tianqing Fang, Jianshu Zhang, Jun-Yu Ma, Zhisong Zhang, Jingyan Zhou, Hongming Zhang, Haitao Mi, Dong Yu, Irwin King
  • Interesting Culture: Social Relation Recognition from Videos via Culture De-confounding
    Yuxuan Zhang, Yangfu Zhu, Haorui Wang, Bin Wu
  • ThinkSwitcher: When to Think Hard, When to Think Fast
    Guosheng Liang, Longguang Zhong, Ziyi Yang, Xiaojun Quan
  • MaGiX: A Multi-Granular Adaptive Graph Intelligence Framework for Enhancing Cross-Lingual RAG
    Nguyen Manh Hieu, Vu Lam Anh, Hung Pham Van, Nam Le Hai, Linh Ngo Van, Nguyen Thi Ngoc Diep, Thien Huu Nguyen
  • LexTime: A Benchmark for Temporal Ordering of Legal Events
    Claire Barale, Leslie Barrett, Vikram Sunil Bajaj, Michael Rovatsos
  • Beyond the Surface: A Solution-Aware Retrieval Model for Competition-level Code Generation
    Zhang Shiwen, Lingxiang Wang, Hainan Zhang, Ziwei Wang, Sijia Wen, Zhiming Zheng
  • X-Boundary: Establishing Exact Safety Boundary to Shield LLMs from Jailbreak Attacks without Compromising Usability
    Xiaoya Lu, Dongrui Liu, Yi Yu, Luxin Xu, Jing Shao
  • Tag&Tab: Pretraining Data Detection in Large Language Models Using Keyword-Based Membership Inference Attack
    Sagiv Antebi, Edan Habler, Asaf Shabtai, Yuval Elovici
  • EcoLANG: Efficient and Effective Agent Communication Language Induction for Social Simulation
    Xinyi Mou, Chen Qian, Wei Liu, Ling Yan, Yao Hu, Xuanjing Huang, zhongyu wei
  • Revealing the Inherent Instructability of Pre-Trained Language Models
    Seokhyun An, Minji Kim, Hyounghun Kim
  • What Media Frames Reveal About Stance: A Dataset and Study about Memes in Climate Change Discourse
    Shijia Zhou, Siyao Peng, Simon M. Luebke, Jörg Haßler, Mario Haim, Saif M. Mohammad, Barbara Plank
  • Rethinking Personality Assessment from Human-Agent Dialogues: Fewer Rounds May Be Better Than More
    Baiqiao Zhang, Zhifeng Liao, Xiangxian Li, Chao Zhou, Juan Liu, Xiaojuan Ma, Yulong Bian
  • TailorRPA: A Retrieval-Based Framework for Eliciting Personalized and Coherent Role-Playing Agents in General Domain
    Zhenpeng Gao, Xiaofen Xing, Xiangmin Xu
  • SCE: Semantic Consistency Enhanced Reinforcement Learning for Multi-Hop Knowledge Graph Reasoning
    Huangyw, Yao Liu, Qiao Liu, Rui Hou, Tingting Dai
  • ReGraphRAG: Reorganizing Fragmented Knowledge Graphs for Multi-Perspective Retrieval-Augmented Generation
    Soohyeong Kim, Seok Jun Hwang, JungHyoun Kim, Jeonghyeon Park, Yong Suk Choi
  • GASE: Generatively Augmented Sentence Encoding
    Manuel Frank, Haithem Afli
  • The “r” in “woman” stands for rights. Auditing LLMs in Uncovering Social Dynamics in Implicit Misogyny
    Arianna Muti, Chris Emmery, Debora Nozza, Alberto Barrón-Cedeño, Tommaso Caselli
  • Fact Verification on Knowledge Graph via Programmatic Graph Reasoning
    Yuanzhen Hao, Desheng Wu
  • Agent Trading Arena: A Study on Numerical Understanding in LLM-Based Agents
    Tianmi Ma, Jiawei Du, Wenxin Huang, Wenjie Wang, Liang Xie, Xian Zhong, Joey Tianyi Zhou
  • Why We Feel What We Feel: Joint Detection of Emotions and Their Opinion Triggers in E-commerce
    Arnav Attri, Anuj Attri, Suman Banerjee, Amey Patil, Muthusamy Chelliah, Nikesh Garera, Pushpak Bhattacharyya
  • Use Random Selection for Now: Investigation of Few-Shot Selection Strategies in LLM-based Text Augmentation
    Jan Cegin, Branislav Pecher, Jakub Simko, Ivan Srba, Maria Bielikova, Peter Brusilovsky
  • BanglaByT5: Byte-Level Modelling for Bangla
    Pramit Bhattacharyya, Arnab Bhattacharya
  • XTRA: Cross-Lingual Topic Modeling with Topic and Representation Alignments
    Nguyen Tien Phat, Ngo Vu Minh, Tung Nguyen, Linh Ngo Van, Duc Anh Nguyen, Sang Dinh, Trung Le
  • CodeContests+: High-Quality Test Case Generation for Competitive Programming
    Zihan Wang, Siyao Liu, Yang Sun, Ming Ding, Hongyan Li
  • SPO: Self Preference Optimization with Self Regularization
    Yuhao Sun, Yifan Zhang, Quandong Wang, Qinzhuo Wu, Wei Liu, Jian Luan
  • Long-context Language Models Fail in Basic Retrieval Tasks Without Sufficient Reasoning Steps
    Yijiong Yu, Ma Xiufa, Fang Jianwei, Zhi Xu, Guangyao Su, Wang Jiancheng, Yongfeng Huang, Zhixiao Qi, Wei Wang, weifeng.liu, Ran Chen, Ji Pei
  • Benchmarking Critical Questions Generation: A Challenging Reasoning Task for Large Language Models
    Blanca Calvo Figueras, Rodrigo Agerri
  • ResearchArena: Benchmarking Large Language Models’ Ability to Collect and Organize Information as Research Agents
    Hao Kang, Chenyan Xiong
  • LLMs are Privacy Erasable
    Zipeng Ye, Wenjian Luo
  • How Good are LLM-based Rerankers? An Empirical Analysis of State-of-the-Art Reranking Models
    Abdelrahman Abdallah, Bhawna Piryani, Jamshid Mozafari, Mohammed Ali, Adam Jatowt
  • DeAR: Dual-Stage Document Reranking with Reasoning Agents via LLM Distillation
    Abdelrahman Abdallah, Jamshid Mozafari, Bhawna Piryani, Adam Jatowt
  • CANDY: Benchmarking LLMs’ Limitations and Assistive Potential in Chinese Misinformation Fact-Checking
    Ruiling Guo, Xinwei Yang, Chen Huang, Tong Zhang, Yong Hu
  • E-Verify: A Paradigm Shift to Scalable Embedding-based Factuality Verification
    Zeyang Liu, Jingfeng Xue, Xiuqi Yang, Wenbiao Du, Jiarun Fu, Junbao Chen, Wenjie Guo, Yong Wang
  • LLM Jailbreak Detection for (Almost) Free!
    Guorui Chen, Yifan Xia, Xiaojun Jia, Zhijiang Li, Philip Torr, Jindong Gu
  • When to Continue Thinking: Adaptive Thinking Mode Switching for Efficient Reasoning
    Xiaoyun Zhang, Jingqing Ruan, Xing Ma, Yawen Zhu, Haodong Zhao, Hao Li, Jiansong Chen, Ke Zeng, Xunliang Cai
  • Plugging Schema Graph into Multi-Table QA: A Human-Guided Framework for Reducing LLM Reliance
    Xixi Wang, Miguel Costa, Jordanka Kovaceva, Shuai Wang, Francisco C. Pereira
  • Evolution in Simulation: AI-Agent School with Dual Memory for High-Fidelity Educational Dynamics
    Sheng Jin, Haoming Wang, Zhiqi Gao, Yongbo Yang, Bao Chunjia, Chengliang Wang
  • Retrieval-Augmented Machine Translation with Unstructured Knowledge
    Jiaan Wang, Fandong Meng, Yingxue Zhang, Jie Zhou
  • MARS-Bench: A Multi-turn Athletic Real-world Scenario Benchmark for Dialogue Evaluation
    Chenghao Yang, Yinbo Luo, Zhoufutu Wen, Qi Chu, Tao Gong, Longxiang Liu, Kaiyuan Zhang, Jianpeng Jiao, Ge Zhang, Wenhao Huang, Nenghai Yu
  • UTMath: A Benchmark for Math Evaluation with Unit Test
    Bo Yang, Qingping Yang, YINGWEI MA, Runtao Liu
  • The Green KNIGHT: Green Machine Translation with Knowledge-Distilled, Narrow, Inexpensive, Greedy, Hybrid Transformers
    Andreas Guta, Frithjof Petrick, Peter Polák
  • Constructing Your Model’s Value Distinction: Towards LLM Alignment with Anchor Words Tuning
    Zhen Yang, Ping Jian, Chengzhi Li, Chenxu Wang, Xinyue Zhang, Wenpeng Lu
  • MCiteBench: A Multimodal Benchmark for Generating Text with Citations
    Caiyu Hu, Yikai Zhang, Tinghui Zhu, Yiwei Ye, Yanghua Xiao
  • Do LLMs Know and Understand Domain Conceptual Knowledge?
    Sijia Shen, Feiyan Jiang, Peiyan Wang, Yuchen Jiang, ChangLiu, Yubo Feng
  • Agent Laboratory: Using LLM Agents as Research Assistants
    Samuel Schmidgall, Yusheng Su, Ze Wang, Ximeng Sun, Jialian Wu, Xiaodong Yu, Jiang Liu, Michael Moor, Zicheng Liu, Emad Barsoum
  • Retrieval-Augmented Generation with Hierarchical Knowledge
    Haoyu Huang, Yongfeng Huang, Yang Junjie, Zhenyu Pan, Yongqiang Chen, Kaili Ma, Hongzhi Chen, James Cheng
  • Regularized Contrastive Decoding with Hard Negative Samples for LLM Hallucination Mitigation
    Haonan Sheng, Dou Hu, Lingwei Wei, Wei Zhou, Songlin Hu
  • CharacterCraft: Bridging the Literature-Reality Dialogue Gap for Practical Role-Playing Agents
    Xuyan Yin, Xinran Yang, Zihao Li, Lixin Zou, Chenliang Li
  • Drift: Decoding-time Personalized Alignments with Implicit User Preferences
    Minbeom Kim, Kang-il Lee, Seongho Joo, Hwaran Lee, Thibaut Thonet, Kyomin Jung
  • Discovering Semantic Subdimensions through Disentangled Conceptual Representations
    Yunhao Zhang, Shaonan Wang, Nan Lin, Xinyi Dong, Chong Li, Chengqing Zong
  • Identifying Aspects in Peer Reviews
    Sheng Lu, Ilia Kuznetsov, Iryna Gurevych
  • Tree-Structured Non-Autoregressive Decoding for Sequence-to-Sequence Text Generation
    Pengyu Ji, Yufei Liu, Xiang Hu, Kewei Tu
  • Towards More Efficient Post-training via Fourier Domain Adapter Framework
    Yijia Fan, Jusheng Zhang, Keze Wang
  • KERAG: Knowledge-Enhanced Retrieval-Augmented Generation for Advanced Question Answering
    Yushi Sun, Kai Sun, Yifan Ethan Xu, Xiao Yang, Xin Luna Dong, Nan Tang, Lei Chen
  • Not All Features Deserve Attention: Graph-Guided Dependency Learning for Tabular Data Generation with Language Models
    Zheyu Zhang, Shuo Yang, Bardh Prenkaj, Gjergji Kasneci
  • CCG: Rare-Label Prediction via Neural SEM–Driven Causal Game
    Yijia Fan, Jusheng Zhang, Kaitong Cai, Jing Yang, Keze Wang
  • Multimodal Emotion Recognition in Conversations: A Survey of Methods, Trends, Challenges and Prospects
    ChengYan Wu, Yiqiang Cai, Yang Liu, pengxu zhu, Yun Xue, Ziwei Gong, Julia Hirschberg, Bolei Ma
  • When Allies Turn Foes: Exploring Group Characteristics of LLM-Based Multi-Agent Collaborative Systems Under Adversarial Attacks
    Jiahao Zhang, Baoshuo Kan, Tao Gong, Fu Lee Wang, Tianyong Hao
  • EditID: Training-Free Editable ID Customization for Text-to-Image Generation
    Guandong Li, Zhaobin Chu
  • OSC: Cognitive Orchestration through Dynamic Knowledge Alignment in Multi-Agent LLM Collaboration
    Jusheng Zhang, Yijia Fan, Kaitong Cai, Xiaofei Sun, Keze Wang
  • VideoLLM Knows When to Speak: Enhancing Time-Sensitive Video Comprehension with Video-Text Duet Interaction Format
    Yueqian Wang, Xiaojun Meng, Yuxuan Wang, Jianxin Liang, Jiansheng Wei, Huishuai Zhang, Dongyan Zhao
  • To Answer or Not to Answer (TAONA): A Robust Textual Graph Understanding and Question Answering Approach
    Yuchen Yan, Aakash Kolekar, Sahika Genc, Wenju Xu, Edward W Huang, Anirudh Srinivasan, Mukesh Jain, Qi He, Hanghang Tong
  • Understanding Refusal in Language Models with Sparse Autoencoders
    Yeo Wei Jie, Nirmalendu Prakash, Clement Neo, Ranjan Satapathy, Roy Ka-Wei Lee, Erik Cambria
  • Where Did That Come From? Sentence-Level Error-Tolerant Attribution
    Ori Ernst, Aviv Slobodkin, Meng Cao, Sihui Wei, Jackie CK Cheung
  • Alleviating Performance Degradation Caused by Out-of-Distribution Issues in Embedding-Based Retrieval
    Haotong Bao, Jianjin Zhang, Qi Chen, Weihao Han, Zhengxin Zeng, Ruiheng Chang, Mingzheng Li, Hao Sun, Weiwei Deng, Feng Sun, Qi Zhang
  • Can LLMs Find a Needle in a Haystack? A Look at Anomaly Detection Language Modeling
    Leslie Barrett, Vikram Sunil Bajaj, Robert John Kingan
  • Beyond Single Frames: Can LMMs Comprehend Implicit Narratives in Comic Strip?
    Xiaochen Wang, Heming Xia, Jialin Song, Longyu Guan, Qingxiu Dong, Yixin Yang, Weiyao Luo, Yifan Pu, Yiru Wang, Xiangdi Meng, Wenjie Li, Zhifang Sui
  • Enhancing Multi-Agent Debate System Performance via Confidence Expression
    Zijie Lin, Bryan Hooi
  • The Face of Persuasion: Analyzing Bias and Generating Culture-Aware Ads
    Aysan Aghazadeh, Adriana Kovashka
  • SIFT: Grounding LLM Reasoning in Contexts via Stickers
    Zihao Zeng, Xuyao Huang, Boxiu Li, Zhijie Deng
  • When Inverse Data Outperforms: Exploring the Pitfalls of Mixed Data in Multi-Stage Fine-Tuning
    Mengyi DENG, Xin Li, Tingyu ZHU, Zhicheng Yang, Zhijiang Guo, Wei Wang
  • LUME: LLM Unlearning with Multitask Evaluations
    Anil Ramakrishna, Yixin Wan, Xiaomeng Jin, Kai-Wei Chang, Zhiqi Bu, Bhanukiran Vinzamuri, Volkan Cevher, Mingyi Hong, Rahul Gupta
  • How do Language Models Generate Slang: A Systematic Comparison between Human and Machine-Generated Slang Usages
    Siyang Wu, Zhewei Sun
  • Bridging the Dynamic Perception Gap: Training-Free Draft Chain-of-Thought for Dynamic Multimodal Spatial Reasoning
    Siqu Ou, Hongcheng Liu, Pingjie Wang, Yusheng Liao, Chuan Xuan, Yanfeng Wang, Yu Wang
  • MedCOD: Enhancing English-to-Spanish Medical Translation of Large Language Models Using Enriched Chain-of-Dictionary Framework
    Md Shahidul Salim, Lian Fu, Arav Adikesh Ramakrishnan, Zonghai Yao, hong yu
  • Chatbot To Help Patients Understand Their Health
    Won Seok Jang, Hieu Tran, Manav Shaileshkumar Mistry, Sai Kiran Gandluri, Yifan Zhang, Sharmin Sultana, SUNJAE KWON, Zonghai Yao, hong yu
  • A Knapsack by Any Other Name: Presentation impacts LLM performance on NP-hard problems
    Alex Duchnowski, Ellie Pavlick, Alexander Koller
  • Is Safety Standard Same for Everyone? User-Specific Safety Evaluation of Large Language Models
    Yeonjun In, Wonjoong Kim, Kanghoon Yoon, Sungchul Kim, Mehrab Tanjim, Sangwu Park, Kibum Kim, Chanyoung Park
  • Jailbreak Attack Initializations as Extractors of Compliance Directions
    Amit LeVi, Rom Himelstein, Yaniv Nemcovsky, Avi Mendelson, Chaim Baskin
  • Train Once for All: A Transitional Approach for Efficient Aspect Sentiment Triplet Extraction
    Xinmeng Hou, Lingyue Fu, Chenhao Meng, Kounianhua Du, Hai Hu
  • A Comprehensive Survey on the Trustworthiness of Large Language Models in Healthcare
    Manar Aljohani, Jun Hou, Sindhura Kommu, Xuan Wang
  • Self-Correction Makes LLMs Better Parsers
    Ziyan Zhang, Yang Hou, Chen Gong, Zhenghua Li
  • Explaining Length Bias in LLM-Based Preference Evaluations
    Zhengyu Hu, Linxin Song, Jieyu Zhang, Zheyuan Xiao, Tianfu Wang, Zhengyu Chen, Nicholas Jing Yuan, Jianxun Lian, Kaize Ding, Hui Xiong
  • Investigating Controversy Framing across Topics on Social Media
    Maxwell Weinzierl, Sanda M. Harabagiu
  • HEAL: Hybrid Enhancement with LLM-based Agents for Text-attributed Hypergraph Self-supervised Representation Learning
    Ruochang Li, Xiao Luo, Zhiping Xiao, Wei Ju, Ming Zhang
  • ReMamba: Equip Mamba with Effective Long-Sequence Modeling
    Danlong Yuan, Jiahao Liu, Bei Li, Huishuai Zhang, Jingang Wang, Xunliang Cai, Dongyan Zhao
  • QUITO-X: A New Perspective on Context Compression from the Information Bottleneck Theory
    Yihang Wang, Xu Huang, Bowen Tian, Yueyang Su, Lei Yu, Huaming Liao, Yixing Fan, Jiafeng Guo, Xueqi Cheng
  • Conv-Basis: A New Paradigm for Efficient Attention Inference and Gradient Computation in Transformers
    Yingyu Liang, Heshan Liu, Zhenmei Shi, Zhao Song, Zhuoyan Xu, Jiale Zhao, Zhen Zhuang
  • Mitigating Gender Bias via Fostering Exploratory Thinking in LLMs
    Kangda Wei, Hasnat Md Abdullah, Ruihong Huang
  • Beyond the Textual: Generating Coherent Visual Options for MCQs
    Wanqiang Wang, Longzhu He, Wei Zheng
  • SafeSwitch: Steering Unsafe LLM Behavior via Internal Activation Signals
    Peixuan Han, Cheng Qian, Xiusi Chen, Yuji Zhang, Denghui Zhang, Heng Ji
  • MADD: Multi-Agent Drug Discovery Orchestra
    Gleb Vitalevich Solovev, Alina Borisovna Zhidkovskaya, Anastasia Orlova, Nina Gubina, Anastasia Vepreva, Rodion Golovinskii, Ilya Tonkii, Ivan Dubrovsky, Ivan Gurev, Dmitry Gilemkhanov, Denis Chistiakov, Timur A. Aliev, Ivan Poddiakov, Galina Zubkova, Ekaterina V. Skorb, Vladimir Vinogradov, Alexander Boukhanovsky, Nikolay Nikitin, Andrei Dmitrenko, Anna Kalyuzhnaya, Andrey Savchenko
  • PersonaGym: Evaluating Persona Agents and LLMs
    Vinay Samuel, Henry Peng Zou, Yue Zhou, Shreyas Chaudhari, Ashwin Kalyan, Tanmay Rajpurohit, Ameet Deshpande, Karthik R Narasimhan, Vishvak Murahari
  • LM2Protein: A Structure-to-Token Protein Large Language Model
    Chang Zhou, Jiyue Jiang, Pengan CHEN, Yuheng Shan, Xiangyu Shi, Zikang Wang, Yanting Li
  • How Well Can Reasoning Models Identify and Recover from Unhelpful Thoughts?
    Sohee Yang, Sang-Woo Lee, Nora Kassner, Daniela Gottesman, Sebastian Riedel, Mor Geva
  • From Token to Action: State Machine Reasoning to Mitigate Overthinking in Information Retrieval
    Dohyeon Lee, Yeonseok Jeong, seung-won hwang
  • Locate-then-Merge: Neuron-Level Parameter Fusion for Mitigating Catastrophic Forgetting in Multimodal LLMs
    Zeping Yu, Sophia Ananiadou
  • Improving Influence-based Instruction Tuning Data Selection for Balanced Learning of Diverse Capabilities
    Qirun Dai, Dylan Zhang, Jiaqi W. Ma, Hao Peng
  • Diagnosing Moral Reasoning Acquisition in Language Models: Pragmatics and Generalization
    Guangliang Liu, Zimo Qi, Xitong Zhang, Lei Jiang, Kristen Johnson
  • Discourse Heuristics For Paradoxically Moral Self-Correction
    Guangliang Liu, Zimo Qi, Xitong Zhang, Kristen Johnson
  • Invisible Prompts, Visible Threats: Malicious Font Injection in External Resources for Large Language Models
    JUNJIE XIONG, Changjia Zhu, Shuhang Lin, Chong Zhang, Yongfeng Zhang, Yao Liu, Lingyao Li
  • Turning the Tide: Repository-based Code Reflection
    Wei Zhang, Jian Yang, Jiaxi Yang, Ya Wang, Zhoujun Li, Zeyu Cui, Binyuan Hui, Junyang Lin
  • Reinforcement Learning with Supervised Alignment
    João Luís Lins, Jia Xu
  • EmByte: Decomposition and Compression Learning for Small yet Private NLP
    Shenglan Li, Jia Xu, Mengjiao Zhang
  • GUARD: Glocal Uncertainty-Aware Robust Decoding for Effective and Efficient Open-Ended Text Generation
    Yuanhao Ding, Esteban Garces Arias, Meimingwei Li, Julian Rodemann, Matthias Aßenmacher, Danlu Chen, Gaojuan Fan, Christian Heumann, Chongsheng ZHANG
  • Efficiently Editing Mixture-of-Experts Models with Compressed Experts
    Yifei He, Yang Liu, Chen Liang, Hany Hassan Awadalla
  • FinGEAR: Financial Mapping-Guided Enhanced Answer Retrieval
    Ying Li, Mengyu Wang, Miguel de Carvalho, Sotirios Sabanis, Tiejun Ma
  • FM2DS: Few-Shot Multimodal Multihop Data Synthesis with Knowledge Distillation for Question Answering
    Amirhossein Abaskohi, Spandana Gella, Giuseppe Carenini, Issam H. Laradji
  • SQUARE: Unsupervised Retrieval Adaptation via Synthetic Data
    Jinsung Yoon, Junhao Zeng, Sercan O Arik
  • Knowledge-enhanced Multimodal ECG Representation Learning with Arbitrary-Lead Inputs
    Che Liu, Cheng Ouyang, Zhongwei Wan, Haozhe Wang, Wenjia Bai, Rossella Arcucci
  • Seeing Race, Feeling Bias: Emotion Stereotyping in Multimodal Language Models
    Mahammed Kamruzzaman, Amanda Cercas Curry, Alba Cercas Curry, Flor Miriam Plaza-del-Arco
  • AdaptMerge: Inference Time Adaptive Visual and Language-Guided Token Merging for Efficient Large Multimodal Models
    Zahidul Islam, Mrigank Rochan
  • Federated Retrieval-Augmented Generation: A Systematic Mapping Study
    Abhijit Chakraborty, Chahana Dahal, Vivek Gupta
  • A Survey of Pun Generation: Datasets, Evaluations and Methodologies
    Yuchen Su, Yonghua Zhu, Ruofan Wang, Zijian Huang, Diana Benavides-Prado, Michael J. Witbrock
  • Evaluating the Robustness and Accuracy of Text Watermarking Under Real-World Cross-Lingual Manipulations
    Mansour Al Ghanim, Jiaqi Xue, Rochana Prih Hastuti, Mengxin Zheng, Yan Solihin, Qian Lou
  • HDiff: Confidence-Guided Denoising Diffusion for Robust Hyper-relational Link Prediction
    Xiangfeng Luo, Ruoxin Zheng, jianqiang huang, Hang Yu
  • Spotlighter: Revisiting Prompt Tuning from a Representative Mining View
    Yutong Gao, Maoyuan Shao, Xinyang Huang, Chuang Zhu, Yu Weng, Xuan Liu, Lijuan Sun, Guoshun Nan
  • Offloaded Reasoning: Efficient Inference for Large Language Models via Modular Reasoning and Refinement
    Ishan Jindal, Jayant Taneja, Badrinath Chandana, Vikas Kapur, SACHIN DEV SHARMA
  • Wait, We Don’t Need to “Wait”! Removing Thinking Tokens Improves Reasoning Efficiency
    Chenlong Wang, Yuanning Feng, Dongping Chen, Zhaoyang Chu, Ranjay Krishna, Tianyi Zhou
  • Towards Reverse Engineering of Language Models: A Survey
    Xinpeng Ti, Wentao Ye, Zhifang Zhang, Junbo Zhao, Chang Yao, Lei Feng, Haobo Wang
  • LIFTED: Multimodal Clinical Trial Outcome Prediction via Large Language Models and Mixture-of-Experts
    Wenhao Zheng, Liaoyaqi Wang, Dongshen Peng, Hongxia Xu, Yun Li, Hongtu Zhu, Tianfan Fu, Huaxiu Yao
  • Addition in Four Movements: Mapping Layer-wise Information Trajectories in LLMs
    YaoYan
  • CoMoE: Contrastive Representation for Mixture-of-Experts in Parameter-Efficient Fine-tuning
    Jinyuan Feng, ChaoPeng Wei, Tenghai Qiu, Tianyi Hu, Zhiqiang Pu
  • GuiLoMo: Allocating Experts and Ranks for LoRA-MoE via Bilevel Optimization with GuidedSelection Vectors
    Xinrong Chen, Hengyuan Zhang, Yingmin Qiu, Xiao Liang, Ziyue Li, Guanyu Wang, Weiping Li, Tong Mo, Wenyu Lv, Ngai Wong
  • Rotate, Clip, and Partition: Towards W2A4KV4 Quantization by Integrating Rotation and Learnable Non-uniform Quantizer
    Euntae Choi, Sumin Song, Woosang Lim, Sungjoo Yoo
  • Decoding in Latent Spaces for Efficient Inference in LLM-based Recommendation
    Chengbing Wang, Yang Zhang, Zhicheng Wang, Tianhao Shi, Keqin Bao, Fuli Feng, Tat-Seng Chua
  • Forget for Get: A Lightweight Two-phase Gradient Method for Knowledge Editing in Large Language Models
    Yanhong Li, Min Yang, Xiping Hu, Chengming Li
  • AutoEvolve: Automatically Evolving Queries for Applicable and Scalable Retrieval-Augmented Generation Benchmarking
    Ding-Chu Zhang, Xiaowen Zhang, Yue Fei, Renjun Hu, Xiao-Wen Yang, Zhi Zhou, Baixuan Li, Yu-Feng Li, Xing Shi, Wei Lin
  • Temporal Alignment of Time Sensitive Facts with Activation Engineering
    Sanjay Govindan, Maurice Pagnucco, Yang Song
  • ChronoBias: A Benchmark for Evaluating Temporal Group Bias in the Time-sensitive Knowledge of Large Language Models
    Kyungmin Kim, Youngbin Choi, Hyounghun Kim, Dongwoo Kim, Sangdon Park
  • MC^2: A Minimum-Coverage and Dataset-Agnostic Framework for Compositional Generalization of LLMs on Semantic Parsing
    Ziyao Xu, Zhe Yang, Houfeng Wang
  • Learning to Instruct: Fine-Tuning a Task-Aware Instruction Optimizer for Black-Box LLMs
    Yunzhe Qi, Jinjin Tian, Tianci Liu, Ruirui Li, Tianxin Wei, Hui Liu, Xianfeng Tang, Monica Xiao Cheng, Jingrui He
  • Enriching Patent Claim Generation with European Patent Dataset
    Lekang Jiang, Chengzu Li, Stefan Goetz
  • StepKE: Stepwise Knowledge Editing for Multi-Hop Question Answering
    Jaewook Lee, Dahyun Jung, Heuiseok Lim
  • AutoDCWorkflow: LLM-based Data Cleaning Workflow Auto-Generation and Benchmark
    Lan Li, Liri Fang, Bertram Ludäscher, Vetle I Torvik
  • Hidden Ghost Hand: Unveiling Backdoor Vulnerabilities in MLLM-Powered Mobile GUI Agents
    Pengzhou Cheng, Haowen Hu, Zheng Wu, Zongru Wu, Tianjie Ju, Daizong Ding, Zhuosheng Zhang, Gongshen Liu
  • Scale Down to Speed Up: Dynamic Data Selection for Reinforcement Learning
    Zhuoyue Chen, Jihai Zhang, Ben Liu, Fangquan Lin, Wotao Yin
  • Towards Efficient CoT Distillation: Self-Guided Rationale Selector for Better Performance with Fewer Rationales
    JianZhi Yan, Le Liu, Youcheng Pan, Shiwei Chen, Yang Xiang, Buzhou Tang
  • GeoDANO: Geometric VLM with Domain Agnostic Vision Encoder
    Seunghyuk Cho, Zhenyue Qin, Yang Liu, Youngbin Choi, Seungbeom Lee, Dongwoo Kim
  • Leveraging 3D Gaussian for Temporal Knowledge Graph Embedding
    Jiang Li, Xiangdong Su, Guanglai Gao
  • LLMAP: LLM-Assisted Multi-Objective Route Planning with User Preferences
    Liangqi Yuan, Dong-Jun Han, Christopher Brinton, Sabine Brunswicker
  • ZEBRA: Leveraging Model-Behavioral Knowledge for Zero-Annotation Preference Dataset Construction
    Jeesu Jung, Chanjun Park, Sangkeun Jung
  • Token Knowledge: A New Perspective For Knowledge in Large Language Models
    Jieyong Wang, Chunyao Song, Tingjian Ge
  • Adaptive Schema-aware Event Extraction with Retrieval-Augmented Generation
    Sheng Liang, Hang Lv, Zhihao Wen, Yaxiong Wu, Yongyue Zhang, Hao Wang, Yong Liu
  • Enhancing Attributed Question Answering using Tailored Progressive Curriculum Learning
    Yuhan Chen, Bowei Zou, Yifan Fan, Yuchong Chen, Shujun Cao, Yu Hong
  • REAR: Reinforced Reasoning Optimization for Event Argument Extraction with Relation-Aware Support
    Jianwen Luo, Yu Hong, Shuai Yang, Jianmin YAO
  • COMI-LINGUA: Expert Annotated Large-Scale Dataset for Multitask NLP in Hindi-English Code-Mixing
    Rajvee Sheth, Himanshu Beniwal, Mayank Singh
  • Nine Ways to Break Copyright Law and Why Our LLM Won’t: A Fair Use Aligned Generation Framework
    Aakash Sen Sharma, Debdeep Sanyal, Priyansh Srivastava, Sundar Athreya H, Shirish Karande, Mohan Kankanhalli, Murari Mandal
  • InteractSpeech: A Speech Dialogue Interaction Corpus for Spoken Dialogue Model
    Yifu Chen, Shengpeng Ji, Ziqing Wang, Hanting Wang, Zhou Zhao
  • Enhancing SQL Table Acquisition with Reverse Engineering for Text-to-SQL
    Shixin Liu, Haoyu Xu, Yu Hong
  • DynamicKV: Task-Aware Adaptive KV Cache Compression for Long Context LLMs
    Xiabin Zhou, Wenbin Wang, Minyan Zeng, Jiaxian Guo, Xuebo Liu, Li Shen, Min Zhang, Liang Ding
  • ASD-iLLM:An Intervention Large Language Model for Autistic Children based on Real Clinical Dialogue Intervention Dataset
    Shuzhong Lai, Chenxi Li, Junhong Lai, Yucun Zhong, Chenyu Yan, Xiang Li, Haifeng Li, Gang Pan, Lin Yao, Yueming Wang
  • GDLLM: A Global Distance-aware Modeling Approach Based on Large Language Models for Event Temporal Relation Extraction
    Jie Zhao, Wanting Ning, Yuxiao Fei, Yubo Feng, Lishuang Li
  • More Tokens, Lower Precision: Towards the Optimal Token-Precision Trade-off in KV Cache Compression
    Jiebin Zhang, Dawei Zhu, Yifan Song, Wenhao Wu, Chuqiao Kuang, Xiaoguang Li, Lifeng Shang, Qun Liu, Sujian Li
  • cAST: Enhancing Code Retrieval-Augmented Generation with Structural Chunking via Abstract Syntax Tree
    Yilin Zhang, Xinran Zhao, Zora Zhiruo Wang, Chenyang Yang, Jiayi Wei, Tongshuang Wu
  • A Group Fairness Lens for Large Language Models
    Guanqun Bi, Yuqiang Xie, Lei Shen, Yanan Cao
  • VLM Is a Strong Reranker: Advancing Multimodal Retrieval-augmented Generation via Knowledge-enhanced Reranking and Noise-injected Training
    Zhanpeng Chen, Chengjin Xu, Yiyan Qi, Xuhui Jiang, Jian Guo
  • Rethinking DPO: The Role of Rejected Responses in Preference Misalignment
    Jae Hyeon Cho, JunHyeok Oh, Myunsoo Kim, Byung-Jun Lee
  • Enhancing Recommendation Explanations through User-Centric Refinement
    Jingsen Zhang, Zihang Tian, Xueyang Feng, Xu Chen, Chong Chen
  • Distributional Surgery for Language Model Activations
    Bao Nguyen, Binh Nguyen, Duy Nguyen, Viet Anh Nguyen
  • Improving Alignment in LVLMs with Debiased Self-Judgment
    Sihan Yang, Chenhang Cui, Zihao Zhao, Yiyang Zhou, Weilong Yan, Ying Wei, Huaxiu Yao
  • Low-Confidence Gold: Refining Low-Confidence Samples for Efficient Instruction Tuning
    Hongyi Cai, jie li, Mohammad Mahdinur Rahman, Wenzhen Dong
  • Safeguarding Privacy of Retrieval Data against Membership Inference Attacks: Is This Query Too Close to Home?
    Yujin Choi, Youngjoo Park, Junyoung Byun, Jaewook Lee, Jinseong Park
  • Causal-LLM: A Unified One-Shot Framework for Prompt- and Data-Driven Causal Graph Discovery
    Amartya Roy, N Devharish, Shreya Ganguly, Kripabandhu Ghosh
  • LRPLAN: A Multi-Agent Collaboration of Large Language and Reasoning Models for Planning with Implicit & Explicit Constraints
    T Karthikeyan, Om Dehlan, Mausam, Manish Gupta
  • DLPO: Towards a Robust, Efficient, and Generalizable Prompt Optimization Framework from a Deep-Learning Perspective
    Dengyun Peng, Yuhang Zhou, Qiguang Chen, JinHao Liu, Jingjing Chen, Libo Qin
  • Towards Robust Few-Shot Relation Classification: Incorporating Relation Description with Agreement
    Mengting Hu, Jianfeng Wu, Ming Jiang, Yalan Xie, Zhunheng Wang, Rui Ying, Xiaoyi Liu, Ruixuan Xu, Hang Gao, Renhong Cheng
  • For a Fistful of Puns: Evaluating a Puns in Multiword Expressions Identification Algorithm Without Dedicated Dataset
    Julien Bezançon, Gaël Lejeune
  • Watermarking for Factuality: Guiding Vision-Language Models Toward Truth via Tri-layer Contrastive Decoding
    Kyungryul Back, Seongbeom Park, Milim Kim, Mincheol Kwon, SangHyeok Lee, Hyunyoung Lee, Junhee Cho, Seunghyun Park, Jinkyu Kim
  • Are the Reasoning Models Good at Automated Essay Scoring?
    Lui Yoshida
  • Rethinking LLM-Based Recommendations: A Personalized Query-Driven Parallel Integration
    Donghee Han, Hwanjun Song, Mun Yong Yi
  • RefVNLI: Towards Scalable Evaluation of Subject-driven Text-to-image Generation
    Aviv Slobodkin, Hagai Taitelbaum, Yonatan Bitton, Brian Gordon, Michal Sokolik, Nitzan Bitton Guetta, Almog Gueta, Royi Rassin, Dani Lischinski, Idan Szpektor
  • What data should I include in my POS tagging training set?
    Zoey Liu, Masoud Jasbi, Christan Grant, Kenji Sagae, Emily Prud’hommeaux
  • AttnComp: Attention-Guided Adaptive Context Compression for Retrieval-Augmented Generation
    Lvzhou Luo, Yixuan Cao, Ping Luo
  • SafeInt: Shielding Large Language Models from Jailbreak Attacks via Safety-Aware Representation Intervention
    Jiaqi Wu, Chen Chen, Chunyan Hou, Xiaojie Yuan
  • Staged Knowledge Distillation Through Least-to-Most Prompting: Optimizing Teacher Guidance via Difficulty-Aware Training
    Mengxiang Zhang, Lingyuan Liu
  • LLM Distillation for Efficient Few-Shot Multiple Choice Question Answering
    Patrick Sutanto, Joan Santoso, Esther Irawati Setiawan, Aji Prasetya Wibawa
  • Teaching LLMs to Plan, Not Just Solve: Plan Learning Boosts LLMs Generalization in Reasoning Tasks
    Tianlong Wang, Junzhe Chen, Weibin Liao, Xueting Han, Jing Bai
  • FedCoT: Federated Chain-of-Thought Distillation for Large Language Models
    Tao Fan, Weijing Chen, Yan Kang, GuoqiangMa, Hanlin Gu, Yuanfeng SONG, Lixin Fan, Qiang Yang
  • SalaMAnder: Shapley-based Mathematical Expression Attribution and Metric for Chain-of-Thought Reasoning
    Yue Xin, Chen Shen, Shaotian Yan, Xiaosong Yuan, Yaoming Wang, Xiaofeng Zhang, Chenxi Huang, Jieping Ye
  • Representing LLMs in Prompt Semantic Task Space
    Idan Kashani, Avi Mendelson, Yaniv Nemcovsky
  • PersLLM: A Personified Training Approach for Large Language Models
    Zheni Zeng, Jiayi Chen, Huimin Chen, Yukun Yan, Yuxuan Chen, Zhenghao Liu, Zhiyuan Liu, Maosong Sun
  • The Illusion of Randomness: How LLMs Fail to Emulate Stochastic Decision-Making in Rock-Paper-Scissors Games?
    Zihao Guo, Hongtao Lv, Chaoli Zhang, Yibowen Zhao, Yixin Zhang, Lizhen Cui
  • DAPE-BR: Distance-Aware Positional Encoding for Mitigating Object Hallucination in LVLMs
    Mingrui Xie, Tianxiang Xu, Qianhai Tang, Shanming Yao, Xiaofeng Zhang, Junliang Du
  • From Confidence to Collapse in LLM Factual Robustness
    Alina Fastowski, Bardh Prenkaj, Gjergji Kasneci
  • CtrlNews: LLM-based Multi-Agent Controllable News Writing via Knowledge Gravitational Field
    Yifei Xu, Yingjie Zong, Wang Zhonghua, Sirui Wu, Yuan Rao, Dan Zhang, Shuiguang Deng
  • Joint Enhancement of Relational Reasoning for Long-Context LLMs
    Zhirui Chen, Wei Shen, Jiashui Huang, Ling Shao
  • Training Medical QA Models Based on Mixed Rewards from Multiple-Choice and Open-Ended Questions
    Yue Qiu, Yujan Ting, Pei Dong, Terrence Chen, Weijing Huang
  • Rethink Rumor Detection in the Era of LLMs: A Review
    Chang Yang, Peng Zhang, Jing Zhang, Hui Gao, Changhao Song
  • ScholarBench: A Bilingual Benchmark for Abstraction, Comprehension, and Reasoning Evaluation in Academic Contexts
    DongwonNoh, Donghyeok Koh, Junghun Yuk, Gyuwan Kim, JAE YONG LEE, KyungTae Lim, Cheoneum Park
  • MAGIC: A Multi-Hop and Graph-Based Benchmark for Inter-Context Conflicts in Retrieval-Augmented Generation
    Jungyeon Lee, Lee Kangmin, Taeuk Kim
  • Align Attention Heads Before Merging Them: An Effective Way for Converting MHA to GQA
    Qingyun Jin, Xiaohui Song, Feng Zhou, Zengchang Qin
  • DRBO: Mitigating the Bottleneck Effect via Dynamic Reward Balancing in Multi-reward LLM Optimization
    Nuo Chen, Yufei Gao, Yongnan Jin, Yan Hu, Anningzhe Gao, Lingyong Yan, Benyou Wang
  • Enhancing LLM Knowledge Learning through Generalization
    Mingkang Zhu, Xi Chen, Zhongdao Wang, Bei Yu, Hengshuang Zhao, Jiaya Jia
  • FastCuRL: Curriculum Reinforcement Learning with Stage-wise Context Scaling for Efficient Training R1-like Reasoning Models
    Mingyang Song, Mao Zheng, Zheng Li, Wenjie Yang, Xuan Luo
  • TR-MTEB: A Comprehensive Benchmark and Embedding Model Suite for Turkish Sentence Representations
    Mehmet Selman Baysan, Tunga Gungor
  • ImpRAG: Retrieval-Augmented Generation with Implicit Queries
    Wenzheng Zhang, Xi Victoria Lin, Karl Stratos, Wen-tau Yih, Mingda Chen
  • HEAL: A Hypothesis-Based Preference-Aware Analysis Framework
    Yifu Huo, Chenglong Wang, Qiren Zhu, Shunjie Xing, Tong Xiao, Chunliang Zhang, Tongran Liu, JingBo Zhu
  • A Survey of Multilingual Reasoning in Language Models
    Akash Ghosh, Debayan Dutta, Sriparna Saha, Chirag Agarwal
  • CLEAR: A Framework Enabling Large Language Models to Discern Confusing Legal Paragraphs
    Qi Xu, Qian Liu, Hao Fei, Hang Yu, Shuhao Guan, Xiao Wei
  • NAP^2: A Benchmark for Naturalness and Privacy-Preserving Text Rewriting by Learning from Human
    Shuo Huang
  • Chain of Ideas: Revolutionizing Research Via Novel Idea Development with LLM Agents
    Long Li, Weiwen Xu, Jiayan Guo, Ruochen Zhao, Xingxuan Li, Yuqian Yuan, Boqiang Zhang, Yuming Jiang, Yifei Xin, Ronghao Dang, Yu Rong, Deli Zhao, Tian Feng, Lidong Bing
  • Unveiling Multimodal Processing: Exploring Activation Patterns in Multimodal LLMs for Interpretability and Efficiency
    Chuan Wu, MengSu, Youxuan Fang, shaolin Zhu
  • Self-Supervised Prompt Optimization
    Jinyu Xiang, Jiayi Zhang, Zhaoyang Yu, Xinbing Liang, Fengwei Teng, Jinhao Tu, Fashen Ren, Xiangru Tang, Sirui Hong, Chenglin Wu, Yuyu Luo
  • Polish-English medical knowledge transfer: A new benchmark and results
    Łukasz Grzybowski, Jakub Pokrywka, Michał Ciesiółka, Jeremi Ignacy Kaczmarek, Marek Kubis
  • Hard Negatives, Hard Lessons: Revisiting Training Data Quality for Robust Information Retrieval with LLMs
    Nandan Thakur, Crystina Zhang, Xueguang Ma, Jimmy Lin
  • EventRelBench: A Comprehensive Benchmark for Evaluating Event Relation Understanding in Large Language Models
    Jie Gong, Biaoshuai Zheng, qiwang hu
  • S2LPP: Small-to-Large Prompt Prediction across LLMs
    Liang Cheng, Tianyi Li, Zhaowei Wang, Mark Steedman
  • DroidCall: A Dataset for LLM-powered Android Intent Invocation
    Weikai Xie, Li Zhang, Shihe Wang, Rongjie Yi, Mengwei Xu
  • Tool Zero: Training Tool-Augmented LLMs via Pure RL from Scratch
    Yirong Zeng, Xiao Ding, Yutai Hou, Yuxian Wang, Li Du, Juyi Dai, Qiuyang Ding, Duyu Tang, Dandan Tu, Weiwen Liu, Bing Qin, Ting Liu
  • INREACT: An Inspire-Then-Reinforce Training Framework For Multimodal GUI Agent
    Yuanlei Wang, liuzhou zhang, Haohao Luo, Ying Shen
  • Facts Fade Fast: Evaluating Memorization of Outdated Medical Knowledge in Large Language Models
    Juraj Vladika, Mahdi Dhaini, Florian Matthes
  • Zero-Shot Privacy-Aware Text Rewriting via Iterative Tree Search
    Shuo Huang, Xingliang YUAN, Gholamreza Haffari, Lizhen Qu
  • KoLEG: On-the-Fly Korean Legal Knowledge Editing with Continuous Retrieval
    Jaehyung Seo, Dahyun Jung, Jaewook Lee, Yongchan Chun, Dongjun Kim, Hwijung Ryu, Donghoon Shin, Heuiseok Lim
  • HARE: an entity and relation centric evaluation framework for histopathology reports
    Yunsoo Kim, Michal Wen Sheue Ong, Alex Shavick, Honghan Wu, Adam P. Levine
  • VeriFastScore: Speeding up long-form factuality evaluation
    Rishanth Rajendhran, Amir Zadeh, Matthew Sarte, Chuan Li, Mohit Iyyer
  • B-REASO: A Multi-Level Multi-Faceted Bengali Evaluation Suite for Foundation Models
    Md Tanzib Hosain, Md Kishor Morol
  • Extracting Conceptual Spaces from LLMs Using Prototype Embeddings
    Nitesh Kumar, Usashi Chatterjee, Steven Schockaert
  • FC-Attack: Jailbreaking Multimodal Large Language Models via Auto-Generated Flowcharts
    Ziyi Zhang, Zhen Sun, Zongmin Zhang, Jihui Guo, Xinlei He
  • Multilingual Data Filtering using Synthetic Data from Large Language Models
    Jonas Waldendorf, Barry Haddow, Alexandra Birch, Mateusz Klimaszewski
  • SAFE: A Sparse Autoencoder-Based Framework for Robust Query Enrichment and Hallucination Mitigation in LLMs
    Samir Abdaljalil, Filippo Pallucchini, Andrea Seveso, HASAN KURBAN, Fabio Mercorio, Erchin Serpedin
  • Soteria: Language-Specific Functional Parameter Steering for Multilingual Safety Alignment
    Somnath Banerjee, Sayan Layek, Pratyush Chatterjee, Animesh Mukherjee, Rima Hazra
  • LLMs as a synthesis between symbolic and distributed approaches to language
    Gemma Boleda
  • MIND: Towards Immersive Psychological Healing with Multi-Agent Inner Dialogue
    Yujia Chen, Changsong Li, Yiming Wang, Tianjie Ju, Qingqing Xiao, Nan Zhang, Zifan Kong, PengWang, Binyu Yan
  • A Monte-Carlo Sampling Framework For Reliable Evaluation of Large Language Models Using Behavioral Analysis
    Davood Wadi, Marc Fredette
  • Understanding How Value Neurons Shape the Generation of Specified Values in LLMs
    Yi Su, Jiayi Zhang, Shu Yang, Xinhai Wang, Lijie Hu, Di Wang
  • Likelihood Variance as Text Importance for Resampling Texts to Map Language Models
    Momose Oyama, Ryo Kishino, Hiroaki Yamagiwa, Hidetoshi Shimodaira
  • Think Twice, Generate Once: Enhancing LLMs Safety via Progressive Self-Reflection
    Hoang Phan, Victor Li, Qi Lei
  • Efficient Integration of External Knowledge to LLM-based World Models via Retrieval-Augmented Generation and Reinforcement Learning
    Chang Yang, Xinrun Wang, Qinggang Zhang, Qi Jiang, Xiao Huang
  • Comparing Apples to Oranges: A Dataset & Analysis of LLM Humour Understanding from Traditional Puns to Topical Jokes
    Tyler Loakman, William Thorne, Chenghua Lin
  • Modeling, Evaluating, and Embodying Personality in LLMs: A Survey
    Iago Alves Brito, Julia Soares Dollis, Fernanda Bufon Färber, Pedro Schindler Freire Brasil Ribeiro, Rafael Teixeira Sousa, Arlindo Rodrigues Galvão Filho
  • Benchmarking the Detection of LLMs-Generated Modern Chinese Poetry
    Shanshan Wang, Junchao Wu, Fengying Ye, Derek F. Wong, Jingming Yao, Lidia S. Chao
  • Leveraging the Cross-Domain & Cross-Linguistic Corpus for Low Resource NMT: A Case Study On Bhili-Hindi-English Parallel Corpus
    Pooja Singh, Shashwat Bhardwaj, Vaibhav Sharma, Sandeep Kumar
  • Creative Preference Optimization
    Mete Ismayilzada, Antonio Laverghetta Jr., Simone A. Luchini, Reet Patel, Antoine Bosselut, Lonneke van der Plas, Roger E. Beaty
  • Assistant-Guided Mitigation of Teacher Preference Bias in LLM-as-a-Judge
    Zhuo Liu, Moxin Li, Xun Deng, Qifan Wang, Fuli Feng
  • Uplift-RAG: Uplift-Driven Knowledge Preference Alignment for Retrieval-Augmented Generation
    Changle Qu, Sunhao Dai, Hengyi Cai, Yiyang Cheng, Jun Xu, Shuaiqiang Wang, Dawei Yin
  • Sugar-Coated Poison: Benign Generation Unlocks Jailbreaking
    Yuhang Wu, Yu-Jie Xiong, Hao Zhang, Jia-Chen Zhang, Zheng Zhou
  • DivScene: Towards Open-Vocabulary Object Navigation with Large Vision Language Models in Diverse Scenes
    Zhaowei Wang, Hongming Zhang, Tianqing Fang, Ye Tian, Yue Yang, Kaixin Ma, Xiaoman Pan, Yangqiu Song, Dong Yu
  • Data-scarce Behavior Editing of Language Models
    Joykirat Singh, Subhabrata Dutta, Tanmoy Chakraborty
  • FIER: Fine-Grained and Efficient KV Cache Retrieval for Long-context LLM Inference
    Dongwei Wang, Zijie Liu, Song Wang, Yuxin Ren, Jianing Deng, Jingtong Hu, Tianlong Chen, Huanrui Yang
  • SVeritas: Benchmark for Robust Speaker Verification under Diverse Conditions
    Massa Baali, Sarthak Bisht, Francisco Teixeira, Kateryna Shapovalenko, Rita Singh, Bhiksha Raj
  • CAARMA: Class Augmentation with Adversarial Mixup Regularization
    Massa Baali, Xiang Li, Hao Chen, Syed Abdul Hannan, Rita Singh, Bhiksha Raj
  • Bringing Pedagogy into Focus: Evaluating Virtual Teaching Assistants’ Question-Answering in Asynchronous Learning Environments
    Siyan Li, Zhen Xu, Vethavikashini Chithrra Raghuram, Xuanming Zhang, Renzhe Yu, Zhou Yu
  • Demystifying Multilingual Reasoning in Process Reward Modeling
    Weixuan Wang, Minghao Wu, Barry Haddow, Alexandra Birch
  • BehaviorSFT: Behavioral Token Conditioning for Health Agents Across the Proactivity Spectrum
    Yubin Kim, Zhiyuan Hu, Hyewon Jeong, Eugene W Park, Shuyue Stella Li, Chanwoo Park, Shiyun Xiong, MingYu Lu, Hyeonhoon Lee, Xin Liu, Daniel McDuff, Cynthia Breazeal, Samir Tulebaev, Hae Won Park
  • LaMP-Cap: Personalized Figure Caption Generation With Multimodal Figure Profiles
    Ho Yin Sam Ng, Ting-Yao Hsu, Aashish Anantha Ramakrishnan, Branislav Kveton, Nedim Lipka, Franck Dernoncourt, Dongwon Lee, Tong Yu, Sungchul Kim, Ryan A. Rossi, Ting-Hao Kenneth Huang
  • Efficient Dynamic Clustering-Based Document Compression for Retrieval-Augmented-Generation
    Weitao Li, Xiangyu Zhang, Kaiming Liu, Xuanyu Lei, Weizhi Ma, Yang Liu
  • HebID: Detecting Social Identities in Hebrew-language Political Text
    Guy Mor-Lan, Naama Rivlin-Angert, Yael R. Kaplan, Tamir Sheafer, Shaul R. Shenhav
  • Dub-S2ST: Textless Speech-to-Speech Translation for Seamless Dubbing
    Jeongsoo Choi, Jaehun Kim, Joon Son Chung
  • FinGrAct: A Framework for FINe-GRrained Evaluation of ACTionability in Explainable Automatic Fact-Checking
    Islam Eldifrawi, Shengrui Wang, Amine Trabelsi
  • What Has Been Lost with Synthetic Evaluation?
    Alexander Gill, Abhilasha Ravichander, Ana Marasovic
  • Bold Claims or Self-Doubt? Factuality Hallucination Type Detection via Belief State
    Dongyu Zhang, Qingqing Hong, Bingxuan Hou, Jiayi Lin, Chenyang Zhang, Jialin Li, Junli Wang
  • Proxy Barrier: A Hidden Repeater Layer Defense Against System Prompt Leakage and Jailbreaking
    Pedro Schindler Freire Brasil Ribeiro, Iago Alves Brito, Rafael Teixeira Sousa, Fernanda Bufon Färber, Julia Soares Dollis, Arlindo Rodrigues Galvão Filho
  • AraSafe: Benchmarking Safety in Arabic LLMs
    Hamdy Mubarak, Abubakr Mohamed, Majd Hawasly
  • Nested Named Entity Recognition as Single-Pass Sequence Labeling
    Alberto Muñoz-Ortiz, David Vilares, Caio Corro, Carlos Gómez-Rodríguez
  • DeCoRe: Decoding by Contrasting Retrieval Heads to Mitigate Hallucinations
    Aryo Pradipta Gema, Chen Jin, Ahmed Abdulaal, Tom Diethe, Philip Alexander Teare, Beatrice Alex, Pasquale Minervini, Amrutha Saseendran
  • Catch Me If You Can? Not Yet: LLMs Still Struggle to Imitate the Implicit Writing Styles of Everyday Authors
    Zhengxiang Wang, Nafis Irtiza Tripto, Solha Park, Zhenzhen Li, Jiawei Zhou
  • Fine-Tuning Encoder-Decoder Models with Contrastive Learning for In-Context Distractor Generation
    Elaf Alhazmi, Quan Z. Sheng, Wei Emma Zhang, Mohammed I. Thanoon, Haojie Zhuang, Behnaz Soltani, Munazza Zaib
  • Conflicts in Texts: Data, Implications and Challenges
    Siyi Liu, Dan Roth
  • Recognizing Limits: Investigating Infeasibility in Large Language Models
    Wenbo Zhang, Zihang Xu, Hengrui Cai
  • VQA-Augmented Machine Translation with Cross-Modal Contrastive Learning
    Zhihui Zhang, Shiliang Sun, Jing Zhao, Tengfei Song, Hao Yang
  • Learning to Describe Implicit Changes: Noise-robust Pre-training for Image Difference Captioning
    Zixin Guo, Jiayang Sun, Tzu-Jui Julius Wang, Abduljalil Radman, Selen Pehlivan, Min Cao, Jorma Laaksonen
  • SOLAR: Serendipity Optimized Language Model Aligned for Recommendation
    Zichen Yuan, Lifan Sun, Yucen Zhuang, Yue Wang, Xinyuan Song, Tianqi Xu, Siyuan Li, Junchen Fu, Youhua Li, Sirui Hong, Jiaqi Chen, Joemon M. Jose, Yongxin Ni
  • AIRepr: An Analyst-Inspector Framework for Evaluating Reproducibility of LLMs in Data Science
    Qiuhai Zeng, Claire Jin, Xinyue Wang, Yuhan Zheng, Qunhua Li
  • MisinfoBench: A Multi-Dimensional Benchmark for Evaluating LLMs’ Resilience to Misinformation
    Ye Yang, Donghe Li, Zuchen Li, Fengyuan Li, Jingyi Liu, Li Sun, Qingyu Yang
  • Fuzzy Reasoning Chain (FRC): An Innovative Reasoning Framework from Fuzziness to Clarity
    Ping Chen, Xiang Liu, Zhaoxiang Liu, Zezhou Chen, Xingpeng Zhang, Huan Hu, Zipeng Wang, Kai Wang, Shuming Shi, Shiguo Lian
  • HighMATH: Evaluating Math Reasoning of Large Language Models in Breadth and Depth
    Yan Liu, Minghui Zhang, Bojian Xiong, Yifan Xiao, Yinong Sun, Yating Mei, Longyu Zeng, Jingchao Yang, Yang Wang, Deyi Xiong
  • CATCH: A Novel Data Synthesis Framework for High Therapy Fidelity and Memory-Driven Planning Chain of Thought in AI Counseling
    Mingyu Chen, Jingkai Lin, Zhaojie Chu, Xiaofen Xing, Yirong Chen, Xiangmin Xu
  • MediVLM: A Vision Language Model for Radiology Report Generation from Medical Images
    Debanjan Goswami, Ronast Subedi, Shayok Chakraborty
  • AdDriftBench: A Benchmark for Detecting Data Drift and Label Drift in Short Video Advertising
    Yinghao Song, Xiangji Zeng, Shuai Cui, Lu Sun, Zhaowei Liu, Yuan Yuan, Hai Zhou, Zhaohan Gong
  • NIM: Neuro-symbolic Ideographic Metalanguage for Inclusive Communication
    Prawaal Sharma, Poonam Goyal, Navneet Goyal, Vidisha Sharma
  • ViFT: Towards Visual Instruction-Free Fine-tuning for Large Vision-Language Models
    Zikang Liu, Kun Zhou, Xin Zhao, Dawei Gao, Yaliang Li, Ji-Rong Wen
  • Do Code Semantics Help? A Comprehensive Study on Execution Trace-Based Information for Code Large Language Models
    Jian Jornbowrl Wang, Xiaofei Xie, Qiang Hu, Shangqing Liu, Yi Li
  • LongWeave: A Long-Form Generation Benchmark Bridging Real-World Relevance and Verifiability
    Zikai Xiao, Fei Huang, Jianhong Tu, Jianhui Wei, Wen MA, Yuxuan Zhou, Jian Wu, Bowen Yu, Zuozhu Liu, Junyang Lin
  • XL-Suite: Cross-Lingual Synthetic Training and Evaluation Data for Open-Ended Generation
    Vivek Iyer, Ricardo Rei, Pinzhen Chen, Alexandra Birch
  • Accelerating LLM Reasoning via Early Rejection with Partial Reward Modeling
    Seyyed Saeid Cheshmi, Azal Ahmad Khan, Xinran Wang, Zirui Liu, Ali Anwar
  • CultureSynth: A Hierarchical Taxonomy-Guided and Retrieval-Augmented Framework for Cultural Question-Answer Synthesis
    Xinyu Zhang, Pei Zhang, Shuang Luo, Jialong Tang, Yu Wan, Baosong Yang, Fei Huang
  • DesignCLIP: Multimodal Learning with CLIP for Design Patent Understanding
    Zhu Wang, Homaira Huda Shomee, Sathya N. Ravi, Sourav Medya
  • R3-RAG: Learning Step-by-Step Reasoning and Retrieval for LLMs via Reinforcement Learning
    Yuan Li, Qi Luo, Xiaonan Li, Bufan Li, Qinyuan Cheng, Bo Wang, Yining Zheng, Yuxin Wang, Zhangyue Yin, Xipeng Qiu
  • Hello, World!’: Making GNNs Talk with LLMs
    Sunwoo Kim, Soo Yong Lee, Jaemin Yoo, Kijung Shin
  • Both Text and Images Leaked! A Systematic Analysis of Data Contamination in Multimodal LLM
    Dingjie Song, Sicheng Lai, Mingxuan Wang, Shunian Chen, Lichao Sun, Benyou Wang
  • NLKI: A Lightweight Natural Language Knowledge Integration Framework for Improving Small VLMs in Commonsense VQA Tasks
    Aritra Dutta, Swapnanil Mukherjee, Deepanway Ghosal, Somak Aditya
  • Text or Pixels? Evaluating Efficiency and Understanding of LLMs with Visual Text Inputs
    Yanhong Li, Zixuan Lan, Jiawei Zhou
  • Assessing Socio-Cultural Alignment and Technical Safety of Sovereign LLMs
    Kyubyung Chae, Gihoon Kim, Gyuseong Lee, Taesup Kim, Jaejin Lee, Heejin Kim
  • Sample Efficient Alignment Learning With Episodic Control
    Van Dai Do, Quan Hung Tran, Ahmed Kirmani, Lu Zhang, Hung Le
  • Evaluating Automatic Speech Recognition Systems for Korean Meteorological Experts
    ChaeHun Park, Hojun Cho, Jaegul Choo
  • 3D-Aware Vision-Language Models Fine-Tuning with Geometric Distillation
    Seonho Lee, Jiho Choi, Inha Kang, Jiwook Kim, Junsung Park, Hyunjung Shim
  • CAPE: Context-Aware Personality Evaluation Framework for Large Language Models
    Jivnesh Sandhan, Fei Cheng, Tushar Sandhan, Yugo Murawaki
  • AgentThink: A Unified Framework for Tool-Augmented Chain-of-Thought Reasoning in Vision-Language Models for Autonomous Driving
    Kangan Qian, Sicong Jiang, YangZhong, Ziang Luo, Zilin Huang, Tianze Zhu, Kun Jiang, mengmeng yang, Zheng Fu, Jinyu Miao, Yining Shi, He Zhe Lim, Li Liu, Tianbao Zhou, Hongyi Wang, HuangYu, Yifei HU, Guang Li, Guang Chen, Hao Ye, Lijun Sun, Diange Yang
  • Select to Know: An Internal-External Knowledge Self-Selection Framework for Domain-Specific Question Answering
    Bolei He, Xinran He, Run Shao, Shanfu Shu, xianwei xue, MingQuan Cheng, Haifeng Li, Zhen-Hua Ling
  • GenPTQ: Green Post-Training Quantization for Large-Scale ASR Models with Mixed-Precision Bit Allocation
    Beom Jin Kang, Hyun Kim
  • Where Does This Strange Smell Come from?: Enabling Conversational Interfaces for Artificial Olfaction
    Xueyi Zhou, Qi Lu, Dong-Kyu Chae
  • LightRAG: Simple and Fast Retrieval-Augmented Generation
    ZIRUI GUO, Lianghao Xia, Yanhua Yu, Tu Ao, Chao Huang
  • Beyond Distribution: Investigating Language Models’ Understanding of Sino-Korean Morphemes
    Taehee Jeon
  • Sarcasm-R1: Enhancing Sarcasm Detection through Focused Reasoning
    Qi Yang, Jingjie Zeng, Kai Ma, Liang Yang, Hongfei Lin
  • ISACL: Internal State Analyzer for Copyrighted Training Data Leakage
    Guangwei Zhang, Qisheng Su, Jiateng Liu, Cheng Qian, Yanzhou Pan, Manling Li, Yanjie Fu, Zhaozhuo Xu, Denghui Zhang
  • Steering LVLMs via Sparse Autoencoder for Hallucination Mitigation
    Zhenglin Hua, Jinghan He, Zijun Yao, Tianxu Han, Haiyun Guo, Yuheng Jia, Junfeng Fang
  • On the Perception Bottleneck of VLMs for Chart Understanding
    Junteng Liu, Weihao Zeng, Xiwen Zhang, Yijun Wang, Zifei Shan, Junxian He
  • Self-Guided Function Calling in Large Language Models via Stepwise Experience Recall
    Sijia Cui, Aiyao He, Shuai Xu, Hongming Zhang, Yanna Wang, Qingyang Zhang, Yajing Wang, bo xu
  • Multilingual Generative Retrieval via Cross-lingual Semantic Compression
    Simeng Wu, Ran Song, Yuxin Huang, Yan Xiang, Yantuan Xian, Shengxiang Gao, Zhengtao Yu
  • Towards Multi-Document Question Answering in Scientific Literature: Pipeline, Dataset, and Evaluation
    Hui Huang, Julien Velcin, Yacine Kessaci
  • Multilingual Knowledge Graph Completion via Efficient Multilingual Knowledge Sharing
    Xiaofei Gao, Ran Song, Shizhu He, Cunli Mao, Shengxiang Gao, Kang Liu, Zhengtao Yu
  • Mitigating Attention Localization in Small Scale: Self-Attention Refinement via One-step Belief Propagation
    Nakyung Lee, Yeongoon Kim, Minhae Oh, Jin Woo Koo, Hyewon Jo, Jungwoo Lee
  • Imagination and Contemplation: A Balanced Framework for Semantic-Augmented Multimodal Machine Translation
    Zhuang Yu, Shiliang Sun, Jing Zhao, Tengfei Song, Hao Yang
  • NeLLCom-Lex: A Neural-agent Framework to Study the Interplay between Lexical Systems and Language Use
    Yuqing Zhang, Ecesu Ürker, Tessa Verhoef, Gemma Boleda, Arianna Bisazza
  • RLMEval: Evaluating Research-Level Neural Theorem Proving
    Auguste Poiroux, Antoine Bosselut, Viktor Kunčak
  • KaeDe: Progressive Generation of Logical Forms via Knowledge-Aware Question Decomposition for Improved KBQA
    Ranran Bu, Jian Cao, Jianqi Gao, Shiyou Qian, Hongming Cai
  • Where Fact Ends and Fairness Begins: Redefining AI Bias Evaluation through Cognitive Biases
    Jen-tse Huang, Yuhang Yan, Linqi LIU, Yixin Wan, Wenxuan Wang, Kai-Wei Chang, Michael R. Lyu
  • Equal Truth: Rumor Detection with Invariant Group Fairness
    Junyi Chen, Mengjia Wu, Qian Liu, Jing Sun, Ying Ding, Yi Zhang
  • STEAM: A Semantic-Level Knowledge Editing Framework for Large Language Models
    Geunyeong Jeong, Juoh Sun, Seonghee Lee, Harksoo Kim
  • SoT: Structured-of-Thought Prompting Guides Multilingual Reasoning in Large Language Models
    Rui Qi, Zhibo Man, Yufeng Chen, Fengran Mo, Jinan Xu, Kaiyu Huang
  • How Reliable is Multilingual LLM-as-a-Judge?
    Xiyan Fu, Wei Liu
  • Cognitive-Level Adaptive Generation via Capability-Aware Retrieval and Style Adaptation
    Qingsong Wang, Tao Wu, Wang Lin, Yueying Feng, Gongsheng Yuan, Chang Yao, Jingyuan Chen
  • Data Doping or True Intelligence? Evaluating the Transferability of Injected Knowledge in LLMs
    Essa Jan, Muhammad fareed zaffar, Yasir Zaki, Moiz Ali, Muhammad Saram Hassan
  • INDOORWORLD : Integrating Physical Task Solving and Social Simulation in A Heterogeneous Multi-Agent Environment
    Dekun Wu, Frederik Brudy, Bang Liu, Yi Wang
  • ARXSA: A General Negative Feedback Control Theory in Vision-Language Models
    Zeyu Zhang, Tianqi Chen, Yuki Todo
  • Breaking the Attention Trap in Code LLMs: A Rejection Sampling Approach to Enhance Code Execution Prediction
    Xingcheng Ruan, Haoxiang Geng, Yunhui Xia, Bingran Zhao
  • HiMATE: A Hierarchical Multi-Agent Framework for Machine Translation Evaluation
    Shijie Zhang, Renhao Li, Songsheng Wang, Philipp Koehn, Min Yang, Derek F. Wong
  • ReliableEval: A Recipe for Stochastic LLM Evaluation via Method of Moments
    Gili Lior, Eliya Habba, Shahar Levy, Avi Caciularu, Gabriel Stanovsky
  • From Characters to Tokens: Dynamic Grouping with Hierarchical BPE
    Rares Dolga, Lucas Maystre, Tudor Berariu, David Barber
  • Auto-SLURP: A Benchmark Dataset for Evaluating Multi-Agent Frameworks in Smart Personal Assistant
    Lei Shen, Xiaoyu Shen
  • NER Retriever: Zero-Shot Named Entity Retrieval with Type-Aware Embeddings
    Or Shachar, Uri Katz, Yoav Goldberg, Oren Glickman
  • MMATH: A Multilingual Benchmark for Mathematical Reasoning
    Wenyang Luo, Xin Zhao, Jing Sha, Shijin Wang, Ji-Rong Wen
  • MultiClaimNet: A Massively Multilingual Dataset of Fact-Checked Claim Clusters
    Rrubaa Panchendrarajan, Rubén Míguez Pérez, Arkaitz Zubiaga
  • DS-MHP: Improving Chain-of-Thought through Dynamic Subgraph-Guided Multi-Hop Path
    Yongqiang Liu, Wenjun Wang, Binrong Liu, Qiyao Peng, Hongtao Liu, XueWei Li
  • LongTail-Swap: benchmarking language models’ abilities on rare words
    Robin Algayres, Mahi Luthra, Jiayi Shen, Youssef Benchekroun, Dongyan Lin, Rashel Moritz, Juan Pino, Emmanuel Dupoux
  • TF-Mamba: Text-enhanced Fusion Mamba with Missing Modalities for Robust Multimodal Sentiment Analysis
    Xiang Li, Xianfu Cheng, Dezhuang Miao, Xiaoming Zhang, Zhoujun Li
  • Are Economists Always More Introverted? Analyzing Consistency in Persona-Assigned LLMs
    Manon Reusens, Bart Baesens, David Jurgens
  • Can you SPLICE it together? A Human Curated Benchmark for Probing Visual Reasoning in VLMs
    Mohamad Ballout, Okajevo Wilfred, Seyedalireza Yaghoubi, Nohayr Muhammad Abdelmoneim, Julius Mayer, Elia Bruni
  • On the Effectiveness of Prompt-Moderated LLMs for Math Tutoring at the Tertiary Level
    Sebastian Steindl, Fabian Brunner, Nada Sissouno, Dominik Schwagerl, Florian Schöler-Niewiera, Ulrich Schäfer
  • SkewRoute: Training-Free LLM Routing for Knowledge Graph Retrieval-Augmented Generation via Score Skewness of Retrieved Context
    Hairu Wang, Yuan Feng, Yukun Cao, Xike Xie, S Kevin Zhou
  • Acquiescence Bias in Large Language Models
    Daniel Braun
  • Time to Talk: LLM Agents for Asynchronous Group Communication in Mafia Games
    Niv Eckhaus, Uri Berger, Gabriel Stanovsky
  • How Sampling Affects the Detectability of Machine-written texts: A Comprehensive Study
    Matthieu Dubois, François Yvon, Pablo Piantanida
  • An Improved, Strong Baseline for Pre-Trained Large Language Models as Task-Oriented Dialogue Systems
    Sebastian Steindl, André Kestler, Ulrich Schäfer, Bernd Ludwig
  • MATCH: Task-Driven Code Evaluation through Contrastive Learning
    Marah Ghoummaid, Vladimir Tchuiev, Ofek Glick, Michal Moshkovitz, Dotan Di Castro
  • Evaluating Large Language Models for Cross-Lingual Retrieval
    Longfei Zuo, Pingjun Hong, Oliver Kraus, Barbara Plank, Robert Litschko
  • SGCD: Subtask-Guided Causal-Debiasing Framework for Robust Cross-Utterance Sentiment Quadruple Extraction in Dialogues
    Xiang Li, Keyu Yao, Gang Shen
  • FaMTEB: Massive Text Embedding Benchmark in Persian Language
    Erfan Zinvandi, Morteza Alikhani, Mehran Sarmadi, Zahra Pourbahman, Sepehr Arvin, Reza Kazemi, Arash Amini
  • Leveraging High-Resource English Corpora for Cross-lingual Domain Adaptation in Low-Resource Japanese Medicine via Continued Pre-training
    Kazuma Kobayashi, Zhen Wan, Fei Cheng, Yuma Tsuta, Xin Zhao, Junfeng Jiang, Jiahao Huang, Zhiyi Huang, Yusuke Oda, Rio Yokota, Yuki Arase, Daisuke Kawahara, Akiko Aizawa, Sadao Kurohashi
  • Structure Trumps Size: Rethinking Data Quality for LLM Reasoning
    Hu Xu, Zeyan Li, Rui Wang, Jianfeng Xu
  • A Zero-Shot Neuro-Symbolic Approach for Complex Knowledge Graph Question Answering
    Prerna Agarwal, Srikanta Bedathur
  • Making Every Step Effective: Jailbreaking Large Vision-Language Models Through Hierarchical KV Equalization
    Shuyang Hao, Yiwei Wang, Bryan Hooi, Jun Liu, Muhao Chen, Zi Huang, Yujun Cai
  • MT-Mol: Multi Agent System with Tool-based Reasoning for Molecular Optimization
    Hyomin Kim, Yunhui Jang, Sungsoo Ahn
  • A Survey on LLM-powered Agents for Recommender Systems
    Qiyao Peng, Hongtao Liu, Hua Huang, Jian Yang, Qing Yang, Minglai Shao
  • Efficiently Selecting Response Generation Strategies for Synthetic Data Construction by Self-Aligned Perplexity
    Xuan Ren, Lingqiao Liu, Qi Chen
  • Benchmarking for Domain-Specific LLMs: A Case Study on Academia and Beyond
    Rubing Chen, Jiaxin Wu, Jian Wang, Xulu Zhang, Wenqi Fan, Chenghua Lin, Xiaoyong Wei, Li Qing
  • FrameEOL: Semantic Frame Induction using Causal Language Models
    Chihiro Yano, Kosuke Yamada, Hayato Tsukagoshi, Ryohei Sasano, Koichi Takeda
  • CaTER: A Framework for Context-aware Topology Entity Retrieval Contrastive Learning in End-to-End Task-Oriented Dialogue Systems
    Di Wu hebeu, Zhizhi Yu
  • Attribution and Application of Multiple Neurons in Multimodal Large Language Models
    Feiyu Wang, Ziran Zhao, Pengyuan Liu, Dong Yu
  • When Models Lie, We Learn: Multilingual Span-Level Hallucination Detection with PsiloQA
    Elisei Rykov, Kseniia Petrushina, Maksim Savkin, Valerii Olisov, Artem Vazhentsev, Kseniia Titova, Alexander Panchenko, Vasily Konovalov, Julia Belikova
  • Unraveling Misinformation Propagation in LLM Reasoning
    Yiyang Feng, Yichen Wang, Shaobo Cui, Boi Faltings, Mina Lee, Jiawei Zhou
  • RAISE: Reinforced Adaptive Instruction Selection For Large Language Models
    Qingsong Lv, Yangning Li, Zihua Lan, Zishan Xu, Jiwei Tang, Tingwei Lu, Yinghui Li, Wenhao Jiang, Hong-Gee Kim, Hai-Tao Zheng, Philip S. Yu
  • Teaching According to Talents! Instruction Tuning LLMs with Competence-Aware Curriculum Learning
    Yangning Li, Tingwei Lu, Yinghui Li, Yankai Chen, Wei-Chieh Huang, Wenhao Jiang, Hui Wang, Hai-Tao Zheng, Philip S. Yu
  • Let Them Down Easy! Contextual Effects of LLM Guardrails on User Perceptions and Preferences
    Mingqian Zheng, Wenjia hu, Patrick Zhao, Motahhare Eslami, Jena D. Hwang, Faeze Brahman, Carolyn Rose, Maarten Sap
  • From Hypothesis to Publication: A Comprehensive Survey of AI-Driven Research Support Systems
    Zekun Zhou, Xiaocheng Feng, Lei Huang, Xiachong Feng, Ziyun Song, Ruihan Chen, Liang Zhao, Weitao Ma, Yuxuan Gu, Baoxin Wang, Dayong Wu, Guoping Hu, Ting Liu, Bing Qin
  • Enhancing Model Privacy in Federated Learning with Random Masking and Quantization
    Zhibo Xu, Zhu JianHao, Jingwen Xu, Changze Lv, Zhenghua Wang, Zisu Huang, Xiaohua Wang, Muling Wu, Qi Qian, Xiaoqing Zheng, Xuanjing Huang
  • SuPreME: A Supervised Pre-training Framework for Multimodal ECG Representation Learning
    Mingsheng Cai, Jiuming Jiang, Wenhao Huang, Che Liu, Rossella Arcucci
  • Ferret: Faster and Effective Automated Red Teaming with Reward-Based Scoring Technique
    Tej Deep Pala, Vernon Toh, Rishabh Bhardwaj, Soujanya Poria
  • Do What? Teaching Vision-Language-Action Models to Reject the Impossible
    Wen-Han Hsieh, Elvis Hsieh, Dantong Niu, Trevor Darrell, Roei Herzig, David M. Chan
  • AgentInit: Initializing LLM-based Multi-Agent Systems via Diversity and Expertise Orchestration for Effective and Efficient Collaboration
    Chunhao Tian, Yutong Wang, Xuebo Liu, Zhexuan Wang, Liang Ding, Miao Zhang, Min Zhang
  • Time to Revisit Exact Match
    Auss Abbood, Zaiqiao Meng, Nigel Collier
  • LongTableBench: Benchmarking Long-Context Table Reasoning across Real-World Formats and Domains
    Liyao Li, Jiaming Tian, Hao Chen, Wentao Ye, Chao Ye, Haobo Wang, NINGTAO WANG, Xing Fu, Gang Chen, Junbo Zhao
  • Exploring and Evaluating Multimodal Knowledge Reasoning Consistency of Multimodal Large Language Models
    Boyu Jia, Junzhe Zhang, Huixuan Zhang, Xiaojun Wan
  • MPTA: MultiTask Personalization Assessment
    Matthieu Tehenan, Eric Chamoun, Andreas Vlachos
  • Semantic Geometry of Sentence Embeddings
    Matthieu Tehenan
  • ReAlign: Structured Revision for Small Language Model Alignment
    Ruijun Chen, Jiajian Guo, Hongzhan Chen, Fanqi Wan, Qifan Wang, Xiaojun Quan
  • Curr-ReFT: Overcoming Training Bottlenecks in Small-scale Vision-Language Models via Curriculum Reinforcement Finetuning
    Huilin Deng, Ding Zou, Xinghao Zhao, Rui Ma, Yanming Guo, Yang Cao, Yu Kang
  • Layer-Aware Task Arithmetic: Disentangling Task-Specific and Instruction-Following Knowledge
    Yan-Lun Chen, Yi-Ru Wei, Chia-Yi Hsu, Chia-Mu Yu, Chun-Ying Huang, Ying-Dar Lin, Yu-Sung Wu, Wei-Bin Lee
  • Revisiting Pruning vs Quantization for Small Language Models
    Zihan Zhou, Simon Kurz, Zhixue Zhao
  • CLaw: Benchmarking Chinese Legal Knowledge in Large Language Models - A Fine-grained Corpus and Reasoning Analysis
    Xinzhe Xu, Liang Zhao, Hongshen Xu, chenchenc
  • polyBART: A Chemical Linguist for Polymer Property Prediction and Generative Design
    Anagha Savit, Harikrishna Sahu, Shivank S. Shukla, Wei Xiong, Rampi Ramprasad
  • A Survey of RAG-Reasoning Systems in Large Language Models
    Yangning Li, Weizhi Zhang, Yuyao Yang, Wei-Chieh Huang, Yaozu Wu, Junyu Luo, Yuanchen Bei, Henry Peng Zou, Xiao Luo, Yusheng Zhao, Chunkit Chan, Yankai Chen, Zhongfen Deng, Yinghui Li, Hai-Tao Zheng, Dongyuan Li, Renhe Jiang, Ming Zhang, Yangqiu Song, Philip S. Yu
  • REGen: A Reliable Evaluation Framework for Generative Event Argument Extraction
    Omar Sharif, Joseph Gatto, Madhusudan Basak, Sarah Masud Preum
  • Mitigating Interviewer Bias in Multimodal Depression Detection: An Approach with Adversarial Learning and Contextual Positional Encoding
    Enshi Zhang, Christian Poellabauer
  • AMIA: Automatic Masking and Joint Intention Analysis Makes LVLMs Robust Jailbreak Defenders
    Yuqi Zhang, Yuchun Miao, Zuchao Li, Liang Ding
  • Disentangling Language Understanding and Reasoning Structures in Cross-lingual Chain-of-Thought Prompting
    Khanh-Tung Tran, Nguyet-Hang Vu, Barry O’Sullivan, Hoang D. Nguyen
  • MoRoVoc: A Large Dataset for Geographical Variation Identification of the Spoken Romanian Language
    Andrei-Marius Avram, Bănescu Ema-Ioana, Anda-Teodora Robea, Dumitru-Clementin Cercel
  • Language-Informed Synthesis of Rational Agent Models for Grounded Theory-of-Mind Reasoning On-the-fly
    Lance Ying, Ryan Truong, Katherine M. Collins, Cedegao E. Zhang, Megan Wei, Tyler BrookeWilson, Tan Zhi-Xuan, Lionel Wong, Joshua B. Tenenbaum
  • MOLE: Metadata Extraction and Validation in Scientific Papers Using LLMs
    Zaid Alyafeai, Maged S. Al-shaibani, Bernard Ghanem
  • MASSV: Multimodal Adaptation and Self-Data Distillation for Speculative Decoding of Vision-Language Models
    Mugilan Ganesan, Shane Segal, Ankur Aggarwal, Nish Sinnadurai, Sean Lie, Vithursan Thangarasa
  • FESTA: Functionally Equivalent Sampling for Trust Assessment of Multimodal LLMs
    Debarpan Bhattacharya, Apoorva Kulkarni, Sriram Ganapathy
  • ClaimGen-CN: A Large-scale Chinese Dataset for Legal Claim Generation
    Siying Zhou, Yiquan Wu, Hui Chen, Xueyu Hu, Kun Kuang, Adam Jatowt, Chunyan Zheng, Fei Wu
  • Summarize-Exemplify-Reflect: Data-driven Insight Distillation Empowers LLMs for Few-shot Tabular Classification
    Yifei Yuan, Jiatong Li, Weijia Zhang, Mohammad Aliannejadi, Evangelos Kanoulas, Renjun Hu
  • Rethinking LLM Uncertainty: A Multi-Agent Approach to Estimating Black-Box Model Uncertainty
    Yu Feng, Phu Mon Htut, Zheng Qi, Wei Xiao, Manuel Mager, Nikolaos Pappas, Kishaloy Halder, Yang Li, Yassine Benajiba, Dan Roth
  • Stress-Testing the Reasoning Competence of Language Models With Formal Proofs
    Konstantine Arkoudas, Serafim Batzoglou
  • Topic-Guided Reinforcement Learning with LLMs for Enhancing Multi-Document Summarization
    Chuyuan Li, Austin Xu, Shafiq Joty, Giuseppe Carenini
  • FACTCHECKMATE: Preemptively Detecting and Mitigating Hallucinations in LMs
    Deema Alnuhait, Neeraja Kirtane, Muhammad Khalifa, Hao Peng
  • Dialectal Toxicity Detection: Evaluating LLM-as-a-Judge Consistency Across Language Varieties
    Fahim Faisal, Md Mushfiqur Rahman, Antonios Anastasopoulos
  • Mitigate One, Skew Another? Tackling Intersectional Biases in Text-to-Image Models
    Pushkar Shukla, Aditya Chinchure, Emily Diana, Alexander Tolbert, Kartik Hosanagar, Vineeth N. Balasubramanian, Leonid Sigal, Matthew A. Turk
  • Language-Specific Layer Matters: Efficient Multilingual Enhancement for Large Vision-Language Models
    Yuchun Fan, Yilin Wang, Yongyu Mu, Lei Huang, Bei Li, Xiaocheng Feng, Tong Xiao, JingBo Zhu
  • InfAL: Inference Time Adversarial Learning for Improving Research Ideation
    Sikun Guo, Amir Hassan Shariatmadari, Peng Wang, Albert Huang, Aidong Zhang
  • Speculative Decoding for Multi-Sample Inference
    Yiwei Li, Jiayi Shi, Shaoxiong Feng, Peiwen Yuan, Xinglin Wang, Yueqi Zhang, Ji Zhang, Chuyi Tan, Boyuan Pan, Yao Hu, Kan Li
  • LSRL: Process-Supervised GRPO on Latent Recurrent States Improves Mathematical Reasoning
    Hangliang Ren
  • Multi-token Mask-filling and Implicit Discourse Relations
    Meinan Liu, Yunfang Dong, Xixian Liao, Bonnie Webber
  • Schema Generation for Large Knowledge Graphs Using Large Language Models
    Bohui Zhang, Yuan He, Lydia Pintscher, Albert Meroño-Peñuela, Elena Simperl
  • MCTS-RAG: Enhancing Retrieval-Augmented Generation with Monte Carlo Tree Search
    Yunhai Hu, Yilun Zhao, Chen Zhao, Arman Cohan
  • What if Othello-Playing Language Models Could See?
    Xinyi Chen, Yifei Yuan, Jiaang Li, Serge Belongie, Maarten de Rijke, Anders Søgaard
  • LLM-Based Web Data Collection for Research Dataset Creation
    Thomas Berkane, Marie-Laure Charpignon, Maimuna S. Majumder
  • PsyScam: A Benchmark for Psychological Techniques in Real-World Scams
    Shang Ma, Tianyi Ma, JIAHAO LIU, Wei Song, Zhenkai Liang, Xusheng Xiao, Yanfang Ye
  • LoRaDA: Low-Rank Direct Attention Adaptation for Efficient LLM Fine-tuning
    Zhangming Li, Qinghao Hu, Yiqun Chen, Peisong Wang, Yifan Zhang, Jian Cheng
  • Inductive Reasoning on Few-Shot Knowledge Graphs with Task-Aware Language Models
    Cheng Yan, Feng Zhao, Ruilin Zhao, Hong Zhang
  • ForestCast: Open-Ended Event Forecasting with Semantic News Forest
    Zi Yu, Shaoxiang Wang, Guozheng Li, Yu Zhang, Chi Harold Liu
  • Agentic Medical Knowledge Graphs Enhance Medical Question Answering: Bridging the Gap Between LLMs and Evolving Medical Knowledge
    Mohammad Reza Rezaei, Reza Saadati Fard, Jayson Lee Parker, Rahul Krishnan, Milad Lankarany
  • Text Anomaly Detection with Simplified Isolation Kernel
    Yang Cao, Sikun Yang, Yujiu Yang, Lianyong Qi, Ming Liu
  • Idola Tribus of AI: Large Language Models tend to perceive order where none exists
    Shin-nosuke Ishikawa, Masato Todo, Taiki Ogihara, Hirotsugu OHBA
  • Thunder-DeID: Accurate and Efficient De-identification Framework for Korean Court Judgments
    Sungeun Hahm, Heejin Kim, Gyuseong Lee, Hyunji M. Park, Jaejin Lee
  • Multi-Agent Autonomous Driving Systems with Large Language Models: A Survey of Recent Advances, Resources, and Future Directions
    Yaozu Wu, Dongyuan Li, Yankai Chen, Renhe Jiang, Henry Peng Zou, Wei-Chieh Huang, Yangning Li, Liancheng Fang, Zhen Wang, Philip S. Yu
  • Comprehensive Evaluation on Lexical Normalization: Boundary-Aware Approaches for Unsegmented Languages
    Shohei Higashiyama, Masao Utiyama
  • Explainable Text Classification with LLMs: Enhancing Performance through Dialectical Prompting and Explanation-Guided Training
    Huaming Du, Lei Yuan, Guisong Liu, Carl Yang, Gang Kou
  • MultiPL-MoE: Multi-Programming-Lingual Extension of Large Language Models through Hybrid Mixture-of-Experts
    Qing Wang, Xue Han, Jiahui Wang, Lehao xing, Qian Hu, Lianlian Zhang, Junlan Feng, Chao Deng
  • AutoSpec: An Agentic Framework for Automatically Drafting Patent Specification
    Ryan Shea, Zhou Yu
  • LimaCost: Data Valuation for Instruction Tuning of Large Language Models
    Hyeonseok Moon, Jaehyung Seo, Seonmin Koo, Jinsung Kim, Young-kyoung Ham, jiwon moon, Heuiseok Lim
  • Two Challenges, One Solution: Robust Multimodal Learning through Dynamic Modality Recognition and Enhancement
    Lanxin Bi, Yunqi Zhang, Luyi Wang, Yake Niu, Hui Zhao
  • SwiftPrune: Hessian-Free Weight Pruning for Large Language Models
    Yuhan Kang, Yang Shi, Mei Wen, Jun He, Jianchao Yang, Zeyu Xue, Jing Feng, Xinwang Liu
  • Training LLMs for Optimization Modeling via Iterative Data Synthesis and Structured Validation
    Yang Wu, Yifan Zhang, Yurong Wu, Yuran Wang, Junkai Zhang, Jian Cheng
  • Exploiting Prompt-induced Confidence for Black-Box Attacks on LLMs
    Meina Chen, Yihong Tang, Kehai Chen
  • DPF-CM: A Data Processing Framework with Privacy-Preserving Vector Databases for Chinese Medical LLMs Training and Deployment
    Wei Huang, Anda Cheng, Zhao Zhang, Yinggui Wang
  • Graph-Reward-SQL: Execution-Free Reinforcement Learning for Text-to-SQL via Graph Matching and Stepwise Reward
    Han Weng, Puzhen Wu, Cui Longjie, Yi Zhan, Boyi Liu, Yuanfeng SONG, Dun Zeng, Yingxiang Yang, Qianru Zhang, Dong HUANG, Xiaoming Yin, Yang Sun, Xing Chen
  • StatsChartMWP: A Dataset for Evaluating Multimodal Mathematical Reasoning Abilities on Math Word Problems with Statistical Charts
    Dan Zhu, Tianqiao Liu, Zitao Liu
  • Logic-Thinker: Teaching Large Language Models to Think more Logically.
    Chengyao Wen, Qiang Cheng, Shaofei Wang, Zhizhen Liu, Lei Liang, Deng Zhao
  • ACEBench: A Comprehensive Evaluation of LLM Tool Usage
    Chen Chen, xinlong hao, Weiwen Liu, Xu Huang, Xingshan Zeng, Shuai Yu, Dexun Li, Yuefeng Huang, Xiangcheng Liu, Wang Xinzhi, Wu Liu
  • RevPRAG: Revealing Poisoning Attacks in Retrieval-Augmented Generation through LLM Activation Analysis
    Xue Tan, Hao Luan, Mingyu Luo, Xiaoyan Sun, Ping Chen, Jun Dai
  • DaMoC: Efficiently Selecting the Optimal Large Language Model for Fine-tuning Domain Tasks Based on Data and Model Compression
    Wei Huang, Huang Wei, Yinggui Wang
  • CoAT: Chain-of-Associated-Thoughts Framework for Enhancing Large Language Models Reasoning
    Jianfeng Pan, Senyou Deng, Shaomang Huang
  • ChartM$^3$: A Multi-Stage Code-Driven Pipeline for Constructing Multi-Dimensional and Multi-Step Visual Reasoning Data in Chart Comprehension
    Duo Xu, Hao Cheng, Xin Lin, Zhen Xie, Hao Henry Wang
  • Can LLMs Truly Plan? A Comprehensive Evaluation of Planning Capabilities
    Gayeon Jung, HyeonSeok Lim, Minjun Kim, JOON-HO LIM, KyungTae Lim, Hansaem Kim
  • MARIO-0.5B: A Multi-Agent Lightweight Model for Real-Time Open Information Extraction in Low-Resource Settings
    Donghai Zhang, SHuangtao Yang, Bo Fu, Dong xiaozheng, Wei Song
  • BiMax: Bidirectional MaxSim Score for Document-Level Alignment
    Xiaotian Wang, Takehito Utsuro, Masaaki Nagata
  • DocMMIR: A Framework for Document Multi-modal Information Retrieval
    Zirui Li, Siwei Wu, Yizhi LI, Xingyu Wang, Yi Zhou, Chenghua Lin
  • MoVoC: Morphology-Aware Subword Construction for Ge’ez Script Languages
    Hailay Kidu Teklehaymanot, Dren Fazlija, Wolfgang Nejdl
  • MMA: Cross-Domain Knowledge Integration via Mixture of Multi-Domain Agents
    Kehang Jia, Juntao Li, Xiaobo Liang, Yisheng Xiao, Yixuan Yang, Min Zhang
  • HAWK: Highlighting Entity-aware Knowledge for Alleviating Information Sparsity in Long Contexts
    Seonmin Koo, Jinsung Kim, Chanjun Park, Heuiseok Lim
  • Sensitivity-LoRA : Low-Load Sensitivity-Based Fine-Tuning for Large Language Models
    Hao Zhang, Bo Huang, Zhenjia Li, Xi Xiao, Hui Yi Leong, Zumeng Zhang, Xinwei Long, Tianyang Wang, Hao Xu
  • ROSE: A Reward-Oriented Data Selection Framework for LLM Task-Specific Instruction Tuning
    Yang Wu, Huayi Zhang, Yizheng Jiao, Lin Ma, Xiaozhong Liu, Jinhong Yu, Dongyu Zhang, DEZHI YU, Wei Xu
  • SimBA: Simplifying Benchmark Analysis Using Performance Matrices Alone
    Nishant Subramani, Alfredo Gomez, Mona T. Diab
  • MarathiEmoExplain: A Dataset for Sentiment, Emotion, and Explanation in Low-Resource Marathi
    Anuj Kumar, Mohammed Faisal Sayed, Satyadev Ahlawat, Yamuna Prasad
  • Active Domain Knowledge Acquisition with 100-Dollar Budget: Enhancing LLMs via Cost-Efficient, Expert-Involved Interaction in Sensitive Domains
    Yang Wu, Raha Moraffah, Rujing Yao, Jinhong Yu, Zhimin Tao, Xiaozhong Liu
  • Structure-aware Propagation Generation with Large Language Models for Fake News Detection
    Mengyang Chen, Lingwei Wei, Wei Zhou, Songlin Hu
  • UniCoM: A Universal Code-Switching Speech Generator
    Sangmin Lee, Woojin Chung, Seyun Um, Hong-Goo Kang
  • Mitigating Sequential Dependencies: A Survey of Algorithms and Systems for Generation-Refinement Frameworks in Autoregressive Models
    Yunhai Hu, Zining Liu, Zhenyuan Dong, Tianfan Peng, Bradley McDanel, Sai Qian Zhang
  • Do We Really Need All Those Dimensions? An Intrinsic Evaluation Framework for Compressed Embeddings
    Nathan Inkiriwang, Necva Bölücü, Garth Tarr, Maciej Rybinski
  • Mixture of LoRA Experts for Continual Information Extraction with LLMs
    Zitao Wang, Xinyi Wang, Wei Hu
  • Spelling-out is not Straightforward: LLMs’ Capability of Tokenization from Token to Characters
    Tatsuya Hiraoka, Kentaro Inui
  • OAgents: An Empirical Study of Building Effective Agents
    He Zhu, Tianrui Qin, King Zhu, Heyuan Huang, Yeyi Guan, Jinxiang Xia, Hanhao Li, Yi Yao, Ningning Wang, Pai Liu, Tianhao Peng, Sunny Gui, LiXiaowan, Yuhui Liu, Xiangru Tang, Jian Yang, Ge Zhang, Xitong Gao, Yuchen Eleanor Jiang, Changwang Zhang, Jun Wang, Jiaheng Liu, Wangchunshu Zhou
  • 2Columns1Row: A Russian Benchmark for Textual and Multimodal Table Understanding and Reasoning
    Vildan Saburov, Daniil Vodolazsky, Danil Sazanakov, Alena Fenogenova
  • Permitted Knowledge Boundary: Evaluating the Knowledge-Constrained Responsiveness of Large Language Models
    Wenrui Bao, Kai Wang, Siqiang Luo, Xiang Li
  • A Closer Look at Bias and Chain-of-Thought Faithfulness of Large (Vision) Language Models
    Sriram Balasubramanian, Samyadeep Basu, Soheil Feizi
  • From Remembering to Metacognition: Do Existing Benchmarks Accurately Evaluate LLMs?
    Geng Zhang, Sihang Jiang, Yizhou Ying, Guanglei Yue, Jiaqing Liang, Yifei Fu, Hailin Hu, Yanghua Xiao
  • How a Bilingual LM Becomes Bilingual: Tracing Internal Representations with Sparse Autoencoders
    Tatsuro Inaba, Go Kamoda, Kentaro Inui, Masaru Isonuma, Yusuke Miyao, Yohei Oseki, Yu Takagi, Benjamin Heinzerling
  • MultiConIR: Towards Multi-Condition Information Retrieval
    Xuan Lu, Sifan Liu, Bochao Yin, Yongqi Li, Xinghao Chen, Hui Su, Yaohui Jin, Wenjun Zeng, Xiaoyu Shen
  • HMCL: Task-Optimal Text Representation Adaptation through Hierarchical Contrastive Learning
    Zhenyi Wang, Yapeng Jia, Haiyan Ning, Peng Wang, Dan Wang, Yitao Cao
  • KBAlign: Efficient Self Adaptation on Specific Textual Knowledge Bases
    Zheni Zeng, Yuxuan Chen, Shi Yu, Ruobing Wang, Yukun Yan, Zhenghao Liu, Shuo Wang, Xu Han, Zhiyuan Liu, Maosong Sun
  • Revisiting Chain-of-Thought Prompting: Zero-shot Can Be Stronger than Few-shot
    Xiang Cheng, Chengyan Pan, Minjun Zhao, Deyang Li, Fangchao Liu, Xinyu Zhang, Xiao Zhang, Yong Liu
  • RMTBench: Benchmarking LLMs Through Multi-Turn User-Centric Role-Playing
    Hao Xiang, Tianyi Tang, Yang Su, Bowen Yu, An Yang, Fei Huang, Yichang Zhang, Yaojie Lu, Hongyu Lin, Xianpei Han, Jingren Zhou, Junyang Lin, Le Sun
  • Smart-Searcher: Incentivizing the Dynamic Knowledge Acquisition of LLMs via Reinforcement Learning
    Huatong Song, Jinhao Jiang, Wenqing Tian, Zhipeng Chen, Yuhuan Wu, Jiahao Zhao, Yingqian Min, Xin Zhao, LEI FANG, Ji-Rong Wen
  • InteGround: On the Evaluation of Verification and Retrieval Planning in Integrative Grounding
    Cheng Jiayang, Qianqian Zhuang, Haoran Li, Chunkit Chan, Xin Liu, Lin Qiu, Yangqiu Song
  • MM-CRITIC: A Holistic Evaluation of Large Multimodal Models as Multimodal Critique
    Gailun Zeng, Ziyang Luo, Hongzhan Lin, Yuchen Tian, Kaixin Li, Ziyang Gong, Jianxiong Guo, Jing Ma
  • On the Correspondence between the Squared Norm and Information Content in Text Embeddings
    Enrique Amigo, Adrian Ghajari, Alejandro Benito-Santos, Diego De la Fuente Rodríguez
  • Adversary-Aware DPO: Enhancing Safety Alignment in Vision Language Models via Adversarial Training
    Fenghua Weng, Jian Lou, Jun Feng, Minlie Huang, Wenjie Wang
  • SLiNT: Structure-aware Language Model with Injection and Contrastive Training for Knowledge Graph Completion
    mengxue yang, Chun Yang, Jiaqi Zhu, Jiafan Li, Jingqi Zhang, Yuyang Li, Ying Li
  • LAVa: Layer-wise KV Cache Eviction with Dynamic Budget Allocation
    Yiqun Shen, Song Yuan, Zhengze Zhang, Xiaoliang Wang, Daxin Jiang, Nguyen Cam-Tu
  • LoRA-PAR: A Flexible Dual-System LoRA Partitioning Approach to Efficient LLM Fine-Tuning
    yining huang, Bin Li, Keke Tang, Meilian Chen
  • SimpleDeepSearcher: Deep Information Seeking via Web-Powered Reasoning Trajectory Synthesis
    Shuang Sun, Huatong Song, Yuhao Wang, Ruiyang Ren, Jinhao Jiang, Junjie Zhang, Fei Bai, Jia Deng, Xin Zhao, Zheng Liu, LEI FANG, Zhongyuan Wang, Ji-Rong Wen
  • LLaVE: Large Language and Vision Embedding Models with Hardness-Weighted Contrastive Learning
    Zhibin Lan, Liqiang Niu, Fandong Meng, Jie Zhou, Jinsong Su
  • SampleMix: A Sample-wise Pre-training Data Mixing Strategy by Coordinating Data Quality and Diversity
    Xiangyu Xi, Deyang Kong, Jian Yang, jiawei yang, Zhengyu Chen, Wei Wang, Jingang Wang, Xunliang Cai, Shikun Zhang, Wei Ye
  • Evaluating Test-Time Scaling LLMs for Legal Reasoning: OpenAI o1, DeepSeek-R1, and Beyond
    Yinghao Hu, Yaoyao Yu, Leilei Gan, Bin Wei, Kun Kuang, Fei Wu
  • LLM Agents for Education: Advances and Applications
    Zhendong Chu, Shen Wang, Jian Xie, Tinghui Zhu, Yibo Yan, Jingheng Ye, Aoxiao Zhong, Xuming Hu, Jing Liang, Philip S. Yu, Qingsong Wen
  • Modeling Subjectivity in Cognitive Appraisal with Language Models
    Yuxiang Zhou, Hainiu Xu, Desmond Ong, Maria Liakata, Petr Slovak, Yulan He
  • Dementia Through Different Eyes: Explainable Modeling of Human and LLM Perceptions for Early Awareness
    Lotem Peled-Cohen, Maya Zadok, Nitay Calderon, Hila Gonen, Roi Reichart
  • Mitigating Hallucinations in Large Vision-Language Models by Self-Injecting Hallucinations
    Yifan Lu, Ziqi Zhang, Chunfeng Yuan, Jun Gao, Congxuan Zhang, Xiaojuan Qi, Bing Li, Weiming Hu
  • How Much Do Large Language Models Know about Human Motion? A Case Study in 3D Avatar Control
    Kunhang Li, Jason Naradowsky, Yansong Feng, Yusuke Miyao
  • The Search for Conflicts of Interest: Open Information Extraction in Scientific Publications
    Garima Gaur, Oana Balalau, Ioana Manolescu, Prajna Devi Upadhyay
  • On Collaborating Small and Large Models For Few-shot Intent Detection
    Peng Chen, Bang Wang
  • A Survey on LLMs for Story Generation
    Maria Teleki, Vedangi Bengali, Xiangjue Dong, Sai Tejas Janjur, Haoran Liu, Tian Liu, Cong Wang, Ting Liu, Yin Zhang, Frank Shipman, James Caverlee
  • From Knowledge to Treatment: Large Language Model Assisted Biomedical Concept Representation for Drug Repurposing
    Chengrui xiang, Tengfei Ma, Xiangzheng Fu, Yiping Liu, Bosheng Song, xiangxiang Zeng
  • SKRAG: A Retrieval-Augmented Generation Framework Guided by Reasoning Skeletons over Knowledge Graphs
    Xiaotong Xu, Yizhao Wang, Yunfei Liu, Shengyang Li
  • A Generative Framework for Personalized Sticker Retrieval
    Changjiang Zhou, Ruqing Zhang, Jiafeng Guo, Yu-An Liu, Fan Zhang, Ganyuan Luo, Xueqi Cheng
  • Bridging Semantic and Modality Gaps in Zero-Shot Captioning via Retrieval from Synthetic Data
    Zhiyue Liu, Wenkai Zhou
  • Humor in Pixels: Benchmarking Large Multimodal Models Understanding of Online Comics
    Yuriel Wang Jun Long Ryan, Rui Yang Tan, Kenny Tsu Wei Choo, Roy Ka-Wei Lee
  • BiMediX2 : Bio-Medical EXpert LMM for Diverse Medical Modalities
    Sahal Shaji Mullappilly, Mohammed Irfan Kurpath, Sara Pieri, Saeed Yahya Alseiari, Shanavas Cholakkal, Khaled M Aldahmani, Fahad Shahbaz Khan, Rao Muhammad Anwer, Salman Khan, Timothy Baldwin, Hisham Cholakkal
  • DeMAC: Enhancing Multi-Agent Coordination with Dynamic DAG and Manager-Player Feedback
    Yuhan Liu, Cong Xu, Lu Liu, Yihua Wang, Feiyu Chen, Qi Jia, Yaqian Zhao, Zhichun Wang, Xiang Li
  • Coherence of Argumentative Dialogue Snippets: A New Method for Large Scale Evaluation with an Application to Inference Anchoring Theory
    Paul Piwek, Jacopo Amidei, Svetlana Stoyanchev
  • Angular Dispersion Accelerates $k$-Nearest Neighbors Machine Translation
    Evgeniia Tokarchuk, Sergey Troshin, Vlad Niculae
  • Benchmarking Contextual and Paralinguistic Reasoning in Speech-LLMs: A Case Study with In-the-Wild Data
    Qiongqiong Wang, Hardik Bhupendra Sailor, Tianchi Liu, Wenyu Zhang, Muhammad Huzaifah, Nattadaporn Lertcheva, Shuo Sun, Nancy F. Chen, Jinyang Wu, AiTi Aw
  • This is not a Disimprovement: Improving Negation Reasoning in Large Language Models via Prompt Engineering
    Joshua Jose Dias Barreto, Abhik Jana
  • Make Every Letter Count: Building Dialect Variation Dictionaries from Monolingual Corpora
    Robert Litschko, Verena Blaschke, Diana Burkhardt, Barbara Plank, Diego Frassinelli
  • SelfAug: Mitigating Catastrophic Forgetting in Retrieval-Augmented Generation via Distribution Self-Alignment
    Yuqing Huang, Rongyang Zhang, Qimeng Wang, Chengqiang Lu, Yan Gao, YIWU, Yao Hu, Xuyang Zhi, Guiquan Liu, Xin Li, Hao Wang, Enhong Chen
  • SEKE: Specialised Experts for Keyword Extraction
    Matej Martinc, Thi Hong Hanh TRAN, Senja Pollak, Boshko Koloski
  • 1+1>2: A Synergistic Sparse and Low-Rank Compression Method for Large Language Models
    Zeliang Zong, Kai Zhang, Zheyang Li, Wenming Tan, Ye Ren, yiyan zhai, Jilin Hu
  • InfiMM-WebMath-40B: Advancing Multimodal Pre-Training for Enhanced Mathematical Reasoning
    Xiaotian Han, Yiren Jian, Xuefeng Hu, Haogeng Liu, Yiqi Wang, Qihang Fan, Yuang Ai, Huaibo Huang, Ran He, Zhenheng Yang, Quanzeng You
  • Zero-Shot Defense Against Toxic Images via Inherent Multimodal Alignment in LVLMs
    Wei Zhao, Zhe Li, Yige Li, Jun Sun
  • Retrieval Augmented Generation based context discovery for ASR
    Siskos Dimitrios, Stavros Papadopoulos, Pablo Peso Parada, Jisi Zhang, Karthikeyan Saravanan, Anastasios Drosou
  • pFedRAG: A Personalized Federated Retrieval-Augmented Generation System with Depth-Adaptive Tiered Embedding Tuning
    Hangyu He, Xin Yuan, Kai Wu, Ren Ping Liu, Wei Ni
  • ReCUT: Balancing Reasoning Length and Accuracy in LLMs via Stepwise Trails and Preference Optimization
    Zhensheng Jin, Xinze Li, Yifan Ji, Chunyi Peng, Zhenghao Liu, Qi Shi, Yukun Yan, Shuo Wang, Furong Peng, Ge Yu
  • CURE: Controlled Unlearning for Robust Embeddings — Mitigating Conceptual Shortcuts in Pre-Trained Language Models
    Aysenur Kocak, Shuo Yang, Bardh Prenkaj, Gjergji Kasneci
  • MLAlgo-Bench: Can Machines Implement Machine Learning Algorithms?
    Yunfei Wang, Yeqin Zhang, Yuyang Wu, Liang Lu, Phi Le Nguyen, Xiaoliang Wang, Nguyen Cam-Tu
  • Fair Text-Attributed Graph Representation Learning
    Ruilin Luo, Tianle Gu, Lin Wang, Yunfeng Zhou, Songtao Jiang, Lei Wang, Yujiu Yang
  • Human-Inspired Obfuscation for Model Unlearning: Local and Global Strategies with Hyperbolic Representations
    zekun wang, Jingjie Zeng, Yingxu Li, Liang Yang, Hongfei Lin
  • Do Influence Functions Work on Large Language Models?
    Zhe Li, Wei Zhao, Yige Li, Jun Sun
  • TRUEBench: Can LLM Response Meet Real-world Constraints as Productivity Assistant?
    Jiho Park, Jongyoon Song, Minjin Choi, Kyuho Heo, Taehun huh, Ji Won Kim
  • CausalMACE: Causality Empowered Multi-Agents in Minecraft Cooperative Tasks
    Qi Chai, Zhang Zheng, Junlong Ren, Deheng Ye, Zichuan Lin, Hao Wang
  • Harry Potter is Still Here! Probing Knowledge Leakage in Targeted Unlearned Large Language Models
    Bang Trinh Tran To, Thai Le
  • Learning Trajectories of Figurative Language for Pre-Trained Language Models
    Nicola Arici, Luca Putelli, Ejdis Gjinika, Ivan Serina, Alfonso Gerevini
  • BcQLM: Efficient Vision-Language Understanding with Distilled Q-Gated Cross-Modal Fusion
    Sike Xiang, Shuang Chen, Amir Atapour-Abarghouei
  • HapticCap: A Multimodal Dataset and Task for Understanding User Experience of Vibration Haptic Signals
    Guimin Hu, Daniel Hershcovich, Hasti Seifi
  • SubDocTrans: Enhancing Document-level Machine Translation with Plug-and-play Multi-granularity Knowledge Augmentation
    Hanghai Hong, Yibo Xie, Jiawei Zheng, Xiaoli Wang
  • Social Bias Evaluation for Large Language Models Requires Prompt Variations
    Rem Hida, Masahiro Kaneko, Naoaki Okazaki
  • Training with Fewer Bits: Unlocking Edge LLMs Training with Stochastic Rounding
    Taowen Liu, Marta Andronic, Deniz Gunduz, George Anthony Constantinides
  • FactReasoner: A Probabilistic Approach to Long-Form Factuality Assessment for Large Language Models
    Radu Marinescu, Debarun Bhattacharjya, Junkyu Lee, Tigran T. Tchrakian, Javier Carnerero-Cano, Yufang Hou, Elizabeth M. Daly, Alessandra Pascale
  • Robust Knowledge Editing via Explicit Reasoning Chains for Distractor-Resilient Multi-Hop QA
    Yuchen Wu, Liang Ding, Li Shen, Dacheng Tao
  • RadialRouter: Structured Representation for Efficient and Robust Large Language Models Routing
    Ruihan Jin, Pengpeng Shao, Zhengqi Wen, Jinyang Wu, Mingkuan Feng, Shuai Zhang, Jianhua Tao
  • Decoding Uncertainty: The Impact of Decoding Strategies for Uncertainty Estimation in Large Language Models
    Wataru Hashimoto, Hidetaka Kamigaito, Taro Watanabe
  • Elucidating Mechanisms of Demographic Bias in LLMs for Healthcare
    Hiba Ahsan, Arnab Sen Sharma, Silvio Amir, David Bau, Byron C Wallace
  • Can You Trick the Grader? Adversarial Persuasion of LLM Judges
    Yerin Hwang, Dongryeol Lee, taegwan kang, Yongil Kim, Kyomin Jung
  • Navigating the Unknown: Intent Classification and Out-of-Distribution Detection Using Large Language Models
    Yusuf Sali, Sıtkı Can Toraman
  • Trust Me, I’m Wrong: LLMs Hallucinate with Certainty Despite Knowing the Answer
    Adi Simhi, Itay Itzhak, Fazl Barez, Gabriel Stanovsky, Yonatan Belinkov
  • QUARTZ: QA-based Unsupervised Abstractive Refinement for Task-oriented Dialogue Summarization
    GHEBRIOUT Mohamed Imed Eddine, Gaël Guibon, Ivan Lerner, Emmanuel Vincent
  • MDSEval: A Meta-Evaluation Benchmark for Multimodal Dialogue Summarization
    Yinhong Liu, Jianfeng He, Hang Su, Ruixue Lian, Yi Nian, Jake W. Vincent, Srikanth Vishnubhotla, Robinson Piramuthu, Saab Mansour
  • PMPO: Probabilistic Metric Prompt Optimization for Small and Large Language Models
    ChenZhuo Zhao, Ziqian Liu, Xinda Wang, Junting Lu, Chaoyi Ruan
  • Evaluating the Creativity of LLMs in Persian Literary Text Generation
    Armin tourajmehr, Mohammad Reza Modarres, Yadollah Yaghoobzadeh
  • SCDTour: Embedding Axis Ordering and Merging for Interpretable Semantic Change Detection
    Taichi Aida, Danushka Bollegala
  • Resolving UnderEdit & OverEdit with Iterative & Neighbor-Assisted Model Editing
    Bhiman Kumar Baghel, Scott M. Jordan, Zheyuan Ryan Shi, Xiang Lorraine Li
  • LLM-empowered Dynamic Prompt Routing for Vision-Language Models Tuning under Long-Tailed Distributions
    Yongju Jia, Jiarui Ma, Xiangxian Li, Baiqiao Zhang, XianhuiCao, Juan Liu, Yulong Bian
  • HGAdapter: Hypergraph-based Adapters in Language Models for Code Summarization and Clone Detection
    Guang Yang, Yujie Zhu
  • Evaluating distillation methods for data-efficient syntax learning
    Takateru Yamakoshi, Thomas L. Griffiths, R. Thomas McCoy, Robert D. Hawkins
  • “Going to a trap house” conveys more fear than “Going to a mall”: Benchmarking Emotion Context Sensitivity for LLMs
    Eojin Jeon, Mingyu Lee, Sangyun Kim, Junho Kim, Wanzee Cho, Tae-Eui Kam, SangKeun Lee
  • [MASK]ED - Language Modeling for Explainable Classification and Disentangling of Socially Unacceptable Discourse.
    Dimitra Niaouri, Mohamed Rayane GHILENE, Michele Linardi, Julien Longhi
  • A Survey of Cognitive Distortion Detection and Classification in NLP
    Archie Sage, Jeroen Keppens, Helen Yannakoudakis
  • Curse of Knowledge: Your Guidance and Provided Knowledge are biasing LLM Judges in Complex Evaluation
    Weiyuan Li, Xintao Wang, Siyu Yuan, Rui Xu, Jiangjie Chen, Qingqing Dong, Yanghua Xiao, Deqing Yang
  • Self-Training Large Language Models with Confident Reasoning
    Hyosoon Jang, Yunhui Jang, Sungjae Lee, Jungseul Ok, Sungsoo Ahn
  • Error Typing for Smarter Rewards: Improving Process Reward Models with Error-Aware Hierarchical Supervision
    Tej Deep Pala, Panshul Sharma, Amir Zadeh, Chuan Li, Soujanya Poria
  • Enhancing LLM-Based Persuasion Simulations with Cultural and Speaker-Specific Information
    Weicheng Ma, Hefan Zhang, Shiyu Ji, Farnoosh Hashemi, Qichao Wang, Ivory Yang, Joice Chen, Juanwen Pan, Michael Macy, Saeed Hassanpour, Soroush Vosoughi
  • An LLM-based Temporal-spatial Data Generation and Fusion Approach for Early Detection of Late Onset Alzheimer’s Disease (LOAD) Stagings Especially in Chinese and English-speaking Populations
    Yang Han, Jacqueline C.K. Lam, Victor O.K. Li, Lawrence Y. L. Cheung
  • Side Effects of Erasing Concepts from Diffusion Models
    Shaswati Saha, Sourajit Saha, Manas Gaur, Tejas Gokhale
  • SaCa: A Highly Compatible Reinforcing Framework for Knowledge Graph Embedding via Structural Pattern Contrast
    Jiashi Lin, Changhong Jiang, Yixiao Wang, Xinyi Zhu, Zhongtian Hu, Wei Zhang
  • Real, Fake, or Manipulated? Detecting Machine-Influenced Text
    Yitong Wang, Zhongping Zhang, Margherita Piana, Zheng Zhou, Peter Gerstoft, Bryan A. Plummer
  • Character is Destiny: Can Persona-assigned Language Models Make Personal Choices?
    Rui Xu, Xintao Wang, Jiangjie Chen, Siyu Yuan, Xinfeng Yuan, Jiaqing Liang, Zulong Chen, xiaoqingdong, Yanghua Xiao
  • Neutral Is Not Unbiased: Evaluating Implicit and Intersectional Identity Bias in LLMs Through Structured Narrative Scenarios
    Saba Ghanbari Haez, Mauro Dragoni
  • BTW: A Non-Parametric Variance Stabilization Framework for Multimodal Model Integration
    Jun Hou, Le Wang, Xuan Wang
  • Can LLMs Be Efficient Predictors of Conversational Derailment?
    Kaustubh Olpadkar, Vikram Sunil Bajaj, Leslie Barrett
  • Q-PRM: Adaptive Query Rewriting for Retrieval-Augmented Generation via Step-level Process Supervision
    Xiaopeng Ye, Chen Xu, Chaoliang Zhang, Zhaocheng Du, Jun Xu, Gang Wang, Zhenhua Dong
  • Factuality Beyond Coherence: Evaluating LLM Watermarking Methods for Medical Texts
    Rochana Prih Hastuti, Rian Adam Rajagede, Mansour Al Ghanim, Mengxin Zheng, Qian Lou
  • Guess What I am Thinking: A Benchmark for Inner Thought Reasoning of Role-Playing Language Agents
    Rui Xu, Mingyu Wang, Xintao Wang, Dakuan Lu, Xiaoyu Tan, Wei Chu, Xu Yinghui
  • Dropping Experts, Recombining Neurons: Retraining-Free Pruning for Sparse Mixture-of-Experts LLMs
    Yixiao Zhou, Ziyu Zhao, Dongzhou Cheng, zhiliang wu, Jie Gui, Yi Yang, Fei Wu, Yu Cheng, Hehe Fan
  • BiasFilter: An Inference-Time Debiasing Framework for Large Language Models
    Xiaoqing Cheng, Ruizhe Chen, Hongying Zan, Yuxiang Jia, Min Peng
  • X-LeBench: A Benchmark for Extremely Long Egocentric Video Understanding
    Wenqi Zhou, Kai Cao, Hao Zheng, Yunze Liu, XINYI ZHENG, Miao Liu, Per Ola Kristensson, Walterio W. Mayol-Cuevas, Fan Zhang, Weizhe Lin, Junxiao Shen
  • A Survey on Multi-modal Intent Recognition: Recent Advances and New Frontiers
    Zhihong Zhu, Fan Zhang, Yunyan Zhang, Jinghan Sun, Zhiqi Huang, QingqingLong, Bowen Xing, Xian Wu
  • Will Annotators Disagree? Identifying Subjectivity in Value-Laden Arguments
    Amir Homayounirad, Enrico Liscio, Tong Wang, Catholijn M Jonker, Luciano Cavalcante Siebert
  • LLMs Can Compensate for Deficiencies in Visual Representations
    Sho Takishita, Jay Gala, Abdelrahman Mohamed, Kentaro Inui, Yova Kementchedjhieva
  • Adapting Large Language Models for Character-based Augmentative and Alternative Communication
    Dylan Gaines, Keith Vertanen
  • Token-Level Metrics for Detecting Incorrect Gold Annotations in Named Entity Recognition
    Elena Merdjanovska, Alan Akbik
  • Exploring Paraphrasing Strategies for CEFR A1-Level Constraints in LLMs
    Eugenio Marzona, Maria Goikhman, Alessio Palmero Aprosio, Massimo Zancanaro
  • Efficient Layer-wise LLM Fine-tuning for Revision Intention Prediction
    Zhexiong Liu, Diane Litman
  • ConText-LE: Cross-Distribution Generalization for Longitudinal Experiential Data via Narrative-Based LLM Representations
    Ahatsham Hayat, Bilal Khan, Mohammad Rashedul Hasan
  • Chain of Strategy Optimization Makes Large Language Models Better Emotional Supporter
    Weixiang Zhao, Xingyu Sui, Xinyang Han, Yang Deng, Yulin Hu, Jiahe Guo, Libo Qin, Qianyun Du, Shijin Wang, Yanyan Zhao, Bing Qin, Ting Liu
  • Unlocking Legal Knowledge: A Multilingual Dataset for Judicial Summarization in Switzerland
    Luca Rolshoven, Vishvaksenan Rasiah, Srinanda Brügger Bose, Sarah Hostettler, Lara Burkhalter, Matthias Stürmer, Joel Niklaus
  • Context Minimization for Resource-Constrained Text Classification: Optimizing Performance-Efficiency Trade-offs through Linguistic Features
    Nahid Hossain, Md Faisal Kabir
  • FLAIRR-TS - Forecasting LLM-Agents with Iterative Refinement and Retrieval for Time Series
    Gunjan Jalori, Preetika Verma, Sercan O Arik
  • ULTRABENCH: Benchmarking LLMs under Extreme Fine-grained Text Generation
    Longfei Yun, Letian Peng, Jingbo Shang
  • The Price of Format: Diversity Collapse in LLMs
    Longfei Yun, Chenyang An, Zilong Wang, Letian Peng, Jingbo Shang
  • Zipf’s and Heaps’ Laws for Tokens and LLM-generated Texts
    Nikolay Mikhaylovskiy
  • LLMs for Bayesian Optimization in Scientific Domains: Are We There Yet?
    Rushil Gupta, Jason Hartford, Bang Liu
  • A Comprehensive Taxonomy of Negation for NLP and Neural Retrievers
    Roxana Petcu, Samarth Bhargav, Maarten de Rijke, Evangelos Kanoulas
  • Identifying Noise in Human-Created Datasets using Training Dynamics from Generative Models
    Maeda Hanafi, Ishan Jindal, Yannis Katsis, Lucian Popa, Huaiyu Zhu
  • Can Multiple Responses from an LLM Reveal the Sources of Its Uncertainty?
    Yang Nan, Pengfei He, Ravi Tandon, Han Xu
  • AfroXLMR-Social: Adapting Pre-trained Language Models for African Languages Social Media Text
    Tadesse Destaw Belay, Israel Abebe Azime, Ibrahim Said Ahmad, David Ifeoluwa Adelani, Idris Abdulmumin, Abinew Ali Ayele, Shamsuddeen Hassan Muhammad, Seid Muhie Yimam
  • Teaching Language Models To Gather Information Proactively
    Tenghao Huang, Sihao Chen, Muhao Chen, Jonathan May, Longqi Yang, Mengting Wan, Pei Zhou
  • Linguistic Alignment Predicts Learning in Small Group Tutoring Sessions
    Dorothea French, Robert Moulder, Kelechi Ezema, Katharina von der Wense, Sidney K. DMello
  • EfficientXLang: Towards Improving Token Efficiency Through Cross-Lingual Reasoning
    Sanchit Ahuja, Praneetha Vaddamanu, Barun Patra
  • Not Lost After All: How Cross-Encoder Attribution Challenges Position Bias Assumptions in LLM Summarization
    Elahe Rahimi, Hassan Sajjad, Domenic Rosati, Abeer Badawi, Elham Dolatabadi, Frank Rudzicz
  • FuzzAug: Data Augmentation by Coverage-guided Fuzzing for Neural Test Generation
    Yifeng He, JICHENG WANG, Yuyang Rong, Hao Chen
  • DrAgent: Empowering Large Language Models as Medical Agents for Multi-hop Medical Reasoning
    Fenglin Liu, Zheng Li, Hongjian Zhou, Qingyu Yin, Jingfeng Yang, Xin Liu, Zhengyang Wang, Xianfeng Tang, Shiyang Li, Xiang He, Ruijie Wang, Bing Yin, Lei Clifton, David A. Clifton
  • XRAG: Cross-lingual Retrieval-Augmented Generation
    Wei Liu, Sony Trenous, Leonardo F. R. Ribeiro, Bill Byrne, Felix Hieber
  • Can VLMs Recall Factual Associations From Visual References?
    Dhananjay Ashok, Ashutosh Chaubey, Hirona Jacqueline Arai, Jonathan May, Jesse Thomason
  • MFTCXplain: A Multilingual Benchmark Dataset for Evaluating the Moral Reasoning of LLMs through Multi-hop Hate Speech Explanation
    Jackson Trager, Francielle Vargas, Diego Alves, Matteo Guida, Mikel K. Ngueajio, Ameeta Agrawal, Flor Miriam Plaza-del-Arco, Yalda Daryani, Farzan Karimi Malekabadi
  • Large Language Models for Multilingual Previously Fact-Checked Claim Detection
    Ivan Vykopal, Matúš Pikuliak, Simon Ostermann, Tatiana Anikina, Michal Gregor, Marian Simko
  • Debating for Better Reasoning in Vision-Language Models
    Ashutosh Adhikari, Mirella Lapata
  • Fine-tuning LLMs with Cross-Attention-based Weight Decay for Bias Mitigation
    Farsheed Haque, Zhe Fu, Depeng Xu, Shuhan Yuan, Xi Niu
  • Profiling LLM’s Copyright Infringement Risks under Adversarial Persuasive Prompting
    Jikai Long, Ming Liu, Xiusi Chen, Jialiang Xu, Shenglan Li, Zhaozhuo Xu, Denghui Zhang
  • Residualized Similarity for Faithfully Explainable Authorship Verification
    Peter Zeng, Pegah Alipoormolabashi, Jihu Mun, Gourab Dey, Nikita Soni, Niranjan Balasubramanian, Owen Rambow, H. Schwartz
  • Post-hoc Study of Climate Microtargeting on Social Media Ads with LLMs: Thematic Insights and Fairness Evaluation
    Tunazzina Islam, Dan Goldwasser
  • MRFD: Multi-Region Fusion Decoding with Self-Consistency for Mitigating Hallucinations in LVLMs
    Haonan Ge, Yiwei Wang, Ming-Hsuan Yang, Yujun Cai
  • SIMBA UQ: Similarity-Based Aggregation for Uncertainty Quantification in Large Language Models
    Debarun Bhattacharjya, Balaji Ganesan, Junkyu Lee, Radu Marinescu, Katya Mirylenka, Michael Glass, Xiao Shou
  • Mind the Dialect: NLP Advancements Uncover Fairness Disparities for Arabic Users in Recommendation Systems
    Abdulla Alshabanah, Murali Annavaram
  • Hopscotch: Discovering and Skipping Redundancies in Language Models
    Mustafa Eyceoz, Nikhil Shivakumar Nayak, Hao Wang, Ligong Han, Akash Srivastava
  • CLEAR: A Clinically Grounded Tabular Framework for Radiology Report Evaluation
    Yuyang Jiang, Chacha Chen, Shengyuan Wang, Feng Li, Zecong Tang, Benjamin M. Mervak, Lydia chelala, Christopher M Straus, Reve Chahine, Samuel G. Armato III, Chenhao Tan
  • Parsing the Switch: LLM-Based UD Annotation for Complex Code-Switched and Low-Resource Languages
    Olga Kellert, Nemika Tyagi, Muhammad Imran, Nelvin Licona-Guevara, Carlos Gómez-Rodríguez
  • HetGCoT: Heterogeneous Graph-Enhanced Chain-of-Thought LLM Reasoning for Academic Question Answering
    Runsong Jia, Mengjia Wu, Ying Ding, Jie Lu, Yi Zhang
  • *S: Test Time Scaling for Code Generation*
    Dacheng Li, Shiyi Cao, Chengkun Cao, Xiuyu Li, Shangyin Tan, Kurt Keutzer, Jiarong Xing, Joseph E. Gonzalez, Ion Stoica*
  • Language Models Can Easily Learn to Reason from Demonstrations
    Dacheng Li, Shiyi Cao, Tyler Griggs, Shu Liu, Xiangxi Mo, Eric Tang, Sumanth Hegde, Shishir G Patil, Matei Zaharia, Joseph E. Gonzalez, Ion Stoica
  • FSTs vs ICL: Generalisation in LLMs for an under-resourced language
    Ximena Gutierrez, Mikel Segura Elizalde, Victor Mijangos
  • SRM-LLM: Semantic Relationship Mining with LLMs for Temporal Knowledge Graph Extrapolation
    Fu Zhang, Panfeng Zhang, Jingwei Cheng
  • Captioning for Text-Video Retrieval via Dual-Group Direct Preference Optimization
    Ji Soo Lee, Byungoh Ko, Jaewon Cho, Howoong Lee, Jaewoon Byun, Hyunwoo J. Kim
  • Benchmarking and Improving LLM Robustness for Personalized Generation
    Chimaobi Okite, Naihao Deng, Kiran Bodipati, Huaidian Hou, Joyce Chai, Rada Mihalcea
  • MemeInterpret: Towards an All-in-One Dataset for Meme Understanding
    Jeongsik Park, Khoi P. N. Nguyen, Jihyung Park, Minseok Kim, Jaeheon Lee, Jae Won Choi, Kalyani Ganta, Phalgun Ashrit Kasu, Rohan Sarakinti, Sanjana Vipperla, Sai Sathanapalli, Nishan Vaghani, Vincent Ng
  • CoRAG: Enhancing Hybrid Retrieval-Augmented Generation through a Cooperative Retriever Architecture
    Zaiyi Zheng, Song Wang, Zihan Chen, Yaochen Zhu, Yinhan He, Liangjie Hong, Qi Guo, Jundong Li
  • Hallucination Detection in Structured Query Generation via LLM Self-Debating
    Miaoran Li, Jiangning Chen, Minghua Xu, Xiaolong Wang
  • Not All Options Are Created Equal: Textual Option Weighting for Token-Efficient LLM-Based Knowledge Tracing
    Jong woo kim, SeongYeub Chu, Bryan Wong, Mun Yong Yi
  • Public Data Assisted Differentially Private In-Context Learning
    Seongho Joo, Hyukhun Koh, Kyomin Jung
  • Inducing Argument Facets for Faithful Opinion Summarization
    Jian Wang, Yanjie Liang, YUQING SUN, Bin Gong
  • Scaling Laws Are Unreliable for Downstream Tasks: A Reality Check
    Nicholas Lourie, Michael Y. Hu, Kyunghyun Cho
  • Familiarity-Aware Evidence Compression for Retrieval-Augmented Generation
    Dongwon Jung, Qin Liu, Tenghao Huang, Ben Zhou, Muhao Chen
  • O_O-VC: Synthetic Data-Driven One-to-One Alignment for Any-to-Any Voice Conversion
    Huu Tuong Tu, Huan Vu, cuong tien nguyen, Dien Hy Ngo, Nguyen Thi Thu Trang
  • Simple Factuality Probes Detect Hallucinations in Long-Form Natural Language Generation
    Jiatong Han, Neil Band, Muhammed Razzak, Jannik Kossen, Tim G. J. Rudner, Yarin Gal
  • CESRec: Constructing Pseudo Interactions for Sequential Recommendation via Conversational Feedback
    Yifan Wang, Shen Gao, Jiabao Fang, Rui Yan, Billy Chiu, Shuo Shang
  • TTPA: Token-level Tool-use Preference Alignment Training Framework with Fine-grained Evaluation
    Chengrui Huang, Shen Gao, Zhengliang Shi, Dongsheng Wang, Shuo Shang
  • Avoiding Knowledge Edit Skipping in Multi-hop Question Answering with Guided Decomposition
    Yi Liu, Xiangrong Zhu, Xiangyu Liu, Wei Wei, Wei Hu
  • Bridging the Creativity Understanding Gap: Small-Scale Human Alignment Enables Expert-Level Humor Ranking in LLMs
    Kuan Lok Zhou, Jiayi Chen, Siddharth Suresh, Reuben Narad, Timothy T. Rogers, Lalit K Jain, Robert D Nowak, Bob Mankoff, Jifan Zhang
  • SMARTMiner: Extracting and Evaluating SMART Goals from Low-Resource Health Coaching Notes
    Iva Bojic, Qi Chwen Ong, Stephanie Hilary Xinyi Ma, Lin Ai, Zheng Liu, Ziwei Gong, Julia Hirschberg, Andy Hau Yan HO, Andy W. H. Khong
  • GRIL: Knowledge Graph Retrieval-Integrated Learning with Large Language Models
    Jialin Chen, Houyu Zhang, Seongjun Yun, Alejandro Mottini, Rex Ying, Xiang song, Vassilis N. Ioannidis, Zheng Li, qingjun cui
  • Exploring Deductive and Inductive Reasoning Capabilities of Large Language Models in Procedural Planning
    Jiabao Kang, Xinye Li, Liyan Xu, Qingbin Liu, Xi Chen, Zhiying Tu, Dianhui Chu, Dianbo Sui
  • KELE: A Multi-Agent Framework for Structured Socratic Teaching with Large Language Models
    Xian Peng, Pan Yuan, Dong Li, Junlong Cheng, Qin Fang, Zhi Liu
  • VisualEDU: A Benchmark for Assessing Coding and Visual Comprehension through Educational Problem-Solving Video Generation
    Hao Chen, TIANYU SHI, Pengran huang, Zeyuan Li, Jiahui Pan, Qianglong Chen, Lewei He
  • OkraLong: A Flexible Retrieval-Augmented Framework for Long-Text Question Answering
    Yulong Hui, Yihao Liu, Yao Lu, Huanchen Zhang
  • VerifiAgent: a Unified Verification Agent in Language Model Reasoning
    Jiuzhou Han, Wray Buntine, Ehsan Shareghi
  • DrKGC: Dynamic Subgraph Retrieval-Augmented LLMs for Knowledge Graph Completion across General and Biomedical Domains
    Yongkang Xiao, Sinian Zhang, Yi Dai, Huixue Zhou, Jue Hou, Jie Ding, Rui Zhang
  • Understanding the Language Model to Solve the Symbolic Multi-Step Reasoning Problem from the Perspective of Buffer Mechanism
    Zhiwei Wang, Yunji Wang, Zhongwang Zhang, Zhangchen Zhou, Hui Jin, Tianyang Hu, Jiacheng Sun, Zhenguo Li, Yaoyu Zhang, Zhi-Qin John Xu
  • TwT: Thinking without Tokens by Habitual Reasoning Distillation with Multi-Teachers’ Guidance
    Jingxian Xu, Mengyu Zhou, Weichang Liu, Hanbing Liu, Shi Han, Dongmei Zhang
  • DAVIS: Planning Agent with Knowledge Graph-Powered Inner Monologue
    Minh Pham Dinh, Michael G Yankoski, Munira Syed, Trenton W. Ford
  • When Instructions Multiply: Measuring and Estimating LLM Capabilities of Multiple Instructions Following
    Keno Harada, Yudai Yamazaki, Masachika Taniguchi, Edison Marrese-Taylor, Takeshi Kojima, Yusuke Iwasawa, Yutaka Matsuo
  • FormosanBench: Benchmarking Low-Resource Austronesian Languages in the Era of Large Language Models
    Kaiying Kevin Lin, Hsi-Yu Chen, Haopeng Zhang
  • SeaPO: Strategic Error Amplification for Robust Preference Optimization of Large Language Models
    Jun Rao, Yunjie Liao, Xuebo Liu, Zepeng Lin, Lian Lian, Dong Jin, shengjun cheng, Jun Yu, Min Zhang
  • FigEx: Aligned Extraction of Scientific Figures and Captions
    Jifeng Song, Arun Das, Ge Cui, Yufei Huang
  • PATIMT-Bench: A Multi-Scenario Benchmark for Position-Aware Text Image Machine Translation in Large Vision-Language Models
    Wanru Zhuang, Wenbo Li, Zhibin Lan, Xu Han, Peng Li, Jinsong Su
  • Safeguard Fine-Tuned LLMs Through Pre- and Post-Tuning Model Merging
    Hua Farn, Hsuan Su, Shachi H. Kumar, Saurav Sahay, Shang-Tse Chen, Hung-yi Lee
  • Self-Ensemble: Mitigating Confidence Distortion for Large Language Models
    Zicheng Xu, Guanchu Wang, Guangyao Zheng, Yu-Neng Chuang, Alex Szalay, Xia Hu, Vladimir Braverman
  • Annotation-Efficient Language Model Alignment via Diverse and Representative Response Texts
    Yuu Jinnai, Ukyo Honda
  • Explainable Chain-of-Thought Reasoning: An Empirical Analysis on State-Aware Reasoning Dynamics
    Sheldon Yu, Yuxin Xiong, Junda Wu, Xintong Li, Tong Yu, Xiang Chen, Ritwik Sinha, Jingbo Shang, Julian McAuley
  • DecisionFlow: Advancing Large Language Model as Principled Decision Maker
    Xiusi Chen, Shanyong Wang, Cheng Qian, Hongru WANG, Peixuan Han, Heng Ji
  • M-Ped: Multi-Prompt Ensemble Decoding for Large Language Models
    Jiaxin GUO, Daimeng Wei, Yuanchang Luo, Hengchao Shang, Zongyao Li, Jinlong Yang, Zhanglin Wu, Zhiqiang Rao, Shimin Tao, Hao Yang
  • Butterfly Effects in Toolchains: A Comprehensive Analysis of Failed Parameter Filling in LLM Tool-Agent Systems
    Qian Xiong, Yuekai Huang, Ziyou Jiang, Zhiyuan Chang, Yujia Zheng, Tianhao Li, Mingyang Li
  • FinLFQA: Evaluating Attributed Text Generation of LLMs in Financial Long-Form Question Answering
    Yitao Long, Tiansheng Hu, Yilun Zhao, Arman Cohan, Chen Zhao
  • BenchMAX: A Comprehensive Multilingual Evaluation Suite for Large Language Models
    Xu Huang, Wenhao Zhu, Hanxu Hu, Conghui He, Lei Li, Shujian Huang, Fei Yuan
  • Assessing the Sensitivity and Alignment of FOL Closeness Metrics
    Ramya Keerthy Thatikonda, Wray Buntine, Ehsan Shareghi
  • FoodSafeSum: Enabling Natural Language Processing Applications for Food Safety Document Summarization and Analysis
    Juli Bakagianni, Korbinian Randl, Guido Rocchietti, Cosimo Rulli, Franco Maria Nardini, Salvatore Trani, Aron Henriksson, Anna Romanova, John Pavlopoulos
  • Self-adaptive Dataset Construction for Real-World Multimodal Safety Scenarios
    Jingen Qu, Lijun Li, Bo Zhang, Yichen Yan, Jing Shao
  • EnDive: A Cross-Dialect Benchmark for Fairness and Performance in Large Language Models
    Abhay Gupta, Jacob Cheung, Philip Meng, Shayan Sayyed, Kevin Zhu, Austen Liao, Sean O’Brien
  • FAEDKV: Infinite-Window Fourier Transform for Unbiased KV Cache Compression
    Runchao Li, Yao Fu, Mu Sheng, Xianxuan Long, Haotian Yu, Pan Li
  • Dynamic Injection of Entity Knowledge into Dense Retrievers
    Ikuya Yamada, Ryokan Ri, Takeshi Kojima, Yusuke Iwasawa, Yutaka Matsuo
  • When Personalization Meets Reality: A Multi-Faceted Analysis of Personalized Preference Learning
    Yijiang River Dong, Tiancheng Hu, Yinhong Liu, Ahmet Üstün, Nigel Collier
  • MASTER: Multi-Agent Security Through Exploration of Roles and Topological Structures - A Comprehensive Framework
    Yifan Zhu, Chao Zhang, Xin Shi, Xueqiao Zhang, Yi Yang, Yawei Luo
  • MONAQ: Multi-Objective Neural Architecture Querying for Time-Series Analysis on Resource-Constrained Devices
    Patara Trirat, Jae-Gil Lee
  • StandUp4AI: A New Multilingual Dataset for Humor Detection in Stand-up Comedy Videos
    Valentin Barriere, Nahuel Gomez, Léo Hemamou, Sofia Callejas, Brian Ravenet
  • Does Visual Grounding Enhance the Understanding of Embodied Knowledge in Large Language Models?
    Zhihui Yang, Yupei Wang, Kaijie Mo, Zhe Zhao, Renfen Hu
  • Semantic Contribution-Aware Adaptive Retrieval for Black-Box Models
    Qinhong Lin, Yuang Cai, Linna Zhou, Zhongliang Yang, Dingfu Yu, Xuan Xu, Yu Li
  • On Guardrail Models’ Robustness to Mutations and Adversarial Attacks
    Elias Bassani, Ignacio Sanchez
  • IP-Dialog: Evaluating Implicit Personalization in Dialogue Systems with Synthetic Data
    Bo Peng, Zhiheng Wang, Heyang Gong, Chaochao Lu
  • Zero-shot Graph Reasoning via Retrieval Augmented Framework with LLMs
    Hanqing Li, Sharika Mahadevan, Kiran Jyothi Sheena, Henry Liang, Diego Klabjan
  • Privacy in Action: Towards Realistic Privacy Mitigation and Evaluation for LLM-Powered Agents
    Shouju Wang, Fenglin Yu, Xirui Liu, Xiaoting Qin, Jue Zhang, Qingwei Lin, Dongmei Zhang, Saravan Rajmohan
  • Dissecting Logical Reasoning in LLMs: A Fine-Grained Evaluation and Supervision Study
    Yujun Zhou, Jiayi Ye, Zipeng Ling, Yufei Han, Yue Huang, Haomin Zhuang, Zhenwen Liang, Kehan Guo, Taicheng Guo, Xiangqi Wang, Xiangliang Zhang
  • ConciseRL: Conciseness-Guided Reinforcement Learning for Efficient Reasoning Models
    Razvan-Gabriel Dumitru, Darius Peteleaza, Vikas Yadav, Liangming Pan
  • Faster and Better LLMs via Latency-Aware Test-Time Scaling
    Zili Wang, Tianyu Zhang, Haoli Bai, Lu Hou, Xianzhi Yu, Wulong Liu, Shiming Xiang, Lei Zhu
  • Reasoning-Augmented Conversation for Multi-Turn Jailbreak Attacks on Large Language Models
    Zonghao Ying, Deyue Zhang, Zonglei Jing, Yisong Xiao, Quanchen Zou, Aishan Liu, Siyuan Liang, Xiangzheng Zhang, Xianglong Liu, Dacheng Tao
  • Distilling Many-Shot In-Context Learning into a Cheat Sheet
    Ukyo Honda, Soichiro Murakami, Peinan Zhang
  • Tracing Training Footprints: A Calibration Approach for Membership Inference Attacks Against Multimodal Large Language Models
    Xiaofan Zheng, Huixuan Zhang, Xiaojun Wan
  • PolBiX: Detecting LLMs’ Political Bias in Fact-Checking through X-phemisms
    Charlott Jakob, David Harbecke, Patrick Parschan, Pia Wenzel Neves, Vera Schmitt
  • URO-Bench: Towards Comprehensive Evaluation for End-to-End Spoken Dialogue Models
    Ruiqi Yan, Xiquan Li, Wenxi Chen, Zhikang Niu, Chen Yang, Ziyang Ma, Kai Yu, Xie Chen
  • Low-Hallucination and Efficient Coreference Resolution with LLMs
    Yujian Gan, Yuan Liang, Jinxia Xie, Yanni Lin, Juntao Yu, Massimo Poesio
  • Your Mileage May Vary: How Empathy and Demographics Shape Human Preferences in LLM Responses
    Yishan Wang, Amanda Cercas Curry, Flor Miriam Plaza-del-Arco
  • Diving into Mitigating Hallucinations from a Vision Perspective for Large Vision-Language Models
    Weihang Wang, Xinhao Li, Ziyue Wang, Yan Pang, Jielei Zhang, Peiyi Li, Qiang Zhang, Longwen Gao
  • PhysicsArena: The First Multimodal Physics Reasoning Benchmark Exploring Variable, Process, and Solution Dimensions
    Song Dai, Yibo Yan, Jiamin Su, Zihao Dongfang, Yubo Gao, Yonghua Hei, Jungang Li, Junyan Zhang, Sicheng Tao, Zhuoran Gao, Xuming Hu
  • Ko-LongRAG: A Korean Long-Context RAG Benchmark Built with a Retrieval-Free Approach
    Yongil Kim, Heuiyeen Yeen, Hyeongu Yun, Jinsik Lee
  • Choosing a Model, Shaping a Future: Comparing LLM Perspectives on Sustainability and its Relationship with AI
    Annika Bush, Meltem Aksoy, Markus Pauly, Greta Ontrup
  • Optimising Factual Consistency in Summarisation via Preference Learning from Multiple Imperfect Metrics
    Yuxuan Ye, Raul Santos-Rodriguez, Edwin Simpson
  • Judging with Many Minds: Do More Perspectives Mean Less Prejudice? On Bias Amplification and Resistance in Multi-Agent Based LLM-as-Judge
    Chiyu Ma, Enpei Zhang, Yilun Zhao, Wenjun Liu, Yaning Jia, Peijun Qing, Lin Shi, Arman Cohan, Yujun Yan, Soroush Vosoughi
  • A Regex Minimization Benchmark: A PSPACE-Complete Challenge for Language Models
    Hyundong Jin, Joonghyuk Hahn, Yo-Sub Han
  • Investigating the Impact of Conceptual Metaphors on LLM-based NLI through Shapley Interactions
    Meghdut Sengupta, Maximilian Muschalik, Fabian Fumagalli, Barbara Hammer, Eyke Hüllermeier, Debanjan Ghosh, Henning Wachsmuth
  • KurTail : Kurtosis-based LLM Quantization
    Mohammad Sadegh Akhondzadeh, Aleksandar Bojchevski, Evangelos Eleftheriou, Martino Dazzi
  • VIVA+: Human-Centered Situational Decision-Making
    Zhe Hu, Yixiao Ren, Guanzhong Liu, Jing Li, Yu Yin
  • QuantAgents: Towards Multi-agent Financial System via Simulated Trading
    Xiangyu Li, Yawen Zeng, Xiaofen Xing, Jin Xu, Xiangmin Xu
  • LLMs Reproduce Stereotypes of Sexual and Gender Minorities
    Ruby Ostrow, Adam Lopez
  • Accept or Deny? Evaluating LLM Fairness and Performance in Loan Approval across Table-to-Text Serialization Approaches
    Israel Abebe Azime, Deborah D. Kanubala, Tejumade Afonja, Mario Fritz, Isabel Valera, Dietrich Klakow, Philipp Slusallek
  • Transfer-Aware Data Selection for Domain Adaptation in Text Retrieval
    Linzhu Yu, Huan Li, Ke Chen, Lidan Shou
  • Understanding and Improving Information Preservation in Prompt Compression for LLMs
    Weronika Łajewska, Laura Aina, Momchil Hardalov, Neha Anna John, Hang Su, Lluis Marquez
  • A Benchmark for Hindi Verb-Argument Structure Alternations
    Kanishka Jain, Ashwini Vaidya
  • Beyond Binary Preferences: Semi-Online Label-Free GRACE-KTO with Group-Wise Adaptive Calibration for High-Quality Long-Text Generation
    Jingyang Deng, Ran Chen, Jo-Ku Cheng, Jinwen Ma
  • Representation-based Broad Hallucination Detectors Fail to Generalize Out of Distribution
    Zuzanna Dubanowska, Maciej Żelaszczyk, Michał Brzozowski, Paolo Mandica, Michal P. Karpowicz
  • MAFMO: Multi-modal Adaptive Fusion with Meta-template Optimization for Vision-Language Models
    Mingrui Xie, Lulu Xu, Junliang Du
  • Multimodal UNcommonsense: From Odd to Ordinary and Ordinary to Odd
    Yejin Son, Saejin Kim, Dongjun Min, Youngjae Yu
  • Analyzing Gambling Addictions: A Spanish Corpus for Understanding Pathological Behavior
    Manuel Couto, Marcos Fernández-Pichel, Mario Ezra Aragon, David E. Losada
  • Beyond Surface Alignment: Rebuilding LLMs Safety Mechanism via Probabilistically Ablating Refusal Direction
    Yuanbo Xie, Yingjie Zhang, Tianyun Liu, Duohe Ma, Tingwen Liu
  • Distributed LLM Serving on Consumer-Grade GPUs by Reconciling Computation and Communication
    Lewei Jin, kui zhang, Yongqi Chen, Zhuoyifan, Renjie Li, Yi Gao, Bowei Yang, Zhengong Cai, Wei Dong
  • SafeToolBench: Pioneering a Prospective Benchmark to Evaluating Tool Utilization Safety in LLMs
    Hongfei Xia, Hongru WANG, Zeming Liu, Qian Yu, Yuhang Guo, Haifeng Wang
  • Sparsifying Mamba
    An Wang, Ruobing Xie, Shuaipeng Li, Xingwu Sun, Zhanhui Kang
  • Beneath the Facade: Probing Safety Vulnerabilities in LLMs via Auto-Generated Jailbreak Prompts
    Heehyeon Kim, Kyeongryul Lee, Joyce Jiyoung Whang
  • ET-MIER: Entity Type-guided Key Mention Identification and Evidence Retrieval for Document-level Relation Extraction
    Xin Li, Huangming Xu, Fu Zhang, Jingwei Cheng
  • Position IDs Matter: An Enhanced Position Layout for Efficient Context Compression in Large Language Models
    Runsong Zhao, Xin Liu, Xinyu Liu, Pengcheng Huang, Chunyang Xiao, Tong Xiao, JingBo Zhu
  • Can Role Vectors Affect LLM Behaviour?
    Daniele Potertì, Andrea Seveso, Fabio Mercorio
  • Semantic Component Analysis: Introducing Multi-Topic Distributions to Clustering-Based Topic Modeling
    Florian Eichin, Carolin M. Schuster, Georg Groh, Michael A. Hedderich
  • ThinkQE: Query Expansion via an Evolving Thinking Process
    Yibin Lei, Tao Shen, Andrew Yates
  • Hierarchical Reward Modeling for Fault Localization in Large Code Repositories
    Jiwei Zhang, Jianxun Lian, Haiming Qin, Mingyang Zhou, KeZhong Lu, Rui Mao, Hao Liao
  • Layer Duplication in LLMs
    Neo Eyal, Nachum Dershowitz, Kfir Bar
  • Semantic-Aware Action Space Compression via LLM-DRL Synergy for Efficient Task-oriented Dialogue Policy Exploration
    Yangyang Zhao, Ben Niu, Yuxuan Tan, Shihan Wang, Libo Qin
  • Linear Steerability in Language Models: When It Emerges and How It Evolves
    Jianshu She, Xinyue Li, Eric P. Xing, Zhengzhong Liu, Qirong Ho
  • A Comprehensive Survey on Learning from Rewards for Large Language Models: Reward Models and Learning Strategies
    Xiaobao Wu
  • InFact: Informativeness Alignment for Improved LLM Factuality
    Roi Cohen, Russa Biswas, Gerard de Melo
  • Large Language Model Agents in Finance: A Survey Bridging Research, Practice, and Real-World Deployment
    Yifei Dong, Fengyi Wu, Kunlin Zhang, Yilong Dai, sanjian zhang, Wanghao Ye, Sihan Chen, Zhi-Qi Cheng
  • Problem Solved? Information Extraction Design Space for Layout-Rich Documents using LLMs
    Gaye Colakoglu, Gürkan Solmaz, Jonathan Fürst
  • Generation-Augmented Retrieval: Rethinking the Role of Large Language Models in Zero-Shot Relation Extraction
    Zehan Li, Fu Zhang, Tianyue Peng, He Liu, Jingwei Cheng
  • Following Occam’s Razor: Dynamic Combination of Structured Knowledge for Multi-Hop Question Answering using LLMs
    Wei Chen, Zhi Zheng, Lili Zhao, huijun hou, Tong Xu
  • Large Language Models as Reader for Bias Detection
    Xuan Luo, Jing Li, Zhong Wenzhong, Geng Tu, Ruifeng Xu
  • LOHRec: Leveraging Order and Hierarchy in Generative Sequential Recommendation
    Jiawen Xie, Haiyang Wu, Deyi Ji, Yuekui Yang, Shaoping Ma
  • Biology-Instructions: A Dataset and Benchmark for Multi-Omics Sequence Understanding Capability of Large Language Models
    haonan he, Yuchen Ren, Yining Tang, Ziyang Xu, Junxian Li, Minghao Yang, Di Zhang, Yuan Dong, Tao Chen, Shufei Zhang, Yuqiang Li, Nanqing Dong, Wanli Ouyang, Dongzhan Zhou, Peng Ye
  • AssistedDS: Benchmarking How External Domain Knowledge Assists LLMs in Automated Data Science
    An Luo, Xun Xian, Jin Du, Fangqiao Tian, Ganghua Wang, Ming Zhong, Shengchun ZHAO, Xuan Bi, Zirui Liu, Jiawei Zhou, Jayanth Srinivasa, Ashish Kundu, Charles Fleming, Mingyi Hong, Jie Ding
  • Are you sure? Measuring models bias in content moderation through uncertainty
    Alessandra Urbinati, Mirko Lai, Simona Frenda, Marco Antonio Stranisci
  • FOSSIL: Harnessing Feedback on Suboptimal Samples for Data-Efficient Generalisation with Imitation Learning for Embodied Vision-and-Language Tasks
    Sabrina McCallum, Amit Parekh, Alessandro Suglia
  • Assess and Prompt: A Generative RL Framework for Improving Engagement in Online Mental Health Communities
    Bhagesh Gaur, Karan Gupta, Aseem Srivastava, Manish Gupta, Md Shad Akhtar
  • Logic: Long-form Outline Generation via Imitative and Critical Self-refinement
    Hengwei Liu, Yongliang Shen, Zhe Zheng, Haoyuan Ma, Xingyu Wu, Yin Zhang, Weiming Lu
  • No Free Lunch: Retrieval-Augmented Generation Undermines Fairness in LLMs, Even for Vigilant Users
    Mengxuan Hu, Hongyi Wu, Ronghang Zhu, Zihan Guan, Dongliang Guo, Daiqing Qi, Sheng Li
  • LegoSLM: Connecting LLM with Speech Encoder using CTC Posteriors
    Rao Ma, Tongzhou Chen, Kartik Audhkhasi, Bhuvana Ramabhadran
  • Skeleton-Guided-Translation: A Benchmarking Framework for Code Repository Translation with Fine-Grained Quality Evaluation
    Xing Zhang, Jiaheng Wen, Fangkai Yang, Yu Kang, Pu Zhao, Junhao Wang, Maoquan Wang, Yufan Huang, Shengyu Fu, Elsie Nallipogu, Qingwei Lin, Yingnong Dang, Saravan Rajmohan, Dongmei Zhang
  • Parallel Communities Across the Surface Web and the Dark Web
    Wenchao Dong, Megha Sundriyal, Seongchan Park, Jaehong Kim, Meeyoung Cha, Tanmoy Chakraborty, Wonjae Lee
  • Lemma Dilemma: On Lemma Generation Without Domain- or Language-Specific Training Data
    Olia Toporkov, Alan Akbik, Rodrigo Agerri
  • LlmFixer: Fix the Helpfulness of Defensive Large Language Models
    Zelong Yu, Xiaoming Zhang, Litian Zhang, Yu Yuan, Chaozhuo Li
  • Universal Acoustic Adversarial Attacks for Flexible Control of Speech-LLMs
    Rao Ma, Mengjie Qian, Vyas Raina, Mark Gales, Kate Knill
  • Probing Semantic Routing in Large Mixture-of-Expert Models
    Matthew Lyle Olson, Neale Ratzlaff, Musashi Hinck, Sungduk Yu, Man Luo, Chendi Xue, Vasudev Lal
  • CMT-Eval: A Novel Chinese Multi-turn Dialogue Evaluation Dataset Addressing Real-world Conversational Challenges
    Siyu Tian, Kaijie Mo, Yupei Wang, Renfen Hu
  • LastingBench: Defend Benchmarks Against Knowledge Leakage
    Yixiong Fang, Tianran Sun, Yuling Shi, Min Wang, Xiaodong Gu
  • Learning API Functionality from In-Context Demonstrations for Tool-based Agents
    Bhrij Patel, Ashish Jagmohan, Aditya Vempaty
  • Predicting Language Models’ Success at Zero-Shot Probabilistic Prediction
    Kevin Ren, Santiago Cortes-Gomez, Carlos Miguel Patiño, Ananya Joshi, Ruiqi Lyu, Jingjing Tang, Alistair Turcan, Khurram Yamin, Steven Wu, Bryan Wilder
  • GAMIC: Graph-Aligned Molecular In-context Learning for Molecule Analysis via LLMs
    ALI AL LAWATI, Jason S Lucas, Zhiwei Zhang, Prasenjit Mitra, Suhang Wang
  • Rethinking Sign Language Translation: The Impact of Signer Dependence on Model Evaluation
    Keren Artiaga, Sabyasachi Kamila, Haithem Afli, Conor Lynch, Mohammed Hasanuzzaman
  • Can Large Language Models Identify Implicit Suicidal Ideation? An Empirical Evaluation
    Tong Li, Shu Yang, Junchao Wu, Jiyao Wei, Lijie Hu, Mengdi Li, Derek F. Wong, Joshua R. Oltmanns, Di Wang
  • Adaptive Platt Scaling with Causal Interpretations for Self-Reflective Language Model Uncertainty Estimates
    Anthony Sicilia, Malihe Alikhani
  • Treble Counterfactual VLMs: A Causal Approach to Hallucination
    Li Li, Jiashu Qu, Linxin Song, Yuxiao Zhou, Yuehan Qin, Tiankai Yang, Yue Zhao
  • Video-Skill-CoT: Skill-based Chain-of-Thoughts for Domain-Adaptive Video Reasoning
    Daeun Lee, Jaehong Yoon, Jaemin Cho, Mohit Bansal
  • Glitter: A Multi-Sentence, Multi-Reference Benchmark for Gender-Fair German Machine Translation
    A Pranav, Janiça Hackenbuchner, Giuseppe Attanasio, Manuel Lardelli, Anne Lauscher
  • From n-gram to Attention: How Model Architectures Learn and Propagate Bias in Language Modeling
    Mohsinul Kabir, Tasfia Tahsin, Sophia Ananiadou
  • SENTRA: Selected-Next-Token Transformer for LLM Text Detection
    Mitchell Plyler, Yilun Zhang, Alexander Tuzhilin, Saoud Khalifah, Sen Tian
  • Automate Strategy Finding with LLM in Quant Investment
    Zhizhuo KOU, Holam Yu, Junyu Luo, Jingshu Peng, Xujia Li, Chengzhong LIU, Juntao Dai, Lei Chen, Sirui Han, Yike Guo
  • Does Reasoning Introduce Bias? A Study of Social Bias Evaluation and Mitigation in LLM Reasoning
    Xuyang Wu, Jinming Nian, Ting-Ruen Wei, Zhiqiang Tao, Hsin-Tai Wu, Yi Fang
  • MT-RewardTree: A Comprehensive Framework for Advancing LLM-Based Machine Translation via Reward Modeling
    Zhaopeng Feng, Jiahan Ren, Jiayuan Su, Jiamei Zheng, Hongwei Wang, Zuozhu Liu
  • Bias after Prompting: Persistent Discrimination in Large Language Models
    Nivedha Sivakumar, Natalie Mackraz, Samira Khorshidi, Krishna Patel, Barry-John Theobald, Luca Zappella, Nicholas Apostoloff
  • CARVQ: Corrective Adaptor with Group Residual Vector Quantization for LLM Embedding Compression
    Dayin Gou, Sanghyun Byun, Nilesh Malpeddi, Gabrielle De Micheli, Prathamesh Vaste, Jacob Song, Woo Seong Chung
  • Consistent Discourse-level Temporal Relation Extraction Using Large Language Models
    Yi Fan, Michael Strube
  • MMPlanner: Zero-Shot Multimodal Procedural Planning with Chain-of-Thought Object State Reasoning
    Afrina Tabassum, Bin Guo, Xiyao Ma, Hoda Eldardiry, Ismini Lourentzou
  • Internal states before wait modulate reasoning patterns
    Dmitrii Troitskii, Koyena Pal, Chris Wendler, Callum Stuart McDougall
  • Sparsity May Be All You Need: Sparse Random Parameter Adaptation
    Jesus Rios, Pierre Dognin, Ronny Luss, Karthikeyan Natesan Ramamurthy
  • Learning to Align: Addressing Character Frequency Distribution Shifts in Handwritten Text Recognition
    Panagiotis Kaliosis, John Pavlopoulos
  • MT-R1-Zero: Advancing LLM-based Machine Translation via R1-Zero-like Reinforcement Learning
    Zhaopeng Feng, Shaosheng Cao, Jiahan Ren, Jiayuan Su, Ruizhe Chen, Yan Zhang, Jian Wu, Zuozhu Liu
  • Discrete Minds in a Continuous World: Do Language Models Know Time Passes?
    Minghan Wang, Ye Bai, Thuy-Trang Vu, Ehsan Shareghi, Gholamreza Haffari
  • DLTKG: Denoising Logic-based Temporal Knowledge Graph Reasoning
    Xiaoke Wang, Fu Zhang, Jingwei Cheng, Yiwen Chi, Jiashun Peng, Yingsong Ning
  • EMO-RL: Emotion-Rule-Based Reinforcement Learning Enhanced Audio-Language Model for Generalized Speech Emotion Recognition
    Pengcheng Li, Botao Zhao, Zuheng Kang, Junqing Peng, Xiaoyang Qu, Yayun He, Jianzong Wang
  • MANTA: A Scalable Pipeline for Transmuting Massive Web Corpora into Instruction Datasets
    Heuiyeen Yeen, Seokhee Hong, Hyeongu Yun, Jinsik Lee
  • Fast Quiet-STaR: Thinking Without Thought Tokens
    Wei Huang, Yizhe Xiong, Xin Ye, Zhijie Deng, Hui Chen, Zijia Lin, Guiguang Ding
  • Lock on Target! Precision Unlearning via Directional Control
    Yuntao Wen, Shen Gao, Ruixiang Feng, Feng Guo, Yifan Wang, Ran Le, Yang Song, Shuo Shang
  • UniRAG: A Unified RAG Framework for Knowledge-Intensive Queries with Decomposition, Break-Down Reasoning, and Iterative Rewriting
    Gun Il Kim, Jong Wook Kim, Beakcheol Jang
  • One Shot Dominance: Knowledge Poisoning Attack on Retrieval-Augmented Generation Systems
    Zhiyuan Chang, Mingyang Li, Xiaojun Jia, Junjie Wang, Yuekai Huang, Ziyou Jiang, Yang Liu, Qing Wang
  • From Generic Empathy to Personalized Emotional Support: A Self-Evolution Framework for User Preference Alignment
    Jing Ye, Lu Xiang, Yaping Zhang, Chengqing Zong
  • MaskCD: Mitigating LVLM Hallucinations by Image Head Masked Contrastive Decoding
    Jingyuan Deng, Yujiu Yang
  • ClusterUCB: Efficient Gradient-Based Data Selection for Targeted Fine-Tuning of LLMs
    Zige Wang, Qi Zhu, Fei Mi, Minghui Xu, Ruochun Jin, Wenjing Yang
  • TrapDoc: Deceiving LLM Users by Injecting Imperceptible Phantom Tokens into Documents
    Hyundong Jin, Sicheol Sung, Shinwoo Park, SeungYeop Baik, Yo-Sub Han
  • AraReasoner: Evaluating Reasoning-Based LLMs for Arabic NLP
    Ahmed Abul Hasanaath, Aisha Alansari, Ahmed Ashraf, Salmane Chafik, Hamzah Luqman, Saad Ezzini
  • Tales of Morality: Comparing Human- and LLM-Generated Moral Stories from Visual Cues
    Rezvaneh Rezapour, Sullam Jeoung, Zhiwen You, Jana Diesner
  • AirRAG: Autonomous Strategic Planning and Reasoning Steer Retrieval Augmented Generation
    Wenfeng Feng, Chuzhan Hao, Yuewei Zhang, Guochao Jiang, Jingyi Song
  • Evaluating NL2SQL via SQL2NL
    Mohammadtaher Safarzadeh, Afshin Oroojlooy, Dan Roth
  • DB-Explore: Automated Database Exploration and Instruction Synthesis for Text-to-SQL
    Haoyuan Ma, Yongliang Shen, Hengwei Liu, Wenqi Zhang, Haolei Xu, Qiuying Peng, Jun Wang, Weiming Lu
  • Do BERT-Like Bidirectional Models Still Perform Better on Text Classification in the Era of LLMs?
    Junyan Zhang, Yiming Huang, Shuliang Liu, Yubo Gao, Xuming Hu
  • Divide, Optimize, Merge: Scalable Fine-Grained Generative Optimization for LLM Agents
    Jiale Liu, Yifan Zeng, Shaokun Zhang, Chi Zhang, Malte Højmark-Bertelsen, Marie Normann Gadeberg, Huazheng Wang, Qingyun Wu
  • Evaluating Evaluation Metrics – The Mirage of Hallucination Detection
    Atharva Kulkarni, Yuan Zhang, Joel Ruben Antony Moniz, Xiou Ge, Bo-Hsiang Tseng, Dhivya Piraviperumal, Swabha Swayamdipta, Hong Yu
  • The Progress Illusion: Revisiting meta-evaluation standards of LLM evaluators
    Tianruo Rose Xu, Vedant Gaur, Liu Leqi, Tanya Goyal
  • MidPO: Dual Preference Optimization for Safety and Helpfulness in Large Language Models via a Mixture of Experts Framework
    Yupeng Qi, Ziyu Lyu, Min Yang, Yanlin Wang, Lu Bai, Lixin Cui
  • From KMMLU-Redux to Pro: A Professional Korean Benchmark Suite for LLM Evaluation
    Seokhee Hong, Sunkyoung Kim, Guijin Son, Soyeon Kim, Yeonjung Hong, Jinsik Lee
  • RealBench: A Chinese Multi-image Understanding Benchmark Close to Real-world Scenarios
    Fei Zhao, Chengqiang Lu, Yufan Shen, Qimeng Wang, Yicheng Qian, Haoxin Zhang, Yan Gao, YIWU, Yao Hu, Zhen Wu, Shangyu Xing, Xinyu Dai
  • The More, The Better? A Critical Study of Multimodal Context in Radiology Report Summarization
    Mong Yuan Sim, Wei Emma Zhang, Xiang Dai, Biaoyan Fang, Sarbin Ranjitkar, Arjun Burlakoti, Jamie Taylor, Haojie Zhuang
  • Localizing Malicious Outputs from CodeLLM
    Mayukh Borana, Liang Junyi, Sai Sathiesh Rajan, Sudipta Chattopadhyay
  • Knowing More, Acting Better: Hierarchical Representation for Embodied Decision-Making
    Chunhui Zhang, Zhongyu Ouyang, Xingjian Diao, Zheyuan Liu, Soroush Vosoughi
  • Culture is Everywhere: A Call for Intentionally Cultural Evaluation
    Juhyun Oh, Inha Cha, Michael Saxon, Hyunseung Lim, Shaily Bhatt, Alice Oh
  • Fairness in Automatic Speech Recognition Isn’t a One-Size-Fits-All
    Hend ElGhazaly, Bahman Mirheidari, Heidi Christensen, Nafise Sadat Moosavi
  • Uncovering Factor-Level Preference to Improve Human-Model Alignment
    Juhyun Oh, Eunsu Kim, Jiseon Kim, Wenda Xu, William Yang Wang, Alice Oh
  • Adaptive Preference Optimization with Uncertainty-aware Utility Anchor
    Xiaobo Wang, Zixia Jia, Jiaqi Li, Qi Liu, Zilong Zheng
  • GRAD: Generative Retrieval-Aligned Demonstration Sampler for Efficient Few-Shot Reasoning
    Oussama Gabouj, Kamel Charaf, Ivan Zakazov, Nicolas Baldwin, Robert West
  • IoTMigrator: LLM-driven Embedded IoT Code Migration across Different OSes for Cloud-device Integration
    YQ, Kaijie Gong, Yi Gao, Hao Wang, Wei Dong
  • ClueAnchor: Clue-Anchored Knowledge Reasoning Exploration and Optimization for Retrieval-Augmented Generation
    Hao Chen, Yukun Yan, Sen Mei, Wanxiang Che, Zhenghao Liu, Qi Shi, Xinze Li, Yuchun Fan, Pengcheng Huang, Qiushi Xiong, Zhiyuan Liu, Maosong Sun
  • BAGELS: Benchmarking the Automated Generation and Extraction of Limitations from Scholarly Text
    Ibrahim Al Azher, Miftahul Jannat Mokarrama, Zhishuai Guo, Sagnik Ray Choudhury, Hamed Alhoori
  • Dense Retrievers Can Fail on Simple Queries: Revealing The Granularity Dilemma of Embeddings
    Liyan Xu, Zhenlin Su, Mo Yu, Jiangnan Li, Fandong Meng, Jie Zhou
  • Over-Generation and Compaction: A Prompting Strategy for Procedural Text Adaptation with Large Language Models
    HyeongSik Kim, XU Yanheng, Chaoqun Dong, Fei Du
  • TransBERT: A Framework for Synthetic Translation in Domain-Specific Language Modeling
    Julien Knafou, Luc Mottin, Anaïs Mottaz, Alexandre Flament, Patrick Ruch
  • Beyond Fixed-Length Calibration for Post-Training Compression of LLMs
    Jaehoon Oh, Dokwan Oh
  • Attributes as Textual Genes: Leveraging LLMs as Genetic Algorithm Simulators for Conditional Synthetic Data Generation
    Guangzeng Han, Weisi Liu, Xiaolei Huang
  • ReCoVeR the Target Language: Language Steering without Sacrificing Task Performance
    Hannah Sterz, Fabian David Schmidt, Goran Glavaš, Ivan Vulić
  • LC-Eval: A Bilingual Multi-Task Evaluation Benchmark for Long-Context Understanding
    Sheikh Jubair, Arwa Omayrah, Amal Alshammari, Alhanoof Althnian, Abdulhamed Alothaimen, Norah A. Alzahrani, Shahad D. Alzaidi, Nora Al-Twairesh, Abdulmohsen Al-Thubaity
  • OVFact: Measuring and Improving Open-Vocabulary Factuality for Long Caption Models
    Monika Wysoczańska, Shyamal Buch, Anurag Arnab, Cordelia Schmid
  • GRPO-Guided Modality Selection Enhanced LoRA-Tuned LLMs for Multimodal Emotion Recognition
    Yang Chen, ShuwanYang, Yan Xiang, Ran Song, Yuxin Huang, Zhengtao Yu
  • Defending against Indirect Prompt Injection by Instruction Detection
    Tongyu Wen, Chenglong Wang, Xiyuan Yang, Haoyu Tang, Yueqi Xie, Lingjuan Lyu, Zhicheng Dou, Fangzhao Wu
  • MUG-Eval: A Proxy Evaluation Framework for Multilingual Generation Capabilities in Any Language
    Seyoung Song, Seogyeong Jeong, Eunsu Kim, Jiho Jin, Dongkwan Kim, Jay Shin, Alice Oh
  • CAC-CoT: Connector-Aware Compact Chain-of-Thought for Efficient Reasoning Data Synthesis Across Dual-System Cognitive Tasks
    Sunguk Choi, Yonghoon Kwon, Heondeuk Lee
  • On the Versatility of Sparse Autoencoders for In-Context Learning
    Ikhyun Cho, Gaeul Kwon, Julia Hockenmaier
  • More Documents, Same Length: Isolating the Challenge of Multiple Documents in RAG
    Shahar Levy, Nir Mazor, Lihi Shalmon, Michael Hassid, Gabriel Stanovsky
  • CLEAR: A Comprehensive Linguistic Evaluation of Argument Rewriting by Large Language Models
    Thomas Huber, Christina Niklaus
  • ALRPHFS:Adversarially Learned Risk Patterns with Hierarchical Fast & Slow Reasoning for Robust Agent Defense
    Shiyu Xiang, Tong Zhang, Ronghao Chen
  • Stop Playing the Guessing Game! Evaluating Conversational Recommender Systems via Target-free User Simulation
    SungHwan Kim, Kwangwook Seo, Tongyoung Kim, Jinyoung Yeo, Dongha Lee
  • Out-of-Context Reasoning in Large Language Models
    Jonathan Shaki, Emanuele La Malfa, Michael J. Wooldridge, Sarit Kraus
  • CodeComplex: Dataset for Worst-Case Time Complexity Prediction
    SeungYeop Baik, Joonghyuk Hahn, Jungin Kim, Aditi, Mingi Jeon, Yo-Sub Han, Sang-Ki Ko
  • Weak2Wise: An Automated, Lightweight Framework for Weak-LLM-Friendly Reasoning Synthesis
    Jianing Lin, Yuanfang Guo, Shunning Liu, Zeming Liu, Yunhong Wang
  • From Tower to Spire: Adding the Speech Modality to a Translation-Specialist LLM
    Kshitij Ambilduke, Ben Peters, Sonal Sannigrahi, Anil Keshwani, Tsz Kin Lam, Bruno Martins, Marcely Zanon Boito, Andre Martins
  • LLM Agents at the Roundtable: A Multi-Perspective and Dialectical Reasoning Framework for Essay Scoring
    Jinhee Jang, Ayoung Moon, Minkyoung Jung, YoungBin Kim, Seung Jin Lee
  • DeepNote: Note-Centric Deep Retrieval-Augmented Generation
    Ruobing Wang, Qingfei Zhao, Yukun Yan, Daren Zha, Yuxuan Chen, Shi Yu, Zhenghao Liu, Yixuan Wang, Shuo Wang, Xu Han, Zhiyuan Liu, Maosong Sun
  • NormAL LoRA: What is the perfect size?
    Aastik, Topu Sai Meghana, Chinmay Prakash Kulkarni, Pragya Paramita Sahu
  • Inclusive Leadership in the Age of AI: A Dataset and Comparative Study of LLMs vs. Real-Life Leaders in Workplace Action Planning
    Vindhya Singh, Sabine Schulte im Walde, Ksenia Keplinger
  • Token Preference Optimization with Self-Calibrated Visual-Anchored Rewards for Hallucination Mitigation
    Jihao Gu, Yingyao Wang, Meng Cao, Pi Bu, Jun Song, Bo Zheng, Yancheng He, Shilong Li
  • EZ-VC: Easy Zero-shot Any-to-Any Voice Conversion
    Advait Joglekar, Divyanshu Singh, Rooshil Rohit Bhatia, Srinivasan Umesh
  • Length Representations in Large Language Models
    Sangjun Moon, Dasom choi, Jingun Kwon, Hidetaka Kamigaito, Manabu Okumura
  • MultiLingPoT: Boosting Mathematical Reasoning in LLMs through Multilingual Program Integration
    Nianqi Li, Zujie Liang, Siyu Yuan, Jiaqing Liang, Feng Wei, Yanghua Xiao
  • Simulating Identity, Propagating Bias: Abstraction and Stereotypes in LLM-Generated Text
    Pia Sommerauer, Giulia Rambelli, Tommaso Caselli
  • Do LVLMs Know What They Know? A Systematic Study of Knowledge Boundary Perception in LVLMs
    Zhikai Ding, Shiyu Ni, Keping Bi
  • Benchmarking Large Language Models for Cryptanalysis and Side-Channel Vulnerabilities
    Utsav Maskey, Chencheng ZHU, Usman Naseem
  • MTabVQA: Evaluating Multi-Tabular Reasoning of Language Models in Visual Space
    Anshul Singh, Chris Biemann, Jan Strich
  • TurnBench-MS: A Benchmark for Evaluating Multi-Turn, Multi-Step Reasoning in Large Language Models
    Yiran Zhang, Mo Wang, Xiaoyang Li, Kaixuan Ren, Chencheng ZHU, Usman Naseem
  • Assessing LLM Reasoning Steps via Principal Knowledge Grounding
    Hyeon Hwang, Yewon Cho, Chanwoong Yoon, Yein Park, Minju Song, Kyungjae Lee, Gangwoo Kim, Jaewoo Kang
  • Stratified Selective Sampling for Instruction Tuning with Dedicated Scoring Strategy
    Paramita Mirza, Lucas Weber, Fabian Küch
  • CoTD-PO: Chain-of-Thought Distillation with Preference Optimization
    Lujie Niu, Haochen Sun, Fangkun Zhao, Sheng Chen, Zimeng Bai, Zhang jiawei, Caixia Yuan, Xiaojie Wang
  • Intelligent Document Parsing: Towards End-to-end Document Parsing via Decoupled Content Parsing and Layout Grounding
    Hangdi Xing, Feiyu Gao, Qi Zheng, Zhaoqing Zhu, Zirui Shao, Ming Yan
  • Feel the Difference? A Comparative Analysis of Emotional Arcs in Real and LLM-Generated CBT Sessions
    Xiaoyi Wang, Jiwei Zhang, Guangtao Zhang, Honglei Guo
  • Beyond Single-User Dialogue: Assessing Multi-User Dialogue State Tracking Capabilities of Large Language Models
    Sangmin Song, Juhwan Choi, JungMin Yun, YoungBin Kim
  • All-in-one: Understanding and Generation in Multimodal Reasoning with the MAIA Benchmark
    Davide Testa, Giovanni Bonetta, Raffaella Bernardi, Alessandro Bondielli, Alessandro Lenci, Alessio Miaschi, Lucia Passaro, Bernardo Magnini
  • Triangulating LLM Progress through Benchmarks, Games, and Cognitive Tests
    Filippo Momentè, Alessandro Suglia, Mario Giulianelli, Ambra Ferrari, Alexander Koller, Oliver Lemon, David Schlangen, Raquel Fernández, Raffaella Bernardi
  • Entity Profile Generation and Reasoning with LLMs for Entity Alignment
    Rumana Ferdous Munne, Md Mostafizur Rahman, Yuji Matsumoto
  • Re-FRAME the Meeting Summarization SCOPE: Fact-Based Summarization and Personalization via Questions
    Frederic Kirstein, Sonu Kumar, Terry Ruas, Bela Gipp
  • Attack as Defense: Safeguarding Large Vision-Language Models from Jailbreaking by Adversarial Attacks
    Chongxin Li, Hanzhang Wang, Yuchun Fang
  • Emphasising Structured Information: Integrating Abstract Meaning Representation into LLMs for Enhanced Open-Domain Dialogue Evaluation
    Bohao Yang, Kun Zhao, Dong Liu, Chen Tang, Liang Zhan, Chenghua Lin
  • Differentiated Vision: Unveiling Entity-Specific Visual Modality Requirements for Multimodal Knowledge Graph
    Minghang Liu, Yinghan Shen, Zihe Huang, Yuanzhuo Wang, Xuhui Jiang, Huawei Shen
  • Post Persona Alignment for Multi-Session Dialogue Generation
    Yi-Pei Chen, Noriki Nishida, Hideki Nakayama, Yuji Matsumoto
  • MASSIVE-Agents: A Benchmark for Multilingual Function-Calling in 52 Languages
    Mayank Kulkarni, Vittorio Mazzia, Judith Gaspers, Chris Hench, Jack FitzGerald
  • Crafting Customisable Characters with LLMs: A Persona-Driven Role-Playing Agent Framework
    Bohao Yang, Dong Liu, Chenghao Xiao, Kun Zhao, Chen Tang, Chao Li, Lin Yuan, YANG GUANG, Lanxiao Huang, Chenghua Lin
  • Can LLMs Express Personality Across Cultures? Introducing CulturalPersonas for Evaluating Trait Alignment
    Priyanka Dey, Aayush Bothra, Yugal Khanter, Emilio Ferrara, Jieyu Zhao
  • Exploring the Hidden Reasoning Process of Large Language Models by Misleading Them
    Guanyu Chen, Peiyang Wang, Yizhou Jiang, Yuqian Liu, Chujie Zhao, Ying Fang, Tianren Zhang, Feng Chen
  • When Models Reason in Your Language: Controlling Thinking Language Comes at the Cost of Accuracy
    Jirui Qi, Shan Chen, Zidi Xiong, Raquel Fernández, Danielle Bitterman, Arianna Bisazza
  • The Role of Model Confidence on Bias Effects in Measured Uncertainties for Vision-Language Models
    Xinyi Liu, Weiguang Wang, Hangfeng He
  • GAttention: Gated Attention for the Detection of Abusive Language
    Horacio Jarquín Vásquez, Hugo Jair Escalante, Manuel Montes, Mario Ezra Aragon
  • Towards Low-Resource Alignment to Diverse Perspectives with Sparse Feedback
    Chu Fei Luo, Samuel Dahan, Xiaodan Zhu
  • ProtoXTM: Cross-Lingual Topic Modeling with Document-Level Prototype-based Contrastive Learning
    Seung-Won Seo, Soon-Sun Kwon
  • One More Question is Enough, Expert Question Decomposition (EQD) Model for Domain Quantitative Reasoning
    Mengyu Wang, Sotirios Sabanis, Miguel de Carvalho, Shay B Cohen, Tiejun Ma
  • When Punctuation Matters: A Large-Scale Comparison of Prompt Robustness Methods for LLMs
    Mikhail Seleznyov, Mikhail Chaichuk, Gleb Ershov, Alexander Panchenko, Elena Tutubalina, Oleg Somov
  • RAR$^2$: Retrieval-Augmented Medical Reasoning via Thought-Driven Retrieval
    Kaishuai Xu, Wenjun Hou, Yi Cheng, Wenjie Li
  • The Security Threat of Compressed Projectors in Large Vision-Language Models
    Yudong Zhang, Ruobing Xie, Xingwu Sun, Jiansheng Chen, Zhanhui Kang, Di Wang, Yu Wang
  • NarratEX Dataset: Explaining the Dominant Narratives in News Texts
    Nuno Guimarães, Purificação Silvano, Ricardo Campos, Alipio Jorge, Ana Filipa Pacheco, Dimitar Iliyanov Dimitrov, Nikolaos Nikolaidis, Roman Yangarber, Elisa Sartori, Nicolas Stefanovitch, Preslav Nakov, Jakub Piskorski, Giovanni Da San Martino
  • Radical Allomorphy: Phonological Surface Forms without Phonology
    Salam Khalifa, Nizar Habash, Owen Rambow
  • Model Calibration for Emotion Detection
    Mihaela Petre-Vlad, Cornelia Caragea, Florentina Hristea
  • From Benchmark to Better Embeddings: Leveraging Synonym Substitution to Enhance Multimodal Models in Ukrainian
    Volodymyr Mudryi, Yurii Laba
  • Context Copying Modulation: The Role of Entropy Neurons in Managing Parametric and Contextual Knowledge Conflicts
    Zineddine Tighidet, Andrea Mogini, Hedi Ben younes, Jiali Mei, Patrick Gallinari, Benjamin Piwowarski
  • A Generalizable Rhetorical Strategy Annotation Model Using LLM-based Debate Simulation and Labelling
    Shiyu Ji, Farnoosh Hashemi, Joice Chen, Juanwen Pan, Weicheng Ma, Hefan Zhang, Sophia Pan, Ming Cheng, Shubham Mohole, Saeed Hassanpour, Soroush Vosoughi, Michael Macy
  • SecDecoding: Steerable Decoding for Safer LLM Generation
    Jiayou Wang, Rundong Liu, Yue Hu, Huijia Wu, Zhaofeng He
  • GENUINE: Graph Enhanced Multi-level Uncertainty Estimation for Large Language Models
    Tuo Wang, Adithya Kulkarni, Tyler Cody, Peter A. Beling, Yujun Yan, Dawei Zhou
  • ReviewEval: An Evaluation Framework for AI-Generated Reviews
    Madhav Krishan Garg, Tejash Prasad, Tanmay Singhal, Chhavi Kirtani, Murari Mandal, Dhruv Kumar
  • Overcoming Black-box Attack Inefficiency with Hybrid and Dynamic Select Algorithms
    Abhinay Shankar Belde, Rohit Ramkumar, Jonathan Rusert
  • GmSLM : Generative Marmoset Spoken Language Modeling
    Talia Sternberg, Yossi Adi, Michael London, David Omer
  • QA‑LIGN: Aligning LLMs through Constitutionally Decomposed QA
    Jacob Dineen, Aswin RRV, Qin Liu, Zhikun Xu, Xiao Ye, Ming Shen, Zhaonan Li, Shijie Lu, Chitta Baral, Muhao Chen, Ben Zhou
  • Characterizing Positional Bias in Large Language Models: A Multi-Model Evaluation of Prompt Order Effects
    Patrick Schilcher, Dominik Karasin, Michael Schöpf, Haisam Saleh, Antonela Tommasel, Markus Schedl
  • You Only Use Reactive Attention Slice When Retrieving From Long Context
    Yun Joon Soh, Hanxian Huang, Yuandong Tian, Jishen Zhao
  • Fine-Tuned Thoughts: Leveraging Chain-of-Thought Reasoning for Industrial Asset Health Monitoring
    Shuxin Lin, Dhaval C Patel, Christodoulos Constantinides
  • CoViPAL: Layer-wise Contextualized Visual Token Pruning for Large Vision-Language Models
    Zicong Tang, Ziyang Ma, Suqing Wang, Zuchao Li, Lefei Zhang, hai zhao, Yun Li, Qianren Wang
  • Large Language Models with Temporal Reasoning for Longitudinal Clinical Summarization and Prediction
    Maya Kruse, Shiyue Hu, Nicholas Derby, Yifu Wu, Samantha Stonbraker, Bingsheng Yao, Dakuo Wang, Elizabeth M. Goldberg, Yanjun Gao
  • TransAlign: Machine Translation Encoders are Strong Word Aligners, Too
    Benedikt Ebing, Christian Goldschmied, Goran Glavaš
  • Pruning Weights but Not Truth: Safeguarding Truthfulness While Pruning LLMs
    Yao Fu, Runchao Li, Xianxuan Long, Haotian Yu, Xiaotian Han, Yu Yin, Pan Li
  • Augment before You Try: Knowledge-Enhanced Table Question Answering via Table Expansion
    Yujian Liu, Jiabao Ji, Tong Yu, Ryan A. Rossi, Sungchul Kim, Handong Zhao, Ritwik Sinha, Yang Zhang, Shiyu Chang
  • Evaluating Large Language Models for Belief Inference: Mapping Belief Networks at Scale
    Trisevgeni Papakonstantinou, Antonina Zhiteneva, Ana Yutong Ma, Derek Powell, Zachary Horne
  • Distinguishing fair from unfair compositional generalization tasks
    Ahmad Jabbar, Cleo Condoravdi, Christopher Potts
  • SA-CLIP: Language Guided Image Spatial and Action Feature Learning
    Guanlin Li, Wenhao SHAO, Praboda Rajapaksha, Noel Crespi
  • Inefficiencies of Meta Agents for Agent Design
    Batu El, Mert Yuksekgonul, James Zou
  • SCoder: Progressive Self-Distillation for Bootstrapping Small-Scale Data Synthesizers to Empower Code LLMs
    Xinyu Zhang, Changzhi Zhou, Linmei Hu, Luhao Zhang, Xiancai Chen, Haomin Fu, Yang Yang, Mengdi Zhang
  • Linguistically-Controlled Paraphrase Generation
    Mohamed Elgaar, Hadi Amiri
  • LAWCAT: Efficient Distillation from Quadratic to Linear Attention with Convolution across Tokens for Long Context Modeling
    Zeyu Liu, Souvik Kundu, Lianghao Jiang, Anni Li, Srikanth Ronanki, Sravan Babu Bodapati, Gourav Datta, Peter Anthony Beerel
  • Analyzing Dialectical Biases in LLMs for Knowledge and Reasoning Benchmarks
    Eileen Pan, Anna Seo Gyeong Choi, Maartje Ter Hoeve, Skyler Seto, Allison Koenecke
  • TreeBoN: Enhancing Inference-Time Alignment with Speculative Tree-Search and Best-of-N Sampling
    Jiahao Qiu, Yifu Lu, Yifan Zeng, Jiacheng Guo, Jiayi Geng, Chenhao Zhu, Xinzhe Juan, Ling Yang, Huazheng Wang, Kaixuan Huang, Yue Wu, Mengdi Wang
  • CulturalFrames: Assessing Cultural Expectation Alignment in Text-to-Image Models and Evaluation Metrics
    Shravan Nayak, Mehar Bhatia, Xiaofeng Zhang, Verena Rieser, Lisa Anne Hendricks, Sjoerd van Steenkiste, Yash Goyal, Karolina Stanczak, Aishwarya Agrawal
  • Decoupled Proxy Alignment: Mitigating Language Prior Conflict for Multimodal Alignment in MLLMs
    Chenkun Tan, Pengyu Wang, Shaojun Zhou, Botian Jiang, Zhaowei Li, Dong Zhang, Xinghao Wang, Yaqian Zhou, Xipeng Qiu
  • Riemannian Optimization for LoRA on the Stiefel Manifold
    JuneYoung Park, Minjae Kang, Seongbae Lee, Haegang Lee, Seongwan Kim, Jaeho Lee
  • How Real Are Synthetic Therapy Conversations? Evaluating Fidelity in Prolonged Exposure Dialogues
    Suhas BN, Dominik O. Mattioli, Andrew M. Sherrill, Rosa I. Arriaga, Christopher Wiese, Saeed Abdullah
  • Large Language Models for Controllable Multi-property Multi-objective Molecule Optimization
    Vishal Dey, Xiao Hu, Xia Ning
  • Measuring Lexical Diversity of Synthetic Data Generated through Fine-Grained Persona Prompting
    Gauri Kambhatla, Chantal Shaib, Venkata S Govindarajan
  • Beyond Function-Level Search: Repository-Aware Dual-Encoder Code Retrieval with Adversarial Verification
    Aofan Liu, Shiyuan SONG, haoxuan li, Cehao Yang, Yiyan Qi
  • Watermark under Fire: A Robustness Evaluation of LLM Watermarking
    Jiacheng Liang, Zian Wang, Spencer Hong, Shouling Ji, Ting Wang
  • PEPE: Long-context Extension for Large Language Models via Periodic Extrapolation Positional Encodings
    Jikun Hu, Dongsheng Guo, Yuli LIU, Qingyao Ai, Lixuan Wang, Xuebing Sun, Qilei Zhang, Quan Zhou, Cheng Luo
  • Beyond Self-Reports: Multi-Observer Agents for Personality Assessment in Large Language Models
    Yin Jou Huang, Rafik Hadfi
  • Controlled Retrieval-augmented Context Evaluation for Long-form RAG
    Jia-Huei Ju, Suzan Verberne, Maarten de Rijke, Andrew Yates
  • Humanity’s Last Code Exam: Can Advanced LLMs Conquer Human’s Hardest Code Competition?
    Xiangyang Li, Xiaopeng Li, Kuicai Dong, Zhangquanhu, Rongju Ruan, Xinyi Dai, Yasheng Wang, Ruiming Tang
  • False Friends Are Not Foes: Investigating Vocabulary Overlap in Multilingual Language Models
    Julie Kallini, Dan Jurafsky, Christopher Potts, Martijn Bartelds
  • Rule-Guided Extraction: A Hierarchical Rule Optimization Framework for Document-Level Event Argument Extraction
    Yue Zuo, Yuxiao Fei, Wanting Ning, Jiayi Huang, Yubo Feng, Lishuang Li
  • SOPL: A Sequential Optimal Learning Approach to Automated Prompt Engineering in Large Language Models
    Shuyang Wang, Somayeh Moazeni, Diego Klabjan
  • CLIP-UP: A Simple and Efficient Mixture-of-Experts CLIP Training Recipe with Sparse Upcycling
    Xinze Wang, Chen Chen, Yinfei Yang, Hong-You Chen, Bowen Zhang, Aditya Pal, Xiangxin Zhu, Xianzhi Du
  • A Category-Theoretic Approach to Neural-Symbolic Task Planning with Bidirectional Search
    Shuhui Qu, Jie Wang, Kincho Law
  • HEAL: An Empirical Study on Hallucinations in Embodied Agents Driven by Large Language Models
    Trishna Chakraborty, Udita Ghosh, Xiaopan Zhang, Fahim Faisal Niloy, Yue Dong, Jiachen Li, Amit Roy-Chowdhury, Chengyu Song
  • Can LLMs Judge Debates? Evaluating Non-Linear Reasoning via Argumentation Theory Semantics
    Reza Sanayei, Srdjan Vesic, Eduardo Blanco, Mihai Surdeanu
  • How Jailbreak Defenses Work and Ensemble? A Mechanistic Investigation
    Zhuohan Long, Siyuan Wang, Shujun Liu, Yuhang Lai
  • Visual Self-Refinement for Autoregressive Models
    Jiamian Wang, Ziqi Zhou, Chaithanya Kumar Mummadi, Sohail Dianat, MAJID RABBANI, Raghuveer Rao, Chen Qiu, Zhiqiang Tao
  • Retrieval-Augmented Language Models are Mimetic Theorem Provers
    Wenjie Yang, Ruiyuan Huang, Jiaxing Guo, Zicheng Lyu, Tongshan Xu, Shengzhong Zhang, Lun Du, Da Zheng, Zengfeng Huang
  • LORE: Continual Logit Rewriting Fosters Faithful Generation
    Charles Yu, Qingyun Wang, Yuting Hu, Jinjun Xiong, Heng Ji
  • PRINCIPLES: Synthetic Strategy Memory for Proactive Dialogue Agents
    Namyoung Kim, Kai Tzu-iunn Ong, Yeonjun Hwang, Minseok Kang, Iiseo Jihn, Gayoung Kim, Minju Kim, Jinyoung Yeo
  • SLM-Bench: A Comprehensive Benchmark of Small Language Models on Environmental Impacts
    Nghiem Thanh Pham, Tung Kieu, Duc Manh Nguyen, Son Ha Xuan, Nghia Duong-Trung, Danh Le-Phuoc
  • A Decoupled Multi-Agent Framework for Complex Text Style Transfer
    Lingxi Zhang, Yu-Neng Chuang, Guanchu Wang, Ruixiang Tang, Xuanting Cai, Rajesh Shenoy, Xia Hu
  • Mamba Drafters for Speculative Decoding
    Daewon Choi, Seunghyuk Oh, Saket Dingliwal, Jihoon Tack, Kyuyoung Kim, Woomin Song, Seojin Kim, Insu Han, Jinwoo Shin, Aram Galstyan, Shubham Katiyar, Sravan Babu Bodapati
  • LongLLaVA: Scaling Multi-modal LLMs to 1000 Images Efficiently via a Hybrid Architecture
    Xidong Wang, Dingjie Song, Shunian Chen, Junying Chen, Zhenyang Cai, Chen Zhang, Lichao Sun, Benyou Wang
  • Think Clearly: Improving Reasoning via Redundant Token Pruning
    Daewon Choi, Jimin Lee, Jihoon Tack, Woomin Song, Saket Dingliwal, Sai Muralidhar Jayanthi, Bhavana Ganesh, Jinwoo Shin, Aram Galstyan, Sravan Babu Bodapati
  • A Systematic Survey of Claim Verification: Corpora, Systems, and Case Studies
    Zhaxi Zerong, CHENXI LI, Xinyi Liu, Ju-hui Chen, Fei Xia
  • Automated Creativity Evaluation for Large Language Models: A Reference-Based Approach
    Ruizhe Li, Chiwei Zhu, Benfeng Xu, Xiaorui Wang, Zhendong Mao
  • LangProBe: a Language Program Benchmark
    Shangyin Tan, Lakshya A Agrawal, Arnav Singhvi, Liheng Lai, Michael J Ryan, Dan Klein, Omar Khattab, Koushik Sen, Matei Zaharia
  • Exploring and Detecting Self-disclosure in Multi-modal posts on Chinese Social Media
    Jingbao Luo, Ming Liu, Aoli Huo, hufujing, Gang Li, WupengNjust
  • MV-CLAM: Multi-View Molecular Interpretation with Cross-Modal Projection via Language Model
    Sumin Ha, Jun Hyeong Kim, Yinhua Piao, Changyun Cho, Sun Kim
  • Mind the Style Gap: Meta-Evaluation of Style and Attribute Transfer Metrics
    Amalie Brogaard Pauli, Isabelle Augenstein, Ira Assent
  • ExtremeAIGC: Benchmarking LMM Vulnerability to AI-Generated Extremist Content
    Bhavik Chandna, Mariam Aboujenane, Usman Naseem
  • Data Augmentation for Maltese NLP using Transliterated and Machine Translated Arabic Data
    Kurt Micallef, Nizar Habash, Claudia Borg
  • Do LLMs Align Human Values Regarding Social Biases? Judging and Explaining Social Biases with LLMs
    Yang Liu, Chenhui Chu
  • CoEx – Co-evolving World-model and Exploration
    Minsoo Kim, seung-won hwang
  • BrainLoc: Brain Signal-Based Object Detection with Multi-modal Alignment
    Kaixuan Luan, Xiaoda Yang, Hongshun Qiu, Weicai Yan, Xueyi Zhang, Jiaqi Duan, Youliang Zhang, Zhaoyang Li, Donglin Huang, Zejian Xie, JunYu Lu, Ziyue Jiang
  • PVTNL: Prompting Vision Transformers with Natural Language for Generalizable Person Re-identification
    WANGNING, Lei Xie, Sanglu Lu, Shiwei Gan
  • RingFormer: Rethinking Recurrent Transformer with Adaptive Level Signals
    Jaemu Heo, Eldor Fozilov, Hyunmin Song, Taehwan Kim
  • TriSPrompt: A Hierarchical Soft Prompt Model for Multimodal Rumor Detection with Incomplete Modalities
    Jiajun Chen, Yangyang Wu, Xiaoye Miao, Mengying Zhu, Meng Xi
  • Evaluating Uncertainty Quantification Methods in Argumentative Large Language Models
    Kevin Zhou, Adam Dejl, Gabriel Freedman, Lihu Chen, Antonio Rago, Francesca Toni
  • Med-RewardBench: Benchmarking Reward Models and Judges for Medical Multimodal Large Language Models
    Meidan Ding, Jipeng Zhang, Wenxuan Wang, Cheng-Yi Li, Wei-Chieh Fang, Hsin-Yu Wu, Haiqin Zhong, Wenting Chen, Linlin Shen
  • CLAIMCHECK: How Grounded are LLM Critiques of Scientific Papers?
    Jiefu Ou, William Gantt Walden, Kate Sanders, Zhengping Jiang, Kaiser Sun, Jeffrey Cheng, William Jurayj, Miriam Wanner, Shaobo Liang, Candice Morgan, Seunghoon Han, Weiqi Wang, Chandler May, Hannah Recknor, Daniel Khashabi, Benjamin Van Durme
  • From Noise to Clarity: Filtering Real and LLM-Generated Samples for Enhanced Intent Detection
    Junbao Huang, Weizhen Li, Peijie Huang, Yuhong Xu
  • Improving Language Model Personas via Rationalization with Psychological Scaffolds
    Brihi Joshi, Xiang Ren, Swabha Swayamdipta, Rik Koncel-Kedziorski, Tim Paek
  • KBM: Delineating Knowledge Boundary for Adaptive Retrieval in Large Language Models
    Zhen Zhang, Xinyu Wang, Yong Jiang, Zile Qiao, Zhuo Chen, Guangyu Li, Feiteng Mu, Mengting Hu, Pengjun Xie, Fei Huang
  • TABARD: A Novel Benchmark for Tabular Anomaly Analysis, Reasoning and Detection
    Manan Roy Choudhury, Shikhhar Siingh, Anirudh Iyengar Kaniyar Narayana Iyengar, Sugeeth Puranam, Vivek Gupta
  • Aspect-based Sentiment Analysis via Synthetic Image Generation
    Ge Chen, Zhongqing Wang, Guodong Zhou
  • IntrEx: A Dataset for Modeling Engagement in Educational Conversations
    Xingwei Tan, Mahathi Parvatham, Chiara Gambi, Gabriele Pergola
  • Bridging the Capability Gap: Joint Alignment Tuning for Harmonizing LLM-based Multi-Agent Systems
    Minghang Zhu, Zhengliang Shi, Zhiwei Xu, Shiguang Wu, Lingjie Wang, Pengjie Ren, Zhaochun Ren, Zhumin Chen
  • Safety Through Reasoning: An Empirical Study of Reasoning Guardrail Models
    Makesh Narsimhan Sreedhar, Traian Rebedea, Christopher Parisien
  • Context-Aware Reasoning On Parametric Knowledge for Inferring Causal Variables
    Ivaxi Sheth, Sahar Abdelnabi, Mario Fritz
  • LoRE-Merging: Exploring Low-Rank Estimation For Large Language Model Merging
    Zehua Liu, Han Wu, Yuxuan Yao, Xiaojin Fu, Ruifeng She, Xiongwei Han, Tao Zhong, Mingxuan Yuan
  • Benchmarking Foundation Models with Retrieval-Augmented Generation in Olympic-Level Physics Problem Solving
    Shunfeng Zheng, Yudi Zhang, Meng Fang, Zihan Zhang, Zhitan Wu, Mykola Pechenizkiy, Ling Chen
  • FiRST: Finetuning Router-Selective Transformers for Input-Adaptive Latency Reduction
    Akriti Jain, Saransh Sharma, Koyel Mukherjee, Soumyabrata Pal
  • Explaining Sources of Uncertainty in Automated Fact-Checking
    Jingyi Sun, Greta Warren, Irina Shklovski, Isabelle Augenstein
  • PolitiSky24: U.S. Political Bluesky Dataset with User Stance Labels
    Peyman Rostami, Vahid Rahimzadeh, Ali Adibi, Azadeh Shakery
  • From Ground Trust to Truth: Disparities in Offensive Language Judgments on Contemporary Korean Political Discourse
    Seunguk Yu, JungMin Yun, Jinhee Jang, YoungBin Kim
  • Misalignment Attack on Text-to-Image Models via Text Embedding Optimization and Inversion
    Zhijie Du, Daizong Liu, Pan Zhou
  • Domain Pre-training Impact on Representations
    Cesar Gonzalez-Gutierrez, Ariadna Quattoni
  • KoACD: The First Korean Adolescent Dataset for Cognitive Distortion Analysis via Role-Switching Multi-LLM Negotiation
    Jun Seo Kim, Hye Hyeon Kim
  • Refined Assessment for Translation Evaluation: Rethinking Machine Translation Evaluation in the Era of Human-Level Systems
    Dmitry Popov, Vladislav Negodin, Ekaterina Enikeeva, Iana Matrosova, Nikolay Karpachev, Max Ryabinin
  • Pre-Storage Reasoning for Episodic Memory: Shifting Inference Burden to Memory for Personalized Dialogue
    Sangyeop Kim, Yohan Lee, Sanghwa Kim, Hyunjong Kim, Sungzoon Cho
  • Temporal Consistency for LLM Reasoning Process Error Identification
    Jiacheng Guo, Yue Wu, Jiahao Qiu, Kaixuan Huang, Xinzhe Juan, Ling Yang, Mengdi Wang
  • Quantifying Compositionality of Classic and State-of-the-Art Embeddings
    Zhijin Guo, Chenhao Xue, Zhaozhen Xu, Hongbo Bo, Yuxuan Ye, Janet B. Pierrehumbert, Martha Lewis
  • Presumed Cultural Identity: How Names Shape LLM Responses
    Siddhesh Milind Pawar, Arnav Arora, Lucie-Aimée Kaffee, Isabelle Augenstein
  • I-GUARD: Interpretability-Guided Parameter Optimization for Adversarial Defense
    Mamta Mamta, Oana Cocarascu
  • DecoupledESC: Enhancing Emotional Support Generation via Strategy-Response Decoupled Preference Optimization
    Chao Zhang, Xin Shi, Xueqiao Zhang, Yifan Zhu, Yi Yang, Yawei Luo
  • Local Normalization Distortion and the Thermodynamic Formalism of Decoding Strategies for Large Language Models
    Tom Kempton, Stuart Burrell
  • BRIT: Bidirectional Retrieval over Unified Image-Text Graph
    Ainulla Khan, Moyuru Yamada, Srinidhi Akella
  • ReTAG: Retrieval-Enhanced, Topic-Augmented Graph-Based Global Sensemaking
    Boyoung Kim, Dosung Lee, Sumin An, Jinseong Jeong, Paul Hongsuck Seo
  • Vocab Diet: Reshaping the Vocabularies of LLMs with Vector Arithmetic
    Yuval Reif, Guy Kaplan, Roy Schwartz
  • Capturing Latent Modal Association For Multimodal Entity Alignment
    Yongquan Ji, Jingwei Cheng, Fu Zhang, Chenglong Lu
  • Explaining novel senses using definition generation with open language models
    Mariia Fedorova, Andrey Kutuzov, Francesco Periti, Yves Scherrer
  • Can Code-Switched Texts Activate a Knowledge Switch in LLMs? A Case Study on English-Korean Code-Switching
    Seoyeon Kim, Huiseo Kim, Chanjun Park, Jinyoung Yeo, Dongha Lee
  • Compositional Translation: A Novel LLM-based Approach for Low-resource Machine Translation
    Armel Randy Zebaze, Benoît Sagot, Rachel Bawden
  • TopXGen: Topic-Diverse Parallel Data Generation for Low-Resource Machine Translation
    Armel Randy Zebaze, Benoît Sagot, Rachel Bawden
  • Fast, Not Fancy: Rethinking G2P with Rich Data and Statistical Models
    Mahta Fetrat Qharabagh, Zahra Dehghanian, Hamid R. Rabiee
  • Personalized open world plan generation for safety-critical human centered autonomous systems: A case study on Artificial Pancreas
    Ayan Banerjee, Sandeep Gupta
  • CaMMT: Benchmarking Culturally Aware Multimodal Machine Translation
    Emilio Villa-Cueva, Sholpan Bolatzhanova, Diana Turmakhan, Kareem Elzeky, Henok Biadglign Ademtew, Alham Fikri Aji, Israel Abebe Azime, Jinheon Baek, Frederico Belcavello, Fermin Cristobal, Jan Christian Blaise Cruz, Mary Dabre, Raj Dabre, Toqeer Ehsan, Naome A Etori, Fauzan Farooqui, Jiahui Geng, Guido Ivetta, Thanmay Jayakumar, Soyeong Jeong, Zheng Wei Lim, Aishik Mandal, Sofía Martinelli, Mihail Minkov Mihaylov, Daniil Orel, Aniket Pramanick, Sukannya Purkayastha, Israfel Salazar, Haiyue Song, Tiago Timponi Torrent, Debela Desalegn Yadeta, Injy Hamed, Atnafu Lambebo Tonja, Thamar Solorio
  • Training Text-to-Molecule Models with Context-Aware Tokenization
    Seojin Kim, Hyeontae Song, Jaehyun Nam, Jinwoo Shin
  • Challenging the Evaluator: LLM Sycophancy Under User Rebuttal
    Sung Won Kim, Daniel Khashabi
  • Perspective-driven Preference Optimization with Entropy Maximization for Diverse Argument Generation
    Yilin Cao, Ruike Zhang, Penghui Wei, Qingchao Kong, Wenji Mao
  • Spoken Document Retrieval for an Unwritten Language: A Case Study on Gormati
    Sanjay Booshanam, Kelly Chen, Ondrej Klejch, Thomas Reitmaier, Dani Kalarikalayil Raju, Electra Wallington, Nina Markl, Jennifer Pearson, Matt Jones, Simon Robinson, Peter Bell
  • M-Help: Using Social Media Data to Detect Mental Health Help-Seeking Signals
    MSVPJ Sathvik, Zuhair Hasan Shaik, Vivek Gupta
  • Brittle Minds, Fixable Activations: Understanding Belief Representations in Language Models
    Matteo Bortoletto, Constantin Ruhdorfer, Lei Shi, Andreas Bulling
  • Golden Touchstone: A Comprehensive Bilingual Benchmark for Evaluating Financial Large Language Models
    Xiaojun Wu, Junxi Liu, Huan-Yi Su, Zhouchi Lin, Yiyan Qi, Chengjin Xu, Jiajun Su, Jiajie Zhong, Fuwei Wang, Saizhuo Wang, Fengrui Hua, Jia Li, Jian Guo
  • Quantifying the Risks of LLM- and Tool-assisted Rephrasing to Linguistic Diversity
    Mengying Wang, Andreas Spitz
  • NUMINA: A Natural Understanding Benchmark for Multi-dimensional Intelligence and Numerical Reasoning Abilities
    Changyu Zeng, Yifan Wang, Zimu Wang, Wei Wang, Zhengni Yang, Muyi Bao, Jimin XIAO, Anh Nguyen, Yutao Yue
  • MoMentS: A Comprehensive Multimodal Benchmark for Theory of Mind
    Emilio Villa-Cueva, S M Masrur Ahmed, Rendi Chevi, Jan Christian Blaise Cruz, Kareem Elzeky, Fermin Cristobal, Alham Fikri Aji, Skyler Wang, Rada Mihalcea, Thamar Solorio
  • Code Like Humans: A Multi-Agent Solution for Medical Coding
    Andreas Geert Motzfeldt, Joakim Edin, Casper L. Christensen, Christian Hardmeier, Lars Maaløe, Anna Rogers
  • Can Out-of-Distribution Evaluations Uncover Reliance on Prediction Shortcuts? A Case Study in Question Answering
    Michal Štefánik, Timothee Mickus, Michal Spiegel, Marek Kadlčík, Josef Kuchař
  • MEXA: Towards General Multimodal Reasoning with Dynamic Multi-Expert Aggregation
    Shoubin Yu, Yue Zhang, Ziyang Wang, Jaehong Yoon, Mohit Bansal
  • Lifelong Knowledge Editing requires Better Regularization
    Akshat Gupta, Phudish Prateepamornkul, Maochuan Lu, Ahmed Alaa, Thomas Hartvigsen, Gopala Anumanchipalli
  • Lost in Embeddings: Information Loss in Vision–Language Models
    Wenyan Li, Raphael Tang, Chengzu Li, Clemente Pasti, Vésteinn Snæbjarnarson, Caiqi Zhang, Ivan Vulić, Ryan Cotterell, Anders Søgaard
  • Assessing the Role of Data Quality in Training Bilingual Language Models
    Skyler Seto, Maartje Ter Hoeve, Maureen de Seyssel, David Grangier
  • DORM: Preference Data Weights Optimization for Reward Modeling in LLM Alignment
    Rongzhi Zhang, Chenwei Zhang, Xinyang Zhang, Liang Qiu, Haoming Jiang, Yuchen Zhuang, Qingru Zhang, Hyokun Yun, Xian Li, Bing Yin, Tuo Zhao, Chao Zhang
  • Enhancing Domain-Specific Encoder Models with LLM-Generated Data: How to Leverage Ontologies, and How to Do Without Them
    Marc Felix Brinner, Tarek Al Mustafa, Sina Zarrieß
  • Aligning Dialogue Agents with Global Feedback via Large Language Model Multimodal Reward Decomposition
    Dong Won Lee, Hae Won Park, Cynthia Breazeal, Louis-Philippe Morency
  • UrduFactCheck: An Agentic Fact-Checking Framework for Urdu with Evidence Boosting and Benchmarking
    Sarfraz Ahmad, Hasan Iqbal, Momina Ahsan, Numaan Naeem, Muhammad Ahsan Riaz Khan, Arham Riaz, Muhammad Arslan Manzoor, Yuxia Wang, Preslav Nakov
  • Echoes of Agreement: Argument Driven Sycophancy in Large Language models
    Avneet Kaur
  • Rethinking NLP for Chemistry: A Critical Look at the USPTO Benchmark
    Derin Ozer, Nicolas Gutowski, Benoit Da Mota, Thomas Cauchy, Sylvain Lamprier
  • Investigating Dictionary Expansion for Video-based Sign Language Dictionaries
    Aashaka Desai, Daniela Massiceti, Richard Ladner, Hal Daumé III, Danielle Bragg, Alex Xijie Lu
  • From Insight to Exploit: Leveraging LLM Collaboration for Adaptive Adversarial Text Generation
    Najrin Sultana, Md Rafi Ur Rashid, Kang Gu, Shagufta Mehnaz
  • Beyond Contrastive Learning: Synthetic Data Enables List-wise Training with Multiple Levels of Relevance
    Reza Esfandiarpoor, George Zerveas, Ruochen Zhang, Macton Mgonzo, Carsten Eickhoff, Stephen Bach
  • Instability in Downstream Task Performance During LLM Pretraining
    Yuto Nishida, Masaru Isonuma, Yusuke Oda
  • A Comparison of Independent and Joint Fine-tuning Strategies for Retrieval-Augmented Generation
    Neal Gregory Lawton, Alfy Samuel, Anoop Kumar, Daben Liu
  • mrCAD: Multimodal Communication to Refine Computer-aided Designs
    William P McCarthy, Saujas Vaduguru, Karl D.D. Willis, Justin Matejka, Judith E Fan, Daniel Fried, Yewen Pu
  • MOCHA: Are Code Language Models Robust Against Multi-Turn Malicious Coding Prompts?
    Muntasir Wahed, Xiaona Zhou, Kiet A. Nguyen, Tianjiao Yu, Nirav Diwan, Gang Wang, Dilek Hakkani-Tür, Ismini Lourentzou
  • How Can Input Reformulation Improve Tool Usage Accuracy in a Complex Dynamic Environment? A Study on τ -bench
    Venkatesh Mishra, Amir Saeidi, Satyam Raj, Mutsumi Nakamura, Jayanth Srinivasa, Gaowen Liu, Ali Payani, Chitta Baral
  • Evaluating Fairness in Large Vision-Language Models Across Diverse Demographic Attributes and Prompts
    Xuyang Wu, Yuan Wang, Hsin-Tai Wu, Zhiqiang Tao, Yi Fang
  • VIBE: Can a VLM Read the Room?
    Tania Chakraborty, Eylon Caplan, Dan Goldwasser
  • LoRATK: LoRA Once, Backdoor Everywhere in the Share-and-Play Ecosystem
    Hongyi Liu, Shaochen Zhong, Xintong Sun, Minghao Tian, Mohsen Hariri, Zirui Liu, Ruixiang Tang, Zhimeng Jiang, Jiayi Yuan, Yu-Neng Chuang, Li Li, Soo-Hyun Choi, Rui Chen, Vipin Chaudhary, Xia Hu
  • Pearl: A Multimodal Culturally-Aware Arabic Instruction Dataset
    Fakhraddin Alwajih, Samar M. Magdy, Abdellah EL MEKKI, OMER NACAR, Youssef Nafea, Safaa Taher Abdelfadil, Abdulfattah Mohammed Yahya, Hamzah Luqman, Nada Almarwani, Samah Aloufi, Baraah Qawasmeh, Houdaifa Atou, Serry Sibaee, Hamzah A. Alsayadi, Walid Al-Dhabyani, Maged S. Al-shaibani, Aya El aatar, Nour Qandos, Rahaf Alhamouri, Samar Ahmad, Razan khassib, Lina Hamad, Mohammed Anwar AL-Ghrawi, Fatimah Alshamari, Cheikh Malainine, Doaa Qawasmeh, Aminetou Yacoub, Tfeil moilid, Ruwa AbuHweidi, Ahmed Aboeitta, Vatimetou Mohamed Lemin, Reem Abdel-Salam, Ahlam Bashiti, Adel Ammar, Aisha Alansari, Ahmed Ashraf, Nora Alturayeif, Sara Shatnawi, Alcides Alcoba Inciarte, AbdelRahim A. Elmadany, Mohamedou cheikh tourad, Ismail Berrada, Mustafa Jarrar, Shady Shehata, Muhammad Abdul-Mageed
  • Protein Large Language Models: A Comprehensive Survey
    Yijia Xiao, Wanjia Zhao, Junkai Zhang, Yiqiao Jin, Han Zhang, Zhicheng Ren, Renliang Sun, Haixin Wang, Guancheng Wan, Pan Lu, Xiao Luo, Yu Zhang, James Zou, Yizhou Sun, Wei Wang
  • MAKIEval: A Multilingual Automatic WiKidata-based Framework for Cultural Awareness Evaluation for LLMs
    Raoyuan Zhao, Beiduo Chen, Barbara Plank, Michael A. Hedderich
  • Looking Beyond the Pixels: Evaluating Visual Metaphor Understanding in VLMs
    Manishit Kundu, Sumit Shekhar, Pushpak Bhattacharyya
  • AGENTVIGIL: Automatic Black-Box Red-teaming for Indirect Prompt Injection against LLM Agents
    Zhun Wang, Vincent Siu, Zhe Ye, Tianneng Shi, Yuzhou Nie, Xuandong Zhao, Chenguang Wang, Wenbo Guo, Dawn Song
  • Improving LLM-as-a-Judge Inference with the Judgment Distribution
    Victor Wang, Michael JQ Zhang, Eunsol Choi
  • Learning Is Not A Race: Improving Retrieval in Language Models via Equal Learning
    Wanqian Yang, Aahlad Manas Puli, Rajesh Ranganath
  • The Prompt Makes the Person(a): A Systematic Evaluation of Sociodemographic Persona Prompting for Large Language Models
    Marlene Lutz, Indira Sen, Georg Ahnert, Elisa Rogers, Markus Strohmaier
  • Spiral of Silence in Large Language Model Agents
    Mingze Zhong, Meng Fang, Zijing Shi, Yuxuan Huang, Shunfeng Zheng, Yali Du, Ling Chen, Jun Wang
  • Do We Know What LLMs Don’t Know? A Study of Consistency in Knowledge Probing
    Raoyuan Zhao, Abdullatif Köksal, Ali Modarressi, Michael A. Hedderich, Hinrich Schuetze
  • Context Length Alone Hurts LLM Performance Despite Perfect Retrieval
    Yufeng Du, Minyang Tian, Srikanth Ronanki, Subendhu Rongali, Sravan Babu Bodapati, Aram Galstyan, Azton Wells, Roy Schwartz, Eliu A Huerta, Hao Peng
  • DebUnc: Improving Large Language Model Agent Communication With Uncertainty Metrics
    Luke Yoffe, Alfonso Amayuelas, William Yang Wang
  • ProcVQA: Benchmarking the Effects of Structural Properties in Mined Process Visualizations on Vision–Language Model Performance
    Kazi Tasnim Zinat, Saad Mohammad Abrar, Shoumik Saha, Sharmila Duppala, Saimadhav Naga Sakhamuri, Zhicheng Liu
  • Probing Political Ideology in Large Language Models: How Latent Political Representations Generalize Across Tasks
    Tianyi Zhang
  • Understanding GUI Agent Localization Biases through Logit Sharpness
    Xingjian Tao, Yiwei Wang, Zhicheng Yang, Jing Tang
  • The Language of Interoception: Examining Embodiment and Emotion Through a Corpus of Body Part Mentions
    Sophie Wu, Jan Philip Wahle, Saif M. Mohammad
  • HomoGraphAdapter: A Homogeneous Graph Neural Network as an Effective Adapter for Vision-Language Models
    Chuan He, Zhuozhao Li, Song Guo, Xiaocheng Lu, Jinxiang Lai
  • No Black Boxes: Interpretable and Interactable Predictive Healthcare with Knowledge-Enhanced Agentic Causal Discovery
    Xiaoxue Han, Pengfei Hu, Chang Lu, Jun-En Ding, Feng Liu, Yue Ning
  • PROOD: A Simple LLM Out-of-Distribution Guardrail Leveraging Response Semantics
    Joshua Tint
  • ICL-Bandit: Relevance Labeling in Advertisement Recommendation Systems via LLM
    Lu Wang, Chiming Duan, Pu Zhao, Fangkai Yang, Yong Shi, Xuefeng Luo, Bingjing Xu, Weiwei Deng, Qingwei Lin, Dongmei Zhang
  • Intent-aware Schema Generation and Refinement for Literature Review Tables
    Vishakh Padmakumar, Joseph Chee Chang, Kyle Lo, Doug Downey, Aakanksha Naik
  • NLP Needs Diversity outside of ‘Diversity’
    Joshua Tint
  • Anatomy of a Feeling: Narrating Embodied Emotions via Large Vision-Language Models
    Mohammad Saim, Phan Anh Duong, Cat Luong, Aniket Bhanderi, Tianyu Jiang
  • Towards Universal Debiasing for Language Models-based Tabular Data Generation
    Tianchun Li, Tianci Liu, Xingchen Wang, Rongzhe Wei, Pan Li, Lu Su, Jing Gao
  • Beyond Linear Steering: Unified Multi-Attribute Control for Language Models
    Narmeen Fatimah Oozeer, Luke Marks, Fazl Barez, Amir Abdullah
  • Unequal Scientific Recognition in the Age of LLMs
    Yixuan Liu, Abel Elekes, Jianglin Lu, Rodrigo Dorantes-Gilardi, Albert-Laszlo Barabasi
  • Zero-Shot Fine-Grained Image Classification Using Large Vision-Language Models
    Md. Atabuzzaman, Andrew Zhang, Chris Thomas
  • Using tournaments to calculate AUROC for zero-shot classification with LLMs
    WonJin Yoon, Ian Bulovic, Timothy A. Miller
  • Exploration-Driven Reinforcement Learning for Expert Routing Improvement in Mixture-of-Experts Language Models
    Gyunyeop Kim, Sangwoo Kang
  • D2CS - Documents Graph Clustering using LLM supervision
    Yoel Ashkenazi, Etzion Harari, Regev Yehezkel Imra, Naphtali Abudarham, Dekel Cohen, Yoram Louzoun
  • GeoChain: Multimodal Chain-of-Thought for Geographic Reasoning
    Sahiti Yerramilli, Nilay Pande, Rynaa Grover, Jayant Sravan Tamarapalli
  • SteerVLM: Robust Model Control through Lightweight Activation Steering for Vision Language Models
    Anushka Sivakumar, Andrew Zhang, Zaber Ibn Abdul Hakim, Chris Thomas
  • FractalLLM: Lossless Self-Speculative Decoding with Layer Embedded Self-Compression
    Juhyeong Kim, Sangyeon Yu, Gyunyeop Kim, Sangwoo Kang
  • Saten: Sparse Augmented Tensor Networks for Post-Training Compression of Large Language Models
    Ryan Solgi, Kai Zhen, Rupak Vignesh Swaminathan, Nathan Susanj, Athanasios Mouchtaris, Siegfried Kunzmann, Zheng Zhang
  • Third-Person Appraisal Agent: Simulating Human Emotional Reasoning in Text with Large Language Models
    Simin Hong, Jun Sun, Hongyang Chen
  • Source-primed Multi-turn Conversation Helps Large Language Models Translate Documents
    Hanxu Hu, Jannis Vamvas, Rico Sennrich
  • Mitigating Spurious Correlations via Counterfactual Contrastive Learning
    Fengxiang Cheng, Chuan Zhou, Xiang Li, Alina Leidinger, Haoxuan Li, Robert Van Rooij
  • The RAG Paradox: A Black-Box Attack Exploiting Unintentional Vulnerabilities in Retrieval-Augmented Generation Systems
    Chanwoo Choi, Jinsoo Kim, Sukmin Cho, Soyeong Jeong, Buru Chang
  • Guiding Large Language Models for Biomedical Entity Linking via Restrictive and Contrastive Decoding
    Zhenxi Lin, Ziheng Zhang, Jian Wu, Yefeng Zheng, Xian Wu
  • Cut the Deadwood Out: Backdoor Purification via Guided Module Substitution
    Yao Tong, Weijun Li, Xuanli He, Haolan Zhan, Qiongkai Xu
  • RepoDebug: Repository-Level Multi-Task and Multi-Language Debugging Evaluation of Large Language Models
    Jingjing Liu, Zeming Liu, Zihao Cheng, Mengliang He, Xiaoming Shi, Yuhang Guo, Xiangrong Zhu, Yuanfang Guo, Yunhong Wang, Haifeng Wang
  • FaStFact: Faster, Stronger Long-Form Factuality Evaluations in LLMs
    Yingjia Wan, Haochen Tan, Xiao Zhu, Xinyu Zhou, Zhiwei Li, Qingsong Lv, Changxuan Sun, Yi Xu, Jianqiao Lu, Yinhong Liu, Zhijiang Guo
  • PropXplain: Can LLMs Enable Explainable Propaganda Detection?
    Maram Hasanain, Md Arid Hasan, Mohamed Bayan Kmainasi, Elisa Sartori, Ali Ezzat Shahroor, Giovanni Da San Martino, Firoj Alam
  • EoT: Evolution of Thoughts for Complex Reasoning Tasks
    Qin Hua, Jiaqi Sun, Shiyou Qian, Dingyu Yang, Jian Cao, Guangtao Xue
  • Reveal and Release: Iterative LLM Unlearning with Self-generated Data
    Linxi Xie, Xin Teng, Shichang Ke, Hongyi Wen, Shenji Wan
  • An Evaluation Resource for Grounding Translation Errors
    Sujin Chen, Kang Wang, Zixuan Zhou, Xiangyu Duan, Wanqun Zhang, Hao Yang, Jinsong Su, Min Zhang
  • Enhancing Time Awareness in Generative Recommendation
    Sunkyung Lee, Seongmin Park, Jonghyo Kim, Mincheol Yoon, Jongwuk Lee
  • Adaptive LLM Routing under Budget Constraints
    Pranoy Panda, Raghav Magazine, Chaitanya Devaguptapu, Sho Takemori, Vishal Sharma
  • Promptception: How Sensitive Are Large Multimodal Models to Prompts?
    Mohamed Insaf Ismithdeen, Muhammad Uzair Khattak, Salman Khan
  • Can Federated Learning Safeguard Private Data in LLM Training? Vulnerabilities, Attacks, and Defense Evaluation
    Wenkai Guo, Xuefeng Liu, Haolin Wang, Jianwei Niu, Shaojie Tang, Jing Yuan
  • Runaway is Ashamed, But Helpful: On the Early-Exit Behavior of Large Language Model-based Agents in Embodied Environments
    Qingyu Lu, Liang Ding, Siyi Cao, Xuebo Liu, Kanjian Zhang, Jinxia Zhang, Dacheng Tao
  • AutoMIR: Effective Zero-Shot Medical Information Retrieval without Relevance Labels
    Lei Li, Xiangxu Zhang, Xiao Zhou, Zheng Liu
  • RG-VQA: Leveraging Retriever-Generator Pipelines for Knowledge Intensive Visual Question Answering
    Settaluri Lakshmi Sravanthi, Pulkit Agarwal, Debjyoti Mondal, Rituraj Singh, Subhadarshi Panda, Ankit Mishra, Kiran Pradeep, Srihari K B, Godawari Sudhakar Rao, Pushpak Bhattacharyya
  • Enhancing RAG Efficiency with Adaptive Context Compression
    Shuyu Guo, Shuo Zhang, Zhaochun Ren
  • Revealing the impact of synthetic native samples and multi-tasking strategies in Hindi-English code-mixed humour and sarcasm detection
    Debajyoti Mazumder, Aakash Kumar, Jasabanta Patro
  • CogAtom: From Cognitive Atoms to Olympiad-level Mathematical Reasoning in Large Language Models
    Zhuofan Chen, Jiyuan He, Yichi zhang, Xing Hu, Haoxing Wen, Jun Bai, Wenge Rong
  • Efficient Latent Semantic Clustering for Scaling Test-Time Computation of LLMs
    Sungjae Lee, Hoyoung Kim, Jeongyeon Hwang, Eunhyeok Park, Jungseul Ok
  • BannerBench: Benchmarking Vision Language Models for Multi-Ad Selection with Human Preferences
    Hiroto Otake, Peinan Zhang, Yusuke Sakai, Masato Mita, Hiroki Ouchi, Taro Watanabe
  • DeKeyNLU: Enhancing Natural Language to SQL Generation through Task Decomposition and Keyword Extraction
    Jian Chen, Zhenyan Chen, Xuming Hu, Peilin Zhou, Yining Hua, Han Fang, Cissy Hing Yee Choy, Xinmei Ke, Jingfeng Luo, Zixuan Yuan
  • Facilitating Cross-lingual Transfer of Empathy through Language-independent Latent Diffusion: A Case Study in Chinese
    Junlin Li, PENG Bo, Yu-Yin Hsu
  • Evaluating Compound AI Systems through Behaviors, Not Benchmarks
    PRANAV BHAGAT, K N Ajay Shastry, Pranoy Panda, Chaitanya Devaguptapu
  • SciCompanion: Graph-Grounded Reasoning for Structured Evaluation of Scientific Arguments
    Joshua Alan Flashner, Adithya Kulkarni, Dawei Zhou
  • From Generation to Detection: A Multimodal Multi-Task Dataset for Benchmarking Health Misinformation
    Zhihao Zhang, Yiran Zhang, Xiyue Zhou, Liting Huang, Imran Razzak, Preslav Nakov, Usman Naseem
  • Estimating Machine Translation Difficulty
    Lorenzo Proietti, Stefano Perrella, Vilém Zouhar, Roberto Navigli, Tom Kocmi
  • TIU-Bench: A Benchmark for Evaluating Large Multimodal Models on Text-rich Image Understanding
    Kun Zhang, Liqiang Niu, Zhen Cao, Fandong Meng, Jie Zhou
  • Breaking Token Into Concepts: Exploring Extreme Compression in Token Representation Via Compositional Shared Semantics
    Kavin R V, Pawan Goyal
  • ExeSQL: Self-Taught Text-to-SQL Models with Execution-Driven Bootstrapping for SQL Dialects
    Jipeng Zhang, Haolin Yang, Kehao Miao, Ruiyuan Zhang, Renjie Pi, Jiahui Gao, Xiaofang Zhou
  • Under the Shadow of Babel: How Language Shapes Reasoning in LLMs
    Chenxi Wang, Yixuan Zhang, Lang Gao, Zixiang Xu, Zirui Song, Yanbo Wang, Xiuying Chen
  • Think Right, Not More: Test-Time Scaling for Numerical Claim Verification
    Primakov Chungkham, Venktesh V, Vinay Setty, Avishek Anand
  • Nexus: Adaptive Upcycling to Efficiently Pretrain Mixture of Experts
    Nikolas Gritsch, Qizhen Zhang, Acyr Locatelli, Sara Hooker, Ahmet Üstün
  • Exploring Context Strategies in LLMs for Discourse-Aware Machine Translation
    Ritvik Choudhary, Rem Hida, Masaki Hamada, Hayato Futami, Toshiyuki Sekiya
  • Insights into using temporal coordinated behaviour to explore connections between social media posts and influence
    Elisa Sartori, Serena Tardelli, Maurizio Tesconi, Mauro Conti, Alessandro Galeazzi, Stefano Cresci, Giovanni Da San Martino
  • SpecCoT: Accelerating Chain-of-Thought Reasoning through Speculative Exploration
    Junhan Shi, Yijia Zhu, Zhenning Shi, Dan Zhao, Qing Li, Yong Jiang
  • A Similarity Measure for Comparing Conversational Dynamics
    Sang Min Jung, Kaixiang Zhang, Cristian Danescu-Niculescu-Mizil
  • AgentDrug: Utilizing Large Language Models in an Agentic Workflow for Zero-Shot Molecular Optimization
    Le Huy Khiem, Ting Hua, Nitesh V Chawla
  • Improving Preference Alignment of LLM with Inference-Free Self-Refinement
    Fukun Ma, Kaibin Tian, JietingXue, Xiaoyi Wang, Ye Ma, Quan Chen, Peng Jiang, Lijie Wen
  • Guaranteed Guess: A Language Modeling Approach for CISC-to-RISC Transpilation with Testing Guarantees
    Ahmed Heakl, Sarim Hashmi, Chaimaa Abi, Celine Lee, Abdulrahman Mahmoud
  • StructuThink: Reasoning with Task Transition Knowledge for Autonomous LLM-Based Agents
    Haiyu Zhao, Zhenyu Guo, Chunhong Zhang, Ziyu Zhou, Zheng Hu
  • Leveraging Unpaired Feedback for Long-Term LLM-based Recommendation Tuning
    Jizhi Zhang, Chongming Gao, Wentao Shi, Xin Chen, Jingang Wang, Xunliang Cai, Fuli Feng
  • Investigating Multi-layer Representations for Dense Passage Retrieval
    Zhongbin Xie, Thomas Lukasiewicz
  • KELE: Residual Knowledge Erasure for Enhanced Multi-hop Reasoning in Knowledge Editing
    Mengqi Zhang, Bowen Fang, Qiang Liu, Xiaotian Ye, Shu Wu, Pengjie Ren, Zhumin Chen, Liang Wang
  • Dissecting Persona-Driven Reasoning in Language Models via Activation Patching
    Ansh Poonia, Maeghal Jain
  • PUER: Boosting Few-shot Positive-Unlabeled Entity Resolution with Reinforcement Learning
    Yaoshu Wang, Mengyi Yan, Wei Wang
  • Toward the Automatic Detection of Word Meaning Negotiation Indicators in Conversation
    Aina Garí Soler, Matthieu Labeau, Chloé Clavel
  • Forget the Unneeded: Backdooring Large Language Models via Contrastive-enhanced Machine Unlearning
    Shiji Yang, Shu Zhao, Congyao Mei, Zhen Yang, Jie Chen, Fulan Qian, Zhen Duan, Yanping Zhang
  • Equipping Retrieval-Augmented Large Language Models with Document Structure Awareness
    Lingnan Xu, Chong Feng, Kaiyuan Zhang, Liu Zhengyong, Wenqiang Xu, Fanqing Meng
  • QEVA: A Reference-Free Evaluation Metric for Narrative Video Summarization with Multimodal Question Answering
    Woojun Jung, Junyeong Kim
  • Thinking Before You Speak: A Proactive Test-time Scaling Approach
    Cong Liu, Wenchang Chai, Hejun Wu, Yan Pan, Pengxu Wei, Liang Lin
  • Do Before You Judge: Self-Reference as a Pathway to Better LLM Evaluation
    Wei-Hsiang Lin, Sheng-Lun Wei, Hen-Hsen Huang, Hsin-Hsi Chen
  • Beyond Content: How Grammatical Gender Shapes Visual Representation in Text-to-Image Models
    Muhammed Saeed, Shaina Raza, Ashmal Vayani, Muhammad Abdul-Mageed, Ali Emami, Shady Shehata
  • ToolHaystack: Stress-Testing Tool-Augmented Language Models in Realistic Long-Term Interactions
    Beong-woo Kwak, Minju Kim, Dongha Lim, Hyungjoo Chae, Dongjin Kang, Sunghwan Kim, Dongil Yang, Jinyoung Yeo
  • GraphCheck: Multipath Fact-Checking with Entity-Relationship Graphs
    Hyewon Jeon, Jay-Yoon Lee
  • FLAMES: Improving LLM Math Reasoning via a Fine-Grained Analysis of the Data Synthesis Pipeline
    Parker Seegmiller, Kartik Mehta, Soumya Saha, Chenyang Tao, Shereen Oraby, Arpit Gupta, Tagyoung Chung, Mohit Bansal, Nanyun Peng
  • POW: Political Overton Windows of Large Language Models
    Leif Azzopardi
  • Columbo: Expanding Abbreviated Column Names for Tabular Data Using Large Language Models
    Ting Cai, Stephen Sheen, AnHai Doan
  • RTTC: Reward-Guided Collaborative Test-Time Compute
    Juan Pablo Munoz, Jinjie Yuan
  • AMANDA: Agentic Medical Knowledge Augmentation for Data-Efficient Medical Visual Question Answering
    Ziqing Wang, Chengsheng Mao, Yuan Luo, Kaize Ding
  • Mixed Signals: Decoding VLMs’ Reasoning and Underlying Bias in Vision-Language Conflict
    Pouya Pezeshkpour, Moin Aminnaseri, Estevam Hruschka
  • Mitigating Hallucination in Large Vision-Language Models through Aligning Attention Distribution to Information Flow
    Jianfei Zhao, Feng Zhang, Xin Sun, Chong Feng
  • OptiSeq: Ordering Examples On-The-Fly for In-Context Learning
    Rahul Atul Bhope, Praveen Venkateswaran, K. R. Jayaram, Vatche Isahagian, Vinod Muthusamy, Nalini Venkatasubramanian
  • Dependency Parsing-Based Syntactic Enhancement of Relation Extraction in Scientific Texts
    Devvrat Joshi, Islem Rekik
  • DIPLomA: Efficient Adaptation of Instructed LLMs to Low-Resource Languages via Post-Training Delta Merging
    Ixak Sarasua Antero, Ander Corral, Xabier Saralegi
  • Reliability Crisis of Reference-free Metrics for Grammatical Error Correction
    Takumi Goto, Yusuke Sakai, Taro Watanabe
  • Who Speaks Matters: Analysing the Influence of the Speaker’s Linguistic Identity on Hate Classification
    Ananya Malik, Kartik Sharma, Lynnette Hui Xian Ng, Shaily Bhatt
  • Are LLMs Empathetic to All? Investigating the Influence of Multi-Demographic Personas on a Model’s Empathy
    Ananya Malik, Nazanin Sabri, Melissa M. Karnaze, Mai ElSherief
  • Active Learning for Multidialectal Arabic POS Tagging
    Diyam Akra, Mohammed Khalilia, Mustafa Jarrar
  • Embedding-Free RAG
    Jessica Maghakian, Raunak Sinha, Max Schettewi, Gunkirat Kaur
  • Rating Roulette: Self-Inconsistency in LLM-As-A-Judge Frameworks
    Rajarshi Haldar, Julia Hockenmaier
  • Quantifying Uncertainty in Natural Language Explanations of Large Language Models for Question Answering
    Yangyi Li, Mengdi Huai
  • Real-World Summarization: When Evaluation Reaches Its Limits
    Patrícia Schmidtová, Ondrej Dusek, Saad Mahamood
  • Open-DeBias: Toward Mitigating Open-Set Bias in Language Models
    Nihar Ranjan Sahoo, Arti Rani, Shweta Singh, Gaurav Kumar Nayak
  • SACL: Understanding and Combating Textual Bias in Code Retrieval with Semantic-Augmented Reranking and Localization
    Dhruv Gupta, Yiqing Xie, Gayathri Ganesh Lakshmy
  • Jailbreak Distillation: Renewable Safety Benchmarking
    Jingyu Zhang, Ahmed Elgohary, Xiawei Wang, A S M Iftekhar, Ahmed Magooda, Benjamin Van Durme, Daniel Khashabi, Kyle Jackson
  • Uncertainty-Aware Answer Selection for Improved Reasoning in Multi-LLM Systems
    Aakriti Agrawal, Rohith Aralikatti, Anirudh Satheesh, Souradip Chakraborty, Amrit Singh Bedi, Furong Huang
  • GreekBarBench: A Challenging Benchmark for Free-Text Legal Reasoning and Citations
    Odysseas S. Chlapanis, Dimitris Galanis, Nikolaos Aletras, Ion Androutsopoulos
  • Pi-SQL: Enhancing Text-to-SQL with Fine-Grained Guidance from Pivot Programming Languages
    Yongdong Chi, Hanqing Wang, Yun Chen, Yan Yang, Jian Yang, Zonghan Yang, Xiao Yan, Guanhua Chen
  • RAC: Efficient LLM Factuality Correction with Retrieval Augmentation
    Changmao Li, Jeffrey Flanigan
  • Does It Run and Is That Enough? Revisiting Text-to-Chart Generation with a Multi-Agent Approach
    James Ford, Anthony Rios
  • WiNELL: Wikipedia Never-Ending Updating with LLM Agents
    Revanth Gangi Reddy, Tanay Dixit, Jiaxin Qin, Cheng Qian, Daniel Lee, Jiawei Han, Kevin Small, Xing Fan, Ruhi Sarikaya, Heng Ji
  • GeLoRA: Geometric Adaptive Ranks For Efficient LoRA Fine-tuning
    Abdessalam Ed-dib, Zhanibek Datbayev, Amine M. Aboussalah
  • Uncovering Scaling Laws for Large Language Models via Inverse Problems
    Arun Verma, Zhaoxuan Wu, Zijian Zhou, Xiaoqiang Lin, Zhiliang Chen, Rachael Hwee Ling Sim, Rui Qiao, Jingtan Wang, Nhung Bui, Xinyuan Niu, Wenyang Hu, Gregory Kang Ruey Lau, Zi-Yu Khoo, Zitong Zhao, Xinyi Xu, Apivich Hemachandra, See-Kiong Ng, Bryan Kian Hsiang Low
  • UIPE: Enhancing LLM Unlearning by Removing Knowledge Related to Forgetting Targets
    Wenyu wang, Mengqi Zhang, Xiaotian Ye, Zhaochun Ren, Pengjie Ren, Zhumin Chen
  • FicSim: A Dataset for Multi-Faceted Semantic Similarity in Long-Form Fiction
    Natasha Johnson, Amanda Bertsch, Maria-Emil Deal, Emma Strubell
  • Masked Diffusion Captioning for Visual Feature Learning
    Chao Feng, Zihao Wei, Andrew Owens
  • Diverse Multi-tool Aggregation with Large Language Models for Enhanced Math Reasoning
    Bohan Yao, Vikas Yadav
  • Enhancing Goal-oriented Proactive Dialogue Systems via Dynamic Multi-dimensional Consistency Optimization
    Didi Zhang, Yaxin Fan, Peifeng Li, Qiaoming Zhu
  • Injecting Domain-Specific Knowledge into Large Language Models: A Comprehensive Survey
    Zirui Song, Bin Yan, Yuhan Liu, Miao Fang, Mingzhe Li, Rui Yan, Xiuying Chen
  • Who’s the Author? How Explanations Impact User Reliance in AI-Assisted Authorship Attribution
    Calvin Bao, Connor Baumler, Hal Daumé III, Marine Carpuat
  • UniSpeaker: A Unified Approach for Multimodality-driven Speaker Generation
    Zhengyan Sheng, Zhihao Du, Heng Lu, ShiLiang Zhang, Zhen-Hua Ling
  • On the Fine-Grained Planning Abilities of VLM Web Agents
    Surgan Jandial, Yinong Oliver Wang, Andrea Bajcsy, Fernando De la Torre
  • InterFeedback: Unveiling Interactive Intelligence of Large Multimodal Models with Human Feedback
    Henry Hengyuan Zhao, Wenqi Pei, Yifei Tao, Haiyang Mei, Mike Zheng Shou
  • ReFLAIR: Enhancing Multimodal Reasoning via Structured Reflection and Reward-Guided Learning
    Jiazhou Ji
  • Exploring Cross-Client Memorization of Training Data in Language Models for Federated Learning
    Tinnakit Udsa, Can Udomcharoenchaikit, Patomporn Payoungkhamdee, Sarana Nutanong, Norrathep Rattanavipanon
  • ControlText: Unlocking Controllable Fonts in Multilingual Text Rendering without Font Annotations
    Bowen Jiang, Yuan Yuan, Xinyi Bai, Zhuoqun Hao, Alyson Yin, Yaojie Hu, Wenyu Liao, Lyle Ungar, Camillo Jose Taylor
  • STA-CoT: Structured Target-Centric Agentic Chain-of-Thought for Consistent Multi-Image Geological Reasoning
    Beibei Yu, Tao Shen, Ling Chen
  • Can Language Models Follow Multiple Turns of Entangled Instructions?
    Chi Han, Xin Liu, Haodong Wang, Shiyang Li, Jingfeng Yang, Haoming Jiang, Zhengyang Wang, Qingyu Yin, Liang Qiu, Changlong Yu, Yifan Gao, Zheng Li, Bing Yin, Jingbo Shang, Heng Ji
  • How to Generalize the Detection of AI-Generated Text: Confounding Neurons
    Claudio Borile, Carlo Abrate
  • SparsePO: Controlling Preference Alignment of LLMs via Sparse Token Masks
    Fenia Christopoulou, Ronald Cardenas, Gerasimos Lampouras, Haitham Bou Ammar, Jun Wang
  • We Argue to Agree: Towards Personality-Driven Argumentation-Based Negotiation Dialogue Systems for Tourism
    Priyanshu Priya, Saurav Dudhate, Desai Vishesh Yasheshbhai, Asif Ekbal
  • Towards the Roots of the Negation Problem: A Multilingual NLI Dataset and Model Scaling Analysis
    Tereza Vrabcová, Marek Kadlčík, Petr Sojka, Michal Štefánik, Michal Spiegel
  • Improving the Language Understanding Capabilities of Large Language Models Using Reinforcement Learning
    Sai Ashish Somayajula, Bokai Hu, Xin Pan, Pengtao Xie
  • HATECAT-TR: A Hate Speech Span Detection and Categorization Dataset for Turkish
    Hasan Kerem Şeker, Gökçe Uludoğan, Pelin Önal, Arzucan Özgür
  • DM-Codec: Distilling Multimodal Representations for Speech Tokenization
    Md Mubtasim Ahasan, Md Fahim, Tasnim Mohiuddin, AKMMAHBUBUR RAHMAN, Aman Chadha, Tariq Iqbal, M Ashraful Amin, Md Mofijul Islam, Amin Ahsan Ali
  • LCAN: A Label-Aware Contrastive Attention Network for Multi-Intent Recognition and Slot Filling in Task-Oriented Dialogue Systems
    Shuli Zhang, Zhiqiang You, XiaoXiangQi, Peng Liu, Gaode Wu, Kan Xia, Shenguang Huang
  • Low-Resource Languages LLM Disinformation is Within Reach: The Case of Walliserdeutsch
    Andrei Kucharavy, Sherine Seppey, Cyril Vallez, Dimitri Percia David, Ljiljana Dolamic
  • Exploring and Controlling Diversity in LLM-Agent Conversation
    KuanChao Chu, Yi-Pei Chen, Hideki Nakayama
  • Agentic-ToM: Cognition-Inspired Agentic Processing For Enhancing Theory of Mind Reasoning
    Sneheel Sarangi, Chetan Talele, Hanan Salam
  • Can We Edit LLMs for Long-Tail Biomedical Knowledge?
    Xinhao Yi, Jake Lever, Kevin Bryson, Zaiqiao Meng
  • GeoPQA: Bridging the Visual Perception Gap in MLLMs for Geometric Reasoning
    Guizhen Chen, Weiwen Xu, Hao Zhang, Hou Pong Chan, Deli Zhao, Anh Tuan Luu, Yu Rong
  • CM-Align: Consistency-based Multilingual Alignment for Large Language Models
    Xue Zhang, Yunlong Liang, Fandong Meng, Songming Zhang, Yufeng Chen, Jinan Xu, Jie Zhou
  • Cache Saver: A Modular Framework for Efficient, Affordable, and Reproducible LLM Inference
    Nearchos Potamitis, Lars Henning Klein, Chongyang Xu, Attreyee Mukherjee, Bardia Mohammadi, Niket Tandon, Laurent Bindschaedler, Akhil Arora
  • Evaluating Cultural Knowledge and Reasoning in LLMs Through Persian Allusions
    Melika Nobakhtian, Yadollah Yaghoobzadeh, Mohammad Taher Pilehvar
  • Evolving Stances on Reproducibility: A Longitudinal Study of NLP and ML Researchers’ Views and Experience of Reproducibility
    Craig Thomson, Ehud Reiter, João Sedoc, Anya Belz
  • KAHAN: Knowledge-Augmented Hierarchical Analysis and Narration for Financial Data Narration
    Yajing Yang, Tony Deng, Min-Yen Kan