- Automating Alternative Generation in Decision-Making
Yevhen Kostiuk, Clara Seyfried, Chris Reed
- Bias Analysis and Mitigation through Protected Attribute Detection and Regard Classification
Takuma Udagawa, YANG ZHAO, Hiroshi Kanayama, Bishwaranjan Bhattacharjee
- Large Language Models Might Not Care What You Are Saying: Prompt Format Beats Descriptions
Chenming Tang, Zhixiang Wang, Hao Sun, Yunfang Wu
- Boundary Matters: Leveraging Structured Text Plots for Long Text Outline Generation
Yuanchi Ma, Jiamou Liu, Hui He, Libo Zhang, Haoyuan Li, Zhendong Niu
- Can Large Language Models Personalize Dialogues to Generational Styles?
Pier Felice Balestrucci, Ondrej Dusek, Luca Anselma, Alessandro Mazzei
- Toward Optimal LLM Alignments Using Two-Player Games
Rui Zheng, Hongyi Guo, Zhihan Liu, Xiaoying Zhang, Yuanshun Yao, Xiaojun Xu, Zhaoran Wang, Zhiheng Xi, Tao Gui, Qi Zhang, Xuanjing Huang, Yang Liu, Hang Li
- Structural Patent Classification Using Label Hierarchy Optimization
Mengting Gui, Shufeng Hao, Chongyang Shi, Qi Zhang
- Exploring Hyperbolic Hierarchical Structure for Multimodal Rumor Detection
Md Mahbubur Rahman, Shufeng Hao, Chongyang Shi, An Lao, Jinyan Liu
- Multi-Surrogate-Objective Optimization for Neural Topic Models
Tue Le, Hoang Tran Vuong, Tung Nguyen, Linh Ngo Van, Sang Dinh, Trung Le, Thien Huu Nguyen
- How Diversely Can Language Models Solve Problems? Exploring the Algorithmic Diversity of Model-Generated Code
Seonghyeon Lee, HeeJae Chon, Joonwon Jang, Dongha Lee, Hwanjo Yu
- ReAL: How Can LLMs Simulate the Real Teacher? Retrieval-enhanced Agent for Adaptive Learning
Rui Lv, Qi Liu, Weibo Gao, Jiatong Li, Kai Zhang, Shiwei Tong
- LLMsPark: A Benchmark for Evaluating Large Language Models in Strategic Gaming Contexts
Junhao Chen, Jingbo Sun, Xiang Li, Haidong Xin, Yuhao Xue, Yibin Xu, Hao Zhao
- Versatile Framework for Song Generation with Prompt-based Control
Yu Zhang, Wenxiang Guo, Changhao Pan, Zhiyuan Zhu, Ruiqi Li, Jingyu Lu, Rongjie Huang, Ruiyuan Zhang, Zhiqing Hong, Ziyue Jiang, Zhou Zhao
- InsBank: Evolving Instruction Subset for Ongoing Alignment
Jiayi Shi, Yiwei Li, Shaoxiong Feng, Peiwen Yuan, Xinglin Wang, Yueqi Zhang, Chuyi Tan, Boyuan Pan, Huan Ren, Yao Hu, Kan Li
- TL-Training: A Task-Feature-Based Framework for Training Large Language Models in Tool Use
Junjie Ye, Yilong Wu, Sixian Li, Yuming Yang, Zhiheng Xi, Tao Gui, Qi Zhang, Xuanjing Huang, Peng Wang, Zhongchao Shi, Jianping Fan, Zhengyin Du
- DCMKC: A Dual Consistency Matching Approach for Multi-hop Question Answering in LLMs
Xinyi Wang, YIPING SONG, Chang Liu, Tingjin Luo, Bo Liu, Zheng Xie, Minlie Huang
- On Domain-Adaptive Post-Training for Multimodal Large Language Models
Daixuan Cheng, Shaohan Huang, Ziyu Zhu, Xintong Zhang, Xin Zhao, Zhongzhi Luan, Bo Dai, Zhenliang Zhang
- CPO: Addressing Reward Ambiguity in Role-playing Dialogue via Comparative Policy Optimization
Jing Ye, Rui Wang, Yuchuan Wu, Victor Ma, Feiteng Fang, Fei Huang, Yongbin Li
- SPPD: Self-training with Process Preference Learning Using Dynamic Value Margin
Hao Yi, Qingyang Li, Yulan Hu, Fuzheng Zhang, Di ZHANG, Yong Liu
- Error Classification of Large Language Models on Math Word Problems: A Dynamically Adaptive Framework
Zhangyue Yin, YuHong Sun, Xuanjing Huang, Xipeng Qiu, Hui Zhao
- sudoLLM: On Multi-role Alignment of Language Models
Soumadeep Saha, Akshay Chaturvedi, Joy Mahapatra, Utpal Garain
- DAC: Decomposed Automation Correction for Text-to-SQL
Dingzirui Wang, Longxu Dou, Xuanliang Zhang, Qingfu Zhu
- VehicleWorld: A Highly Integrated Multi-Device Environment for Intelligent Vehicle Interaction
Jie Yang, Jiajun Chen, Zhangyue Yin, Shuo Chen, Yuxin Wang, YiranGuo, Yuan Li, Yining Zheng, Xuanjing Huang, Xipeng Qiu
- End-to-End Optimization for Multimodal Retrieval-Augmented Generation via Reward Backpropagation
Zhiyuan Fan, Longfei Yun, Ming Yan, Yumeng Wang, Dadi Guo, Brian Mak, James Kwok, Yi R. Fung
- Audio-Aware Large Language Models as Judges for Speaking Styles
Cheng-Han Chiang, Xiaofei Wang, Chung-Ching Lin, Kevin Lin, Linjie Li, Radu Kopetz, Yao Qian, Zhendong Wang, Zhengyuan Yang, Hung-yi Lee, Lijuan Wang
- Evaluation of Text-to-Image Generation from a Creativity Perspective
Xinhao Wang, Xinyu Ma, ShengYong Ding, Derek F. Wong
- Perovskite-LLM: Knowledge-Enhanced Large Language Models for Perovskite Solar Cell Research
Xiang Liu, Penglei Sun, Shuyan Chen, Longhan Zhang, Peijie Dong, Huajie You, Yongqi Zhang, Chang YAN, Xiaowen Chu, Tong-yi Zhang
- ProPy: Building Interactive Prompt Pyramids upon CLIP for Partially Relevant Video Retrieval
Yi Pan, Yujia Zhang, Michael Kampffmeyer, Xiaoguang Zhao
- Multilingual Datasets for Custom Input Extraction and Explanation Requests Parsing in Conversational XAI Systems
Qianli Wang, Tatiana Anikina, Nils Feldhus, Simon Ostermann, Fedor Splitt, Jiaao Li, Yoana Tsoneva, Sebastian Möller, Vera Schmitt
- Toolscaler: Scalable Generative Tool Calling via Structure-Aware Semantic Tokenization
Yunyue Su, Zhang Jinshuai, Bowen Fang, Wen Ye, Jinghao Zhang, Qiang Liu, Bowen Song, Weiqiang Wang, Liang Wang
- LaMP-Val: Large Language Models Empower Personalized Valuation in Auction
Jie Sun, Tianyu Zhang, Houcheng Jiang, Junkang Wu, Xiang Shu, Jiancan Wu, An Zhang, Chi Luo, Zhibo Zhu, Xingyu Lu, Lintao Ma, Xiang Wang
- Exploring Model Kinship for Merging Large Language Models
Yedi Hu, Yunzhi Yao, Ningyu Zhang, Huajun Chen, Shumin Deng
- MULTITAT: Benchmarking Multilingual Table-and-Text Question Answering
Xuanliang Zhang, Dingzirui Wang, Keyan Xu, Qingfu Zhu, Wanxiang Che
- LoRA-MGPO: Mitigating Double Descent in Low-Rank Adaptation via Momentum-Guided Perturbation Optimization
Yupeng Chang, Chenlu Guo, Yi Chang, Yuan Wu
- R-LoRA: Randomized Multi-Head LoRA for Efficient Multi-task Learning
Jinda Liu, Yi Chang, Yuan Wu
- RACQC: Advanced Retrieval-Augmented Generation for Chinese Query Correction
Jinbo Su, Lingzhe Gao, Wei Li, Shihao Liu, Haojie Lei, Xinyi Wang, Yuanzhao Guo, Ke Wang, Daiting Shi, Dawei Yin
- Mechanistic Understanding and Mitigation of Language Confusion in English-Centric Large Language Models
Ercong Nie, Helmut Schmid, Hinrich Schuetze
- Assessing and Mitigating Medical Knowledge Drift and Conflicts in Large Language Models
Weiyi Wu, Xinwen Xu, Chongyang Gao, Xingjian Diao, Lucas A. Salas, Jiang Gui
- Improving LLM Reasoning through Interpretable Role-Playing Steering
Anyi Wang, Dong Shu, Yifan Wang, Yunpu Ma, Mengnan Du
- R2A-TLS: Reflective Retrieval-Augmented Timeline Summarization with Causal-Semantic Integration
Chenlong Bao, Shijie Li, Minghao Hu, Ming Qiao, Bin Zhang, Jin-Tao Tang, Shasha Li, Ting Wang
- MedEBench: Diagnosing Reliability in Text-Guided Medical Image Editing
Minghao LIU, Zhitao He, Zhiyuan Fan, Qingyun Wang, Yi R. Fung
- FairCoT: Enhancing Fairness in Text-to-Image Generation via Chain of Thought Reasoning with Multimodal Large Language Models
Zahraa Al Sahili, Ioannis Patras, Matthew Purver
- Bag of Tricks for Sparse Mixture-of-Experts: A Benchmark Across Reasoning, Efficiency, and Safety
Mufan Qiu, Zheyu Shen, Pingzhi Li, Ang Li, Tianlong Chen
- Don’t Take the Premise for Granted: Evaluating the Premise Critique Ability of Large Language Models
Jinzhe Li, Gengxu Li, Yi Chang, Yuan Wu
- Mitigating Geospatial Knowledge Hallucination in Large Language Models: Benchmarking and Dynamic Factuality Aligning
Shengyuan Wang, Jie Feng, Tianhui Liu, Dan Pei, Yong Li
- The Power of Framing: How News Headlines Guide Search Behavior
Amrit Poudel, Maria Milkowski, Tim Weninger
- DivLogicEval: A Framework for Benchmarking Logical Reasoning Evaluation in Large Language Models
Tsz Ting Chung, Lemao Liu, Mo Yu, Dit-Yan Yeung
- THCM-CAL: Temporal-Hierarchical Causal Modelling with Conformal Calibration for Clinical Risk Prediction
Xin Zhang, Qiyu Wei, Yingjie Zhu, Fanyi Wu, Sophia Ananiadou
- GenPilot: A Multi-Agent System for Test-Time Prompt Optimization in Image Generation
Wen Ye, Zhaocheng Liu, Gui Yuwei, Tingyu Yuan, Yunyue Su, Bowen Fang, Chaoyang Zhao, Qiang Liu, Liang Wang
- Grounded-VideoLLM: Sharpening Fine-grained Temporal Grounding in Video Large Language Models
Haibo Wang, Zhiyang Xu, Yu Cheng, Shizhe Diao, Yufan Zhou, Yixin Cao, Qifan Wang, Weifeng Ge, Lifu Huang
- DongbaMIE: A Multimodal Information Extraction Dataset for Evaluating Semantic Understanding of Dongba Pictograms
Xiaojun Bi, Shuo Li, Junyao Xing, Ziyue Wang, Fuwen Luo, Weizheng Qiao, Lu Han, Ziwei Sun, Peng Li, Yang Liu
- Optimizing Cross-Client Domain Coverage for Federated Instruction Tuning of Large Language Models
Zezhou Wang, Yaxin Du, Xingjun Ma, Yu-Gang Jiang, Zhuzhong Qian, Siheng Chen
- Aligning Black-Box LLMs for Aspect Sentiment Quad Prediction
Shichen Li, Jiawei Zhang, Zhongqing Wang, Peifeng Li
- Multifaceted Evaluation of Audio-Visual Capability for MLLMs: Effectiveness, Efficiency, Generalizability and Robustness
Yusheng Zhao, Xiao Luo, Junyu Luo, Weizhi Zhang, Zhiping Xiao, Wei Ju, Philip S. Yu, Ming Zhang
- Two Steps from Hell: Compositionality on Chemical LMs
Veronika Ganeeva, Kuzma Khrabrov, Artur Kadurin, Elena Tutubalina
- GTA: Supervised-Guided Reinforcement Learning for Text Classification with Large Language Models
Min Zeng, Jingfei Sun, Xueyou Luo, Shiqi Zhang, Li Xie, Caiquan Liu, Xiaoxin Chen
- Unearthing Gems from Stones: Policy Optimization with Negative Sample Augmentation for LLM Reasoning
Zhaohui Yang, Yuxiao Ye, Shilei Jiang, Shihong Deng, Chen Hu, Linjing Li, Daxin Jiang
- LEAF: Large Language Diffusion Model for Time Series Forecasting
Yuhang Pei, Yifan Wang, Tao Ren, Zhipeng Sun, Wei Ju, Chong Chen, Xian-Sheng Hua, Xiao Luo
- SPFT-SQL: Enhancing Large Language Model for Text-to-SQL Parsing by Self-Play Fine-Tuning
Yuhao Zhang, Shaoming Duan, Jinhang Su, Chuanyi Liu, Peiyi Han
- Multilingual Verbalisation of Knowledge Graphs
Yifei Song, William Soto Martinez, Anna Nikiforovskaya, Evan Parker Kelly Chapple, Claire Gardent
- LAGCL4Rec: When LLMs Activate Interactions Potential in Graph Contrastive Learning for Recommendation
Leqi Zheng, Chaokun Wang, Canzhi Chen, Jiajun Zhang, Cheng Wu, Zixin Song, Shannan Yan, Ziyang Liu, Hongwei Li
- English as Defense Proxy: Mitigating Multilingual Jailbreak via Eliciting English Safety Knowledge
Zekai Zhang, Yiduo Guo, Jiuheng Lin, Shanghaoran Quan, Huishuai Zhang, Dongyan Zhao
- Dagger Behind Smile: Fool LLMs with a Happy Ending Story
Xurui Song, Zhixin Xie, Shuo Huai, Jiayi Kong, Jun Luo
- Mitigating Object Hallucinations in MLLMs via Multi-Frequency Perturbations
Shuo Li, Jiajun Sun, Guodong Zheng, Xiaoran Fan, Yujiong Shen, Yi Lu, Zhiheng Xi, Yuming Yang, Wenming Tan, Tao Ji, Tao Gui, Qi Zhang, Xuanjing Huang
- Natural Context Drift Undermines the Natural Language Understanding of Large Language Models
Yulong Wu, Viktor Schlegel, Riza Batista-Navarro
- Minimal Ranks, Maximum Confidence: Parameter-efficient Uncertainty Quantification for LoRA
Patryk Marszałek, Klaudia Bałazy, Jacek Tabor, Tomasz Kuśmierczyk
- Chain-of-Thought Prompting Obscures Hallucination Cues in Large Language Models: An Empirical Evaluation
Jiahao Cheng, Tiancheng Su, Jia Yuan, Guoxiu He, Jiawei Liu, XinqiTao, Jingwen Xie, Huaxia Li
- Large Language Model Evaluation via Matrix Nuclear-Norm
Yahan Li, Tingyu Xia, Yuan Wu, Yi Chang
- From Grounding to Manipulation: Case Studies of Foundation Model Integration in Embodied Robotic Systems
Xiuchao Sui, Daiying Tian, Qi Sun, Ruirui Chen, Dongkyu Choi, Kenneth Kwok, Soujanya Poria
- Flexible Thinking for Multimodal Emotional Support Conversation via Reinforcement Learning
Fanfan Wang, Xiangqing Shen, Jianfei Yu, Rui Xia
- ORAL: Prompting Your Large-Scale LoRAs via Conditional Recurrent Diffusion
Rana Shahroz, Dongwen Tang, Pingzhi Li, Kai Wang, Tianlong Chen
- NLoRA: Nyström-Initiated Low-Rank Adaptation for Large Language Models
Chenlu Guo, Yi Chang, Yuan Wu
- Bhaasha, Bhāṣā, Zaban: A Survey for Low-Resourced Languages in South Asia – Current Stage and Challenges
Sampoorna Poria, Xiaolei Huang
- DISCO Balances the Scales: Adaptive Domain- and Difficulty-Aware Reinforcement Learning on Imbalanced Data
Yuhang Zhou, Jing Zhu, Shengyi Qian, Zhuokai Zhao, Xiyao Wang, Xiaoyu Liu, Ming Li, Paiheng Xu, Wei Ai, Furong Huang
- What Makes for Good Image Captions?
Delong Chen, Samuel Cahyawijaya, Etsuko Ishii, Ho Shu Chan, Yejin Bang, Pascale Fung
- What’s Not Said Still Hurts: A Description-Based Evaluation Framework for Measuring Social Bias in LLMs
Jinhao Pan, Chahat Raj, Ziyu Yao, Ziwei Zhu
- Identifying Rare Languages in Common Crawl Data is a Needles-in-a-Haystack Problem
Rasul Dent, Pedro Ortiz Suarez, Thibault Clérice, Benoît Sagot
- Training Language Models to Critique With Multi-agent Feedback
Tian Lan, Wenwei Zhang, Chengqi Lyu, Shuaibin Li, Chen Xu, Heyan Huang, Dahua Lin, Xian-Ling Mao, Kai Chen
- RELIC: Enhancing Reward Model Generalization for Low-Resource Indic Languages with Few-Shot Examples
Soumya Suvra Ghosal, Vaibhav Singh, Akash Ghosh, Soumyabrata Pal, Subhadip Baidya, Sriparna Saha, Dinesh Manocha
- Invoke Interfaces Only When Needed: Adaptive Invocation for Large Language Models in Question Answering
Jihao Zhao, Chunlai Zhou, Daixuan Li, Shuaishuai Zu, Biao Qin
- SQLSpace: A Representation Space for Text-to-SQL to Discover and Mitigate Robustness Gaps
Neha Srikanth, Victor Bursztyn, Puneet Mathur, Ani Nenkova
- One More Modality: Does Abstract Meaning Representation Benefit Visual Question Answering?
Shira Wein, Emma Markle, Abhidip Bhattacharyya
- DP-GTR: Differentially Private Prompt Protection via Group Text Rewriting
Mingchen Li, Heng Fan, Song Fu, Junhua Ding, Yunhe Feng
- Legal Mathematical Reasoning with LLMs: Procedural Alignment through Two-Stage Reinforcement Learning
Kepu Zhang, Guofu Xie, Weijie Yu, Mingyue Xu, Xu Tang, Yaxin Li, Jun Xu
- ModelingAgent: Bridging LLMs and Mathematical Modeling for Real-World Challenges
Cheng Qian, Hongyi Du, Hongru WANG, Xiusi Chen, Yuji Zhang, Avirup Sil, ChengXiang Zhai, Kathleen McKeown, Heng Ji
- Beyond Coarse Labels: Fine-Grained Problem Augmentation and Multi-Dimensional Feedback for Emotional Support Conversation
Yuanchen Shi, Jiawang Hao, Fang Kong
- FinHEAR: Human Expertise and Adaptive Risk-Aware Temporal Reasoning for Financial Decision-Making
Jiaxiang Chen, mingxi Zou, Zhuo Wang, Qifan Wang, Danny Dongning Sun, Zhang Chi, Zenglin Xu
- EvolKV: Evolutionary KV Cache Compression for LLM Inference
Bohan Yu, Yekun Chai
- A Survey on Sparse Autoencoders: Interpreting the Internal Mechanisms of Large Language Models
Dong Shu, Xuansheng Wu, Haiyan Zhao, Daking Rai, Ziyu Yao, Ninghao Liu, Mengnan Du
- Large Vision-Language Model Alignment and Misalignment: A Survey Through the Lens of Explainability
Dong Shu, Haiyan Zhao, Jingyu Hu, Weiru Liu, Ali Payani, Lu Cheng, Mengnan Du
- Attention Consistency for LLMs Explanation
Tian LAN, JINYUAN XU, Xue HE, Jenq-Neng Hwang, Lei Li
- Confusion is the Final Barrier: Rethinking Jailbreak Evaluation and Investigating the Real Misuse Threat of LLMs
Yu Yan, Sheng Sun, Zhe Wang, Yijun Lin, Zenghao Duan, zhifei zheng, Min Liu, Zhiyi yin, Jianping Zhang
- CCL-XCoT: An Efficient Cross-Lingual Knowledge Transfer Method for Mitigating Hallucination Generation
Zheng Weihua, Roy Ka-Wei Lee, Zhengyuan Liu, Wu Kui, AiTi Aw, Bowei Zou
- Evaluating Step-by-step Reasoning Traces: A Survey
Jinu Lee, Julia Hockenmaier
- Beyond Guilt: Legal Judgment Prediction with Trichotomous Reasoning
Kepu Zhang, Haoyue Yang, Xu Tang, Weijie Yu, Jun Xu
- Not Every Token Needs Forgetting: Selective Unlearning Balancing Forgetting and Utility in Large Language Models
Yixin Wan, Anil Ramakrishna, Kai-Wei Chang, Volkan Cevher, Rahul Gupta
- DisastIR: A Comprehensive Information Retrieval Benchmark for Disaster Management
Kai Yin, Xiangjue Dong, Chengkai Liu, Lipai Huang, Yiming Xiao, Zhewei LIU, Ali Mostafavi, James Caverlee
- Data or Language Supervision: What Makes CLIP Better than DINO?
Yiming Liu, Yuhui Zhang, Dhruba Ghosh, Ludwig Schmidt, Serena Yeung-Levy
- Do LLMs Understand Wine Descriptors Across Cultures? A Benchmark for Cultural Adaptations of Wine Reviews
Chenye Zou, Xingyue Wen, Tianyi Hu, Qian Janice Wang, Daniel Hershcovich
- DeFT-X: Denoised Sparse Fine-Tuning for Zero-Shot Cross-Lingual Transfer
Sona Elza Simon, Preethi Jyothi
- Memory-enhanced Large Language Model for Cross-lingual Dependency Parsing via Deep Hierarchical Syntax Understanding
Jianjian Liu, Ying Li, Zhengtao Yu, Shun Su, Shengxiang Gao, Yuxin Huang
- Developing and Utilizing a Large-Scale Cantonese Dataset for Multi-Tasking in Large Language Models
Jiyue Jiang, Alfred Kar Yin Truong, Yanyu CHEN, Qinghang Bao, Sheng Wang, Pengan CHEN, Jiuming Wang, Lingpeng Kong, Yu Li, Chuan Wu
- A Structured Framework for Evaluating and Enhancing Interpretive Capabilities of Multimodal LLMs in Culturally Situated Tasks
Haorui Yu, Ramon Ruiz-Dolz, Qiufeng Yi
- Train a Unified Multimodal Data Quality Classifier with Synthetic Data
Weizhi Wang, Rongmei Lin, Shiyang Li, Colin Lockard, Ritesh Sarkhel, Sanket Lokegaonkar, Jingbo Shang, Xifeng Yan, Nasser Zalmout, Xian Li
- Self-Improvement in Multimodal Large Language Models: A Survey
Shijian Deng, Kai Wang, Tianyu Yang, Harsh Singh, Yapeng Tian
- Towards Achieving Concept Completeness for Textual Concept Bottleneck Models
Milan Bhan, Yann CHOHO, Jean-Noël Vittaut, Nicolas CHESNEAU, Pierre Moreau, Marie-Jeanne Lesot
- EmoBench-UA: A Benchmark Dataset for Emotion Detection in Ukrainian
Daryna Dementieva, Nikolay Babakov, Alexander Fraser
- Scientific Paper Retrieval with LLM-Guided Semantic-Based Ranking
Yunyi Zhang, Ruozhen Yang, Siqi Jiao, SeongKu Kang, Jiawei Han
- DLIR: Spherical Adaptation for Cross-Lingual Knowledge Transfer of Sociological Concepts Alignment
Zeqiang Wang, Jon Johnson, Suparna De
- Test-Time Steering for Lossless Text Compression via Weighted Product of Experts
Qihang Zhang, Muchen Li, Ziao Wang, Renjie Liao, Lele Wang
- Zero-Shot Contextual Embeddings via Offline Synthetic Corpus Generation
Philip Lippmann, Jie Yang
- The Hallucination Tax of Reinforcement Finetuning
Linxin Song, Taiwei Shi, Jieyu Zhao
- Tracing Multilingual Factual Knowledge Acquisition in Pretraining
Yihong Liu, Mingyang Wang, Amir Hossein Kargaran, Felicia Körner, Ercong Nie, Barbara Plank, François Yvon, Hinrich Schuetze
- Exploring the Vulnerability of the Content Moderation Guardrail in Large Language Models via Intent Manipulation
Jun Zhuang, Haibo Jin, Ye Zhang, Zhengjian Kang, Wenbin Zhang, Gaby G. Dagher, Haohan Wang
- Examining Multilingual Embedding Models Cross-Lingually Through LLM-Generated Adversarial Examples
Andrianos Michail, Simon Clematide, Rico Sennrich
- EmoGist: Efficient In-Context Learning for Visual Emotion Understanding
Ronald Seoh, Dan Goldwasser
- Soft Token Attacks Cannot Reliably Audit Unlearning in Large Language Models
Haokun Chen, Sebastian Szyller, Weilin Xu, Nageen Himayat
- Bridging the Editing Gap in LLMs: FineEdit for Precise and Targeted Text Modifications
Yiming Zeng, Wanhao Yu, Zexin Li, Tao Ren, Yu Ma, Jinghan Cao, Xiyan Chen, Tingting Yu
- LLM-based Conversational Recommendation Agents with Collaborative Verbalized Experience
Yaochen Zhu, Harald Steck, Dawen Liang, Yinhan He, Nathan Kallus, Jundong Li
- Hardware-Aware Parallel Prompt Decoding for Memory-Efficient Acceleration of LLM Inference
Hao Mark Chen, Wayne Luk, Yiu Ka Fai Cedric, Rui Li, Konstantin Mishchenko, Stylianos Venieris, Hongxiang Fan
- Measuring Sycophancy of Language Models in Multi-turn Dialogues
Jiseung Hong, Grace Byun, Seungone Kim, Kai Shu
- On the Role of Entity and Event Level Conceptualization in Generalizable Reasoning: A Survey of Tasks, Methods, Applications, and Future Directions
Weiqi Wang, Tianqing Fang, Haochen Shi, Baixuan Xu, Wenxuan Ding, Liyu Zhang, Wei Fan, Jiaxin Bai, Haoran Li, Xin Liu, Yangqiu Song
- Mitigating Visual Knowledge Forgetting in MLLM Instruction-tuning via Modality-decoupled Gradient Descent
Junda Wu, Yuxin Xiong, Xintong Li, Yu Xia, Yu Wang, Tong Yu, Sungchul Kim, Ryan A. Rossi, Lina Yao, Jingbo Shang, Julian McAuley
- PathoHR: Hierarchical Reasoning for Vision-Language Models in Pathology
Yating Huang, Ziyan Huang, Lintao Xiang, Qijun Yang, Hujun Yin
- “What’s Up, Doc?”: Analyzing How Users Seek Health Information in Large-Scale Conversational AI Datasets
Akshay Paruchuri, Maryam Aziz, Rohit Vartak, Ayman Ali, Best Uchehara, Xin Liu, Ishan Chatterjee, Monica Agrawal
- Dynamic Evaluation for Oversensitivity in LLMs
Sophia Xiao Pu, Sitao Cheng, Xin Eric Wang, William Yang Wang
- Self-Correcting Code Generation Using Small Language Models
Jeonghun Cho, Deokhyung Kang, Hyounghun Kim, Gary Lee
- A Unified Framework for N-ary Property Information Extraction in Materials Science
Van-Thuy Phi, Yuji Matsumoto
- A Benchmark for Translations Across Styles and Language Variants
Xin Tan, Bowei Zou, AiTi Aw
- ManuSearch: Democratizing Deep Search in Large Language Models with a Transparent and Open Multi-Agent Framework
Lisheng Huang, Yichen Liu, Jinhao Jiang, Rongxiang Zhang, Jiahao yan, Junyi Li, Xin Zhao
- Proactive User Information Acquisition via Chats on User-Favored Topics
Shiki Sato, Jun Baba, Asahi Hentona, Shinji Iwata, Akifumi Yoshimoto, Koichiro Yoshino
- Evaluating Text Generation Quality Using Spectral Distances of Surprisal
Zhichen Liu, Yongyuan Li, Yang Xu, Yu Wang, Yingfang Yuan, Zuhao Yang
- NLP-ADBench: NLP Anomaly Detection Benchmark
Yuangang Li, Jiaqi li, Zhuo Xiao, Tiankai Yang, Yi Nian, Xiyang Hu, Yue Zhao
- Toward Inclusive Language Models: Sparsity-Driven Calibration for Systematic and Interpretable Mitigation of Social Biases in LLMs
Prommy Sultana Hossain, Chahat Raj, Ziwei Zhu, Jessica Lin, Emanuela Marasco
- Table-Text Alignment: Explaining Claim Verification Against Tables in Scientific Papers
Xanh Ho, Sunisth Kumar, Yun-Ang Wu, Florian Boudin, Atsuhiro Takasu, Akiko Aizawa
- DCRM: A Heuristic to Measure Response Pair Quality in Preference Optimization
Chengyu Huang, Tanya Goyal
- Advancing Reasoning with Off-the-Shelf LLMs: A Semantic Structure Perspective
Pengfei He, Zitao Li, Yue Xing, Yaliang Li, Jiliang Tang, Bolin Ding
- LLM-based Open Domain Planning by Leveraging Entity-Attribute-Level Domain Models
Dongning Rao, Songlin He, Ruishi Liang, Zhihua Jiang
- DICP: Deep In-Context Prompt for Event Causality Identification
Lin Mu, Jun Shen, Li Ni, Lei Sang, Zhize Wu, Peiquan Jin, Yiwen Zhang
- Seeing is Believing: Emotion-Aware Audio-Visual Language Modeling for Expressive Speech Generation
Weiting Tan, Jiachen Lian, Hirofumi Inaguma, Paden Tomasello, Philipp Koehn, Xutai Ma
- GRV-KBQA: A Three-Stage Framework for Knowledge Base Question Answering with Decoupled Logical Structure, Semantic Grounding and Structure-Aware Validation
Yuhang Tian, Pan Yang, Dandan Song, Zhijing Wu, Hao Wang
- Improving Prompt Generalization for Cross-prompt Essay Trait Scoring from the Scoring-invariance Perspective
Jiong Wang, Shengquan Yu
- When Format Changes Meaning: Investigating Semantic Inconsistency of Large Language Models
Cheongwoong Kang, Jongeun Baek, Yeonjea Kim, Jaesik Choi
- ASTPrompter: Preference-Aligned Automated Language Model Red-Teaming to Generate Low-Perplexity Unsafe Prompts
Amelia Hardy, Houjun Liu, Allie Griffith, Bernard Lange, Duncan Eddy, Mykel Kochenderfer
- How Do Large Language Models Perform on PDE Discovery: A Coarse-to-fine Perspective
Xiao Luo, Changhu Wang, Yizhou Sun, Wei Wang
- Rethinking Data Selection at Scale: Random Selection is Almost All You Need
Tingyu Xia, Bowen Yu, Kai Dang, An Yang, Yuan Wu, Yi Chang, Junyang Lin
- PromptKeeper: Safeguarding System Prompts for LLMs
Zhifeng Jiang, Zhihua Jin, Guoliang He
- Automating eHMI Action Design with LLMs for Automated Vehicle Communication
Ding Xia, Xinyue Gui, Fan Gao, Dongyuan Li, Mark Colley, Takeo Igarashi
- A Dynamic Fusion Model for Consistent Crisis Response
Xiaoying Song, Anirban Saha Anik, Eduardo Blanco, Vanessa Frias-Martinez, Lingzi Hong
- UIOrchestra: Generating High-Fidelity Code from UI Designs with a Multi-agent System
Chuhuai Yue, Jiajun Chai, Yufei zhang, Zixiang Ding, Xihao Liang, Peixin Wang, Shihai Chen, Wang Yixuan, wangyanping, Wei Lin, Guojun Yin
- CrossQG: Improving Difficulty-Controllable Question Generation through Consistency Enhancement
Kunze Li, Yu Zhang
- Progressive Facial Granularity Aggregation with Bilateral Attribute-based Enhancement for Face-to-Speech Synthesis
Yejin Jeon, Youngjae Kim, Jihyun Lee, Hyounghun Kim, Gary Lee
- Speaking at the Right Level: Literacy-Controlled Counterspeech Generation with RAG-RL
Xiaoying Song, Anirban Saha Anik, Dibakar Barua, Pengcheng Luo, Junhua Ding, Lingzi Hong
- FNSCC: Fuzzy Neighborhood-Aware Self-Supervised Contrastive Clustering for Short Text
Zijian Zheng, Yonghe Lu, Jian Yin
- AuraDial: A Large-Scale Human-Centric Dialogue Dataset for Chinese AI Psychological Counseling
Xiantao Zhang
- TS-SQL: Test-driven Self-refinement for Text-to-SQL
Wenbo Xu, Haifeng Zhu, Liang Yan, Chuanyi Liu, Peiyi Han, Shaoming Duan, Jeff Z. Pan
- DemonAgent: Dynamically Encrypted Multi-Backdoor Implantation Attack on LLM-based Agent
Pengyu Zhu, Zhenhong Zhou, Yuanhe Zhang, Shilinlu Yan, Kun Wang, Sen Su
- MotivGraph-SoIQ: Integrating Motivational Knowledge Graphs and Socratic Dialogue for Enhanced LLM Ideation
Xinping Lei, Tong Zhou, Yubo Chen, Kang Liu, Jun Zhao
- ExpertGenQA: Open-ended QA generation in Specialized Domains
Haz Sameen Shahgir, Chansong Lim, Jia Chen, Evangelos E. Papalexakis, Yue Dong
- VisCoder: Fine-Tuning LLMs for Executable Python Visualization Code Generation
Yuansheng Ni, Ping Nie, Kai Zou, Xiang Yue, Wenhu Chen
- Conversational Education at Scale: A Multi-LLM Agent Workflow for Procedural Learning and Pedagogic Quality Assessment
Jiahuan Pei, Fanghua Ye, XIN SUN, Wentao Deng, Koen Hindriks, Junxiao Wang
- Visual Program Distillation with Template-Based Augmentation
Michal Shlapentokh-Rothman, Yu-Xiong Wang, Derek Hoiem
- NeighXLM: Enhancing Cross-Lingual Transfer in Low-Resource Languages via Neighbor-Augmented Contrastive Pretraining
Sicheng Wang, Wenyi Wu, Zibo Zhang
- ICLER: Intent CLassification with Enhanced Reasoning
Dezheng Gao, Dong xiaozheng, SHuangtao Yang, Bo Fu
- PreGenie: An Agentic Framework for High-quality Visual Presentation Generation
Xiaojie Xu, Xinli Xu, Sirui CHEN, Haoyu Chen, Fan Zhang, Ying-Cong Chen
- RIVAL: Reinforcement Learning with Iterative and Adversarial Optimization for Machine Translation
Tianjiao Li, Mengranyu, Chenyu Shi, Yanjun Zhao, Xiaojing Liu, Qi Zhang, Xuanjing Huang, Qiang Zhang, Jiayin Wang
- MRAG: A Modular Retrieval Framework for Time-Sensitive Question Answering
Siyue Zhang, Yuxiang Xue, Yiming Zhang, Xiaobao Wu, Anh Tuan Luu, Chen Zhao
- CoT-RAG: Integrating Chain of Thought and Retrieval-Augmented Generation to Enhance Reasoning in Large Language Models
Feiyang Li, Peng Fang, Zhan Shi, Arijit Khan, Fang Wang, Weihao Wang, zhangxin-hw, Cui Yongjian
- TabDSR: Decompose, Sanitize, and Reason for Complex Numerical Reasoning in Tabular Data
Changjiang Jiang, Fengchang Yu, Haihua Chen, Wei Lu, Jin Zeng
- Chain-of-Thought Matters: Improving Long-Context Language Models with Reasoning Path Supervision
Dawei Zhu, Xiyu Wei, Guangxiang Zhao, Wenhao Wu, Haosheng Zou, Junfeng Ran, XWang, Lin Sun, Xiangzheng Zhang, Sujian Li
- Multimodal Document-level Triple Extraction via Dynamic Graph Enhancement and Relation-Aware Reflection
Xiang Li, Runhai Jiao, ZHOU CHANGYU, Shoupeng Qiao, Ruojiao Qiao, Ruifan Li
- Distill Visual Chart Reasoning Ability from LLMs to MLLMs
Wei He, Zhiheng Xi, Wanxu Zhao, Xiaoran Fan, Yiwen Ding, Zifei Shan, Tao Gui, Qi Zhang, Xuanjing Huang
- FlowMalTrans: Unsupervised Binary Code Translation for Malware Detection Using Flow-Adapter Architecture
Minghao Hu, Junzhe Wang, Weisen Zhao, Qiang Zeng, Lannan Luo
- AdaTP: Attention-Debiased Token Pruning for Video Large Language Models
Fengyuan Sun, Leqi Shen, Hui Chen, Sicheng Zhao, Jungong Han, Guiguang Ding
- AdaptFlow: Adaptive Workflow Optimization via Meta-Learning
Runchuan Zhu, Bowen Jiang, Lingrui Mei, Fangkai Yang, Lu Wang, Haoxiang Gao, Fengshuo Bai, Pu Zhao, Qingwei Lin, Saravan Rajmohan, Dongmei Zhang
- LMUNIT: Fine-grained Evaluation with Natural Language Unit Tests
Jon Saad-Falcon, Rajan Pathe Vivek, William Berrios, Nandita Shankar Naik, Matija Franklin, Bertie Vidgen, Amanpreet Singh, Douwe Kiela, Shikib Mehri
- ThinkAnswer Loss: Balancing Semantic Similarity and Exact Matching for LLM Reasoning Enhancement
Shan Yang, Kun Wu, Zeju Li, Linlin Zhang, Xiangyu Pei, Leike An, Yu Liu
- Detecting Stealthy Backdoor Samples based on Intra-class Distance for Large Language Models
Jinwen Chen, Hainan Zhang, Fei Sun, Qinnan Zhang, Sijia Wen, Ziwei Wang, Zhiming Zheng
- Rust-doctor: Enhanced Feature for Rust Ownership and Lifetime Repair with Balanced Training Data Generation
Wenzhang Yang, xiaoning ren, Cuifeng Gao, Yinxing Xue
- SLIM: Subtrajectory-Level Elimination for More Effective Reasoning
Xifeng Yao, Chengyuan Ma, Dongyu Lang, Yinhao Ni, Zhiwei Xu, Huarui Xie, Zihao Chen, Guang Shen, Dandan Tu, Yi Bai, Changzheng Zhang
- From Cross-Task Examples to In-Task Prompts: A Graph-Based Pseudo-Labeling Framework for In-context Learning
Zihan Chen, Song Wang, Xingbo Fu, Chengshuai Shi, Zhenyu Lei, Cong Shen, Jundong Li
- Instance-level Randomization: Toward More Stable LLM Evaluations
Yiyang Li, Yonghuang Wu, Ying Luo, Liangtai Sun, Zishu Qin, Lin Qiu, Xuezhi Cao, Xunliang Cai
- Not All Voices Are Rewarded Equally: Probing and Repairing Reward Models across Human Diversity
Zihao Li, Feihao Fang, Xitong Zhang, Jiaru Zou, Zhining Liu, Wei Xiong, Ziwei Wu, Baoyu Jing, Jingrui He
- PAMN: Multi-phase Correlation Modeling for Contrast-Enhanced 3D Medical Image Retrieval
Haonan Tong, Ke Liu, Chuang Zhang, Xinglin Zhang, Tao Chen, Jenq-Neng Hwang, Lei Li
- Safety in Large Reasoning Models: A Survey
Cheng Wang, Yue Liu, Baolong Bi, Duzhen Zhang, Zhong-Zhi Li, YINGWEI MA, Yufei He, Shengju Yu, Xinfeng Li, Junfeng Fang, Jiaheng Zhang, Bryan Hooi
- SafeConf: A Confidence-Calibrated Safety Self-Evaluation Method for Large Language Models
Bo Zhang, Cong Gao, Linkang Yang, Bingxu Han, Minghao Hu, Zhunchen Luo, Guotong Geng, Xiaoying Bai, Jun Zhang, Wen Yao, Zhong Wang
- DocAssistant: Integrating Key-region Reading and Step-wise Reasoning for Robust Document Visual Question Answering
Jinxu Zhang, QiyuanFan, Yu Zhang
- LNE-Blocking: An Efficient Framework for Contamination Mitigation Evaluation on Large Language Models
Ruijie Hou, Yueyang Jiao, Hanxu Hu, Yingming Li, Wai Lam, Huajian Zhang, Hongyuan Lu
- Enhancing Hate Speech Classifiers through a Gradient-assisted Counterfactual Text Generation Strategy
Michael Van Supranes, Shaowen Peng, Shoko Wakamiya, Eiji Aramaki
- Learning SQL Like a Human: Structure-Aware Curriculum Learning for Text-to-SQL Generation
Xiaohu Zhu, Qian Li, Lizhen Cui, Yuntao Du
- Chain-of-Interactions: Multi-step Iterative ICL Framework for Abstractive Task-Oriented Dialogue Summarization of Conversational AI Interactions
Jason S Lucas, ALI AL LAWATI, Mahjabin Nahar, John Chen, Mahnoosh Mehrabani
- Your Semantic-Independent Watermark is Fragile: A Semantic Perturbation Attack against EaaS Watermark
Zekun Fei, Biao Yi, Jianing Geng, He Ruiqi, Lihai Nie, Zheli Liu
- Query Optimization for Parametric Knowledge Refinement in Retrieval-Augmented Large Language Models
Youan Cong, Pritom Saha Akash, Cheng Wang, Kevin Chen-Chuan Chang
- SKA-Bench: A Fine-Grained Benchmark for Evaluating Structured Knowledge Understanding of LLMs
Zhiqiang Liu, Enpei Niu, Yin Hua, Mengshu Sun, Lei Liang, Huajun Chen, Wen Zhang
- $PD^3F$: A Pluggable and Dynamic DoS-Defense Framework against resource consumption attacks targeting Large Language Models
Yuanhe Zhang, Xinyue Wang, Haoran Gao, Zhenhong Zhou, Fanyu Meng, Yuyao Zhang, Sen Su
- From Implicit Exploration to Structured Reasoning: Guideline and Refinement for LLMs
Jiaxiang Chen, Zhuo Wang, mingxi Zou, Zhucong Li, Zhijian Zhou, Song Wang, Zenglin Xu
- PIP: Perturbation-based Iterative Pruning for Large Language Models
Yi Cao, Wei-Jie Xu, Yucheng Shen, Weijie Shi, Chi-Min Chan, Jianfeng Qu, Jiajie Xu
- Convolutional LoRA Aggregation for Unseen Tasks Adaptation
Xinhao Wu, Jialin Liu, Yutai Duan, Jie Liu
- CDT: A Comprehensive Capability Framework for Large Language Models Across Cognition, Domain, and Task
Haosi Mo, Xinyu Ma, Xuebo Liu, Derek F. Wong, YU LI, Jie Liu, Min Zhang
- Multilingual Collaborative Defense for Large Language Models
Hongliang Li, Jinan Xu, Gengping Cui, Changhao Guan, Fengran Mo, Kaiyu Huang
- Role-Guided Annotation and Prototype-Aligned Representation Learning for Historical Literature Sentiment Classification
Hongfei Du, Jiacheng Shi, Jacobo Myerston, Sidi Lu, Gang Zhou, Ashley Gao
- MetaMixSpeech: Meta Task Augmentation for Low-Resource Speech Recognition
Yaqi Chen, Hao Zhang, Wenlin Zhang, XuKui Yang, Dan Qu, Yunpeng Liu
- RECAST: Retrieval-Augmented Contextual ASR via Decoder-State Keyword Spotting
Ashish Mittal, Sunita Sarawagi, Preethi Jyothi
- PREE: Towards Harmless and Adaptive Fingerprint Editing in Large Language Models via Knowledge Prefix Enhancement
Xubin Yue, Zhenhua Xu, Wenpeng Xing, Jiahui Yu, Mohan Li, Meng Han
- Beyond Spurious Signals: Debiasing Multimodal Large Language Models via Counterfactual Inference and Adaptive Expert Routing
Zichen Wu, Hsiu-Yuan Huang, Yunfang Wu
- Text-centric Alignment for Bridging Test-time Unseen Modality
Yun-Da Tsai, Ting-Yu Yen, Pei-Fu Guo, Zhe-Yan Li, Shou-De Lin
- HierPrompt: Zero-Shot Hierarchical Text Classification with LLM-Enhanced Prototypes
Qian Zhang, Qinliang Su, Wei Zhu, Pang Yachun
- RouterEval: A Comprehensive Benchmark for Routing LLMs to Explore Model-level Scaling Up in LLMs
Zhongzhan Huang, Guoming Ling, Yupei Lin, Yandong Chen, Shanshan Zhong, Hefeng Wu, Liang Lin
- Can We Steer Reasoning Direction by Thinking Intervention?
Xingsheng ZHANG, Luxi Xing, Chen Zhang, Yanbing Liu, Yifan Deng, Yunpeng Li, Yue Hu, Chenxu Niu
- MPO: Boosting LLM Agents with Meta Plan Optimization
Weimin Xiong, Yifan Song, Qingxiu Dong, Bingchan Zhao, Feifan Song, XWang, Sujian Li
- Exploring the Generalizability of Factual Hallucination Mitigation via Enhancing Precise Knowledge Utilization
Siyuan Zhang, Yichi Zhang, Yinpeng Dong, Hang Su
- Learning What to Remember: Adaptive Probabilistic Memory Retention for Memory-Efficient Language Models
S M Rafiuddin, Muntaha Nujat Khan
- Unlocking Smarter Device Control: Foresighted Planning with a World Model-Driven Code Execution Approach
Xiaoran Yin, Xu Luo, Hao Wu, Lianli Gao, Jingkuan Song
- RGAR: Recurrence Generation-augmented Retrieval for Factual-aware Medical Question Answering
Sichu Liang, Linhai Zhang, Hongyu Zhu, Wenwen Wang, Yulan He, Deyu Zhou
- EcoSafeRAG: Efficient Security through Context Analysis in Retrieval-Augmented Generation
Ruobing Yao, Yifei Zhang, Shuang Song, Neng Gao, Chenyang Tu
- StereoDetect: Detecting Stereotypes and Anti-stereotypes the Correct Way Using Social Psychological Underpinnings
Kaustubh Shivshankar Shejole, Pushpak Bhattacharyya
- Sparkle: Mastering Basic Spatial Capabilities in Vision Language Models Elicits Generalization to Spatial Reasoning
Yihong Tang, Ao Qu, Zhaokai Wang, Dingyi Zhuang, Zhaofeng Wu, Wei Ma, Shenhao Wang, Yunhan Zheng, Zhan Zhao, Jinhua Zhao
- How Does Knowledge Selection Help Retrieval Augmented Generation?
Xiangci Li, Jessica Ouyang
- UPLex: Fine-Grained Personality Control in Large Language Models via Unsupervised Lexical Modulation
Tianlong Li, Xiaoqing Zheng
- ParetoRAG: Leveraging Sentence-Context Attention for Robust and Efficient Retrieval-Augmented Generation
Ruobing Yao, Yifei Zhang, Shuang Song, Yuhan Liu, Neng Gao, Chenyang Tu
- FlexQuant: A Flexible and Efficient Dynamic Precision Switching Framework for LLM Quantization
Fangxin Liu, Zongwu Wang, Jinhong Xia, Junping Zhao, Jian Liu, Haibing Guan, Li Jiang
- ReLoop: “Seeing Twice and Thinking Backwards” via Closed-loop Training to Mitigate Hallucinations in Multimodal understanding
Jianjiang Yang, Yanshu Li, Ziyan Huang
- Sequence Structure Aware Retriever for Procedural Document Retrieval: A New Dataset and Baseline
Zhenqi Ye, HaoPeng Ren, Yi Cai, Qingbao Huang, Jing Qin, Pinli Zhu, Songwen Gong
- The Effect of Language Diversity When Fine-Tuning Large Language Models for Translation
David Stap, Christof Monz
- David vs. Goliath: Cost-Efficient Financial QA via Cascaded Multi-Agent Reasoning
Chenghao Liu, Qian Liu, Ziqin Zhu, Hao Fei, Aniket Mahanti
- Benchmarking Uncertainty Metrics for LLM Target-Aware Search
Pei-Fu Guo, Yun-Da Tsai, Shou-De Lin
- ZOGRASCOPE: A New Benchmark for Semantic Parsing over Property Graphs
Francesco Cazzaro, Justin Kleindienst, Sofia Márquez Gomez, Ariadna Quattoni
- FG-PRM: Fine-grained Hallucination Detection and Mitigation in Language Model Mathematical Reasoning
Ruosen Li, Ziming Luo, Xinya Du
- Multi-Faceted Self-Consistent Preference Alignment for Query Rewriting in Conversational Search
Zhiyu Cao, Peifeng Li, Qiaoming Zhu
- Recipe2Plan: Evaluating Planning Abilities of LLMs for Efficient and Feasible Multitasking with Time Constraints Between Actions
Zirui Wu, Xiao Liu, Jiayi Li, Lingpeng Kong, Yansong Feng
- Unlocking the Effectiveness of LoRA-FP for Seamless Transfer Implantation of Fingerprints in Downstream Models
Zhenhua Xu, Zhaokun Yan, Binhan Xu, Xin Tong, Haitao Xu, Yourong Chen, Meng Han
- AELC: Adaptive Entity Linking with LLM-Driven Contextualization
Fang Wang, Zhengwei Tao, Ming Wang, Minghao Hu, Xiaoying Bai
- MetaLadder: Ascending Mathematical Solution Quality via Analogical-Problem Reasoning Transfer
Honglin Lin, Zhuoshi Pan, Qizhi Pei, Xin Gao, Yu Li, Mengzhang Cai, Conghui He, Lijun Wu
- GLProtein: Global-and-Local Structure Aware Protein Representation Learning
Yunqing LIU, Wenqi Fan, Xiaoyong Wei, Li Qing
- Reward Mixology: Crafting Hybrid Signals for Reinforcement Learning Driven In-Context Learning
Changshuo Zhang, Ang Gao, Xiao Zhang, Yong Liu, Deyang Li, Fangchao Liu, Xinyu Zhang
- Can Multimodal LLMs See Materials Clearly? A Multimodal Benchmark on Materials Characterization
Zhengzhao Lai, Youbin Zheng, Zhenyang Cai, HAONAN LYU, Jingpu Yang, Hong-Qing Liang, Yan Hu, Benyou Wang
- GRADE: Generating multi-hop QA and fine-gRAined Difficulty matrix for RAG Evaluation
Jeongsoo Lee, Daeyong Kwon, Kyohoon Jin
- FusionDTI: Fine-grained Binding Discovery with Token-level Fusion for Drug-Target Interaction
Zhaohan Meng, Zaiqiao Meng, Ke Yuan, Iadh Ounis
- A Survey on Training-free Alignment of Large Language Models
Birong Pan, Yongqi Li, Weiyu Zhang, Wenpeng Lu, Mayi Xu, Shen Zhou, Yuanyuan Zhu, Ming Zhong, Tieyun Qian
- CIVET: Systematic Evaluation of Understanding in VLMs
Massimo Rizzoli, Simone Alghisi, Olha Khomyn, Gabriel Roccabruna, Seyed Mahed Mousavi, giuseppe riccardi
- How Does Cognitive Bias Affect Large Language Models? A Case Study on the Anchoring Effect in Price Negotiation Simulations
Yoshiki Takenami, Yin Jou Huang, Yugo Murawaki, Chenhui Chu
- Enhancing Speech-to-Speech Dialogue Modeling with End-to-End Retrieval-Augmented Generation
Pengchao Feng, Ziyang Ma, Wenxi Chen, Yao Li, SHENG WANG, Kai Yu, Xie Chen
- Backdoor-Powered Prompt Injection Attacks Nullify Defense Methods
Yulin Chen, Haoran Li, Yuan Sui, Yangqiu Song, Bryan Hooi
- Path-enhanced Pre-trained Language Model for Knowledge Graph Completion
Hao Wang, Dandan Song, Zhijing Wu, Yuhang Tian, Pan Yang
- Zero-shot Cross-lingual NER via Mitigating Language Difference: An Entity-aligned Translation Perspective
Zhihao Zhang, Sophia Yat Mei Lee, Dong Zhang, Shoushan Li, Guodong Zhou
- Zero-Shot Cross-Domain Aspect-Based Sentiment Analysis via Domain-Contextualized Chain-of-Thought Reasoning
Chuming Shen, Wei Wei, Dong Wang, Zhong-Hao Wang
- Tree of Agents: Improving Long-Context Capabilities of Large Language Models through Multi-Perspective Reasoning
Song Yu, Xiaofei Xu, KE DENG, Li Li, LIN TIAN
- Cross-Cultural Transfer of Commonsense Reasoning in LLMs: Evidence from the Arab World
Saeed Almheiri, Rania Elbadry, Mena Attia, Chenxi Wang, Preslav Nakov, Timothy Baldwin, Fajri Koto
- Enhancing Partially Relevant Video Retrieval with Robust Alignment Learning
Long Zhang, Peipei Song, Jianfeng Dong, Kun Li, Xun Yang
- Multi-level Diagnosis and Evaluation for Robust Tabular Feature Engineering with Large Language Models
Yebin Lim, Susik Yoon
- Prejudge-Before-Think: Enhancing Large Language Models at Test-Time by Process Prejudge Reasoning
Jianing Wang, Jin Jiang, Yang Liu, Mengdi Zhang, Xunliang Cai
- FroM: Frobenius Norm-Based Data-Free Adaptive Model Merging
Zijian Li, Xiaocheng Feng, Huixin Liu, Yichong Huang, Ting Liu, Bing Qin
- Dynamic Simulation Framework for Disinformation Dissemination and Correction With Social Bots
Boyu Qiao, Kun Li, Wei Zhou, Songlin Hu
- Beyond the First Error: Process Reward Models for Reflective Mathematical Reasoning
Zhaohui Yang, Chenghua He, Xiaowen Shi, Shihong Deng, Linjing Li, Qiyue Yin, Daxin Jiang
- PrAd: Prompt Adaptive Tuning for Decoder-only Language Models
Youneng Ma, Junyi He, Haojun Fei
- Personalized Question Answering with User Profile Generation and Compression
Hang Su, Yun Yang, Tianyang Liu, Xin Liu, Peng Pu, Xuesong Lu
- Dream to Chat: Model-based Reinforcement Learning on Dialogues with User Belief Modeling
Yue Zhao, Xiaoyu Wang, Dan Wang, Zhonglin Jiang, Qingqing Gu, Teng Chen, Ningyuan Xi, Jinxian Qu, Yong Chen, Luo Ji
- FakeSV-VLM: Taming VLM for Detecting Fake Short-Video News via Progressive Mixture-Of-Experts Adapter
JunXi Wang, Yaxiong Wang, Lechao Cheng, Zhun Zhong
- Beyond Inherent Cognition Biases in LLM-Based Event Forecasting: A Multi-Cognition Agentic Framework
Zhen Wang, Xi Zhou, Yating Yang, Bo Ma, Lei Wang, Rui Dong, Azmat Anwar
- Breaking the Reviewer: Assessing the Vulnerability of Large Language Models in Automated Peer Review Under Textual Adversarial Attacks
Tzu-Ling Lin, Wei-Chih Chen, Teng-Fang Hsiao, Hou-I Liu, Ya-Hsin Yeh, Yu-Kai Chan, Wen-Sheng Lien, Po-Yen Kuo, Philip S. Yu, Hong-Han Shuai
- Watermarking with Low-Entropy POS-Guided Token Partitioning and Z-Score-Driven Dynamic Bias for Large Language Models
He Li, Xiaojun Chen, Zhendong Zhao, Yunfei Yang, Xin Zhao, Jingcheng He
- Knowledge Graph-Driven Memory Editing with Directional Interventions
Jinhu Fu, Kun Wang, chongye guo, Junfeng Fang, Wentao Zhang, Sen Su
- DTDES-KGE: Dual-Teacher Knowledge Distillation with Distinct Embedding Spaces for Knowledge Graph Embeddings
Bofan Wei, Hongyuan Xu, Yuhang Niu, Jiarui Ren, Yanlong Wen, Xiaojie Yuan
- LLMEval-Med: A Real-world Clinical Benchmark for Medical LLMs with Physician Validation
Ming Zhang, Yujiong Shen, Zelin Li, Huayu Sha, Binze Hu, Yuhui Wang, Chenhao Huang, Shichun Liu, Jingqi Tong, Changhao Jiang, Mingxu Chai, Zhiheng Xi, Shihan Dou, Tao Gui, Qi Zhang, Xuanjing Huang
- Watermark Smoothing Attacks against Language Models
Hongyan Chang, Hamed Hassani, Reza Shokri
- PICD-Instruct: A Generative Instruction Learning Framework for Few-Shot Multi-Intent Spoken Language Understanding
Wenbin Hua, Rui Fan, Tingting He, Ming Dong
- Forewarned is Forearmed: Pre-Synthesizing Jailbreak-like Instructions to Enhance LLM Safety Guardrail to Potential Attacks
Sheng Liu, Qiang Sheng, Danding Wang, Yang Li, Guang Yang, Juan Cao
- Are Knowledge and Reference in Multilingual Language Models Cross-Lingually Consistent?
Xi Ai, Mahardika Krisna Ihsani, Min-Yen Kan
- Krikri: Advancing Open Large Language Models for Greek
Dimitris Roussis, Leon Voukoutis, Georgios Paraskevopoulos, Sokratis Sofianopoulos, Prokopis Prokopidis, Vassilis Papavassileiou, Athanasios Katsamanis, Stelios Piperidis, Vassilis Katsouros
- Beyond the Scientific Document: A Citation-Aware Multi-Granular Summarization Approach with Heterogeneous Graphs
Quoc-An Nguyen, Xuan-Hung Le, Thi-Minh-Thu Vu, Hoang-Quynh Le
- Detecting Continuously Evolving Scam Calls under Limited Annotation: A LLM-Augmented Expert Rule Framework
Haoyu Ma, Qinliang Su, Minhua Huang, Wu Kai
- An Empirical Study of Position Bias in Modern Information Retrieval
Ziyang Zeng, Dun Zhang, Jiacheng Li, zoupanxiang, Yuqing Yang
- GenPoE: Generative Passage-level Mixture of Experts for Knowledge Enhancement of LLMs
Xuebing Liu, Shanbao Qiao, Seung-Hoon Na
- CoRanking: Collaborative Ranking with Small and Large Ranking Agents
Wenhan Liu, Xinyu Ma, Yutao Zhu, Lixin Su, Shuaiqiang Wang, Dawei Yin, Zhicheng Dou
- HIRAG: Hierarchical-Thought Instruction-Tuning Retrieval-Augmented Generation
Yihan Jiao, Zhehao Tan, Dan Yang, Duolin sun, Jie Feng, YUE SHEN, Jian Wang, Peng Wei
- Towards Personalized Conversational Sales Agents: Contextual User Profiling for Strategic Action
Tongyoung Kim, Jeongeun Lee, SooJin Yoon, SungHwan Kim, Dongha Lee
- WebCoT: Enhancing Web Agent Reasoning by Reconstructing Chain-of-Thought in Reflection, Branching, and Rollback
Minda Hu, Tianqing Fang, Jianshu Zhang, Jun-Yu Ma, Zhisong Zhang, Jingyan Zhou, Hongming Zhang, Haitao Mi, Dong Yu, Irwin King
- Interesting Culture: Social Relation Recognition from Videos via Culture De-confounding
Yuxuan Zhang, Yangfu Zhu, Haorui Wang, Bin Wu
- ThinkSwitcher: When to Think Hard, When to Think Fast
Guosheng Liang, Longguang Zhong, Ziyi Yang, Xiaojun Quan
- MaGiX: A Multi-Granular Adaptive Graph Intelligence Framework for Enhancing Cross-Lingual RAG
Nguyen Manh Hieu, Vu Lam Anh, Hung Pham Van, Nam Le Hai, Linh Ngo Van, Nguyen Thi Ngoc Diep, Thien Huu Nguyen
- LexTime: A Benchmark for Temporal Ordering of Legal Events
Claire Barale, Leslie Barrett, Vikram Sunil Bajaj, Michael Rovatsos
- Beyond the Surface: A Solution-Aware Retrieval Model for Competition-level Code Generation
Zhang Shiwen, Lingxiang Wang, Hainan Zhang, Ziwei Wang, Sijia Wen, Zhiming Zheng
- X-Boundary: Establishing Exact Safety Boundary to Shield LLMs from Jailbreak Attacks without Compromising Usability
Xiaoya Lu, Dongrui Liu, Yi Yu, Luxin Xu, Jing Shao
- Tag&Tab: Pretraining Data Detection in Large Language Models Using Keyword-Based Membership Inference Attack
Sagiv Antebi, Edan Habler, Asaf Shabtai, Yuval Elovici
- EcoLANG: Efficient and Effective Agent Communication Language Induction for Social Simulation
Xinyi Mou, Chen Qian, Wei Liu, Ling Yan, Yao Hu, Xuanjing Huang, zhongyu wei
- Revealing the Inherent Instructability of Pre-Trained Language Models
Seokhyun An, Minji Kim, Hyounghun Kim
- What Media Frames Reveal About Stance: A Dataset and Study about Memes in Climate Change Discourse
Shijia Zhou, Siyao Peng, Simon M. Luebke, Jörg Haßler, Mario Haim, Saif M. Mohammad, Barbara Plank
- Rethinking Personality Assessment from Human-Agent Dialogues: Fewer Rounds May Be Better Than More
Baiqiao Zhang, Zhifeng Liao, Xiangxian Li, Chao Zhou, Juan Liu, Xiaojuan Ma, Yulong Bian
- TailorRPA: A Retrieval-Based Framework for Eliciting Personalized and Coherent Role-Playing Agents in General Domain
Zhenpeng Gao, Xiaofen Xing, Xiangmin Xu
- SCE: Semantic Consistency Enhanced Reinforcement Learning for Multi-Hop Knowledge Graph Reasoning
Huangyw, Yao Liu, Qiao Liu, Rui Hou, Tingting Dai
- ReGraphRAG: Reorganizing Fragmented Knowledge Graphs for Multi-Perspective Retrieval-Augmented Generation
Soohyeong Kim, Seok Jun Hwang, JungHyoun Kim, Jeonghyeon Park, Yong Suk Choi
- GASE: Generatively Augmented Sentence Encoding
Manuel Frank, Haithem Afli
- The “r” in “woman” stands for rights. Auditing LLMs in Uncovering Social Dynamics in Implicit Misogyny
Arianna Muti, Chris Emmery, Debora Nozza, Alberto Barrón-Cedeño, Tommaso Caselli
- Fact Verification on Knowledge Graph via Programmatic Graph Reasoning
Yuanzhen Hao, Desheng Wu
- Agent Trading Arena: A Study on Numerical Understanding in LLM-Based Agents
Tianmi Ma, Jiawei Du, Wenxin Huang, Wenjie Wang, Liang Xie, Xian Zhong, Joey Tianyi Zhou
- Why We Feel What We Feel: Joint Detection of Emotions and Their Opinion Triggers in E-commerce
Arnav Attri, Anuj Attri, Suman Banerjee, Amey Patil, Muthusamy Chelliah, Nikesh Garera, Pushpak Bhattacharyya
- Use Random Selection for Now: Investigation of Few-Shot Selection Strategies in LLM-based Text Augmentation
Jan Cegin, Branislav Pecher, Jakub Simko, Ivan Srba, Maria Bielikova, Peter Brusilovsky
- BanglaByT5: Byte-Level Modelling for Bangla
Pramit Bhattacharyya, Arnab Bhattacharya
- XTRA: Cross-Lingual Topic Modeling with Topic and Representation Alignments
Nguyen Tien Phat, Ngo Vu Minh, Tung Nguyen, Linh Ngo Van, Duc Anh Nguyen, Sang Dinh, Trung Le
- CodeContests+: High-Quality Test Case Generation for Competitive Programming
Zihan Wang, Siyao Liu, Yang Sun, Ming Ding, Hongyan Li
- SPO: Self Preference Optimization with Self Regularization
Yuhao Sun, Yifan Zhang, Quandong Wang, Qinzhuo Wu, Wei Liu, Jian Luan
- Long-context Language Models Fail in Basic Retrieval Tasks Without Sufficient Reasoning Steps
Yijiong Yu, Ma Xiufa, Fang Jianwei, Zhi Xu, Guangyao Su, Wang Jiancheng, Yongfeng Huang, Zhixiao Qi, Wei Wang, weifeng.liu, Ran Chen, Ji Pei
- Benchmarking Critical Questions Generation: A Challenging Reasoning Task for Large Language Models
Blanca Calvo Figueras, Rodrigo Agerri
- ResearchArena: Benchmarking Large Language Models’ Ability to Collect and Organize Information as Research Agents
Hao Kang, Chenyan Xiong
- LLMs are Privacy Erasable
Zipeng Ye, Wenjian Luo
- How Good are LLM-based Rerankers? An Empirical Analysis of State-of-the-Art Reranking Models
Abdelrahman Abdallah, Bhawna Piryani, Jamshid Mozafari, Mohammed Ali, Adam Jatowt
- DeAR: Dual-Stage Document Reranking with Reasoning Agents via LLM Distillation
Abdelrahman Abdallah, Jamshid Mozafari, Bhawna Piryani, Adam Jatowt
- CANDY: Benchmarking LLMs’ Limitations and Assistive Potential in Chinese Misinformation Fact-Checking
Ruiling Guo, Xinwei Yang, Chen Huang, Tong Zhang, Yong Hu
- E-Verify: A Paradigm Shift to Scalable Embedding-based Factuality Verification
Zeyang Liu, Jingfeng Xue, Xiuqi Yang, Wenbiao Du, Jiarun Fu, Junbao Chen, Wenjie Guo, Yong Wang
- LLM Jailbreak Detection for (Almost) Free!
Guorui Chen, Yifan Xia, Xiaojun Jia, Zhijiang Li, Philip Torr, Jindong Gu
- When to Continue Thinking: Adaptive Thinking Mode Switching for Efficient Reasoning
Xiaoyun Zhang, Jingqing Ruan, Xing Ma, Yawen Zhu, Haodong Zhao, Hao Li, Jiansong Chen, Ke Zeng, Xunliang Cai
- Plugging Schema Graph into Multi-Table QA: A Human-Guided Framework for Reducing LLM Reliance
Xixi Wang, Miguel Costa, Jordanka Kovaceva, Shuai Wang, Francisco C. Pereira
- Evolution in Simulation: AI-Agent School with Dual Memory for High-Fidelity Educational Dynamics
Sheng Jin, Haoming Wang, Zhiqi Gao, Yongbo Yang, Bao Chunjia, Chengliang Wang
- Retrieval-Augmented Machine Translation with Unstructured Knowledge
Jiaan Wang, Fandong Meng, Yingxue Zhang, Jie Zhou
- MARS-Bench: A Multi-turn Athletic Real-world Scenario Benchmark for Dialogue Evaluation
Chenghao Yang, Yinbo Luo, Zhoufutu Wen, Qi Chu, Tao Gong, Longxiang Liu, Kaiyuan Zhang, Jianpeng Jiao, Ge Zhang, Wenhao Huang, Nenghai Yu
- UTMath: A Benchmark for Math Evaluation with Unit Test
Bo Yang, Qingping Yang, YINGWEI MA, Runtao Liu
- The Green KNIGHT: Green Machine Translation with Knowledge-Distilled, Narrow, Inexpensive, Greedy, Hybrid Transformers
Andreas Guta, Frithjof Petrick, Peter Polák
- Constructing Your Model’s Value Distinction: Towards LLM Alignment with Anchor Words Tuning
Zhen Yang, Ping Jian, Chengzhi Li, Chenxu Wang, Xinyue Zhang, Wenpeng Lu
- MCiteBench: A Multimodal Benchmark for Generating Text with Citations
Caiyu Hu, Yikai Zhang, Tinghui Zhu, Yiwei Ye, Yanghua Xiao
- Do LLMs Know and Understand Domain Conceptual Knowledge?
Sijia Shen, Feiyan Jiang, Peiyan Wang, Yuchen Jiang, ChangLiu, Yubo Feng
- Agent Laboratory: Using LLM Agents as Research Assistants
Samuel Schmidgall, Yusheng Su, Ze Wang, Ximeng Sun, Jialian Wu, Xiaodong Yu, Jiang Liu, Michael Moor, Zicheng Liu, Emad Barsoum
- Retrieval-Augmented Generation with Hierarchical Knowledge
Haoyu Huang, Yongfeng Huang, Yang Junjie, Zhenyu Pan, Yongqiang Chen, Kaili Ma, Hongzhi Chen, James Cheng
- Regularized Contrastive Decoding with Hard Negative Samples for LLM Hallucination Mitigation
Haonan Sheng, Dou Hu, Lingwei Wei, Wei Zhou, Songlin Hu
- CharacterCraft: Bridging the Literature-Reality Dialogue Gap for Practical Role-Playing Agents
Xuyan Yin, Xinran Yang, Zihao Li, Lixin Zou, Chenliang Li
- Drift: Decoding-time Personalized Alignments with Implicit User Preferences
Minbeom Kim, Kang-il Lee, Seongho Joo, Hwaran Lee, Thibaut Thonet, Kyomin Jung
- Discovering Semantic Subdimensions through Disentangled Conceptual Representations
Yunhao Zhang, Shaonan Wang, Nan Lin, Xinyi Dong, Chong Li, Chengqing Zong
- Identifying Aspects in Peer Reviews
Sheng Lu, Ilia Kuznetsov, Iryna Gurevych
- Tree-Structured Non-Autoregressive Decoding for Sequence-to-Sequence Text Generation
Pengyu Ji, Yufei Liu, Xiang Hu, Kewei Tu
- Towards More Efficient Post-training via Fourier Domain Adapter Framework
Yijia Fan, Jusheng Zhang, Keze Wang
- KERAG: Knowledge-Enhanced Retrieval-Augmented Generation for Advanced Question Answering
Yushi Sun, Kai Sun, Yifan Ethan Xu, Xiao Yang, Xin Luna Dong, Nan Tang, Lei Chen
- Not All Features Deserve Attention: Graph-Guided Dependency Learning for Tabular Data Generation with Language Models
Zheyu Zhang, Shuo Yang, Bardh Prenkaj, Gjergji Kasneci
- CCG: Rare-Label Prediction via Neural SEM–Driven Causal Game
Yijia Fan, Jusheng Zhang, Kaitong Cai, Jing Yang, Keze Wang
- Multimodal Emotion Recognition in Conversations: A Survey of Methods, Trends, Challenges and Prospects
ChengYan Wu, Yiqiang Cai, Yang Liu, pengxu zhu, Yun Xue, Ziwei Gong, Julia Hirschberg, Bolei Ma
- When Allies Turn Foes: Exploring Group Characteristics of LLM-Based Multi-Agent Collaborative Systems Under Adversarial Attacks
Jiahao Zhang, Baoshuo Kan, Tao Gong, Fu Lee Wang, Tianyong Hao
- EditID: Training-Free Editable ID Customization for Text-to-Image Generation
Guandong Li, Zhaobin Chu
- OSC: Cognitive Orchestration through Dynamic Knowledge Alignment in Multi-Agent LLM Collaboration
Jusheng Zhang, Yijia Fan, Kaitong Cai, Xiaofei Sun, Keze Wang
- VideoLLM Knows When to Speak: Enhancing Time-Sensitive Video Comprehension with Video-Text Duet Interaction Format
Yueqian Wang, Xiaojun Meng, Yuxuan Wang, Jianxin Liang, Jiansheng Wei, Huishuai Zhang, Dongyan Zhao
- To Answer or Not to Answer (TAONA): A Robust Textual Graph Understanding and Question Answering Approach
Yuchen Yan, Aakash Kolekar, Sahika Genc, Wenju Xu, Edward W Huang, Anirudh Srinivasan, Mukesh Jain, Qi He, Hanghang Tong
- Understanding Refusal in Language Models with Sparse Autoencoders
Yeo Wei Jie, Nirmalendu Prakash, Clement Neo, Ranjan Satapathy, Roy Ka-Wei Lee, Erik Cambria
- Where Did That Come From? Sentence-Level Error-Tolerant Attribution
Ori Ernst, Aviv Slobodkin, Meng Cao, Sihui Wei, Jackie CK Cheung
- Alleviating Performance Degradation Caused by Out-of-Distribution Issues in Embedding-Based Retrieval
Haotong Bao, Jianjin Zhang, Qi Chen, Weihao Han, Zhengxin Zeng, Ruiheng Chang, Mingzheng Li, Hao Sun, Weiwei Deng, Feng Sun, Qi Zhang
- Can LLMs Find a Needle in a Haystack? A Look at Anomaly Detection Language Modeling
Leslie Barrett, Vikram Sunil Bajaj, Robert John Kingan
- Beyond Single Frames: Can LMMs Comprehend Implicit Narratives in Comic Strip?
Xiaochen Wang, Heming Xia, Jialin Song, Longyu Guan, Qingxiu Dong, Yixin Yang, Weiyao Luo, Yifan Pu, Yiru Wang, Xiangdi Meng, Wenjie Li, Zhifang Sui
- Enhancing Multi-Agent Debate System Performance via Confidence Expression
Zijie Lin, Bryan Hooi
- The Face of Persuasion: Analyzing Bias and Generating Culture-Aware Ads
Aysan Aghazadeh, Adriana Kovashka
- SIFT: Grounding LLM Reasoning in Contexts via Stickers
Zihao Zeng, Xuyao Huang, Boxiu Li, Zhijie Deng
- When Inverse Data Outperforms: Exploring the Pitfalls of Mixed Data in Multi-Stage Fine-Tuning
Mengyi DENG, Xin Li, Tingyu ZHU, Zhicheng Yang, Zhijiang Guo, Wei Wang
- LUME: LLM Unlearning with Multitask Evaluations
Anil Ramakrishna, Yixin Wan, Xiaomeng Jin, Kai-Wei Chang, Zhiqi Bu, Bhanukiran Vinzamuri, Volkan Cevher, Mingyi Hong, Rahul Gupta
- How do Language Models Generate Slang: A Systematic Comparison between Human and Machine-Generated Slang Usages
Siyang Wu, Zhewei Sun
- Bridging the Dynamic Perception Gap: Training-Free Draft Chain-of-Thought for Dynamic Multimodal Spatial Reasoning
Siqu Ou, Hongcheng Liu, Pingjie Wang, Yusheng Liao, Chuan Xuan, Yanfeng Wang, Yu Wang
- MedCOD: Enhancing English-to-Spanish Medical Translation of Large Language Models Using Enriched Chain-of-Dictionary Framework
Md Shahidul Salim, Lian Fu, Arav Adikesh Ramakrishnan, Zonghai Yao, hong yu
- Chatbot To Help Patients Understand Their Health
Won Seok Jang, Hieu Tran, Manav Shaileshkumar Mistry, Sai Kiran Gandluri, Yifan Zhang, Sharmin Sultana, SUNJAE KWON, Zonghai Yao, hong yu
- A Knapsack by Any Other Name: Presentation impacts LLM performance on NP-hard problems
Alex Duchnowski, Ellie Pavlick, Alexander Koller
- Is Safety Standard Same for Everyone? User-Specific Safety Evaluation of Large Language Models
Yeonjun In, Wonjoong Kim, Kanghoon Yoon, Sungchul Kim, Mehrab Tanjim, Sangwu Park, Kibum Kim, Chanyoung Park
- Jailbreak Attack Initializations as Extractors of Compliance Directions
Amit LeVi, Rom Himelstein, Yaniv Nemcovsky, Avi Mendelson, Chaim Baskin
- Train Once for All: A Transitional Approach for Efficient Aspect Sentiment Triplet Extraction
Xinmeng Hou, Lingyue Fu, Chenhao Meng, Kounianhua Du, Hai Hu
- A Comprehensive Survey on the Trustworthiness of Large Language Models in Healthcare
Manar Aljohani, Jun Hou, Sindhura Kommu, Xuan Wang
- Self-Correction Makes LLMs Better Parsers
Ziyan Zhang, Yang Hou, Chen Gong, Zhenghua Li
- Explaining Length Bias in LLM-Based Preference Evaluations
Zhengyu Hu, Linxin Song, Jieyu Zhang, Zheyuan Xiao, Tianfu Wang, Zhengyu Chen, Nicholas Jing Yuan, Jianxun Lian, Kaize Ding, Hui Xiong
- Investigating Controversy Framing across Topics on Social Media
Maxwell Weinzierl, Sanda M. Harabagiu
- HEAL: Hybrid Enhancement with LLM-based Agents for Text-attributed Hypergraph Self-supervised Representation Learning
Ruochang Li, Xiao Luo, Zhiping Xiao, Wei Ju, Ming Zhang
- ReMamba: Equip Mamba with Effective Long-Sequence Modeling
Danlong Yuan, Jiahao Liu, Bei Li, Huishuai Zhang, Jingang Wang, Xunliang Cai, Dongyan Zhao
- QUITO-X: A New Perspective on Context Compression from the Information Bottleneck Theory
Yihang Wang, Xu Huang, Bowen Tian, Yueyang Su, Lei Yu, Huaming Liao, Yixing Fan, Jiafeng Guo, Xueqi Cheng
- Conv-Basis: A New Paradigm for Efficient Attention Inference and Gradient Computation in Transformers
Yingyu Liang, Heshan Liu, Zhenmei Shi, Zhao Song, Zhuoyan Xu, Jiale Zhao, Zhen Zhuang
- Mitigating Gender Bias via Fostering Exploratory Thinking in LLMs
Kangda Wei, Hasnat Md Abdullah, Ruihong Huang
- Beyond the Textual: Generating Coherent Visual Options for MCQs
Wanqiang Wang, Longzhu He, Wei Zheng
- SafeSwitch: Steering Unsafe LLM Behavior via Internal Activation Signals
Peixuan Han, Cheng Qian, Xiusi Chen, Yuji Zhang, Denghui Zhang, Heng Ji
- MADD: Multi-Agent Drug Discovery Orchestra
Gleb Vitalevich Solovev, Alina Borisovna Zhidkovskaya, Anastasia Orlova, Nina Gubina, Anastasia Vepreva, Rodion Golovinskii, Ilya Tonkii, Ivan Dubrovsky, Ivan Gurev, Dmitry Gilemkhanov, Denis Chistiakov, Timur A. Aliev, Ivan Poddiakov, Galina Zubkova, Ekaterina V. Skorb, Vladimir Vinogradov, Alexander Boukhanovsky, Nikolay Nikitin, Andrei Dmitrenko, Anna Kalyuzhnaya, Andrey Savchenko
- PersonaGym: Evaluating Persona Agents and LLMs
Vinay Samuel, Henry Peng Zou, Yue Zhou, Shreyas Chaudhari, Ashwin Kalyan, Tanmay Rajpurohit, Ameet Deshpande, Karthik R Narasimhan, Vishvak Murahari
- LM2Protein: A Structure-to-Token Protein Large Language Model
Chang Zhou, Jiyue Jiang, Pengan CHEN, Yuheng Shan, Xiangyu Shi, Zikang Wang, Yanting Li
- How Well Can Reasoning Models Identify and Recover from Unhelpful Thoughts?
Sohee Yang, Sang-Woo Lee, Nora Kassner, Daniela Gottesman, Sebastian Riedel, Mor Geva
- From Token to Action: State Machine Reasoning to Mitigate Overthinking in Information Retrieval
Dohyeon Lee, Yeonseok Jeong, seung-won hwang
- Locate-then-Merge: Neuron-Level Parameter Fusion for Mitigating Catastrophic Forgetting in Multimodal LLMs
Zeping Yu, Sophia Ananiadou
- Improving Influence-based Instruction Tuning Data Selection for Balanced Learning of Diverse Capabilities
Qirun Dai, Dylan Zhang, Jiaqi W. Ma, Hao Peng
- Diagnosing Moral Reasoning Acquisition in Language Models: Pragmatics and Generalization
Guangliang Liu, Zimo Qi, Xitong Zhang, Lei Jiang, Kristen Johnson
- Discourse Heuristics For Paradoxically Moral Self-Correction
Guangliang Liu, Zimo Qi, Xitong Zhang, Kristen Johnson
- Invisible Prompts, Visible Threats: Malicious Font Injection in External Resources for Large Language Models
JUNJIE XIONG, Changjia Zhu, Shuhang Lin, Chong Zhang, Yongfeng Zhang, Yao Liu, Lingyao Li
- Turning the Tide: Repository-based Code Reflection
Wei Zhang, Jian Yang, Jiaxi Yang, Ya Wang, Zhoujun Li, Zeyu Cui, Binyuan Hui, Junyang Lin
- Reinforcement Learning with Supervised Alignment
João Luís Lins, Jia Xu
- EmByte: Decomposition and Compression Learning for Small yet Private NLP
Shenglan Li, Jia Xu, Mengjiao Zhang
- GUARD: Glocal Uncertainty-Aware Robust Decoding for Effective and Efficient Open-Ended Text Generation
Yuanhao Ding, Esteban Garces Arias, Meimingwei Li, Julian Rodemann, Matthias Aßenmacher, Danlu Chen, Gaojuan Fan, Christian Heumann, Chongsheng ZHANG
- Efficiently Editing Mixture-of-Experts Models with Compressed Experts
Yifei He, Yang Liu, Chen Liang, Hany Hassan Awadalla
- FinGEAR: Financial Mapping-Guided Enhanced Answer Retrieval
Ying Li, Mengyu Wang, Miguel de Carvalho, Sotirios Sabanis, Tiejun Ma
- FM2DS: Few-Shot Multimodal Multihop Data Synthesis with Knowledge Distillation for Question Answering
Amirhossein Abaskohi, Spandana Gella, Giuseppe Carenini, Issam H. Laradji
- SQUARE: Unsupervised Retrieval Adaptation via Synthetic Data
Jinsung Yoon, Junhao Zeng, Sercan O Arik
- Knowledge-enhanced Multimodal ECG Representation Learning with Arbitrary-Lead Inputs
Che Liu, Cheng Ouyang, Zhongwei Wan, Haozhe Wang, Wenjia Bai, Rossella Arcucci
- Seeing Race, Feeling Bias: Emotion Stereotyping in Multimodal Language Models
Mahammed Kamruzzaman, Amanda Cercas Curry, Alba Cercas Curry, Flor Miriam Plaza-del-Arco
- AdaptMerge: Inference Time Adaptive Visual and Language-Guided Token Merging for Efficient Large Multimodal Models
Zahidul Islam, Mrigank Rochan
- Federated Retrieval-Augmented Generation: A Systematic Mapping Study
Abhijit Chakraborty, Chahana Dahal, Vivek Gupta
- A Survey of Pun Generation: Datasets, Evaluations and Methodologies
Yuchen Su, Yonghua Zhu, Ruofan Wang, Zijian Huang, Diana Benavides-Prado, Michael J. Witbrock
- Evaluating the Robustness and Accuracy of Text Watermarking Under Real-World Cross-Lingual Manipulations
Mansour Al Ghanim, Jiaqi Xue, Rochana Prih Hastuti, Mengxin Zheng, Yan Solihin, Qian Lou
- HDiff: Confidence-Guided Denoising Diffusion for Robust Hyper-relational Link Prediction
Xiangfeng Luo, Ruoxin Zheng, jianqiang huang, Hang Yu
- Spotlighter: Revisiting Prompt Tuning from a Representative Mining View
Yutong Gao, Maoyuan Shao, Xinyang Huang, Chuang Zhu, Yu Weng, Xuan Liu, Lijuan Sun, Guoshun Nan
- Offloaded Reasoning: Efficient Inference for Large Language Models via Modular Reasoning and Refinement
Ishan Jindal, Jayant Taneja, Badrinath Chandana, Vikas Kapur, SACHIN DEV SHARMA
- Wait, We Don’t Need to “Wait”! Removing Thinking Tokens Improves Reasoning Efficiency
Chenlong Wang, Yuanning Feng, Dongping Chen, Zhaoyang Chu, Ranjay Krishna, Tianyi Zhou
- Towards Reverse Engineering of Language Models: A Survey
Xinpeng Ti, Wentao Ye, Zhifang Zhang, Junbo Zhao, Chang Yao, Lei Feng, Haobo Wang
- LIFTED: Multimodal Clinical Trial Outcome Prediction via Large Language Models and Mixture-of-Experts
Wenhao Zheng, Liaoyaqi Wang, Dongshen Peng, Hongxia Xu, Yun Li, Hongtu Zhu, Tianfan Fu, Huaxiu Yao
- Addition in Four Movements: Mapping Layer-wise Information Trajectories in LLMs
YaoYan
- CoMoE: Contrastive Representation for Mixture-of-Experts in Parameter-Efficient Fine-tuning
Jinyuan Feng, ChaoPeng Wei, Tenghai Qiu, Tianyi Hu, Zhiqiang Pu
- GuiLoMo: Allocating Experts and Ranks for LoRA-MoE via Bilevel Optimization with GuidedSelection Vectors
Xinrong Chen, Hengyuan Zhang, Yingmin Qiu, Xiao Liang, Ziyue Li, Guanyu Wang, Weiping Li, Tong Mo, Wenyu Lv, Ngai Wong
- Rotate, Clip, and Partition: Towards W2A4KV4 Quantization by Integrating Rotation and Learnable Non-uniform Quantizer
Euntae Choi, Sumin Song, Woosang Lim, Sungjoo Yoo
- Decoding in Latent Spaces for Efficient Inference in LLM-based Recommendation
Chengbing Wang, Yang Zhang, Zhicheng Wang, Tianhao Shi, Keqin Bao, Fuli Feng, Tat-Seng Chua
- Forget for Get: A Lightweight Two-phase Gradient Method for Knowledge Editing in Large Language Models
Yanhong Li, Min Yang, Xiping Hu, Chengming Li
- AutoEvolve: Automatically Evolving Queries for Applicable and Scalable Retrieval-Augmented Generation Benchmarking
Ding-Chu Zhang, Xiaowen Zhang, Yue Fei, Renjun Hu, Xiao-Wen Yang, Zhi Zhou, Baixuan Li, Yu-Feng Li, Xing Shi, Wei Lin
- Temporal Alignment of Time Sensitive Facts with Activation Engineering
Sanjay Govindan, Maurice Pagnucco, Yang Song
- ChronoBias: A Benchmark for Evaluating Temporal Group Bias in the Time-sensitive Knowledge of Large Language Models
Kyungmin Kim, Youngbin Choi, Hyounghun Kim, Dongwoo Kim, Sangdon Park
- MC^2: A Minimum-Coverage and Dataset-Agnostic Framework for Compositional Generalization of LLMs on Semantic Parsing
Ziyao Xu, Zhe Yang, Houfeng Wang
- Learning to Instruct: Fine-Tuning a Task-Aware Instruction Optimizer for Black-Box LLMs
Yunzhe Qi, Jinjin Tian, Tianci Liu, Ruirui Li, Tianxin Wei, Hui Liu, Xianfeng Tang, Monica Xiao Cheng, Jingrui He
- Enriching Patent Claim Generation with European Patent Dataset
Lekang Jiang, Chengzu Li, Stefan Goetz
- StepKE: Stepwise Knowledge Editing for Multi-Hop Question Answering
Jaewook Lee, Dahyun Jung, Heuiseok Lim
- AutoDCWorkflow: LLM-based Data Cleaning Workflow Auto-Generation and Benchmark
Lan Li, Liri Fang, Bertram Ludäscher, Vetle I Torvik
- Hidden Ghost Hand: Unveiling Backdoor Vulnerabilities in MLLM-Powered Mobile GUI Agents
Pengzhou Cheng, Haowen Hu, Zheng Wu, Zongru Wu, Tianjie Ju, Daizong Ding, Zhuosheng Zhang, Gongshen Liu
- Scale Down to Speed Up: Dynamic Data Selection for Reinforcement Learning
Zhuoyue Chen, Jihai Zhang, Ben Liu, Fangquan Lin, Wotao Yin
- Towards Efficient CoT Distillation: Self-Guided Rationale Selector for Better Performance with Fewer Rationales
JianZhi Yan, Le Liu, Youcheng Pan, Shiwei Chen, Yang Xiang, Buzhou Tang
- GeoDANO: Geometric VLM with Domain Agnostic Vision Encoder
Seunghyuk Cho, Zhenyue Qin, Yang Liu, Youngbin Choi, Seungbeom Lee, Dongwoo Kim
- Leveraging 3D Gaussian for Temporal Knowledge Graph Embedding
Jiang Li, Xiangdong Su, Guanglai Gao
- LLMAP: LLM-Assisted Multi-Objective Route Planning with User Preferences
Liangqi Yuan, Dong-Jun Han, Christopher Brinton, Sabine Brunswicker
- ZEBRA: Leveraging Model-Behavioral Knowledge for Zero-Annotation Preference Dataset Construction
Jeesu Jung, Chanjun Park, Sangkeun Jung
- Token Knowledge: A New Perspective For Knowledge in Large Language Models
Jieyong Wang, Chunyao Song, Tingjian Ge
- Adaptive Schema-aware Event Extraction with Retrieval-Augmented Generation
Sheng Liang, Hang Lv, Zhihao Wen, Yaxiong Wu, Yongyue Zhang, Hao Wang, Yong Liu
- Enhancing Attributed Question Answering using Tailored Progressive Curriculum Learning
Yuhan Chen, Bowei Zou, Yifan Fan, Yuchong Chen, Shujun Cao, Yu Hong
- REAR: Reinforced Reasoning Optimization for Event Argument Extraction with Relation-Aware Support
Jianwen Luo, Yu Hong, Shuai Yang, Jianmin YAO
- COMI-LINGUA: Expert Annotated Large-Scale Dataset for Multitask NLP in Hindi-English Code-Mixing
Rajvee Sheth, Himanshu Beniwal, Mayank Singh
- Nine Ways to Break Copyright Law and Why Our LLM Won’t: A Fair Use Aligned Generation Framework
Aakash Sen Sharma, Debdeep Sanyal, Priyansh Srivastava, Sundar Athreya H, Shirish Karande, Mohan Kankanhalli, Murari Mandal
- InteractSpeech: A Speech Dialogue Interaction Corpus for Spoken Dialogue Model
Yifu Chen, Shengpeng Ji, Ziqing Wang, Hanting Wang, Zhou Zhao
- Enhancing SQL Table Acquisition with Reverse Engineering for Text-to-SQL
Shixin Liu, Haoyu Xu, Yu Hong
- DynamicKV: Task-Aware Adaptive KV Cache Compression for Long Context LLMs
Xiabin Zhou, Wenbin Wang, Minyan Zeng, Jiaxian Guo, Xuebo Liu, Li Shen, Min Zhang, Liang Ding
- ASD-iLLM:An Intervention Large Language Model for Autistic Children based on Real Clinical Dialogue Intervention Dataset
Shuzhong Lai, Chenxi Li, Junhong Lai, Yucun Zhong, Chenyu Yan, Xiang Li, Haifeng Li, Gang Pan, Lin Yao, Yueming Wang
- GDLLM: A Global Distance-aware Modeling Approach Based on Large Language Models for Event Temporal Relation Extraction
Jie Zhao, Wanting Ning, Yuxiao Fei, Yubo Feng, Lishuang Li
- More Tokens, Lower Precision: Towards the Optimal Token-Precision Trade-off in KV Cache Compression
Jiebin Zhang, Dawei Zhu, Yifan Song, Wenhao Wu, Chuqiao Kuang, Xiaoguang Li, Lifeng Shang, Qun Liu, Sujian Li
- cAST: Enhancing Code Retrieval-Augmented Generation with Structural Chunking via Abstract Syntax Tree
Yilin Zhang, Xinran Zhao, Zora Zhiruo Wang, Chenyang Yang, Jiayi Wei, Tongshuang Wu
- A Group Fairness Lens for Large Language Models
Guanqun Bi, Yuqiang Xie, Lei Shen, Yanan Cao
- VLM Is a Strong Reranker: Advancing Multimodal Retrieval-augmented Generation via Knowledge-enhanced Reranking and Noise-injected Training
Zhanpeng Chen, Chengjin Xu, Yiyan Qi, Xuhui Jiang, Jian Guo
- Rethinking DPO: The Role of Rejected Responses in Preference Misalignment
Jae Hyeon Cho, JunHyeok Oh, Myunsoo Kim, Byung-Jun Lee
- Enhancing Recommendation Explanations through User-Centric Refinement
Jingsen Zhang, Zihang Tian, Xueyang Feng, Xu Chen, Chong Chen
- Distributional Surgery for Language Model Activations
Bao Nguyen, Binh Nguyen, Duy Nguyen, Viet Anh Nguyen
- Improving Alignment in LVLMs with Debiased Self-Judgment
Sihan Yang, Chenhang Cui, Zihao Zhao, Yiyang Zhou, Weilong Yan, Ying Wei, Huaxiu Yao
- Low-Confidence Gold: Refining Low-Confidence Samples for Efficient Instruction Tuning
Hongyi Cai, jie li, Mohammad Mahdinur Rahman, Wenzhen Dong
- Safeguarding Privacy of Retrieval Data against Membership Inference Attacks: Is This Query Too Close to Home?
Yujin Choi, Youngjoo Park, Junyoung Byun, Jaewook Lee, Jinseong Park
- Causal-LLM: A Unified One-Shot Framework for Prompt- and Data-Driven Causal Graph Discovery
Amartya Roy, N Devharish, Shreya Ganguly, Kripabandhu Ghosh
- LRPLAN: A Multi-Agent Collaboration of Large Language and Reasoning Models for Planning with Implicit & Explicit Constraints
T Karthikeyan, Om Dehlan, Mausam, Manish Gupta
- DLPO: Towards a Robust, Efficient, and Generalizable Prompt Optimization Framework from a Deep-Learning Perspective
Dengyun Peng, Yuhang Zhou, Qiguang Chen, JinHao Liu, Jingjing Chen, Libo Qin
- Towards Robust Few-Shot Relation Classification: Incorporating Relation Description with Agreement
Mengting Hu, Jianfeng Wu, Ming Jiang, Yalan Xie, Zhunheng Wang, Rui Ying, Xiaoyi Liu, Ruixuan Xu, Hang Gao, Renhong Cheng
- For a Fistful of Puns: Evaluating a Puns in Multiword Expressions Identification Algorithm Without Dedicated Dataset
Julien Bezançon, Gaël Lejeune
- Watermarking for Factuality: Guiding Vision-Language Models Toward Truth via Tri-layer Contrastive Decoding
Kyungryul Back, Seongbeom Park, Milim Kim, Mincheol Kwon, SangHyeok Lee, Hyunyoung Lee, Junhee Cho, Seunghyun Park, Jinkyu Kim
- Are the Reasoning Models Good at Automated Essay Scoring?
Lui Yoshida
- Rethinking LLM-Based Recommendations: A Personalized Query-Driven Parallel Integration
Donghee Han, Hwanjun Song, Mun Yong Yi
- RefVNLI: Towards Scalable Evaluation of Subject-driven Text-to-image Generation
Aviv Slobodkin, Hagai Taitelbaum, Yonatan Bitton, Brian Gordon, Michal Sokolik, Nitzan Bitton Guetta, Almog Gueta, Royi Rassin, Dani Lischinski, Idan Szpektor
- What data should I include in my POS tagging training set?
Zoey Liu, Masoud Jasbi, Christan Grant, Kenji Sagae, Emily Prud’hommeaux
- AttnComp: Attention-Guided Adaptive Context Compression for Retrieval-Augmented Generation
Lvzhou Luo, Yixuan Cao, Ping Luo
- SafeInt: Shielding Large Language Models from Jailbreak Attacks via Safety-Aware Representation Intervention
Jiaqi Wu, Chen Chen, Chunyan Hou, Xiaojie Yuan
- Staged Knowledge Distillation Through Least-to-Most Prompting: Optimizing Teacher Guidance via Difficulty-Aware Training
Mengxiang Zhang, Lingyuan Liu
- LLM Distillation for Efficient Few-Shot Multiple Choice Question Answering
Patrick Sutanto, Joan Santoso, Esther Irawati Setiawan, Aji Prasetya Wibawa
- Teaching LLMs to Plan, Not Just Solve: Plan Learning Boosts LLMs Generalization in Reasoning Tasks
Tianlong Wang, Junzhe Chen, Weibin Liao, Xueting Han, Jing Bai
- FedCoT: Federated Chain-of-Thought Distillation for Large Language Models
Tao Fan, Weijing Chen, Yan Kang, GuoqiangMa, Hanlin Gu, Yuanfeng SONG, Lixin Fan, Qiang Yang
- SalaMAnder: Shapley-based Mathematical Expression Attribution and Metric for Chain-of-Thought Reasoning
Yue Xin, Chen Shen, Shaotian Yan, Xiaosong Yuan, Yaoming Wang, Xiaofeng Zhang, Chenxi Huang, Jieping Ye
- Representing LLMs in Prompt Semantic Task Space
Idan Kashani, Avi Mendelson, Yaniv Nemcovsky
- PersLLM: A Personified Training Approach for Large Language Models
Zheni Zeng, Jiayi Chen, Huimin Chen, Yukun Yan, Yuxuan Chen, Zhenghao Liu, Zhiyuan Liu, Maosong Sun
- The Illusion of Randomness: How LLMs Fail to Emulate Stochastic Decision-Making in Rock-Paper-Scissors Games?
Zihao Guo, Hongtao Lv, Chaoli Zhang, Yibowen Zhao, Yixin Zhang, Lizhen Cui
- DAPE-BR: Distance-Aware Positional Encoding for Mitigating Object Hallucination in LVLMs
Mingrui Xie, Tianxiang Xu, Qianhai Tang, Shanming Yao, Xiaofeng Zhang, Junliang Du
- From Confidence to Collapse in LLM Factual Robustness
Alina Fastowski, Bardh Prenkaj, Gjergji Kasneci
- CtrlNews: LLM-based Multi-Agent Controllable News Writing via Knowledge Gravitational Field
Yifei Xu, Yingjie Zong, Wang Zhonghua, Sirui Wu, Yuan Rao, Dan Zhang, Shuiguang Deng
- Joint Enhancement of Relational Reasoning for Long-Context LLMs
Zhirui Chen, Wei Shen, Jiashui Huang, Ling Shao
- Training Medical QA Models Based on Mixed Rewards from Multiple-Choice and Open-Ended Questions
Yue Qiu, Yujan Ting, Pei Dong, Terrence Chen, Weijing Huang
- Rethink Rumor Detection in the Era of LLMs: A Review
Chang Yang, Peng Zhang, Jing Zhang, Hui Gao, Changhao Song
- ScholarBench: A Bilingual Benchmark for Abstraction, Comprehension, and Reasoning Evaluation in Academic Contexts
DongwonNoh, Donghyeok Koh, Junghun Yuk, Gyuwan Kim, JAE YONG LEE, KyungTae Lim, Cheoneum Park
- MAGIC: A Multi-Hop and Graph-Based Benchmark for Inter-Context Conflicts in Retrieval-Augmented Generation
Jungyeon Lee, Lee Kangmin, Taeuk Kim
- Align Attention Heads Before Merging Them: An Effective Way for Converting MHA to GQA
Qingyun Jin, Xiaohui Song, Feng Zhou, Zengchang Qin
- DRBO: Mitigating the Bottleneck Effect via Dynamic Reward Balancing in Multi-reward LLM Optimization
Nuo Chen, Yufei Gao, Yongnan Jin, Yan Hu, Anningzhe Gao, Lingyong Yan, Benyou Wang
- Enhancing LLM Knowledge Learning through Generalization
Mingkang Zhu, Xi Chen, Zhongdao Wang, Bei Yu, Hengshuang Zhao, Jiaya Jia
- FastCuRL: Curriculum Reinforcement Learning with Stage-wise Context Scaling for Efficient Training R1-like Reasoning Models
Mingyang Song, Mao Zheng, Zheng Li, Wenjie Yang, Xuan Luo
- TR-MTEB: A Comprehensive Benchmark and Embedding Model Suite for Turkish Sentence Representations
Mehmet Selman Baysan, Tunga Gungor
- ImpRAG: Retrieval-Augmented Generation with Implicit Queries
Wenzheng Zhang, Xi Victoria Lin, Karl Stratos, Wen-tau Yih, Mingda Chen
- HEAL: A Hypothesis-Based Preference-Aware Analysis Framework
Yifu Huo, Chenglong Wang, Qiren Zhu, Shunjie Xing, Tong Xiao, Chunliang Zhang, Tongran Liu, JingBo Zhu
- A Survey of Multilingual Reasoning in Language Models
Akash Ghosh, Debayan Dutta, Sriparna Saha, Chirag Agarwal
- CLEAR: A Framework Enabling Large Language Models to Discern Confusing Legal Paragraphs
Qi Xu, Qian Liu, Hao Fei, Hang Yu, Shuhao Guan, Xiao Wei
- NAP^2: A Benchmark for Naturalness and Privacy-Preserving Text Rewriting by Learning from Human
Shuo Huang
- Chain of Ideas: Revolutionizing Research Via Novel Idea Development with LLM Agents
Long Li, Weiwen Xu, Jiayan Guo, Ruochen Zhao, Xingxuan Li, Yuqian Yuan, Boqiang Zhang, Yuming Jiang, Yifei Xin, Ronghao Dang, Yu Rong, Deli Zhao, Tian Feng, Lidong Bing
- Unveiling Multimodal Processing: Exploring Activation Patterns in Multimodal LLMs for Interpretability and Efficiency
Chuan Wu, MengSu, Youxuan Fang, shaolin Zhu
- Self-Supervised Prompt Optimization
Jinyu Xiang, Jiayi Zhang, Zhaoyang Yu, Xinbing Liang, Fengwei Teng, Jinhao Tu, Fashen Ren, Xiangru Tang, Sirui Hong, Chenglin Wu, Yuyu Luo
- Polish-English medical knowledge transfer: A new benchmark and results
Łukasz Grzybowski, Jakub Pokrywka, Michał Ciesiółka, Jeremi Ignacy Kaczmarek, Marek Kubis
- Hard Negatives, Hard Lessons: Revisiting Training Data Quality for Robust Information Retrieval with LLMs
Nandan Thakur, Crystina Zhang, Xueguang Ma, Jimmy Lin
- EventRelBench: A Comprehensive Benchmark for Evaluating Event Relation Understanding in Large Language Models
Jie Gong, Biaoshuai Zheng, qiwang hu
- S2LPP: Small-to-Large Prompt Prediction across LLMs
Liang Cheng, Tianyi Li, Zhaowei Wang, Mark Steedman
- DroidCall: A Dataset for LLM-powered Android Intent Invocation
Weikai Xie, Li Zhang, Shihe Wang, Rongjie Yi, Mengwei Xu
- Tool Zero: Training Tool-Augmented LLMs via Pure RL from Scratch
Yirong Zeng, Xiao Ding, Yutai Hou, Yuxian Wang, Li Du, Juyi Dai, Qiuyang Ding, Duyu Tang, Dandan Tu, Weiwen Liu, Bing Qin, Ting Liu
- INREACT: An Inspire-Then-Reinforce Training Framework For Multimodal GUI Agent
Yuanlei Wang, liuzhou zhang, Haohao Luo, Ying Shen
- Facts Fade Fast: Evaluating Memorization of Outdated Medical Knowledge in Large Language Models
Juraj Vladika, Mahdi Dhaini, Florian Matthes
- Zero-Shot Privacy-Aware Text Rewriting via Iterative Tree Search
Shuo Huang, Xingliang YUAN, Gholamreza Haffari, Lizhen Qu
- KoLEG: On-the-Fly Korean Legal Knowledge Editing with Continuous Retrieval
Jaehyung Seo, Dahyun Jung, Jaewook Lee, Yongchan Chun, Dongjun Kim, Hwijung Ryu, Donghoon Shin, Heuiseok Lim
- HARE: an entity and relation centric evaluation framework for histopathology reports
Yunsoo Kim, Michal Wen Sheue Ong, Alex Shavick, Honghan Wu, Adam P. Levine
- VeriFastScore: Speeding up long-form factuality evaluation
Rishanth Rajendhran, Amir Zadeh, Matthew Sarte, Chuan Li, Mohit Iyyer
- B-REASO: A Multi-Level Multi-Faceted Bengali Evaluation Suite for Foundation Models
Md Tanzib Hosain, Md Kishor Morol
- Extracting Conceptual Spaces from LLMs Using Prototype Embeddings
Nitesh Kumar, Usashi Chatterjee, Steven Schockaert
- FC-Attack: Jailbreaking Multimodal Large Language Models via Auto-Generated Flowcharts
Ziyi Zhang, Zhen Sun, Zongmin Zhang, Jihui Guo, Xinlei He
- Multilingual Data Filtering using Synthetic Data from Large Language Models
Jonas Waldendorf, Barry Haddow, Alexandra Birch, Mateusz Klimaszewski
- SAFE: A Sparse Autoencoder-Based Framework for Robust Query Enrichment and Hallucination Mitigation in LLMs
Samir Abdaljalil, Filippo Pallucchini, Andrea Seveso, HASAN KURBAN, Fabio Mercorio, Erchin Serpedin
- Soteria: Language-Specific Functional Parameter Steering for Multilingual Safety Alignment
Somnath Banerjee, Sayan Layek, Pratyush Chatterjee, Animesh Mukherjee, Rima Hazra
- LLMs as a synthesis between symbolic and distributed approaches to language
Gemma Boleda
- MIND: Towards Immersive Psychological Healing with Multi-Agent Inner Dialogue
Yujia Chen, Changsong Li, Yiming Wang, Tianjie Ju, Qingqing Xiao, Nan Zhang, Zifan Kong, PengWang, Binyu Yan
- A Monte-Carlo Sampling Framework For Reliable Evaluation of Large Language Models Using Behavioral Analysis
Davood Wadi, Marc Fredette
- Understanding How Value Neurons Shape the Generation of Specified Values in LLMs
Yi Su, Jiayi Zhang, Shu Yang, Xinhai Wang, Lijie Hu, Di Wang
- Likelihood Variance as Text Importance for Resampling Texts to Map Language Models
Momose Oyama, Ryo Kishino, Hiroaki Yamagiwa, Hidetoshi Shimodaira
- Think Twice, Generate Once: Enhancing LLMs Safety via Progressive Self-Reflection
Hoang Phan, Victor Li, Qi Lei
- Efficient Integration of External Knowledge to LLM-based World Models via Retrieval-Augmented Generation and Reinforcement Learning
Chang Yang, Xinrun Wang, Qinggang Zhang, Qi Jiang, Xiao Huang
- Comparing Apples to Oranges: A Dataset & Analysis of LLM Humour Understanding from Traditional Puns to Topical Jokes
Tyler Loakman, William Thorne, Chenghua Lin
- Modeling, Evaluating, and Embodying Personality in LLMs: A Survey
Iago Alves Brito, Julia Soares Dollis, Fernanda Bufon Färber, Pedro Schindler Freire Brasil Ribeiro, Rafael Teixeira Sousa, Arlindo Rodrigues Galvão Filho
- Benchmarking the Detection of LLMs-Generated Modern Chinese Poetry
Shanshan Wang, Junchao Wu, Fengying Ye, Derek F. Wong, Jingming Yao, Lidia S. Chao
- Leveraging the Cross-Domain & Cross-Linguistic Corpus for Low Resource NMT: A Case Study On Bhili-Hindi-English Parallel Corpus
Pooja Singh, Shashwat Bhardwaj, Vaibhav Sharma, Sandeep Kumar
- Creative Preference Optimization
Mete Ismayilzada, Antonio Laverghetta Jr., Simone A. Luchini, Reet Patel, Antoine Bosselut, Lonneke van der Plas, Roger E. Beaty
- Assistant-Guided Mitigation of Teacher Preference Bias in LLM-as-a-Judge
Zhuo Liu, Moxin Li, Xun Deng, Qifan Wang, Fuli Feng
- Uplift-RAG: Uplift-Driven Knowledge Preference Alignment for Retrieval-Augmented Generation
Changle Qu, Sunhao Dai, Hengyi Cai, Yiyang Cheng, Jun Xu, Shuaiqiang Wang, Dawei Yin
- Sugar-Coated Poison: Benign Generation Unlocks Jailbreaking
Yuhang Wu, Yu-Jie Xiong, Hao Zhang, Jia-Chen Zhang, Zheng Zhou
- DivScene: Towards Open-Vocabulary Object Navigation with Large Vision Language Models in Diverse Scenes
Zhaowei Wang, Hongming Zhang, Tianqing Fang, Ye Tian, Yue Yang, Kaixin Ma, Xiaoman Pan, Yangqiu Song, Dong Yu
- Data-scarce Behavior Editing of Language Models
Joykirat Singh, Subhabrata Dutta, Tanmoy Chakraborty
- FIER: Fine-Grained and Efficient KV Cache Retrieval for Long-context LLM Inference
Dongwei Wang, Zijie Liu, Song Wang, Yuxin Ren, Jianing Deng, Jingtong Hu, Tianlong Chen, Huanrui Yang
- SVeritas: Benchmark for Robust Speaker Verification under Diverse Conditions
Massa Baali, Sarthak Bisht, Francisco Teixeira, Kateryna Shapovalenko, Rita Singh, Bhiksha Raj
- CAARMA: Class Augmentation with Adversarial Mixup Regularization
Massa Baali, Xiang Li, Hao Chen, Syed Abdul Hannan, Rita Singh, Bhiksha Raj
- Bringing Pedagogy into Focus: Evaluating Virtual Teaching Assistants’ Question-Answering in Asynchronous Learning Environments
Siyan Li, Zhen Xu, Vethavikashini Chithrra Raghuram, Xuanming Zhang, Renzhe Yu, Zhou Yu
- Demystifying Multilingual Reasoning in Process Reward Modeling
Weixuan Wang, Minghao Wu, Barry Haddow, Alexandra Birch
- BehaviorSFT: Behavioral Token Conditioning for Health Agents Across the Proactivity Spectrum
Yubin Kim, Zhiyuan Hu, Hyewon Jeong, Eugene W Park, Shuyue Stella Li, Chanwoo Park, Shiyun Xiong, MingYu Lu, Hyeonhoon Lee, Xin Liu, Daniel McDuff, Cynthia Breazeal, Samir Tulebaev, Hae Won Park
- LaMP-Cap: Personalized Figure Caption Generation With Multimodal Figure Profiles
Ho Yin Sam Ng, Ting-Yao Hsu, Aashish Anantha Ramakrishnan, Branislav Kveton, Nedim Lipka, Franck Dernoncourt, Dongwon Lee, Tong Yu, Sungchul Kim, Ryan A. Rossi, Ting-Hao Kenneth Huang
- Efficient Dynamic Clustering-Based Document Compression for Retrieval-Augmented-Generation
Weitao Li, Xiangyu Zhang, Kaiming Liu, Xuanyu Lei, Weizhi Ma, Yang Liu
- HebID: Detecting Social Identities in Hebrew-language Political Text
Guy Mor-Lan, Naama Rivlin-Angert, Yael R. Kaplan, Tamir Sheafer, Shaul R. Shenhav
- Dub-S2ST: Textless Speech-to-Speech Translation for Seamless Dubbing
Jeongsoo Choi, Jaehun Kim, Joon Son Chung
- FinGrAct: A Framework for FINe-GRrained Evaluation of ACTionability in Explainable Automatic Fact-Checking
Islam Eldifrawi, Shengrui Wang, Amine Trabelsi
- What Has Been Lost with Synthetic Evaluation?
Alexander Gill, Abhilasha Ravichander, Ana Marasovic
- Bold Claims or Self-Doubt? Factuality Hallucination Type Detection via Belief State
Dongyu Zhang, Qingqing Hong, Bingxuan Hou, Jiayi Lin, Chenyang Zhang, Jialin Li, Junli Wang
- Proxy Barrier: A Hidden Repeater Layer Defense Against System Prompt Leakage and Jailbreaking
Pedro Schindler Freire Brasil Ribeiro, Iago Alves Brito, Rafael Teixeira Sousa, Fernanda Bufon Färber, Julia Soares Dollis, Arlindo Rodrigues Galvão Filho
- AraSafe: Benchmarking Safety in Arabic LLMs
Hamdy Mubarak, Abubakr Mohamed, Majd Hawasly
- Nested Named Entity Recognition as Single-Pass Sequence Labeling
Alberto Muñoz-Ortiz, David Vilares, Caio Corro, Carlos Gómez-Rodríguez
- DeCoRe: Decoding by Contrasting Retrieval Heads to Mitigate Hallucinations
Aryo Pradipta Gema, Chen Jin, Ahmed Abdulaal, Tom Diethe, Philip Alexander Teare, Beatrice Alex, Pasquale Minervini, Amrutha Saseendran
- Catch Me If You Can? Not Yet: LLMs Still Struggle to Imitate the Implicit Writing Styles of Everyday Authors
Zhengxiang Wang, Nafis Irtiza Tripto, Solha Park, Zhenzhen Li, Jiawei Zhou
- Fine-Tuning Encoder-Decoder Models with Contrastive Learning for In-Context Distractor Generation
Elaf Alhazmi, Quan Z. Sheng, Wei Emma Zhang, Mohammed I. Thanoon, Haojie Zhuang, Behnaz Soltani, Munazza Zaib
- Conflicts in Texts: Data, Implications and Challenges
Siyi Liu, Dan Roth
- Recognizing Limits: Investigating Infeasibility in Large Language Models
Wenbo Zhang, Zihang Xu, Hengrui Cai
- VQA-Augmented Machine Translation with Cross-Modal Contrastive Learning
Zhihui Zhang, Shiliang Sun, Jing Zhao, Tengfei Song, Hao Yang
- Learning to Describe Implicit Changes: Noise-robust Pre-training for Image Difference Captioning
Zixin Guo, Jiayang Sun, Tzu-Jui Julius Wang, Abduljalil Radman, Selen Pehlivan, Min Cao, Jorma Laaksonen
- SOLAR: Serendipity Optimized Language Model Aligned for Recommendation
Zichen Yuan, Lifan Sun, Yucen Zhuang, Yue Wang, Xinyuan Song, Tianqi Xu, Siyuan Li, Junchen Fu, Youhua Li, Sirui Hong, Jiaqi Chen, Joemon M. Jose, Yongxin Ni
- AIRepr: An Analyst-Inspector Framework for Evaluating Reproducibility of LLMs in Data Science
Qiuhai Zeng, Claire Jin, Xinyue Wang, Yuhan Zheng, Qunhua Li
- MisinfoBench: A Multi-Dimensional Benchmark for Evaluating LLMs’ Resilience to Misinformation
Ye Yang, Donghe Li, Zuchen Li, Fengyuan Li, Jingyi Liu, Li Sun, Qingyu Yang
- Fuzzy Reasoning Chain (FRC): An Innovative Reasoning Framework from Fuzziness to Clarity
Ping Chen, Xiang Liu, Zhaoxiang Liu, Zezhou Chen, Xingpeng Zhang, Huan Hu, Zipeng Wang, Kai Wang, Shuming Shi, Shiguo Lian
- HighMATH: Evaluating Math Reasoning of Large Language Models in Breadth and Depth
Yan Liu, Minghui Zhang, Bojian Xiong, Yifan Xiao, Yinong Sun, Yating Mei, Longyu Zeng, Jingchao Yang, Yang Wang, Deyi Xiong
- CATCH: A Novel Data Synthesis Framework for High Therapy Fidelity and Memory-Driven Planning Chain of Thought in AI Counseling
Mingyu Chen, Jingkai Lin, Zhaojie Chu, Xiaofen Xing, Yirong Chen, Xiangmin Xu
- MediVLM: A Vision Language Model for Radiology Report Generation from Medical Images
Debanjan Goswami, Ronast Subedi, Shayok Chakraborty
- AdDriftBench: A Benchmark for Detecting Data Drift and Label Drift in Short Video Advertising
Yinghao Song, Xiangji Zeng, Shuai Cui, Lu Sun, Zhaowei Liu, Yuan Yuan, Hai Zhou, Zhaohan Gong
- NIM: Neuro-symbolic Ideographic Metalanguage for Inclusive Communication
Prawaal Sharma, Poonam Goyal, Navneet Goyal, Vidisha Sharma
- ViFT: Towards Visual Instruction-Free Fine-tuning for Large Vision-Language Models
Zikang Liu, Kun Zhou, Xin Zhao, Dawei Gao, Yaliang Li, Ji-Rong Wen
- Do Code Semantics Help? A Comprehensive Study on Execution Trace-Based Information for Code Large Language Models
Jian Jornbowrl Wang, Xiaofei Xie, Qiang Hu, Shangqing Liu, Yi Li
- LongWeave: A Long-Form Generation Benchmark Bridging Real-World Relevance and Verifiability
Zikai Xiao, Fei Huang, Jianhong Tu, Jianhui Wei, Wen MA, Yuxuan Zhou, Jian Wu, Bowen Yu, Zuozhu Liu, Junyang Lin
- XL-Suite: Cross-Lingual Synthetic Training and Evaluation Data for Open-Ended Generation
Vivek Iyer, Ricardo Rei, Pinzhen Chen, Alexandra Birch
- Accelerating LLM Reasoning via Early Rejection with Partial Reward Modeling
Seyyed Saeid Cheshmi, Azal Ahmad Khan, Xinran Wang, Zirui Liu, Ali Anwar
- CultureSynth: A Hierarchical Taxonomy-Guided and Retrieval-Augmented Framework for Cultural Question-Answer Synthesis
Xinyu Zhang, Pei Zhang, Shuang Luo, Jialong Tang, Yu Wan, Baosong Yang, Fei Huang
- DesignCLIP: Multimodal Learning with CLIP for Design Patent Understanding
Zhu Wang, Homaira Huda Shomee, Sathya N. Ravi, Sourav Medya
- R3-RAG: Learning Step-by-Step Reasoning and Retrieval for LLMs via Reinforcement Learning
Yuan Li, Qi Luo, Xiaonan Li, Bufan Li, Qinyuan Cheng, Bo Wang, Yining Zheng, Yuxin Wang, Zhangyue Yin, Xipeng Qiu
- Hello, World!’: Making GNNs Talk with LLMs
Sunwoo Kim, Soo Yong Lee, Jaemin Yoo, Kijung Shin
- Both Text and Images Leaked! A Systematic Analysis of Data Contamination in Multimodal LLM
Dingjie Song, Sicheng Lai, Mingxuan Wang, Shunian Chen, Lichao Sun, Benyou Wang
- NLKI: A Lightweight Natural Language Knowledge Integration Framework for Improving Small VLMs in Commonsense VQA Tasks
Aritra Dutta, Swapnanil Mukherjee, Deepanway Ghosal, Somak Aditya
- Text or Pixels? Evaluating Efficiency and Understanding of LLMs with Visual Text Inputs
Yanhong Li, Zixuan Lan, Jiawei Zhou
- Assessing Socio-Cultural Alignment and Technical Safety of Sovereign LLMs
Kyubyung Chae, Gihoon Kim, Gyuseong Lee, Taesup Kim, Jaejin Lee, Heejin Kim
- Sample Efficient Alignment Learning With Episodic Control
Van Dai Do, Quan Hung Tran, Ahmed Kirmani, Lu Zhang, Hung Le
- Evaluating Automatic Speech Recognition Systems for Korean Meteorological Experts
ChaeHun Park, Hojun Cho, Jaegul Choo
- 3D-Aware Vision-Language Models Fine-Tuning with Geometric Distillation
Seonho Lee, Jiho Choi, Inha Kang, Jiwook Kim, Junsung Park, Hyunjung Shim
- CAPE: Context-Aware Personality Evaluation Framework for Large Language Models
Jivnesh Sandhan, Fei Cheng, Tushar Sandhan, Yugo Murawaki
- AgentThink: A Unified Framework for Tool-Augmented Chain-of-Thought Reasoning in Vision-Language Models for Autonomous Driving
Kangan Qian, Sicong Jiang, YangZhong, Ziang Luo, Zilin Huang, Tianze Zhu, Kun Jiang, mengmeng yang, Zheng Fu, Jinyu Miao, Yining Shi, He Zhe Lim, Li Liu, Tianbao Zhou, Hongyi Wang, HuangYu, Yifei HU, Guang Li, Guang Chen, Hao Ye, Lijun Sun, Diange Yang
- Select to Know: An Internal-External Knowledge Self-Selection Framework for Domain-Specific Question Answering
Bolei He, Xinran He, Run Shao, Shanfu Shu, xianwei xue, MingQuan Cheng, Haifeng Li, Zhen-Hua Ling
- GenPTQ: Green Post-Training Quantization for Large-Scale ASR Models with Mixed-Precision Bit Allocation
Beom Jin Kang, Hyun Kim
- Where Does This Strange Smell Come from?: Enabling Conversational Interfaces for Artificial Olfaction
Xueyi Zhou, Qi Lu, Dong-Kyu Chae
- LightRAG: Simple and Fast Retrieval-Augmented Generation
ZIRUI GUO, Lianghao Xia, Yanhua Yu, Tu Ao, Chao Huang
- Beyond Distribution: Investigating Language Models’ Understanding of Sino-Korean Morphemes
Taehee Jeon
- Sarcasm-R1: Enhancing Sarcasm Detection through Focused Reasoning
Qi Yang, Jingjie Zeng, Kai Ma, Liang Yang, Hongfei Lin
- ISACL: Internal State Analyzer for Copyrighted Training Data Leakage
Guangwei Zhang, Qisheng Su, Jiateng Liu, Cheng Qian, Yanzhou Pan, Manling Li, Yanjie Fu, Zhaozhuo Xu, Denghui Zhang
- Steering LVLMs via Sparse Autoencoder for Hallucination Mitigation
Zhenglin Hua, Jinghan He, Zijun Yao, Tianxu Han, Haiyun Guo, Yuheng Jia, Junfeng Fang
- On the Perception Bottleneck of VLMs for Chart Understanding
Junteng Liu, Weihao Zeng, Xiwen Zhang, Yijun Wang, Zifei Shan, Junxian He
- Self-Guided Function Calling in Large Language Models via Stepwise Experience Recall
Sijia Cui, Aiyao He, Shuai Xu, Hongming Zhang, Yanna Wang, Qingyang Zhang, Yajing Wang, bo xu
- Multilingual Generative Retrieval via Cross-lingual Semantic Compression
Simeng Wu, Ran Song, Yuxin Huang, Yan Xiang, Yantuan Xian, Shengxiang Gao, Zhengtao Yu
- Towards Multi-Document Question Answering in Scientific Literature: Pipeline, Dataset, and Evaluation
Hui Huang, Julien Velcin, Yacine Kessaci
- Multilingual Knowledge Graph Completion via Efficient Multilingual Knowledge Sharing
Xiaofei Gao, Ran Song, Shizhu He, Cunli Mao, Shengxiang Gao, Kang Liu, Zhengtao Yu
- Mitigating Attention Localization in Small Scale: Self-Attention Refinement via One-step Belief Propagation
Nakyung Lee, Yeongoon Kim, Minhae Oh, Jin Woo Koo, Hyewon Jo, Jungwoo Lee
- Imagination and Contemplation: A Balanced Framework for Semantic-Augmented Multimodal Machine Translation
Zhuang Yu, Shiliang Sun, Jing Zhao, Tengfei Song, Hao Yang
- NeLLCom-Lex: A Neural-agent Framework to Study the Interplay between Lexical Systems and Language Use
Yuqing Zhang, Ecesu Ürker, Tessa Verhoef, Gemma Boleda, Arianna Bisazza
- RLMEval: Evaluating Research-Level Neural Theorem Proving
Auguste Poiroux, Antoine Bosselut, Viktor Kunčak
- KaeDe: Progressive Generation of Logical Forms via Knowledge-Aware Question Decomposition for Improved KBQA
Ranran Bu, Jian Cao, Jianqi Gao, Shiyou Qian, Hongming Cai
- Where Fact Ends and Fairness Begins: Redefining AI Bias Evaluation through Cognitive Biases
Jen-tse Huang, Yuhang Yan, Linqi LIU, Yixin Wan, Wenxuan Wang, Kai-Wei Chang, Michael R. Lyu
- Equal Truth: Rumor Detection with Invariant Group Fairness
Junyi Chen, Mengjia Wu, Qian Liu, Jing Sun, Ying Ding, Yi Zhang
- STEAM: A Semantic-Level Knowledge Editing Framework for Large Language Models
Geunyeong Jeong, Juoh Sun, Seonghee Lee, Harksoo Kim
- SoT: Structured-of-Thought Prompting Guides Multilingual Reasoning in Large Language Models
Rui Qi, Zhibo Man, Yufeng Chen, Fengran Mo, Jinan Xu, Kaiyu Huang
- How Reliable is Multilingual LLM-as-a-Judge?
Xiyan Fu, Wei Liu
- Cognitive-Level Adaptive Generation via Capability-Aware Retrieval and Style Adaptation
Qingsong Wang, Tao Wu, Wang Lin, Yueying Feng, Gongsheng Yuan, Chang Yao, Jingyuan Chen
- Data Doping or True Intelligence? Evaluating the Transferability of Injected Knowledge in LLMs
Essa Jan, Muhammad fareed zaffar, Yasir Zaki, Moiz Ali, Muhammad Saram Hassan
- INDOORWORLD : Integrating Physical Task Solving and Social Simulation in A Heterogeneous Multi-Agent Environment
Dekun Wu, Frederik Brudy, Bang Liu, Yi Wang
- ARXSA: A General Negative Feedback Control Theory in Vision-Language Models
Zeyu Zhang, Tianqi Chen, Yuki Todo
- Breaking the Attention Trap in Code LLMs: A Rejection Sampling Approach to Enhance Code Execution Prediction
Xingcheng Ruan, Haoxiang Geng, Yunhui Xia, Bingran Zhao
- HiMATE: A Hierarchical Multi-Agent Framework for Machine Translation Evaluation
Shijie Zhang, Renhao Li, Songsheng Wang, Philipp Koehn, Min Yang, Derek F. Wong
- ReliableEval: A Recipe for Stochastic LLM Evaluation via Method of Moments
Gili Lior, Eliya Habba, Shahar Levy, Avi Caciularu, Gabriel Stanovsky
- From Characters to Tokens: Dynamic Grouping with Hierarchical BPE
Rares Dolga, Lucas Maystre, Tudor Berariu, David Barber
- Auto-SLURP: A Benchmark Dataset for Evaluating Multi-Agent Frameworks in Smart Personal Assistant
Lei Shen, Xiaoyu Shen
- NER Retriever: Zero-Shot Named Entity Retrieval with Type-Aware Embeddings
Or Shachar, Uri Katz, Yoav Goldberg, Oren Glickman
- MMATH: A Multilingual Benchmark for Mathematical Reasoning
Wenyang Luo, Xin Zhao, Jing Sha, Shijin Wang, Ji-Rong Wen
- MultiClaimNet: A Massively Multilingual Dataset of Fact-Checked Claim Clusters
Rrubaa Panchendrarajan, Rubén Míguez Pérez, Arkaitz Zubiaga
- DS-MHP: Improving Chain-of-Thought through Dynamic Subgraph-Guided Multi-Hop Path
Yongqiang Liu, Wenjun Wang, Binrong Liu, Qiyao Peng, Hongtao Liu, XueWei Li
- LongTail-Swap: benchmarking language models’ abilities on rare words
Robin Algayres, Mahi Luthra, Jiayi Shen, Youssef Benchekroun, Dongyan Lin, Rashel Moritz, Juan Pino, Emmanuel Dupoux
- TF-Mamba: Text-enhanced Fusion Mamba with Missing Modalities for Robust Multimodal Sentiment Analysis
Xiang Li, Xianfu Cheng, Dezhuang Miao, Xiaoming Zhang, Zhoujun Li
- Are Economists Always More Introverted? Analyzing Consistency in Persona-Assigned LLMs
Manon Reusens, Bart Baesens, David Jurgens
- Can you SPLICE it together? A Human Curated Benchmark for Probing Visual Reasoning in VLMs
Mohamad Ballout, Okajevo Wilfred, Seyedalireza Yaghoubi, Nohayr Muhammad Abdelmoneim, Julius Mayer, Elia Bruni
- On the Effectiveness of Prompt-Moderated LLMs for Math Tutoring at the Tertiary Level
Sebastian Steindl, Fabian Brunner, Nada Sissouno, Dominik Schwagerl, Florian Schöler-Niewiera, Ulrich Schäfer
- SkewRoute: Training-Free LLM Routing for Knowledge Graph Retrieval-Augmented Generation via Score Skewness of Retrieved Context
Hairu Wang, Yuan Feng, Yukun Cao, Xike Xie, S Kevin Zhou
- Acquiescence Bias in Large Language Models
Daniel Braun
- Time to Talk: LLM Agents for Asynchronous Group Communication in Mafia Games
Niv Eckhaus, Uri Berger, Gabriel Stanovsky
- How Sampling Affects the Detectability of Machine-written texts: A Comprehensive Study
Matthieu Dubois, François Yvon, Pablo Piantanida
- An Improved, Strong Baseline for Pre-Trained Large Language Models as Task-Oriented Dialogue Systems
Sebastian Steindl, André Kestler, Ulrich Schäfer, Bernd Ludwig
- MATCH: Task-Driven Code Evaluation through Contrastive Learning
Marah Ghoummaid, Vladimir Tchuiev, Ofek Glick, Michal Moshkovitz, Dotan Di Castro
- Evaluating Large Language Models for Cross-Lingual Retrieval
Longfei Zuo, Pingjun Hong, Oliver Kraus, Barbara Plank, Robert Litschko
- SGCD: Subtask-Guided Causal-Debiasing Framework for Robust Cross-Utterance Sentiment Quadruple Extraction in Dialogues
Xiang Li, Keyu Yao, Gang Shen
- FaMTEB: Massive Text Embedding Benchmark in Persian Language
Erfan Zinvandi, Morteza Alikhani, Mehran Sarmadi, Zahra Pourbahman, Sepehr Arvin, Reza Kazemi, Arash Amini
- Leveraging High-Resource English Corpora for Cross-lingual Domain Adaptation in Low-Resource Japanese Medicine via Continued Pre-training
Kazuma Kobayashi, Zhen Wan, Fei Cheng, Yuma Tsuta, Xin Zhao, Junfeng Jiang, Jiahao Huang, Zhiyi Huang, Yusuke Oda, Rio Yokota, Yuki Arase, Daisuke Kawahara, Akiko Aizawa, Sadao Kurohashi
- Structure Trumps Size: Rethinking Data Quality for LLM Reasoning
Hu Xu, Zeyan Li, Rui Wang, Jianfeng Xu
- A Zero-Shot Neuro-Symbolic Approach for Complex Knowledge Graph Question Answering
Prerna Agarwal, Srikanta Bedathur
- Making Every Step Effective: Jailbreaking Large Vision-Language Models Through Hierarchical KV Equalization
Shuyang Hao, Yiwei Wang, Bryan Hooi, Jun Liu, Muhao Chen, Zi Huang, Yujun Cai
- MT-Mol: Multi Agent System with Tool-based Reasoning for Molecular Optimization
Hyomin Kim, Yunhui Jang, Sungsoo Ahn
- A Survey on LLM-powered Agents for Recommender Systems
Qiyao Peng, Hongtao Liu, Hua Huang, Jian Yang, Qing Yang, Minglai Shao
- Efficiently Selecting Response Generation Strategies for Synthetic Data Construction by Self-Aligned Perplexity
Xuan Ren, Lingqiao Liu, Qi Chen
- Benchmarking for Domain-Specific LLMs: A Case Study on Academia and Beyond
Rubing Chen, Jiaxin Wu, Jian Wang, Xulu Zhang, Wenqi Fan, Chenghua Lin, Xiaoyong Wei, Li Qing
- FrameEOL: Semantic Frame Induction using Causal Language Models
Chihiro Yano, Kosuke Yamada, Hayato Tsukagoshi, Ryohei Sasano, Koichi Takeda
- CaTER: A Framework for Context-aware Topology Entity Retrieval Contrastive Learning in End-to-End Task-Oriented Dialogue Systems
Di Wu hebeu, Zhizhi Yu
- Attribution and Application of Multiple Neurons in Multimodal Large Language Models
Feiyu Wang, Ziran Zhao, Pengyuan Liu, Dong Yu
- When Models Lie, We Learn: Multilingual Span-Level Hallucination Detection with PsiloQA
Elisei Rykov, Kseniia Petrushina, Maksim Savkin, Valerii Olisov, Artem Vazhentsev, Kseniia Titova, Alexander Panchenko, Vasily Konovalov, Julia Belikova
- Unraveling Misinformation Propagation in LLM Reasoning
Yiyang Feng, Yichen Wang, Shaobo Cui, Boi Faltings, Mina Lee, Jiawei Zhou
- RAISE: Reinforced Adaptive Instruction Selection For Large Language Models
Qingsong Lv, Yangning Li, Zihua Lan, Zishan Xu, Jiwei Tang, Tingwei Lu, Yinghui Li, Wenhao Jiang, Hong-Gee Kim, Hai-Tao Zheng, Philip S. Yu
- Teaching According to Talents! Instruction Tuning LLMs with Competence-Aware Curriculum Learning
Yangning Li, Tingwei Lu, Yinghui Li, Yankai Chen, Wei-Chieh Huang, Wenhao Jiang, Hui Wang, Hai-Tao Zheng, Philip S. Yu
- Let Them Down Easy! Contextual Effects of LLM Guardrails on User Perceptions and Preferences
Mingqian Zheng, Wenjia hu, Patrick Zhao, Motahhare Eslami, Jena D. Hwang, Faeze Brahman, Carolyn Rose, Maarten Sap
- From Hypothesis to Publication: A Comprehensive Survey of AI-Driven Research Support Systems
Zekun Zhou, Xiaocheng Feng, Lei Huang, Xiachong Feng, Ziyun Song, Ruihan Chen, Liang Zhao, Weitao Ma, Yuxuan Gu, Baoxin Wang, Dayong Wu, Guoping Hu, Ting Liu, Bing Qin
- Enhancing Model Privacy in Federated Learning with Random Masking and Quantization
Zhibo Xu, Zhu JianHao, Jingwen Xu, Changze Lv, Zhenghua Wang, Zisu Huang, Xiaohua Wang, Muling Wu, Qi Qian, Xiaoqing Zheng, Xuanjing Huang
- SuPreME: A Supervised Pre-training Framework for Multimodal ECG Representation Learning
Mingsheng Cai, Jiuming Jiang, Wenhao Huang, Che Liu, Rossella Arcucci
- Ferret: Faster and Effective Automated Red Teaming with Reward-Based Scoring Technique
Tej Deep Pala, Vernon Toh, Rishabh Bhardwaj, Soujanya Poria
- Do What? Teaching Vision-Language-Action Models to Reject the Impossible
Wen-Han Hsieh, Elvis Hsieh, Dantong Niu, Trevor Darrell, Roei Herzig, David M. Chan
- AgentInit: Initializing LLM-based Multi-Agent Systems via Diversity and Expertise Orchestration for Effective and Efficient Collaboration
Chunhao Tian, Yutong Wang, Xuebo Liu, Zhexuan Wang, Liang Ding, Miao Zhang, Min Zhang
- Time to Revisit Exact Match
Auss Abbood, Zaiqiao Meng, Nigel Collier
- LongTableBench: Benchmarking Long-Context Table Reasoning across Real-World Formats and Domains
Liyao Li, Jiaming Tian, Hao Chen, Wentao Ye, Chao Ye, Haobo Wang, NINGTAO WANG, Xing Fu, Gang Chen, Junbo Zhao
- Exploring and Evaluating Multimodal Knowledge Reasoning Consistency of Multimodal Large Language Models
Boyu Jia, Junzhe Zhang, Huixuan Zhang, Xiaojun Wan
- MPTA: MultiTask Personalization Assessment
Matthieu Tehenan, Eric Chamoun, Andreas Vlachos
- Semantic Geometry of Sentence Embeddings
Matthieu Tehenan
- ReAlign: Structured Revision for Small Language Model Alignment
Ruijun Chen, Jiajian Guo, Hongzhan Chen, Fanqi Wan, Qifan Wang, Xiaojun Quan
- Curr-ReFT: Overcoming Training Bottlenecks in Small-scale Vision-Language Models via Curriculum Reinforcement Finetuning
Huilin Deng, Ding Zou, Xinghao Zhao, Rui Ma, Yanming Guo, Yang Cao, Yu Kang
- Layer-Aware Task Arithmetic: Disentangling Task-Specific and Instruction-Following Knowledge
Yan-Lun Chen, Yi-Ru Wei, Chia-Yi Hsu, Chia-Mu Yu, Chun-Ying Huang, Ying-Dar Lin, Yu-Sung Wu, Wei-Bin Lee
- Revisiting Pruning vs Quantization for Small Language Models
Zihan Zhou, Simon Kurz, Zhixue Zhao
- CLaw: Benchmarking Chinese Legal Knowledge in Large Language Models - A Fine-grained Corpus and Reasoning Analysis
Xinzhe Xu, Liang Zhao, Hongshen Xu, chenchenc
- polyBART: A Chemical Linguist for Polymer Property Prediction and Generative Design
Anagha Savit, Harikrishna Sahu, Shivank S. Shukla, Wei Xiong, Rampi Ramprasad
- A Survey of RAG-Reasoning Systems in Large Language Models
Yangning Li, Weizhi Zhang, Yuyao Yang, Wei-Chieh Huang, Yaozu Wu, Junyu Luo, Yuanchen Bei, Henry Peng Zou, Xiao Luo, Yusheng Zhao, Chunkit Chan, Yankai Chen, Zhongfen Deng, Yinghui Li, Hai-Tao Zheng, Dongyuan Li, Renhe Jiang, Ming Zhang, Yangqiu Song, Philip S. Yu
- REGen: A Reliable Evaluation Framework for Generative Event Argument Extraction
Omar Sharif, Joseph Gatto, Madhusudan Basak, Sarah Masud Preum
- Mitigating Interviewer Bias in Multimodal Depression Detection: An Approach with Adversarial Learning and Contextual Positional Encoding
Enshi Zhang, Christian Poellabauer
- AMIA: Automatic Masking and Joint Intention Analysis Makes LVLMs Robust Jailbreak Defenders
Yuqi Zhang, Yuchun Miao, Zuchao Li, Liang Ding
- Disentangling Language Understanding and Reasoning Structures in Cross-lingual Chain-of-Thought Prompting
Khanh-Tung Tran, Nguyet-Hang Vu, Barry O’Sullivan, Hoang D. Nguyen
- MoRoVoc: A Large Dataset for Geographical Variation Identification of the Spoken Romanian Language
Andrei-Marius Avram, Bănescu Ema-Ioana, Anda-Teodora Robea, Dumitru-Clementin Cercel
- Language-Informed Synthesis of Rational Agent Models for Grounded Theory-of-Mind Reasoning On-the-fly
Lance Ying, Ryan Truong, Katherine M. Collins, Cedegao E. Zhang, Megan Wei, Tyler BrookeWilson, Tan Zhi-Xuan, Lionel Wong, Joshua B. Tenenbaum
- MOLE: Metadata Extraction and Validation in Scientific Papers Using LLMs
Zaid Alyafeai, Maged S. Al-shaibani, Bernard Ghanem
- MASSV: Multimodal Adaptation and Self-Data Distillation for Speculative Decoding of Vision-Language Models
Mugilan Ganesan, Shane Segal, Ankur Aggarwal, Nish Sinnadurai, Sean Lie, Vithursan Thangarasa
- FESTA: Functionally Equivalent Sampling for Trust Assessment of Multimodal LLMs
Debarpan Bhattacharya, Apoorva Kulkarni, Sriram Ganapathy
- ClaimGen-CN: A Large-scale Chinese Dataset for Legal Claim Generation
Siying Zhou, Yiquan Wu, Hui Chen, Xueyu Hu, Kun Kuang, Adam Jatowt, Chunyan Zheng, Fei Wu
- Summarize-Exemplify-Reflect: Data-driven Insight Distillation Empowers LLMs for Few-shot Tabular Classification
Yifei Yuan, Jiatong Li, Weijia Zhang, Mohammad Aliannejadi, Evangelos Kanoulas, Renjun Hu
- Rethinking LLM Uncertainty: A Multi-Agent Approach to Estimating Black-Box Model Uncertainty
Yu Feng, Phu Mon Htut, Zheng Qi, Wei Xiao, Manuel Mager, Nikolaos Pappas, Kishaloy Halder, Yang Li, Yassine Benajiba, Dan Roth
- Stress-Testing the Reasoning Competence of Language Models With Formal Proofs
Konstantine Arkoudas, Serafim Batzoglou
- Topic-Guided Reinforcement Learning with LLMs for Enhancing Multi-Document Summarization
Chuyuan Li, Austin Xu, Shafiq Joty, Giuseppe Carenini
- FACTCHECKMATE: Preemptively Detecting and Mitigating Hallucinations in LMs
Deema Alnuhait, Neeraja Kirtane, Muhammad Khalifa, Hao Peng
- Dialectal Toxicity Detection: Evaluating LLM-as-a-Judge Consistency Across Language Varieties
Fahim Faisal, Md Mushfiqur Rahman, Antonios Anastasopoulos
- Mitigate One, Skew Another? Tackling Intersectional Biases in Text-to-Image Models
Pushkar Shukla, Aditya Chinchure, Emily Diana, Alexander Tolbert, Kartik Hosanagar, Vineeth N. Balasubramanian, Leonid Sigal, Matthew A. Turk
- Language-Specific Layer Matters: Efficient Multilingual Enhancement for Large Vision-Language Models
Yuchun Fan, Yilin Wang, Yongyu Mu, Lei Huang, Bei Li, Xiaocheng Feng, Tong Xiao, JingBo Zhu
- InfAL: Inference Time Adversarial Learning for Improving Research Ideation
Sikun Guo, Amir Hassan Shariatmadari, Peng Wang, Albert Huang, Aidong Zhang
- Speculative Decoding for Multi-Sample Inference
Yiwei Li, Jiayi Shi, Shaoxiong Feng, Peiwen Yuan, Xinglin Wang, Yueqi Zhang, Ji Zhang, Chuyi Tan, Boyuan Pan, Yao Hu, Kan Li
- LSRL: Process-Supervised GRPO on Latent Recurrent States Improves Mathematical Reasoning
Hangliang Ren
- Multi-token Mask-filling and Implicit Discourse Relations
Meinan Liu, Yunfang Dong, Xixian Liao, Bonnie Webber
- Schema Generation for Large Knowledge Graphs Using Large Language Models
Bohui Zhang, Yuan He, Lydia Pintscher, Albert Meroño-Peñuela, Elena Simperl
- MCTS-RAG: Enhancing Retrieval-Augmented Generation with Monte Carlo Tree Search
Yunhai Hu, Yilun Zhao, Chen Zhao, Arman Cohan
- What if Othello-Playing Language Models Could See?
Xinyi Chen, Yifei Yuan, Jiaang Li, Serge Belongie, Maarten de Rijke, Anders Søgaard
- LLM-Based Web Data Collection for Research Dataset Creation
Thomas Berkane, Marie-Laure Charpignon, Maimuna S. Majumder
- PsyScam: A Benchmark for Psychological Techniques in Real-World Scams
Shang Ma, Tianyi Ma, JIAHAO LIU, Wei Song, Zhenkai Liang, Xusheng Xiao, Yanfang Ye
- LoRaDA: Low-Rank Direct Attention Adaptation for Efficient LLM Fine-tuning
Zhangming Li, Qinghao Hu, Yiqun Chen, Peisong Wang, Yifan Zhang, Jian Cheng
- Inductive Reasoning on Few-Shot Knowledge Graphs with Task-Aware Language Models
Cheng Yan, Feng Zhao, Ruilin Zhao, Hong Zhang
- ForestCast: Open-Ended Event Forecasting with Semantic News Forest
Zi Yu, Shaoxiang Wang, Guozheng Li, Yu Zhang, Chi Harold Liu
- Agentic Medical Knowledge Graphs Enhance Medical Question Answering: Bridging the Gap Between LLMs and Evolving Medical Knowledge
Mohammad Reza Rezaei, Reza Saadati Fard, Jayson Lee Parker, Rahul Krishnan, Milad Lankarany
- Text Anomaly Detection with Simplified Isolation Kernel
Yang Cao, Sikun Yang, Yujiu Yang, Lianyong Qi, Ming Liu
- Idola Tribus of AI: Large Language Models tend to perceive order where none exists
Shin-nosuke Ishikawa, Masato Todo, Taiki Ogihara, Hirotsugu OHBA
- Thunder-DeID: Accurate and Efficient De-identification Framework for Korean Court Judgments
Sungeun Hahm, Heejin Kim, Gyuseong Lee, Hyunji M. Park, Jaejin Lee
- Multi-Agent Autonomous Driving Systems with Large Language Models: A Survey of Recent Advances, Resources, and Future Directions
Yaozu Wu, Dongyuan Li, Yankai Chen, Renhe Jiang, Henry Peng Zou, Wei-Chieh Huang, Yangning Li, Liancheng Fang, Zhen Wang, Philip S. Yu
- Comprehensive Evaluation on Lexical Normalization: Boundary-Aware Approaches for Unsegmented Languages
Shohei Higashiyama, Masao Utiyama
- Explainable Text Classification with LLMs: Enhancing Performance through Dialectical Prompting and Explanation-Guided Training
Huaming Du, Lei Yuan, Guisong Liu, Carl Yang, Gang Kou
- MultiPL-MoE: Multi-Programming-Lingual Extension of Large Language Models through Hybrid Mixture-of-Experts
Qing Wang, Xue Han, Jiahui Wang, Lehao xing, Qian Hu, Lianlian Zhang, Junlan Feng, Chao Deng
- AutoSpec: An Agentic Framework for Automatically Drafting Patent Specification
Ryan Shea, Zhou Yu
- LimaCost: Data Valuation for Instruction Tuning of Large Language Models
Hyeonseok Moon, Jaehyung Seo, Seonmin Koo, Jinsung Kim, Young-kyoung Ham, jiwon moon, Heuiseok Lim
- Two Challenges, One Solution: Robust Multimodal Learning through Dynamic Modality Recognition and Enhancement
Lanxin Bi, Yunqi Zhang, Luyi Wang, Yake Niu, Hui Zhao
- SwiftPrune: Hessian-Free Weight Pruning for Large Language Models
Yuhan Kang, Yang Shi, Mei Wen, Jun He, Jianchao Yang, Zeyu Xue, Jing Feng, Xinwang Liu
- Training LLMs for Optimization Modeling via Iterative Data Synthesis and Structured Validation
Yang Wu, Yifan Zhang, Yurong Wu, Yuran Wang, Junkai Zhang, Jian Cheng
- Exploiting Prompt-induced Confidence for Black-Box Attacks on LLMs
Meina Chen, Yihong Tang, Kehai Chen
- DPF-CM: A Data Processing Framework with Privacy-Preserving Vector Databases for Chinese Medical LLMs Training and Deployment
Wei Huang, Anda Cheng, Zhao Zhang, Yinggui Wang
- Graph-Reward-SQL: Execution-Free Reinforcement Learning for Text-to-SQL via Graph Matching and Stepwise Reward
Han Weng, Puzhen Wu, Cui Longjie, Yi Zhan, Boyi Liu, Yuanfeng SONG, Dun Zeng, Yingxiang Yang, Qianru Zhang, Dong HUANG, Xiaoming Yin, Yang Sun, Xing Chen
- StatsChartMWP: A Dataset for Evaluating Multimodal Mathematical Reasoning Abilities on Math Word Problems with Statistical Charts
Dan Zhu, Tianqiao Liu, Zitao Liu
- Logic-Thinker: Teaching Large Language Models to Think more Logically.
Chengyao Wen, Qiang Cheng, Shaofei Wang, Zhizhen Liu, Lei Liang, Deng Zhao
- ACEBench: A Comprehensive Evaluation of LLM Tool Usage
Chen Chen, xinlong hao, Weiwen Liu, Xu Huang, Xingshan Zeng, Shuai Yu, Dexun Li, Yuefeng Huang, Xiangcheng Liu, Wang Xinzhi, Wu Liu
- RevPRAG: Revealing Poisoning Attacks in Retrieval-Augmented Generation through LLM Activation Analysis
Xue Tan, Hao Luan, Mingyu Luo, Xiaoyan Sun, Ping Chen, Jun Dai
- DaMoC: Efficiently Selecting the Optimal Large Language Model for Fine-tuning Domain Tasks Based on Data and Model Compression
Wei Huang, Huang Wei, Yinggui Wang
- CoAT: Chain-of-Associated-Thoughts Framework for Enhancing Large Language Models Reasoning
Jianfeng Pan, Senyou Deng, Shaomang Huang
- ChartM$^3$: A Multi-Stage Code-Driven Pipeline for Constructing Multi-Dimensional and Multi-Step Visual Reasoning Data in Chart Comprehension
Duo Xu, Hao Cheng, Xin Lin, Zhen Xie, Hao Henry Wang
- Can LLMs Truly Plan? A Comprehensive Evaluation of Planning Capabilities
Gayeon Jung, HyeonSeok Lim, Minjun Kim, JOON-HO LIM, KyungTae Lim, Hansaem Kim
- MARIO-0.5B: A Multi-Agent Lightweight Model for Real-Time Open Information Extraction in Low-Resource Settings
Donghai Zhang, SHuangtao Yang, Bo Fu, Dong xiaozheng, Wei Song
- BiMax: Bidirectional MaxSim Score for Document-Level Alignment
Xiaotian Wang, Takehito Utsuro, Masaaki Nagata
- DocMMIR: A Framework for Document Multi-modal Information Retrieval
Zirui Li, Siwei Wu, Yizhi LI, Xingyu Wang, Yi Zhou, Chenghua Lin
- MoVoC: Morphology-Aware Subword Construction for Ge’ez Script Languages
Hailay Kidu Teklehaymanot, Dren Fazlija, Wolfgang Nejdl
- MMA: Cross-Domain Knowledge Integration via Mixture of Multi-Domain Agents
Kehang Jia, Juntao Li, Xiaobo Liang, Yisheng Xiao, Yixuan Yang, Min Zhang
- HAWK: Highlighting Entity-aware Knowledge for Alleviating Information Sparsity in Long Contexts
Seonmin Koo, Jinsung Kim, Chanjun Park, Heuiseok Lim
- Sensitivity-LoRA : Low-Load Sensitivity-Based Fine-Tuning for Large Language Models
Hao Zhang, Bo Huang, Zhenjia Li, Xi Xiao, Hui Yi Leong, Zumeng Zhang, Xinwei Long, Tianyang Wang, Hao Xu
- ROSE: A Reward-Oriented Data Selection Framework for LLM Task-Specific Instruction Tuning
Yang Wu, Huayi Zhang, Yizheng Jiao, Lin Ma, Xiaozhong Liu, Jinhong Yu, Dongyu Zhang, DEZHI YU, Wei Xu
- SimBA: Simplifying Benchmark Analysis Using Performance Matrices Alone
Nishant Subramani, Alfredo Gomez, Mona T. Diab
- MarathiEmoExplain: A Dataset for Sentiment, Emotion, and Explanation in Low-Resource Marathi
Anuj Kumar, Mohammed Faisal Sayed, Satyadev Ahlawat, Yamuna Prasad
- Active Domain Knowledge Acquisition with 100-Dollar Budget: Enhancing LLMs via Cost-Efficient, Expert-Involved Interaction in Sensitive Domains
Yang Wu, Raha Moraffah, Rujing Yao, Jinhong Yu, Zhimin Tao, Xiaozhong Liu
- Structure-aware Propagation Generation with Large Language Models for Fake News Detection
Mengyang Chen, Lingwei Wei, Wei Zhou, Songlin Hu
- UniCoM: A Universal Code-Switching Speech Generator
Sangmin Lee, Woojin Chung, Seyun Um, Hong-Goo Kang
- Mitigating Sequential Dependencies: A Survey of Algorithms and Systems for Generation-Refinement Frameworks in Autoregressive Models
Yunhai Hu, Zining Liu, Zhenyuan Dong, Tianfan Peng, Bradley McDanel, Sai Qian Zhang
- Do We Really Need All Those Dimensions? An Intrinsic Evaluation Framework for Compressed Embeddings
Nathan Inkiriwang, Necva Bölücü, Garth Tarr, Maciej Rybinski
- Mixture of LoRA Experts for Continual Information Extraction with LLMs
Zitao Wang, Xinyi Wang, Wei Hu
- Spelling-out is not Straightforward: LLMs’ Capability of Tokenization from Token to Characters
Tatsuya Hiraoka, Kentaro Inui
- OAgents: An Empirical Study of Building Effective Agents
He Zhu, Tianrui Qin, King Zhu, Heyuan Huang, Yeyi Guan, Jinxiang Xia, Hanhao Li, Yi Yao, Ningning Wang, Pai Liu, Tianhao Peng, Sunny Gui, LiXiaowan, Yuhui Liu, Xiangru Tang, Jian Yang, Ge Zhang, Xitong Gao, Yuchen Eleanor Jiang, Changwang Zhang, Jun Wang, Jiaheng Liu, Wangchunshu Zhou
- 2Columns1Row: A Russian Benchmark for Textual and Multimodal Table Understanding and Reasoning
Vildan Saburov, Daniil Vodolazsky, Danil Sazanakov, Alena Fenogenova
- Permitted Knowledge Boundary: Evaluating the Knowledge-Constrained Responsiveness of Large Language Models
Wenrui Bao, Kai Wang, Siqiang Luo, Xiang Li
- A Closer Look at Bias and Chain-of-Thought Faithfulness of Large (Vision) Language Models
Sriram Balasubramanian, Samyadeep Basu, Soheil Feizi
- From Remembering to Metacognition: Do Existing Benchmarks Accurately Evaluate LLMs?
Geng Zhang, Sihang Jiang, Yizhou Ying, Guanglei Yue, Jiaqing Liang, Yifei Fu, Hailin Hu, Yanghua Xiao
- How a Bilingual LM Becomes Bilingual: Tracing Internal Representations with Sparse Autoencoders
Tatsuro Inaba, Go Kamoda, Kentaro Inui, Masaru Isonuma, Yusuke Miyao, Yohei Oseki, Yu Takagi, Benjamin Heinzerling
- MultiConIR: Towards Multi-Condition Information Retrieval
Xuan Lu, Sifan Liu, Bochao Yin, Yongqi Li, Xinghao Chen, Hui Su, Yaohui Jin, Wenjun Zeng, Xiaoyu Shen
- HMCL: Task-Optimal Text Representation Adaptation through Hierarchical Contrastive Learning
Zhenyi Wang, Yapeng Jia, Haiyan Ning, Peng Wang, Dan Wang, Yitao Cao
- KBAlign: Efficient Self Adaptation on Specific Textual Knowledge Bases
Zheni Zeng, Yuxuan Chen, Shi Yu, Ruobing Wang, Yukun Yan, Zhenghao Liu, Shuo Wang, Xu Han, Zhiyuan Liu, Maosong Sun
- Revisiting Chain-of-Thought Prompting: Zero-shot Can Be Stronger than Few-shot
Xiang Cheng, Chengyan Pan, Minjun Zhao, Deyang Li, Fangchao Liu, Xinyu Zhang, Xiao Zhang, Yong Liu
- RMTBench: Benchmarking LLMs Through Multi-Turn User-Centric Role-Playing
Hao Xiang, Tianyi Tang, Yang Su, Bowen Yu, An Yang, Fei Huang, Yichang Zhang, Yaojie Lu, Hongyu Lin, Xianpei Han, Jingren Zhou, Junyang Lin, Le Sun
- Smart-Searcher: Incentivizing the Dynamic Knowledge Acquisition of LLMs via Reinforcement Learning
Huatong Song, Jinhao Jiang, Wenqing Tian, Zhipeng Chen, Yuhuan Wu, Jiahao Zhao, Yingqian Min, Xin Zhao, LEI FANG, Ji-Rong Wen
- InteGround: On the Evaluation of Verification and Retrieval Planning in Integrative Grounding
Cheng Jiayang, Qianqian Zhuang, Haoran Li, Chunkit Chan, Xin Liu, Lin Qiu, Yangqiu Song
- MM-CRITIC: A Holistic Evaluation of Large Multimodal Models as Multimodal Critique
Gailun Zeng, Ziyang Luo, Hongzhan Lin, Yuchen Tian, Kaixin Li, Ziyang Gong, Jianxiong Guo, Jing Ma
- On the Correspondence between the Squared Norm and Information Content in Text Embeddings
Enrique Amigo, Adrian Ghajari, Alejandro Benito-Santos, Diego De la Fuente Rodríguez
- Adversary-Aware DPO: Enhancing Safety Alignment in Vision Language Models via Adversarial Training
Fenghua Weng, Jian Lou, Jun Feng, Minlie Huang, Wenjie Wang
- SLiNT: Structure-aware Language Model with Injection and Contrastive Training for Knowledge Graph Completion
mengxue yang, Chun Yang, Jiaqi Zhu, Jiafan Li, Jingqi Zhang, Yuyang Li, Ying Li
- LAVa: Layer-wise KV Cache Eviction with Dynamic Budget Allocation
Yiqun Shen, Song Yuan, Zhengze Zhang, Xiaoliang Wang, Daxin Jiang, Nguyen Cam-Tu
- LoRA-PAR: A Flexible Dual-System LoRA Partitioning Approach to Efficient LLM Fine-Tuning
yining huang, Bin Li, Keke Tang, Meilian Chen
- SimpleDeepSearcher: Deep Information Seeking via Web-Powered Reasoning Trajectory Synthesis
Shuang Sun, Huatong Song, Yuhao Wang, Ruiyang Ren, Jinhao Jiang, Junjie Zhang, Fei Bai, Jia Deng, Xin Zhao, Zheng Liu, LEI FANG, Zhongyuan Wang, Ji-Rong Wen
- LLaVE: Large Language and Vision Embedding Models with Hardness-Weighted Contrastive Learning
Zhibin Lan, Liqiang Niu, Fandong Meng, Jie Zhou, Jinsong Su
- SampleMix: A Sample-wise Pre-training Data Mixing Strategy by Coordinating Data Quality and Diversity
Xiangyu Xi, Deyang Kong, Jian Yang, jiawei yang, Zhengyu Chen, Wei Wang, Jingang Wang, Xunliang Cai, Shikun Zhang, Wei Ye
- Evaluating Test-Time Scaling LLMs for Legal Reasoning: OpenAI o1, DeepSeek-R1, and Beyond
Yinghao Hu, Yaoyao Yu, Leilei Gan, Bin Wei, Kun Kuang, Fei Wu
- LLM Agents for Education: Advances and Applications
Zhendong Chu, Shen Wang, Jian Xie, Tinghui Zhu, Yibo Yan, Jingheng Ye, Aoxiao Zhong, Xuming Hu, Jing Liang, Philip S. Yu, Qingsong Wen
- Modeling Subjectivity in Cognitive Appraisal with Language Models
Yuxiang Zhou, Hainiu Xu, Desmond Ong, Maria Liakata, Petr Slovak, Yulan He
- Dementia Through Different Eyes: Explainable Modeling of Human and LLM Perceptions for Early Awareness
Lotem Peled-Cohen, Maya Zadok, Nitay Calderon, Hila Gonen, Roi Reichart
- Mitigating Hallucinations in Large Vision-Language Models by Self-Injecting Hallucinations
Yifan Lu, Ziqi Zhang, Chunfeng Yuan, Jun Gao, Congxuan Zhang, Xiaojuan Qi, Bing Li, Weiming Hu
- How Much Do Large Language Models Know about Human Motion? A Case Study in 3D Avatar Control
Kunhang Li, Jason Naradowsky, Yansong Feng, Yusuke Miyao
- The Search for Conflicts of Interest: Open Information Extraction in Scientific Publications
Garima Gaur, Oana Balalau, Ioana Manolescu, Prajna Devi Upadhyay
- On Collaborating Small and Large Models For Few-shot Intent Detection
Peng Chen, Bang Wang
- A Survey on LLMs for Story Generation
Maria Teleki, Vedangi Bengali, Xiangjue Dong, Sai Tejas Janjur, Haoran Liu, Tian Liu, Cong Wang, Ting Liu, Yin Zhang, Frank Shipman, James Caverlee
- From Knowledge to Treatment: Large Language Model Assisted Biomedical Concept Representation for Drug Repurposing
Chengrui xiang, Tengfei Ma, Xiangzheng Fu, Yiping Liu, Bosheng Song, xiangxiang Zeng
- SKRAG: A Retrieval-Augmented Generation Framework Guided by Reasoning Skeletons over Knowledge Graphs
Xiaotong Xu, Yizhao Wang, Yunfei Liu, Shengyang Li
- A Generative Framework for Personalized Sticker Retrieval
Changjiang Zhou, Ruqing Zhang, Jiafeng Guo, Yu-An Liu, Fan Zhang, Ganyuan Luo, Xueqi Cheng
- Bridging Semantic and Modality Gaps in Zero-Shot Captioning via Retrieval from Synthetic Data
Zhiyue Liu, Wenkai Zhou
- Humor in Pixels: Benchmarking Large Multimodal Models Understanding of Online Comics
Yuriel Wang Jun Long Ryan, Rui Yang Tan, Kenny Tsu Wei Choo, Roy Ka-Wei Lee
- BiMediX2 : Bio-Medical EXpert LMM for Diverse Medical Modalities
Sahal Shaji Mullappilly, Mohammed Irfan Kurpath, Sara Pieri, Saeed Yahya Alseiari, Shanavas Cholakkal, Khaled M Aldahmani, Fahad Shahbaz Khan, Rao Muhammad Anwer, Salman Khan, Timothy Baldwin, Hisham Cholakkal
- DeMAC: Enhancing Multi-Agent Coordination with Dynamic DAG and Manager-Player Feedback
Yuhan Liu, Cong Xu, Lu Liu, Yihua Wang, Feiyu Chen, Qi Jia, Yaqian Zhao, Zhichun Wang, Xiang Li
- Coherence of Argumentative Dialogue Snippets: A New Method for Large Scale Evaluation with an Application to Inference Anchoring Theory
Paul Piwek, Jacopo Amidei, Svetlana Stoyanchev
- Angular Dispersion Accelerates $k$-Nearest Neighbors Machine Translation
Evgeniia Tokarchuk, Sergey Troshin, Vlad Niculae
- Benchmarking Contextual and Paralinguistic Reasoning in Speech-LLMs: A Case Study with In-the-Wild Data
Qiongqiong Wang, Hardik Bhupendra Sailor, Tianchi Liu, Wenyu Zhang, Muhammad Huzaifah, Nattadaporn Lertcheva, Shuo Sun, Nancy F. Chen, Jinyang Wu, AiTi Aw
- This is not a Disimprovement: Improving Negation Reasoning in Large Language Models via Prompt Engineering
Joshua Jose Dias Barreto, Abhik Jana
- Make Every Letter Count: Building Dialect Variation Dictionaries from Monolingual Corpora
Robert Litschko, Verena Blaschke, Diana Burkhardt, Barbara Plank, Diego Frassinelli
- SelfAug: Mitigating Catastrophic Forgetting in Retrieval-Augmented Generation via Distribution Self-Alignment
Yuqing Huang, Rongyang Zhang, Qimeng Wang, Chengqiang Lu, Yan Gao, YIWU, Yao Hu, Xuyang Zhi, Guiquan Liu, Xin Li, Hao Wang, Enhong Chen
- SEKE: Specialised Experts for Keyword Extraction
Matej Martinc, Thi Hong Hanh TRAN, Senja Pollak, Boshko Koloski
- 1+1>2: A Synergistic Sparse and Low-Rank Compression Method for Large Language Models
Zeliang Zong, Kai Zhang, Zheyang Li, Wenming Tan, Ye Ren, yiyan zhai, Jilin Hu
- InfiMM-WebMath-40B: Advancing Multimodal Pre-Training for Enhanced Mathematical Reasoning
Xiaotian Han, Yiren Jian, Xuefeng Hu, Haogeng Liu, Yiqi Wang, Qihang Fan, Yuang Ai, Huaibo Huang, Ran He, Zhenheng Yang, Quanzeng You
- Zero-Shot Defense Against Toxic Images via Inherent Multimodal Alignment in LVLMs
Wei Zhao, Zhe Li, Yige Li, Jun Sun
- Retrieval Augmented Generation based context discovery for ASR
Siskos Dimitrios, Stavros Papadopoulos, Pablo Peso Parada, Jisi Zhang, Karthikeyan Saravanan, Anastasios Drosou
- pFedRAG: A Personalized Federated Retrieval-Augmented Generation System with Depth-Adaptive Tiered Embedding Tuning
Hangyu He, Xin Yuan, Kai Wu, Ren Ping Liu, Wei Ni
- ReCUT: Balancing Reasoning Length and Accuracy in LLMs via Stepwise Trails and Preference Optimization
Zhensheng Jin, Xinze Li, Yifan Ji, Chunyi Peng, Zhenghao Liu, Qi Shi, Yukun Yan, Shuo Wang, Furong Peng, Ge Yu
- CURE: Controlled Unlearning for Robust Embeddings — Mitigating Conceptual Shortcuts in Pre-Trained Language Models
Aysenur Kocak, Shuo Yang, Bardh Prenkaj, Gjergji Kasneci
- MLAlgo-Bench: Can Machines Implement Machine Learning Algorithms?
Yunfei Wang, Yeqin Zhang, Yuyang Wu, Liang Lu, Phi Le Nguyen, Xiaoliang Wang, Nguyen Cam-Tu
- Fair Text-Attributed Graph Representation Learning
Ruilin Luo, Tianle Gu, Lin Wang, Yunfeng Zhou, Songtao Jiang, Lei Wang, Yujiu Yang
- Human-Inspired Obfuscation for Model Unlearning: Local and Global Strategies with Hyperbolic Representations
zekun wang, Jingjie Zeng, Yingxu Li, Liang Yang, Hongfei Lin
- Do Influence Functions Work on Large Language Models?
Zhe Li, Wei Zhao, Yige Li, Jun Sun
- TRUEBench: Can LLM Response Meet Real-world Constraints as Productivity Assistant?
Jiho Park, Jongyoon Song, Minjin Choi, Kyuho Heo, Taehun huh, Ji Won Kim
- CausalMACE: Causality Empowered Multi-Agents in Minecraft Cooperative Tasks
Qi Chai, Zhang Zheng, Junlong Ren, Deheng Ye, Zichuan Lin, Hao Wang
- Harry Potter is Still Here! Probing Knowledge Leakage in Targeted Unlearned Large Language Models
Bang Trinh Tran To, Thai Le
- Learning Trajectories of Figurative Language for Pre-Trained Language Models
Nicola Arici, Luca Putelli, Ejdis Gjinika, Ivan Serina, Alfonso Gerevini
- BcQLM: Efficient Vision-Language Understanding with Distilled Q-Gated Cross-Modal Fusion
Sike Xiang, Shuang Chen, Amir Atapour-Abarghouei
- HapticCap: A Multimodal Dataset and Task for Understanding User Experience of Vibration Haptic Signals
Guimin Hu, Daniel Hershcovich, Hasti Seifi
- SubDocTrans: Enhancing Document-level Machine Translation with Plug-and-play Multi-granularity Knowledge Augmentation
Hanghai Hong, Yibo Xie, Jiawei Zheng, Xiaoli Wang
- Social Bias Evaluation for Large Language Models Requires Prompt Variations
Rem Hida, Masahiro Kaneko, Naoaki Okazaki
- Training with Fewer Bits: Unlocking Edge LLMs Training with Stochastic Rounding
Taowen Liu, Marta Andronic, Deniz Gunduz, George Anthony Constantinides
- FactReasoner: A Probabilistic Approach to Long-Form Factuality Assessment for Large Language Models
Radu Marinescu, Debarun Bhattacharjya, Junkyu Lee, Tigran T. Tchrakian, Javier Carnerero-Cano, Yufang Hou, Elizabeth M. Daly, Alessandra Pascale
- Robust Knowledge Editing via Explicit Reasoning Chains for Distractor-Resilient Multi-Hop QA
Yuchen Wu, Liang Ding, Li Shen, Dacheng Tao
- RadialRouter: Structured Representation for Efficient and Robust Large Language Models Routing
Ruihan Jin, Pengpeng Shao, Zhengqi Wen, Jinyang Wu, Mingkuan Feng, Shuai Zhang, Jianhua Tao
- Decoding Uncertainty: The Impact of Decoding Strategies for Uncertainty Estimation in Large Language Models
Wataru Hashimoto, Hidetaka Kamigaito, Taro Watanabe
- Elucidating Mechanisms of Demographic Bias in LLMs for Healthcare
Hiba Ahsan, Arnab Sen Sharma, Silvio Amir, David Bau, Byron C Wallace
- Can You Trick the Grader? Adversarial Persuasion of LLM Judges
Yerin Hwang, Dongryeol Lee, taegwan kang, Yongil Kim, Kyomin Jung
- Navigating the Unknown: Intent Classification and Out-of-Distribution Detection Using Large Language Models
Yusuf Sali, Sıtkı Can Toraman
- Trust Me, I’m Wrong: LLMs Hallucinate with Certainty Despite Knowing the Answer
Adi Simhi, Itay Itzhak, Fazl Barez, Gabriel Stanovsky, Yonatan Belinkov
- QUARTZ: QA-based Unsupervised Abstractive Refinement for Task-oriented Dialogue Summarization
GHEBRIOUT Mohamed Imed Eddine, Gaël Guibon, Ivan Lerner, Emmanuel Vincent
- MDSEval: A Meta-Evaluation Benchmark for Multimodal Dialogue Summarization
Yinhong Liu, Jianfeng He, Hang Su, Ruixue Lian, Yi Nian, Jake W. Vincent, Srikanth Vishnubhotla, Robinson Piramuthu, Saab Mansour
- PMPO: Probabilistic Metric Prompt Optimization for Small and Large Language Models
ChenZhuo Zhao, Ziqian Liu, Xinda Wang, Junting Lu, Chaoyi Ruan
- Evaluating the Creativity of LLMs in Persian Literary Text Generation
Armin tourajmehr, Mohammad Reza Modarres, Yadollah Yaghoobzadeh
- SCDTour: Embedding Axis Ordering and Merging for Interpretable Semantic Change Detection
Taichi Aida, Danushka Bollegala
- Resolving UnderEdit & OverEdit with Iterative & Neighbor-Assisted Model Editing
Bhiman Kumar Baghel, Scott M. Jordan, Zheyuan Ryan Shi, Xiang Lorraine Li
- LLM-empowered Dynamic Prompt Routing for Vision-Language Models Tuning under Long-Tailed Distributions
Yongju Jia, Jiarui Ma, Xiangxian Li, Baiqiao Zhang, XianhuiCao, Juan Liu, Yulong Bian
- HGAdapter: Hypergraph-based Adapters in Language Models for Code Summarization and Clone Detection
Guang Yang, Yujie Zhu
- Evaluating distillation methods for data-efficient syntax learning
Takateru Yamakoshi, Thomas L. Griffiths, R. Thomas McCoy, Robert D. Hawkins
- “Going to a trap house” conveys more fear than “Going to a mall”: Benchmarking Emotion Context Sensitivity for LLMs
Eojin Jeon, Mingyu Lee, Sangyun Kim, Junho Kim, Wanzee Cho, Tae-Eui Kam, SangKeun Lee
- [MASK]ED - Language Modeling for Explainable Classification and Disentangling of Socially Unacceptable Discourse.
Dimitra Niaouri, Mohamed Rayane GHILENE, Michele Linardi, Julien Longhi
- A Survey of Cognitive Distortion Detection and Classification in NLP
Archie Sage, Jeroen Keppens, Helen Yannakoudakis
- Curse of Knowledge: Your Guidance and Provided Knowledge are biasing LLM Judges in Complex Evaluation
Weiyuan Li, Xintao Wang, Siyu Yuan, Rui Xu, Jiangjie Chen, Qingqing Dong, Yanghua Xiao, Deqing Yang
- Self-Training Large Language Models with Confident Reasoning
Hyosoon Jang, Yunhui Jang, Sungjae Lee, Jungseul Ok, Sungsoo Ahn
- Error Typing for Smarter Rewards: Improving Process Reward Models with Error-Aware Hierarchical Supervision
Tej Deep Pala, Panshul Sharma, Amir Zadeh, Chuan Li, Soujanya Poria
- Enhancing LLM-Based Persuasion Simulations with Cultural and Speaker-Specific Information
Weicheng Ma, Hefan Zhang, Shiyu Ji, Farnoosh Hashemi, Qichao Wang, Ivory Yang, Joice Chen, Juanwen Pan, Michael Macy, Saeed Hassanpour, Soroush Vosoughi
- An LLM-based Temporal-spatial Data Generation and Fusion Approach for Early Detection of Late Onset Alzheimer’s Disease (LOAD) Stagings Especially in Chinese and English-speaking Populations
Yang Han, Jacqueline C.K. Lam, Victor O.K. Li, Lawrence Y. L. Cheung
- Side Effects of Erasing Concepts from Diffusion Models
Shaswati Saha, Sourajit Saha, Manas Gaur, Tejas Gokhale
- SaCa: A Highly Compatible Reinforcing Framework for Knowledge Graph Embedding via Structural Pattern Contrast
Jiashi Lin, Changhong Jiang, Yixiao Wang, Xinyi Zhu, Zhongtian Hu, Wei Zhang
- Real, Fake, or Manipulated? Detecting Machine-Influenced Text
Yitong Wang, Zhongping Zhang, Margherita Piana, Zheng Zhou, Peter Gerstoft, Bryan A. Plummer
- Character is Destiny: Can Persona-assigned Language Models Make Personal Choices?
Rui Xu, Xintao Wang, Jiangjie Chen, Siyu Yuan, Xinfeng Yuan, Jiaqing Liang, Zulong Chen, xiaoqingdong, Yanghua Xiao
- Neutral Is Not Unbiased: Evaluating Implicit and Intersectional Identity Bias in LLMs Through Structured Narrative Scenarios
Saba Ghanbari Haez, Mauro Dragoni
- BTW: A Non-Parametric Variance Stabilization Framework for Multimodal Model Integration
Jun Hou, Le Wang, Xuan Wang
- Can LLMs Be Efficient Predictors of Conversational Derailment?
Kaustubh Olpadkar, Vikram Sunil Bajaj, Leslie Barrett
- Q-PRM: Adaptive Query Rewriting for Retrieval-Augmented Generation via Step-level Process Supervision
Xiaopeng Ye, Chen Xu, Chaoliang Zhang, Zhaocheng Du, Jun Xu, Gang Wang, Zhenhua Dong
- Factuality Beyond Coherence: Evaluating LLM Watermarking Methods for Medical Texts
Rochana Prih Hastuti, Rian Adam Rajagede, Mansour Al Ghanim, Mengxin Zheng, Qian Lou
- Guess What I am Thinking: A Benchmark for Inner Thought Reasoning of Role-Playing Language Agents
Rui Xu, Mingyu Wang, Xintao Wang, Dakuan Lu, Xiaoyu Tan, Wei Chu, Xu Yinghui
- Dropping Experts, Recombining Neurons: Retraining-Free Pruning for Sparse Mixture-of-Experts LLMs
Yixiao Zhou, Ziyu Zhao, Dongzhou Cheng, zhiliang wu, Jie Gui, Yi Yang, Fei Wu, Yu Cheng, Hehe Fan
- BiasFilter: An Inference-Time Debiasing Framework for Large Language Models
Xiaoqing Cheng, Ruizhe Chen, Hongying Zan, Yuxiang Jia, Min Peng
- X-LeBench: A Benchmark for Extremely Long Egocentric Video Understanding
Wenqi Zhou, Kai Cao, Hao Zheng, Yunze Liu, XINYI ZHENG, Miao Liu, Per Ola Kristensson, Walterio W. Mayol-Cuevas, Fan Zhang, Weizhe Lin, Junxiao Shen
- A Survey on Multi-modal Intent Recognition: Recent Advances and New Frontiers
Zhihong Zhu, Fan Zhang, Yunyan Zhang, Jinghan Sun, Zhiqi Huang, QingqingLong, Bowen Xing, Xian Wu
- Will Annotators Disagree? Identifying Subjectivity in Value-Laden Arguments
Amir Homayounirad, Enrico Liscio, Tong Wang, Catholijn M Jonker, Luciano Cavalcante Siebert
- LLMs Can Compensate for Deficiencies in Visual Representations
Sho Takishita, Jay Gala, Abdelrahman Mohamed, Kentaro Inui, Yova Kementchedjhieva
- Adapting Large Language Models for Character-based Augmentative and Alternative Communication
Dylan Gaines, Keith Vertanen
- Token-Level Metrics for Detecting Incorrect Gold Annotations in Named Entity Recognition
Elena Merdjanovska, Alan Akbik
- Exploring Paraphrasing Strategies for CEFR A1-Level Constraints in LLMs
Eugenio Marzona, Maria Goikhman, Alessio Palmero Aprosio, Massimo Zancanaro
- Efficient Layer-wise LLM Fine-tuning for Revision Intention Prediction
Zhexiong Liu, Diane Litman
- ConText-LE: Cross-Distribution Generalization for Longitudinal Experiential Data via Narrative-Based LLM Representations
Ahatsham Hayat, Bilal Khan, Mohammad Rashedul Hasan
- Chain of Strategy Optimization Makes Large Language Models Better Emotional Supporter
Weixiang Zhao, Xingyu Sui, Xinyang Han, Yang Deng, Yulin Hu, Jiahe Guo, Libo Qin, Qianyun Du, Shijin Wang, Yanyan Zhao, Bing Qin, Ting Liu
- Unlocking Legal Knowledge: A Multilingual Dataset for Judicial Summarization in Switzerland
Luca Rolshoven, Vishvaksenan Rasiah, Srinanda Brügger Bose, Sarah Hostettler, Lara Burkhalter, Matthias Stürmer, Joel Niklaus
- Context Minimization for Resource-Constrained Text Classification: Optimizing Performance-Efficiency Trade-offs through Linguistic Features
Nahid Hossain, Md Faisal Kabir
- FLAIRR-TS - Forecasting LLM-Agents with Iterative Refinement and Retrieval for Time Series
Gunjan Jalori, Preetika Verma, Sercan O Arik
- ULTRABENCH: Benchmarking LLMs under Extreme Fine-grained Text Generation
Longfei Yun, Letian Peng, Jingbo Shang
- The Price of Format: Diversity Collapse in LLMs
Longfei Yun, Chenyang An, Zilong Wang, Letian Peng, Jingbo Shang
- Zipf’s and Heaps’ Laws for Tokens and LLM-generated Texts
Nikolay Mikhaylovskiy
- LLMs for Bayesian Optimization in Scientific Domains: Are We There Yet?
Rushil Gupta, Jason Hartford, Bang Liu
- A Comprehensive Taxonomy of Negation for NLP and Neural Retrievers
Roxana Petcu, Samarth Bhargav, Maarten de Rijke, Evangelos Kanoulas
- Identifying Noise in Human-Created Datasets using Training Dynamics from Generative Models
Maeda Hanafi, Ishan Jindal, Yannis Katsis, Lucian Popa, Huaiyu Zhu
- Can Multiple Responses from an LLM Reveal the Sources of Its Uncertainty?
Yang Nan, Pengfei He, Ravi Tandon, Han Xu
- AfroXLMR-Social: Adapting Pre-trained Language Models for African Languages Social Media Text
Tadesse Destaw Belay, Israel Abebe Azime, Ibrahim Said Ahmad, David Ifeoluwa Adelani, Idris Abdulmumin, Abinew Ali Ayele, Shamsuddeen Hassan Muhammad, Seid Muhie Yimam
- Teaching Language Models To Gather Information Proactively
Tenghao Huang, Sihao Chen, Muhao Chen, Jonathan May, Longqi Yang, Mengting Wan, Pei Zhou
- Linguistic Alignment Predicts Learning in Small Group Tutoring Sessions
Dorothea French, Robert Moulder, Kelechi Ezema, Katharina von der Wense, Sidney K. DMello
- EfficientXLang: Towards Improving Token Efficiency Through Cross-Lingual Reasoning
Sanchit Ahuja, Praneetha Vaddamanu, Barun Patra
- Not Lost After All: How Cross-Encoder Attribution Challenges Position Bias Assumptions in LLM Summarization
Elahe Rahimi, Hassan Sajjad, Domenic Rosati, Abeer Badawi, Elham Dolatabadi, Frank Rudzicz
- FuzzAug: Data Augmentation by Coverage-guided Fuzzing for Neural Test Generation
Yifeng He, JICHENG WANG, Yuyang Rong, Hao Chen
- DrAgent: Empowering Large Language Models as Medical Agents for Multi-hop Medical Reasoning
Fenglin Liu, Zheng Li, Hongjian Zhou, Qingyu Yin, Jingfeng Yang, Xin Liu, Zhengyang Wang, Xianfeng Tang, Shiyang Li, Xiang He, Ruijie Wang, Bing Yin, Lei Clifton, David A. Clifton
- XRAG: Cross-lingual Retrieval-Augmented Generation
Wei Liu, Sony Trenous, Leonardo F. R. Ribeiro, Bill Byrne, Felix Hieber
- Can VLMs Recall Factual Associations From Visual References?
Dhananjay Ashok, Ashutosh Chaubey, Hirona Jacqueline Arai, Jonathan May, Jesse Thomason
- MFTCXplain: A Multilingual Benchmark Dataset for Evaluating the Moral Reasoning of LLMs through Multi-hop Hate Speech Explanation
Jackson Trager, Francielle Vargas, Diego Alves, Matteo Guida, Mikel K. Ngueajio, Ameeta Agrawal, Flor Miriam Plaza-del-Arco, Yalda Daryani, Farzan Karimi Malekabadi
- Large Language Models for Multilingual Previously Fact-Checked Claim Detection
Ivan Vykopal, Matúš Pikuliak, Simon Ostermann, Tatiana Anikina, Michal Gregor, Marian Simko
- Debating for Better Reasoning in Vision-Language Models
Ashutosh Adhikari, Mirella Lapata
- Fine-tuning LLMs with Cross-Attention-based Weight Decay for Bias Mitigation
Farsheed Haque, Zhe Fu, Depeng Xu, Shuhan Yuan, Xi Niu
- Profiling LLM’s Copyright Infringement Risks under Adversarial Persuasive Prompting
Jikai Long, Ming Liu, Xiusi Chen, Jialiang Xu, Shenglan Li, Zhaozhuo Xu, Denghui Zhang
- Residualized Similarity for Faithfully Explainable Authorship Verification
Peter Zeng, Pegah Alipoormolabashi, Jihu Mun, Gourab Dey, Nikita Soni, Niranjan Balasubramanian, Owen Rambow, H. Schwartz
- Post-hoc Study of Climate Microtargeting on Social Media Ads with LLMs: Thematic Insights and Fairness Evaluation
Tunazzina Islam, Dan Goldwasser
- MRFD: Multi-Region Fusion Decoding with Self-Consistency for Mitigating Hallucinations in LVLMs
Haonan Ge, Yiwei Wang, Ming-Hsuan Yang, Yujun Cai
- SIMBA UQ: Similarity-Based Aggregation for Uncertainty Quantification in Large Language Models
Debarun Bhattacharjya, Balaji Ganesan, Junkyu Lee, Radu Marinescu, Katya Mirylenka, Michael Glass, Xiao Shou
- Mind the Dialect: NLP Advancements Uncover Fairness Disparities for Arabic Users in Recommendation Systems
Abdulla Alshabanah, Murali Annavaram
- Hopscotch: Discovering and Skipping Redundancies in Language Models
Mustafa Eyceoz, Nikhil Shivakumar Nayak, Hao Wang, Ligong Han, Akash Srivastava
- CLEAR: A Clinically Grounded Tabular Framework for Radiology Report Evaluation
Yuyang Jiang, Chacha Chen, Shengyuan Wang, Feng Li, Zecong Tang, Benjamin M. Mervak, Lydia chelala, Christopher M Straus, Reve Chahine, Samuel G. Armato III, Chenhao Tan
- Parsing the Switch: LLM-Based UD Annotation for Complex Code-Switched and Low-Resource Languages
Olga Kellert, Nemika Tyagi, Muhammad Imran, Nelvin Licona-Guevara, Carlos Gómez-Rodríguez
- HetGCoT: Heterogeneous Graph-Enhanced Chain-of-Thought LLM Reasoning for Academic Question Answering
Runsong Jia, Mengjia Wu, Ying Ding, Jie Lu, Yi Zhang
- *S: Test Time Scaling for Code Generation*
Dacheng Li, Shiyi Cao, Chengkun Cao, Xiuyu Li, Shangyin Tan, Kurt Keutzer, Jiarong Xing, Joseph E. Gonzalez, Ion Stoica*
- Language Models Can Easily Learn to Reason from Demonstrations
Dacheng Li, Shiyi Cao, Tyler Griggs, Shu Liu, Xiangxi Mo, Eric Tang, Sumanth Hegde, Shishir G Patil, Matei Zaharia, Joseph E. Gonzalez, Ion Stoica
- FSTs vs ICL: Generalisation in LLMs for an under-resourced language
Ximena Gutierrez, Mikel Segura Elizalde, Victor Mijangos
- SRM-LLM: Semantic Relationship Mining with LLMs for Temporal Knowledge Graph Extrapolation
Fu Zhang, Panfeng Zhang, Jingwei Cheng
- Captioning for Text-Video Retrieval via Dual-Group Direct Preference Optimization
Ji Soo Lee, Byungoh Ko, Jaewon Cho, Howoong Lee, Jaewoon Byun, Hyunwoo J. Kim
- Benchmarking and Improving LLM Robustness for Personalized Generation
Chimaobi Okite, Naihao Deng, Kiran Bodipati, Huaidian Hou, Joyce Chai, Rada Mihalcea
- MemeInterpret: Towards an All-in-One Dataset for Meme Understanding
Jeongsik Park, Khoi P. N. Nguyen, Jihyung Park, Minseok Kim, Jaeheon Lee, Jae Won Choi, Kalyani Ganta, Phalgun Ashrit Kasu, Rohan Sarakinti, Sanjana Vipperla, Sai Sathanapalli, Nishan Vaghani, Vincent Ng
- CoRAG: Enhancing Hybrid Retrieval-Augmented Generation through a Cooperative Retriever Architecture
Zaiyi Zheng, Song Wang, Zihan Chen, Yaochen Zhu, Yinhan He, Liangjie Hong, Qi Guo, Jundong Li
- Hallucination Detection in Structured Query Generation via LLM Self-Debating
Miaoran Li, Jiangning Chen, Minghua Xu, Xiaolong Wang
- Not All Options Are Created Equal: Textual Option Weighting for Token-Efficient LLM-Based Knowledge Tracing
Jong woo kim, SeongYeub Chu, Bryan Wong, Mun Yong Yi
- Public Data Assisted Differentially Private In-Context Learning
Seongho Joo, Hyukhun Koh, Kyomin Jung
- Inducing Argument Facets for Faithful Opinion Summarization
Jian Wang, Yanjie Liang, YUQING SUN, Bin Gong
- Scaling Laws Are Unreliable for Downstream Tasks: A Reality Check
Nicholas Lourie, Michael Y. Hu, Kyunghyun Cho
- Familiarity-Aware Evidence Compression for Retrieval-Augmented Generation
Dongwon Jung, Qin Liu, Tenghao Huang, Ben Zhou, Muhao Chen
- O_O-VC: Synthetic Data-Driven One-to-One Alignment for Any-to-Any Voice Conversion
Huu Tuong Tu, Huan Vu, cuong tien nguyen, Dien Hy Ngo, Nguyen Thi Thu Trang
- Simple Factuality Probes Detect Hallucinations in Long-Form Natural Language Generation
Jiatong Han, Neil Band, Muhammed Razzak, Jannik Kossen, Tim G. J. Rudner, Yarin Gal
- CESRec: Constructing Pseudo Interactions for Sequential Recommendation via Conversational Feedback
Yifan Wang, Shen Gao, Jiabao Fang, Rui Yan, Billy Chiu, Shuo Shang
- TTPA: Token-level Tool-use Preference Alignment Training Framework with Fine-grained Evaluation
Chengrui Huang, Shen Gao, Zhengliang Shi, Dongsheng Wang, Shuo Shang
- Avoiding Knowledge Edit Skipping in Multi-hop Question Answering with Guided Decomposition
Yi Liu, Xiangrong Zhu, Xiangyu Liu, Wei Wei, Wei Hu
- Bridging the Creativity Understanding Gap: Small-Scale Human Alignment Enables Expert-Level Humor Ranking in LLMs
Kuan Lok Zhou, Jiayi Chen, Siddharth Suresh, Reuben Narad, Timothy T. Rogers, Lalit K Jain, Robert D Nowak, Bob Mankoff, Jifan Zhang
- SMARTMiner: Extracting and Evaluating SMART Goals from Low-Resource Health Coaching Notes
Iva Bojic, Qi Chwen Ong, Stephanie Hilary Xinyi Ma, Lin Ai, Zheng Liu, Ziwei Gong, Julia Hirschberg, Andy Hau Yan HO, Andy W. H. Khong
- GRIL: Knowledge Graph Retrieval-Integrated Learning with Large Language Models
Jialin Chen, Houyu Zhang, Seongjun Yun, Alejandro Mottini, Rex Ying, Xiang song, Vassilis N. Ioannidis, Zheng Li, qingjun cui
- Exploring Deductive and Inductive Reasoning Capabilities of Large Language Models in Procedural Planning
Jiabao Kang, Xinye Li, Liyan Xu, Qingbin Liu, Xi Chen, Zhiying Tu, Dianhui Chu, Dianbo Sui
- KELE: A Multi-Agent Framework for Structured Socratic Teaching with Large Language Models
Xian Peng, Pan Yuan, Dong Li, Junlong Cheng, Qin Fang, Zhi Liu
- VisualEDU: A Benchmark for Assessing Coding and Visual Comprehension through Educational Problem-Solving Video Generation
Hao Chen, TIANYU SHI, Pengran huang, Zeyuan Li, Jiahui Pan, Qianglong Chen, Lewei He
- OkraLong: A Flexible Retrieval-Augmented Framework for Long-Text Question Answering
Yulong Hui, Yihao Liu, Yao Lu, Huanchen Zhang
- VerifiAgent: a Unified Verification Agent in Language Model Reasoning
Jiuzhou Han, Wray Buntine, Ehsan Shareghi
- DrKGC: Dynamic Subgraph Retrieval-Augmented LLMs for Knowledge Graph Completion across General and Biomedical Domains
Yongkang Xiao, Sinian Zhang, Yi Dai, Huixue Zhou, Jue Hou, Jie Ding, Rui Zhang
- Understanding the Language Model to Solve the Symbolic Multi-Step Reasoning Problem from the Perspective of Buffer Mechanism
Zhiwei Wang, Yunji Wang, Zhongwang Zhang, Zhangchen Zhou, Hui Jin, Tianyang Hu, Jiacheng Sun, Zhenguo Li, Yaoyu Zhang, Zhi-Qin John Xu
- TwT: Thinking without Tokens by Habitual Reasoning Distillation with Multi-Teachers’ Guidance
Jingxian Xu, Mengyu Zhou, Weichang Liu, Hanbing Liu, Shi Han, Dongmei Zhang
- DAVIS: Planning Agent with Knowledge Graph-Powered Inner Monologue
Minh Pham Dinh, Michael G Yankoski, Munira Syed, Trenton W. Ford
- When Instructions Multiply: Measuring and Estimating LLM Capabilities of Multiple Instructions Following
Keno Harada, Yudai Yamazaki, Masachika Taniguchi, Edison Marrese-Taylor, Takeshi Kojima, Yusuke Iwasawa, Yutaka Matsuo
- FormosanBench: Benchmarking Low-Resource Austronesian Languages in the Era of Large Language Models
Kaiying Kevin Lin, Hsi-Yu Chen, Haopeng Zhang
- SeaPO: Strategic Error Amplification for Robust Preference Optimization of Large Language Models
Jun Rao, Yunjie Liao, Xuebo Liu, Zepeng Lin, Lian Lian, Dong Jin, shengjun cheng, Jun Yu, Min Zhang
- FigEx: Aligned Extraction of Scientific Figures and Captions
Jifeng Song, Arun Das, Ge Cui, Yufei Huang
- PATIMT-Bench: A Multi-Scenario Benchmark for Position-Aware Text Image Machine Translation in Large Vision-Language Models
Wanru Zhuang, Wenbo Li, Zhibin Lan, Xu Han, Peng Li, Jinsong Su
- Safeguard Fine-Tuned LLMs Through Pre- and Post-Tuning Model Merging
Hua Farn, Hsuan Su, Shachi H. Kumar, Saurav Sahay, Shang-Tse Chen, Hung-yi Lee
- Self-Ensemble: Mitigating Confidence Distortion for Large Language Models
Zicheng Xu, Guanchu Wang, Guangyao Zheng, Yu-Neng Chuang, Alex Szalay, Xia Hu, Vladimir Braverman
- Annotation-Efficient Language Model Alignment via Diverse and Representative Response Texts
Yuu Jinnai, Ukyo Honda
- Explainable Chain-of-Thought Reasoning: An Empirical Analysis on State-Aware Reasoning Dynamics
Sheldon Yu, Yuxin Xiong, Junda Wu, Xintong Li, Tong Yu, Xiang Chen, Ritwik Sinha, Jingbo Shang, Julian McAuley
- DecisionFlow: Advancing Large Language Model as Principled Decision Maker
Xiusi Chen, Shanyong Wang, Cheng Qian, Hongru WANG, Peixuan Han, Heng Ji
- M-Ped: Multi-Prompt Ensemble Decoding for Large Language Models
Jiaxin GUO, Daimeng Wei, Yuanchang Luo, Hengchao Shang, Zongyao Li, Jinlong Yang, Zhanglin Wu, Zhiqiang Rao, Shimin Tao, Hao Yang
- Butterfly Effects in Toolchains: A Comprehensive Analysis of Failed Parameter Filling in LLM Tool-Agent Systems
Qian Xiong, Yuekai Huang, Ziyou Jiang, Zhiyuan Chang, Yujia Zheng, Tianhao Li, Mingyang Li
- FinLFQA: Evaluating Attributed Text Generation of LLMs in Financial Long-Form Question Answering
Yitao Long, Tiansheng Hu, Yilun Zhao, Arman Cohan, Chen Zhao
- BenchMAX: A Comprehensive Multilingual Evaluation Suite for Large Language Models
Xu Huang, Wenhao Zhu, Hanxu Hu, Conghui He, Lei Li, Shujian Huang, Fei Yuan
- Assessing the Sensitivity and Alignment of FOL Closeness Metrics
Ramya Keerthy Thatikonda, Wray Buntine, Ehsan Shareghi
- FoodSafeSum: Enabling Natural Language Processing Applications for Food Safety Document Summarization and Analysis
Juli Bakagianni, Korbinian Randl, Guido Rocchietti, Cosimo Rulli, Franco Maria Nardini, Salvatore Trani, Aron Henriksson, Anna Romanova, John Pavlopoulos
- Self-adaptive Dataset Construction for Real-World Multimodal Safety Scenarios
Jingen Qu, Lijun Li, Bo Zhang, Yichen Yan, Jing Shao
- EnDive: A Cross-Dialect Benchmark for Fairness and Performance in Large Language Models
Abhay Gupta, Jacob Cheung, Philip Meng, Shayan Sayyed, Kevin Zhu, Austen Liao, Sean O’Brien
- FAEDKV: Infinite-Window Fourier Transform for Unbiased KV Cache Compression
Runchao Li, Yao Fu, Mu Sheng, Xianxuan Long, Haotian Yu, Pan Li
- Dynamic Injection of Entity Knowledge into Dense Retrievers
Ikuya Yamada, Ryokan Ri, Takeshi Kojima, Yusuke Iwasawa, Yutaka Matsuo
- When Personalization Meets Reality: A Multi-Faceted Analysis of Personalized Preference Learning
Yijiang River Dong, Tiancheng Hu, Yinhong Liu, Ahmet Üstün, Nigel Collier
- MASTER: Multi-Agent Security Through Exploration of Roles and Topological Structures - A Comprehensive Framework
Yifan Zhu, Chao Zhang, Xin Shi, Xueqiao Zhang, Yi Yang, Yawei Luo
- MONAQ: Multi-Objective Neural Architecture Querying for Time-Series Analysis on Resource-Constrained Devices
Patara Trirat, Jae-Gil Lee
- StandUp4AI: A New Multilingual Dataset for Humor Detection in Stand-up Comedy Videos
Valentin Barriere, Nahuel Gomez, Léo Hemamou, Sofia Callejas, Brian Ravenet
- Does Visual Grounding Enhance the Understanding of Embodied Knowledge in Large Language Models?
Zhihui Yang, Yupei Wang, Kaijie Mo, Zhe Zhao, Renfen Hu
- Semantic Contribution-Aware Adaptive Retrieval for Black-Box Models
Qinhong Lin, Yuang Cai, Linna Zhou, Zhongliang Yang, Dingfu Yu, Xuan Xu, Yu Li
- On Guardrail Models’ Robustness to Mutations and Adversarial Attacks
Elias Bassani, Ignacio Sanchez
- IP-Dialog: Evaluating Implicit Personalization in Dialogue Systems with Synthetic Data
Bo Peng, Zhiheng Wang, Heyang Gong, Chaochao Lu
- Zero-shot Graph Reasoning via Retrieval Augmented Framework with LLMs
Hanqing Li, Sharika Mahadevan, Kiran Jyothi Sheena, Henry Liang, Diego Klabjan
- Privacy in Action: Towards Realistic Privacy Mitigation and Evaluation for LLM-Powered Agents
Shouju Wang, Fenglin Yu, Xirui Liu, Xiaoting Qin, Jue Zhang, Qingwei Lin, Dongmei Zhang, Saravan Rajmohan
- Dissecting Logical Reasoning in LLMs: A Fine-Grained Evaluation and Supervision Study
Yujun Zhou, Jiayi Ye, Zipeng Ling, Yufei Han, Yue Huang, Haomin Zhuang, Zhenwen Liang, Kehan Guo, Taicheng Guo, Xiangqi Wang, Xiangliang Zhang
- ConciseRL: Conciseness-Guided Reinforcement Learning for Efficient Reasoning Models
Razvan-Gabriel Dumitru, Darius Peteleaza, Vikas Yadav, Liangming Pan
- Faster and Better LLMs via Latency-Aware Test-Time Scaling
Zili Wang, Tianyu Zhang, Haoli Bai, Lu Hou, Xianzhi Yu, Wulong Liu, Shiming Xiang, Lei Zhu
- Reasoning-Augmented Conversation for Multi-Turn Jailbreak Attacks on Large Language Models
Zonghao Ying, Deyue Zhang, Zonglei Jing, Yisong Xiao, Quanchen Zou, Aishan Liu, Siyuan Liang, Xiangzheng Zhang, Xianglong Liu, Dacheng Tao
- Distilling Many-Shot In-Context Learning into a Cheat Sheet
Ukyo Honda, Soichiro Murakami, Peinan Zhang
- Tracing Training Footprints: A Calibration Approach for Membership Inference Attacks Against Multimodal Large Language Models
Xiaofan Zheng, Huixuan Zhang, Xiaojun Wan
- PolBiX: Detecting LLMs’ Political Bias in Fact-Checking through X-phemisms
Charlott Jakob, David Harbecke, Patrick Parschan, Pia Wenzel Neves, Vera Schmitt
- URO-Bench: Towards Comprehensive Evaluation for End-to-End Spoken Dialogue Models
Ruiqi Yan, Xiquan Li, Wenxi Chen, Zhikang Niu, Chen Yang, Ziyang Ma, Kai Yu, Xie Chen
- Low-Hallucination and Efficient Coreference Resolution with LLMs
Yujian Gan, Yuan Liang, Jinxia Xie, Yanni Lin, Juntao Yu, Massimo Poesio
- Your Mileage May Vary: How Empathy and Demographics Shape Human Preferences in LLM Responses
Yishan Wang, Amanda Cercas Curry, Flor Miriam Plaza-del-Arco
- Diving into Mitigating Hallucinations from a Vision Perspective for Large Vision-Language Models
Weihang Wang, Xinhao Li, Ziyue Wang, Yan Pang, Jielei Zhang, Peiyi Li, Qiang Zhang, Longwen Gao
- PhysicsArena: The First Multimodal Physics Reasoning Benchmark Exploring Variable, Process, and Solution Dimensions
Song Dai, Yibo Yan, Jiamin Su, Zihao Dongfang, Yubo Gao, Yonghua Hei, Jungang Li, Junyan Zhang, Sicheng Tao, Zhuoran Gao, Xuming Hu
- Ko-LongRAG: A Korean Long-Context RAG Benchmark Built with a Retrieval-Free Approach
Yongil Kim, Heuiyeen Yeen, Hyeongu Yun, Jinsik Lee
- Choosing a Model, Shaping a Future: Comparing LLM Perspectives on Sustainability and its Relationship with AI
Annika Bush, Meltem Aksoy, Markus Pauly, Greta Ontrup
- Optimising Factual Consistency in Summarisation via Preference Learning from Multiple Imperfect Metrics
Yuxuan Ye, Raul Santos-Rodriguez, Edwin Simpson
- Judging with Many Minds: Do More Perspectives Mean Less Prejudice? On Bias Amplification and Resistance in Multi-Agent Based LLM-as-Judge
Chiyu Ma, Enpei Zhang, Yilun Zhao, Wenjun Liu, Yaning Jia, Peijun Qing, Lin Shi, Arman Cohan, Yujun Yan, Soroush Vosoughi
- A Regex Minimization Benchmark: A PSPACE-Complete Challenge for Language Models
Hyundong Jin, Joonghyuk Hahn, Yo-Sub Han
- Investigating the Impact of Conceptual Metaphors on LLM-based NLI through Shapley Interactions
Meghdut Sengupta, Maximilian Muschalik, Fabian Fumagalli, Barbara Hammer, Eyke Hüllermeier, Debanjan Ghosh, Henning Wachsmuth
- KurTail : Kurtosis-based LLM Quantization
Mohammad Sadegh Akhondzadeh, Aleksandar Bojchevski, Evangelos Eleftheriou, Martino Dazzi
- VIVA+: Human-Centered Situational Decision-Making
Zhe Hu, Yixiao Ren, Guanzhong Liu, Jing Li, Yu Yin
- QuantAgents: Towards Multi-agent Financial System via Simulated Trading
Xiangyu Li, Yawen Zeng, Xiaofen Xing, Jin Xu, Xiangmin Xu
- LLMs Reproduce Stereotypes of Sexual and Gender Minorities
Ruby Ostrow, Adam Lopez
- Accept or Deny? Evaluating LLM Fairness and Performance in Loan Approval across Table-to-Text Serialization Approaches
Israel Abebe Azime, Deborah D. Kanubala, Tejumade Afonja, Mario Fritz, Isabel Valera, Dietrich Klakow, Philipp Slusallek
- Transfer-Aware Data Selection for Domain Adaptation in Text Retrieval
Linzhu Yu, Huan Li, Ke Chen, Lidan Shou
- Understanding and Improving Information Preservation in Prompt Compression for LLMs
Weronika Łajewska, Laura Aina, Momchil Hardalov, Neha Anna John, Hang Su, Lluis Marquez
- A Benchmark for Hindi Verb-Argument Structure Alternations
Kanishka Jain, Ashwini Vaidya
- Beyond Binary Preferences: Semi-Online Label-Free GRACE-KTO with Group-Wise Adaptive Calibration for High-Quality Long-Text Generation
Jingyang Deng, Ran Chen, Jo-Ku Cheng, Jinwen Ma
- Representation-based Broad Hallucination Detectors Fail to Generalize Out of Distribution
Zuzanna Dubanowska, Maciej Żelaszczyk, Michał Brzozowski, Paolo Mandica, Michal P. Karpowicz
- MAFMO: Multi-modal Adaptive Fusion with Meta-template Optimization for Vision-Language Models
Mingrui Xie, Lulu Xu, Junliang Du
- Multimodal UNcommonsense: From Odd to Ordinary and Ordinary to Odd
Yejin Son, Saejin Kim, Dongjun Min, Youngjae Yu
- Analyzing Gambling Addictions: A Spanish Corpus for Understanding Pathological Behavior
Manuel Couto, Marcos Fernández-Pichel, Mario Ezra Aragon, David E. Losada
- Beyond Surface Alignment: Rebuilding LLMs Safety Mechanism via Probabilistically Ablating Refusal Direction
Yuanbo Xie, Yingjie Zhang, Tianyun Liu, Duohe Ma, Tingwen Liu
- Distributed LLM Serving on Consumer-Grade GPUs by Reconciling Computation and Communication
Lewei Jin, kui zhang, Yongqi Chen, Zhuoyifan, Renjie Li, Yi Gao, Bowei Yang, Zhengong Cai, Wei Dong
- SafeToolBench: Pioneering a Prospective Benchmark to Evaluating Tool Utilization Safety in LLMs
Hongfei Xia, Hongru WANG, Zeming Liu, Qian Yu, Yuhang Guo, Haifeng Wang
- Sparsifying Mamba
An Wang, Ruobing Xie, Shuaipeng Li, Xingwu Sun, Zhanhui Kang
- Beneath the Facade: Probing Safety Vulnerabilities in LLMs via Auto-Generated Jailbreak Prompts
Heehyeon Kim, Kyeongryul Lee, Joyce Jiyoung Whang
- ET-MIER: Entity Type-guided Key Mention Identification and Evidence Retrieval for Document-level Relation Extraction
Xin Li, Huangming Xu, Fu Zhang, Jingwei Cheng
- Position IDs Matter: An Enhanced Position Layout for Efficient Context Compression in Large Language Models
Runsong Zhao, Xin Liu, Xinyu Liu, Pengcheng Huang, Chunyang Xiao, Tong Xiao, JingBo Zhu
- Can Role Vectors Affect LLM Behaviour?
Daniele Potertì, Andrea Seveso, Fabio Mercorio
- Semantic Component Analysis: Introducing Multi-Topic Distributions to Clustering-Based Topic Modeling
Florian Eichin, Carolin M. Schuster, Georg Groh, Michael A. Hedderich
- ThinkQE: Query Expansion via an Evolving Thinking Process
Yibin Lei, Tao Shen, Andrew Yates
- Hierarchical Reward Modeling for Fault Localization in Large Code Repositories
Jiwei Zhang, Jianxun Lian, Haiming Qin, Mingyang Zhou, KeZhong Lu, Rui Mao, Hao Liao
- Layer Duplication in LLMs
Neo Eyal, Nachum Dershowitz, Kfir Bar
- Semantic-Aware Action Space Compression via LLM-DRL Synergy for Efficient Task-oriented Dialogue Policy Exploration
Yangyang Zhao, Ben Niu, Yuxuan Tan, Shihan Wang, Libo Qin
- Linear Steerability in Language Models: When It Emerges and How It Evolves
Jianshu She, Xinyue Li, Eric P. Xing, Zhengzhong Liu, Qirong Ho
- A Comprehensive Survey on Learning from Rewards for Large Language Models: Reward Models and Learning Strategies
Xiaobao Wu
- InFact: Informativeness Alignment for Improved LLM Factuality
Roi Cohen, Russa Biswas, Gerard de Melo
- Large Language Model Agents in Finance: A Survey Bridging Research, Practice, and Real-World Deployment
Yifei Dong, Fengyi Wu, Kunlin Zhang, Yilong Dai, sanjian zhang, Wanghao Ye, Sihan Chen, Zhi-Qi Cheng
- Problem Solved? Information Extraction Design Space for Layout-Rich Documents using LLMs
Gaye Colakoglu, Gürkan Solmaz, Jonathan Fürst
- Generation-Augmented Retrieval: Rethinking the Role of Large Language Models in Zero-Shot Relation Extraction
Zehan Li, Fu Zhang, Tianyue Peng, He Liu, Jingwei Cheng
- Following Occam’s Razor: Dynamic Combination of Structured Knowledge for Multi-Hop Question Answering using LLMs
Wei Chen, Zhi Zheng, Lili Zhao, huijun hou, Tong Xu
- Large Language Models as Reader for Bias Detection
Xuan Luo, Jing Li, Zhong Wenzhong, Geng Tu, Ruifeng Xu
- LOHRec: Leveraging Order and Hierarchy in Generative Sequential Recommendation
Jiawen Xie, Haiyang Wu, Deyi Ji, Yuekui Yang, Shaoping Ma
- Biology-Instructions: A Dataset and Benchmark for Multi-Omics Sequence Understanding Capability of Large Language Models
haonan he, Yuchen Ren, Yining Tang, Ziyang Xu, Junxian Li, Minghao Yang, Di Zhang, Yuan Dong, Tao Chen, Shufei Zhang, Yuqiang Li, Nanqing Dong, Wanli Ouyang, Dongzhan Zhou, Peng Ye
- AssistedDS: Benchmarking How External Domain Knowledge Assists LLMs in Automated Data Science
An Luo, Xun Xian, Jin Du, Fangqiao Tian, Ganghua Wang, Ming Zhong, Shengchun ZHAO, Xuan Bi, Zirui Liu, Jiawei Zhou, Jayanth Srinivasa, Ashish Kundu, Charles Fleming, Mingyi Hong, Jie Ding
- Are you sure? Measuring models bias in content moderation through uncertainty
Alessandra Urbinati, Mirko Lai, Simona Frenda, Marco Antonio Stranisci
- FOSSIL: Harnessing Feedback on Suboptimal Samples for Data-Efficient Generalisation with Imitation Learning for Embodied Vision-and-Language Tasks
Sabrina McCallum, Amit Parekh, Alessandro Suglia
- Assess and Prompt: A Generative RL Framework for Improving Engagement in Online Mental Health Communities
Bhagesh Gaur, Karan Gupta, Aseem Srivastava, Manish Gupta, Md Shad Akhtar
- Logic: Long-form Outline Generation via Imitative and Critical Self-refinement
Hengwei Liu, Yongliang Shen, Zhe Zheng, Haoyuan Ma, Xingyu Wu, Yin Zhang, Weiming Lu
- No Free Lunch: Retrieval-Augmented Generation Undermines Fairness in LLMs, Even for Vigilant Users
Mengxuan Hu, Hongyi Wu, Ronghang Zhu, Zihan Guan, Dongliang Guo, Daiqing Qi, Sheng Li
- LegoSLM: Connecting LLM with Speech Encoder using CTC Posteriors
Rao Ma, Tongzhou Chen, Kartik Audhkhasi, Bhuvana Ramabhadran
- Skeleton-Guided-Translation: A Benchmarking Framework for Code Repository Translation with Fine-Grained Quality Evaluation
Xing Zhang, Jiaheng Wen, Fangkai Yang, Yu Kang, Pu Zhao, Junhao Wang, Maoquan Wang, Yufan Huang, Shengyu Fu, Elsie Nallipogu, Qingwei Lin, Yingnong Dang, Saravan Rajmohan, Dongmei Zhang
- Parallel Communities Across the Surface Web and the Dark Web
Wenchao Dong, Megha Sundriyal, Seongchan Park, Jaehong Kim, Meeyoung Cha, Tanmoy Chakraborty, Wonjae Lee
- Lemma Dilemma: On Lemma Generation Without Domain- or Language-Specific Training Data
Olia Toporkov, Alan Akbik, Rodrigo Agerri
- LlmFixer: Fix the Helpfulness of Defensive Large Language Models
Zelong Yu, Xiaoming Zhang, Litian Zhang, Yu Yuan, Chaozhuo Li
- Universal Acoustic Adversarial Attacks for Flexible Control of Speech-LLMs
Rao Ma, Mengjie Qian, Vyas Raina, Mark Gales, Kate Knill
- Probing Semantic Routing in Large Mixture-of-Expert Models
Matthew Lyle Olson, Neale Ratzlaff, Musashi Hinck, Sungduk Yu, Man Luo, Chendi Xue, Vasudev Lal
- CMT-Eval: A Novel Chinese Multi-turn Dialogue Evaluation Dataset Addressing Real-world Conversational Challenges
Siyu Tian, Kaijie Mo, Yupei Wang, Renfen Hu
- LastingBench: Defend Benchmarks Against Knowledge Leakage
Yixiong Fang, Tianran Sun, Yuling Shi, Min Wang, Xiaodong Gu
- Learning API Functionality from In-Context Demonstrations for Tool-based Agents
Bhrij Patel, Ashish Jagmohan, Aditya Vempaty
- Predicting Language Models’ Success at Zero-Shot Probabilistic Prediction
Kevin Ren, Santiago Cortes-Gomez, Carlos Miguel Patiño, Ananya Joshi, Ruiqi Lyu, Jingjing Tang, Alistair Turcan, Khurram Yamin, Steven Wu, Bryan Wilder
- GAMIC: Graph-Aligned Molecular In-context Learning for Molecule Analysis via LLMs
ALI AL LAWATI, Jason S Lucas, Zhiwei Zhang, Prasenjit Mitra, Suhang Wang
- Rethinking Sign Language Translation: The Impact of Signer Dependence on Model Evaluation
Keren Artiaga, Sabyasachi Kamila, Haithem Afli, Conor Lynch, Mohammed Hasanuzzaman
- Can Large Language Models Identify Implicit Suicidal Ideation? An Empirical Evaluation
Tong Li, Shu Yang, Junchao Wu, Jiyao Wei, Lijie Hu, Mengdi Li, Derek F. Wong, Joshua R. Oltmanns, Di Wang
- Adaptive Platt Scaling with Causal Interpretations for Self-Reflective Language Model Uncertainty Estimates
Anthony Sicilia, Malihe Alikhani
- Treble Counterfactual VLMs: A Causal Approach to Hallucination
Li Li, Jiashu Qu, Linxin Song, Yuxiao Zhou, Yuehan Qin, Tiankai Yang, Yue Zhao
- Video-Skill-CoT: Skill-based Chain-of-Thoughts for Domain-Adaptive Video Reasoning
Daeun Lee, Jaehong Yoon, Jaemin Cho, Mohit Bansal
- Glitter: A Multi-Sentence, Multi-Reference Benchmark for Gender-Fair German Machine Translation
A Pranav, Janiça Hackenbuchner, Giuseppe Attanasio, Manuel Lardelli, Anne Lauscher
- From n-gram to Attention: How Model Architectures Learn and Propagate Bias in Language Modeling
Mohsinul Kabir, Tasfia Tahsin, Sophia Ananiadou
- SENTRA: Selected-Next-Token Transformer for LLM Text Detection
Mitchell Plyler, Yilun Zhang, Alexander Tuzhilin, Saoud Khalifah, Sen Tian
- Automate Strategy Finding with LLM in Quant Investment
Zhizhuo KOU, Holam Yu, Junyu Luo, Jingshu Peng, Xujia Li, Chengzhong LIU, Juntao Dai, Lei Chen, Sirui Han, Yike Guo
- Does Reasoning Introduce Bias? A Study of Social Bias Evaluation and Mitigation in LLM Reasoning
Xuyang Wu, Jinming Nian, Ting-Ruen Wei, Zhiqiang Tao, Hsin-Tai Wu, Yi Fang
- MT-RewardTree: A Comprehensive Framework for Advancing LLM-Based Machine Translation via Reward Modeling
Zhaopeng Feng, Jiahan Ren, Jiayuan Su, Jiamei Zheng, Hongwei Wang, Zuozhu Liu
- Bias after Prompting: Persistent Discrimination in Large Language Models
Nivedha Sivakumar, Natalie Mackraz, Samira Khorshidi, Krishna Patel, Barry-John Theobald, Luca Zappella, Nicholas Apostoloff
- CARVQ: Corrective Adaptor with Group Residual Vector Quantization for LLM Embedding Compression
Dayin Gou, Sanghyun Byun, Nilesh Malpeddi, Gabrielle De Micheli, Prathamesh Vaste, Jacob Song, Woo Seong Chung
- Consistent Discourse-level Temporal Relation Extraction Using Large Language Models
Yi Fan, Michael Strube
- MMPlanner: Zero-Shot Multimodal Procedural Planning with Chain-of-Thought Object State Reasoning
Afrina Tabassum, Bin Guo, Xiyao Ma, Hoda Eldardiry, Ismini Lourentzou
- Internal states before wait modulate reasoning patterns
Dmitrii Troitskii, Koyena Pal, Chris Wendler, Callum Stuart McDougall
- Sparsity May Be All You Need: Sparse Random Parameter Adaptation
Jesus Rios, Pierre Dognin, Ronny Luss, Karthikeyan Natesan Ramamurthy
- Learning to Align: Addressing Character Frequency Distribution Shifts in Handwritten Text Recognition
Panagiotis Kaliosis, John Pavlopoulos
- MT-R1-Zero: Advancing LLM-based Machine Translation via R1-Zero-like Reinforcement Learning
Zhaopeng Feng, Shaosheng Cao, Jiahan Ren, Jiayuan Su, Ruizhe Chen, Yan Zhang, Jian Wu, Zuozhu Liu
- Discrete Minds in a Continuous World: Do Language Models Know Time Passes?
Minghan Wang, Ye Bai, Thuy-Trang Vu, Ehsan Shareghi, Gholamreza Haffari
- DLTKG: Denoising Logic-based Temporal Knowledge Graph Reasoning
Xiaoke Wang, Fu Zhang, Jingwei Cheng, Yiwen Chi, Jiashun Peng, Yingsong Ning
- EMO-RL: Emotion-Rule-Based Reinforcement Learning Enhanced Audio-Language Model for Generalized Speech Emotion Recognition
Pengcheng Li, Botao Zhao, Zuheng Kang, Junqing Peng, Xiaoyang Qu, Yayun He, Jianzong Wang
- MANTA: A Scalable Pipeline for Transmuting Massive Web Corpora into Instruction Datasets
Heuiyeen Yeen, Seokhee Hong, Hyeongu Yun, Jinsik Lee
- Fast Quiet-STaR: Thinking Without Thought Tokens
Wei Huang, Yizhe Xiong, Xin Ye, Zhijie Deng, Hui Chen, Zijia Lin, Guiguang Ding
- Lock on Target! Precision Unlearning via Directional Control
Yuntao Wen, Shen Gao, Ruixiang Feng, Feng Guo, Yifan Wang, Ran Le, Yang Song, Shuo Shang
- UniRAG: A Unified RAG Framework for Knowledge-Intensive Queries with Decomposition, Break-Down Reasoning, and Iterative Rewriting
Gun Il Kim, Jong Wook Kim, Beakcheol Jang
- One Shot Dominance: Knowledge Poisoning Attack on Retrieval-Augmented Generation Systems
Zhiyuan Chang, Mingyang Li, Xiaojun Jia, Junjie Wang, Yuekai Huang, Ziyou Jiang, Yang Liu, Qing Wang
- From Generic Empathy to Personalized Emotional Support: A Self-Evolution Framework for User Preference Alignment
Jing Ye, Lu Xiang, Yaping Zhang, Chengqing Zong
- MaskCD: Mitigating LVLM Hallucinations by Image Head Masked Contrastive Decoding
Jingyuan Deng, Yujiu Yang
- ClusterUCB: Efficient Gradient-Based Data Selection for Targeted Fine-Tuning of LLMs
Zige Wang, Qi Zhu, Fei Mi, Minghui Xu, Ruochun Jin, Wenjing Yang
- TrapDoc: Deceiving LLM Users by Injecting Imperceptible Phantom Tokens into Documents
Hyundong Jin, Sicheol Sung, Shinwoo Park, SeungYeop Baik, Yo-Sub Han
- AraReasoner: Evaluating Reasoning-Based LLMs for Arabic NLP
Ahmed Abul Hasanaath, Aisha Alansari, Ahmed Ashraf, Salmane Chafik, Hamzah Luqman, Saad Ezzini
- Tales of Morality: Comparing Human- and LLM-Generated Moral Stories from Visual Cues
Rezvaneh Rezapour, Sullam Jeoung, Zhiwen You, Jana Diesner
- AirRAG: Autonomous Strategic Planning and Reasoning Steer Retrieval Augmented Generation
Wenfeng Feng, Chuzhan Hao, Yuewei Zhang, Guochao Jiang, Jingyi Song
- Evaluating NL2SQL via SQL2NL
Mohammadtaher Safarzadeh, Afshin Oroojlooy, Dan Roth
- DB-Explore: Automated Database Exploration and Instruction Synthesis for Text-to-SQL
Haoyuan Ma, Yongliang Shen, Hengwei Liu, Wenqi Zhang, Haolei Xu, Qiuying Peng, Jun Wang, Weiming Lu
- Do BERT-Like Bidirectional Models Still Perform Better on Text Classification in the Era of LLMs?
Junyan Zhang, Yiming Huang, Shuliang Liu, Yubo Gao, Xuming Hu
- Divide, Optimize, Merge: Scalable Fine-Grained Generative Optimization for LLM Agents
Jiale Liu, Yifan Zeng, Shaokun Zhang, Chi Zhang, Malte Højmark-Bertelsen, Marie Normann Gadeberg, Huazheng Wang, Qingyun Wu
- Evaluating Evaluation Metrics – The Mirage of Hallucination Detection
Atharva Kulkarni, Yuan Zhang, Joel Ruben Antony Moniz, Xiou Ge, Bo-Hsiang Tseng, Dhivya Piraviperumal, Swabha Swayamdipta, Hong Yu
- The Progress Illusion: Revisiting meta-evaluation standards of LLM evaluators
Tianruo Rose Xu, Vedant Gaur, Liu Leqi, Tanya Goyal
- MidPO: Dual Preference Optimization for Safety and Helpfulness in Large Language Models via a Mixture of Experts Framework
Yupeng Qi, Ziyu Lyu, Min Yang, Yanlin Wang, Lu Bai, Lixin Cui
- From KMMLU-Redux to Pro: A Professional Korean Benchmark Suite for LLM Evaluation
Seokhee Hong, Sunkyoung Kim, Guijin Son, Soyeon Kim, Yeonjung Hong, Jinsik Lee
- RealBench: A Chinese Multi-image Understanding Benchmark Close to Real-world Scenarios
Fei Zhao, Chengqiang Lu, Yufan Shen, Qimeng Wang, Yicheng Qian, Haoxin Zhang, Yan Gao, YIWU, Yao Hu, Zhen Wu, Shangyu Xing, Xinyu Dai
- The More, The Better? A Critical Study of Multimodal Context in Radiology Report Summarization
Mong Yuan Sim, Wei Emma Zhang, Xiang Dai, Biaoyan Fang, Sarbin Ranjitkar, Arjun Burlakoti, Jamie Taylor, Haojie Zhuang
- Localizing Malicious Outputs from CodeLLM
Mayukh Borana, Liang Junyi, Sai Sathiesh Rajan, Sudipta Chattopadhyay
- Knowing More, Acting Better: Hierarchical Representation for Embodied Decision-Making
Chunhui Zhang, Zhongyu Ouyang, Xingjian Diao, Zheyuan Liu, Soroush Vosoughi
- Culture is Everywhere: A Call for Intentionally Cultural Evaluation
Juhyun Oh, Inha Cha, Michael Saxon, Hyunseung Lim, Shaily Bhatt, Alice Oh
- Fairness in Automatic Speech Recognition Isn’t a One-Size-Fits-All
Hend ElGhazaly, Bahman Mirheidari, Heidi Christensen, Nafise Sadat Moosavi
- Uncovering Factor-Level Preference to Improve Human-Model Alignment
Juhyun Oh, Eunsu Kim, Jiseon Kim, Wenda Xu, William Yang Wang, Alice Oh
- Adaptive Preference Optimization with Uncertainty-aware Utility Anchor
Xiaobo Wang, Zixia Jia, Jiaqi Li, Qi Liu, Zilong Zheng
- GRAD: Generative Retrieval-Aligned Demonstration Sampler for Efficient Few-Shot Reasoning
Oussama Gabouj, Kamel Charaf, Ivan Zakazov, Nicolas Baldwin, Robert West
- IoTMigrator: LLM-driven Embedded IoT Code Migration across Different OSes for Cloud-device Integration
YQ, Kaijie Gong, Yi Gao, Hao Wang, Wei Dong
- ClueAnchor: Clue-Anchored Knowledge Reasoning Exploration and Optimization for Retrieval-Augmented Generation
Hao Chen, Yukun Yan, Sen Mei, Wanxiang Che, Zhenghao Liu, Qi Shi, Xinze Li, Yuchun Fan, Pengcheng Huang, Qiushi Xiong, Zhiyuan Liu, Maosong Sun
- BAGELS: Benchmarking the Automated Generation and Extraction of Limitations from Scholarly Text
Ibrahim Al Azher, Miftahul Jannat Mokarrama, Zhishuai Guo, Sagnik Ray Choudhury, Hamed Alhoori
- Dense Retrievers Can Fail on Simple Queries: Revealing The Granularity Dilemma of Embeddings
Liyan Xu, Zhenlin Su, Mo Yu, Jiangnan Li, Fandong Meng, Jie Zhou
- Over-Generation and Compaction: A Prompting Strategy for Procedural Text Adaptation with Large Language Models
HyeongSik Kim, XU Yanheng, Chaoqun Dong, Fei Du
- TransBERT: A Framework for Synthetic Translation in Domain-Specific Language Modeling
Julien Knafou, Luc Mottin, Anaïs Mottaz, Alexandre Flament, Patrick Ruch
- Beyond Fixed-Length Calibration for Post-Training Compression of LLMs
Jaehoon Oh, Dokwan Oh
- Attributes as Textual Genes: Leveraging LLMs as Genetic Algorithm Simulators for Conditional Synthetic Data Generation
Guangzeng Han, Weisi Liu, Xiaolei Huang
- ReCoVeR the Target Language: Language Steering without Sacrificing Task Performance
Hannah Sterz, Fabian David Schmidt, Goran Glavaš, Ivan Vulić
- LC-Eval: A Bilingual Multi-Task Evaluation Benchmark for Long-Context Understanding
Sheikh Jubair, Arwa Omayrah, Amal Alshammari, Alhanoof Althnian, Abdulhamed Alothaimen, Norah A. Alzahrani, Shahad D. Alzaidi, Nora Al-Twairesh, Abdulmohsen Al-Thubaity
- OVFact: Measuring and Improving Open-Vocabulary Factuality for Long Caption Models
Monika Wysoczańska, Shyamal Buch, Anurag Arnab, Cordelia Schmid
- GRPO-Guided Modality Selection Enhanced LoRA-Tuned LLMs for Multimodal Emotion Recognition
Yang Chen, ShuwanYang, Yan Xiang, Ran Song, Yuxin Huang, Zhengtao Yu
- Defending against Indirect Prompt Injection by Instruction Detection
Tongyu Wen, Chenglong Wang, Xiyuan Yang, Haoyu Tang, Yueqi Xie, Lingjuan Lyu, Zhicheng Dou, Fangzhao Wu
- MUG-Eval: A Proxy Evaluation Framework for Multilingual Generation Capabilities in Any Language
Seyoung Song, Seogyeong Jeong, Eunsu Kim, Jiho Jin, Dongkwan Kim, Jay Shin, Alice Oh
- CAC-CoT: Connector-Aware Compact Chain-of-Thought for Efficient Reasoning Data Synthesis Across Dual-System Cognitive Tasks
Sunguk Choi, Yonghoon Kwon, Heondeuk Lee
- On the Versatility of Sparse Autoencoders for In-Context Learning
Ikhyun Cho, Gaeul Kwon, Julia Hockenmaier
- More Documents, Same Length: Isolating the Challenge of Multiple Documents in RAG
Shahar Levy, Nir Mazor, Lihi Shalmon, Michael Hassid, Gabriel Stanovsky
- CLEAR: A Comprehensive Linguistic Evaluation of Argument Rewriting by Large Language Models
Thomas Huber, Christina Niklaus
- ALRPHFS:Adversarially Learned Risk Patterns with Hierarchical Fast & Slow Reasoning for Robust Agent Defense
Shiyu Xiang, Tong Zhang, Ronghao Chen
- Stop Playing the Guessing Game! Evaluating Conversational Recommender Systems via Target-free User Simulation
SungHwan Kim, Kwangwook Seo, Tongyoung Kim, Jinyoung Yeo, Dongha Lee
- Out-of-Context Reasoning in Large Language Models
Jonathan Shaki, Emanuele La Malfa, Michael J. Wooldridge, Sarit Kraus
- CodeComplex: Dataset for Worst-Case Time Complexity Prediction
SeungYeop Baik, Joonghyuk Hahn, Jungin Kim, Aditi, Mingi Jeon, Yo-Sub Han, Sang-Ki Ko
- Weak2Wise: An Automated, Lightweight Framework for Weak-LLM-Friendly Reasoning Synthesis
Jianing Lin, Yuanfang Guo, Shunning Liu, Zeming Liu, Yunhong Wang
- From Tower to Spire: Adding the Speech Modality to a Translation-Specialist LLM
Kshitij Ambilduke, Ben Peters, Sonal Sannigrahi, Anil Keshwani, Tsz Kin Lam, Bruno Martins, Marcely Zanon Boito, Andre Martins
- LLM Agents at the Roundtable: A Multi-Perspective and Dialectical Reasoning Framework for Essay Scoring
Jinhee Jang, Ayoung Moon, Minkyoung Jung, YoungBin Kim, Seung Jin Lee
- DeepNote: Note-Centric Deep Retrieval-Augmented Generation
Ruobing Wang, Qingfei Zhao, Yukun Yan, Daren Zha, Yuxuan Chen, Shi Yu, Zhenghao Liu, Yixuan Wang, Shuo Wang, Xu Han, Zhiyuan Liu, Maosong Sun
- NormAL LoRA: What is the perfect size?
Aastik, Topu Sai Meghana, Chinmay Prakash Kulkarni, Pragya Paramita Sahu
- Inclusive Leadership in the Age of AI: A Dataset and Comparative Study of LLMs vs. Real-Life Leaders in Workplace Action Planning
Vindhya Singh, Sabine Schulte im Walde, Ksenia Keplinger
- Token Preference Optimization with Self-Calibrated Visual-Anchored Rewards for Hallucination Mitigation
Jihao Gu, Yingyao Wang, Meng Cao, Pi Bu, Jun Song, Bo Zheng, Yancheng He, Shilong Li
- EZ-VC: Easy Zero-shot Any-to-Any Voice Conversion
Advait Joglekar, Divyanshu Singh, Rooshil Rohit Bhatia, Srinivasan Umesh
- Length Representations in Large Language Models
Sangjun Moon, Dasom choi, Jingun Kwon, Hidetaka Kamigaito, Manabu Okumura
- MultiLingPoT: Boosting Mathematical Reasoning in LLMs through Multilingual Program Integration
Nianqi Li, Zujie Liang, Siyu Yuan, Jiaqing Liang, Feng Wei, Yanghua Xiao
- Simulating Identity, Propagating Bias: Abstraction and Stereotypes in LLM-Generated Text
Pia Sommerauer, Giulia Rambelli, Tommaso Caselli
- Do LVLMs Know What They Know? A Systematic Study of Knowledge Boundary Perception in LVLMs
Zhikai Ding, Shiyu Ni, Keping Bi
- Benchmarking Large Language Models for Cryptanalysis and Side-Channel Vulnerabilities
Utsav Maskey, Chencheng ZHU, Usman Naseem
- MTabVQA: Evaluating Multi-Tabular Reasoning of Language Models in Visual Space
Anshul Singh, Chris Biemann, Jan Strich
- TurnBench-MS: A Benchmark for Evaluating Multi-Turn, Multi-Step Reasoning in Large Language Models
Yiran Zhang, Mo Wang, Xiaoyang Li, Kaixuan Ren, Chencheng ZHU, Usman Naseem
- Assessing LLM Reasoning Steps via Principal Knowledge Grounding
Hyeon Hwang, Yewon Cho, Chanwoong Yoon, Yein Park, Minju Song, Kyungjae Lee, Gangwoo Kim, Jaewoo Kang
- Stratified Selective Sampling for Instruction Tuning with Dedicated Scoring Strategy
Paramita Mirza, Lucas Weber, Fabian Küch
- CoTD-PO: Chain-of-Thought Distillation with Preference Optimization
Lujie Niu, Haochen Sun, Fangkun Zhao, Sheng Chen, Zimeng Bai, Zhang jiawei, Caixia Yuan, Xiaojie Wang
- Intelligent Document Parsing: Towards End-to-end Document Parsing via Decoupled Content Parsing and Layout Grounding
Hangdi Xing, Feiyu Gao, Qi Zheng, Zhaoqing Zhu, Zirui Shao, Ming Yan
- Feel the Difference? A Comparative Analysis of Emotional Arcs in Real and LLM-Generated CBT Sessions
Xiaoyi Wang, Jiwei Zhang, Guangtao Zhang, Honglei Guo
- Beyond Single-User Dialogue: Assessing Multi-User Dialogue State Tracking Capabilities of Large Language Models
Sangmin Song, Juhwan Choi, JungMin Yun, YoungBin Kim
- All-in-one: Understanding and Generation in Multimodal Reasoning with the MAIA Benchmark
Davide Testa, Giovanni Bonetta, Raffaella Bernardi, Alessandro Bondielli, Alessandro Lenci, Alessio Miaschi, Lucia Passaro, Bernardo Magnini
- Triangulating LLM Progress through Benchmarks, Games, and Cognitive Tests
Filippo Momentè, Alessandro Suglia, Mario Giulianelli, Ambra Ferrari, Alexander Koller, Oliver Lemon, David Schlangen, Raquel Fernández, Raffaella Bernardi
- Entity Profile Generation and Reasoning with LLMs for Entity Alignment
Rumana Ferdous Munne, Md Mostafizur Rahman, Yuji Matsumoto
- Re-FRAME the Meeting Summarization SCOPE: Fact-Based Summarization and Personalization via Questions
Frederic Kirstein, Sonu Kumar, Terry Ruas, Bela Gipp
- Attack as Defense: Safeguarding Large Vision-Language Models from Jailbreaking by Adversarial Attacks
Chongxin Li, Hanzhang Wang, Yuchun Fang
- Emphasising Structured Information: Integrating Abstract Meaning Representation into LLMs for Enhanced Open-Domain Dialogue Evaluation
Bohao Yang, Kun Zhao, Dong Liu, Chen Tang, Liang Zhan, Chenghua Lin
- Differentiated Vision: Unveiling Entity-Specific Visual Modality Requirements for Multimodal Knowledge Graph
Minghang Liu, Yinghan Shen, Zihe Huang, Yuanzhuo Wang, Xuhui Jiang, Huawei Shen
- Post Persona Alignment for Multi-Session Dialogue Generation
Yi-Pei Chen, Noriki Nishida, Hideki Nakayama, Yuji Matsumoto
- MASSIVE-Agents: A Benchmark for Multilingual Function-Calling in 52 Languages
Mayank Kulkarni, Vittorio Mazzia, Judith Gaspers, Chris Hench, Jack FitzGerald
- Crafting Customisable Characters with LLMs: A Persona-Driven Role-Playing Agent Framework
Bohao Yang, Dong Liu, Chenghao Xiao, Kun Zhao, Chen Tang, Chao Li, Lin Yuan, YANG GUANG, Lanxiao Huang, Chenghua Lin
- Can LLMs Express Personality Across Cultures? Introducing CulturalPersonas for Evaluating Trait Alignment
Priyanka Dey, Aayush Bothra, Yugal Khanter, Emilio Ferrara, Jieyu Zhao
- Exploring the Hidden Reasoning Process of Large Language Models by Misleading Them
Guanyu Chen, Peiyang Wang, Yizhou Jiang, Yuqian Liu, Chujie Zhao, Ying Fang, Tianren Zhang, Feng Chen
- When Models Reason in Your Language: Controlling Thinking Language Comes at the Cost of Accuracy
Jirui Qi, Shan Chen, Zidi Xiong, Raquel Fernández, Danielle Bitterman, Arianna Bisazza
- The Role of Model Confidence on Bias Effects in Measured Uncertainties for Vision-Language Models
Xinyi Liu, Weiguang Wang, Hangfeng He
- GAttention: Gated Attention for the Detection of Abusive Language
Horacio Jarquín Vásquez, Hugo Jair Escalante, Manuel Montes, Mario Ezra Aragon
- Towards Low-Resource Alignment to Diverse Perspectives with Sparse Feedback
Chu Fei Luo, Samuel Dahan, Xiaodan Zhu
- ProtoXTM: Cross-Lingual Topic Modeling with Document-Level Prototype-based Contrastive Learning
Seung-Won Seo, Soon-Sun Kwon
- One More Question is Enough, Expert Question Decomposition (EQD) Model for Domain Quantitative Reasoning
Mengyu Wang, Sotirios Sabanis, Miguel de Carvalho, Shay B Cohen, Tiejun Ma
- When Punctuation Matters: A Large-Scale Comparison of Prompt Robustness Methods for LLMs
Mikhail Seleznyov, Mikhail Chaichuk, Gleb Ershov, Alexander Panchenko, Elena Tutubalina, Oleg Somov
- RAR$^2$: Retrieval-Augmented Medical Reasoning via Thought-Driven Retrieval
Kaishuai Xu, Wenjun Hou, Yi Cheng, Wenjie Li
- The Security Threat of Compressed Projectors in Large Vision-Language Models
Yudong Zhang, Ruobing Xie, Xingwu Sun, Jiansheng Chen, Zhanhui Kang, Di Wang, Yu Wang
- NarratEX Dataset: Explaining the Dominant Narratives in News Texts
Nuno Guimarães, Purificação Silvano, Ricardo Campos, Alipio Jorge, Ana Filipa Pacheco, Dimitar Iliyanov Dimitrov, Nikolaos Nikolaidis, Roman Yangarber, Elisa Sartori, Nicolas Stefanovitch, Preslav Nakov, Jakub Piskorski, Giovanni Da San Martino
- Radical Allomorphy: Phonological Surface Forms without Phonology
Salam Khalifa, Nizar Habash, Owen Rambow
- Model Calibration for Emotion Detection
Mihaela Petre-Vlad, Cornelia Caragea, Florentina Hristea
- From Benchmark to Better Embeddings: Leveraging Synonym Substitution to Enhance Multimodal Models in Ukrainian
Volodymyr Mudryi, Yurii Laba
- Context Copying Modulation: The Role of Entropy Neurons in Managing Parametric and Contextual Knowledge Conflicts
Zineddine Tighidet, Andrea Mogini, Hedi Ben younes, Jiali Mei, Patrick Gallinari, Benjamin Piwowarski
- A Generalizable Rhetorical Strategy Annotation Model Using LLM-based Debate Simulation and Labelling
Shiyu Ji, Farnoosh Hashemi, Joice Chen, Juanwen Pan, Weicheng Ma, Hefan Zhang, Sophia Pan, Ming Cheng, Shubham Mohole, Saeed Hassanpour, Soroush Vosoughi, Michael Macy
- SecDecoding: Steerable Decoding for Safer LLM Generation
Jiayou Wang, Rundong Liu, Yue Hu, Huijia Wu, Zhaofeng He
- GENUINE: Graph Enhanced Multi-level Uncertainty Estimation for Large Language Models
Tuo Wang, Adithya Kulkarni, Tyler Cody, Peter A. Beling, Yujun Yan, Dawei Zhou
- ReviewEval: An Evaluation Framework for AI-Generated Reviews
Madhav Krishan Garg, Tejash Prasad, Tanmay Singhal, Chhavi Kirtani, Murari Mandal, Dhruv Kumar
- Overcoming Black-box Attack Inefficiency with Hybrid and Dynamic Select Algorithms
Abhinay Shankar Belde, Rohit Ramkumar, Jonathan Rusert
- GmSLM : Generative Marmoset Spoken Language Modeling
Talia Sternberg, Yossi Adi, Michael London, David Omer
- QA‑LIGN: Aligning LLMs through Constitutionally Decomposed QA
Jacob Dineen, Aswin RRV, Qin Liu, Zhikun Xu, Xiao Ye, Ming Shen, Zhaonan Li, Shijie Lu, Chitta Baral, Muhao Chen, Ben Zhou
- Characterizing Positional Bias in Large Language Models: A Multi-Model Evaluation of Prompt Order Effects
Patrick Schilcher, Dominik Karasin, Michael Schöpf, Haisam Saleh, Antonela Tommasel, Markus Schedl
- You Only Use Reactive Attention Slice When Retrieving From Long Context
Yun Joon Soh, Hanxian Huang, Yuandong Tian, Jishen Zhao
- Fine-Tuned Thoughts: Leveraging Chain-of-Thought Reasoning for Industrial Asset Health Monitoring
Shuxin Lin, Dhaval C Patel, Christodoulos Constantinides
- CoViPAL: Layer-wise Contextualized Visual Token Pruning for Large Vision-Language Models
Zicong Tang, Ziyang Ma, Suqing Wang, Zuchao Li, Lefei Zhang, hai zhao, Yun Li, Qianren Wang
- Large Language Models with Temporal Reasoning for Longitudinal Clinical Summarization and Prediction
Maya Kruse, Shiyue Hu, Nicholas Derby, Yifu Wu, Samantha Stonbraker, Bingsheng Yao, Dakuo Wang, Elizabeth M. Goldberg, Yanjun Gao
- TransAlign: Machine Translation Encoders are Strong Word Aligners, Too
Benedikt Ebing, Christian Goldschmied, Goran Glavaš
- Pruning Weights but Not Truth: Safeguarding Truthfulness While Pruning LLMs
Yao Fu, Runchao Li, Xianxuan Long, Haotian Yu, Xiaotian Han, Yu Yin, Pan Li
- Augment before You Try: Knowledge-Enhanced Table Question Answering via Table Expansion
Yujian Liu, Jiabao Ji, Tong Yu, Ryan A. Rossi, Sungchul Kim, Handong Zhao, Ritwik Sinha, Yang Zhang, Shiyu Chang
- Evaluating Large Language Models for Belief Inference: Mapping Belief Networks at Scale
Trisevgeni Papakonstantinou, Antonina Zhiteneva, Ana Yutong Ma, Derek Powell, Zachary Horne
- Distinguishing fair from unfair compositional generalization tasks
Ahmad Jabbar, Cleo Condoravdi, Christopher Potts
- SA-CLIP: Language Guided Image Spatial and Action Feature Learning
Guanlin Li, Wenhao SHAO, Praboda Rajapaksha, Noel Crespi
- Inefficiencies of Meta Agents for Agent Design
Batu El, Mert Yuksekgonul, James Zou
- SCoder: Progressive Self-Distillation for Bootstrapping Small-Scale Data Synthesizers to Empower Code LLMs
Xinyu Zhang, Changzhi Zhou, Linmei Hu, Luhao Zhang, Xiancai Chen, Haomin Fu, Yang Yang, Mengdi Zhang
- Linguistically-Controlled Paraphrase Generation
Mohamed Elgaar, Hadi Amiri
- LAWCAT: Efficient Distillation from Quadratic to Linear Attention with Convolution across Tokens for Long Context Modeling
Zeyu Liu, Souvik Kundu, Lianghao Jiang, Anni Li, Srikanth Ronanki, Sravan Babu Bodapati, Gourav Datta, Peter Anthony Beerel
- Analyzing Dialectical Biases in LLMs for Knowledge and Reasoning Benchmarks
Eileen Pan, Anna Seo Gyeong Choi, Maartje Ter Hoeve, Skyler Seto, Allison Koenecke
- TreeBoN: Enhancing Inference-Time Alignment with Speculative Tree-Search and Best-of-N Sampling
Jiahao Qiu, Yifu Lu, Yifan Zeng, Jiacheng Guo, Jiayi Geng, Chenhao Zhu, Xinzhe Juan, Ling Yang, Huazheng Wang, Kaixuan Huang, Yue Wu, Mengdi Wang
- CulturalFrames: Assessing Cultural Expectation Alignment in Text-to-Image Models and Evaluation Metrics
Shravan Nayak, Mehar Bhatia, Xiaofeng Zhang, Verena Rieser, Lisa Anne Hendricks, Sjoerd van Steenkiste, Yash Goyal, Karolina Stanczak, Aishwarya Agrawal
- Decoupled Proxy Alignment: Mitigating Language Prior Conflict for Multimodal Alignment in MLLMs
Chenkun Tan, Pengyu Wang, Shaojun Zhou, Botian Jiang, Zhaowei Li, Dong Zhang, Xinghao Wang, Yaqian Zhou, Xipeng Qiu
- Riemannian Optimization for LoRA on the Stiefel Manifold
JuneYoung Park, Minjae Kang, Seongbae Lee, Haegang Lee, Seongwan Kim, Jaeho Lee
- How Real Are Synthetic Therapy Conversations? Evaluating Fidelity in Prolonged Exposure Dialogues
Suhas BN, Dominik O. Mattioli, Andrew M. Sherrill, Rosa I. Arriaga, Christopher Wiese, Saeed Abdullah
- Large Language Models for Controllable Multi-property Multi-objective Molecule Optimization
Vishal Dey, Xiao Hu, Xia Ning
- Measuring Lexical Diversity of Synthetic Data Generated through Fine-Grained Persona Prompting
Gauri Kambhatla, Chantal Shaib, Venkata S Govindarajan
- Beyond Function-Level Search: Repository-Aware Dual-Encoder Code Retrieval with Adversarial Verification
Aofan Liu, Shiyuan SONG, haoxuan li, Cehao Yang, Yiyan Qi
- Watermark under Fire: A Robustness Evaluation of LLM Watermarking
Jiacheng Liang, Zian Wang, Spencer Hong, Shouling Ji, Ting Wang
- PEPE: Long-context Extension for Large Language Models via Periodic Extrapolation Positional Encodings
Jikun Hu, Dongsheng Guo, Yuli LIU, Qingyao Ai, Lixuan Wang, Xuebing Sun, Qilei Zhang, Quan Zhou, Cheng Luo
- Beyond Self-Reports: Multi-Observer Agents for Personality Assessment in Large Language Models
Yin Jou Huang, Rafik Hadfi
- Controlled Retrieval-augmented Context Evaluation for Long-form RAG
Jia-Huei Ju, Suzan Verberne, Maarten de Rijke, Andrew Yates
- Humanity’s Last Code Exam: Can Advanced LLMs Conquer Human’s Hardest Code Competition?
Xiangyang Li, Xiaopeng Li, Kuicai Dong, Zhangquanhu, Rongju Ruan, Xinyi Dai, Yasheng Wang, Ruiming Tang
- False Friends Are Not Foes: Investigating Vocabulary Overlap in Multilingual Language Models
Julie Kallini, Dan Jurafsky, Christopher Potts, Martijn Bartelds
- Rule-Guided Extraction: A Hierarchical Rule Optimization Framework for Document-Level Event Argument Extraction
Yue Zuo, Yuxiao Fei, Wanting Ning, Jiayi Huang, Yubo Feng, Lishuang Li
- SOPL: A Sequential Optimal Learning Approach to Automated Prompt Engineering in Large Language Models
Shuyang Wang, Somayeh Moazeni, Diego Klabjan
- CLIP-UP: A Simple and Efficient Mixture-of-Experts CLIP Training Recipe with Sparse Upcycling
Xinze Wang, Chen Chen, Yinfei Yang, Hong-You Chen, Bowen Zhang, Aditya Pal, Xiangxin Zhu, Xianzhi Du
- A Category-Theoretic Approach to Neural-Symbolic Task Planning with Bidirectional Search
Shuhui Qu, Jie Wang, Kincho Law
- HEAL: An Empirical Study on Hallucinations in Embodied Agents Driven by Large Language Models
Trishna Chakraborty, Udita Ghosh, Xiaopan Zhang, Fahim Faisal Niloy, Yue Dong, Jiachen Li, Amit Roy-Chowdhury, Chengyu Song
- Can LLMs Judge Debates? Evaluating Non-Linear Reasoning via Argumentation Theory Semantics
Reza Sanayei, Srdjan Vesic, Eduardo Blanco, Mihai Surdeanu
- How Jailbreak Defenses Work and Ensemble? A Mechanistic Investigation
Zhuohan Long, Siyuan Wang, Shujun Liu, Yuhang Lai
- Visual Self-Refinement for Autoregressive Models
Jiamian Wang, Ziqi Zhou, Chaithanya Kumar Mummadi, Sohail Dianat, MAJID RABBANI, Raghuveer Rao, Chen Qiu, Zhiqiang Tao
- Retrieval-Augmented Language Models are Mimetic Theorem Provers
Wenjie Yang, Ruiyuan Huang, Jiaxing Guo, Zicheng Lyu, Tongshan Xu, Shengzhong Zhang, Lun Du, Da Zheng, Zengfeng Huang
- LORE: Continual Logit Rewriting Fosters Faithful Generation
Charles Yu, Qingyun Wang, Yuting Hu, Jinjun Xiong, Heng Ji
- PRINCIPLES: Synthetic Strategy Memory for Proactive Dialogue Agents
Namyoung Kim, Kai Tzu-iunn Ong, Yeonjun Hwang, Minseok Kang, Iiseo Jihn, Gayoung Kim, Minju Kim, Jinyoung Yeo
- SLM-Bench: A Comprehensive Benchmark of Small Language Models on Environmental Impacts
Nghiem Thanh Pham, Tung Kieu, Duc Manh Nguyen, Son Ha Xuan, Nghia Duong-Trung, Danh Le-Phuoc
- A Decoupled Multi-Agent Framework for Complex Text Style Transfer
Lingxi Zhang, Yu-Neng Chuang, Guanchu Wang, Ruixiang Tang, Xuanting Cai, Rajesh Shenoy, Xia Hu
- Mamba Drafters for Speculative Decoding
Daewon Choi, Seunghyuk Oh, Saket Dingliwal, Jihoon Tack, Kyuyoung Kim, Woomin Song, Seojin Kim, Insu Han, Jinwoo Shin, Aram Galstyan, Shubham Katiyar, Sravan Babu Bodapati
- LongLLaVA: Scaling Multi-modal LLMs to 1000 Images Efficiently via a Hybrid Architecture
Xidong Wang, Dingjie Song, Shunian Chen, Junying Chen, Zhenyang Cai, Chen Zhang, Lichao Sun, Benyou Wang
- Think Clearly: Improving Reasoning via Redundant Token Pruning
Daewon Choi, Jimin Lee, Jihoon Tack, Woomin Song, Saket Dingliwal, Sai Muralidhar Jayanthi, Bhavana Ganesh, Jinwoo Shin, Aram Galstyan, Sravan Babu Bodapati
- A Systematic Survey of Claim Verification: Corpora, Systems, and Case Studies
Zhaxi Zerong, CHENXI LI, Xinyi Liu, Ju-hui Chen, Fei Xia
- Automated Creativity Evaluation for Large Language Models: A Reference-Based Approach
Ruizhe Li, Chiwei Zhu, Benfeng Xu, Xiaorui Wang, Zhendong Mao
- LangProBe: a Language Program Benchmark
Shangyin Tan, Lakshya A Agrawal, Arnav Singhvi, Liheng Lai, Michael J Ryan, Dan Klein, Omar Khattab, Koushik Sen, Matei Zaharia
- Exploring and Detecting Self-disclosure in Multi-modal posts on Chinese Social Media
Jingbao Luo, Ming Liu, Aoli Huo, hufujing, Gang Li, WupengNjust
- MV-CLAM: Multi-View Molecular Interpretation with Cross-Modal Projection via Language Model
Sumin Ha, Jun Hyeong Kim, Yinhua Piao, Changyun Cho, Sun Kim
- Mind the Style Gap: Meta-Evaluation of Style and Attribute Transfer Metrics
Amalie Brogaard Pauli, Isabelle Augenstein, Ira Assent
- ExtremeAIGC: Benchmarking LMM Vulnerability to AI-Generated Extremist Content
Bhavik Chandna, Mariam Aboujenane, Usman Naseem
- Data Augmentation for Maltese NLP using Transliterated and Machine Translated Arabic Data
Kurt Micallef, Nizar Habash, Claudia Borg
- Do LLMs Align Human Values Regarding Social Biases? Judging and Explaining Social Biases with LLMs
Yang Liu, Chenhui Chu
- CoEx – Co-evolving World-model and Exploration
Minsoo Kim, seung-won hwang
- BrainLoc: Brain Signal-Based Object Detection with Multi-modal Alignment
Kaixuan Luan, Xiaoda Yang, Hongshun Qiu, Weicai Yan, Xueyi Zhang, Jiaqi Duan, Youliang Zhang, Zhaoyang Li, Donglin Huang, Zejian Xie, JunYu Lu, Ziyue Jiang
- PVTNL: Prompting Vision Transformers with Natural Language for Generalizable Person Re-identification
WANGNING, Lei Xie, Sanglu Lu, Shiwei Gan
- RingFormer: Rethinking Recurrent Transformer with Adaptive Level Signals
Jaemu Heo, Eldor Fozilov, Hyunmin Song, Taehwan Kim
- TriSPrompt: A Hierarchical Soft Prompt Model for Multimodal Rumor Detection with Incomplete Modalities
Jiajun Chen, Yangyang Wu, Xiaoye Miao, Mengying Zhu, Meng Xi
- Evaluating Uncertainty Quantification Methods in Argumentative Large Language Models
Kevin Zhou, Adam Dejl, Gabriel Freedman, Lihu Chen, Antonio Rago, Francesca Toni
- Med-RewardBench: Benchmarking Reward Models and Judges for Medical Multimodal Large Language Models
Meidan Ding, Jipeng Zhang, Wenxuan Wang, Cheng-Yi Li, Wei-Chieh Fang, Hsin-Yu Wu, Haiqin Zhong, Wenting Chen, Linlin Shen
- CLAIMCHECK: How Grounded are LLM Critiques of Scientific Papers?
Jiefu Ou, William Gantt Walden, Kate Sanders, Zhengping Jiang, Kaiser Sun, Jeffrey Cheng, William Jurayj, Miriam Wanner, Shaobo Liang, Candice Morgan, Seunghoon Han, Weiqi Wang, Chandler May, Hannah Recknor, Daniel Khashabi, Benjamin Van Durme
- From Noise to Clarity: Filtering Real and LLM-Generated Samples for Enhanced Intent Detection
Junbao Huang, Weizhen Li, Peijie Huang, Yuhong Xu
- Improving Language Model Personas via Rationalization with Psychological Scaffolds
Brihi Joshi, Xiang Ren, Swabha Swayamdipta, Rik Koncel-Kedziorski, Tim Paek
- KBM: Delineating Knowledge Boundary for Adaptive Retrieval in Large Language Models
Zhen Zhang, Xinyu Wang, Yong Jiang, Zile Qiao, Zhuo Chen, Guangyu Li, Feiteng Mu, Mengting Hu, Pengjun Xie, Fei Huang
- TABARD: A Novel Benchmark for Tabular Anomaly Analysis, Reasoning and Detection
Manan Roy Choudhury, Shikhhar Siingh, Anirudh Iyengar Kaniyar Narayana Iyengar, Sugeeth Puranam, Vivek Gupta
- Aspect-based Sentiment Analysis via Synthetic Image Generation
Ge Chen, Zhongqing Wang, Guodong Zhou
- IntrEx: A Dataset for Modeling Engagement in Educational Conversations
Xingwei Tan, Mahathi Parvatham, Chiara Gambi, Gabriele Pergola
- Bridging the Capability Gap: Joint Alignment Tuning for Harmonizing LLM-based Multi-Agent Systems
Minghang Zhu, Zhengliang Shi, Zhiwei Xu, Shiguang Wu, Lingjie Wang, Pengjie Ren, Zhaochun Ren, Zhumin Chen
- Safety Through Reasoning: An Empirical Study of Reasoning Guardrail Models
Makesh Narsimhan Sreedhar, Traian Rebedea, Christopher Parisien
- Context-Aware Reasoning On Parametric Knowledge for Inferring Causal Variables
Ivaxi Sheth, Sahar Abdelnabi, Mario Fritz
- LoRE-Merging: Exploring Low-Rank Estimation For Large Language Model Merging
Zehua Liu, Han Wu, Yuxuan Yao, Xiaojin Fu, Ruifeng She, Xiongwei Han, Tao Zhong, Mingxuan Yuan
- Benchmarking Foundation Models with Retrieval-Augmented Generation in Olympic-Level Physics Problem Solving
Shunfeng Zheng, Yudi Zhang, Meng Fang, Zihan Zhang, Zhitan Wu, Mykola Pechenizkiy, Ling Chen
- FiRST: Finetuning Router-Selective Transformers for Input-Adaptive Latency Reduction
Akriti Jain, Saransh Sharma, Koyel Mukherjee, Soumyabrata Pal
- Explaining Sources of Uncertainty in Automated Fact-Checking
Jingyi Sun, Greta Warren, Irina Shklovski, Isabelle Augenstein
- PolitiSky24: U.S. Political Bluesky Dataset with User Stance Labels
Peyman Rostami, Vahid Rahimzadeh, Ali Adibi, Azadeh Shakery
- From Ground Trust to Truth: Disparities in Offensive Language Judgments on Contemporary Korean Political Discourse
Seunguk Yu, JungMin Yun, Jinhee Jang, YoungBin Kim
- Misalignment Attack on Text-to-Image Models via Text Embedding Optimization and Inversion
Zhijie Du, Daizong Liu, Pan Zhou
- Domain Pre-training Impact on Representations
Cesar Gonzalez-Gutierrez, Ariadna Quattoni
- KoACD: The First Korean Adolescent Dataset for Cognitive Distortion Analysis via Role-Switching Multi-LLM Negotiation
Jun Seo Kim, Hye Hyeon Kim
- Refined Assessment for Translation Evaluation: Rethinking Machine Translation Evaluation in the Era of Human-Level Systems
Dmitry Popov, Vladislav Negodin, Ekaterina Enikeeva, Iana Matrosova, Nikolay Karpachev, Max Ryabinin
- Pre-Storage Reasoning for Episodic Memory: Shifting Inference Burden to Memory for Personalized Dialogue
Sangyeop Kim, Yohan Lee, Sanghwa Kim, Hyunjong Kim, Sungzoon Cho
- Temporal Consistency for LLM Reasoning Process Error Identification
Jiacheng Guo, Yue Wu, Jiahao Qiu, Kaixuan Huang, Xinzhe Juan, Ling Yang, Mengdi Wang
- Quantifying Compositionality of Classic and State-of-the-Art Embeddings
Zhijin Guo, Chenhao Xue, Zhaozhen Xu, Hongbo Bo, Yuxuan Ye, Janet B. Pierrehumbert, Martha Lewis
- Presumed Cultural Identity: How Names Shape LLM Responses
Siddhesh Milind Pawar, Arnav Arora, Lucie-Aimée Kaffee, Isabelle Augenstein
- I-GUARD: Interpretability-Guided Parameter Optimization for Adversarial Defense
Mamta Mamta, Oana Cocarascu
- DecoupledESC: Enhancing Emotional Support Generation via Strategy-Response Decoupled Preference Optimization
Chao Zhang, Xin Shi, Xueqiao Zhang, Yifan Zhu, Yi Yang, Yawei Luo
- Local Normalization Distortion and the Thermodynamic Formalism of Decoding Strategies for Large Language Models
Tom Kempton, Stuart Burrell
- BRIT: Bidirectional Retrieval over Unified Image-Text Graph
Ainulla Khan, Moyuru Yamada, Srinidhi Akella
- ReTAG: Retrieval-Enhanced, Topic-Augmented Graph-Based Global Sensemaking
Boyoung Kim, Dosung Lee, Sumin An, Jinseong Jeong, Paul Hongsuck Seo
- Vocab Diet: Reshaping the Vocabularies of LLMs with Vector Arithmetic
Yuval Reif, Guy Kaplan, Roy Schwartz
- Capturing Latent Modal Association For Multimodal Entity Alignment
Yongquan Ji, Jingwei Cheng, Fu Zhang, Chenglong Lu
- Explaining novel senses using definition generation with open language models
Mariia Fedorova, Andrey Kutuzov, Francesco Periti, Yves Scherrer
- Can Code-Switched Texts Activate a Knowledge Switch in LLMs? A Case Study on English-Korean Code-Switching
Seoyeon Kim, Huiseo Kim, Chanjun Park, Jinyoung Yeo, Dongha Lee
- Compositional Translation: A Novel LLM-based Approach for Low-resource Machine Translation
Armel Randy Zebaze, Benoît Sagot, Rachel Bawden
- TopXGen: Topic-Diverse Parallel Data Generation for Low-Resource Machine Translation
Armel Randy Zebaze, Benoît Sagot, Rachel Bawden
- Fast, Not Fancy: Rethinking G2P with Rich Data and Statistical Models
Mahta Fetrat Qharabagh, Zahra Dehghanian, Hamid R. Rabiee
- Personalized open world plan generation for safety-critical human centered autonomous systems: A case study on Artificial Pancreas
Ayan Banerjee, Sandeep Gupta
- CaMMT: Benchmarking Culturally Aware Multimodal Machine Translation
Emilio Villa-Cueva, Sholpan Bolatzhanova, Diana Turmakhan, Kareem Elzeky, Henok Biadglign Ademtew, Alham Fikri Aji, Israel Abebe Azime, Jinheon Baek, Frederico Belcavello, Fermin Cristobal, Jan Christian Blaise Cruz, Mary Dabre, Raj Dabre, Toqeer Ehsan, Naome A Etori, Fauzan Farooqui, Jiahui Geng, Guido Ivetta, Thanmay Jayakumar, Soyeong Jeong, Zheng Wei Lim, Aishik Mandal, Sofía Martinelli, Mihail Minkov Mihaylov, Daniil Orel, Aniket Pramanick, Sukannya Purkayastha, Israfel Salazar, Haiyue Song, Tiago Timponi Torrent, Debela Desalegn Yadeta, Injy Hamed, Atnafu Lambebo Tonja, Thamar Solorio
- Training Text-to-Molecule Models with Context-Aware Tokenization
Seojin Kim, Hyeontae Song, Jaehyun Nam, Jinwoo Shin
- Challenging the Evaluator: LLM Sycophancy Under User Rebuttal
Sung Won Kim, Daniel Khashabi
- Perspective-driven Preference Optimization with Entropy Maximization for Diverse Argument Generation
Yilin Cao, Ruike Zhang, Penghui Wei, Qingchao Kong, Wenji Mao
- Spoken Document Retrieval for an Unwritten Language: A Case Study on Gormati
Sanjay Booshanam, Kelly Chen, Ondrej Klejch, Thomas Reitmaier, Dani Kalarikalayil Raju, Electra Wallington, Nina Markl, Jennifer Pearson, Matt Jones, Simon Robinson, Peter Bell
- M-Help: Using Social Media Data to Detect Mental Health Help-Seeking Signals
MSVPJ Sathvik, Zuhair Hasan Shaik, Vivek Gupta
- Brittle Minds, Fixable Activations: Understanding Belief Representations in Language Models
Matteo Bortoletto, Constantin Ruhdorfer, Lei Shi, Andreas Bulling
- Golden Touchstone: A Comprehensive Bilingual Benchmark for Evaluating Financial Large Language Models
Xiaojun Wu, Junxi Liu, Huan-Yi Su, Zhouchi Lin, Yiyan Qi, Chengjin Xu, Jiajun Su, Jiajie Zhong, Fuwei Wang, Saizhuo Wang, Fengrui Hua, Jia Li, Jian Guo
- Quantifying the Risks of LLM- and Tool-assisted Rephrasing to Linguistic Diversity
Mengying Wang, Andreas Spitz
- NUMINA: A Natural Understanding Benchmark for Multi-dimensional Intelligence and Numerical Reasoning Abilities
Changyu Zeng, Yifan Wang, Zimu Wang, Wei Wang, Zhengni Yang, Muyi Bao, Jimin XIAO, Anh Nguyen, Yutao Yue
- MoMentS: A Comprehensive Multimodal Benchmark for Theory of Mind
Emilio Villa-Cueva, S M Masrur Ahmed, Rendi Chevi, Jan Christian Blaise Cruz, Kareem Elzeky, Fermin Cristobal, Alham Fikri Aji, Skyler Wang, Rada Mihalcea, Thamar Solorio
- Code Like Humans: A Multi-Agent Solution for Medical Coding
Andreas Geert Motzfeldt, Joakim Edin, Casper L. Christensen, Christian Hardmeier, Lars Maaløe, Anna Rogers
- Can Out-of-Distribution Evaluations Uncover Reliance on Prediction Shortcuts? A Case Study in Question Answering
Michal Štefánik, Timothee Mickus, Michal Spiegel, Marek Kadlčík, Josef Kuchař
- MEXA: Towards General Multimodal Reasoning with Dynamic Multi-Expert Aggregation
Shoubin Yu, Yue Zhang, Ziyang Wang, Jaehong Yoon, Mohit Bansal
- Lifelong Knowledge Editing requires Better Regularization
Akshat Gupta, Phudish Prateepamornkul, Maochuan Lu, Ahmed Alaa, Thomas Hartvigsen, Gopala Anumanchipalli
- Lost in Embeddings: Information Loss in Vision–Language Models
Wenyan Li, Raphael Tang, Chengzu Li, Clemente Pasti, Vésteinn Snæbjarnarson, Caiqi Zhang, Ivan Vulić, Ryan Cotterell, Anders Søgaard
- Assessing the Role of Data Quality in Training Bilingual Language Models
Skyler Seto, Maartje Ter Hoeve, Maureen de Seyssel, David Grangier
- DORM: Preference Data Weights Optimization for Reward Modeling in LLM Alignment
Rongzhi Zhang, Chenwei Zhang, Xinyang Zhang, Liang Qiu, Haoming Jiang, Yuchen Zhuang, Qingru Zhang, Hyokun Yun, Xian Li, Bing Yin, Tuo Zhao, Chao Zhang
- Enhancing Domain-Specific Encoder Models with LLM-Generated Data: How to Leverage Ontologies, and How to Do Without Them
Marc Felix Brinner, Tarek Al Mustafa, Sina Zarrieß
- Aligning Dialogue Agents with Global Feedback via Large Language Model Multimodal Reward Decomposition
Dong Won Lee, Hae Won Park, Cynthia Breazeal, Louis-Philippe Morency
- UrduFactCheck: An Agentic Fact-Checking Framework for Urdu with Evidence Boosting and Benchmarking
Sarfraz Ahmad, Hasan Iqbal, Momina Ahsan, Numaan Naeem, Muhammad Ahsan Riaz Khan, Arham Riaz, Muhammad Arslan Manzoor, Yuxia Wang, Preslav Nakov
- Echoes of Agreement: Argument Driven Sycophancy in Large Language models
Avneet Kaur
- Rethinking NLP for Chemistry: A Critical Look at the USPTO Benchmark
Derin Ozer, Nicolas Gutowski, Benoit Da Mota, Thomas Cauchy, Sylvain Lamprier
- Investigating Dictionary Expansion for Video-based Sign Language Dictionaries
Aashaka Desai, Daniela Massiceti, Richard Ladner, Hal Daumé III, Danielle Bragg, Alex Xijie Lu
- From Insight to Exploit: Leveraging LLM Collaboration for Adaptive Adversarial Text Generation
Najrin Sultana, Md Rafi Ur Rashid, Kang Gu, Shagufta Mehnaz
- Beyond Contrastive Learning: Synthetic Data Enables List-wise Training with Multiple Levels of Relevance
Reza Esfandiarpoor, George Zerveas, Ruochen Zhang, Macton Mgonzo, Carsten Eickhoff, Stephen Bach
- Instability in Downstream Task Performance During LLM Pretraining
Yuto Nishida, Masaru Isonuma, Yusuke Oda
- A Comparison of Independent and Joint Fine-tuning Strategies for Retrieval-Augmented Generation
Neal Gregory Lawton, Alfy Samuel, Anoop Kumar, Daben Liu
- mrCAD: Multimodal Communication to Refine Computer-aided Designs
William P McCarthy, Saujas Vaduguru, Karl D.D. Willis, Justin Matejka, Judith E Fan, Daniel Fried, Yewen Pu
- MOCHA: Are Code Language Models Robust Against Multi-Turn Malicious Coding Prompts?
Muntasir Wahed, Xiaona Zhou, Kiet A. Nguyen, Tianjiao Yu, Nirav Diwan, Gang Wang, Dilek Hakkani-Tür, Ismini Lourentzou
- How Can Input Reformulation Improve Tool Usage Accuracy in a Complex Dynamic Environment? A Study on τ -bench
Venkatesh Mishra, Amir Saeidi, Satyam Raj, Mutsumi Nakamura, Jayanth Srinivasa, Gaowen Liu, Ali Payani, Chitta Baral
- Evaluating Fairness in Large Vision-Language Models Across Diverse Demographic Attributes and Prompts
Xuyang Wu, Yuan Wang, Hsin-Tai Wu, Zhiqiang Tao, Yi Fang
- VIBE: Can a VLM Read the Room?
Tania Chakraborty, Eylon Caplan, Dan Goldwasser
- LoRATK: LoRA Once, Backdoor Everywhere in the Share-and-Play Ecosystem
Hongyi Liu, Shaochen Zhong, Xintong Sun, Minghao Tian, Mohsen Hariri, Zirui Liu, Ruixiang Tang, Zhimeng Jiang, Jiayi Yuan, Yu-Neng Chuang, Li Li, Soo-Hyun Choi, Rui Chen, Vipin Chaudhary, Xia Hu
- Pearl: A Multimodal Culturally-Aware Arabic Instruction Dataset
Fakhraddin Alwajih, Samar M. Magdy, Abdellah EL MEKKI, OMER NACAR, Youssef Nafea, Safaa Taher Abdelfadil, Abdulfattah Mohammed Yahya, Hamzah Luqman, Nada Almarwani, Samah Aloufi, Baraah Qawasmeh, Houdaifa Atou, Serry Sibaee, Hamzah A. Alsayadi, Walid Al-Dhabyani, Maged S. Al-shaibani, Aya El aatar, Nour Qandos, Rahaf Alhamouri, Samar Ahmad, Razan khassib, Lina Hamad, Mohammed Anwar AL-Ghrawi, Fatimah Alshamari, Cheikh Malainine, Doaa Qawasmeh, Aminetou Yacoub, Tfeil moilid, Ruwa AbuHweidi, Ahmed Aboeitta, Vatimetou Mohamed Lemin, Reem Abdel-Salam, Ahlam Bashiti, Adel Ammar, Aisha Alansari, Ahmed Ashraf, Nora Alturayeif, Sara Shatnawi, Alcides Alcoba Inciarte, AbdelRahim A. Elmadany, Mohamedou cheikh tourad, Ismail Berrada, Mustafa Jarrar, Shady Shehata, Muhammad Abdul-Mageed
- Protein Large Language Models: A Comprehensive Survey
Yijia Xiao, Wanjia Zhao, Junkai Zhang, Yiqiao Jin, Han Zhang, Zhicheng Ren, Renliang Sun, Haixin Wang, Guancheng Wan, Pan Lu, Xiao Luo, Yu Zhang, James Zou, Yizhou Sun, Wei Wang
- MAKIEval: A Multilingual Automatic WiKidata-based Framework for Cultural Awareness Evaluation for LLMs
Raoyuan Zhao, Beiduo Chen, Barbara Plank, Michael A. Hedderich
- Looking Beyond the Pixels: Evaluating Visual Metaphor Understanding in VLMs
Manishit Kundu, Sumit Shekhar, Pushpak Bhattacharyya
- AGENTVIGIL: Automatic Black-Box Red-teaming for Indirect Prompt Injection against LLM Agents
Zhun Wang, Vincent Siu, Zhe Ye, Tianneng Shi, Yuzhou Nie, Xuandong Zhao, Chenguang Wang, Wenbo Guo, Dawn Song
- Improving LLM-as-a-Judge Inference with the Judgment Distribution
Victor Wang, Michael JQ Zhang, Eunsol Choi
- Learning Is Not A Race: Improving Retrieval in Language Models via Equal Learning
Wanqian Yang, Aahlad Manas Puli, Rajesh Ranganath
- The Prompt Makes the Person(a): A Systematic Evaluation of Sociodemographic Persona Prompting for Large Language Models
Marlene Lutz, Indira Sen, Georg Ahnert, Elisa Rogers, Markus Strohmaier
- Spiral of Silence in Large Language Model Agents
Mingze Zhong, Meng Fang, Zijing Shi, Yuxuan Huang, Shunfeng Zheng, Yali Du, Ling Chen, Jun Wang
- Do We Know What LLMs Don’t Know? A Study of Consistency in Knowledge Probing
Raoyuan Zhao, Abdullatif Köksal, Ali Modarressi, Michael A. Hedderich, Hinrich Schuetze
- Context Length Alone Hurts LLM Performance Despite Perfect Retrieval
Yufeng Du, Minyang Tian, Srikanth Ronanki, Subendhu Rongali, Sravan Babu Bodapati, Aram Galstyan, Azton Wells, Roy Schwartz, Eliu A Huerta, Hao Peng
- DebUnc: Improving Large Language Model Agent Communication With Uncertainty Metrics
Luke Yoffe, Alfonso Amayuelas, William Yang Wang
- ProcVQA: Benchmarking the Effects of Structural Properties in Mined Process Visualizations on Vision–Language Model Performance
Kazi Tasnim Zinat, Saad Mohammad Abrar, Shoumik Saha, Sharmila Duppala, Saimadhav Naga Sakhamuri, Zhicheng Liu
- Probing Political Ideology in Large Language Models: How Latent Political Representations Generalize Across Tasks
Tianyi Zhang
- Understanding GUI Agent Localization Biases through Logit Sharpness
Xingjian Tao, Yiwei Wang, Zhicheng Yang, Jing Tang
- The Language of Interoception: Examining Embodiment and Emotion Through a Corpus of Body Part Mentions
Sophie Wu, Jan Philip Wahle, Saif M. Mohammad
- HomoGraphAdapter: A Homogeneous Graph Neural Network as an Effective Adapter for Vision-Language Models
Chuan He, Zhuozhao Li, Song Guo, Xiaocheng Lu, Jinxiang Lai
- No Black Boxes: Interpretable and Interactable Predictive Healthcare with Knowledge-Enhanced Agentic Causal Discovery
Xiaoxue Han, Pengfei Hu, Chang Lu, Jun-En Ding, Feng Liu, Yue Ning
- PROOD: A Simple LLM Out-of-Distribution Guardrail Leveraging Response Semantics
Joshua Tint
- ICL-Bandit: Relevance Labeling in Advertisement Recommendation Systems via LLM
Lu Wang, Chiming Duan, Pu Zhao, Fangkai Yang, Yong Shi, Xuefeng Luo, Bingjing Xu, Weiwei Deng, Qingwei Lin, Dongmei Zhang
- Intent-aware Schema Generation and Refinement for Literature Review Tables
Vishakh Padmakumar, Joseph Chee Chang, Kyle Lo, Doug Downey, Aakanksha Naik
- NLP Needs Diversity outside of ‘Diversity’
Joshua Tint
- Anatomy of a Feeling: Narrating Embodied Emotions via Large Vision-Language Models
Mohammad Saim, Phan Anh Duong, Cat Luong, Aniket Bhanderi, Tianyu Jiang
- Towards Universal Debiasing for Language Models-based Tabular Data Generation
Tianchun Li, Tianci Liu, Xingchen Wang, Rongzhe Wei, Pan Li, Lu Su, Jing Gao
- Beyond Linear Steering: Unified Multi-Attribute Control for Language Models
Narmeen Fatimah Oozeer, Luke Marks, Fazl Barez, Amir Abdullah
- Unequal Scientific Recognition in the Age of LLMs
Yixuan Liu, Abel Elekes, Jianglin Lu, Rodrigo Dorantes-Gilardi, Albert-Laszlo Barabasi
- Zero-Shot Fine-Grained Image Classification Using Large Vision-Language Models
Md. Atabuzzaman, Andrew Zhang, Chris Thomas
- Using tournaments to calculate AUROC for zero-shot classification with LLMs
WonJin Yoon, Ian Bulovic, Timothy A. Miller
- Exploration-Driven Reinforcement Learning for Expert Routing Improvement in Mixture-of-Experts Language Models
Gyunyeop Kim, Sangwoo Kang
- D2CS - Documents Graph Clustering using LLM supervision
Yoel Ashkenazi, Etzion Harari, Regev Yehezkel Imra, Naphtali Abudarham, Dekel Cohen, Yoram Louzoun
- GeoChain: Multimodal Chain-of-Thought for Geographic Reasoning
Sahiti Yerramilli, Nilay Pande, Rynaa Grover, Jayant Sravan Tamarapalli
- SteerVLM: Robust Model Control through Lightweight Activation Steering for Vision Language Models
Anushka Sivakumar, Andrew Zhang, Zaber Ibn Abdul Hakim, Chris Thomas
- FractalLLM: Lossless Self-Speculative Decoding with Layer Embedded Self-Compression
Juhyeong Kim, Sangyeon Yu, Gyunyeop Kim, Sangwoo Kang
- Saten: Sparse Augmented Tensor Networks for Post-Training Compression of Large Language Models
Ryan Solgi, Kai Zhen, Rupak Vignesh Swaminathan, Nathan Susanj, Athanasios Mouchtaris, Siegfried Kunzmann, Zheng Zhang
- Third-Person Appraisal Agent: Simulating Human Emotional Reasoning in Text with Large Language Models
Simin Hong, Jun Sun, Hongyang Chen
- Source-primed Multi-turn Conversation Helps Large Language Models Translate Documents
Hanxu Hu, Jannis Vamvas, Rico Sennrich
- Mitigating Spurious Correlations via Counterfactual Contrastive Learning
Fengxiang Cheng, Chuan Zhou, Xiang Li, Alina Leidinger, Haoxuan Li, Robert Van Rooij
- The RAG Paradox: A Black-Box Attack Exploiting Unintentional Vulnerabilities in Retrieval-Augmented Generation Systems
Chanwoo Choi, Jinsoo Kim, Sukmin Cho, Soyeong Jeong, Buru Chang
- Guiding Large Language Models for Biomedical Entity Linking via Restrictive and Contrastive Decoding
Zhenxi Lin, Ziheng Zhang, Jian Wu, Yefeng Zheng, Xian Wu
- Cut the Deadwood Out: Backdoor Purification via Guided Module Substitution
Yao Tong, Weijun Li, Xuanli He, Haolan Zhan, Qiongkai Xu
- RepoDebug: Repository-Level Multi-Task and Multi-Language Debugging Evaluation of Large Language Models
Jingjing Liu, Zeming Liu, Zihao Cheng, Mengliang He, Xiaoming Shi, Yuhang Guo, Xiangrong Zhu, Yuanfang Guo, Yunhong Wang, Haifeng Wang
- FaStFact: Faster, Stronger Long-Form Factuality Evaluations in LLMs
Yingjia Wan, Haochen Tan, Xiao Zhu, Xinyu Zhou, Zhiwei Li, Qingsong Lv, Changxuan Sun, Yi Xu, Jianqiao Lu, Yinhong Liu, Zhijiang Guo
- PropXplain: Can LLMs Enable Explainable Propaganda Detection?
Maram Hasanain, Md Arid Hasan, Mohamed Bayan Kmainasi, Elisa Sartori, Ali Ezzat Shahroor, Giovanni Da San Martino, Firoj Alam
- EoT: Evolution of Thoughts for Complex Reasoning Tasks
Qin Hua, Jiaqi Sun, Shiyou Qian, Dingyu Yang, Jian Cao, Guangtao Xue
- Reveal and Release: Iterative LLM Unlearning with Self-generated Data
Linxi Xie, Xin Teng, Shichang Ke, Hongyi Wen, Shenji Wan
- An Evaluation Resource for Grounding Translation Errors
Sujin Chen, Kang Wang, Zixuan Zhou, Xiangyu Duan, Wanqun Zhang, Hao Yang, Jinsong Su, Min Zhang
- Enhancing Time Awareness in Generative Recommendation
Sunkyung Lee, Seongmin Park, Jonghyo Kim, Mincheol Yoon, Jongwuk Lee
- Adaptive LLM Routing under Budget Constraints
Pranoy Panda, Raghav Magazine, Chaitanya Devaguptapu, Sho Takemori, Vishal Sharma
- Promptception: How Sensitive Are Large Multimodal Models to Prompts?
Mohamed Insaf Ismithdeen, Muhammad Uzair Khattak, Salman Khan
- Can Federated Learning Safeguard Private Data in LLM Training? Vulnerabilities, Attacks, and Defense Evaluation
Wenkai Guo, Xuefeng Liu, Haolin Wang, Jianwei Niu, Shaojie Tang, Jing Yuan
- Runaway is Ashamed, But Helpful: On the Early-Exit Behavior of Large Language Model-based Agents in Embodied Environments
Qingyu Lu, Liang Ding, Siyi Cao, Xuebo Liu, Kanjian Zhang, Jinxia Zhang, Dacheng Tao
- AutoMIR: Effective Zero-Shot Medical Information Retrieval without Relevance Labels
Lei Li, Xiangxu Zhang, Xiao Zhou, Zheng Liu
- RG-VQA: Leveraging Retriever-Generator Pipelines for Knowledge Intensive Visual Question Answering
Settaluri Lakshmi Sravanthi, Pulkit Agarwal, Debjyoti Mondal, Rituraj Singh, Subhadarshi Panda, Ankit Mishra, Kiran Pradeep, Srihari K B, Godawari Sudhakar Rao, Pushpak Bhattacharyya
- Enhancing RAG Efficiency with Adaptive Context Compression
Shuyu Guo, Shuo Zhang, Zhaochun Ren
- Revealing the impact of synthetic native samples and multi-tasking strategies in Hindi-English code-mixed humour and sarcasm detection
Debajyoti Mazumder, Aakash Kumar, Jasabanta Patro
- CogAtom: From Cognitive Atoms to Olympiad-level Mathematical Reasoning in Large Language Models
Zhuofan Chen, Jiyuan He, Yichi zhang, Xing Hu, Haoxing Wen, Jun Bai, Wenge Rong
- Efficient Latent Semantic Clustering for Scaling Test-Time Computation of LLMs
Sungjae Lee, Hoyoung Kim, Jeongyeon Hwang, Eunhyeok Park, Jungseul Ok
- BannerBench: Benchmarking Vision Language Models for Multi-Ad Selection with Human Preferences
Hiroto Otake, Peinan Zhang, Yusuke Sakai, Masato Mita, Hiroki Ouchi, Taro Watanabe
- DeKeyNLU: Enhancing Natural Language to SQL Generation through Task Decomposition and Keyword Extraction
Jian Chen, Zhenyan Chen, Xuming Hu, Peilin Zhou, Yining Hua, Han Fang, Cissy Hing Yee Choy, Xinmei Ke, Jingfeng Luo, Zixuan Yuan
- Facilitating Cross-lingual Transfer of Empathy through Language-independent Latent Diffusion: A Case Study in Chinese
Junlin Li, PENG Bo, Yu-Yin Hsu
- Evaluating Compound AI Systems through Behaviors, Not Benchmarks
PRANAV BHAGAT, K N Ajay Shastry, Pranoy Panda, Chaitanya Devaguptapu
- SciCompanion: Graph-Grounded Reasoning for Structured Evaluation of Scientific Arguments
Joshua Alan Flashner, Adithya Kulkarni, Dawei Zhou
- From Generation to Detection: A Multimodal Multi-Task Dataset for Benchmarking Health Misinformation
Zhihao Zhang, Yiran Zhang, Xiyue Zhou, Liting Huang, Imran Razzak, Preslav Nakov, Usman Naseem
- Estimating Machine Translation Difficulty
Lorenzo Proietti, Stefano Perrella, Vilém Zouhar, Roberto Navigli, Tom Kocmi
- TIU-Bench: A Benchmark for Evaluating Large Multimodal Models on Text-rich Image Understanding
Kun Zhang, Liqiang Niu, Zhen Cao, Fandong Meng, Jie Zhou
- Breaking Token Into Concepts: Exploring Extreme Compression in Token Representation Via Compositional Shared Semantics
Kavin R V, Pawan Goyal
- ExeSQL: Self-Taught Text-to-SQL Models with Execution-Driven Bootstrapping for SQL Dialects
Jipeng Zhang, Haolin Yang, Kehao Miao, Ruiyuan Zhang, Renjie Pi, Jiahui Gao, Xiaofang Zhou
- Under the Shadow of Babel: How Language Shapes Reasoning in LLMs
Chenxi Wang, Yixuan Zhang, Lang Gao, Zixiang Xu, Zirui Song, Yanbo Wang, Xiuying Chen
- Think Right, Not More: Test-Time Scaling for Numerical Claim Verification
Primakov Chungkham, Venktesh V, Vinay Setty, Avishek Anand
- Nexus: Adaptive Upcycling to Efficiently Pretrain Mixture of Experts
Nikolas Gritsch, Qizhen Zhang, Acyr Locatelli, Sara Hooker, Ahmet Üstün
- Exploring Context Strategies in LLMs for Discourse-Aware Machine Translation
Ritvik Choudhary, Rem Hida, Masaki Hamada, Hayato Futami, Toshiyuki Sekiya
- Insights into using temporal coordinated behaviour to explore connections between social media posts and influence
Elisa Sartori, Serena Tardelli, Maurizio Tesconi, Mauro Conti, Alessandro Galeazzi, Stefano Cresci, Giovanni Da San Martino
- SpecCoT: Accelerating Chain-of-Thought Reasoning through Speculative Exploration
Junhan Shi, Yijia Zhu, Zhenning Shi, Dan Zhao, Qing Li, Yong Jiang
- A Similarity Measure for Comparing Conversational Dynamics
Sang Min Jung, Kaixiang Zhang, Cristian Danescu-Niculescu-Mizil
- AgentDrug: Utilizing Large Language Models in an Agentic Workflow for Zero-Shot Molecular Optimization
Le Huy Khiem, Ting Hua, Nitesh V Chawla
- Improving Preference Alignment of LLM with Inference-Free Self-Refinement
Fukun Ma, Kaibin Tian, JietingXue, Xiaoyi Wang, Ye Ma, Quan Chen, Peng Jiang, Lijie Wen
- Guaranteed Guess: A Language Modeling Approach for CISC-to-RISC Transpilation with Testing Guarantees
Ahmed Heakl, Sarim Hashmi, Chaimaa Abi, Celine Lee, Abdulrahman Mahmoud
- StructuThink: Reasoning with Task Transition Knowledge for Autonomous LLM-Based Agents
Haiyu Zhao, Zhenyu Guo, Chunhong Zhang, Ziyu Zhou, Zheng Hu
- Leveraging Unpaired Feedback for Long-Term LLM-based Recommendation Tuning
Jizhi Zhang, Chongming Gao, Wentao Shi, Xin Chen, Jingang Wang, Xunliang Cai, Fuli Feng
- Investigating Multi-layer Representations for Dense Passage Retrieval
Zhongbin Xie, Thomas Lukasiewicz
- KELE: Residual Knowledge Erasure for Enhanced Multi-hop Reasoning in Knowledge Editing
Mengqi Zhang, Bowen Fang, Qiang Liu, Xiaotian Ye, Shu Wu, Pengjie Ren, Zhumin Chen, Liang Wang
- Dissecting Persona-Driven Reasoning in Language Models via Activation Patching
Ansh Poonia, Maeghal Jain
- PUER: Boosting Few-shot Positive-Unlabeled Entity Resolution with Reinforcement Learning
Yaoshu Wang, Mengyi Yan, Wei Wang
- Toward the Automatic Detection of Word Meaning Negotiation Indicators in Conversation
Aina Garí Soler, Matthieu Labeau, Chloé Clavel
- Forget the Unneeded: Backdooring Large Language Models via Contrastive-enhanced Machine Unlearning
Shiji Yang, Shu Zhao, Congyao Mei, Zhen Yang, Jie Chen, Fulan Qian, Zhen Duan, Yanping Zhang
- Equipping Retrieval-Augmented Large Language Models with Document Structure Awareness
Lingnan Xu, Chong Feng, Kaiyuan Zhang, Liu Zhengyong, Wenqiang Xu, Fanqing Meng
- QEVA: A Reference-Free Evaluation Metric for Narrative Video Summarization with Multimodal Question Answering
Woojun Jung, Junyeong Kim
- Thinking Before You Speak: A Proactive Test-time Scaling Approach
Cong Liu, Wenchang Chai, Hejun Wu, Yan Pan, Pengxu Wei, Liang Lin
- Do Before You Judge: Self-Reference as a Pathway to Better LLM Evaluation
Wei-Hsiang Lin, Sheng-Lun Wei, Hen-Hsen Huang, Hsin-Hsi Chen
- Beyond Content: How Grammatical Gender Shapes Visual Representation in Text-to-Image Models
Muhammed Saeed, Shaina Raza, Ashmal Vayani, Muhammad Abdul-Mageed, Ali Emami, Shady Shehata
- ToolHaystack: Stress-Testing Tool-Augmented Language Models in Realistic Long-Term Interactions
Beong-woo Kwak, Minju Kim, Dongha Lim, Hyungjoo Chae, Dongjin Kang, Sunghwan Kim, Dongil Yang, Jinyoung Yeo
- GraphCheck: Multipath Fact-Checking with Entity-Relationship Graphs
Hyewon Jeon, Jay-Yoon Lee
- FLAMES: Improving LLM Math Reasoning via a Fine-Grained Analysis of the Data Synthesis Pipeline
Parker Seegmiller, Kartik Mehta, Soumya Saha, Chenyang Tao, Shereen Oraby, Arpit Gupta, Tagyoung Chung, Mohit Bansal, Nanyun Peng
- POW: Political Overton Windows of Large Language Models
Leif Azzopardi
- Columbo: Expanding Abbreviated Column Names for Tabular Data Using Large Language Models
Ting Cai, Stephen Sheen, AnHai Doan
- RTTC: Reward-Guided Collaborative Test-Time Compute
Juan Pablo Munoz, Jinjie Yuan
- AMANDA: Agentic Medical Knowledge Augmentation for Data-Efficient Medical Visual Question Answering
Ziqing Wang, Chengsheng Mao, Yuan Luo, Kaize Ding
- Mixed Signals: Decoding VLMs’ Reasoning and Underlying Bias in Vision-Language Conflict
Pouya Pezeshkpour, Moin Aminnaseri, Estevam Hruschka
- Mitigating Hallucination in Large Vision-Language Models through Aligning Attention Distribution to Information Flow
Jianfei Zhao, Feng Zhang, Xin Sun, Chong Feng
- OptiSeq: Ordering Examples On-The-Fly for In-Context Learning
Rahul Atul Bhope, Praveen Venkateswaran, K. R. Jayaram, Vatche Isahagian, Vinod Muthusamy, Nalini Venkatasubramanian
- Dependency Parsing-Based Syntactic Enhancement of Relation Extraction in Scientific Texts
Devvrat Joshi, Islem Rekik
- DIPLomA: Efficient Adaptation of Instructed LLMs to Low-Resource Languages via Post-Training Delta Merging
Ixak Sarasua Antero, Ander Corral, Xabier Saralegi
- Reliability Crisis of Reference-free Metrics for Grammatical Error Correction
Takumi Goto, Yusuke Sakai, Taro Watanabe
- Who Speaks Matters: Analysing the Influence of the Speaker’s Linguistic Identity on Hate Classification
Ananya Malik, Kartik Sharma, Lynnette Hui Xian Ng, Shaily Bhatt
- Are LLMs Empathetic to All? Investigating the Influence of Multi-Demographic Personas on a Model’s Empathy
Ananya Malik, Nazanin Sabri, Melissa M. Karnaze, Mai ElSherief
- Active Learning for Multidialectal Arabic POS Tagging
Diyam Akra, Mohammed Khalilia, Mustafa Jarrar
- Embedding-Free RAG
Jessica Maghakian, Raunak Sinha, Max Schettewi, Gunkirat Kaur
- Rating Roulette: Self-Inconsistency in LLM-As-A-Judge Frameworks
Rajarshi Haldar, Julia Hockenmaier
- Quantifying Uncertainty in Natural Language Explanations of Large Language Models for Question Answering
Yangyi Li, Mengdi Huai
- Real-World Summarization: When Evaluation Reaches Its Limits
Patrícia Schmidtová, Ondrej Dusek, Saad Mahamood
- Open-DeBias: Toward Mitigating Open-Set Bias in Language Models
Nihar Ranjan Sahoo, Arti Rani, Shweta Singh, Gaurav Kumar Nayak
- SACL: Understanding and Combating Textual Bias in Code Retrieval with Semantic-Augmented Reranking and Localization
Dhruv Gupta, Yiqing Xie, Gayathri Ganesh Lakshmy
- Jailbreak Distillation: Renewable Safety Benchmarking
Jingyu Zhang, Ahmed Elgohary, Xiawei Wang, A S M Iftekhar, Ahmed Magooda, Benjamin Van Durme, Daniel Khashabi, Kyle Jackson
- Uncertainty-Aware Answer Selection for Improved Reasoning in Multi-LLM Systems
Aakriti Agrawal, Rohith Aralikatti, Anirudh Satheesh, Souradip Chakraborty, Amrit Singh Bedi, Furong Huang
- GreekBarBench: A Challenging Benchmark for Free-Text Legal Reasoning and Citations
Odysseas S. Chlapanis, Dimitris Galanis, Nikolaos Aletras, Ion Androutsopoulos
- Pi-SQL: Enhancing Text-to-SQL with Fine-Grained Guidance from Pivot Programming Languages
Yongdong Chi, Hanqing Wang, Yun Chen, Yan Yang, Jian Yang, Zonghan Yang, Xiao Yan, Guanhua Chen
- RAC: Efficient LLM Factuality Correction with Retrieval Augmentation
Changmao Li, Jeffrey Flanigan
- Does It Run and Is That Enough? Revisiting Text-to-Chart Generation with a Multi-Agent Approach
James Ford, Anthony Rios
- WiNELL: Wikipedia Never-Ending Updating with LLM Agents
Revanth Gangi Reddy, Tanay Dixit, Jiaxin Qin, Cheng Qian, Daniel Lee, Jiawei Han, Kevin Small, Xing Fan, Ruhi Sarikaya, Heng Ji
- GeLoRA: Geometric Adaptive Ranks For Efficient LoRA Fine-tuning
Abdessalam Ed-dib, Zhanibek Datbayev, Amine M. Aboussalah
- Uncovering Scaling Laws for Large Language Models via Inverse Problems
Arun Verma, Zhaoxuan Wu, Zijian Zhou, Xiaoqiang Lin, Zhiliang Chen, Rachael Hwee Ling Sim, Rui Qiao, Jingtan Wang, Nhung Bui, Xinyuan Niu, Wenyang Hu, Gregory Kang Ruey Lau, Zi-Yu Khoo, Zitong Zhao, Xinyi Xu, Apivich Hemachandra, See-Kiong Ng, Bryan Kian Hsiang Low
- UIPE: Enhancing LLM Unlearning by Removing Knowledge Related to Forgetting Targets
Wenyu wang, Mengqi Zhang, Xiaotian Ye, Zhaochun Ren, Pengjie Ren, Zhumin Chen
- FicSim: A Dataset for Multi-Faceted Semantic Similarity in Long-Form Fiction
Natasha Johnson, Amanda Bertsch, Maria-Emil Deal, Emma Strubell
- Masked Diffusion Captioning for Visual Feature Learning
Chao Feng, Zihao Wei, Andrew Owens
- Diverse Multi-tool Aggregation with Large Language Models for Enhanced Math Reasoning
Bohan Yao, Vikas Yadav
- Enhancing Goal-oriented Proactive Dialogue Systems via Dynamic Multi-dimensional Consistency Optimization
Didi Zhang, Yaxin Fan, Peifeng Li, Qiaoming Zhu
- Injecting Domain-Specific Knowledge into Large Language Models: A Comprehensive Survey
Zirui Song, Bin Yan, Yuhan Liu, Miao Fang, Mingzhe Li, Rui Yan, Xiuying Chen
- Who’s the Author? How Explanations Impact User Reliance in AI-Assisted Authorship Attribution
Calvin Bao, Connor Baumler, Hal Daumé III, Marine Carpuat
- UniSpeaker: A Unified Approach for Multimodality-driven Speaker Generation
Zhengyan Sheng, Zhihao Du, Heng Lu, ShiLiang Zhang, Zhen-Hua Ling
- On the Fine-Grained Planning Abilities of VLM Web Agents
Surgan Jandial, Yinong Oliver Wang, Andrea Bajcsy, Fernando De la Torre
- InterFeedback: Unveiling Interactive Intelligence of Large Multimodal Models with Human Feedback
Henry Hengyuan Zhao, Wenqi Pei, Yifei Tao, Haiyang Mei, Mike Zheng Shou
- ReFLAIR: Enhancing Multimodal Reasoning via Structured Reflection and Reward-Guided Learning
Jiazhou Ji
- Exploring Cross-Client Memorization of Training Data in Language Models for Federated Learning
Tinnakit Udsa, Can Udomcharoenchaikit, Patomporn Payoungkhamdee, Sarana Nutanong, Norrathep Rattanavipanon
- ControlText: Unlocking Controllable Fonts in Multilingual Text Rendering without Font Annotations
Bowen Jiang, Yuan Yuan, Xinyi Bai, Zhuoqun Hao, Alyson Yin, Yaojie Hu, Wenyu Liao, Lyle Ungar, Camillo Jose Taylor
- STA-CoT: Structured Target-Centric Agentic Chain-of-Thought for Consistent Multi-Image Geological Reasoning
Beibei Yu, Tao Shen, Ling Chen
- Can Language Models Follow Multiple Turns of Entangled Instructions?
Chi Han, Xin Liu, Haodong Wang, Shiyang Li, Jingfeng Yang, Haoming Jiang, Zhengyang Wang, Qingyu Yin, Liang Qiu, Changlong Yu, Yifan Gao, Zheng Li, Bing Yin, Jingbo Shang, Heng Ji
- How to Generalize the Detection of AI-Generated Text: Confounding Neurons
Claudio Borile, Carlo Abrate
- SparsePO: Controlling Preference Alignment of LLMs via Sparse Token Masks
Fenia Christopoulou, Ronald Cardenas, Gerasimos Lampouras, Haitham Bou Ammar, Jun Wang
- We Argue to Agree: Towards Personality-Driven Argumentation-Based Negotiation Dialogue Systems for Tourism
Priyanshu Priya, Saurav Dudhate, Desai Vishesh Yasheshbhai, Asif Ekbal
- Towards the Roots of the Negation Problem: A Multilingual NLI Dataset and Model Scaling Analysis
Tereza Vrabcová, Marek Kadlčík, Petr Sojka, Michal Štefánik, Michal Spiegel
- Improving the Language Understanding Capabilities of Large Language Models Using Reinforcement Learning
Sai Ashish Somayajula, Bokai Hu, Xin Pan, Pengtao Xie
- HATECAT-TR: A Hate Speech Span Detection and Categorization Dataset for Turkish
Hasan Kerem Şeker, Gökçe Uludoğan, Pelin Önal, Arzucan Özgür
- DM-Codec: Distilling Multimodal Representations for Speech Tokenization
Md Mubtasim Ahasan, Md Fahim, Tasnim Mohiuddin, AKMMAHBUBUR RAHMAN, Aman Chadha, Tariq Iqbal, M Ashraful Amin, Md Mofijul Islam, Amin Ahsan Ali
- LCAN: A Label-Aware Contrastive Attention Network for Multi-Intent Recognition and Slot Filling in Task-Oriented Dialogue Systems
Shuli Zhang, Zhiqiang You, XiaoXiangQi, Peng Liu, Gaode Wu, Kan Xia, Shenguang Huang
- Low-Resource Languages LLM Disinformation is Within Reach: The Case of Walliserdeutsch
Andrei Kucharavy, Sherine Seppey, Cyril Vallez, Dimitri Percia David, Ljiljana Dolamic
- Exploring and Controlling Diversity in LLM-Agent Conversation
KuanChao Chu, Yi-Pei Chen, Hideki Nakayama
- Agentic-ToM: Cognition-Inspired Agentic Processing For Enhancing Theory of Mind Reasoning
Sneheel Sarangi, Chetan Talele, Hanan Salam
- Can We Edit LLMs for Long-Tail Biomedical Knowledge?
Xinhao Yi, Jake Lever, Kevin Bryson, Zaiqiao Meng
- GeoPQA: Bridging the Visual Perception Gap in MLLMs for Geometric Reasoning
Guizhen Chen, Weiwen Xu, Hao Zhang, Hou Pong Chan, Deli Zhao, Anh Tuan Luu, Yu Rong
- CM-Align: Consistency-based Multilingual Alignment for Large Language Models
Xue Zhang, Yunlong Liang, Fandong Meng, Songming Zhang, Yufeng Chen, Jinan Xu, Jie Zhou
- Cache Saver: A Modular Framework for Efficient, Affordable, and Reproducible LLM Inference
Nearchos Potamitis, Lars Henning Klein, Chongyang Xu, Attreyee Mukherjee, Bardia Mohammadi, Niket Tandon, Laurent Bindschaedler, Akhil Arora
- Evaluating Cultural Knowledge and Reasoning in LLMs Through Persian Allusions
Melika Nobakhtian, Yadollah Yaghoobzadeh, Mohammad Taher Pilehvar
- Evolving Stances on Reproducibility: A Longitudinal Study of NLP and ML Researchers’ Views and Experience of Reproducibility
Craig Thomson, Ehud Reiter, João Sedoc, Anya Belz
- KAHAN: Knowledge-Augmented Hierarchical Analysis and Narration for Financial Data Narration
Yajing Yang, Tony Deng, Min-Yen Kan