Accepted Industry Track Papers

  • RAVEN++: Pinpointing Fine-Grained Violations in Advertisement Videos with Active Reinforcement Reasoning
    Deyi Ji, Yuekui Yang, Haiyang Wu, Shaogang Tang, Peng Shu, Xudong Chen, Shaoping Ma, Tianrun Chen, Lanyun Zhu
  • SAGE: A Generic Framework for LLM Safety Evaluation
    Madhur Jindal, Hari Shrawgi, Parag Agrawal, Sandipan Dandapat
  • CRAB: A Benchmark for Evaluating Curation of Retrieval-Augmented LLMs in Biomedicine
    Hanmeng Zhong, Linqing Chen, weilei wang, Wentao Wu
  • VENUS: A VLLM-driven Video Content Moderation System for Real Application Scenarios
    Minyi Zhao, YI LIU, Jianfeng Wen, Boshen Zhang, Hailang Chang, Zhiheng ouyang, Jie Wang, wensong he, Shuigeng Zhou
  • Text2MDT: Extracting Decision Trees from Medical Texts Using Large Language Models
    Yuheng Li, Wei Zhu, Jiechao Gao
  • PolyNorm: Few-Shot LLM-Based Text Normalization for Text-to-Speech
    Michel Wong, Ali Alshehri, Sophia Kao, Haotian He
  • Audio Query Handling System with Integrated Expert Models and Contextual Understanding
    NAVEEN VAKADA, Yinyi Guo, Erik Visser, Arvind Krishna Sridhar
  • Generative Reviewer Agents: Scalable Simulacra of Peer Review
    Nicolas Bougie, Narimawa Watanabe
  • Aligning LLMs for Multilingual Consistency in Enterprise Applications
    Amit Agarwal, Hansa Meghwani, Hitesh Laxmichand Patel, Tao Sheng, Sujith Ravi, Dan Roth
  • RCI: A Score for Evaluating Global and Local Reasoning in Multimodal Benchmarks
    Amit Agarwal, Hitesh Laxmichand Patel, Srikant Panda, Hansa Meghwani, Jyotika Singh, Karan Dua, Paul Li, Tao Sheng, Sujith Ravi, Dan Roth
  • LP Data Pipeline: Lightweight, Purpose-driven Data Pipeline for Large Language Models
    Yungi Kim, Hyunsoo Ha, Seonghoon Yang, Sukyung Lee, Jihoo Kim, Chanjun Park
  • Toward Reliable Clinical Coding with Language Models: Verification and Lightweight Adaptation
    Moy Yuan, Han-Chin Shing, Mitch Strong, Chaitanya Shivade
  • Enhancing Talent Search Ranking with Role-Aware Expert Mixtures and LLM-based Fine-Grained Job Descriptions
    JihangLi, Bing Xu, Minping Chen, Zulong Chen, Chuanfei Xu, Suyu Liu, Zeyi Wen
  • PCRI: Measuring Context Robustness in Multimodal Models for Enterprise Applications
    Hitesh Laxmichand Patel, Srikant Panda, Amit Agarwal, Hansa Meghwani, Karan Dua, Paul Li, Tao Sheng, Sujith Ravi, Dan Roth
  • CitySim: Modeling Urban Behaviors and City Dynamics with Large-Scale LLM-Driven Agent Simulation
    Nicolas Bougie, Narimawa Watanabe
  • Evaluating Conversational Agents with Persona-driven User Simulations based on Large Language Models: A Sales Bot Case Study
    Justyna Gromada, Alicja Kasicka, Ewa Komkowska, Lukasz Krajewski, Natalia Krawczyk, Morgan Veyret, Bartosz Przybył, Lina M. Rojas-Barahona, Michał K. Szczerbak
  • Mirror in the Model: Ad Banner Image Generation via Reflective Multi-LLM and Multi-modal Agents
    Zhao Wang, Bowen Chen, Yotaro Shimose, Sota Moriyama, Heng Wang, Shingo Takamatsu
  • PatternRAG: Leveraging Product Catalog Patterns for Multilingual E-commerce Product Attribute Prediction
    Bryan Zhang, Suleiman A. Khan, Stephan Walter
  • ECom-Bench: Can LLM Agent Resolve Real-World E-commerce Customer Support Issues?
    Haoxin Wang, Xianhan Peng, xucheng Huang, Yizhe Huang, Ming Gong, chenghan Yang, Yang Liu, ling Jiang
  • ProCut: LLM Prompt Compression via Attribution Estimation
    Zhentao Xu, Fengyi Li, Albert C. Chen, Xiaofeng Wang
  • A Reasoner for Real-World Event Detection: Scaling Reinforcement Learning via Adaptive Perplexity-Aware Sampling Strategy
    Xiaoyun Zhang, Jingqing Ruan, Xing Ma, Yawen Zhu, Jiansong Chen, Ke Zeng, Xunliang Cai
  • Detecting Omissions in LLM-Generated Medical Summaries
    Achir Oukelmoun, Nasredine Semmar, Gaël de Chalendar, Clément cormi, Mariame Oukelmoun, Eric Vibert, Marc-Antoine Allard
  • LEAF: Learning and Evaluation Augmented by Fact-Checking to Improve Factualness in Large Language Models
    Hieu Tran, Junda Wang, Yujan Ting, hong yu, Weijing Huang, Terrence Chen
  • ReAct Meets Industrial IoT: Language Agents for Data Access
    James T Rayfield, Shuxin Lin, Nianjun Zhou, Dhaval C Patel
  • ProductAgent: Benchmarking Conversational Product Search Agent with Asking Clarification Questions
    Jingheng Ye, Yong Jiang, Xiaobin Wang, Yinghui Li, Yangning Li, Pengjun Xie, Fei Huang
  • MADS: Multi-Agent Dialogue Simulation for Diverse and Persuasive Training Data Generation
    Mingjin Li, Yu Liu, Huayi Liu, Hongguang Zhang, Xiang Ye
  • On-device System of Compositional Multi-tasking in Large Language Models
    Ondrej Bohdal, Konstantinos Theodosiadis, Asterios Mpatziakas, Dimitrios Filippidis, Iro Spyrou, Christos Zonios, Anastasios Drosou, Dimosthenis Ioannidis, Kyenghun Lee, Jijoong Moon, Hyeonmok Ko, Mete Ozay, Umberto Michieli
  • Select-then-Route : Taxonomy guided Routing for LLMs
    Soham Shah, Kumar Shridhar
  • FABRIC: Fully-Automated Broad Intent Categorization in E-commerce
    Anna Tigunova, Philipp Schmidt, Damla Ezgi Akcora
  • MKT: A Multi-Stage Knowledge Transfer Framework to Mitigate Catastrophic Forgetting in Multi-Domain Chinese Spelling Correction
    Peng Xing, Yinghui Li, Shirong Ma, Xinnian Liang, Haojing Huang, Yangning Li, Shu-Yu Guo, Hai-Tao Zheng, Wenhao Jiang, Ying Shen
  • End-to-End Aspect-Guided Review Summarization at Scale
    Ilya Boytsov, Vinny DeGenova, Mikhail Balyasin, Joseph Walt, Caitlin Eusden, Marie-Claire Rochat, Margaret Pierson
  • SLOT: Structuring the Output of Large Language Models
    Zhengyuan Shen, Darren Yow-Bang Wang, Soumya Smruti Mishra, Zhichao Xu, Yifei Teng, Haibo Ding
  • QuackIR: Retrieval in DuckDB and Other Relational Database Management Systems
    Yijun Ge, Zijian Chen, Jimmy Lin
  • Benchmarking Deep Search over Heterogeneous Enterprise Data
    Prafulla Kumar Choubey, XIANGYU PENG, Shilpa Bhagavath, Kung-Hsiang Huang, Caiming Xiong, Chien-Sheng Wu
  • RLHF Algorithms Ranked: An Extensive Evaluation Across Diverse Tasks, Rewards, and Hyperparameters
    Lucas Spangher, Rama Kumar Pasumarthi, Nick Masiewicki, William F. Arnold, Aditi Kaushal, Dale Johnson, Peter Grabowski, Eugene Ie
  • Predicting Cross-lingual Trends in Microblogs
    Satoshi Akasaki
  • Generating Fine Details of Entity Interactions
    Xinyi Gu, Jiayuan Mao
  • AutoCVSS: Assessing the Performance of LLMs for Automated Software Vulnerability Scoring
    Davide Sanvito, Giovanni Arriciati, Giuseppe Siracusano, Roberto Bifulco, Michele Carminati
  • SFAL: Semantic-Functional Alignment Scores for Distributional Evaluation of Auto-Interpretability in Sparse Autoencoders
    Fabio Mercorio, Filippo Pallucchini, Daniele Potertì, Antonio Serino, Andrea Seveso
  • Just One is Enough: An Existence-based Alignment Check for Robust Japanese Pronunciation Estimation
    Hayate Nakano, Nobuhiro Kaji
  • Towards Enforcing Company Policy Adherence in Agentic Workflows
    Naama Zwerdling, David Boaz, Ella Rabinovich, Guy Uziel, David Amid, Ateret Anaby Tavor
  • Learning to Translate Ambiguous Terminology by Preference Optimization on Post-Edits
    Nathaniel Berger, Johannes Eschbach-Dymanus, Miriam Exel, Matthias Huck, Stefan Riezler
  • More Data or Better Data? A Critical Analysis of Data Selection and Synthesis for Mathematical Reasoning
    Yike Zhao, Simin Guo, Ziqing Yang, Shifan Han, Dahua Lin, Fei Tan
  • SRS-Stories: Vocabulary-constrained multilingual story generation for language learning
    Wiktor Kamzela, Mateusz Lango, Ondrej Dusek
  • Banking Done Right: Redefining Retail Banking with Language-Centric AI
    Xin Jie Chua, Jeraelyn Ming Li Tan, Jia Xuan Tan, Soon Chang Poh, Yi Xian Goh, Debbie Hui Tian Choong, Foong Chee Mun, Sze Jue Yang, Chee Seng Chan
  • Graph of Attacks with Pruning: Optimizing Stealthy Jailbreak Prompt Generation for Enhanced LLM Content Moderation
    Daniel Schwartz, Dmitriy Bespalov, Zhe Wang, Ninad Kulkarni, Yanjun Qi
  • Structural Reward Model: Enhancing Interpretability, Efficiency, and Scalability in Reward Modeling
    Xiaoyu Liu, Di Liang, Hongyu Shan, Peiyang Liu, Yonghao Liu, Muling Wu, Yuntao Li, Xianjie Wu, LI Miao, Jiangrong Shen, Minlong Peng
  • Controllable Clustering with LLM-driven Embeddings
    Kerria Pang-Naylor, Shivani Manivasagan, Aitong Zhong, Mehak Garg, Nicholas Mondello, Blake Buckner, Jonathan Chang, Khyati Mahajan, Masoud Hashemi, Fabio Casati
  • SpeechLLMs for Large-scale Contextualized Zero-shot Slot Filling
    Kadri Hacioglu, Manjunath K E, Andreas Stolcke
  • NurseLLM: The First Specialized Language Model for Nursing
    Md Tawkat Islam Khondaker, Julia Harrington, Shady Shehata
  • Augmenting Compliance Guaranteed Conversational AI: Context-Aware Knowledge Base Expansion with LLMs and Combinatorial Optimization
    Mengze Hong, Chen Jason Zhang, Di Jiang, Yuanqin He
  • Memory-Efficient Backpropagation for Fine-Tuning LLMs on Resource-Constrained Mobile Devices
    Congzheng Song, Xinyu Tang
  • PrismRAG: Boosting RAG Factuality with Distractor Resilience and Strategized Reasoning
    Mohammad Kachuee, Teja Gollapudi, Minseok Kim, Yin Huang, Kai Sun, Xiao Yang, Jiaqi Wang, Nirav Shah, Yue Liu, AARON COLAK, Anuj Kumar, Wen-tau Yih, Xin Luna Dong
  • Benchmarking LLM Faithfulness in RAG with Evolving Leaderboards
    Manveer Singh Tamber, Forrest Sheng Bao, Chenyu Xu, Ge Luo, Suleman Kazi, Minseok Bae, Miaoran Li, Ofer Mendelevitch, Renyi Qu, Jimmy Lin
  • A Multi-Agent Framework for Quantitative Finance : An Application to Portfolio Management Analytics
    Sayani Kundu, Dushyant Sahoo, Victor Li, Jennifer Rabowsky, Amit Varshney
  • Group Preference Alignment: Customizing LLM Responses from In-Situ Conversations Only When Needed
    Ishani Mondal, Jack W Stokes, Sujay Kumar Jauhar, Longqi Yang, Mengting Wan, Xiaofeng Xu, Xia Song, Jordan Lee Boyd-Graber, Jennifer Neville
  • DASR: Distributed Adaptive Scene Recognition - A Multi-Agent Cloud-Edge Framework for Language-Guided Scene Detection
    Can Cui, Yongkang Liu, Seyhan Ucar, Juntong Peng, Ahmadreza Moradipari, Maryam Khabazi, Ziran Wang
  • Empowering Healthcare Practitioners with Language Models: Structuring Speech Transcripts in Two Real-World Clinical Applications
    Jean-Philippe Corbeil, Asma Ben Abacha, George Michalopoulos, Phillip Swazinna, Miguel Del-Agua, Jérôme Tremblay, Akila Jeeson Daniel, Cari Bader, Kevin Cho, Pooja Krishnan, Nathan Bodenstab, Thomas Lin, Wenxuan Teng, Francois Beaulieu, Paul Vozila
  • Leveraging the Power of Large Language Models in Entity Linking via Adaptive Routing and Targeted Reasoning
    Yajie Li, Albert Galimov, Mitra Datta Ganapaneni, Pujitha Thejaswi, De Meng, Priyanshu Kumar, Saloni Potdar
  • Can LLMs Narrate Tabular Data? An Evaluation Framework for Natural Language Representations of Text-to-SQL System Outputs
    Jyotika Singh, Weiyi Sun, Amit Agarwal
  • Enhancing Foundation Models in Transaction Understanding with LLM-based Sentence Embeddings
    Xiran Fan, Zhimeng Jiang, Chin-Chia Michael Yeh, Yuzhong Chen, Yingtong Dou, Menghai Pan, Yan Zheng
  • Agent vs. Agent: Automated Data Generation and Red-Teaming for Custom Agentic Workflows
    Ninad Kulkarni, Xian Wu, Siddharth Varia, Dmitriy Bespalov
  • Auto prompting without training labels: An LLM cascade for product quality assessment in e-commerce catalogs
    Soham Satyadharma, Fatemeh Sheikholeslami, Swati Kaul, Aziz Umit Batur, Suleiman A. Khan
  • Harmonizing Diverse Models: A Layer-wise Merging Strategy for Consistent Generation
    Xujun Peng, Anoop Kumar, Jingyu Wu, Parker Glenn, Daben Liu
  • Transparent Reference-free Automated Evaluation of Open-Ended User Survey Responses
    Subin An, Yugyeong Ji, Junyoung Kim, Heejin Kook, Yang Lu, Josh Seltzer
  • Datasets and Recipes for Video Temporal Grounding via Reinforcement Learning
    Ruizhe Chen, Tianze Luo, Zhiting Fan, Heqing Zou, Zhaopeng Feng, Guiyang Xie, Hansheng Zhang, Wang Zhuochen, Zuozhu Liu, Zhang huaijian
  • SEARA: An Automated Approach for Obtaining Optimal Retrievers
    Zou Yuheng
  • UniEDU: Toward Unified and Efficient Large Multimodal Models for Educational Tasks
    Zhendong Chu, Jian Xie, Shen Wang, Zichao Wang, Qingsong Wen
  • Truth, Trust, and Trouble: Medical AI on the Edge
    Mohammad Anas Azeez, Rafiq Ali, Ebad Shabbir, Zohaib Hasan Siddiqui, Gautam Siddharth Kashyap, Jiechao Gao, Usman Naseem
  • An Address Intelligence Framework for E-commerce Deliveries
    Gokul Swamy, Aman Gulati, Srinivas Virinchi, Anoop Saladi
  • LLMs on a Budget? Say HOLA
    Zohaib Hasan Siddiqui, Jiechao Gao, Ebad Shabbir, Mohammad Anas Azeez, Rafiq Ali, Gautam Siddharth Kashyap, Usman Naseem
  • LLM-Based Dialogue Labeling for Multiturn Adaptive RAG
    Zhiyu Chen, biancen xie, SIDARTH SRINIVASAN, Manikandarajan Ramanathan, Rajashekar Maragoud, Qun Liu
  • RAGulator: Lightweight Out-of-Context Detectors for Grounded Text Generation
    Ian Poey, Jiajun Liu, Qishuai Zhong
  • REIC: RAG-Enhanced Intent Classification at Scale
    Ziji Zhang, Michael Yang, Zhiyu Chen, Yingying Zhuang, Shu-Ting Pi, Qun Liu, Rajashekar Maragoud, Vy Nguyen, Anurag Beniwal
  • Mapping Smarter, Not Harder: A Test-Time Reinforcement Learning Agent That Improve Without Labels or Model Updates
    Wen-Kwang Tsao, Yao-Ching Yu, Chien-Ming Huang
  • On Assigning Product and Software Codes to Customer Service Requests with Large Language Models
    Sujatha Das Gollapalli, Mouad Hakam, Mingzhe Du, See-Kiong Ng, Mohammed Hamzeh
  • Reasoning-Enhanced Domain-Adaptive Pretraining of Multimodal Large Language Models for Short Video Content Moderation
    Zixuan Wang, Yu Sun, Hongwei Wang, Baoyu Jing, Xiang Shen, Xin Dong, Zhuolin Hao, Hongyu Xiong, Yang Song
  • GSID: Generative Semantic Indexing for E-Commerce Product Understanding
    Haiyang Yang, Qinye Xie, qingheng zhang, Chen Li Yu, Huike Zou, ChengbaoLian, Shuguang Han, Fei Huang, jufeng chen, Bo Zheng
  • Learning from LLM Agents: In-Context Generative Models for Text Casing in E-Commerce Ads
    Yingxue Zhou, Tan Zhu, Tao Zeng, Wei Shen
  • Auto-Weighted Group Relative Preference Optimization for Multi-Objective Text Generation Tasks
    Yuki Ichihara, Yuu Jinnai
  • Cost-Effective E-Commerce Catalog Translation at Scale Ensuring Named Entity Protection
    Asier Gutiérrez-Fandiño, Jorge Yero Salazar, Clement Ruin, Alejandro Quintero-Roba, Shang Ravichandran, Jesus Perez-Martin, Pankaj Adsul, Suruchi Garg, Leonardo Lezcano
  • InstaJudge: Aligning Judgment Bias of LLM-as-Judge with Humans in Industry Applications
    Myeongjun Erik Jang, Fran Silavong
  • TelAgentBench: A Multi-faceted Benchmark for Evaluating LLM-based Agents in Telecommunications
    Sunwoo Lee, Dhammiko Arya, Seung-Mo Cho, Gyoung-eun Han, Seokyoung Hong, Daseong Jang, Wonbeom Jang, Sangjin Kim, SaeRom Kim, Seojin Lee, Sohee Park, Sereimony Sek, Injee Song, Sungbin yoon, Eric Davis
  • Taming the Real-world Complexities in CPT E/M Coding with Large Language Models
    Islam Nassar, Yang Lin, Yuan Jin, Rongxin Zhu, Chang Wei Tan, Zenan Zhai, Thanh Tien Vu, Xu Zhong, Long Duong, Yuan-Fang Li
  • Classifier-Augmented Generation for Structured Workflow Prediction
    Thomas Gschwind, Shramona Chakraborty, Nitin Gupta, Sameep Mehta
  • Efficient and Versatile Model for Multilingual Information Retrieval of Islamic Text: Development and Deployment in Real-World Scenarios
    Vera Pavlova, Mohammed Makhlouf
  • AutoQual: An LLM Agent for Automated Discovery of Interpretable Features for Review Quality Assessment
    Xiaochong Lan, Jie Feng, Yinxing Liu, Xinleishi, Yong Li
  • JSON Whisperer: Efficient JSON Editing with LLMs
    Sarel Duanis, Asnat Greenstein-Messica, Eliya Habba
  • L4: Mutual Learning Helps Lifelong Language Learning
    Jiyong Li, Dilshod Azizov, Shangsong Liang
  • TTD-SQL: Tree-Guided Token Decoding for Efficient and Schema-Aware SQL Generation
    Chetan Sharma, Ramasuri Narayanam, Soumyabrata Pal, Kalidas Yeturu, Shiv Kumar Saini, Koyel Mukherjee
  • Spot the BlindSpots: Systematic Identification and Quantification of Fine-Grained LLM Biases in Contact Center Summaries
    Kawin Mayilvaghanan, Siddhant Gupta, Ayush Kumar
  • HierDiffuse: Progressive Diffusion for Robust Interest Fusion in CTR Prediction
    Ziheng Ni, Congcong Liu, Yuying Chen, Zhiwei Fang, Changping Peng, Zhangang Lin, Ching Law, Jingping Shao
  • TOBUGraph: Knowledge Graph-Based Retrieval for Enhanced LLM Performance Beyond RAG
    Savini Kashmira, Jayanaka L. Dantanarayana, Joshua Brodsky, Ashish Mahendra, Yiping Kang, Krisztian Flautner, Lingjia Tang, Jason Mars
  • Thinking with DistilQwen: A Tale of Four Distilled Reasoning and Reward Model Series
    Wenrui Cai, Chengyu Wang, Junbing Yan, Jun Huang, Xiangzhong Fang
  • Crossing Domains without Labels: Distant Supervision for Term Extraction
    Elena Senger, Yuri Campbell, Rob van der Goot, Barbara Plank
  • I-SEE: An Instruction-tuned, SOP-Enhanced Quality Evaluator for Product Content
    Aniket Joshi, Cyrus Andre DSouza, Sejal Jain, Jitenkumar Babubhai Rana, Promod Yenigalla
  • Computational Blueprints: Generating Isomorphic Math Problems with Large Language Models
    Jeong-hoon Kim, Jinwoo Nam, Geunsik Jo
  • Fin-ExBERT: User Intent based Text Extraction in Financial Context using Graph-Augmented BERT and trainable Plugin
    Soumick Sarker, Abhijit Kumar Rai
  • DecEx-RAG: Boosting Agentic Retrieval-Augmented Generation with Decision and Execution Optimization via Process Supervision
    Yongqi Leng, Yikun Lei, Xikai Liu, Meizhi Zhong, Bojian Xiong, Yurong Zhang, Yan Gao, YIWU, Yao Hu, Deyi Xiong
  • FLOW-BENCH: Towards Conversational Generation of Enterprise Workflows
    Evelyn Duesterwald, Siyu Huo, Vatche Isahagian, Ritesh Kumar, Vinod Muthusamy, Punleuk Oum, K. R. Jayaram, Debashish Saha, Gegi Thomas, Praveen Venkateswaran
  • Format Inertia: A Failure Mechanism of LLMs in Medical Pre-Consultation
    Seungseop Lim, Gibaeg Kim, Wooseok Han, Jean Seo, Hyunkyung Lee, Jaehyo Yoo, Eunho Yang
  • Extraction of Information Provision Activity Requirements from EU-Acquis
    Jakub Piskorski, Dominik Skotarczak
  • Contrastive Learning Using Graph Embeddings for Domain Adaptation of Language Models in the Process Industry
    Anastasia Zhukova, Jonas Lührs, Christian E. Matt, Bela Gipp
  • From Feedback to Checklists: Grounded Evaluation of AI-Generated Clinical Notes
    Karen Zhou, John Michael Giorgi, Pranav Mani, Peng Xu, Davis Liang, Chenhao Tan
  • FlexDoc: Parameterized Sampling for Diverse Multilingual Synthetic Documents for Training Document Understanding Models
    Karan Dua, Hitesh Laxmichand Patel, Puneet Mittal, Ranjeet Gupta, Amit Agarwal, Praneet Pabolu, Srikant Panda, Hansa Meghwani
  • GEMMAS: Graph-based Evaluation Metrics for Multi Agent Systems
    Jisoo lee, Raeyoung Chang, Dongwook Kwon, Harmanpreet Singh, Nikhil Verma
  • Knowledge-Augmented Question Error Correction for Chinese Question Answer System with QuestionRAG
    Longpeng Qiu, Ting Li, Shuai Mao, Nan Yang, Xiaohui Yan
  • SMART: Scalable Multilingual Approach for a Robust TOD System
    Karan Malhotra, Arihant Jain, Purav Aggarwal, Anoop Saladi
  • Think-Search-Patch: A Retrieval-Augmented Reasoning Framework for Repository-Level Code Repair
    Bojian Xiong, Yikun Lei, Xikai Liu, Shaowei Zhang, Pengyun Zhu, Yan Liu, Yongqi Leng, Ling Shi, Meizhi Zhong, Yurong Zhang, Yan Gao, YIWU, Yao Hu, Deyi Xiong
  • ASR-EC Benchmark: Evaluating Large Language Models on Chinese ASR Error Correction
    Victor Junqiu Wei, Weicheng Wang, Di Jiang, Yuanfeng SONG, Lu Wang
  • Bidirectional Reasoning Supervision for Multilingual Financial Decision Making
    Muhammad Rafsan Kabir, Jawad Ibn Ahad, Robin Krambroeckers, Silvia Ahmed, M M Lutfe Elahi, Nabeel Mohammed, Shafin Rahman
  • Automotive Document Labeling Using Large Language Models
    Dang Van Thin, Cuong Xuan Chu, Christian Graf, Tobias Kaminski, Trung-Kien Tran
  • Building Data-Driven Occupation Taxonomies: A Bottom-Up Multi-Stage Approach via Semantic Clustering and Multi-Agent Collaboration
    Nan Li, Bo Kang, Tijl De Bie
  • AutoPenBench: A Vulnerability Testing Benchmark for Generative Agents
    Luca Gioacchini, Alexander Delsanto, Idilio Drago, Marco Mellia, Giuseppe Siracusano, Roberto Bifulco
  • Enabling Self-Improving Agents to Learn at Test Time With Human-In-The-Loop Guidance
    Yufei He, Ruoyu Li, Alex Chen, Yue Liu, Yulin Chen, Yuan Sui, Cheng Chen, Yi Zhu, Luca Luo, Frank Yang, Bryan Hooi
  • Encouraging Good Processes Without the Need for Good Answers: Reinforcement Learning for LLM Agent Planning
    Zhiwei Li, Yong Hu, Wenqing Wang
  • Experience report: Implementing Machine Translation in a Regulated Industry
    Marco Zocca, Per Fallgren, David Buffoni
  • Multi-Task Pre-Finetuning of Lightweight Transformer Encoders for Text Classification and NER
    Junyi Zhu, Savas Ozkan, Andrea Maracani, Sinan Mutlu, Cho Jung Min, Mete Ozay
  • Scaling Down, Serving Fast: Compressing and Deploying Efficient LLMs for Recommendation Systems
    Zhipeng Wang, Kayhan Behdin, Qingquan Song, Yun Dai, Ata Fatahibaarzi, Aman Gupta, Hejian Sang, Shao Tang, Gregory Dexter, Sirou Zhu, Siyu Zhu, Tejas Dharamsi, Vignesh Kothapalli, Zhoutong Fu, Yihan Cao, Pin-Lun Hsu, Fedor Borisyuk, Rahul Mazumder, Natesh S. Pillai, Luke Simon
  • Group, Embed and Reason: A Hybrid LLM and Embedding Framework for Semantic Attribute Alignment
    Shramona Chakraborty, Shashank Mujumdar, Nitin Gupta, Sameep Mehta, Ronen Kat, Itay Etelis, Mohamed Mahameed, Itai Guez, Rachel Brill
  • STREAQ: Selective Tiered Routing for Effective and Affordable Contact Center Quality Assurance
    Prajwal Sood, Rajdeep Agrawal, Mayank Sati, Digvijay Anil Ingle, Cijo George
  • Divide, Link, and Conquer: Recall-oriented Schema Linking for NL-to-SQL via Question Decomposition
    Kiran Pradeep, Kirushikesh DB, Nishtha Madaan, Sameep Mehta, Pushpak Bhattacharyya
  • Declarative Techniques for NL Queries over Heterogeneous Data
    Elham Khabiri, Jeffrey O. Kephart, Fenno F. Heath III, Srideepika Jayaraman, Yingjie Li, Fateh A. Tipu, Dhruv Shah, Achille Fokoue, Anu Bhamidipaty
  • Taxonomy of Comprehensive Safety for Clinical Agents
    Jean Seo, Hyunkyung Lee, Gibaeg Kim, Wooseok Han, Jaehyo Yoo, Seungseop Lim, Kihun Shin, Eunho Yang
  • Dr. Copilot: A Multi-Agent Prompt Optimized Assistant for Improving Patient-Doctor Communication in Romanian
    Andrei Niculae, Adrian Cosma, Cosmin Dumitrache, Emilian Radoi
  • Data-Efficient Active Prompt Optimization for Memory-Enhanced Conversational Agents
    Ervine Zheng, Yikuan Li, Geoffrey Jay Tso, Jilong Kuang
  • CLARITY: Clinical Assistant for Routing, Inference, and Triage
    Vladimir Shaposhnikov, Alexandr Nesterov, Ilia Kopanichuk, Ivan Bakulin, Zhelvakov Egor, Ruslan Abramov, Tsapieva Ekaterina Olegovna, Iaroslav Radionovich Bespalov, Dmitry V. Dylov, Ivan Oseledets
  • HalluDetect: Detecting, Mitigating, and Benchmarking Hallucinations in Conversational Systems
    Spandan Anaokar, Shrey Ganatra, Swapnil Bhattacharyya, Harshvivek Kashid, Shruthi N Nair, Reshma Sekhar, Siddharth Manohar, Rahul Hemrajani, Pushpak Bhattacharyya
  • How Accurate Are LLMs at Multi-Question Answering on Conversational Transcripts?
    Xiliang Zhu, Shi Zong, David Rossouw
  • AI Knowledge Assist: An Automated Approach for the Creation of Knowledge Bases for Conversational AI Agents
    Md Tahmid Rahman Laskar, Julien Bouvier Tremblay, Xue-Yong Fu, Cheng Chen, SHASHI BHUSHAN TN
  • DACIP-RC: Domain Adaptive Continual Instruction Pre-Training via Reading Comprehension on Business Conversations
    Elena Khasanova, Harsh Saini, Md Tahmid Rahman Laskar, Xue-Yong Fu, Cheng Chen, SHASHI BHUSHAN TN
  • Analysis of Automated Document Relevance Annotation for Information Retrieval in Oil and Gas Industry
    João Vitor Mariano Correia, Murilo Missano Bell, João Vitor Robiatti Amorim, Jonas Queiroz, Daniel Pedronette, Ivan Rizzo Guilherme, Felipe Lima de Oliveira
  • Mind the Query: A Benchmark Dataset towards Text2Cypher Task
    Vashu Chauhan, Shobhit Raj, Shashank Mujumdar, Avirup Saha, Anannay Jain
  • Deploying Tiny LVLM Judges for Real-World Evaluation of Chart Models: Lessons Learned and Best Practices
    Md Tahmid Rahman Laskar, Mohammed Saidul Islam, Ridwan Mahbub, Mizanur Rahman, Amran Bhuiyan, Israt Jahan, Mir Tafseer Nayeem, Shafiq Joty, Enamul Hoque, Jimmy Huang
  • Agent-in-the-Loop: A Data Flywheel for Continuous Improvement in LLM-based Customer Support
    Cen Zhao, Tiantian Zhang, Hanchen Su, Yufeng Zhang, Shaowei Su, Mingzhi Xu, Yu Liu, Wei Han, Jeremy Werner, Claire Na Cheng, Yashar Mehdad
  • Beyond Pointwise Scores: Decomposed Criteria-Based Evaluation of LLM Responses
    Fangyi Yu, Nabeel Seedat, Drahomira Herrmannova, Frank Schilder, Jonathan Richard Schwarz
  • Scalable and Cost Effective High-Cardinality Classification with LLMs via Multi-View Label Representations and Retrieval Augmentation
    Anup Pattnaik, Cijo George, Sasanka Vutla, Hamvir Dev, Jeevesh Nandan
  • How to Fine-Tune Safely on a Budget: Model Adaptation Using Minimal Resources
    Anh C. Pham, Mihir Thalanki, Michael Sun, Aditya Chaloo, Ankita Gupta, Tian Xia, Aditya Mate, Ehi Nosakhare, Soundararajan Srinivasan
  • Zero-knowledge LLM hallucination detection and mitigation through fine-grained cross-model consistency
    Aman Goel, Daniel Schwartz, Yanjun Qi
  • Incremental Summarization for Customer Support via Progressive Note-Taking and Agent Feedback
    Yisha Wu, Cen Zhao, Yuanpei Cao, Xiaoqing Xu, Yashar Mehdad, Mindy Ji, Claire Na Cheng
  • R3D - Reasoning for Search Relevance using Reinforcement Learning and Distillation
    Sourab Mangrulkar, Ankith M S, Vijay huddar, Atul Saroop, Sumit Negi, Rahul Bhagat
  • LLMInit: A Free Lunch from Large Language Models for Selective Initialization of Recommendation
    Weizhi Zhang, Liangwei Yang, Wooseong Yang, Henry Peng Zou, Yuqing Liu, Ke Xu, Sourav Medya, Philip S. Yu
  • LLM Agents Implement an NLG System from Scratch: Building Interpretable Rule-Based RDF-to-Text Generators
    Mateusz Lango, Ondrej Dusek
  • Leveraging LLMs to Streamline the Review of Public Funding Applications
    João DS Marques, Andre Vicente Duarte, André Mendes Marques de Carvalho, Gil Rocha, Bruno Martins, Arlindo L. Oliveira
  • AdaSwarm: Adaptive Graph Structure Selection for LLM-based Multi-agent System
    Hui Yi Leong, Yuheng Li, Yuqing Wu, Wei Zhu, Jiechao Gao
  • ColMate: Contrastive Late Interaction and Masked Text for Multimodal Document Retrieval
    Ahmed Masry, Megh Thakkar, Patrice Bechard, Sathwik Tejaswi Madhusudhan, Rabiul Awal, Shambhavi Mishra, Akshay Kalkunte Suresh, Srivatsava Daruru, Enamul Hoque, Spandana Gella, Torsten Scholak, Sai Rajeswar
  • Confidence-Aware Reasoning: Optimizing Self-Guided Thinking Trajectories in Large Reasoning Models
    Jiaxin Zhang
  • Multi-Value-Product Retrieval-Augmented Generation for Industrial Product Attribute Value Identification
    Huike Zou, Haiyang Yang, Yindu Su, Chen Li Yu, Qinye Xie, ChengbaoLian, qingheng zhang, Shuguang Han, Fei Huang, jufeng chen
  • AttributeForge: An Agentic LLM Framework for Automated Product Schema Modeling
    Yunhan Huang, Klevis Ramo, Andrea Iovine, Melvin Monteiro, Sedat Gokalp, Arjun Bakshi, Hasan Turalic, Arsh Kumar, Jona Neumeier, Rejaul Monir, Simon Hartmann, Mohamed Yakout
  • VestaBench: An Embodied Benchmark for Safe Long-Horizon Planning Under Multi-Constraint and Adversarial Settings
    Tanmana Sadhu, Yanan Chen, Ali Pesaranghader
  • Advancing E-commerce Merchants Telemarketing with Synthetic Data-Driven LLMs
    Qi Gou, Zehua Xia, Li Juan, Qingyang Zhao, wenjing yang
  • Medical Knowledge-Guided Depression Detection on Social Media with Large Language Models
    Xiaochong Lan, Zhiguang Han, Yiming Cheng, Li Sheng, Jie Feng, Chen Gao, Yong Li
  • BullyBench: Youth \& Experts-in-the-loop Framework for \textit{Intrinsic} and \textit{Extrinsic} Cyberbullying NLP Benchmarking
    Kanishk Verma, Sri Balaaji Natarajan Kalaivendan, Joachim Wagner, Arefeh Kazemi, Sinan Asci, Sayani Basak, Isobel Walsh, Darragh McCashin, Alexandros Poulis, Yelena Cherkasova, James O’Higgins Norman, Rebecca Umbach, Tijana Milosevic, Brian Davis
  • Tagging-Augmented Generation: Assisting Language Models in Finding Intricate Knowledge In Long Contexts
    Anwesan Pal, Karen Hovsepian, Tinghao Guo, Mengnan Zhao, Somendra Tripathi, George Mihaila, Nikos Kanakaris, Sumit Nigam
  • DispatchQA: A Benchmark for Small Function Calling Language Models in E-Commerce Applications
    Joachim Daiber, Victor Maricato, Ayan Sinha, Andrew Rabinovich
  • Generalized Embedding Models for Industry 4.0 Applications
    Christodoulos Constantinides, Shuxin Lin, Dhaval C Patel
  • ECHO-LLaMA: Efficient Caching for High-Performance LLaMA Training
    Hossein Rajabzadeh, Maryam Dialameh, Rezaul Karim, Omar Mohamed Awad, Hyock Ju Kwon, Boxing Chen, Walid Ahmed, Yang Liu
  • Generating Spatial Knowledge Graphs from Automotive Diagrams for Question Answering
    Steve Bakos, Chen Xing, Heidar Davoudi, Aijun An, Ron DiCarlantonio
  • Enhancing Persuasive Dialogue Agents by Synthesizing Cross‑Disciplinary Communication Strategies
    Shinnosuke Nozue, Yuto Nakano, Yotaro Watanabe, Meguru Takasaki, Shoji Moriya, Reina Akama, Jun Suzuki
  • BIOPSY - Biomarkers In Oncology: Pipeline for Structured Yielding
    Sanya A. Chetwani, Jaseem Mahmmdla
  • DAST: Difficulty-Adaptive Slow-Thinking for Large Reasoning Models
    Yi Shen, Jian Zhang, Jieyun Huang, Shuming Shi, Wenjing Zhang, Jiangze Yan, Ning Wang, Kai Wang, Zhaoxiang Liu, Shiguo Lian
  • pEBR: A Probabilistic Approach to Embedding Based Retrieval
    Han Zhang, Yunjiang Jiang, Mingming Li, Haowei Yuan, Yiming Qiu, Wen-Yun Yang
  • Finding Diamonds in Conversation Haystacks: A Benchmark for Conversational Data Retrieval
    Yohan Lee, Yongwoo Song, Sangyeop Kim
  • **: Structuring Your Natural Language SOPs into Tailored Ambiguity-Resolved Code Templates**
    *Sachin Kumar Giroh, Pushpendu Ghosh, Aryan Jain, Harshal Giridhari Paunikar, Aditi Rastogi, Promod Yenigalla*
  • Recover-LoRA: Data-Free Accuracy Recovery of Degraded Language Models via Low-Rank Adaptation
    Devleena Das, Rajeev Patwari, Ashish Sirasao
  • Efficient Industrial sLLMs through Domain Adaptive Continual Pretraining: Method, Evaluation and Applications
    Seonwu kim, Yohan Na, Kihun Kim, Hanhee Cho, Geun Lim, Mintae Kim, Seongik Park, Ki Hyun Kim, Youngsub Han, Byoung-Ki Jeon
  • GRAFT: A Graph-based Flow-aware Agentic Framework for Document-level Machine Translation
    Himanshu Dutta, Sunny Manchanda, Prakhar Bapat, Meva Ram Gurjar, Pushpak Bhattacharyya
  • Recon, Answer, Verify: Agents in Search of Truth
    Satyam Shukla, Himanshu Dutta, Pushpak Bhattacharyya
  • T-VEC: A Telecom-Specific Vectorization Model with Enhanced Semantic Understanding via Deep Triplet Loss Fine-Tuning
    Vignesh Ethiraj, Ashwath D, Sidhanth Menon, Divya Vijay, Vidhyakshaya Kannan
  • PlanGPT-VL: Enhancing Urban Planning with Domain-Specific Vision-Language Models
    He Zhu, Junyou Su, Minxin Chen, Wen Wang, Yijie Deng, Guanhua Chen, Wenjia Zhang
  • IPR: Intelligent Prompt Routing with User-Controlled Quality-Cost Trade-offs
    Aosong Feng, Zhichao Xu, Xian Wu, Kang Zhou, Sheng Guan, Yueyan Chen, Ninad Kulkarni, Yun Zhou, Balasubramaniam Srinivasan, Haibo Ding, Lin Lee Cheong
  • Semantic Agreement Enables Efficient Open-Ended LLM Cascades
    Duncan Soiffer, Steven Kolawole, Virginia Smith
  • Lost in Pronunciation: Detecting Chinese Offensive Language Disguised by Phonetic Cloaking Replacement
    Haotan Guo, Jianfei He, Jiayuan Ma, Hongbin Na, Zimu Wang, Haiyang Zhang, Qi Chen, Wei Wang, Zijing Shi, Tao Shen, Ling Chen
  • Distilling Cross-Modal Knowledge into Domain-Specific Retrievers for Enhanced Industrial Document Understanding
    Jinhyeong Lim, Jeongwan Shin, Seeun Lee, Seongdeok Kim, JOUNGSU CHOI, Jongbae Kim, Chun Hwan Jung, Youjin Kang
  • Don’t Forget the Base Retriever! A Low-Resource Graph-based Retriever for Multi-hop Question Answering
    Andre Melo, Enting Chen, Pavlos Vougiouklis, Chenxin Diao, Shriram Piramanayagam, Ruofei Lai, Jeff Z. Pan
  • Beyond Dynamic Quantization: An Efficient Static Hierarchical Mix-precision Framework for Near-Lossless LLM Compression
    Yi Zhang, Kai Zhang, Zheyang Li, Wenming Tan, Ye Ren, Jilin Hu
  • STACKFEED: Structured Textual Actor-Critic Knowledge base editing with FEEDback
    Shashank Kirtania, Naman Gupta, Priyanshu Gupta, Sumit Gulwani, Arun Iyer, Suresh Parthasarathy Iyengar, Arjun Radhakrishna, Sriram K. Rajamani, Gustavo Soares
  • JaCorpTrack: Corporate History Event Extraction for Tracking Organizational Changes
    Yuya Sawada, Hiroki Ouchi, Yuichiro Yasui, Hiroki Teranishi, Yuji Matsumoto, Taro Watanabe, Masayuki Ishii
  • CTR-Guided Generative Query Suggestion in Conversational Search
    Erxue Min, Hsiu-Yuan Huang, Xihong Yang, MinYang, Xin Jia, Yunfang Wu, Hengyi Cai, Junfeng Wang, Shuaiqiang Wang, Dawei Yin
  • LATTE: Learning Aligned Transactions and Textual Embeddings for Bank Clients
    Egor Fadeev, Dzhambulat Mollaev, Aleksei Shestov, Dima Korolev, Omar Zoloev, Ivan A Kireev, Andrey Savchenko, Maksim Makarenko
  • RedOne: Revealing Domain-specific LLM Post-Training in Social Networking Services
    Fei zhao, Chonggang Lu, wangyue, Zheyong Xie, Ziyan Liu, Haofu Qian, Jianzhao Huang, Fangcheng Shi, Zijie Meng, Hongcheng Guo, Mingqian He, Xinze Lyu, Zheyu Ye, Weiting Liu, Boyang Wang, Shaosheng Cao
  • High-Quality Medical Dialogue Synthesis for Improving EMR Generation
    Chengze Ge, Yu Xu, Qi Shao
  • Z1: Efficient Test-time Scaling with Code
    Zhaojian Yu, Yinghao Wu, Yilun Zhao, Arman Cohan, Xiao-Ping Zhang
  • Quality Assessment of Tabular Data using Large Language Models and Code Generation
    Ashlesha Akella, Akshar Kaul, Krishnasuri Narayanam, Sameep Mehta
  • PARSE: Parameter Automated Refinement and Schema Extraction
    Anubhav Shrimal, Aryan Jain, Soumyajit Chowdhury, Promod Yenigalla
  • From Long Videos to Engaging Clips: A Human-Inspired Video Editing Framework with Multimodal Narrative Understanding
    Xiangfeng Wang, XiaoLi, Yadong Wei, songxueyu, Yang Song, xiaxiaoqiang, Fangrui Zeng, Zaiyi Chen, liuliu, Gu Xu, Tong Xu
  • Efficiency-Effectiveness Reranking FLOPs for LLM-based Rerankers
    Zhiyuan Peng, Ting-Ruen Wei, Tingyu Song, Yilun Zhao, Yi Fang
  • GEAR: A Scalable and Interpretable Evaluation Framework for RAG-Based Car Assistant Systems
    Niloufar Beyranvand, Hamidreza Dastmalchi, Aijun An, Heidar Davoudi, Winston Chan, Ron DiCarlantonio
  • FQ-Eval: Building Evaluation Dataset for User-centered Follow-up Question Generation
    Sanghyun Seo, Bumsoo Kang, DAHM LEE, Jaeheon Kim, Joongbo Shin, Euisoon Kim, Kijeong Jeon
  • Evaluating AI for Finance: Is AI Credible at Assessing Investment Risk Appetite?
    Divij Chawla, Ashita Bhutada, Duc Anh Do, Abhinav Raghunathan, Vinod SP, Cathy Guo, Dar Win Liew, Prannaya Gupta, Rishabh Bhardwaj, Rajat Bhardwaj, Soujanya Poria
  • A Proactive Reliability Metric for Detecting Failures in Language Model Training .
    Maryam Fatima
  • CAPSTONE: Composable Attribute‑Prompted Scene Translation for Zero‑Shot Vision–Language Reasoning
    Md. Ismail Hossain, Shahriyar Zaman Ridoy, Moshiur Farazi, Nabeel Mohammed, Shafin Rahman
  • Building Resource-Constrained Language Agents: A Korean Case Study on Chemical Toxicity Information
    Hojun Cho, Donghu Kim, Soyoung Yang, Chan Lee, Hunjoo Lee, Jaegul Choo
  • AutoDSPy: Automating Modular Prompt Design with Reinforcement Learning for Small and Large Language Models
    Nafew Azim, Abrar Ur Alam, Hasan Bin Omar, Abdullah Mohammad Muntasir Adnan Jami, Jawad Ibn Ahad, Muhammad Rafsan Kabir, Md. Ismail Hossain, Fuad Rahman, Mohammad Ruhul Amin, Shafin Rahman, Nabeel Mohammed