2026 Papers
- Speech-LLM ICMLADEPT: RL-Aligned Agentic Decoding of Emotion via Evidence Probing Tools – From Consensus Learning to Ambiguity-Driven Emotion ReasoningIn ICML 2026
- Dialogue ICMLStream RAG: Instant and Accurate Spoken Dialogue Systems with Streaming Tool UsageIn ICML 2026
- Evaluation ICMLLALM-as-a-Judge: Benchmarking Large Audio-Language Models for Safety Evaluation in Multi-Turn Spoken DialoguesIn ICML 2026
- Speech-LLM ICMLAudioChat: Unified Audio Storytelling, Editing, and Understanding with Transfusion ForcingIn ICML 2026
- Evaluation ACLFull-Duplex-Bench-v2: A Multi-Turn Evaluation Framework for Duplex Dialogue Systems with an Automated ExaminerIn ACL 2026
- ASR ACLPOWSM: A Phonetic Open Whisper-Style Speech Foundation ModelIn ACL 2026
- Evaluation ACLPRiSM: Benchmarking Phone Realization in Speech ModelsIn ACL 2026
- SLU ACLFindingsPlanRAG-Audio: Planning and Retrieval Augmented Generation for Long-form Audio UnderstandingIn ACLFindings 2026
- ASR ACLSpeech-Hands: A Self-Reflection Voice Agentic Approach to Speech Recognition and Audio Reasoning with Omni PerceptionIn ACL 2026
- Dialogue ACLFindingsOptimizing Conversational Quality in Spoken Dialogue Systems with Reinforcement Learning from AI FeedbackIn ACLFindings 2026
- Speech-LLM ICLRUALM: Unified Audio Language Model for Understanding, Generation and ReasoningIn ICLR 2026
- SE ICLRMAPSS: Manifold-based Assessment of Perceptual Source SeparationIn ICLR 2026
- SE ICASSPICASSP 2026 URGENT Speech Enhancement ChallengeIn ICASSP 2026
- ASR ICASSPSSVD-O: Parameter-Efficient Fine-Tuning with Structured SVD for Speech RecognitionIn ICASSP 2026
- Speech-LLM ICASSPReasoning Beyond Majority Vote: An Explainable SpeechLM Framework for Speech Emotion RecognitionIn ICASSP 2026
- SE ICASSP2025 URGENT Speech Enhancement Challenge Multilingual P.808 Listening Tests: Approach and ResultsIn ICASSP 2026
- Evaluation ICASSPFull-Duplex-Bench v1.5: Evaluating Overlap Handling for Full-Duplex Speech ModelsIn ICASSP 2026
- ASR ICASSPCALM: Joint Contextual Acoustic-Linguistic Modeling for Personalization of Multi-Speaker ASRIn ICASSP 2026
- Tokenizer ICASSPPhonological Tokenizer: Prosody-Aware Phonetic Token via Multi-Objective Fine-Tuning with Differentiable K-MeansIn ICASSP 2026
- SSL ICASSPOnline Register for Dual-Mode Self-Supervised Speech Models: Mitigating the Lack of Future ContextIn ICASSP 2026
- Evaluation EACLCSPB: Conversational Speech Processing Benchmark for Self-supervised Speech ModelsIn EACL 2026
- Tokenizer EACLFindingsBSCodec: A Band-Split Neural Codec for High-Quality Universal Audio ReconstructionIn EACLFindings 2026