arXiv AI·March 21, 2026DEAF: A Benchmark for Diagnostic Evaluation of Acoustic Faithfulness in Audio Language Models