대형 언어 및 시각-언어 모델에서 적응 리스크 관리를 위한 준수적인 자제 정책 학습

초록

대형 언어 및 시각-언어 모델 (LLMs/VLMs)은 안전 중요 응용 프로그램에서 점점 더 많이 사용되고 있지만, 그들의 불투명한 의사 결정은 위험 평가와 신뢰성을 복잡하게 만듭니다. 불확실성 양자화 (UQ)는 예측 신뢰도를 평가하고 불확실성이 높을 때 삼가게 하는 데 도움이 됩니다. 선형 예측 (CP)는 주요 UQ 방법으로 통계적 보장을 제공하지만 정적 임계값에 의존하여 작업 복잡성과 변화하는 데이터 분포에 적응하지 못해 정확성, 커버리지 및 정보성의 최적의 균형을 이루지 못합니다. 이를 해결하기 위해 우리는 학습 가능한 선형 삼가법을 제안하며, CP에 강화 학습 (RL)을 통합하여 삼가 임계값을 동적으로 최적화합니다. CP 임계값을 적응적 조치로 취급함으로써 우리의 방법은 여러 목표를 균형 있게 유지하며, 신뢰할 수 있는 커버리지를 유지하면서 예측 집합 크기를 최소화합니다. 다양한 LLM/VLM 벤치마크를 통한 포괄적인 평가 결과, 우리의 방법이 최소 모호 분류기 (LAC) 및 적응형 예측 집합 (APS)을 능가하여 정확도를 최대 3.2% 향상시키고, 환각 탐지를 위한 AUROC를 22.19% 향상시키며, 불확실성에 따른 선택적 생성 (AUARC)를 21.17% 향상시키고, 보정 오차를 70%-85% 감소시킵니다. 이러한 개선 사항은 여러 모델과 데이터 세트에 걸쳐 유지되며, 일관되게 90% 커버리지 목표를 충족하여 안전 중요 응용 프로그램에서 신뢰할 수 있는 의사 결정을 위한 더 효과적이고 유연한 솔루션으로 우리의 방법을 확립합니다. 코드는 다음에서 확인할 수 있습니다: {https://github.com/sinatayebati/vlm-uncertainty}.

English

Large Language and Vision-Language Models (LLMs/VLMs) are increasingly used in safety-critical applications, yet their opaque decision-making complicates risk assessment and reliability. Uncertainty quantification (UQ) helps assess prediction confidence and enables abstention when uncertainty is high. Conformal prediction (CP), a leading UQ method, provides statistical guarantees but relies on static thresholds, which fail to adapt to task complexity and evolving data distributions, leading to suboptimal trade-offs in accuracy, coverage, and informativeness. To address this, we propose learnable conformal abstention, integrating reinforcement learning (RL) with CP to optimize abstention thresholds dynamically. By treating CP thresholds as adaptive actions, our approach balances multiple objectives, minimizing prediction set size while maintaining reliable coverage. Extensive evaluations across diverse LLM/VLM benchmarks show our method outperforms Least Ambiguous Classifiers (LAC) and Adaptive Prediction Sets (APS), improving accuracy by up to 3.2%, boosting AUROC for hallucination detection by 22.19%, enhancing uncertainty-guided selective generation (AUARC) by 21.17%, and reducing calibration error by 70%-85%. These improvements hold across multiple models and datasets while consistently meeting the 90% coverage target, establishing our approach as a more effective and flexible solution for reliable decision-making in safety-critical applications. The code is available at: {https://github.com/sinatayebati/vlm-uncertainty}.

대형 언어 및 시각-언어 모델에서 적응 리스크 관리를 위한 준수적인 자제 정책 학습

Learning Conformal Abstention Policies for Adaptive Risk Management in Large Language and Vision-Language Models

초록

Support