Logit Fingerprinting: A Novel, Accuracy-Independent Method for Validating Large Language Model Stability in High-Stakes Clinical Applications

Fuente: PubMed "apis"

AMIA Jt Summits Transl Sci Proc. 2026 Jun 1;2026:287-292. eCollection 2026.ABSTRACTThe integration of Large Language Models (LLMs) into clinical settings requires quality assurance mechanisms capable of detecting the hidden effects of model compression and architectural instability. Conventional accuracy metrics often fail to capture the behavioral volatility introduced by quantization, distillation, and sparse architectures. We propose the "Single-Token Forced-Choice Logit Probe," a method that generates a "behavioral fingerprint" of a model by analyzing its decision-making stability on a domain-specific (MedQA) benchmark. Validated on 11 local model families, our approach achieved 100% accuracy in distinguishing full-precision models from quantized variants. Furthermore, a longitudinal audit of commercial APIs revealed a distinct "Stability Gap": distilled "Nano" models exhibited nearly double the decision instability (2.82% vs. 1.58% Flip Rate) of their standard counterparts. Forensic classification identified the underlying compression techniques (Q8 vs. FP8), while analysis suggests the inherent non-determinism stems from Sparse Mixture-of-Experts (SMoE) routing. We conclude that Flip Rate is a critical safety metric and that distilled and quantized models require rigorous stability auditing before clinical deployment.PMID:42317854 | PMC:PMC13274300

Volver