In evaluating AI hallucination—that is, the propensity of language models to...
https://www.mediafire.com/file/uh0xkunc9s23fts/pdf-95592-4365.pdf/file
In evaluating AI hallucination—that is, the propensity of language models to generate factually incorrect or fabricated information—benchmark data plays a critical role in assessing and comparing model reliability