Nature Medicine: An evaluation framework for clinical use of large language models in patient interaction tasks
NeurIPS ERBench: An Entity-Relationship based Automatically Verifiable Hallucination Benchmark for LLMs