AI versus Faculty: A Comparative Study of Narrative Feedback on Medical Students' Written Mental Status Exams

Elle S. Cleaves; David C. Belmonte; Sorana Raiciulescu; Joshua R. Duncan

doi:10.18103/mra.v14i3.7347

PDF

DOI: https://doi.org/10.18103/mra.v14i3.7347

Downloads

Download data is not yet available.

Submit your own article

Register as an author to reserve your spot in the next issue of the Medical Research Archives.

Author Registration

Join the Society

The European Society of Medicine is more than a professional association. We are a community. Our members work in countries across the globe, yet are united by a common goal: to promote health and health equity, around the world.

Join Europe’s leading medical society and discover the many advantages of membership, including free article publication.

Membership

Elle S. Cleaves

Uniformed Services University of the Health Sciences, Bethesda, Maryland, United States of America

David C. Belmonte

University of Michigan, Ann Arbor, Michigan, United States of America

Sorana Raiciulescu

Uniformed Services University of the Health Sciences, Bethesda, Maryland, United States of America

Joshua R. Duncan

Uniformed Services University of the Health Sciences, Bethesda, Maryland, United States of America

Abstract

Objective: The authors evaluated narrative feedback generated by ChatGPT and faculty members for medical students' Mental Status Exam (MSE) write-ups. The study compared feedback quality and usefulness and assessed whether students and academic psychiatrists could identify the feedback source.

Methods: Medical students (N=164) wrote MSEs and received blinded feedback from either a faculty member (for low-scoring write-ups, n=43) or ChatGPT (for high-scoring write-ups, n=121). Students rated the feedback's quality and usefulness and guessed its origin. Three academic psychiatrists also conducted a blinded evaluation, rating both feedback types for the low-scoring MSEs, choosing the superior version, and guessing the source.

Results: Students rated AI feedback quality significantly higher than faculty feedback (mean=4.22 vs. 3.5). Academic psychiatrists preferred the AI-generated feedback in 93% of cases. Only 29% of students receiving AI feedback correctly identified its source. Psychiatrists correctly identified AI feedback only 23% of the time and misattributed faculty feedback as AI-generated 71% of the time.

Conclusions: AI-generated feedback was perceived as high-quality by students and preferred by expert raters. The difficulty in distinguishing AI from faculty feedback suggests generative AI can produce feedback comparable or superior to human experts, offering a scalable tool to support medical education and reduce faculty workload.

How to Cite

S. CLEAVES, Elle et al. AI versus Faculty: A Comparative Study of Narrative Feedback on Medical Students' Written Mental Status Exams. Medical Research Archives, [S.l.], v. 14, n. 3, apr. 2026. ISSN 2375-1924. Available at: <https://esmed.org/MRA/mra/article/view/7347>. Date accessed: 30 may 2026. doi: https://doi.org/10.18103/mra.v14i3.7347.

ABNT APA BibTeX CBE EndNote - EndNote format (Macintosh & Windows) MLA ProCite - RIS format (Macintosh & Windows) RefWorks Reference Manager - RIS format (Windows only) Turabian

Keywords

Artificial Intelligence, Medical Education, Feedback, Mental Status Examination, Psychiatry

Issue

Vol 14 No 3 (2026): Vol.14, Issue 3, March 2026

Section

Research Articles

The Medical Research Archives grants authors the right to publish and reproduce the unrevised contribution in whole or in part at any time and in any form for any scholarly non-commercial purpose with the condition that all publications of the contribution include a full citation to the journal as published by the Medical Research Archives.

European Society of Medicine

Article Sidebar

Downloads

Submit your own article

Join the Society

Main Article Content

Abstract

Article Details