A surgery oral examination

Interrater agreement and the influence of rater characteristics

Kenneth W. Burchard, Pamela A. Rowland-Morin, Nicholas P.W. Coe, Jane L. Garb

Research output: Contribution to journalArticle

31 Citations (Scopus)

Abstract

Background. Poor interrater reliability is a common objection to the use of oral examinations. Method. In 1990 the authors measured the agreement of 140 U.S. and Canadian surgical raters and the influences, if any, of age, years in practice, and experience as an examiner on individual oral examination scores. Eight actor examinees memorized transcripts of actual oral examinations and were videotaped using a single examiner. Examinee verbal style, dress, content of answers, and gender were purposefully adjusted. A repeated-measures analysis of variance was used for data analysis. Results. Three aspects of examinee performance influenced scores (verbal style, dress, and content of answers). No rater characteristic significantly affected scores. Raters showed high agreement (86%) when rating “good” performances but less agreement (67%) when rating “poor” performances. Conclusion. The oral examination scores were not influenced by rater selection. The raters ranked good performances more consistently than poor performances. Therefore, more than one examiner appears necessary to confirm a poor performance during an examination.

Original languageEnglish (US)
Pages (from-to)1044-1046
Number of pages3
JournalAcademic Medicine
Volume70
Issue number11
StatePublished - Jan 1 1995

Fingerprint

surgery
examination
examiner
performance
rating
analysis of variance
data analysis
gender
experience

All Science Journal Classification (ASJC) codes

  • Education

Cite this

Burchard, K. W., Rowland-Morin, P. A., Coe, N. P. W., & Garb, J. L. (1995). A surgery oral examination: Interrater agreement and the influence of rater characteristics. Academic Medicine, 70(11), 1044-1046.

A surgery oral examination : Interrater agreement and the influence of rater characteristics. / Burchard, Kenneth W.; Rowland-Morin, Pamela A.; Coe, Nicholas P.W.; Garb, Jane L.

In: Academic Medicine, Vol. 70, No. 11, 01.01.1995, p. 1044-1046.

Research output: Contribution to journalArticle

Burchard, KW, Rowland-Morin, PA, Coe, NPW & Garb, JL 1995, 'A surgery oral examination: Interrater agreement and the influence of rater characteristics', Academic Medicine, vol. 70, no. 11, pp. 1044-1046.
Burchard KW, Rowland-Morin PA, Coe NPW, Garb JL. A surgery oral examination: Interrater agreement and the influence of rater characteristics. Academic Medicine. 1995 Jan 1;70(11):1044-1046.
Burchard, Kenneth W. ; Rowland-Morin, Pamela A. ; Coe, Nicholas P.W. ; Garb, Jane L. / A surgery oral examination : Interrater agreement and the influence of rater characteristics. In: Academic Medicine. 1995 ; Vol. 70, No. 11. pp. 1044-1046.
@article{9ecc9d59cf804065ba898461cf920c22,
title = "A surgery oral examination: Interrater agreement and the influence of rater characteristics",
abstract = "Background. Poor interrater reliability is a common objection to the use of oral examinations. Method. In 1990 the authors measured the agreement of 140 U.S. and Canadian surgical raters and the influences, if any, of age, years in practice, and experience as an examiner on individual oral examination scores. Eight actor examinees memorized transcripts of actual oral examinations and were videotaped using a single examiner. Examinee verbal style, dress, content of answers, and gender were purposefully adjusted. A repeated-measures analysis of variance was used for data analysis. Results. Three aspects of examinee performance influenced scores (verbal style, dress, and content of answers). No rater characteristic significantly affected scores. Raters showed high agreement (86{\%}) when rating “good” performances but less agreement (67{\%}) when rating “poor” performances. Conclusion. The oral examination scores were not influenced by rater selection. The raters ranked good performances more consistently than poor performances. Therefore, more than one examiner appears necessary to confirm a poor performance during an examination.",
author = "Burchard, {Kenneth W.} and Rowland-Morin, {Pamela A.} and Coe, {Nicholas P.W.} and Garb, {Jane L.}",
year = "1995",
month = "1",
day = "1",
language = "English (US)",
volume = "70",
pages = "1044--1046",
journal = "Academic Medicine",
issn = "1040-2446",
publisher = "Lippincott Williams and Wilkins",
number = "11",

}

TY - JOUR

T1 - A surgery oral examination

T2 - Interrater agreement and the influence of rater characteristics

AU - Burchard, Kenneth W.

AU - Rowland-Morin, Pamela A.

AU - Coe, Nicholas P.W.

AU - Garb, Jane L.

PY - 1995/1/1

Y1 - 1995/1/1

N2 - Background. Poor interrater reliability is a common objection to the use of oral examinations. Method. In 1990 the authors measured the agreement of 140 U.S. and Canadian surgical raters and the influences, if any, of age, years in practice, and experience as an examiner on individual oral examination scores. Eight actor examinees memorized transcripts of actual oral examinations and were videotaped using a single examiner. Examinee verbal style, dress, content of answers, and gender were purposefully adjusted. A repeated-measures analysis of variance was used for data analysis. Results. Three aspects of examinee performance influenced scores (verbal style, dress, and content of answers). No rater characteristic significantly affected scores. Raters showed high agreement (86%) when rating “good” performances but less agreement (67%) when rating “poor” performances. Conclusion. The oral examination scores were not influenced by rater selection. The raters ranked good performances more consistently than poor performances. Therefore, more than one examiner appears necessary to confirm a poor performance during an examination.

AB - Background. Poor interrater reliability is a common objection to the use of oral examinations. Method. In 1990 the authors measured the agreement of 140 U.S. and Canadian surgical raters and the influences, if any, of age, years in practice, and experience as an examiner on individual oral examination scores. Eight actor examinees memorized transcripts of actual oral examinations and were videotaped using a single examiner. Examinee verbal style, dress, content of answers, and gender were purposefully adjusted. A repeated-measures analysis of variance was used for data analysis. Results. Three aspects of examinee performance influenced scores (verbal style, dress, and content of answers). No rater characteristic significantly affected scores. Raters showed high agreement (86%) when rating “good” performances but less agreement (67%) when rating “poor” performances. Conclusion. The oral examination scores were not influenced by rater selection. The raters ranked good performances more consistently than poor performances. Therefore, more than one examiner appears necessary to confirm a poor performance during an examination.

UR - http://www.scopus.com/inward/record.url?scp=0028848181&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0028848181&partnerID=8YFLogxK

M3 - Article

VL - 70

SP - 1044

EP - 1046

JO - Academic Medicine

JF - Academic Medicine

SN - 1040-2446

IS - 11

ER -