Using the bayesian improved surname geocoding method (BISG) to create a working classification of race and ethnicity in a diverse managed care population

A validation study

Dzifa Adjaye-Gbewonyo, Robert A. Bednarczyk, Robert Davis, Saad B. Omer

Research output: Contribution to journalArticle

14 Citations (Scopus)

Abstract

Objective To validate classification of race/ethnicity based on the Bayesian Improved Surname Geocoding method (BISG) and assess variations in validity by gender and age. Data Sources/Study Setting Secondary data on members of Kaiser Permanente Georgia, an integrated managed care organization, through 2010. Study Design For 191,494 members with self-reported race/ethnicity, probabilities for belonging to each of six race/ethnicity categories predicted from the BISG algorithm were used to assign individuals to a race/ethnicity category over a range of cutoffs greater than a probability of 0.50. Overall as well as gender- and age-stratified sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) were calculated. Receiver operating characteristic (ROC) curves were generated and used to identify optimal cutoffs for race/ethnicity assignment. Principal Findings The overall cutoffs for assignment that optimized sensitivity and specificity ranged from 0.50 to 0.57 for the four main racial/ethnic categories (White, Black, Asian/Pacific Islander, Hispanic). Corresponding sensitivity, specificity, PPV, and NPV ranged from 64.4 to 81.4 percent, 80.8 to 99.7 percent, 75.0 to 91.6 percent, and 79.4 to 98.0 percent, respectively. Accuracy of assignment was better among males and individuals of 65 years or older. Conclusions BISG may be useful for classifying race/ethnicity of health plan members when needed for health care studies.

Original languageEnglish (US)
Pages (from-to)268-283
Number of pages16
JournalHealth Services Research
Volume49
Issue number1
DOIs
StatePublished - Feb 1 2014
Externally publishedYes

Fingerprint

Geographic Mapping
Validation Studies
Managed Care Programs
Population
Sensitivity and Specificity
Information Storage and Retrieval
Hispanic Americans
ROC Curve
Organizations
Delivery of Health Care
Health

All Science Journal Classification (ASJC) codes

  • Health Policy

Cite this

Using the bayesian improved surname geocoding method (BISG) to create a working classification of race and ethnicity in a diverse managed care population : A validation study. / Adjaye-Gbewonyo, Dzifa; Bednarczyk, Robert A.; Davis, Robert; Omer, Saad B.

In: Health Services Research, Vol. 49, No. 1, 01.02.2014, p. 268-283.

Research output: Contribution to journalArticle

@article{5ee9d0b295ff4be5b1005a6b5b1d1e00,
title = "Using the bayesian improved surname geocoding method (BISG) to create a working classification of race and ethnicity in a diverse managed care population: A validation study",
abstract = "Objective To validate classification of race/ethnicity based on the Bayesian Improved Surname Geocoding method (BISG) and assess variations in validity by gender and age. Data Sources/Study Setting Secondary data on members of Kaiser Permanente Georgia, an integrated managed care organization, through 2010. Study Design For 191,494 members with self-reported race/ethnicity, probabilities for belonging to each of six race/ethnicity categories predicted from the BISG algorithm were used to assign individuals to a race/ethnicity category over a range of cutoffs greater than a probability of 0.50. Overall as well as gender- and age-stratified sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) were calculated. Receiver operating characteristic (ROC) curves were generated and used to identify optimal cutoffs for race/ethnicity assignment. Principal Findings The overall cutoffs for assignment that optimized sensitivity and specificity ranged from 0.50 to 0.57 for the four main racial/ethnic categories (White, Black, Asian/Pacific Islander, Hispanic). Corresponding sensitivity, specificity, PPV, and NPV ranged from 64.4 to 81.4 percent, 80.8 to 99.7 percent, 75.0 to 91.6 percent, and 79.4 to 98.0 percent, respectively. Accuracy of assignment was better among males and individuals of 65 years or older. Conclusions BISG may be useful for classifying race/ethnicity of health plan members when needed for health care studies.",
author = "Dzifa Adjaye-Gbewonyo and Bednarczyk, {Robert A.} and Robert Davis and Omer, {Saad B.}",
year = "2014",
month = "2",
day = "1",
doi = "10.1111/1475-6773.12089",
language = "English (US)",
volume = "49",
pages = "268--283",
journal = "Health Services Research",
issn = "0017-9124",
publisher = "Wiley-Blackwell",
number = "1",

}

TY - JOUR

T1 - Using the bayesian improved surname geocoding method (BISG) to create a working classification of race and ethnicity in a diverse managed care population

T2 - A validation study

AU - Adjaye-Gbewonyo, Dzifa

AU - Bednarczyk, Robert A.

AU - Davis, Robert

AU - Omer, Saad B.

PY - 2014/2/1

Y1 - 2014/2/1

N2 - Objective To validate classification of race/ethnicity based on the Bayesian Improved Surname Geocoding method (BISG) and assess variations in validity by gender and age. Data Sources/Study Setting Secondary data on members of Kaiser Permanente Georgia, an integrated managed care organization, through 2010. Study Design For 191,494 members with self-reported race/ethnicity, probabilities for belonging to each of six race/ethnicity categories predicted from the BISG algorithm were used to assign individuals to a race/ethnicity category over a range of cutoffs greater than a probability of 0.50. Overall as well as gender- and age-stratified sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) were calculated. Receiver operating characteristic (ROC) curves were generated and used to identify optimal cutoffs for race/ethnicity assignment. Principal Findings The overall cutoffs for assignment that optimized sensitivity and specificity ranged from 0.50 to 0.57 for the four main racial/ethnic categories (White, Black, Asian/Pacific Islander, Hispanic). Corresponding sensitivity, specificity, PPV, and NPV ranged from 64.4 to 81.4 percent, 80.8 to 99.7 percent, 75.0 to 91.6 percent, and 79.4 to 98.0 percent, respectively. Accuracy of assignment was better among males and individuals of 65 years or older. Conclusions BISG may be useful for classifying race/ethnicity of health plan members when needed for health care studies.

AB - Objective To validate classification of race/ethnicity based on the Bayesian Improved Surname Geocoding method (BISG) and assess variations in validity by gender and age. Data Sources/Study Setting Secondary data on members of Kaiser Permanente Georgia, an integrated managed care organization, through 2010. Study Design For 191,494 members with self-reported race/ethnicity, probabilities for belonging to each of six race/ethnicity categories predicted from the BISG algorithm were used to assign individuals to a race/ethnicity category over a range of cutoffs greater than a probability of 0.50. Overall as well as gender- and age-stratified sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) were calculated. Receiver operating characteristic (ROC) curves were generated and used to identify optimal cutoffs for race/ethnicity assignment. Principal Findings The overall cutoffs for assignment that optimized sensitivity and specificity ranged from 0.50 to 0.57 for the four main racial/ethnic categories (White, Black, Asian/Pacific Islander, Hispanic). Corresponding sensitivity, specificity, PPV, and NPV ranged from 64.4 to 81.4 percent, 80.8 to 99.7 percent, 75.0 to 91.6 percent, and 79.4 to 98.0 percent, respectively. Accuracy of assignment was better among males and individuals of 65 years or older. Conclusions BISG may be useful for classifying race/ethnicity of health plan members when needed for health care studies.

UR - http://www.scopus.com/inward/record.url?scp=84892825940&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84892825940&partnerID=8YFLogxK

U2 - 10.1111/1475-6773.12089

DO - 10.1111/1475-6773.12089

M3 - Article

VL - 49

SP - 268

EP - 283

JO - Health Services Research

JF - Health Services Research

SN - 0017-9124

IS - 1

ER -