Development and Validation of a Genomic Knowledge Scale to Advance Informed Decision Making Research in Genomic Sequencing

Michelle M Langer; Myra I Roche; Noel T Brewer; Jonathan S Berg; Cynthia M Khan; Cristina Leos; Elizabeth Moore; Michelle Brown; Christine Rini

doi:10.1177/2381468317692582

. Author manuscript; available in PMC: 2018 Jun 18.

Published in final edited form as: MDM Policy Pract. 2017 Feb 1;2(1):10.1177/2381468317692582. doi: 10.1177/2381468317692582

Development and Validation of a Genomic Knowledge Scale to Advance Informed Decision Making Research in Genomic Sequencing

Michelle M Langer ¹, Myra I Roche ^2,³, Noel T Brewer ^4,⁵, Jonathan S Berg ⁶, Cynthia M Khan ⁷, Cristina Leos ⁴, Elizabeth Moore ⁴, Michelle Brown ⁸, Christine Rini ^3,^4,⁵

PMCID: PMC6005662 NIHMSID: NIHMS973983 PMID: 29928697

Abstract

Background

This study evaluated the psychometric properties of a new, comprehensive measure of knowledge about genomic sequencing, the University of North Carolina Genomic Knowledge Scale (UNC-GKS).

Methods

The UNC-GKS assesses knowledge in four domains thought to be critical for informed decision making about genomic sequencing. The scale was validated using classical test theory and item response theory in 286 adult patients and 132 parents of pediatric patients undergoing diagnostic whole exome sequencing (WES) in the NCGENES study.

Results

The UNC-GKS assessed a single underlying construct (genomic knowledge) with good internal reliability (Cronbach’s alpha = 0.90). Scores were most informative (able to discriminate between individuals with different levels of genomic knowledge) at one standard deviation above the scale mean or lower, a range that included most participants. Convergent validity was supported by associations with health literacy and numeracy (rs=0.41–0.46). The scale functioned well across subgroups differing in sex, race/ethnicity, education, and English proficiency.

Discussion

Findings supported the promise of the UNC-GKS as a valid and reliable measure of genomic knowledge among people facing complex decisions about WES and comparable sequencing methods. It is neither disease- nor population-specific, and it functioned well across important subgroups, making it usable in diverse populations.

Keywords: genomic sequencing, knowledge, whole exome sequencing, informed decision making

Rapidly evolving genetic testing practices have begun to include panels of dozens of genes in some clinical scenarios, or even more comprehensive tests ranging from thousands of genes to the entire genome or exome (e.g., whole exome sequencing, or WES).¹ Genome-scale tests (“genomic sequencing”) are more complex than single gene tests in important ways. For instance, they yield a wide spectrum of potential results, many of which have uncertain meaning. Providing informed consent for genomic sequencing requires that people have knowledge that includes, but goes beyond, knowledge needed for informed consent for single gene testing.^2,3 People with greater knowledge of the nature of genes and their effects on health, how genes are inherited in families, and the potential benefits, harms, and limitations of genomic sequencing are better equipped than their less knowledgeable peers to make informed decisions about undergoing sequencing, comprehend the meaning and limitations of their results, and take appropriate actions upon learning these results^4,5. Yet, many people currently offered genomic sequencing have inadequate knowledge and misconceptions about basic genetics^2,6 and are unfamiliar with genomic sequencing. The unique and complex issues raised by genome-scale tests are not well-covered by existing knowledge measures. Having a validated, comprehensive measure of genomic knowledge could help identify knowledge gaps and reduce the chance that people’s decisions and responses to genomic sequencing are based on false assumptions, unrealistic expectations, or misconceptions.

To meet the need for a valid way to assess genomic knowledge, we conducted the present study to evaluate a new measure, the University of North Carolina Genomic Knowledge Scale (UNC-GKS). We conceptualized “genomic knowledge” as encompassing four domains: the structure and function of genes, how they are inherited, their relation to health, and potential benefits, harms, and limitations of WES—a sequencing method that identifies variants in the subset of the genome that encodes the genes. These domains are based on our clinical experiences with patients offered sequencing and reflect a pragmatic approach that is highly relevant to typical research applications of genomic sequencing. The domains are also consistent with a framework discussed by Smerecnik and colleagues⁷ that includes awareness knowledge (knowing that there are genetic risk factors for disease), how-to knowledge (knowing how those risk factors influence risk for developing a disease), and principles knowledge (knowledge of pathways through which genes are theorized to influence health). Our comprehensive definition allows evaluation of the extent to which people have a basic working knowledge they can use to evaluate pros and cons, risks, uncertainties, and alternatives to genomic sequencing.

The UNC-GKS was also designed to address several limitations in existing measures. First, many existing knowledge measures focus on testing for mutations in a single gene^{e.g., 8} or are specific to a particular disease^{e.g., 9,10}. However, genome-scale tests, in addition to becoming increasingly common, raise unique and complex issues not well-covered by these kinds of existing measures. Moreover, because the UNC-GKS is general rather than disease-specific, scores can be compared across populations affected by different diseases. The UNC-GKS can also be used in populations affected by diseases for which disease-specific measures do not exist. A second issue addressed by the UNC-GKS relates to the sociodemographic diversity of most patient populations. Given recognized subgroup differences in knowledge and views of genomic sequencing,^11,12 it is often useful to compare knowledge scores across subgroups in a study’s sample. However, these comparisons are only informative if observed subgroup knowledge differences reflect real differences rather than measurement artifacts. It is rare to see formal analyses investigating the psychometric functioning of a knowledge measure across different subgroups. Finally, some existing knowledge measures assess agreement or disagreement with statements about genetics or genomics¹³—a common approach for measuring beliefs or attitudes that may, but does not necessarily, correspond to knowledge. Studies in educational testing and knowledge assessment more often use items with multiple choice or true/false response options that can be unambiguously scored as correct or incorrect^14,15.

Accordingly, our goal was to develop a measure that met the following criteria: (1) it covers domains of knowledge relevant to the complex decision contexts created by genome-scale sequencing; (2) it applies to a broad range of contexts and populations rather than being specific to a particular disease or population; (3) it has adequate validity and reliability across important sociodemographic subgroups; and (4) it uses a true/false response scale. No existing measures meet these criteria. We note that a recently introduced measure of basic knowledge about genetics and genetic causes of disease¹⁶ used true/false response options; however, it was evaluated in a sample with little diversity and analyses did not examine the psychometric functioning of its items across subgroups, so it is unclear whether the measure can assess knowledge with similar validity and reliability across important subgroups. Moreover, it does not include items about genomic sequencing and implications of potential sequencing results. The UNC-GKS includes those types of items, and we examined its item functioning across subgroups varying by sex, race/ethnicity, education, and English language proficiency. Consequently, the present study was expected to yield a new tool for research with diverse populations offered genomic sequencing.

Methods

Participants

Participants were enrolled in the North Carolina Clinical Genomic Evaluation by Next-generation Exome Sequencing (NCGENES) study, which is investigating the performance and best use of WES in the diagnosis and clinical care of patients with suspected genetic disorders³. Adult and pediatric patients evaluated at a UNC-affiliated hospital or at Vidant Medical Center (Greenville, NC) were eligible for NCGENES if they had symptoms or an illness with a possible genetic etiology (as determined by the referring physician) and if they were in one of these diagnostic groups: hereditary cancers, cardiovascular disorders (mainly cardiomyopathies), neurodevelopmental disorders, congenital disorders, retinopathies, or selected other disorders (e.g., mitochondrial disorders). Adult patients and parents of pediatric patients sequenced in NCGENES completed study measures including the UNC-GKS. Participants also included a small sample of guardians of adult patients whose cognitive or physical functioning precluded completion of study procedures; however, none were included in the present sample. We recruited 418 participants (286 adult patients, 132 parents) for the present study’s sample between August 2012 and December 2014. All completed study measures in English.

Procedures

Eligible individuals were contacted by study staff to schedule a study visit and then were mailed an appointment letter, consent and HIPAA forms, educational brochures designed for the study, and an intake questionnaire that included the UNC-GKS. Potential participants then met with a certified genetic counselor who obtained informed consent for sequencing. At this meeting, consenting participants returned their completed intake questionnaire, completed health literacy measures, and had their blood drawn. Data for the present study came from the intake questionnaire, the literacy measures, and UNC Hospitals chart abstraction. Prior to completing the UNC-GKS, participants had the opportunity to read the mailed educational brochures, which provided an overview of genomic sequencing and potential sequencing results. They had not yet received the more specific, personalized information provided during informed consent procedures. Their intake UNC-GKS scores therefore may not reflect the level of genomic knowledge in the general public; instead, these scores approximate knowledge likely to be found in candidates for sequencing. The institutional review boards of the University of North Carolina and Vidant Medical Center approved the study protocols.

Measures

Development of UNC-GKS

The UNC-GKS was developed in an iterative process that gathered feedback on measure domains and on item content and clarity from a team that included certified genetic counselors and medical geneticists with extensive clinical experience educating patients, behavioral scientists with formal training in communication and measure development, and others with and without genetics expertise. This team identified four key domains for the measure: the structure and function of genes, how they are inherited in families, their relation to health, and strengths and limitations of WES. We viewed the latter domain as a potentially separate module that future users could adapt or replace for other sequencing contexts (e.g., newborn or population screening). We reviewed existing measures and adapted or drafted knowledge items with the goal of ensuring good content validity across the four domains and cohesion across the underlying construct of genomic testing knowledge¹⁷. Some items within each domain specifically addressed misconceptions that could affect informed decision making (e.g., A mother and daughter who look alike are more genetically similar than a mother and daughter who do not look alike). We used genetic terms that participants were exposed to throughout the study (e.g., in consent procedures, brochures, and counseling). Because the term “gene variant” was especially important, the instructions included the following reminder: “We are using the term ‘gene variant’ to mean a version of a gene. Sometimes two people have the same version of a gene (they have the same gene variant) and other times two people have different versions of a gene (they have different gene variants).”

The resulting measure includes 25 items framed as statements and uses the response categories: true, false, and not sure/don’t know (with the latter provided to minimize guessing). The statements and correct answers appear in Table 1. We scored correct responses as 1 and incorrect responses and not sure/don’t know responses as 0.

Table 1.

UNC Genomic Knowledge Scale Items

Content Area	Item
Genes	1. Genes are made of DNA.
	2. Genes affect health by influencing the proteins our bodies make.
	3. All of a person’s genetic information is called his or her genome.
	4. A person’s genes change completely every 7 years.^*
	5. The DNA in a gene is made of four building blocks (A, C, T, and G).
	6. Everyone has about 20,000 to 25,000 genes.
Genes and health	7. Gene variants can have positive effects, harmful effects, or no effects on health.
	8. Most gene variants will affect a person’s health.^*
	9. Everyone who has a harmful gene variant will eventually have symptoms.^*
	10. Some gene variants have a large effect on health while others have a small effect.
	11. Some gene variants decrease the chance of developing a disorder.
	12. Two unrelated people with the same genetic variant will always have the same symptoms.^*
How genes are inherited in families	13. Genetic disorders are always inherited from a parent.^*
	14. If only one person in the family has a disorder it can’t be genetic.^*
	15. Everyone has a chance for having a child with a genetic disorder.
	16. A girl inherits most of her genes from her mother while a boy inherits most of his genes from his father.^*
	17. A mother and daughter who look alike are more genetically similar than a mother and daughter who do not look alike.^*
	18. If a parent has a harmful gene variant, all of his or her children will inherit it.^*
	19. If one of your parents has a gene variant, your brother or sister may also have it.
Whole exome sequencing	20. Whole exome sequencing can find variants in many genes at once.
	21. Whole exome sequencing will find variants that cannot be interpreted at the present time.
	22. Whole exome sequencing could find that you have a high risk for a disorder even if you do not have symptoms.
	23. Your whole exome sequencing may not find the cause of your disorder even if it is genetic.
	24. The gene variants that whole exome sequencing can find today could have different meanings in the future as scientists learn more about how genes work.
	25. Whole exome sequencing will not find any variants in people who are healthy.^*

Open in a new tab

Note: Correct answer to the items is true unless followed by an asterisk (*).

Sociodemographic and medical characteristics

Sociodemographic variables came from clinical records or self-report in the intake questionnaire. They included participant sex, race/ethnicity, educational attainment, and annual household income. Nominating clinicians reported patients’ diagnosis or symptoms; this information was supplemented and confirmed during the informed consent session.

English proficiency

We used 3-item subscale of the Cultural Identity Scale¹⁸ to assess English proficiency in speaking, reading, and writing. Responses, ranging from 1 (Poor) to 4 (Excellent), were summed to create a single score for which higher scores indicated greater proficiency (Cronbach’s alpha=0.90).

Health literacy

Study staff assessed general health literacy and genetics-related health literacy in person using the 66-item Rapid Estimate of Adult Literacy Measure¹⁹ and the 8-item Rapid Estimate of Adult Literacy-Genetics⁶. For both scales, we created scores by summing the number of words a participant pronounced correctly; words that a participant pronounced incorrectly or skipped were not counted. Higher scores indicate greater general health literacy and genetics-related health literacy, respectively. REALM raw scores can be used to categorize people as having low health literacy (scores of 0–44, ≤ sixth grade reading level), marginal literacy (scores of 45–60, seventh to eighth grade reading level), and functional health literacy (scores of 61–66, ≥ ninth grade reading level)¹⁹. REAL-G scores of three or less have been interpreted as indicating low genetics-related health literacy (≤ sixth grade level)⁶.

Numeracy

We measured subjective numeracy with a validated 3-item version of the Subjective Numeracy Scale^20–22. Items assess perceived numerical aptitude and preference for numbers on a scale from 1 (Not at all good/helpful) to 6 (Extremely good/helpful). Summing responses yields a single score for which higher scores indicated stronger preference for numerical over textual information and greater perceived ability to perform mathematical tasks (Cronbach’s alpha=0.89). We measured objective numeracy with a validated measure that presents three arithmetic problems testing the use of proportions, fractions, and percentages²³. Summing correct responses yields an objective numeracy score.

Data analysis

We examined the psychometric properties, factor structure (to evaluate the assumption that all items reflect a single underlying construct—in this case, genomic knowledge), and convergent validity (to evaluate the scale’s association with conceptually related variables) of the UNC-GKS. We also conducted item response theory (IRT) analyses that offer more in-depth information than classical test theory methods, including evaluation of variation in item performance (differential item functioning) across demographic subgroups.

Item-level descriptive statistics

First, we examined the proportion of participants correctly answering each UNC-GKS item to evaluate whether items were too easy (ceiling effects, indicated by >90% of participants answering them correctly) or too hard (floor effects, indicated by >90% of participants answering them incorrectly). Second, we computed inter-item tetrachoric correlations²⁴ to evaluate whether UNC-GKS items were positively associated with each other, as we would expect. Third, we evaluated whether responses to each item were consistent with the sum of the responses to the remaining items by examining whether item-total correlations were positive. Negative or low item-total correlations indicate items that may need to be reworded or discarded.

Factor analyses

Before completing IRT analyses, we checked the assumption that the measure was unidimensional by conducting a confirmatory factor analysis of the inter-item tetrachoric correlation matrix using the Mplus²⁵ “weighted least squares with robust standard errors, mean- and variance-adjusted” algorithm. We evaluated model fit using the root mean square error of approximation (RMSEA, acceptable if <0.05)²⁶, the Tucker-Lewis index (TLI, acceptable if >0.95)²⁷, the comparative fix index (CFI, acceptable if >0.95)²⁸, and residual correlations between items via modification indices. Large modification indices (>10) reveal possible local dependence for sets of items, indicating possible violation of the local dependence assumption of IRT. Local dependencies indicate content redundancy or similar wording between two or more items and may suggest additional factors exist in the scale.

IRT analyses

We evaluated performance of the UNC-GKS items by fitting one-, two-, and three-parameter logistic IRT models (1PL, 2PL, and 3PL, respectively) using the software program IRTPRO²⁹. The 1PL model³⁰ characterizes each item by a single parameter—the difficulty parameter, b, which indicates the level of genetic knowledge at which there is a 50% chance of answering the item correctly (that is, how difficult the item is). The 2PL model³¹ estimates both b and an additional parameter, the discrimination parameter, a, which reflects the degree to which item responses are associated with the latent construct being measured (how effectively an item discriminates between individuals with higher versus lower genomic knowledge). The 3PL model³¹ estimates the a and b parameters and an additional parameter, c, which accounts for guessing. We chose the best fitting model by examining chi-square tests of the likelihood ratio for each model pair, then examined this model’s goodness of fit to the data using Orlando and Thissen’s S-X² statistic^32,33, for which a nonsignificant result indicates adequate model fit at the item-level (i.e., how well each item fits the model). We controlled for multiple comparisons using the Benjamini-Hochberg procedure^34,35.

In addition to items flagged for potential local dependence in the confirmatory factor analysis, we used the IRT-based local dependence statistic³⁶ to identify items that were excessively related after controlling for the underlying construct (genomic knowledge)—an undesirable characteristic. Values >10 indicate substantial local dependence. We then conducted an additional check on the dimensionality of the data by estimating a bifactor IRT model in which each locally dependent set of items was specified as a second order factor. Violations of local dependence were deemed negligible if the variance accounted for by first-order or general factor (common variance)^37–40 was at least 0.85.

Next, we examined differential item functioning (DIF), which enables evaluation of whether items behave differently across subgroups after holding the underlying construct (genomic knowledge) constant⁴¹. It detects a form of measurement bias that occurs when people in different groups with the same level of the underlying construct have a different probability of getting a particular score on a scale. DIF may indicate that attributes other than those the scale is intended to measure are affecting responses. In the present study, we examined DIF across sex, race/ethnicity, education, and English proficiency groups. For each item, we used a logistic regression model to evaluate whether item responses were associated with group membership after controlling for participants’ IRT score on the UNC-GKS. Uniform DIF (of a similar magnitude across the range of the underlying construct) was evaluated with a likelihood ratio test comparing a logistic regression model with one predictor (IRT score) to a model with both IRT score and an additional predictor (group membership); this approach allowed us to evaluate whether, after controlling for overall level of genomic knowledge, one group was more likely than the other to answer the item correctly. Non-uniform DIF (for which magnitude may differ across the range of the underlying construct) was evaluated with a likelihood ratio test comparing a model with both predictors (IRT score and group membership) to a model that also included their interaction term. This model allowed us to evaluate whether an item provided better measurement of genomic knowledge for one group versus another. We used the Benjamini-Hochberg procedure to make inferential decisions in multiple comparisons.

According to common practice, we planned to drop items if they did not fit well, substantially violated local dependence, or functioned differently for key groups. The remaining items would then be used to calibrate a final IRT model to use in subsequent analyses.

IRT scoring and reliability

We computed IRT scores for the UNC-GKS based on the parameters from the final IRT model. These scores are relative to the population of this sample, assuming a normal distribution with a mean of 0 and standard deviation of 1. To be more easily interpretable, we scaled the IRT scores to the T-score metric with a mean of 50 and a standard deviation of 10. Analysts typically compute IRT scaled scores based on response patterns, essentially weighting item responses by their IRT a parameters so that items more strongly related to the underlying construct have a greater impact on the score. However, analysts also often use summed scores because they do not require special software. To enable practical use of scaled scores, we computed a scoring table to convert summed scores to expected scaled scores. We also computed a scoring table for a 19-item version of the UNC-GKS that omitted the WES items, for use when those items are not needed. The 19-item version was scored using the 19 IRT parameters from the 25-item calibration so that scores would be on the same scale and comparable.

Next, we used the IRT test information function (TIF) to examine the precision of scale scores—the extent to which an estimate of genomic knowledge at a given scale score is reasonably close to the true value. Given that these scores estimate individuals’ genomic knowledge, greater precision improves the scale’s ability to distinguish between individuals with different levels of genomic knowledge in addition to providing other useful information. The TIF sums information functions for each individual item into a single function. Greater test “information” indicates greater precision⁴². TIF is depicted in a graphical format where the amount of information is plotted against the latent construct (here, genomic knowledge) to show how well the test estimates the construct over the full range of individuals’ ability or knowledge. The areas of greatest measurement precision are indicated by the highest points of the curve.

Classical test theory reliability

We evaluated internal consistency reliability of the 25 UNC-GKS items by computing Cronbach’s coefficient α⁴³. Ideal α values are at least 0.70⁴⁴, indicating a set of items that are strongly related to one another.

Convergent validity

We calculated Pearson correlations between the UNC-GKS scale score and the REAL-G, REALM, and subjective and objective numeracy scales to evaluate convergent validity, or the extent to which measures that should be associated with each other are in fact associated. We predicted that the UNC-GKS would correlate positively with genetics-related literacy and general health literacy, because individuals with greater ability to obtain, process, and understand health information should be more able to learn the domains of information assessed by the UNC-GKS. We also predicted positive correlations between the UNC-GKS and both measures of numeracy because the ability to reason and apply numerical concepts influences ability to learn these domains of information⁴⁵.

Results

Sample

The final sample included 286 adult patients and 132 parents (Table 2). Three quarters were women and 17% were racial/ethnic minorities. Participants’ mean age was 47 years (range 17–84 years). Nearly 20% of participants had not attended college; just over half had a college degree. The median annual household income category was $60,000–74,999. Nearly 13% of the sample had marginal or worse general health literacy and 6% had low genetic literacy. About 18% reported less than “excellent” proficiency in speaking, writing, and/or reading English.

Table 2.

Participant Characteristics (N=418)

	n (%)	Mean (SD)
Role
Adult patient	286 (68.4)
Parent of pediatric patient	132 (31.6)
Age (years)^a		46.5 (14.3)
Sex
Female	315 (75.4)
Male	103 (24.6)
Ethnicity
Non-Hispanic	391 (93.5)
Hispanic	19 (4.5)
Missing	8 (1.9)
Race
White	345 (82.5)
Non-White	71 (17.0)
Missing	2 (0.5)
Education
Less than high school	28 (6.7)
High school graduate	52 (12.4)
Some college	88 (21.1)
Associates degree or vocational program	69 (16.5)
4-year college degree	108 (25.8)
Graduate degree	71 (17.0)
Missing	2 (.5)
Income
<$30,000	107 (25.6)
$30,000–$59,999	83 (19.9)
$60,000–$89,999	84 (20.1)
$90,000–$104,999	17 (4.1)
>$105,000	97 (23.2)
Missing	30 (7.2)
Clinical group
Hereditary cancers	100 (23.9)
Cardiovascular disorders	46 (11.0)
Neurodevelopmental disorders	112 (26.8)
Congenital disorders	32 (7.7)
Other	128 (30.6)
General health literacy		63.0 (6.7)
Functional (9^th grade and above)	358 (85.6)
Marginal (7^th or 8^th grade)	44 (10.5)
Low (6^th grade and below)	12 (2.9)
Missing	4 (1.0)
Genetics-related health literacy		7.1 (1.6)
High (above 6^th grade)	384 (91.9)
Low (6^th grade and below)	23 (5.5)
Missing	11 (2.6)
Objective numeracy		1.7 (1.0)
Subjective numeracy		4.6 (1.3)

Open in a new tab

Ages for participating parents of pediatric patients were not collected early in the study; therefore, descriptive statistics for participant age are based on all adult patients and 27 of the 132 participating parents.

Scale and item descriptive statistics

The number of missing responses was minor, ranging from 2–8 of 418 participants per item. The mean proportion of participants correctly answering each item was 0.73 (SD=0.12) with a range of 0.48 to 0.89 across the items. Figure 1 shows the distribution of responses for each item, revealing items for which knowledge and uncertainty were highest and lowest. Items from the four domains assessed by the measure (the structure and function of genes, how they are inherited, their relation to health, and the strengths and limitations of WES) were well distributed across the range of items answered correctly, incorrectly, or for which there was uncertainty. There were no floor or ceiling effects; thus, no items were too easy or too hard for this sample. All correlations among items were positive and statistically significant, with a mean of r=0.46 (SD=0.15) and range of 0.10 to 0.90. Similarly, all item-total correlations were positive and of medium to large magnitude, with a mean of r=0.49 (SD=0.11) and range of 0.29 to 0.68. Thus, item-level statistics did not identify any items as candidates for revision or removal and subsequent analyses considered all 25 items. Distributions of the summed score for the 19 and 25 item versions of the UNC-GKS appear in Figures 2 and 3. Scores are skewed to the left, indicating that participants correctly answered most items.

Item response distributions for the University of North Carolina Genomic Knowledge (UNC-GKS)

Summed score distribution for the 25-item University of North Carolina Genomic Knowledge (UNC-GKS)

Summed score distribution for the 19-item University of North Carolina Genomic Knowledge (UNC-GKS)