Abstract
Objective. The majority of practicing pharmacists and student pharmacists are women. However, instruments to assess perceptions of gender equity within pharmacy academia are not available. The objective of this research was to describe the psychometric analysis of a questionnaire developed to assess gender equity by a Gender Equity Task Force and to report reliability and validity evidence.
Methods. A questionnaire with 21 items addressing the teaching, research, service, advancement, mentoring, recruitment, and gender of college leaders was created. The survey was distributed via email in December 2020 to all social and administrative science section members of two professional associations. Rasch analysis was performed to evaluate the reliability and validity evidence for the questionnaire.
Results. After reverse coding, all items met parameters for unidimensionality necessary for Rasch analysis. Once adjacent categories were merged to create a 3-point scale, the scale and items met parameters for appropriate functionality. Items were ordered hierarchically in order of difficulty. The modified instrument and scale can be treated as interval level data for future use.
Conclusion. This analysis provides reliability and validity evidence supporting use of the gender equity questionnaire in the social and administrative academic pharmacy population if recommended edits such as the 3-point scale are used. Future research on gender equity can benefit from use of a psychometrically sound questionnaire for data collection.
Keywords: gender equity, academia, pharmacy, Rasch analysis
INTRODUCTION
Women are participating in the pharmacy workforce in greater numbers than ever before, with approximately 60% of pharmacists (all settings) identifying as women.1 In the 2020-2021 academic year, 51% of the academic pharmacy workforce was comprised of women. The percentage has increased from 46.4% in academic year 2010-2011.2 Given that 64.9% of 2020 applicants to schools and colleges of pharmacy were female, this trend is expected to continue 3 The AACP 2019 National Pharmacy Workforce Study of primarily practicing pharmacists revealed that approximately 75% of women experienced some form of discrimination in the workplace.4 In the academic pharmacy setting, previous evidence suggests that there may be different barriers to advancement between men and women,5 such as differences in responsibilities that may be valued disproportionately in promotion and tenure processes.6,7 While some of these studies were published 3-6 years ago, anecdotal evidence suggests the gender gap may still be an issue within academic pharmacy. Additionally, a 2021 commentary on gender inequity identified the need for future studies to establish more current baseline measures.1
Within academic pharmacy, the extent of the issue(s) among social and administrative science faculty members is not known. As a result, a task force with members representing the Economic, Social and Administrative Science (ESAS) section of the American Pharmacists Association Academy of Pharmaceutical Research and Science (APhA-APRS ESAS) and Social and Administrative Sciences (SAS) section of the American Association of Colleges of Pharmacy (AACP SAS), hereafter referred to as the Gender Equity Task Force, investigated the following questions: Is there evidence to suggest that there is gender disparity in pay, responsibilities, leadership opportunities, rank, tenure status, career advancement, and “treatment” by faculty colleagues, students, and administration for social and administrative science faculty members? If so, what should the ESAS and SAS Sections do to address these disparities? To accomplish these goals, the Gender Equity Task Force completed the following tasks: created and administered a questionnaire to assess faculty perceptions of gender equity; created and conducted semi-structured interviews to gather qualitative information on perceptions of gender equity; and provided suggested actions for APhA and AACP to take to achieve gender equity. This paper focuses on the psychometric analysis of a questionnaire the task force developed that specifically focused on ESAS/SAS members of two professional organizations. Quantitative results from the survey and qualitative results from semi-structured interviews were presented in separate manuscripts. The objective of this study was to assess and describe the psychometric analysis of a gender equity questionnaire and to report reliability and validity evidence.
METHODS
Data were collected using a cross-sectional survey design. The University of Arizona Institutional Review Board determined that this quality improvement project was exempt.
The survey items and focus group guide were derived by the task force members from work conducted by The University of Texas (UT) at Austin Gender Equity Task Force.8 One of the study authors led the design and conduct of a focus group of African-American women in that previous UT task force, and items were developed to address the issues that were identified by the committee. The final instrument included 20 questions addressing the following areas: teaching, research, service, advancement, mentoring, and recruitment, and one item on the gender of college leaders. Each item was measured on a 5-point Likert scale with response options ranging from “favoring men much more/more,” “same for men and women,” to “favoring women more/much more.” An “I don’t know” response was also included. One question asked about the participant’s experience with gender inequity and, if they had experienced it, at what academic title and type of institution it occurred. Additional questions included the perceived impact of the department chair’s gender on department collegiality and demographic questions. A copy of the questionnaire is provided in Appendix 1.
The survey was distributed via email in December 2020 to all AACP SAS Section members (n=609) and APhA-APRS ESAS Section members (n=326). The survey recruitment email and one reminder email were sent out two weeks apart by AACP and APhA to their members. The survey was available for one month. Because there is overlap in membership between the two associations, the survey settings in Qualtrics were adjusted so that respondents could answer the survey only once.
Rasch analysis was performed to evaluate the reliability and validity evidence for the questionnaire. The Rasch rating scale model was used due to the polytomous nature of the response choices. If data fit the Rasch model, the ordinal scale data can produce interval level measurement.9 For this analysis, Winsteps, version 4.8.2, was used.10
Because unidimensionality is required to perform Rasch analysis, dimensionality was assessed using the parameters established in the Winsteps Manual.11 The standardized residual variance was evaluated. Unidimensionality was supported when the raw variance explained in the first contrast was 60% or more, and when the modeled and empirical raw variance explained by measures were similar. The unexplained variance in the first contrast should have an eigenvalue at or above two, and the modeled and empirical raw variance explained by measures should be similar. Then, the item-total correlations were assessed to ensure they were 0.2 or more and that they were all positive.
Next, scale functionality was assessed. The response options were analyzed as a 5-point scale (ie, responses of “I do not know” were not analyzed). Scale functionality was assessed using the Linacre criteria.12 If the rating scale meets the Linacre criteria to assess Rasch model fit, the ordinal level data approximate interval level data. The Linacre criteria indicate that each response category should contain at least 10 observations, the response categories should have a regular observation distribution, average measures should advance monotonically with category, outfit mean squares (MNSQ) (which are sensitive to outliers) should be less than 2, MNSQs should range from 0.6 to 1.4 for each response category, Andrich thresholds (category probabilities) should be ordered and increase from negative to positive, ratings should imply measures and measures should imply ratings, and Andrich thresholds should advance by at least 1.4 logits per response category but less than 5 logits for adjacent categories. If the scale did not meet the functionality requirements, the recommendations to merge adjacent categories and reassess scale functionality were followed.
Person and item model fit were then assessed. When participants and items fit the model, their Z standard (ZSTD) is between 2 and -2 and their MNSQ values are 0.7 to 1.3.13 If items did not fit, the most misfitting item was considered for removal. If an item was removed, the analysis was rerun.
Next, content validity and item/subject targeting were assessed using an item-person map where all person scores and item difficulties are converted to Logits and placed in the same figure. Item-person targeting was assessed by comparing the Logit distributions for subjects and items. A perfect mirror of these would indicate perfect targeting of the items to the subjects. In addition, the items were placed in order of difficulty on this figure. Content redundancy was assessed by identifying items at the same level on the latent construct.
Finally, reliability was assessed. The separation index (ie, the extent that items are sufficiently spread out to define distinct levels of difficulty on the latent construct) should be >0.8.11 The person separation index, analogous to Cronbach alpha, was also assessed.
RESULTS
A total of 159 responses to the questionnaire were received. It was assumed that all APhA ESAS members were included in the AACP SAS list, resulting in a response rate of 26.1% (159/609). Participant characteristics are listed in Table 1. Sixty-one percent of respondents were female and 78% were non-Hispanic White. Most respondents had a PhD degree (63.6%) were tenured (60.4%), and many (45.5%) were at the professor rank. Additional data can be found in Table 1. Descriptive statistics and gender differences of the survey items are presented elsewhere.
Table 1.
Demographic Characteristics of Respondents to a Gender Equity Questionnaire

When assessing the dimensionality, we noted that one item, “How do you believe the gender of your department chair influences the collegiality of your department?” was negatively correlated with other items. The item was reverse coded and then all items met parameters for unidimensionality as described in the Methods.
The original 5-point scale did not meet the criteria for rating scale functionality. As suggested by Linacre, adjacent categories were merged to create a 3-point scale where the options “Men get much more” and “Men get a little more” were merged into one category, “Men get more.” The same was done on the other side of the scale, where a “Women get more” category was created. Once this was accomplished, the scale met all parameters for appropriate functionality (see Figure 1).
Figure 1.
Category probability curves showing category response peaks for the merged 3-point response scale.
Most subjects fit the model and suggested parameters. While there were a few responses that had more error than the model expected (ie, INFIT MNSQ 2.39, OUTFIT MNSQ 3.85), the authors deemed that inclusion of these responses would not bias the results because there were only a few. In addition, most items fit the Rasch model parameters. However, the item assessing classroom civility had an OUTFIT MNSQ larger than expected (ie, 1.59). Because this item performed well for the INFIT criteria (ie, for those respondents near the mean) and because the content was important to the construct, the item was retained.
Figure 2 depicts the hierarchical ordering of item endorsement and what can be expected from each person/item interaction (the expected score map). The respondents’ perception of gender inequity and item difficulty are shown on the Y axis (using a logit scale). Gender inequity is shown in ascending order (ie, least to most) and item difficulty is shown in descending order (ie, most difficult to agree with to least difficult to agree with). Each respondent is represented by an X. The easiest item to agree with, “there is equity for …” is on the bottom, and the most difficult item is on the top. The mean item difficulty, standard deviation, and two standard deviations are noted in Figure 2, and distributions curves are presented. There is some content duplication where there are several items listed on the same logit level, such as Classroom Civility, Student Respect and Teaching Evaluation, and content gaps above the most difficult question (ie, Startup Package).
Figure 2.
Item difficulty and personability on the logit scale (n=159).
The separation index (ie, the extent that items are sufficiently spread out to define distinct levels of ability) for the 21-item questionnaire was 4.28, which translates to an item reliability of 0.95. The person separation index was 2.75, which translates to a person reliability of 0.88, respectively (analogous to Cronbach alpha).
DISCUSSION
The Rasch evaluation of the Gender Equity questionnaire, which focused on social and administrative sciences, has several implications for the future use of the instrument. The results suggest a 3-point scale should be used in lieu of the original 5-point scale. The 3-point scale function approximates interval level data; therefore, mathematical manipulations and parametric tests can be used for analysis. In addition, when cleaning the data, the “How you believe the gender of your department chair influences the collegiality of the department?” question should be reverse coded to ensure the instrument is unidimensional. Because the modified instrument and scale fit the Rasch model, the modified instrument and scale can be treated as interval level data for future use. Future research on gender equity in other areas of academia, such as basic sciences and clinical sciences, is essential to maintain attention on this critical issue. Access to psychometrically sound survey instruments for evaluating this topic will allow colleagues in other academic disciplines (eg, basic and clinical science) to conduct similar research and compare trends over time. However, future research in which the modified instrument is used should include that study’s reliability and validity evidence.
The gender equity items are placed in hierarchical order in Figure 2. This means that it is easiest for participants to agree there is gender equity in the items at the bottom of the figure (ie, grant expectations and research expectations) and harder for them to agree that there is gender equity for the items at the top (ie, start-up package, recruitment courting). Understanding the hierarchical order can be important for informing future research and interventions to improve gender equity perceptions. An organization can assess their faculty perceptions, determine where they lie on the latent construct (perceptions of gender equity), and then use this information to select the items that need to be addressed. For example, if an organization finds that most of their faculty perceive gender inequity exists with service workload, they can then develop policies and procedures to address that, and then move on to address the next most difficult item in the hierarchy. There did seem to be evidence of a content gap at the top of the scale (ie, there are respondents above the most difficult item and there are no items at the level of these respondents as can be seen in Figure 2). Thus, in the future, additional difficult items should be considered and added. Perhaps these items can assess social acceptance or inclusion. Lastly, the scale reliability of 0.88 (analogous to Cronbach alpha) showed excellent reliability.
Future researchers who use the instrument might benefit from removing some of the duplication as indicated by the person item map (Figure 2) as a shorter instrument can often result in a higher response rate. Groups using this instrument might consider removing some of the questions where several fall at the same difficulty level. For example, instead of questions on classroom civility, student respect, and teaching evaluations, the analysis suggests that the same information can be collected using only one of these questions rather than all three. However, it might be important to retain some other items that are at the same difficulty level if they represent different concepts (ie, leadership opportunities, resources for research, and salary).
This project does have limitations. For example, only ESAS/SAS faculty were surveyed. Future projects should assess gender equity perceptions among other groups of pharmacy faculty, such as those in the clinical and basic science sections. While this study captured binary and non/binary gender identities, due to small cell sizes, only binary gender was used. Future studies may consider methods for oversampling non-binary gender participants. While the response rate was suboptimal, the Rasch polytomous measures have been found to be stable with samples of 50.14 Thus, for the purposes of the study, the sample size obtained had the power to address stated objectives.
CONCLUSION
This analysis shows that there is reliability and validity evidence for the gender equity questionnaire for use in the ESAS/SAS population if recommended edits such as the 3-point scale are used. Future research on gender equity can benefit from use of a psychometrically sound questionnaire for data collection, as well as examination of gender equity perceptions among pharmacy faculty in basic and clinical sciences.
ACKNOWLEDGMENTS
The authors thank Terri S. Moore, PhD, Senior Director, AACP Academic Services, and Margaret H. Tomecki, PharmD, Senior Director, APhA Practice and Science Academies, for their support and assistance with this project. The opinions expressed are those of the authors and do not represent those of the Department of Veterans Affairs or the US Government.
REFERENCES
- 1.Bissell BD, Johnston JP, Smith RR, et al. Gender inequity and sexual harassment in the pharmacy profession: evidence and call to action. Am J Health-Syst Pharmacy. 2021;78(22):2059-2076. doi: 10.1093/ajhp/zxab275 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.American Association of Colleges of Pharmacy. 2010-11 Profile of Pharmacy Faculty. Accessed April 15, 2023. https://www.aacp.org/research/institutional-research/pharmacy-faculty-demographics-and-salaries
- 3.American Association of Colleges of Pharmacy Profile of Pharmacy Students Fall 2020. Accessed April 15, 2023. https://www.aacp.org/sites/default/files/2021-05/pps-2020-introduction.pdf
- 4.American Association of Colleges of Pharmacy. National Pharmacist Workforce Study 2019. Published January 10, 2020. Accessed April 15, 2023. https://www.aacp.org/sites/default/files/2020-03/2019_NPWS_Final_Report.pdf
- 5.Draugalis JR, Plaza CM, Taylor DA, Meyer SM. The status of women in US academic pharmacy. Am J Pharm Educ. 2014;78(10):178. doi: 10.5688/ajpe7810178 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Guarino CM, Borden VMH. Faculty service loads and gender: are women taking care of the academic family? Res High Educ. 2017;58(6):672-694. doi: 10.1007/s11162-017-9454-2 [DOI] [Google Scholar]
- 7.Sarna KV, Griffin T, Tarlov E, Gerber BS, Gabay MP, Suda KJ. Trends in gender composition on editorial boards in leading medicine, nursing, and pharmacy journals. J Am Pharm Assoc. 2020;60(4):565-570. doi: 10.1016/j.japh.2019.12.018 [DOI] [PubMed] [Google Scholar]
- 8.Moore JS, Ritter G. Final Report of the Gender Equity Task Force, The University of Texas at Austin. Published 2008. Accessed April 15, 2023. https://utexas.app.box.com/s/g55h0isyw1tj78qauhd2up9nuku3ghqd
- 9.Smith EV, Wakely MB, deKruif REL, Swartz CW. Optimizing rating scales for self-efficacy (and other) research. Educ Psychol Meas. 2003;63(3):369-391. doi: 10.1177/0013164403063003002 [DOI] [Google Scholar]
- 10.Winsteps® Rasch Measurement [Computer Software]. Version 4.8.2. Beaverton, Oregon: Winsteps; 2021.
- 11.Linacre JM. Winsteps® Rasch measurement computer program User’s Guide. Published 2021. Accessed January 15, 2022. Winsteps.com/manuals.htm
- 12.Linacre JM. Investigating rating scale category utility. J Outcome Meas. 1999;3(2):103-122. Accessed April 15, 2023. https://pubmed.ncbi.nlm.nih.gov/10204322/ [PubMed] [Google Scholar]
- 13.Smith RM. Polytomous mean-square fit statistics. Rasch Measurement Transactions. 1996;10(3):516-517. Accessed April 15, 2023. https://www.rasch.org/rmt/rmt103a.htm [Google Scholar]
- 14.Linacre JM. Sample size and item calibration stability. Rasch Measurement Transactions. 1994;7(4):328. Accessed April 15, 2023. https://www.rasch.org/rmt/rmt74m.htm [Google Scholar]


