Abstract
This paper presents results of a consumer health vocabulary study of text appearing on Web-based bulletin boards. Consumers used obscenities and euphemisms to refer to certain body parts, functions, and behaviors. The female genitalia are the body region most often described with an obscenity (29% of all instances); male genitalia, in contrast, were rendered as obscene only 3% of the time. Consumers responding on the bulletin boards appear genuinely to prefer euphemistic slang and baby talk (62%) over obscenities (24%) when referring to the buttocks. From an anatomical perspective, this large dataset reveals a consumer health vocabulary of euphemisms and outright obscenities coexisting with professional medical terminology. The evident preference for euphemisms and slang for some anatomical parts has important implications for the design of health information controlled vocabularies and translation systems, faced with a lay language more informal than expected.
Introduction
Between January and March, 2007, two intellectual freedom controversies erupted over medical terminology and its reception and interpretation by the general public. The Higher Power of Lucky, a young adult fiction book about a 10-year-old girl, used the word scrotum.1 Three high school juniors in Cross River, NY, used the word vagina in reading from Eve Ensler’s The Vagina Monologues. 2 In each case, the controversy centered not on a vulgarism, but on the professionally “correct” term for a normal feature of human anatomy. The theologian C.S. Lewis expressed the problem well: when one deals explicitly with sex, he wrote, one is “forced to choose between the language of the nursery, the gutter, and the anatomy class.”3
This paper presents results of a consumer health vocabulary study of electronic communication about health problems. It is the first in a series of reports on data extracted from virtual conversations on health information-focused, Web-based bulletin boards. Among the recurring features of this data was consumers’ use of obscenity to describe certain body parts, functions, and behaviors. The evident preference for some terms in the online consumer population studied has important implications for the design of health information vocabularies intended to mediate between formal and informal language.
Literature Review
The confusion of obscenity with anatomy is a very old problem. Words for the human buttocks, lower abdomen and genitals engender not only vulgarisms, but euphemisms, and have for centuries; those regions of the body considered private and sensitive have traditionally required coded means of description. For example, the phrase hayward of corps dale (literally: “guardian of the body’s dale”) is an extremely veiled reference to the clitoris, ca. 1425.4 More than 500 years later, little has changed; sexuality researchers Braun and Kitzinger identified more than 1200 synonyms for vagina used by women speakers. It has been argued that these synonyms arise, in part, because of a reaction against clinical terminology: anatomical terms are considered “clinical and impersonal”.5 Braun and Kitzinger see this reaction as producing “a lexical gap in female genital terms .. divided between the anatomical, the coy and the euphemistic”.6
This diversity in expression has been documented as a problem for Internet searching. Richardson, Resnick, Hansen, Derry and Rideout7 investigated potential filter blocking of legitimate health information. Of 3987 unique URLs generated through submission of specific sexual health-related keywords, 2467 (62%) contained health information, not pornography; 516 (13%) really were pornographic; and 1004 (25%) had neither kind of content. This suggested either a need for no filters, or a need for more intelligent filters. Three years later, Su, Li, Ma and Li tested Google’s ™ keyword filtering ability with a corpus of 4000 pieces of pornographic text in Chinese. Examining the first 8000 results, these authors found that only 22% returned pornographic pages in response to pornographic queries; 30% of the results were “scientific texts about sex or medicine”.8 Clearly, the conflation of obscenity with anatomy cannot be solved through use of a search engine.
In the literature of healthcare, there is some evidence of a patient preference for slang when discussing certain kinds of problems; this may represent a potential advance in communication for physicians. Williams and Ogden published a study of 60 patients with “sexual or excretory” issues. One group received clinical explanations in medical terminology, while the other half heard “slang and euphemisms”. The latter group reported higher satisfaction with their quality of care.9 This finding resonates completely with the work of linguists Gibbs and Nagaoka, who wrote in 1985 that “the use of slang metaphors permits speakers to not only convey specific propositional information, but also some indication of their attitude towards this information”10. The present study examines the use of a particular kind of slang—obscenity and euphemism—in verbalization of consumer health.
Methods
Data Sources
The data reported here was gathered as part of the Ten Thousand Questions Project (TTQ). Ten thousand posts, ranging from sentence fragments to multiple pages in length, were selected from 36 health-focused Web-based bulletin boards between November 2003 and December 2004. Table 1 shows two examples of these posts, analyzed as “text units” (all spellings and capitalizations original). These bulletin boards are all sponsored by English-language online communities. They were identified through seven major search engines’ subject directories, classed under Health Advice and synonymous categories. These subject directories revealed 374 unique boards, to which four selection criteria were then applied. To qualify for data collection in TTQ, a board had to be English-language, no-cost, and functional in October 2003; finally, postings could not be edited by the site’s moderators or sponsors. While this last criterion eliminated otherwise excellent sources of consumer health information such as Columbia’s Go Ask Alice (www.goaskalice.columbia.edu), it did ensure that the vocabulary and questions captured did represent real and raw consumer communication. Posts in which the writer self-identified as a healthcare professional, or a student in a healthcare field, were removed from analysis.
Table 1.
Examples of TTQ text units.
TTQ Text Units |
---|
Hello, I am writing a novel in which one character tries to kill himself using a hypo filled with heroine. I need to know how much heroine it would take to cause death in an average-sized male, what would likely be in the syringe (typically... 100% heroine?), and what the fluid would look like. Thank you in advance for your service. |
IS leukimia contageous? Do Many people die from it ? |
The focus of these 36 boards ranges from general parenting issues (childhood illness, learning disabilities, etc.) through specific pediatric and adult diagnoses, to overall fitness and wellness, including sexual health.
The 10,000 text units collected were indexed by people employed in professional librarian positions (9 MLS-holders and 1 MD) with experience in indexing and/or consumer health. These indexers coded 1000 text units each over the duration of the study. They were asked to annotate for three types of content in each post using a previously established methodology11: Features (naturally occurring part of a person, such as a nose or a lung); Findings (something a physician might discover or diagnose in the course of an examination, such as a rash or a fever); and Therapies/Procedures (something done or recommended to be done for the treatment of a health concern, such as acupuncture or chemotherapy).
Text annotated as fitting one of these three categories was then processed by computer scientists at the Lister Hill National Center for Biomedical Communications, using MetaMap.12 Results for each category appear in Table 2.
Table 2.
Breakdown of TTQ text units according to semantic category.
Text Unit Category | Total Phrases |
---|---|
Features | 1931 |
Findings | 13302 |
Therapies/Procedures | 9716 |
TOTAL | 24,949 |
This data is currently being analyzed for consonance with the Unified Medical Language System (UMLS). The present study reports on analysis of TTQ terms and phrases categorized as Features, Findings or Therapies/Procedures that specifically referred to anatomical parts.
When words offend, they are seldom formally collected. As Hughes reminds us, “Hesitancy over accepting obscenity as a proper topic for public display or serious discussion” has made it a difficult subject for academic research.13 Few publicly available compilations of dirty words exist; the most comprehensive available, Jay’s Cursing in America14 identifies 10 categories of “curse” words: profanity, blasphemy, taboo, obscenity, vulgarity, slang, epithets, insults, slurs, and scatology.
The author and a second coder with an MLS reviewed all 24,949 TTQ phrases, checking the phrase for its context in the complete text unit, and reached initial inter-rater agreement of 86%, 100% and 100%, respectively, on the nature of terms and phrases found in the Features, Findings, and Therapies/Procedures categories. The published collection by Jay14 was used to validate final coding of a word as “taboo” and resolve disagreements. Table 3 presents the final list of terms and phrases coded as either obscene or as a euphemism, ordered according to their MeSH (Medical Subject Headings) classification in the Anatomy tree.
Table 3.
Obscene and euphemistic terms and phrases in TTQ data.
MeSH category | Dictionary definition | Instances | Obscenity/ Euphemisms [instances] |
---|---|---|---|
Body Regions (Female) | Woman’s breast15 | 177 Breast/s | Tit, tits, titties, teat, teats [17]; Boobies, boobs [3] |
Body Regions (Male) | Male breast* | 0 | Man-boobies [1] ; Man-tits [1] |
Body Regions (Unspecified) | Middle finger* | 6 Middle finger/s | “Flip the bird” finger [1] |
Digestive System | Anus15 | 85 | Asshole [1] |
Back there [1] | |||
Genitalia (Female) | Labia* | 19 | Nether lips [1] |
Virginity16 | 4 | Cherry [1] | |
Vulva17 | 8 | Pussy [1] | |
Clitoris16 | 86 | Clit [31], clit hood [1] | |
Genitalia, male | Erection of the penis16 | 64 | Manhood [2] Boner [1] |
Penis15 | 294 penis/es, | Cock [7] ; Prick [1] ; Dick/s [2] | |
Scrotum* | 18 | Ball sack [1] | |
Testicles18 | 34 testicle/s | Balls [1] | |
Urethra* | 12 urethra | Pee hole [1] | |
Genitalia (unspec.) | Genitals19 | 62 Genital/s | Privates [1]; Private area [2]; part [1] |
Lower Extremity | Buttocks 15 Gluteal cleft20 | 10 Buttock/s | Butt [29]; Ass [16]; Bum [8]; Bottom [4]; Buns [1]; Fanny [1]; Rear end [1]; Crack [1] |
Note: *. No dictionary definition found; coder-defined
For each term or phrase identified as obscene or euphemistic, the coders sought a dictionary definition. When such a definition could not be found (22% of terms), coders reached agreement on a non-obscene equivalent for the same concept. The frequency of occurrence of each obscenity or euphemism in the TTQ data was then compared with the frequency of its plain-language dictionary counterpart. For example, Chambers’ 21st Century Dictionary defines pussy (which appears 4 times) as “vulva” (which appears 8 times). (See Table 3).
Results
One hundred forty-one instances were found of 36 obscenities and euphemisms. They are classified as shown in Table 4. The phrases and terms themselves appear in Table 3. The female genitalia are the body region most often described with an obscenity (29% of all instances); in contrast, male genitalia are rendered as obscene only 3% of the time.
Table 4.
Classification and frequency of obscenities and euphemisms in TTQ data.
MeSH Category | Obscenities /% | Euphemisms /% |
---|---|---|
Body region female | 20 [10% of all instances] | 0 |
Body region male | 2 [100%] | 0 |
Body region: unspecified | 0 | 1 [14% of all instances] |
Digestive system | 1 [1%] | 1 [1%] |
Genitalia: female | 34 [29%] | 1 [1%] |
Genitalia: male | 14 [3%] | 2 [.5%] |
Genitalia: unspecified | 0 | 4 [6%] |
Lower extremity | 17 [24%] | 44 [62%] |
Total | 88 | 53 |
Discussion
From an anatomical perspective, this large dataset reveals a consumer health vocabulary of euphemisms and outright obscenities coexisting with professional medical terminology.
Braun and Kitzinger found that questionnaire respondents of both genders (N=156 F; 125 M) were more likely to generate a “standard slang” reference to female than to male genitalia. The ten most frequent standard slang terms submitted to these authors included cunt (82%), fanny (76%), and pussy (60%), of which the latter two are present in the TTQ data. In contrast, Braun and Kitzinger admit a paucity of slang terms available to describe the clitoris: “there is, as far as I know, no word to refer to the clitoris in a non-medical way”6 In the TTQ data, the abbreviation clit—like scrotum, a word rendered obscene for some listeners by its sexual associations--appears frequently (31 instances).
Conversely, while the use of manhood and privates reveals a continuing coyness of consumer speech, the body region most often euphemized is gender-neutral: the buttocks. Consumers responding on the TTQ bulletin boards appear genuinely to prefer slang and baby talk (butt, bum, bottom, buns, fanny, rear end ; 62%) over obscenities (ass, crack; 24%) and certainly over the technical term (buttocks or gluteal cleft) when referring to this body part.
Finally, this study reveals coining of words: terms used by two apparently different writers in reference to the male breast, which is never referred to in any other way than Man-tits and Man-boobies; and one instance of a euphemistic digit (“Flip the bird” finger).
Braun and Kitzinger complain about a “curious imprecision” in female genital slang: “Coy or euphemistic terms, such as down there, privates, and crotch..6‘strengthen the view that a woman's genitalia are something mysterious, vague and taboo: 'eclipsed' through the avoidance of naming’ “21. This statement applies not only to female genital terminology but to all terminology representing the potentially sensitive, private and stigmatizing. This is the language that is particularly liable to repression and suppression; it is blocked by Internet filters and human censors alike. As Richardson et al. have noted7 the confusion of obscene and pornographic information-seeking with legitimate health information-seeking has negative effects on the availability of that health information. Controlled health vocabularies for information retrieval can enhance health information delivery through incorporation of consumer-friendly terms that include not only the informal, but the coy and the obscene. To make this enhancement happen, however, vocabulary developers must, like dictionary makers, make a choice between description—capturing what exists—and prescription—capturing the words that “should” be used.
Conclusion
Consumer health vocabulary developers must examine language that consumers actually use to describe health concepts. Web-based bulletin boards are rich resources. Without the space and time restrictions imposed by a search box, without requiring the high level of information literacy necessary for effective online searching, bulletin board posts can provide a consumer’s comprehensive self-representation of health information needs and health status — and thus of their information requirements and the vocabulary used to express them.
Consumer health informatics researchers who are interested in enhancing consumer access to health information must, like sexuality researchers, begin to approach the taboo and the un-nameable. As consumer health vocabulary receives increased attention in informatics,22 the breadth of the domain in which vocabulary is investigated has already expanded from the clinic to the street. We now may need to move from the street to the nursery and the gutter.
Acknowledgments
The TTQ Project was funded by the Donald A.B. Lindberg Research Fellowship awarded by the Medical Library Association in 2003. The author also thanks Guy Divita and Allen Browne at the Lister Hill National Center for Biomedical Communications.
References
- 1.Bosman J. With one word, children’s book sets off uproar. NY Times. 2007 Feb 18;:1. [Google Scholar]
- 2.O’Connor A. ‘Monologues’ spurs dialogue on taste and speech. NY Times. 2007 Mar 8;:B1. [Google Scholar]
- 3.Lewis CS. Studies in words. Cambridge: Cambridge University Press; 1960. [Google Scholar]
- 4.Norri J. Names of body parts in English, 1400–1550. Helsinki: Academia Scientiarum Fennica; 1998. p. 350. [Google Scholar]
- 5.Sanders JS, Robinson WL. Talking and not talking about sex: Male and female vocabularies. J Commun. 1979;29:29. [Google Scholar]
- 6.Braun V, Kitzinger C. ’Snatch,’ ‘hole,’ or ‘honey-pot’? Semantic categories and the problem of nonspecificity in female genital slang. J Sex Research. 2001;38:146–158. [Google Scholar]
- 7.Richardson CR, Resnick PJ, Hansen DL, Derry HA, Rideout VJ. Does pornography-blocking software block access to health information on the Internet? JAMA. 2002;298(22):2887–2894. doi: 10.1001/jama.288.22.2887. [DOI] [PubMed] [Google Scholar]
- 8.Su G-Y, Li J-H, Ma Y-H, Li S-H. Improving the precision of the keyword-matching pornographic text filtering method using a hybrid model. J Zheyian U Sci. 2004;5(9):1106–1113. doi: 10.1631/jzus.2004.1106. [DOI] [PubMed] [Google Scholar]
- 9.Williams N, Ogden J. The impact of matching the patient's vocabulary: a randomized control trial. Fam Pract. 2004;21:632–637. doi: 10.1093/fampra/cmh610. [DOI] [PubMed] [Google Scholar]
- 10.Gibbs RW, Nagaoka A. Getting the hang of American slang: Studies on understanding and remembering slang metaphors. Lang Spch. 1985;28:178. [Google Scholar]
- 11.Smith CA, Stavri PZ, Chapman WW. In their own words? A terminological analysis of e-mail to a cancer information service. Proceedings of AMIA Annual Fall Symposium. 2002:697–701. [PMC free article] [PubMed] [Google Scholar]
- 12.Research Programs. [homepage on the Internet]. Bethesda, MD: NLM [ Updated 2007, Feb. 16; cited 2007 Mar 15]. Available from: http://skr.nlm.nih.gov/papers/index.shtml#MetaMap
- 13.Hughes G. Swearing: A social history of foul language, oaths and profanity in English. London: Penguin; 1991. p. vii. [Google Scholar]
- 14.Jay T. Cursing in America: A psycholinguistic study of dirty language in the courts, in the movies, in the schoolyards and on the streets. Amsterdam: John Benjamins; 1993. [Google Scholar]
- 15.The American Heritage® Dictionary of the English Language. http://www.xreferplus.com/entry/4140598
- 16.OED [Oxford English Dictionary] Online.
- 17.Chambers 21st Century Dictionary. http://www.xreferplus.com/entry/1223122
- 18.Rawson's Wicked Words. http://www.xreferplus.com/entry/1055332
- 19.Bloomsbury Dictionary of Contemporary Slang. http://www.xreferplus.com/entry/341435
- 20.Stedman’s Medical Dictionary, online edition. http://www.drugs.com
- 21.Mills S. Feminist stylistics. London: Routledge; 1995. p. 104. [Google Scholar]
- 22.Zeng QT, Tse T. Exploring and developing consumer health vocabularies. J Am Med Inform Assoc. 2006 Jan-Feb;13(1):24–9. doi: 10.1197/jamia.M1761. [DOI] [PMC free article] [PubMed] [Google Scholar]