Skip to main content
AMIA Annual Symposium Proceedings logoLink to AMIA Annual Symposium Proceedings
. 2005;2005:540–544.

Analyzing the Structure and Content of Public Health Messages

Frances P Morrison 1, Rita Kukafka 1,2, Stephen B Johnson 1
PMCID: PMC1560424  PMID: 16779098

Abstract

Background

Health messages are crucial to the field of public health in effecting behavior change, but little research is available to assist writers in composing the overall structure of a message. In order to develop software to assist individuals in constructing effective messages, the structure of existing health messages must be understood, and an appropriate method for analyzing health message structure developed.

Methods

72 messages from expert sources were used for development of the method, which was then tested for reproducibility using ten randomly selected health messages. Four raters analyzed the messages and inter-coder agreement was calculated.

Results

A method for analyzing the structure of the messages was developed using sublanguage analysis and discourse analysis. Overall kappa between four coders was 0.69.

Conclusion

A novel framework for characterizing health message structure and a method for analyzing messages appears to be reproducible and potentially useful for creating an authoring tool.

INTRODUCTION

Evidence supporting the impact of preventable factors on disease has been building for decades. The most common causes of death and many causes of disability in the U.S. are linked to lifestyle and behavior.1 Health messages have become a major tool in assisting patients to change behavior. The public health community has recognized the potential for health communication to modify beliefs and behavior and has been working for many years to develop communication campaigns to change health behavior. This has been demonstrated by the inclusion of a chapter on health communication with objectives that highlight the integral role communication has on modifying behavioral risks and in meeting the overall health goals of the nation in Healthy People 2010.2

The term “health message” can apply to persuasive messages that are designed to change any behavior within the realm of health care. Messages created in an attempt to change behavior can potentially be used not only for changing patients’ health behaviors, such as smoking or avoiding physical activity, but also potentially for changing the behavior of caregivers, such as adherence to preventive care reminders or reporting of diseases to a public health department.

Researchers have investigated the potential for entirely automated creation of messages, but the most likely solution is a system that allows a human to create messages in a structured manner, described as text planning. 3 This can be achieved by creating software to guide message structure. However, in order to create a tool, the optimal structure for a message must be determined. This requires that a framework for formal representation of health messages and a reproducible method to describe the content and structure of these messages. This can be done by understanding current methods for developing and analyzing messages. We can then determine whether existing methods can be applied, and if not, develop our own method.

Currently, health messages tend to be developed by experts in the health issue addressed in the message. These messages typically have been created using models and theories available from behavior change research. Constructs used in these models describe recipient characteristics that may affect the respondent’s barriers to behavior change. Using behavioral constructs for creating messages can be challenging because this requires a high degree of knowledge of the message recipient by assessment of impermanent characteristics such as attitudes, motivation, and perceived self-efficacy (people's beliefs about their capabilities to produce effects). Although these models have been useful providing general guidance for creating health messages, they offer a broad conceptual basis rather than specific guidance in message construction. In practice, overall message structure typically has been left to the individual author.

Message writers in public health also have relied to a great extent on guidance from research on fear appeals, which are communications intended to evoke fear in the recipient. Four components of fear appeal have been elucidated: severe threat, vulnerability of the target to the threat, personal efficacy (target’s perception that he or she has the ability to follow the message recommendations), and response efficacy (ability of the message recommendation to eliminate or reduce the threat depicted in the message).4 Behavior change has been associated with strong f ear appeal combined with high efficacy (or capacity to produce a desired effect)5. Some authors recommend a “problem-solution pattern”4 with threat, vulnerability, and severity messages followed by recommended actions. However, this guidance is fairly broad when considering the complexity of constructing an entire health message.

Despite the relative lack of guidance from the field of public health in terms of message structure, there is evidence from other fields that structure can affect the impact of a message. Research in marketing and communication demonstrates a few principles, including the techniques of refuting opposing arguments, placing requests that are sequentially larger or smaller, and including explicit conclusions and recommendations.6 However, these basic principles also fall short of guiding the construction of an entire message. Although research has demonstrated that, in general, a structured message is more effective than an unstructured message,7 the specific structure of a health message has yet to be elucidated.

One area that has the potential to offer guidance in the analysis of persuasive messages is computer science, which is the source of Rhetorical Structure Theory (RST).8 RST was originally developed to create a framework that would enable natural language generation, and is an effort to explain how each text span within a message fills a role that contributes to an overall rhetorical argument. The Toulmin Method9 is an additional technique for analyzing argument text. It provides a way to dissect a text and describe each portion of text in terms of its function.

Within the field of linguistics, discourse analysis10 has been used to gain information from patterns of words in a specific document. It illustrates how repeating elements (or morphemes) can form a structure. The structure consists of morphemes appearing in a similar context in a variety of sentences and forming a thread through the text that establishes a formal conclusion. The key benefit of this method is that it is specifically designed to be operational and reproducible.

One issue with discourse analysis is that it is intended for analysis of one text, and information gained from that analysis does not necessarily apply to additional texts. A sublanguage analysis can be developed to determine word class patterns within a large body of texts. This technique has been used to demonstrate that texts created by a community of speakers with common interests or profession will have elements in common, which can be significantly different from texts originating in other domains.11

Even if single existing theory or method were available to establish the ideal message structure, research has demonstrated that authors can have limited success in translating theoretical constructs into actual messages. 12, 13 The goal of creating an automated authoring tool is to use real-time guidance to assist authors in creating messages. This has the potential benefit of reducing the variability in quality of messages being created.

METHODS

Developing a method to describe message structure

Research indicates that a knowledge acquisition approach may be limited in the arena of message writing because there is a disconnect between the intended structure planned by an author and the actual structure of a message. 13 Therefore, we took the approach of learning directly from messages created by experts. Messages were selected that were readily available and from a source that was presumed to be expert. These included health messages from reputable organizations such as the Centers for Disease Control and Prevention (CDC) and the National Heart, Lung, and Blood Institute. These 72 messages covered a wide variety of topics and were targeted at a number of different audiences. In order to determine a framework for analyzing and characterizing health messages, attempts to apply known text analysis techniques were performed on the collected health message s. From exploring and applying these techniques and finding patterns that existed through the entire corpus, the final method was derived.

Evaluating the method

As a first stage in determining reproducibility of the method, an evaluation of inter-rater agreement in analyzing messages was performed. Ten test messages were randomly selected from a corpus of messages provided by an outside expert in health communication. After undergoing six hours of training in the method, four individuals from the fields of linguistics, cognitive science, and public health analyzed the messages using the method that was developed. Inter-rater agreement was performed at the sentence level. In order to accommodate multiple categories and multiple raters, a method developed by Landis and Koch 14 was used to calculate inter-coder agreement.

RESULTS

Message analysis framework and method

After collecting a wide variety of health messages, we attempted to apply the message analysis methods available to us. Although they appeared useful for general text, RST and the Toulmin method did not characterize the content of health messages well, and there was a high level of disagreement on the overall structure of each message when using these two methods. Instead, we applied sublanguage analysis to health messages, identifying patterns of words that are common through multiple texts. The analysis revealed several concepts as essential elements in health messages that existed throughout the entire corpus:

  • a message recipient

  • threats to health

  • actions to be performed to reduce the threat

  • benefits achieved from performing the actions

This is demonstrated by the following message, a public service announcement from the CDC:15

Most people don’t think about colorectal cancer. But it’s the second leading cancer killer in the U.S. This year, more than 135,000 men and women will learn they have colorectal cancer. Nearly 57,000 will die of it. But regular screening tests could save thousands of lives... including yours. So talk to your doctor and Screen for Life.

In this message, several elements can be identified as described in Table 1.

Table 1.

Elementary concepts in health messages

Elementary concepts Example in message
Recipient (R) people, men and women
Threat (T) cancer, colorectal cancer
Action (A) regular screening tests, talk to your doctor
Benefit (B) lives, life

Further, it became evident that all sentences could be reduced to one or more patterns of these core concepts. Patterns within the example message can be identified as seen in Table 2. The simplest example is the sentence, “But regular screening tests could save thousands of lives.” This links the concept of an action taken with a potential benefit and can be signified with AB (Action produces Benefit). In the sentence, “This year, more than 135,000 men and women will learn they have colorectal cancer,” the term “men and women” can be used to represent the message recipient in the context of the message. Therefore the sentence can be assigned two elements: Recipient (for “men and women”) and Threat (for “colorectal cancer”). This sentence would be assigned RT (Recipient is exposed to Threat). Other patterns detected in the health messages in our corpus include: AT (Action reduces Threat), TB (Threat precludes Benefit), and RB (Recipient desires Benefit). Virtually all sentences in the health messages we collected can be broken down into these essential patterns by identifying key words, assigning each one to a category, and determining their relationship within a sentence.

Table 2.

Examples of elementary relationships in colorectal cancer health message

Concept pair Relationship Example in message
RT Recipient is exposed to Threat men and women have colorectal cancer
AB Action produces Benefit screening tests could save thousands of lives
RA Recipient performs Action talk to your doctor

The method as described so far is useful for categorizing the main assertions or statement types in health messages. However, it does not capture the rational flow of the argument through the message. Additional information about the argument structure can be detected in the connective phrases (e.g. but, so, if, therefore ). These connective phrases can be evaluated in their contribution to an overall argument structure, similarly to methods used in discourse analysis.

The phrasing of connective clauses reveals the rational argument structure that flows through the message. Instead of simply saying: “You should think about colorectal cancer,” the message is phrased: “Most people don’t think about colorectal cancer.” The message begins with a counter argument, which is followed by an argument. Toulmin described this technique as a rebuttal, or an anticipated objection. This strategy is supported by researchers in the field of communication, who have demonstrated that a refute-then-support order of argument is more effective than the reverse.6 The pattern of argument in the example message is demonstrated in Figure 1. The connective phrases are separated from the other text (Figure 1a). Connective phrases in each sentence can be normalized and phrases assigned semantic patterns (Figure 1b), and the overall argument pattern can be extracted (Figure 1c). Once the connecting phrases are normalized, several patterns of the argument structure emerge. Examples of these from a series of messages are seen in Table 3.

Figure 1.

Figure 1

Example of message structure analysis. Rectangles contain element pairs and ovals contain connective phrases.

Table 3.

Examples of argument structure patterns found in health messages.

Not RT but RT so RA AB
Although RT, however AT, so
RA RB RB so RA, RT so RA
RB RT so RA AB

Inter-coder agreement

The four coders identified element pairs within messages at the sentence level; connective phrases were not coded for this evaluation. After analysis, an overall inter-coder agreement of 0.69 was achieved. This demonstrates “substantial agreement” as described by Landis and Koch.16

DISCUSSION

The method that has been developed is unique in that it was created using concepts from public health, linguistics, and computer science. In addition to providing a platform for future research, several interesting findings result from this work.

A key observation stemming from the work is that health messages have a distinct set of elements and argument structure patterns in common. In the next phase of the research, it is necessary to determine similarity of structure between different messages. A possible way to group together messages with similar structure is to apply tools developed in the Bioinformatics domain, for example the Needleman-Wunsch or Smith-Waterman string alignment methods that are used for detecting gene homology. Another potentially useful method is cluster analysis. Once messages can be grouped or clustered, overall message structure could be associated with message characteristics such as audience, health topic, or length of message. A tool for authoring messages could leverage this information by providing guidance to an author in creating the most appropriate message for a given situation.

Another important outcome of this research is that it merges the concepts in public health behavior change theory with argument structure derived from linguistic analysis. The patterns found in this analysis can be mapped to established public health constructs. For example, the statement type RT (Recipient is exposed to Threat) can map to both susceptibility and severity constructs, which are crucial to fear appeals. An interesting finding was that, although susceptibility and severity appear to be very distinct concepts, they were often inextricably joined within the message s. An example is the statement: “Nearly 57,000 will die of it.” This statement incorporates a sense of susceptibility, with a specific number implying risk to the recipient, with severity, indicated by use of the word “die.” Another statement type, RA (Recipient performs Action), can be mapped to the concept of personal efficacy or self-efficacy, as it can give the recipient a feasible way to accomplish the goal. AB (Action produces Benefit) and AT (Action reduces Threat) can be considered to correspond to response efficacy, giving recipients the confidence that an action will achieve a desired effect.

An interesting finding is that key words did not fall into categories consistently between messages. One example is the word “medicine.” In most messages, this falls into the category of A (Action), as an action performed by a recipient (i.e. recipient takes medicine). However, in a message about falls, the word “medicine” is categorized as a T (Threat) word, with the implication that the side-effects of medicines can increase falls. This is consistent with other research using sublanguage analysis, where the word meaning depends on its context, but it highlights the fact that care must be taken when assigning key words to categories, particularly if this is to be done in an automated way.

Further work may also assist in elucidating the overall pattern in a discourse, specifically, how word choices evolve through a message. Using the previous example, one can observe that the term “colorectal cancer” and “cancer killer” are used in a similar context, implying some level of equivalence. Another example is that the terms “people” and “men and women” are used at first, eventually to be replaced by “you” (as in: “..including yours”). Understanding how the selection of words evolves through a message can allow the further step of assisting in the construction of a message, guiding the use of the contextually equivalent words to allow a recipient to come to a particular conclusion. In this case, the recipient is subtly guided to equate colorectal cancer screening with life.

Demonstrating the elements that exist within a health message as well as the underlying argument structure is crucial for future work in message construction and the development of an authoring tool. An important first step has been achieved in the development of a framework for describing the content and structure of health messages. In addition, a reproducible method for analyzing the messages was demonstrated.

Once the ability exists to represent messages in a standardized way, automated functions such as decision support and natural language processing can be employed. An authoring tool to assist authors in constructing a persuasive message could leverage of the knowledge gained from this research. A system could be designed to elicit information such as recipient characteristics and targeted behavior from the user. It could then present a series of argument structures drawn from the knowledge base that could be used to construct an argument. The tool would then provide sentence-level guidance on how to include the essential elements of the argument. In addition, a variety of views of the message could be provided, including diagrams that reveal the underlying message structure or marked-up text with the basic elements of the argument presented with color coding or in tabular form. An additional function would be to provide feedback on stylistic message elements, such as how specific the words used to describe recipient, threat, action, and benefit should be for a particular section of the message.

CONCLUSION

Public health messages are crucial to implementing behavior change. They have clear structures that can be described reliably using a simple technique. This knowledge can be used in future computer-assisted message authoring systems. Use of such a system could reduce variability of message quality, which could facilitate research on the impact of health messages on behavior.

ACKNOWLEDGEMENTS

Dr. Morrison is supported by NLM Training Grant (5T15LM007079-12). We would like to thank Alla Keselman and Laura Slaughter for their contributions to this project.

REFERENCES

  • 1.Mokdad AH, Marks JS, Stroup DF, Geberding JL. Actual causes of death in the United States, 2000. JAMA. 2004;291(10):1238–1245. doi: 10.1001/jama.291.10.1238. [DOI] [PubMed] [Google Scholar]
  • 2.Office of Disease Prevention and Health Promotion. Health Communication. In: Healthy People 2010. Washington, DC: U.S. Dept. of Health and Human Services; 2000. p. 2 v.
  • 3.Cawsey AJ, Webber BL, Jones RB. Natural language generation in health care. JAMIA. 1997;4(6):473–82. doi: 10.1136/jamia.1997.0040473. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Hale JL, Dillard JP. Fear appeals in health promotion campaigns. In: Maibach E, Parrott RL, editors. Designing Health Messages. Thousand Oaks, CA: Sage Publications; 1995.
  • 5.Witte K, Allen M. A meta-analysis of fear appeals: implications for effective public health campaigns. Health Education & Behavior. 2000;27(5):591–615. doi: 10.1177/109019810002700506. [DOI] [PubMed] [Google Scholar]
  • 6.O'Keefe DJ. Message Factors. In: Persuasion: theory and research. Newbury Park, Calif.: Sage Publications; 1990. p. 270.
  • 7.Burgoon M, Bettinghaus EP. Persuasive Message Strategies. In: Roloff ME, Miller GR, editors. Persuasion: new directions in theory and research. Beverly Hills: Sage Publications; 1980. p. 311.
  • 8.Mann W, Thompson S. Rhetorical Structure Theory: Towards a Functional Theory of Text Organization. Text. 1988;8(3):243–281. [Google Scholar]
  • 9.Toulmin SE. The uses of argument. Updated ed. Cambridge, U.K.; New York: Cambridge University Press; 2003.
  • 10.Harris ZS, Hiz H. Discourse Analysis. In: Papers on syntax, Synthese language library; v. 14. Dordrecht, Holland: Reidel; 1981. p. vi, 479.
  • 11.Harris ZS. The Form of information in science: analysis of an immunology sublanguage. Dordrecht; Boston: Kluwer Academic Publishers; 1989.
  • 12.Kline K, Mattson M. Breast Self-Examination Pamphlets: A content analysis grounded in fear appeal research. Health Communication. 2000;12(1):1–21. doi: 10.1207/S15327027HC1201_01. [DOI] [PubMed] [Google Scholar]
  • 13.Reiter E, Cawsey A, Osman L, Roff Y. Knowledge acquisition for content selection. In: Proceedings of the 5th European Workshop on Natural Language Generation; 1997; 1997. p. 117–126.
  • 14.Fleiss JL. Statistical methods for rates and proportions. 2d ed. New York: Wiley; 1981.
  • 15.Centers for Disease Control and Prevention. Screen for life. [cited 5/17/2004]; Available from: http://www.cdc.gov/cancer/screenforlife/transcripts.htm#smalltown
  • 16.Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics. 1977;33(1):159–74. [PubMed] [Google Scholar]

Articles from AMIA Annual Symposium Proceedings are provided here courtesy of American Medical Informatics Association

RESOURCES