PHAST: A Collaborative Machine Translation and Post-Editing Tool for Public Health

Kristin Dew; Anne M Turner; Loma Desai; Nathalie Martin; Adrian Laurenzi; Katrin Kirchhoff

. 2015 Nov 5;2015:492–501.

PHAST: A Collaborative Machine Translation and Post-Editing Tool for Public Health

Kristin Dew ¹, Anne M Turner ², Loma Desai ³, Nathalie Martin ⁴, Adrian Laurenzi ⁵, Katrin Kirchhoff ⁶

PMCID: PMC4765627 PMID: 26958182

Abstract

This paper describes a novel collaborative machine translation (MT) plus post-editing system called PHAST (Public Health Automatic System for Translation, phastsystem.org), tailored for use in producing multilingual education materials for public health. Its collaborative features highlight a new approach in public health informatics: sharing limited bilingual translation resources via a groupware system. We report here on the design methods and requirements used to develop PHAST and on its evaluation with potential public health users. Our results indicate such a system could be a feasible means of increasing the production of multilingual public health materials by reducing the barriers of time and cost. PHAST’s design can serve as a model for other communities interested in assuring the accuracy of MT through shared language expertise.

Introduction and Background

There are more than 300 languages spoken in the US, with approximately 25.2 million people having limited English proficiency (LEP), defined as a limited ability to speak, read, write, or understand English.¹ For LEP individuals, accessing accurate and up-to-date health information can be difficult. LEP status contributes to poor health literacy and to a greater incidence of health disparities, such as less preventative health screening.²^–⁵

As providers of health care, public health departments are required by law to provide language appropriate materials for individuals who do not speak English.⁶^–⁸ This includes translating health education materials, such as handouts, web resources, and flyers, into various languages used in the departments’ respective communities. Translations are typically performed in-house by bilingual employees, or ordered from professional translation vendors. The high cost of translations, whether commissioned through professional vendors or using limited bilingual in-house staff, limits the number and type of language translations produced. These constraints are particularly true for smaller health departments with limited resources and staff.⁹ Unfortunately, the decentralized organization of public health in the United States means that local health departments often use their limited funds to translate content similar to that of other health departments, because they lack systems for sharing translation resources.

Over the last four years, as part of an NLM-funded research project, we have been investigating the potential role of machine translation (MT), i.e. the fully automatic translation of text or speech from one natural language into another by a computer program, in producing multilingual public health materials,¹⁰ and how MT could be integrated in current public health practice.¹¹ In this paper we describe the design and evaluation of PHAST, a collaborative translation tool developed to facilitate the integration of MT into the typical workflow of public health practice (phastsystem.org). An early prototype of PHAST was described previously.¹² This paper describes additional design work, the updated PHAST system, and an evaluation via user testing with public health staff.

Our target population comprises lay users – bilingual public health professionals who have a deep understanding of their area of public health or a particular community they serve, but no formal linguistic or technological expertise. In our initial studies,¹¹^–¹³ we found that core characteristics required of any MT system for public health materials include: 1) a simple, intuitive user interface to facilitate the various steps involved without extensive training or maintenance; 2) support for MT quality control via post-editing, i.e. the manual review and correction of machine-translated text; 3) document sharing among different health departments; and 4) a means for users to track the progress of their translations.

The technical progress in MT over the past decade makes such a system possible. MT has improved dramatically in the past 10 years. With statistical MT – which is presently the most promising approach – models are trained automatically using large bodies of parallel text or text in the source language paired with its translation in the target language;¹⁴ a more detailed introduction to statistical MT technology can be found in Cancedda et al.¹⁵ Statistical MT has been shown to be more powerful than older approaches like rule-based MT and translation memories, which rely on sophisticated linguistic analyses or on large databases of stored examples that need to match the input text. Therefore, the best statistical MT systems typically outperform rule-based or example-based MT systems, even on specialized technical text (see e.g., Bojar et al.).¹⁶

However, statistical MT alone is still too full of errors to be used reliably for end-to-end translation, especially in sensitive domains like health. In applications where high translation accuracy is required, statistical MT is combined with a human post-editing step, i.e. human readers correct the translation errors created by the MT engine. The MT plus post-editing approach has so far been under-utilized by local and regional health departments. Our previous work was the first to demonstrate the feasibility and advantages of using MT plus post-editing in the public health domain.¹⁰ In a study of Spanish MT plus post-editing for public health materials, documents blindly rated as equivalent in quality to a professional, manual, human translation could be produced for 5% of the cost using MT plus post-editing by a bilingual public health professional, based on an average public health professional’s wage.⁹

MT is now in common use amongst translation vendors and in various business applications,¹⁷ but has been slow to be adopted into clinical settings. MT followed by human post-editing, or correction to the machine-translated output, can hold substantial time and cost gains over manual human-only translation.¹⁸^–²¹ Even with the required step of post-editing, our prior studies have shown that MT could help public health workers create multilingual documents in less time and at dramatically lower costs. In blind ratings by public health professionals, post-edited MT translations were found to be equivalent to manual human translations.⁹ In tandem with document sharing, MT plus post-editing could ultimately help health departments produce a greater number of low-cost multilingual health materials.

The barriers of time and cost are compounded by a lack of coordination across health care authorities. Our prior studies of Washington (WA) State health departments found that translated health materials generated at one health department were rarely shared among other health departments.⁹ While there is a wide range of bilingual language expertise across health departments (e.g. interpreters, translators, community health workers, etc.), no single health department has the language coverage to meet their translation needs. There have been studies of collaborative translation by monolinguals with machine translation systems, whereby two non-bilingual people who use different languages collaborate to perform translation using MT services.²²^–²³ As a novel approach to translation and current public health practices, we designed PHAST to enable the remote collaboration of health department workers in editing translated documents produced through MT. To the best of our knowledge, the way in which language expertise and translated documents are shared in PHAST has not been tried until now.

We describe here the design and evaluation of the PHAST tool. While PHAST was developed as a public health informatics tool, we believe it contains useful ideas for designing novel technologies for lay users in other governmental domains, creating collaborative translation applications, and bootstrapping low-cost MT applications.

Methods

Design

User-centered design is essential for increasing the efficiency with which users interact with MT systems.²⁴ We relied on several common user-centered design methods to develop PHAST. Its initial design requirements stemmed from information gathered with qualitative methods, including interviews and focus groups, within five health departments in Washington State (see Table 1). We used the Cognitive Work Analysis framework²⁵ to model current translation workflow as part of a larger project investigating translation practices in Washington State public health agencies. This was the basis of the step-by-step translation workflow used in PHAST, as well as the scenarios used in its evaluation. The workflow study was supplemented with feedback from state health department staff, preliminary usability testing with lay users, and post-editing studies, forming an iterative design and evaluation process that culminated with the evaluation described below.

Table 1:

Errors by task and by participant

Task	P1	P2	P3	P4	P5	P6	P7	P8	P9	P10	Total by task
Login			2								2
Identify who’s working on a document									2		2
How complete is post-editing		2									2
Upload a document			2		4					1	7
Claim		2	1		1		1			1	6
Post-edit		1	1	1						1	4
Identified lines edited and saved		2									2
Leave a note											0
Unclaim	1	2	1								4
Download		1			1		1				3
Total by participant	1	10	7	1	6	0	2	0	2	3	32

Open in a new tab

While some materials may be specific to the local health department, more general health education information – such as those on vaccines, basic food safety, and so on – are frequently produced independently by each health department and remain in the gray literature.²⁶ We decided to extend the design requirements for PHAST to include collaboration on producing and storing translations, in the hopes of allowing health departments to pool their bilingual staffs’ language expertise and their translated documents (see System Overview below).

We used the findings from our studies of public health translation processes and those of the subsequent post-editing studies to develop a translation system that is easy to use, efficient, and flexible, and that pooled limited bilingual public health staff language resources. The system is designed for lay users rather than professional translators. As such, its design deviates from that of traditional translation management systems: based on the results of our workflow studies, we removed irrelevant features and created a step-by-step translation workflow mirroring that of health departments’ current practices.¹¹ Because the public health translation staff frequently cited time and cost constraints, we attempted to create a lightweight, simple and intuitive design, which would remove the need for extensive training and minimize adoption headaches for staff; deployment would simply consist of staff signing up and undergoing brief training, for which we have prepared a tutorial.

To minimize costs, we created PHAST as a web-based application built on freely available tools and requiring only minimal maintenance costs and technical resources. PHAST was built online using the Kohana PHP Framework and a MySQL database. The front-end interface was built using JQuery, Twitter Bootstrap, HTML and CSS.

PHAST supports the following four main tasks that our workflow studies found to be essential for users: (1) uploading a text document requiring translation from the source language (English), (2) statistical MT translation, (3) human post-editing of the MT output to ensure quality, and (4) saving the original and post-edited text documents in an archive available for users to download either the original or complete post-edited version (see Figure 1). An important constraint is that PHAST only handles raw text, so figures, bullet points, images, and other formatting must be removed before uploading a document to the system. Uploaded documents are visible in the main archive and can be sorted by date or language. Any user can see the document’s post-editing status, which is tracked with a progress bar that is updated as each line break is completed (see Post-editing Interface section). When all lines have been saved, the document is marked with a green check, indicating that it is available for download.

Figure 1: — Overview of documents and their progress through post-editing

Our prior studies of WA State health departments found that, in general, health materials generated at one health department were rarely shared among other health departments.⁹ The PHAST system applies a collaborative approach to encourage sharing of bilingual resources across health departments. Public health workers using the system can upload and post-edit documents from within their own department and can volunteer to post-edit documents that have been uploaded by other departments. Therefore, most of the cost of translation stems from the cost of the post-editor’s time. Sharing bilingual staff expertise allows collaboration amongst different departments in translating and sharing health materials. PHAST pools both language expertise and translated documents with the features described below.

When a user creates an account, their agency affiliation, language expertise, and information about experience level and professional certifications are saved in a profile. Once a user has a profile, she can upload a document in PHAST, which uses the Microsoft Translator API3. PHAST covers all 39 languages supported by the Microsoft Translator API. At present we are using a free license with a limit of 2,000,000 characters per month. If PHAST were to be adopted widely, the modest monthly subscription cost of $40–160 (4,000,000–16,000,000 translated characters per month) would be one of the only recurring expenses to using the system. It is most likely that a central authority, such as the state health department, would act as a host institution for PHAST and pay the subscription cost; it is also possible that member departments would share the cost among themselves. In either case, the monthly subscription fee is at the low end of the cost range for a single translation by traditional methods ($130–1220).⁹ Its breadth of supported languages, low usage costs, and translation quality are the main reasons we used Microsoft Translator API instead of the Google Translate API, though a system like PHAST could be built using either option.

A single source document can be added to the system and translated into multiple languages at once using the upload page. At present, PHAST only supports unformatted .txt and .docx files, because it uses PHPDOCX to convert documents into text format that is interpretable by Microsoft Translator. The upload page’s interface contains text entry fields for the document’s title, topic, intended audience, desired reading level, and other notes. Below these fields, the user selects the file to upload and checks the appropriate boxes next to the MT output languages available. The translated versions are then added to the pool of documents that are ready for post-editing (see Figure 1) and are tracked through the post-editing process. PHAST sends an email to users who have language expertise in the target language, notifying them that a new document has been added and is ready for post-editing.

Users can start post-editing by “claiming” the document, which prevents others from post-editing it simultaneously (see [1] in Figure 2), protecting their work and allowing for version control. Any user in the system can claim a document, allowing staff from one health department to add a document and bilingual staff from another to claim and post-edit the document. Because bilingual public health staff often have so many competing demands and perform translation work between other tasks, it may take them several post-editing sessions to complete the process, or they may need to hand the document off to someone else to finish the post-editing. They may also need someone else to perform a quality check of their work.⁹ By clicking the “unclaim” button, other users can access the document to revise or finish the post-editing (2).

Once a user initiates the post-editing process (see Figure 2), PHAST automatically separates the documents into segments based on delimiters (line breaks) in the source document (3). For each segment, the source text (4) is shown above its corresponding MT, which is now available for editing. When the post-editor has finished editing a segment, the user saves it (5), prompting the system to highlight the segment in green. During our interviews, we found that public health workflow requires intermittent post-editing and that staff require a method for easily saving their progress in order to resume post-editing at a later time. The “postedited” bar at the top of the screen tracks a document’s progress through the translation process (6), while the green highlighted sections remind post-editors which parts they have already worked on. The post-editing interface includes a discussion board where users can post comments about the document for reference (7). Our user interviews indicated that staff requiring a document translation may need to communicate particular instructions to the post-editors. For example, they could resolve disagreements about how to translate specific parts of the document.

Evaluation

We have studied the PHAST tool’s functionality and usability with public health department staff in Washington State. Prior to usability testing, we performed unit testing, whereby we tested the functionality of each feature in isolation; scenario testing, whereby we identified and performed a set of scenarios and verified that the system produced the desired output; and stress testing, whereby multiple users logged into the system at once and performed tasks with the desired outcome.

Before presenting our prototype to target users, we performed early usability testing with three bilingual volunteers who, like health department staff, were not professional translators. Appropriate changes were made and, when PHAST was determined to be ready for our target users, we presented it to a Washington Department of Health focus group comprising various health education and translation staff. Participants validated the translation workflow around which PHAST was designed, watched a demonstration of the system, and tested the main functions of the PHAST website. We noted critical incidences and user preferences to inform changes to PHAST. Once we made this series of initial changes, we performed formal user testing. Staff from two public health agencies in Washington were given scenarios based on the workflow findings and asked to perform a series of tasks that reflect the main PHAST functions and the typical translation workflow. Our one-on-one, in-person user tests with health department translation staff took place at their offices, on the current live PHAST website, phastsystem.org.

We would like to emphasize that fully testing the collaborative aspects of PHAST would require a complete deployment to several health departments. This would require a push from a state or federal authority, which was beyond the scope of this research project. Instead, we used a novel strategy for groupware usability testing to simulate two components of statewide deployment: the test scenarios and an analysis framework focused on evaluating collaboration.²⁷ The test scenarios took users through the collaborative aspects of PHAST and encouraged them to think about performing each of the tasks within the context of inter-agency cooperation (e.g. “You want to leave the document for someone else to continue editing. How would you do that?”). As part of the analysis, we used Gutwin et al.’s mechanics of collaboration framework to evaluate how well PHAST in its current form supports collaboration.²⁸

In total, we completed the user tests with 10 staff members from two mid-sized health departments. All participants were involved in producing translated health materials in some capacity, but their job titles varied and included communications, education, nursing, and design roles. Their experience with translation work and technical expertise also varied widely; seven of the 10 had tried statistical MT systems, primarily Google Translate. Participants sat at laptops in their departments’ respective conference rooms to perform tasks and were video recorded to capture audio and actions on the laptop screen. In addition to the moderator, there was a note taker present for each session. Participants were asked to think aloud and were asked probe questions when appropriate.

For this reason, we did not report the time spent on each task. However, it is worth noting that the time required to complete all tasks ranged from around 7 to 28 minutes, depending on whether the users encountered difficulties and on the extent of the think-aloud commentary and probes. Data were collected on task success/failure; error count; error severity on a scale of 1–5, with 1 indicating completion with no problems and 5 indicating failure; as expected/not as expected statements; and perceived satisfaction, with 1 indicating completing the task with no problems and statements that it was exactly as expected or better and 5 indicating failure, with complaints.

The tasks were situated within scenarios based on our translation workflow findings and were read by the moderator. The scenarios emphasized the collaborative aspects of PHAST, in order to simulate realistic interactions across health departments that would likely occur if it were deployed widely. In order, the tasks were: login, identify who’s working on a document, identify the progress bar showing how complete the post-editing is, upload a document and fill in the metadata fields associated with it, claim a document, post-edit three lines of the claimed document and save them, identify the lines that have been edited and saved, leave a note for another post-editor in the discussion box, unclaim the document, and download a completed document. After each task, the moderator asked, “Is that what you would expect?”

Data Analysis

We analyzed the data by code in Excel (e.g., error count per task and participant) and with the code co-occurrence table feature in Atlas.ti, in order to find correlations between codes (e.g. claim a document and as expected/not as expected). Three members of the study team used Atlas.ti to blindly code the videos. We used a closed codebook containing three main types of codes: codes for each task; codes for the aforementioned metrics that point to effectiveness, efficiency and satisfaction with PHAST; and codes for each of the mechanics of collaboration (explicit communication, consequential communication, coordination of action, planning, monitoring, assistance, and protection). Though some mechanics of collaboration were present across several tasks, such as coordination of action, we found planning to be too high-level for the purposes of evaluating PHAST, and therefore excluded it. We also coded suggestions and comments on heuristics. Open coding was encouraged for interesting quotes and additional themes, though it did not ultimately produce any useful findings. We merged the blindly coded Atlas.ti hermeneutic unit files and cleaned the merged version for overlaps and redundant coding.

Results

Participants voiced satisfaction with PHAST by indicating that each task occurred as they expected in the vast majority of instances, and four participants explicitly said they would like to use such a system. One of the less technically proficient participants, P5, told the moderator: “I just think it would be an incredible tool for us. And I like the fact that you have whatever document you have and that people can get on and they can claim it and they can do their thing and then they can unclaim it and someone else can go look at that. That’s very easy.”

About 66% of errors were concentrated in four of the ten tasks: uploading a document, claiming a document for post-editing, post-editing and saving three lines, and unclaiming a document. We also found errors in downloading a completed translation, but all were due to the participant not understanding the meaning of the term “download.” A summary of the errors by task and participant is in Table 1 below.

The “uploading a new document” task had a high number of errors and two failures out of ten participant attempts. The errors were concentrated amongst the seemingly less technically proficient participants; four of the ten did not understand what “upload” means and needed prompts or further explanation to add a document to the system. Just over half of all errors were from two participants.

Claiming a document had the second-highest number of errors. While all ten participants completed the task successfully, five needed prompting. A majority voiced confusion over the need to claim the document before post-editing. We believe that this task has a high degree of trainability and that users would not have issues in subsequent attempts. Much of the confusion stemmed from the terminology of claim/unclaim; participant 10 said: “I’m not sure that claimed is the best word. Probably like a library checked in and checked out… We use SharePoint here and I think that’s also a common terminology with SharePoint is checking out and checking in a document.”

The vast majority of failures stemmed from two of the users. Post-editing and saving a line had three failures, all of which stemmed from the participant not saving an edited line before moving on to the next. Identifying edited and saved lines also had three failures, one of which was due to not recognizing that the saved lines changed color. We believe this has a high degree of learnability and would not be an issue in repeated use. This task also evoked many suggestions for a track change feature or other ways to see what other post-editors have changed in the document. Unclaiming a document had two failures due to the users logging out or closing the window instead of clicking the unclaim button. This could cause critical bottlenecks to a translation’s progress through the system and will require design changes. A summary of the failures by task and by participant is in Table 2 below.

Table 2:

Failures by task and by participant

Task	P1	P2	P3	P4	P5	P6	P7	P8	P9	P10	Total by task
Login											0
Identify who’s working on a document									1		1
How complete is post-editing											0
Upload a document		1			1						2
Claim		1									1
Post-edit	1		1	1							3
Identified lines edited and saved		1	1							1	3
Leave a note											0
Unclaim		1	1								2
Download		1									1
Total by participant	1	5	3	1	1	0	0	0	1	1	13

Open in a new tab

To evaluate whether PHAST allows users to work collaboratively to complete common translation tasks, we plotted each of the mechanics of collaboration codes against the effectiveness (success/failure), efficiency (error count and severity) and satisfaction (as expected/not as expected and satisfaction 1–5) codes in Atlas.ti’s code co-occurrence table, in a novel groupware prototype evaluation method.²⁷ The table gives a correlation coefficient that shows the degree to which two codes overlap. We used the table’s correlation coefficients qualitatively, as a way of finding themes by drawing attention to the extreme values, rather than as correlation coefficients per se. The login and download tasks were not considered collaborative enough for inclusion in this part of the analysis, so only eight of the ten tasks were analyzed. We also excluded the “planning” mechanic code, after deciding it was too high level for PHAST.

We found that PHAST adequately supports explicit communication, as embodied in the document upload and leave a note tasks. Effectiveness, efficiency and satisfaction were all deemed adequate because the positive codes (success, error 1–2, as expected, satisfaction 1–2) were the most strongly associated with the explicit communication code. The two upload failures were related to technical proficiency, rather than to any major design flaws, while all participants were successful in leaving a note for a fellow post-editor in the discussion box. Both tasks had high co-occurrence with positive satisfaction codes. Aside from leaving a note, all tasks included in this analysis involved consequential communication; the results for this code were thus somewhat mixed. The code co-occurrence table showed consequential communication to have a great degree of overlap with both success and failure codes. While there were 30+ errors in the tasks considered consequential communication, two participants accounted for more than half of them, so efficiency was also mixed. Satisfaction was high: the “as expected” code occurred with a much greater frequency than “not as expected” and satisfaction scores 1 and 2 were the most frequently associated with consequential communication.

Similarly, the codes “coordination of action” and “monitoring” spanned all tasks included in this analysis. This is mostly due to the fact that the entire system and corresponding scenarios follow the translation workflow model, and these mechanics are embedded to some degree in every task. We found need for improvement in both, which is in line with our overall individual user experience findings. In terms of effectiveness, there were failures in seven of the eight tasks, with two or more each in uploading, post-editing, identifying edited and saved lines, and unclaiming. Coordination of action showed only moderate overlap with both success and failure in the code co-occurrence table. The success code was most strongly associated with monitoring, and failures had the third-strongest association. With regard to efficiency, there were 30+ errors, over the ten participants, in both coordination of action and monitoring, but the severity was relatively low and, as mentioned above, two participants accounted for more than half the errors. Satisfaction, on the other hand, was relatively high, with “as expected” and satisfaction scores of 3 and above being the most commonly associated with the coordination of action and monitoring codes.

PHAST has adequate support for assistance, which was practiced with the task of leaving a note in the discussion box. There were no failures (effectiveness) and no errors (efficiency), but participants made several suggestions for track changes or line-by-line commenting. Satisfaction was mixed, with “not as expected” co-occurring slightly more frequently than “as expected” and satisfaction scores mostly around 3.

Protection was part of all tasks except uploading a new document, and support for this code was mixed. There were seven failures, so effectiveness for tasks considered protection could be improved. Efficiency was mixed, with 24 errors but low severity ratings. Satisfaction was moderately high, with a greater frequency of “as expected” than “not as expected,” and a majority of satisfaction scores clustered around 1–3.

Discussion and implications for design

The usability testing findings indicate that PHAST could be successfully integrated in public health practice and that public health staff are receptive to using such a system. Still, additional changes should be made prior to any full deployment of the system. The vast majority of changes are relatively minor adjustments to the user interface, e.g. revising the use of claim/unclaim and prompting users to unclaim a document before logging out or closing the browser if they intend to stop post-editing at that time. These types of changes could increase PHAST’s usability and consequently improve its support for collaboration. Even with a design process that emphasized the needs of lay users, the findings show that the system could still be difficult for less technically proficient users in their first encounter. While we accounted for this with the step-by-step translation flow, highly learnable features, and design updates such as changing the upload page name from “Upload a document” to “Add new document,” there are limitations to how much the design can accommodate users unfamiliar with common terms such as “download.” Additional training may be required for these users.

In reviewing the findings and considering them in the context of our past and ongoing interactions with public health translation staff, we believe developing a way to better track changes or post-edited versions may be critical to the success of a full deployment. In the earlier workflow studies and in the post-editing tasks, public health employees voiced concern about quality assurance.¹¹ This concern surfaced again during the user testing; one participant said, “My concern is that, will it save it as this is the change that I’ve made, but if somebody else didn’t want that change, the other version was still there… If I made a change and somebody else came back and said ‘Well what happened to my version?’ You know, I just think there needs to be some documentation there.” While quality assurance is also a concern with traditional translation vendors, our prior studies showed that health departments formed relationships with trusted vendors. A similar relationship could be built into PHAST with post-editor ratings by users. Post-editors would then have a greater incentive to produce quality translations and departments uploading documents would have more control over quality assurance. This addition would require careful consideration so as not to produce excessive demand for certain post-editors while discouraging others or damaging morale.

These concerns over quality assurance and respect for bilingual employees’ workload are compounded by the funding structure of local public health in the US, which generally occurs at the county or city government level and thus funding authority is highly decentralized. It would likely take a boost from a central authority to fully implement and maintain PHAST in Washington State. To this end, we have built an ongoing partnership with the Washington Department of Health, which has voiced interest in hosting and driving the adoption of PHAST.

Conclusion, limitations and future work

We have presented a collaborative MT system that can significantly reduce the time and cost of producing multilingual health materials and was deemed usable by current public health staff. The system’s collaborative aspects provide a new way to share limited translation and post-editing resources via a groupware system. We have relied on user-centered design methods to iteratively develop PHAST and to better ensure that it meets the needs of lay users and their work context; and usability testing results indicate that additional minor training would be required only for the least technically proficient users. We identified design requirements that should be implemented before attempting a full deployment, including additional quality assurance features. Offline support policies, such as a partnership with a central authority that can help drive adoption, would also be necessary for full implementation.

PHAST has been developed based on user studies with personnel from health departments in WA State. Our findings may not reflect the practices of health departments in other states. A full evaluation of PHAST’s support for collaborative work was limited by the fact that it was not yet fully implemented across health departments. However, this provided an opportunity to simulate and then evaluate the system with the mechanics of collaboration framework, which we believe to be a useful contribution to our study and to the public health informatics community.

Statistical MT combined with post-editing has already been shown to reduce errors after machine translation is performed. Currently, PHAST uses a generic MT engine (Microsoft Translator, (http://www.microsoft.com/en-us/translator/). The Microsoft Translator Hub (http://hub.microsofttranslator.com/) allows users to train their own customized models for a particular domain or style of text, provided they can supply a sufficiently large training set (a minimum of 2000 sentences, though larger sets will yield better results). We have run initial experiments with customized models produced by the Translator Hub, and we have found that the translation performance (as measured by automatic evaluation procedures) is improved over generic models. A more widespread and regular deployment of the PHAST system would enable the collection of a large number of examples of translated materials, which in the future can be used to generate more accurate automated translation models. Thus, one of the long-term goals of this work is to collect a sufficient number of quality translated documents to produce customized translation models. Currently, the PHAST system does provide functionality for collecting documents it has processed in a parallel text format, but the procedure for building new models has not been automated or integrated into the PHAST system. This growing collection would allow future development of dynamic, customized MT in the public health domain and could be replicable for other domain-specific MT systems.

Acknowledgments

This study was funded by grant #1R01LM010811-01 from the National Library of Medicine (NLM). Its content is the sole responsibility of the authors and does not necessarily represent the view of the NLM. We would like to thank the WA health department staff members who volunteered their time during various stages of this project.

References

1.Pandya C, McHugh M, Batalova J. Limited English Proficient Individuals in the United States: Number, Share, Growth, and Linguistic Diversity. LEP Data Brief. Migration Policy Institute; 2011. [Google Scholar]
2.Peña-Purcell N. Hispanics’ use of Internet health information: an exploratory study. Journal of the Medical Library Association: JMLA. 2008;96(2):101. doi: 10.3163/1536-5050.96.2.101. [DOI] [PMC free article] [PubMed] [Google Scholar]
3.Cardelle AJ F, Rodriguez EG. The quality of Spanish health information websites: an emerging disparity. Journal of Prevention & Intervention in the Community. 2005;29(1–2):85–102. [Google Scholar]
4.Cheng EM, Chen A, Cunningham W. Primary language and receipt of recommended health care among Hispanics in the United States. Journal of General Internal Medicine. 2007;22(2):283–288. doi: 10.1007/s11606-007-0346-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
5.Shi L, Lebrun LA, Tsai J. The influence of English proficiency on access to care. Ethnicity & Health. 2009;14(6):625–642. doi: 10.1080/13557850903248639. [DOI] [PubMed] [Google Scholar]
6.Civil Rights Act of 1964 § 7, 42 U.S.C. § 2000d et seq (1964)
7.Office of Civil Rights Guidance to Federal Financial Assistance Recipients Regarding Title VI Prohibition against National Origin Discrimination Affecting Limited English Proficient Persons. 2002. Available from: http://www.lep.gov/guidance/guidance_index.html.
8.Office of Minority Health National Standards for Culturally and Linguistically Appropriate Services (CLAS) in Health and Health Care. Available from: https://www.thinkculturalhealth.hhs.gov/pdfs/EnhancedNationalCLASStandards.pdf.
9.Turner AM, Bergman M, Brownstein M, Cole K, Kirchhoff K. A Comparison of Human and Machine Translation of Health Promotion Materials for Public Health Practice: Time, Costs, and Quality. Journal of Public Health Management and Practice. 2014;20(5):523–529. doi: 10.1097/PHH.0b013e3182a95c87. [DOI] [PMC free article] [PubMed] [Google Scholar]
10.Kirchhoff K, Turner AM, Axelrod A, Saavedra F. Application of statistical machine translation to public health information: a feasibility study. Journal of the American Medical Informatics Association. 2011;18(4):473–478. doi: 10.1136/amiajnl-2011-000176. [DOI] [PMC free article] [PubMed] [Google Scholar]
11.Turner AM, Brownstein M, Cole K, Karasz H, Kirchhoff K. Modeling workflow to design machine translation applications for public health practice. Journal of Biomedical Informatics. 2015;53:136–146. doi: 10.1016/j.jbi.2014.10.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
12.Laurenzi A, Brownstein M, Turner AM, Kientz JA, Kirchhoff K. A web-based collaborative translation management system for public health workers; In proceedings of CHI’13 Extended Abstracts on Human Factors in Computing Systems; ACM; 2013. pp. 511–516. [Google Scholar]
13.Laurenzi A, Brownstein M, Turner AM, Kirchhoff K. Integrated Post-Editing and Translation Management for Lay User Communities; In proceedings of the 2nd Workshop on Post-editing Technology and Practice; France: Nice; 2013. pp. 27–34. [Google Scholar]
14.Koehn P. Statistical Machine Translation. Cambridge University Press; 2009. [Google Scholar]
15.Cancedda N, Dymetman M, Foster G, Goutte C. A statistical machine translation primer. In: Goutte C, Cancedda N, Dymetman M, Foster G, editors. Learning Machine Translation. Cambridge, MA: The MIT Press; 2009. pp. 1–38. [Google Scholar]
16.Bojar O, Buck C, Federmann C, Haddow B, Koehn P, Leveling J, et al. Findings of the 2014 Workshop on Statistical Machine Translation; In proceedings of the Ninth Workshop on Statistical Machine Translation; 2014. pp. 12–58. [Google Scholar]
17.Muller C. Machine translation post-editing holds the key to MT success. Available from: http://www.csoftintl.com/knowledge_vault/machine_translation_post_editing_holds_the_key_to_mt_success.
18.Allen J. Postediting: an integrated part of a translation software program. Language International. 2001;13(2):26–29. [Google Scholar]
19.Doyon J, Doran C, Means CD, Parr D. Automated machine translation improvement through post-editing techniques: analyst and translator experiments; In proceedings of the Conference of the American Machine Translation Association; Waikiki, Hawai’i: AMTA; 2008. [Google Scholar]
20.Green S, Heer J, Manning CD. The efficacy of human post-editing for language translation; In proceedings of the SIGCHI Conference on Human Factors in Computing Systems; ACM; 2013. [Google Scholar]
21.Green S, Chuang J, Heer J, Manning CD. Predictive Translation Memory: A mixed-initiative system for human language translation; In proceedings of the 27th annual ACM symposium on User Interface Software and Technology; ACM; 2014. pp. 177–187. [Google Scholar]
22.Morita D, Ishida T. Collaborative translation by monolinguals with machine translators; In proceedings of the 14th International Conference on Intelligent User Interfaces; ACM; 2009. pp. 361–366. [Google Scholar]
23.Hu C, Bederson BB, Resnik P. Translation by iterative collaboration between monolingual users; In proceedings of Graphics Interface 2010; Canadian Information Processing Society; 2010. [Google Scholar]
24.Gaspari F. Machine Translation: From Real Users to Research. Springer Berlin; Heidelberg: 2004. Online MT services and real users’ needs: An empirical usability evaluation; pp. 74–85. [Google Scholar]
25.Vicente KJ. Cognitive work analysis: Toward safe, productive, and healthy computer-based work. CRC Press; 1999. [Google Scholar]
26.Turner AM, Liddy ED, Bradley J, Wheatley JA. Modeling public health interventions for improved access to the gray literature. Journal of the Medical Library Association. 2005;93(4):487. [PMC free article] [PubMed] [Google Scholar]
27.Dew K, Turner AM, Desai L, Martin N, Kirchhoff K. Evaluating groupware prototypes with discount methods; In proceedings of the 18th ACM Conference on Computer-Supported Cooperative Work and Social Computing; ACM; 2015. in press. [Google Scholar]
28.Gutwin C, Greenberg S. The mechanics of collaboration: Developing low cost usability evaluation methods for shared workspaces; In proceedings of Enabling Technologies: Infrastructure for Collaborative Enterprises; IEEE; 2000. pp. 98–103. [Google Scholar]

[b1-2246390] 1.Pandya C, McHugh M, Batalova J. Limited English Proficient Individuals in the United States: Number, Share, Growth, and Linguistic Diversity. LEP Data Brief. Migration Policy Institute; 2011. [Google Scholar]

[b2-2246390] 2.Peña-Purcell N. Hispanics’ use of Internet health information: an exploratory study. Journal of the Medical Library Association: JMLA. 2008;96(2):101. doi: 10.3163/1536-5050.96.2.101. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b3-2246390] 3.Cardelle AJ F, Rodriguez EG. The quality of Spanish health information websites: an emerging disparity. Journal of Prevention & Intervention in the Community. 2005;29(1–2):85–102. [Google Scholar]

[b4-2246390] 4.Cheng EM, Chen A, Cunningham W. Primary language and receipt of recommended health care among Hispanics in the United States. Journal of General Internal Medicine. 2007;22(2):283–288. doi: 10.1007/s11606-007-0346-6. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b5-2246390] 5.Shi L, Lebrun LA, Tsai J. The influence of English proficiency on access to care. Ethnicity & Health. 2009;14(6):625–642. doi: 10.1080/13557850903248639. [DOI] [PubMed] [Google Scholar]

[b6-2246390] 6.Civil Rights Act of 1964 § 7, 42 U.S.C. § 2000d et seq (1964)

[b7-2246390] 7.Office of Civil Rights Guidance to Federal Financial Assistance Recipients Regarding Title VI Prohibition against National Origin Discrimination Affecting Limited English Proficient Persons. 2002. Available from: http://www.lep.gov/guidance/guidance_index.html.

[b8-2246390] 8.Office of Minority Health National Standards for Culturally and Linguistically Appropriate Services (CLAS) in Health and Health Care. Available from: https://www.thinkculturalhealth.hhs.gov/pdfs/EnhancedNationalCLASStandards.pdf.

[b9-2246390] 9.Turner AM, Bergman M, Brownstein M, Cole K, Kirchhoff K. A Comparison of Human and Machine Translation of Health Promotion Materials for Public Health Practice: Time, Costs, and Quality. Journal of Public Health Management and Practice. 2014;20(5):523–529. doi: 10.1097/PHH.0b013e3182a95c87. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b10-2246390] 10.Kirchhoff K, Turner AM, Axelrod A, Saavedra F. Application of statistical machine translation to public health information: a feasibility study. Journal of the American Medical Informatics Association. 2011;18(4):473–478. doi: 10.1136/amiajnl-2011-000176. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b11-2246390] 11.Turner AM, Brownstein M, Cole K, Karasz H, Kirchhoff K. Modeling workflow to design machine translation applications for public health practice. Journal of Biomedical Informatics. 2015;53:136–146. doi: 10.1016/j.jbi.2014.10.005. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b12-2246390] 12.Laurenzi A, Brownstein M, Turner AM, Kientz JA, Kirchhoff K. A web-based collaborative translation management system for public health workers; In proceedings of CHI’13 Extended Abstracts on Human Factors in Computing Systems; ACM; 2013. pp. 511–516. [Google Scholar]

[b13-2246390] 13.Laurenzi A, Brownstein M, Turner AM, Kirchhoff K. Integrated Post-Editing and Translation Management for Lay User Communities; In proceedings of the 2nd Workshop on Post-editing Technology and Practice; France: Nice; 2013. pp. 27–34. [Google Scholar]

[b14-2246390] 14.Koehn P. Statistical Machine Translation. Cambridge University Press; 2009. [Google Scholar]

[b15-2246390] 15.Cancedda N, Dymetman M, Foster G, Goutte C. A statistical machine translation primer. In: Goutte C, Cancedda N, Dymetman M, Foster G, editors. Learning Machine Translation. Cambridge, MA: The MIT Press; 2009. pp. 1–38. [Google Scholar]

[b16-2246390] 16.Bojar O, Buck C, Federmann C, Haddow B, Koehn P, Leveling J, et al. Findings of the 2014 Workshop on Statistical Machine Translation; In proceedings of the Ninth Workshop on Statistical Machine Translation; 2014. pp. 12–58. [Google Scholar]

[b17-2246390] 17.Muller C. Machine translation post-editing holds the key to MT success. Available from: http://www.csoftintl.com/knowledge_vault/machine_translation_post_editing_holds_the_key_to_mt_success.

[b18-2246390] 18.Allen J. Postediting: an integrated part of a translation software program. Language International. 2001;13(2):26–29. [Google Scholar]

[b19-2246390] 19.Doyon J, Doran C, Means CD, Parr D. Automated machine translation improvement through post-editing techniques: analyst and translator experiments; In proceedings of the Conference of the American Machine Translation Association; Waikiki, Hawai’i: AMTA; 2008. [Google Scholar]

[b20-2246390] 20.Green S, Heer J, Manning CD. The efficacy of human post-editing for language translation; In proceedings of the SIGCHI Conference on Human Factors in Computing Systems; ACM; 2013. [Google Scholar]

[b21-2246390] 21.Green S, Chuang J, Heer J, Manning CD. Predictive Translation Memory: A mixed-initiative system for human language translation; In proceedings of the 27th annual ACM symposium on User Interface Software and Technology; ACM; 2014. pp. 177–187. [Google Scholar]

[b22-2246390] 22.Morita D, Ishida T. Collaborative translation by monolinguals with machine translators; In proceedings of the 14th International Conference on Intelligent User Interfaces; ACM; 2009. pp. 361–366. [Google Scholar]

[b23-2246390] 23.Hu C, Bederson BB, Resnik P. Translation by iterative collaboration between monolingual users; In proceedings of Graphics Interface 2010; Canadian Information Processing Society; 2010. [Google Scholar]

[b24-2246390] 24.Gaspari F. Machine Translation: From Real Users to Research. Springer Berlin; Heidelberg: 2004. Online MT services and real users’ needs: An empirical usability evaluation; pp. 74–85. [Google Scholar]

[b25-2246390] 25.Vicente KJ. Cognitive work analysis: Toward safe, productive, and healthy computer-based work. CRC Press; 1999. [Google Scholar]

[b26-2246390] 26.Turner AM, Liddy ED, Bradley J, Wheatley JA. Modeling public health interventions for improved access to the gray literature. Journal of the Medical Library Association. 2005;93(4):487. [PMC free article] [PubMed] [Google Scholar]

[b27-2246390] 27.Dew K, Turner AM, Desai L, Martin N, Kirchhoff K. Evaluating groupware prototypes with discount methods; In proceedings of the 18th ACM Conference on Computer-Supported Cooperative Work and Social Computing; ACM; 2015. in press. [Google Scholar]

[b28-2246390] 28.Gutwin C, Greenberg S. The mechanics of collaboration: Developing low cost usability evaluation methods for shared workspaces; In proceedings of Enabling Technologies: Infrastructure for Collaborative Enterprises; IEEE; 2000. pp. 98–103. [Google Scholar]

PERMALINK

PHAST: A Collaborative Machine Translation and Post-Editing Tool for Public Health

Kristin Dew, MS

Anne M Turner, MD, MPH, MLIS

Loma Desai, MBA

Nathalie Martin

Adrian Laurenzi

Katrin Kirchhoff, PhD

Abstract

Introduction and Background

Methods

Design

Table 1:

Figure 1:

Figure 2:

Evaluation

Data Analysis

Results

Table 2:

Discussion and implications for design

Conclusion, limitations and future work

Acknowledgments

References

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

PHAST: A Collaborative Machine Translation and Post-Editing Tool for Public Health

Kristin Dew, MS

Anne M Turner, MD, MPH, MLIS

Loma Desai, MBA

Nathalie Martin

Adrian Laurenzi

Katrin Kirchhoff, PhD

Abstract

Introduction and Background

Methods

Design

Table 1:

Figure 1:

Figure 2:

Evaluation

Data Analysis

Results

Table 2:

Discussion and implications for design

Conclusion, limitations and future work

Acknowledgments

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases