Skip to main content
. 2021 Jun 21;28(9):1970–1976. doi: 10.1093/jamia/ocab086

Figure 1.

Figure 1.

An overview of the pretraining and transfer phases vs a traditional training approach. The traditional training approach initializes a new model for each task without sharing knowledge between tasks. In contrast, during the pretraining phase a model learns parameters that can generalize to other natural language processing tasks by learning a pretraining task. Pretraining datasets can be large, allowing tasks with smaller datasets to take advantage of the “warm start” provided through pretraining. Pretraining is a one-time cost, allowing for a pretrained model to be transferred to multiple new tasks. During the transfer phase, the pretrained model is updated to perform a new task and can result in better performance with less data than if the model was randomly initialized.