| Generation methods |
Usually autoregressive. |
Usually non-autoregressive. |
| Discrete text handling |
One-hot encoding, distributed representation, bag-of-words representation and word embedding representation. |
Discrete text diffusion and continuous text diffusion. |
| Time complexity |
Related to factors such as the number of layers of the model, the number of attention heads, the dimension of the hidden layer, and the size of the training data. |
Usually related to the number of sampling steps and the model complexity. |
| Diversity of generated results |
Tending to choose words with high probabilities may result in relatively conservative and similar generated outcomes. |
By introducing more randomness, the generated text tends to exhibit diversity. |