. 2024 Mar 15;19:27. doi: 10.1186/s13012-024-01357-9

Table 1.

Generative AI models [20–23]

Generative AI model	Description	Applications
Generative adversarial networks (GANs)	GANs consist of 2 neural networks, a generator and a discriminator, that compete against each other. GANs are often used in image synthesis, super-resolution, style transfer, and more	Image synthesis, style transfer, face ageing, data augmentation, 3D object creation
Variational autoencoders (VAEs)	VAEs are a type of autoencoder which adds additional constraints to the encoding process, causing the network to generate continuous, structured representations. This makes them useful for tasks such as generating new images or other data points	Image generation, anomaly detection, image denoising, exploration of latent spaces, content generation in gaming
Autoregressive models	These models predict the next output in a sequence based on previous outputs. They have been used extensively in language modelling tasks (like text generation), as well as in generating music and even images	Text generation (e.g., GPT models), music composition, image generation (e.g., PixelRNN), time-series forecasting
Flow-based models	These models leverage the change of variables formula to model complex distributions. They are characterised by their ability to both generate new samples and perform efficient inference	High-quality image synthesis, speech and music modelling, density estimation, anomaly detection
Energy-based models (EBMs)	In EBMs, the aim is to learn an energy function that assigns low-energy values to data points from the data distribution and higher energies to other points. EBMs can be used for a wide range of applications, including image synthesis, denoising and in painting	Image synthesis and restoration, pattern recognition, unsupervised and semi-supervised learning, structured prediction
Diffusion models	These models gradually learn to construct data by reversing a diffusion process, which transforms data into a Gaussian distribution. They have shown remarkable results in generating high-quality, diverse samples	High-fidelity image generation (DALL-E2), audio synthesis, molecular structure generation