Skip to main content
. 2024 Mar 15;19:27. doi: 10.1186/s13012-024-01357-9

Table 1.

Generative AI models [2023]

Generative AI model Description Applications
Generative adversarial networks (GANs) GANs consist of 2 neural networks, a generator and a discriminator, that compete against each other. GANs are often used in image synthesis, super-resolution, style transfer, and more Image synthesis, style transfer, face ageing, data augmentation, 3D object creation
Variational autoencoders (VAEs) VAEs are a type of autoencoder which adds additional constraints to the encoding process, causing the network to generate continuous, structured representations. This makes them useful for tasks such as generating new images or other data points Image generation, anomaly detection, image denoising, exploration of latent spaces, content generation in gaming
Autoregressive models These models predict the next output in a sequence based on previous outputs. They have been used extensively in language modelling tasks (like text generation), as well as in generating music and even images Text generation (e.g., GPT models), music composition, image generation (e.g., PixelRNN), time-series forecasting
Flow-based models These models leverage the change of variables formula to model complex distributions. They are characterised by their ability to both generate new samples and perform efficient inference High-quality image synthesis, speech and music modelling, density estimation, anomaly detection
Energy-based models (EBMs) In EBMs, the aim is to learn an energy function that assigns low-energy values to data points from the data distribution and higher energies to other points. EBMs can be used for a wide range of applications, including image synthesis, denoising and in painting Image synthesis and restoration, pattern recognition, unsupervised and semi-supervised learning, structured prediction
Diffusion models These models gradually learn to construct data by reversing a diffusion process, which transforms data into a Gaussian distribution. They have shown remarkable results in generating high-quality, diverse samples High-fidelity image generation (DALL-E2), audio synthesis, molecular structure generation