Generative adversarial networks (GANs) |
GANs consist of 2 neural networks, a generator and a discriminator, that compete against each other. GANs are often used in image synthesis, super-resolution, style transfer, and more |
Image synthesis, style transfer, face ageing, data augmentation, 3D object creation |
Variational autoencoders (VAEs) |
VAEs are a type of autoencoder which adds additional constraints to the encoding process, causing the network to generate continuous, structured representations. This makes them useful for tasks such as generating new images or other data points |
Image generation, anomaly detection, image denoising, exploration of latent spaces, content generation in gaming |
Autoregressive models |
These models predict the next output in a sequence based on previous outputs. They have been used extensively in language modelling tasks (like text generation), as well as in generating music and even images |
Text generation (e.g., GPT models), music composition, image generation (e.g., PixelRNN), time-series forecasting |
Flow-based models |
These models leverage the change of variables formula to model complex distributions. They are characterised by their ability to both generate new samples and perform efficient inference |
High-quality image synthesis, speech and music modelling, density estimation, anomaly detection |
Energy-based models (EBMs) |
In EBMs, the aim is to learn an energy function that assigns low-energy values to data points from the data distribution and higher energies to other points. EBMs can be used for a wide range of applications, including image synthesis, denoising and in painting |
Image synthesis and restoration, pattern recognition, unsupervised and semi-supervised learning, structured prediction |
Diffusion models |
These models gradually learn to construct data by reversing a diffusion process, which transforms data into a Gaussian distribution. They have shown remarkable results in generating high-quality, diverse samples |
High-fidelity image generation (DALL-E2), audio synthesis, molecular structure generation |