Skip to main content
Nanophotonics logoLink to Nanophotonics
. 2023 Oct 4;12(20):3871–3881. doi: 10.1515/nanoph-2023-0292

Diffusion probabilistic model based accurate and high-degree-of-freedom metasurface inverse design

Zezhou Zhang 1,2, Chuanchuan Yang 3,, Yifeng Qin 2,, Hao Feng 1,2, Jiqiang Feng 4, Hongbin Li 3
PMCID: PMC11501780  PMID: 39635197

Abstract

Conventional meta-atom designs rely heavily on researchers’ prior knowledge and trial-and-error searches using full-wave simulations, resulting in time-consuming and inefficient processes. Inverse design methods based on optimization algorithms, such as evolutionary algorithms, and topological optimizations, have been introduced to design metamaterials. However, none of these algorithms are general enough to fulfill multi-objective tasks. Recently, deep learning methods represented by generative adversarial networks (GANs) have been applied to inverse design of metamaterials, which can directly generate high-degree-of-freedom meta-atoms based on S-parameters requirements. However, the adversarial training process of GANs makes the network unstable and results in high modeling costs. This paper proposes a novel metamaterial inverse design method based on the diffusion probability theory. By learning the Markov process that transforms the original structure into a Gaussian distribution, the proposed method can gradually remove the noise starting from the Gaussian distribution and generate new high-degree-of-freedom meta-atoms that meet S-parameters conditions, which avoids the model instability introduced by the adversarial training process of GANs and ensures more accurate and high-quality generation results. Experiments have proven that our method is superior to representative methods of GANs in terms of model convergence speed, generation accuracy, and quality.

Keywords: deep learning, metasurfaces, inverse design, diffusion probabilistic model

1. Introduction

The prevalent metamaterial/functional-metasurface design involves trial-and-error iterative process guided by designers’ intuition, that designers select unit cell structures, perform parameter sweeps, and create arrays to achieve specific functions [16]. However, this method is inefficient and struggles with wideband multi-frequency targets due to increased nonlinearity [7]. Recently, optimization-based inverse design methods, such as adjoint-based topological optimization [8], genetic algorithm [9], and ant-colony optimization [10], have emerged, but they cannot deal with multi-objective tasks and have high computational costs due to individual optimization for different targeted unit cells.

Compared to optimization-based inverse design methods, deep learning can effectively explore the global design space [11], enabling fast generation of meta-atoms’ structures meeting S-parameters requirements that can be reused after a single training. Early research employed a tandem approach to connect forward and inverse networks to solve one-to-many problems, using fixed structure parameters in one-dimensional vectors for inverse design [1216]. As performance requirements grew, researchers explored free-form structures with higher degrees of freedom, and introduced generative adversarial network(GAN) [17] and variational autoencoder(VAE) [18] as generation methods.

VAEs produce blurry, low-quality results due to the trade-off between the reconstruction loss and the KL divergence loss [7, 19], and require additional optimization algorithms for extensive latent space searches [2022], increasing design complexity. GANs, through adversarial training between a generator and a discriminator [23], enable the direct generation of high-quality structures. Conditional GANs guide the network by providing conditional inputs, and have been applied to various scenarios like metasurfaces [2427], 1D metagratings [11], free-form metagratings [28, 29], transmission surfaces [30], nanostructures [31], structural color [32], and single-frequency XY-polarization multiplexing surfaces [33].

However, the adversarial training process of GANs requires a dynamic balance between the capabilities of the generator and the discriminator throughout training, which is challenging. Consequently, GANs experience inherent convergence difficulties during training, ultimately affecting their ability to generate accurate structures. Despite attempts to alleviate instability through methods like deep convolutional GANs (DCGANs) [34], Wasserstein GANs (WGANs) [35], and WGANs with gradient penalty (WGAN-GP) [36], the adversarial process’s inherent instability remains unresolved. Model stability and accuracy still rely on the careful model structure design and hyperparameter selection during training [37]. When input and output dimensions change, numerous experiments are needed to adjust the model structure, increasing network design costs and limiting GANs’ applicability in various scenarios [38].

Recently, diffusion probability theory [39] has demonstrated superior performance over GANs in terms of accuracy, diversity, and precision for generated samples in image generation tasks [38]. Consequently, it has been successfully applied to text-to-image generation, including OpenAI’s DALLE [40], Google’s IMAGEN [41], and image super-resolution [42]. However, the application of diffusion probability theory as a novel generation method for on-demand direct inverse generation of metasurfaces remains unexplored.

In this paper, we propose a novel metasurface inverse design method based on diffusion probabilistic model, called MetaDiffusion, capable of directly generating free-form meta-atom structures with high degrees of freedom according to broadband amplitude and phase requirements. Our method, illustrated in Figure 1, initiates a forward Markov process by gradually adding noise to the meta-atom x 0 until it becomes Gaussian noise x T in q(x t |x t−1). Then, a neural network p θ (x t−1|x t ) approximates the denoising process q(x t−1|x t ), iteratively denoising the sampled x T in the Gaussian distribution to generate a new structure x0 meeting the S-parameters requirements. This eliminates the unstable adversarial process employed by GANs. Consequently, MetaDiffusion bypasses the need of fine-tuning stable networks and hyperparameters for different data, and can be easily applied to various metasurface design tasks. We also fuse the S-parameters and extra parameters (meta-atom size W1, thickness H2, material refractive index N2) as a condition in the decoder of our model and utilize a classifier-free condition control strategy [43], enabling accurate direct generation of wideband S-parameters targets.

Figure 1:

Figure 1:

Overview of MetaDiffusion. (a) Purple Panel: Schematic diagram of the training process. From left to right, the process of gradually adding noise to the original meta-atom is shown (black arrow), while from right to left, the process of denoising by gradually removing noise from samples on a Gaussian distribution to regenerate meta-atoms is shown (green arrow). (b) Green Panel: A neural network is used to approximate the posterior probability q(x t−1|x t ) required in the denoising process. By inputting the noisy meta-atom x t , the noise ϵ θ to be removed is predicted. The S-parameters, extra parameters, and time step t are used as conditions and are integrated into the neural network after feature extraction to control conditional denoising. (c) Blue Panel: On-demand inverse design process. The well-trained neural networks is used in denoising process to generate new meta-atoms that meet the requirements, by starting with random noise and performing directional denoising according to the S-parameters conditions.

To the best of our knowledge, we are the first to apply the diffusion probabilistic model to inverse design of meta-atoms, overcoming the instability of GANs’ adversarial process. Our method, MetaDiffusion, incorporates S-parameters conditions, and utilizes classifier-free condition control strategies in the neural network to achieve precise generation according to S-parameters requirements. Importantly, through rigorous tests, we demonstrate that our proposed method converges faster, exhibits higher conditional accuracy, and is more stable compared to GAN-based methods such as SLMGAN [44] and WGAN-GP. Specifically, for a wideband S-parameters target with 52 sampling points between 30 and 60 THz, our method shows a 43 % and 48 % improvement in average mean absolute error of the S-parameters compared to SLMGAN and WGAN-GP, respectively. Furthermore, the mean absolute errors of top 95 % samples are below 0.071, which is 38 % better than SLMGAN and 49 % better than WGAN-GP.

2. The proposed MetaDiffusion method

2.1. Diffusion model

In order to train the neural network to gradually denoise Gaussian noise and generate a new meta-atom according to the S-parameters requirements, noise is first added to the meta-atoms in the training set through a forward Markov process based on the diffusion probability theory [39]. Starting from the original meta-atom x 0, random noise ϵ is added at each time step t ∈ 1, 2, …, T according to the variance scheduler β 1, …, β T :

qxt|xt1Nxt;1βtxt1,βtI, (1)
qx1:T|x0t=1Tqxt|xt1, (2)

where the variance β t ∈ (0, 0.02) increases linearly from the first timestep β 1 = 1e − 4 to the last timestep β T = 0.02, indicating that more noise is gradually added to the data so that the image eventually approaches a Gaussian distribution. By using a reparameterization trick and defining α t = 1 − β t , α¯t=s=1tαs , the noisy meta-atom matrix x t at any time step t can be directly expressed as a closed-form formula based on x 0 (detailed formula derivation see Section 1.1 of the Supplementary Materials):

xt=α¯tx0+ϵ1α¯t,ϵN(0,1). (3)

To gradually remove noise from the samples x T in the Gaussian distribution and obtain x 0 from the original meta-atom distribution, the posterior probability q(x t−1|x t ) needs to be known, which is intractable. Therefore, we use a learnable function p θ (x t−1|x t ) to approximate q(x t−1|x t ) instead. The denoising process from x T to x 0 is defined as p θ (x 0:T ):

pθx0:T=pxTt=1Tpθxt1|xt, (4)
pθxt1|xt=Nxt1;μθxt,t,Σθ(xt,t). (5)

During the training process, we encourage p θ (x t−1|x t ) to approximate q(x t−1|x t ) by optimizing the variational lower bound on negative log likelihood, where variational lower bound L can be simplified as (detailed formula derivation see Section 1.2 of the Supplementary Materials):

Elogpθx0Eqlogpθx0:Tqx1:T|x0L, (6)
Lt>1DKLqxt1|xt,x0pθxt1|xt. (7)

Minimizing the KL divergence can be intuitively understood as using the neural network ϵ θ (x t , t) to approximate the estimation of the noise ϵ t added at each time step in the forward process (detailed formula derivation see Section 1.3 of the Supplementary Materials),

arg minθL(θ)arg minθEϵtϵθxt,t2, (8)

where θ represents the learnable parameters of a neural network, and x t can be obtained from Equation (3).

2.2. Incorporating S-parameters conditions

The original diffusion model was used to generate images similar to the training dataset without introducing conditional control for generation. However, in the task of generating meta-atoms, it is essential to incorporate the S-parameters and additional parameters (such as meta-atom size W1, thickness H2, and material refractive index N2) to control the generation process of the model. This ensures that the generated structure, with specific material, thickness, and size, meets the requirements of the S-parameters condition. Therefore, we fuse the conditions with time information t and used as part of the neural network input during training process, represented as ϵ θ (x t , t, c).

In addition, in order to further improve the accuracy of the model, we adopt a classifier-free guide strategy [43], which trained both a conditional denoising diffusion model and an unconditional denoising diffusion model on the neural network simultaneously as an implicit guide. Specifically, during training, 10 % of data samples are randomly selected, and their corresponding conditions are masked with a specific label 0. The complete training process can be found in Algorithm S1 in the Supplementary Materials.

After the neural network is trained, during the on-demand generation stage of the meta-atom, we mix the conditional estimate ϵ θ (x t , t, c) and unconditional estimate ϵ θ (x t , t, 0) as the network’s noise prediction result, where w is a hyperparameter that controls the trade-off between accuracy and diversity. Unlike typical image generation tasks that focus on diversity and image quality, the meta-atom generation task places more emphasis on conditional accuracy. Therefore, we selected a relative larger value w = 6.0.

ϵ~θxt,t,c=(1+w)ϵθxt,t,cwϵθxt,t,0. (9)

Then, we use the reparameterization trick on the posterior q(x t−1|x t , x 0) and substitute the noise ϵ~θ(xt,t,c) approximated by the neural network to obtain an image from the former time step (detailed formula derivation see Section 1.4 of the Supplementary Materials):

xt1=1αtxt1αt1α¯tϵ~θxt,t,c+1α¯t11α¯tβtz, (10)

where zN(0,1) .

Through the iterative denoising process, new meta-atoms that meet the requirements of the S-parameters can be generated. The complete process of the on-demand generation stage can be found in Algorithm S2 in the Supplementary Materials.

2.3. Network architecture

As shown in Figure 2, we employ a generic encoder-decoder U-Net architecture for predicting noise ϵ for a given noisy image x t . This enable us to approximate the posterior probability q(x t−1|x t ) and subsequently obtain x t−1 using Equation (10). The network consists of a set of encoders composed of residual blocks and pooling layers for down sampling compression and then uses transposed convolutions and residual blocks for up sampling recovery. Key features are extracted during the compression and recovery process. The encoder and decoder are connected by skip connections to alleviate information loss during the down sampling process of the encoder.

Figure 2:

Figure 2:

MetaDiffusion network architecture. The input x t is first downsampled by an encoder composed of residual blocks (Resblock) and pooling layers, reducing the matrix size while increasing the depth of the feature channels. At the bottleneck layer, it becomes a 512-length vector. It then passes through a decoder composed of multiple transposed convolutional layers (Tconv) and residual blocks, gradually reducing the number of channels and restoring the size. There are skip connections between the encoder and decoder to retain information. The S-parameters and extra parameters are concatenated to a condition vector. The condition vector and time step information are separately passed through feature extractors composed of fully connected layers (Linear) before being added to the decoder. Finally, the network predicts noise ϵ θ of the same size as the input.

In addition, considering the complexity of conditions in the Meta-atom generation task, we utilize two fully connected layers to extract features from the S-parameters, extra parameters, and the time step t. These extracted features are then concatenated and integrated into the upsampling process of the decoder, ensuring both low computational complexity and effective fusion of conditions. (For specific network structure and training hyperparameters, see the Supplementary Materials)

3. Results and discussion

To verify and compare the performance of the model, we use the all-dielectric freeform unit cell dataset [45]. As shown in Figure 1, each unit cell consists of two layers of dielectric with different refractive indices, where the upper layer is a high-refractive-index free-form structure symmetrical about the XY axis and the lower layer is a fixed refractive index of N1 = 1.4 and a thickness of H1 = 2 μm. The entire unit cell structure can be characterized by a 64 × 64 binary matrix (0 represents air, 1 represents dielectric) to represent the shape of the free-form structure and a 1 × 3 parameter matrix to represent the substrate size W1 ∈ [2.5 μm, 3 μm], thickness H2 ∈ [0.5 μm, 1 μm], and refractive index N2 ∈ [3.5, 5]. The dataset assumes that all meta-atoms are polarization-independent and reciprocal, with symmetric configurations. This allow us to simplify the structure representation to a 32 × 32 two-dimensional matrix in the upper left corner, thus introducing physical symmetry as prior knowledge in advance, which is conducive to network training. However, in the future, we plan to use our MetaDiffusion model to generate meta-atoms with more sophisticated properties, such as chiral metamaterials or non-Hermitian metamaterials.

The transmission responses from 30 to 60 THz obtained by simulating the structure using CST Microwave Studio are used as the label for the structure data. The real and imaginary parts of the transmission response are sampled at 26 points respectively and combined into a 1 × 52 one-dimensional vector to represent the transmission response. In addition, we incorporate 1 × 3 structural parameters [W1, H2, N2] as extra conditions, thus adding a constraint that guides the model to generate meta-atoms in the desired material and structural size and thickness. As a result, we concatenate the 1 × 3 parameter vector with the 1 × 52 transmission response to form a 1 × 55 vector as the target condition for generating network input. The dataset contains a total of 174,883 data points and is divided into training, validation, and test sets at a ratio of 8 : 1 : 1.

During the process of training and comparing generative neural networks, the model is instructed to generate structures for the S-parameters targets on the validation and test sets. And then, the generated results need to be forward simulated to obtain the corresponding frequency response, thereby quantitatively evaluating the accuracy performance of the network. However, using numerical simulation methods such as finite-difference time-domain (FDTD) is very time-consuming and much slower than the generation speed of the generative network. It is difficult to handle the evaluation of large amounts of data during the training and testing process of neural networks. Therefore, we choose to use a predicting neural network (PNN) [45] as a fast solver instead to accurately predict the spectral response of unit cells within milliseconds. We have verified through qualitative and quantitative experiments that PNN can serve as a reliable surrogate solver. The mean squared error (MSE) of PNN on the test set is 0.001835 for the real part and 0.001789 for the imaginary part. (See the Supplementary Materials for more experimental results)

To fully demonstrate the accuracy and stability of the on-demand direct generation capabilities of the method proposed in this paper, we compare it with the current state-of-the-art methods for on-demand direct generation of meta-atoms, including WGAN-GP [28], SLMGAN with a simulator [44] and conditional VAE (cVAE) [22]. (1) WGAN-GP enhances the stability of WGAN by employing a gradient penalty method to restrict the parameters of the network that approximates the Wasserstein distance. (2) The recently proposed SLMGAN with a simulator (hereinafter referred to as SLMGAN) further enhances training stability and conditional accuracy by employing Sinkhorn iteration to explicitly calculate optimal transport and incorporating a pre-trained forward prediction network as one of the training loss items during training. (3) The cVAE is a representative VAE-based approach that utilizes the S-parameters conditions as inputs to the decoder, enabling VAE to be applied for the directly generation of meta-atoms. (See the Supplementary Materials for implementation details of SLMGAN with a simulator, WGAN-GP and cVAE)

After binarizing the generated results of the four models (using 0.5 as the threshold, setting values less than 0.5 to 0 and values greater than or equal to 0.5 to 1), we use the surrogate solver for forward calculation on them to obtain the corresponding S-parameters. Mean absolute error (MAE) is used as the error indicator for the accuracy of each sample, where MAE of each sample is expressed as:

L=1ni=1n|starget(i)sgen(i)|, (11)

where s target (i) is the target S-parameters at the ith frequency point and s gen (i) is the S-parameters corresponding to the generated result at the ith frequency point. The above calculation includes 26 points for both the real part and the imaginary part.

During the training process, at the end of each epoch, we use the model parameters of the current epoch to generate samples on the validation set according to the S-parameters conditions and evaluate the MAE of each generated sample relative to the input conditions. The mean of the MAE loss for all samples are calculated to obtain the accuracy of the generated results on the entire validation set under the current epoch. The MAE results of the three methods on the validation set during the entire training process are shown in Figure 3. All four methods gradually converge within 500 epochs, with MetaDiffusion converging to around 0.03 and SLMGAN, WGAN-GP and cVAE methods converging to around 0.05, 0.05 and 0.06 respectively. More importantly, from the figure we can also see that our method converges significantly faster than SLMGAN and WGAN-GP. Specifically, MetaDiffusion converged to 0.068 in the first 14 epochs and to 0.052 in the first 30 epochs, indicating that our method is more stable and easier to train than GAN-based methods, and requires less training time to achieve the same effect. Although cVAE has a similar convergence speed to MetaDiffusion, it converges to a higher error than MetaDiffusion and GAN-based methods, which suggests that it has weaker learning ability.

Figure 3:

Figure 3:

Validation loss of four models during the training process.

Then, the four trained models are used to perform qualitative and quantitative comparisons of model performance on the test set. In Figures 4 and 5, the generation results of the four models are randomly selected and qualitatively presented. Each row represents one method, from top to bottom: MetaDiffusion, SLMGAN, WGAN-GP and cVAE. For each method, the figure shows the original structure in the dataset, the direct output of the model, and the structure after binarization. Finally, the binarized structure is forward-simulated to compare the S-parameters of the generated structure with the target S-parameters. From the figure, we can see that all three methods can generate structures that roughly conform to the target S-parameters trend according to the required S-parameters conditions. However, our proposed MetaDiffusion method has the highest accuracy among the three methods, while SLMGAN, WGAN-GP and cVAE show inaccurate phenomena in some frequency ranges. Although we use the results after binarization as the outputs of the models, directly observing the outputs of the models can reflect the generation ability and stability of the model to a certain extent. From the figure, it can be seen that MetaDiffusion can directly output clear structural images, while WGAN-GP has some blurring phenomena in its generated results due to training instability caused by adversarial processes. SLMGAN further affects the clarity of its generated results due to its use of special interpolation operations as a way to introduce symmetry information. The cVAE also tends to generate blurry images, which is due to the fact that VAE needs to balance the image reconstruction error and the KL divergence between the latent space and the standard distribution during training. This inherent problem limits the cVAE to generate stable and uniform pattern.

Figure 4:

Figure 4:

Qualitative comparison example 1 of on-demand inverse design. (a) The generated results of MetaDiffusion. (b) The generated results of SLMGAN with a simulator. (c) The generated results of WGAN-GP. (d) The generated results of cVAE. For each method, we display the structure in the original dataset (Original), the direct output of the model (Model Output), and the output after binarization (After Binarization). We perform forward simulation on the binarized structure to obtain the real part (Generated-real) and imaginary part (Generated-imag) of the corresponding S-parameters and compare it with the real part (Target-real) and imaginary part (Target-imag) of the target S-parameters. The error in the real part (Error-real) and imaginary part (Error-imag) at each frequency point is shown as a bar graph. Note that the gray portion of the figure represents the overlap of blue and orange.

Figure 5:

Figure 5:

Qualitative comparison example 2 of on-demand inverse design. (a) The generated results of MetaDiffusion. (b) The generated results of SLMGAN with a simulator. (c) The generated results of WGAN-GP. (d) The generated results of cVAE.

Additionally, as depicted in Figures 4 and 5, our proposed method tends to produce more uniform structures, whereas the structures generated by the other three methods occasionally exhibit scattered pixels, complicating the fabrication process. To further demonstrate the advantage of our method in generating stable structures with respect to ease of fabrication, we employ our approach to perform one-to-many generations for specific targets, as illustrated in Figure 6.

Figure 6:

Figure 6:

One to many Generations. Panels (a) and (b) show two sets of examples where multiple unit cell structures that meet the given S-parameters target (top of each panel) are generated. It can be observed that MetaDiffusion tends to generate regular structures that are easy to fabrication.

Figure 7 quantitatively compares the accuracy of the four models in generating unit cells according to S-parameters conditions on the entire test set. Figure 7a shows the mean MAE on the entire test set. The average error of MetaDiffusion on the entire test set is 0.02824, which is significantly lower than the average errors of SLMGAN (0.04955), WGAN-GP (0.05454), and cVAE (0.06142). This suggests that MetaDiffusion is a more effective meta-atoms inverse design model than the other three models. In particular, MetaDiffusion has a 43 %, 48 % and 54 % accuracy improvement compared to SLMGAN, WGAN-GP, and cVAE, respectively.

Figure 7:

Figure 7:

Quantitative comparison of the four methods. (a) Mean MAE error on the test set. The improvement of MetaDiffusion relative to SLMGAN with a simulator, WGAN-GP and cVAE is indicated by arrows in the figure. (b) Error distribution on the test set. The vertical dashed line indicates the position corresponding to 95 % error.

In addition to focusing on the average error situation, we also pay attention to the distribution of errors in the samples to prevent situations where some samples performed extremely well and masked the poor performance of most samples in the mean indicator. We calculate the error of each individual sample for the four models on the entire test set. As shown in the error distribution graph in Figure 7b, the green, blue, orange and purple bars represent the proportions of MetaDiffusion, SLMGAN, WGAN-GP and cVAE in the corresponding error range; the solid lines represent the use of kernel density estimate (KDE) data smoothing method on the corresponding model distribution to visualize the overall trend of the distribution. From Figure 7b, it can be seen that the error distribution of our proposed method is more concentrated towards 0, while SLMGAN, WGAN-GP, and cVAE have a more even overall distribution. We mark with a dashed line in the figure the position of 95 % sample error for each model. In our proposed MetaDiffusion model, 95 % of samples are better than 0.071, while SLMGAN, WGAN-GP, and cVAE have 95 % errors of 0.115, 0.139, and 0.166, respectively. Therefore, our method improves by 38 %, 49 %, and 57 % respectively compared to SLMGAN, WGAN-GP, and cVAE in terms of error for the top 95 % samples.

Finally, we also compare the performance of the models under the reduced training data sizes. As shown in Figure 8, we limited training sets with sizes of 80 %, 60 %, 40 %, and 20 % of the original training set. The results on the test set are shown in Figure 8. From the results, it can be observed that as the training dataset size decreases, the performance of all four methods decreases. However, MetaDiffusion consistently outperforms the other three methods, demonstrating its advantage in performance under limited sample sizes.

Figure 8:

Figure 8:

Test loss with limited training data. The four models are trained with 100 %, 80 %, 60 %, 40 %, and 20 % of the training dataset, respectively, and tested on the complete test dataset.

4. Conclusions

In conclusion, we present a novel inverse design method for metasurfaces, MetaDiffusion, based on diffusion probability theory. This method facilitates the direct generation of high-quality, diverse, and accurate freeform meta-atoms, conforming to broadband amplitude and phase requirements. By employing a neural network to learn the noise diffusion process, our approach generates new meta-atoms that meet S-parameters conditions via denoising, thereby avoiding the labor-intensive traditional iterative search process.

Experiments have shown that widely-used GAN series methods have the advantage of higher accuracy than VAE in recent AI applications for on-demand direct meta-atoms generation. However, GANs have inherent model instability due to the adversarial training process. Our method, on the other hand, does not require an adversarial process and thus does not have the problem of model instability. As a result, our approach obviates the need for multiple experiments to select stable network structures for different scenarios, allowing for easy extension to various situations while ensuring accurate and high-quality meta-atoms generation results. Experimental validation demonstrates that MetaDiffusion outperforms representative GAN methods in terms of model convergence speed, quality, and accuracy of generated structures.

The methodology, MetaDiffusion, establishes a novel direction in inverse design and fosters further inquiry into the utilization of diffusion probability models in the realm of meta-atoms design. In subsequent research, we plan to harness the exceptional conditional accuracy inherent in diffusion probability models to develop and fabricate intricate freeform metasurfaces, such as those required for complex applications in real-world scenarios.

Supplementary Material

Supplementary Material Details

Supplementary Material

This article contains supplementary material (https://doi.org/10.1515/nanoph-2023-0292).

Footnotes

Research funding: This research was supported by the National Key Research and Development Program of China (2020YFB1806405 and 2020YFB1806400) and the Major Key Project of PCL (PCL2023AS2-4).

Author contributions: Zezhou Zhang: Conceptualization, Methodology, Software, Writing – original draft, Writing – review & editing; Chuanchuan Yang: Writing – review & editing, Supervision, Funding acquisition; Yifeng Qin: Writing – original draft, Writing – review & editing; Hao Feng: Visualization; Jiqiang Feng: Writing – review & editing, Resources, Funding acquisition; Hongbin Li: Supervision, Funding acquisition. All authors have accepted responsibility for the entire content of this manuscript and approved its submission.

Conflict of interest: Authors state no conflict of interest.

Data availability: The datasets used during the current study are available in the open-source repository at https://github.com/SensongAn/Meta-atoms-data-sharing.

Contributor Information

Zezhou Zhang, Email: zezhou.zhang@stu.pku.edu.cn.

Chuanchuan Yang, Email: yangchuanchuan@pku.edu.cn.

Yifeng Qin, Email: qinyf@pcl.ac.cn.

Hao Feng, Email: hfeng@pku.edu.cn.

Jiqiang Feng, Email: fengjq@szu.edu.cn.

Hongbin Li, Email: lihb@pku.edu.cn.

References

  • [1].He Q., Sun S., Xiao S., Zhou L. High-efficiency metasurfaces: principles, realizations, and applications. Adv. Opt. Mater. . 2018;6(19):1800415. doi: 10.1002/adom.201800415. [DOI] [Google Scholar]
  • [2].Chen S., Liu W., Li Z., Cheng H., Tian J. Metasurface-empowered optical multiplexing and multifunction. Adv. Mater. . 2020;32(3):1805912. doi: 10.1002/adma.201805912. [DOI] [PubMed] [Google Scholar]
  • [3].Khorasaninejad M., Capasso F. Broadband multifunctional efficient meta-gratings based on dielectric waveguide phase shifters. Nano Lett. . 2015;15(10):6709–6715. doi: 10.1021/acs.nanolett.5b02524. [DOI] [PubMed] [Google Scholar]
  • [4].Li Z., Pestourie R., Park J.-S., Huang Y.-W., Johnson S. G., Capasso F. Inverse design enables large-scale high-performance meta-optics reshaping virtual reality. Nat. Commun. . 2022;13(1):2409. doi: 10.1038/s41467-022-29973-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [5].Hu G., Ma W., Hu D., et al. Real-space nanoimaging of hyperbolic shear polaritons in a monoclinic crystal. Nat. Nanotechnol. . 2023;18(1):64–70. doi: 10.1038/s41565-022-01264-4. [DOI] [PubMed] [Google Scholar]
  • [6].Dong S., Hu G., Wang Q., et al. Loss-assisted metasurface at an exceptional point. ACS Photonics . 2020;7(12):3321–3327. doi: 10.1021/acsphotonics.0c01440. [DOI] [Google Scholar]
  • [7].Khatib O., Ren S., Malof J., Padilla W. J. Deep learning the electromagnetic properties of metamaterials—a comprehensive review. Adv. Funct. Mater. . 2021;31(31):2101748. doi: 10.1002/adfm.202101748. [DOI] [Google Scholar]
  • [8].Jensen J., Sigmund O. Topology optimization for nano-photonics. Laser Photonics Rev. . 2011;5(2):308–321. doi: 10.1002/lpor.201000014. [DOI] [Google Scholar]
  • [9].Jafar-Zanjani S., Inampudi S., Mosallaei H. Adaptive genetic algorithm for optical metasurfaces design. Sci. Rep. . 2018;8(1):11040. doi: 10.1038/s41598-018-29275-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [10].Lewis A., Weis G., Randall M., Galehdar A., Thiel D. 2009 IEEE Congress on Evolutionary Computation . IEEE; 2009. Optimising efficiency and gain of small meander line rfid antennas using ant colony system; pp. 1486–1492. [Google Scholar]
  • [11].Jiang J., Fan J. A. Global optimization of dielectric metasurfaces using a physics-driven neural network. Nano Lett. . 2019;19(8):5366–5372. doi: 10.1021/acs.nanolett.9b01857. [DOI] [PubMed] [Google Scholar]
  • [12].An X., Cao Y., Wei Y., et al. Broadband achromatic metalens design based on deep neural networks. Opt. Lett. . 2021;46(16):3881–3884. doi: 10.1364/ol.427221. [DOI] [PubMed] [Google Scholar]
  • [13].An S., Fowler C., Zheng B., et al. A deep learning approach for objective-driven all-dielectric metasurface design. ACS Photonics . 2019;6(12):3196–3207. doi: 10.1021/acsphotonics.9b00966. [DOI] [Google Scholar]
  • [14].Liu D., Tan Y., Khoram E., Yu Z. Training deep neural networks for the inverse design of nanophotonic structures. ACS Photonics . 2018;5(4):1365–1369. doi: 10.1021/acsphotonics.7b01377. [DOI] [Google Scholar]
  • [15].Yuan L., Wang L., Yang X.-S., Huang H., Wang B.-Z. An efficient artificial neural network model for inverse design of metasurfaces. IEEE Antennas Wirel. Propag. Lett. . 2021;20(6):1013–1017. doi: 10.1109/lawp.2021.3069713. [DOI] [Google Scholar]
  • [16].Yeung C., Tsai J.-M., King B., et al. Multiplexed supercell metasurface design and optimization with tandem residual networks. Nanophotonics . 2021;10(3):1133–1143. doi: 10.1515/nanoph-2020-0549. [DOI] [Google Scholar]
  • [17].Goodfellow I., Pouget-Abadie J., Mirza M., et al. Advances in Neural Information Processing Systems . Vol. 27. Palais des Congrès de Montréal, Montréal Canada: Curran Associates, Inc; 2014. Generative adversarial nets; pp. 2672–2680. Available at: https://papers.nips.cc/paper_files/paper/2014/hash/5ca3e9b122f61f8f06494c97b1afccf3-Abstract.html . [Google Scholar]
  • [18].Kingma D. P., Welling M. Auto-encoding variational bayes. . 2013 arXiv preprint arXiv:1312.6114. [Google Scholar]
  • [19].Xiao Z., Kreis K., Vahdat A. Tackling the generative learning trilemma with denoising diffusion gans. . 2021 arXiv preprint arXiv:2112.07804. [Google Scholar]
  • [20].Tanriover I., Lee D., Chen W., Aydin K. Deep generative modeling and inverse design of manufacturable free-form dielectric metasurfaces. ACS Photonics . 2022;10:875–883. doi: 10.1021/acsphotonics.2c01006. [DOI] [Google Scholar]
  • [21].Zandehshahvar M., Kiarashinejad Y., Zhu M., Maleki H., Brown T., Adibi A. Manifold learning for knowledge discovery and intelligent inverse design of photonic nanostructures: breaking the geometric complexity. ACS Photonics . 2022;9(2):714–721. doi: 10.1021/acsphotonics.1c01888. [DOI] [Google Scholar]
  • [22].Ma W., Cheng F., Xu Y., Wen Q., Liu Y. Probabilistic representation and inverse design of metamaterials based on a deep generative model with semi-supervised learning strategy. Adv. Mater. . 2019;31(35):1901111. doi: 10.1002/adma.201901111. [DOI] [PubMed] [Google Scholar]
  • [23].Mirza M., Osindero S. Conditional generative adversarial nets. . 2014 arXiv preprint arXiv:1411.1784. [Google Scholar]
  • [24].Liu Z., Zhu D., Rodrigues S. P., Lee K.-T., Cai W. Generative model for the inverse design of metasurfaces. Nano Lett. . 2018;18(10):6570–6576. doi: 10.1021/acs.nanolett.8b03171. [DOI] [PubMed] [Google Scholar]
  • [25].So S., Rho J. Designing nanophotonic structures using conditional deep convolutional generative adversarial networks. Nanophotonics . 2019;8(7):1255–1261. doi: 10.1515/nanoph-2019-0117. [DOI] [Google Scholar]
  • [26].Wang H. P., Li Y. B., Li H., et al. Deep learning designs of anisotropic metasurfaces in ultrawideband based on generative adversarial networks. Adv. Intell. Syst. . 2020;2(9):2000068. doi: 10.1002/aisy.202000068. [DOI] [Google Scholar]
  • [27].Yeung C., Tsai R., Pham B., et al. Global inverse design across multiple photonic structure classes using generative deep learning. Adv. Opt. Mater. . 2021;9(20):2100548. doi: 10.1002/adom.202100548. [DOI] [Google Scholar]
  • [28].Jiang J., Sell D., Hoyer S., Hickey J., Yang J., Fan J. A. Free-form diffractive metagrating design based on generative adversarial networks. ACS Nano . 2019;13(8):8872–8878. doi: 10.1021/acsnano.9b02371. [DOI] [PubMed] [Google Scholar]
  • [29].Wen F., Jiang J., Fan J. A. Robust freeform metasurface design based on progressively growing generative networks. ACS Photonics . 2020;7(8):2098–2104. doi: 10.1021/acsphotonics.0c00539. [DOI] [Google Scholar]
  • [30].Liu P., Chen L., Chen Z. N. Prior-knowledge-guided deep-learning-enabled synthesis for broadband and large phase shift range metacells in metalens antenna. IEEE Trans. Antennas Propag. . 2022;70(7):5024–5034. doi: 10.1109/tap.2021.3138517. [DOI] [Google Scholar]
  • [31].Baucour A., Kim M., Shin J. Data-driven concurrent nanostructure optimization based on conditional generative adversarial networks. Nanophotonics . 2022;11(12):2865–2873. doi: 10.1515/nanoph-2022-0005. [DOI] [Google Scholar]
  • [32].Dai P., Sun K., Yan X., et al. Inverse design of structural color: finding multiple solutions via conditional generative adversarial networks. Nanophotonics . 2022;11(13):3057–3069. doi: 10.1515/nanoph-2022-0095. [DOI] [Google Scholar]
  • [33].An S., Zheng B., Tang H., et al. Multifunctional metasurface design with a generative adversarial network. Adv. Opt. Mater. . 2021;9(5):2001433. doi: 10.1002/adom.202001433. [DOI] [Google Scholar]
  • [34].Radford A., Metz L., Chintala S. Unsupervised Representation learning with deep convolutional generative adversarial networks. . 2015 arXiv preprint arXiv:1511.06434. [Google Scholar]
  • [35].Arjovsky M., Chintala S., Bottou L. International Conference on Machine Learning . PMLR; 2017. Wasserstein generative adversarial networks; pp. 214–223. [Google Scholar]
  • [36].Gulrajani I., Ahmed F., Arjovsky M., Dumoulin V., Courville A. C. Improved training of wasserstein gans. Adv. Neural Inf. Process. Syst. . 2017;30:5767–5777. [Google Scholar]
  • [37].Li Z., Pestourie R., Lin Z., Johnson S. G., Capasso F. Empowering metasurfaces with inverse design: principles and applications. ACS Photonics . 2022;9(7):2178–2192. doi: 10.1021/acsphotonics.1c01850. [DOI] [Google Scholar]
  • [38].Dhariwal P., Nichol A. Diffusion models beat gans on image synthesis. Adv. Neural Inf. Process. Syst. . 2021;34:8780–8794. [Google Scholar]
  • [39].Ho J., Jain A., Abbeel P. Denoising diffusion probabilistic models. Adv. Neural Inf. Process. Syst. . 2020;33:6840–6851. [Google Scholar]
  • [40].Ramesh A., Pavlov M., Goh G., et al. International Conference on Machine Learning . PMLR; 2021. Zero-shot text-to-image generation; pp. 8821–8831. [Google Scholar]
  • [41].Saharia C., Chan W., Saxena S., et al. Photorealistic text-to-image diffusion models with deep language understanding. Adv. Neural Inf. Process. Syst. . 2022;35:36479–36494. [Google Scholar]
  • [42].Saharia C., Ho J., Chan W., Salimans T., Fleet D. J., Norouzi M. Image super-resolution via iterative refinement. IEEE Trans. Pattern Anal. Mach. Intell. . 2022;45:4713–4726. doi: 10.1109/tpami.2022.3204461. [DOI] [PubMed] [Google Scholar]
  • [43].Ho J., Salimans T. NeurIPS 2021 Workshop on Deep Generative Models and Downstream Applications . 2021. Classifier-free diffusion guidance. [Google Scholar]
  • [44].Dai M., Jiang Y., Yang F., et al. Slmgan: single-layer metasurface design with symmetrical free-form patterns using generative adversarial networks. Appl. Soft Comput. . 2022;130:109646. doi: 10.1016/j.asoc.2022.109646. [DOI] [Google Scholar]
  • [45].An S., Zheng B., Shalaginov M. Y., et al. Deep learning modeling approach for metasurfaces with high degrees of freedom. Opt. Express . 2020;28(21):31932–31942. doi: 10.1364/oe.401960. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Material Details


Articles from Nanophotonics are provided here courtesy of Wiley

RESOURCES