Skip to main content
Nature Communications logoLink to Nature Communications
. 2022 Nov 29;13:6831. doi: 10.1038/s41467-022-34308-3

Asymptotically fault-tolerant programmable photonics

Ryan Hamerly 1,2,, Saumil Bandyopadhyay 1, Dirk Englund 1
PMCID: PMC9708693  PMID: 36446762

Abstract

Component errors limit the scaling of programmable coherent photonic circuits. These errors arise because the standard tunable photonic coupler—the Mach-Zehnder interferometer (MZI)—cannot be perfectly programmed to the cross state. Here, we introduce two modified circuit architectures that overcome this limitation: (1) a 3-splitter MZI mesh for generic errors, and (2) a broadband MZI+Crossing design for correlated errors. Because these designs allow for perfect realization of the cross state, the matrix fidelity no longer degrades with increased mesh size, allowing scaling to arbitrarily large meshes. The proposed architectures support progressive self-configuration, are more compact than previous MZI-doubling schemes, and do not require additional phase shifters. This removes a key limitation to the development of very-large-scale programmable photonic circuits.

Subject terms: Silicon photonics, Electrical and electronic engineering, Integrated optics


Fabrication errors limit the scaling of programmable photonic circuits. Here the authors show how a broad class of circuits can be made asymptotically fault-tolerant, where the effect of errors remains controlled regardless of the circuit’s size.

Introduction

Large-scale programmable photonic circuits are opening up radical new possibilities for optics. Of key importance in many devices is the universal multiport interferometer, which functions as an N × N reconfigurable feedforward linear circuit. This device, typically constructed with a compact mesh of Mach–Zehnder interferometers (MZIs, Fig. 1a, b)1,2, is widely employed in applications ranging from spatially multiplexed optical communications to machine learning and quantum computing37. Sadly, component errors (Fig. 1c) are a critical factor limiting the size of such circuits. Since the circuit depth of MZI meshes scales as O(N), the effect of errors grows with mesh size, meaning that, in practice, even modestly sized circuits cannot be programmed to high accuracy. Motivated by this challenge, a large body of recent work has focused on “correcting” hardware errors by global optimization810, self-configuration1118, or local correction19,20. For conventional MZI meshes, correction reduces errors by a quadratic factor16,19; however, the effect of errors still grows with mesh size and poses a fundamental limit to the scaling of these circuits.

Fig. 1. Multiport interferometers with imperfect components.

Fig. 1

a Universal 6 × 6 circuit realized by a triangular (Reck1) mesh. b Constituent components of the mesh include phase shifters (ψ) and programmable MZI couplers (θ, ϕ). c Fabrication imperfections lead to splitting-ratio errors α, β. d, e Alternative error-resilient coupler designs proposed in this paper: d 3-splitter MZI and e MZI+crossing.

To overcome this limit, various alternative mesh architectures have been proposed. Non-compact structures such as binary trees avoid the extreme splitting-ratio requirements21,22, but suffer from large chip area and the need for many crossings. A complementary approach is to stick to conventional geometries1,2, but insert redundant MZIs to realize the full range of splitting ratios even in imperfect hardware2325. This solves the scaling problem, but at the cost of a 1.5–2× increase in the number of splitters and phase shifters. The resulting effects on chip area (particularly on emerging high-speed platforms where phase shifters have a large footprint26,27), waveguide length (which affects insertion loss and latency28), and electronic complexity (number of pads, traces, DACs/drivers, etc.) make this option unappealing.

In this paper, we propose two mesh architectures that achieve the same perfect scaling without significant added complexity: a 3-splitter MZI that corrects all hardware errors (Fig. 1d) and an MZI+crossing design that only corrects correlated errors, but has the added advantage of broader bandwidth (Fig. 1e). These designs take up significantly less chip area than the “perfect” redundant MZIs23,24, and do not require additional phase shifters. Moreover, the proposed architectures support progressive self-configuration16,17, allowing for error correction even when the hardware errors are unknown. This work will enable the development of freely scalable, broadband, and compact linear photonic circuits.

This paper is structured as follows: first we introduce the formalism of error correction in MZI meshes, focusing on the self-configuration approach. Splitting ratios are visualized as points on the Riemann sphere, where forbidden regions emerge as a result of hardware imperfections; these regions are centered at the poles (bar- and cross-state), where the probability density is at a maximum. To avoid this unfortunate coincidence, our architectures “rotate” the Riemann sphere to move the forbidden regions away from this peak, so that a larger fraction of MZIs are perfectly realized. Based on this concept, we introduce the 3-splitter MZI, which can correct arbitrary errors by rotating the forbidden regions to the equator. Using a benchmark optical neural network, we show that this modified MZI mesh is >3× more robust to hardware errors, enabling accurate inference in a regime where standard interferometric circuits struggle. Finally, we introduce the MZI+crossing, which flips the poles of the Riemann sphere. While this design is only robust against correlated errors, it has the added advantage of broader intrinsic bandwidth. For both architectures, we compare the matrix fidelity to the standard MZI to demonstrate the scaling advantage of both schemes.

Results

Error correction formalism

To correctly configure an MZI mesh in the presence of errors, one uses a nulling method based on physical measurements16,17. Figure 2a illustrates the case of the triangular mesh1, where the procedure is more straightforward. The transfer matrix for this system is a product of a phase screen D and a sequence of 2 × 2 unitaries W:

U=DmnTmnW 1

where Tmn is the nth MZI of the mth rising diagonal.

Fig. 2. Nulling method of self-configuration.

Fig. 2

a Configuring MZI Tmn updates matrix W. b Corresponding nulling update to X = UW, which is c equivalent to zeroing an output of Tmn given a fixed input. d Allowed range of s=T11/T12C, showing the forbidden regions centered at s = 0 and s =  that arise from hardware imperfections (α = 0.23, β = 0.07 chosen for illustrative purposes). Contours for (θ, ϕ) are plotted in the accessible region (gray). e Probability density P(s) plotted on the Riemann sphere for meshes of size N = 4, 16, 64, and 256.

We program the mesh using a sequence of steps (Givens rotations), which build up W in order to diagonalize the “target” matrix X = UW. For every step, we add one 2 × 2 unitary to W, performing the update W → TmnW, which right-multiplies the target matrix XXTmn (Fig. 2b). The 2 × 2 unitary Tmn is chosen to null a specific matrix element v → 0 (shaded green in Fig. 2b), which is equivalent to the equation (indices m, n suppressed for notational simplicity):

[uv]T=[*0]T11T12=uv 2

This condition is visualized in Fig. 2c. In hardware, nulling of the (i, j) element of X is implemented by inputting the field wj* (jth column of W) and adjusting the MZI parameters (θ, ϕ) to zero the output power at the ith port. If all nulling steps are performed exactly, the mesh will perfectly realize the target matrix U (see Methods and Supplementary Section 1 for details).

Mathematically, nulling involves setting the (complex-valued) splitting ratio sT11/T12=(T22/T21)* of the physical MZI to match the target value s^u/v required for diagonalization. In many cases, this is not possible, because the range of admissible splitting ratios tanα+βscotαβ is restricted in the presence of hardware imperfections, namely the splitting-angle errors for the 50:50 couplers in a real MZI (α, β in Fig. 1c). Owing to these imperfections, forbidden regions emerge for small and large s where perfect nulling is impossible (Fig. 2d). It is also instructive to view this chart on the Riemann sphere, which shows that these forbidden regions are centered around the poles (Fig. 2b), highlighting the well-known fact that imperfect MZIs generally have finite extinction ratio and cannot realize a perfect cross (s = 0) or bar (s = ) state.

If in a given nulling step s^ falls within the forbidden region, nulling is imperfect, and an off-diagonal residual prevents perfect diagonalization of the matrix, leading to an “uncorrectable” error. This residual is proportional to d(s,s^), the Euclidean distance on the Riemann sphere between the target ratio and the closest realizable s. The overall error is the quadrature sum of all such residuals.

For linear photonic circuits, two important fidelity figures of merit are (1) the coverage C, i.e., the probability that a matrix is realized exactly, and (2) the normalized matrix error E=ΔUrms/N, a scaled Frobenius norm which is approximately equal to the average relative error for a given matrix element. C and E depend on the error model and the distribution of target matrices. Here, consistent with prior work16,17,19,29, we sample target matrices randomly over the Haar measure30,31 and consider an uncorrelated Gaussian error model 〈αrms = 〈βrms = σ.

Analytic expressions for E and C are derived in the Methods, which we summarize here. If a mesh is straightforwardly programmed without taking any account of the imperfections (“uncorrected” error), the normalized error is E0=2Nσ16,19. The coverage C=eN3σ2/3 (Eq. (16)) decreases sufficiently fast that even moderately sized meshes have vanishingly small coverage, and error correction is generally imperfect. In this case, the residual “corrected” error Ec=(2/3)Nσ2 (Eq. (19)) is the more relevant metric. Since Ec(E0)2, self-configuration correction affords a quadratic suppression of errors, which is a significant advantage when errors are below a threshold. However, for sufficiently large meshes N ≳ 1/σ2, error correction will be ineffective and the mesh cannot realize most matrices at high fidelity. Thus, even with error correction, hardware imperfections set a fundamental scaling limit for standard MZI meshes.

Asymptotically perfect photonic circuits

The main challenge limiting error correction here is that the forbidden regions overlap with the peak of the probability distribution, which clusters tightly around the cross state s = 0 (Fig. 2e)29. This clustering happens because light must propagate all the way down a mesh’s diagonals to realize generic unitaries; the forbidden regions disrupt this ballistic transport leading to clipping of off-diagonal matrix elements10. Adding redundant components (MZI doubling) solves this problem by eliminating the forbidden regions altogether23,24, but at the cost of added optical and electrical complexity. Here, we take the alternative approach of displacing the forbidden regions away from the cross state. This can be performed by placing a third splitter at the input of the MZI, as shown in Fig. 3a. The extra splitter performs a Möbius transformation s(s+itanη)/(1+istanη), which for a 50:50 splitting ratio (η = π/4) maps the bar and cross states to s = ±i (Fig. 3b). This can be visualized as a 90° rotation on the Riemann sphere, which pushes the forbidden regions to the equator, while the probability density is still concentrated at the poles (small errors γ in the third splitter perturb this rotation angle slightly, but this does not change the structure of the forbidden regions and has little effect on the error correction).

Fig. 3. 3-splitter MZI design and simulated performance.

Fig. 3

a Schematic of 3-MZI. b Splitter Möbius transformation on sC, which pushes the forbidden regions away from s = {0, }, corresponding to a Riemann sphere rotation. c Dependence of matrix error E0, Ec on the splitter variation σ, contrasting the standard and 3-splitter MZIs (fixed mesh size N = 256). d Scaling of corrected error Ec with mesh size N, showing the qualitative scaling difference between MZI and 3-MZI (fixed splitter variation σ = 0.05). e Corrected error Ec as function of both σ and N. The sudden onset of "perfect'' hardware error correction (Ec=0) occurs when the coverage approaches unity (C1).

This “3-splitter MZI” (3-MZI) can therefore access the complete range of splitting-ratio magnitudes ∣s∣ ∈ [0, ), and can thus function as a high-contrast optical switch24,32. However, forbidden regions are still present for the 3-MZI, which implies that for some configurations, the relative phase of the splitter arg(s) is constrained by hardware errors (unlike the MZI-doubled “perfect” couplers of refs. 2325, which cure this defect with redundant phase shifters). However, from the distributions in Fig. 2e, for large meshes s^ will fall into the 3-MZI’s forbidden regions only rarely. The normalized matrix error, calculated in the Methods (Eq. (22)), takes the following form:

Ec8σ22log(N)1.366N1/2 3

In Fig. 3c, d, we numerically simulate self-configuration on imperfect meshes using the MESHES package (see Methods and Supplementary code); the realized Ec shows good agreement with Eq. (3). For large meshes N ≳ 64, the matrix error is approximately 1–2 orders of magnitude lower with the 3-MZI. Moreover, the 3-MZI exhibits more favorable error scaling, with the error remarkably decreasing with mesh size as Eclog(N)/N. This leads to asymptotically fault-tolerant hardware error correction: in the limit N → , matrices can be programmed perfectly.

This non-intuitive effect arises from the fact that, under the Haar measure, only a small fraction of MZIs have significant probability density near s = ±i, where the forbidden regions are centered29. This probability decreases exponentially with the distance from the triangle’s base (see Methods for details). Therefore, although the mesh has N(N − 1)/2 MZIs, only O(N) contribute significantly to the matrix error under self-configuration. A naïve estimate assuming uncorrelated errors would give ΔUNσ2, which would lead to a constant Ec. However, during the self-configuration process, subsequent MZIs can partially correct for errors in earlier MZIs that cannot be properly configured; the end result is to reduce the overall error of each MZI by a factor proportional to log(N)/N (see Methods), yielding the result Eq. (3).

A second benefit to the 3-MZI is its higher threshold for perfect error correction. One obtains this threshold by computing the coverage C=e16Nσ2 (see Methods, Eq. (20)). This is much larger than the coverage of the regular MZI mesh, and the threshold scales as σth ∝ N−1/2, in contrast to the N−3/2 dependence seen for the conventional mesh. Consequently, errors are perfectly correctable under a much broader range of circumstances, as shown in Fig. 3e.

Error-resilient optical neural networks

To highlight the significance of this error reduction, consider as a concrete example deep neural network (DNN) inference on coherent optical hardware. A DNN is a sequence of layers, consisting of linear synaptic connections and nonlinear neuron activations. An emerging application of photonics seeks to use optical interference to accelerate this process, encoding neuron activations in coherent optical amplitudes, while a programmable MZI mesh implements the synaptic weights and activations are performed with an all-optical or electo-optic nonlinearity5. Scaling remains the major challenge to constructing practical optical neural networks, as large mesh sizes (N > 100) are required to achieve a significant advantages over electronic hardware, and such large meshes are especially susceptible to fabrication errors. A recent numerical study showed that even with state-of-the-art process tolerances, hardware errors can significantly degrade DNN inference accuracy33, a difficulty that has spurred investigations into alternatives to the MZI mesh, which all have their own limitations3437.

Figure 4a depicts a benchmark neural network. Here, 28 × 28 images from the MNIST digit dataset38 are preprocessed by a Fourier transform and cropped to a window of size N×N, which forms the input to a two-layer unitary DNN. The DNN can be implemented optically with rectangular MZI meshes for synaptic weighting2 and electro-optic nonlinearities for the activation (see refs. 17,39 for details). Models with inner-layer sizes N = 64 and N = 256 are pre-trained using the NEUROPHOX package40, and inference accuracy is subsequently simulated on imperfect meshes with Gaussian splitter errors to calculate the classification accuracy.

Fig. 4. Effect of hardware errors on DNN inference.

Fig. 4

a Benchmark neural network consisting of FFT preprocessing, windowing, and two DNN layers, where the linear connections U1 and U2 are realized with MZI meshes17,39. b Inference accuracy as a function of MZI error.

This accuracy is plotted in Fig. 4b for three cases: straightforwardly programming an MZI mesh without error correction, with error correction, and with the modified 3-MZI architecture. Even for small device errors σ = 1–2%, which is considered state-of-the-art for directional couplers in highly controlled fabrication processes41, hardware errors significantly degrade the model’s inference accuracy relative to its canonical value (σ = 0). For small σ, this is recovered using error correction17,19. However, many broadband coupler designs4247 trade bandwidth for fabrication sensitivity and are in practice very sensitive to process variations, meaning larger splitter errors σ ≳ 5% are common. In this moderate-error regime, error correction alone is not sufficient and the network shows reduced accuracy, a problem that becomes more pronounced as the size N increases. Moving to the 3-MZI architecture overcomes this limitation, enabling effectively error-free inference (relative to the canonical model) even out to very large splitter errors σ ≈ 10–15%, far beyond what is likely to be encountered in practice.

Broadband mesh for correlated errors

For generic, uncorrelated component errors the 3-splitter MZI is well-suited. However, since the correlation lengths of process variations tend to be larger than a single MZI48, errors are correlated in practice. This is especially true for broadband couplers based on multimode interference (MMI)42,43, subwavelength gratings44,45, and asymmetric designs46,47, all of which are highly dependent on the device geometry, which can vary slightly from run to run. Moreover, even with perfect 50:50 couplers, the splitting ratios are still wavelength-dependent. Operating the mesh away from its design wavelength leads to correlated device errors, so sensitivity to these errors is closely tied to the operational bandwidth of the device.

Consider the case of a constant offset μ for all splitting ratios: α = β = μ. In a standard MZI, the bar-state forbidden region (around s = ) disappears since ∣α − β∣ = 0, while the cross-state region (around s = 0, the peak of the probability distribution) remains in place (Fig. 2). This is consistent with the common observation that the extinction ratio in an MZI is much higher in the cross port than in the bar port. The optimal error reduction strategy, illustrated in Fig. 5a, was previously proposed in the context of broadband optical switching: place a waveguide crossing before the MZI49. The added crossing performs the Möbius transformation s → 1/s, rotating the Riemann sphere by 180° to move the forbidden region to the minimum of the probability distribution (Fig. 5b, c).

Fig. 5. MZI+crossing architecture.

Fig. 5

a Schematic of MZI+X. b Effect of the crossing is to flip the s = 0 and s =  forbidden regions. For correlated errors, the forbidden region around s = 0 disappears. c Riemann sphere projection.

As before, we can calculate the coverage and matrix error of this “MZI+crossing” (MZI+X) mesh by performing the nulling procedure on target unitaries, obtaining C from the probabilities that splitting ratios fall within the forbidden regions, and Ec from the residuals arising from imperfect diagonalization. In this case, there is only one forbidden region, centered at s = . The calculation is worked out in the Methods. For the normalized error, we find (Eq. (27)):

Ec=4μ223log(N)0.423N1/2 4

This is plotted in Fig. 6. Like the 3-MZI design, this metric scales as Eclog(N)/Nμ2, in contrast to the trend Ec=(4/33/2)Nμ2 calculated for the standard MZI under correlated errors. The coverage also increases (Eq. (25)), so that the threshold for perfect correction likewise scales as μth ∝ N–1/2, in contrast to the μth ∝ N–3/2 dependence seen in the standard mesh.

Fig. 6. Advantages of MZI+crossing architecture for correlated component errors.

Fig. 6

a Dependence of matrix error E0,Ec on splitter error μ (fixed N = 256). b Dependence of Ec on mesh size N (fixed μ = 0.1).

Ultimately, the scalability of the MZI+X architecture is limited by differential errors ∣α − β∣ that arise from local fluctuations in waveguide dimensions. The effect of such errors is analyzed in Supplementary Section 2. For typical photonic process variations, ∣α − β∣ ≪ μ and differential errors are insignificant for mesh sizes up to at least N = 512.

As an added bonus, the MZI+X design also reduces the effect of errors in the absence of correction. To see how, we can make an analogy to Bloch-sphere rotations. The transfer matrix of a standard MZI is (up to a phase factor) the product of four rotations:

T(θ,ϕ)Rxπ4+μRz(θ)Rxπ4+μRz(ϕ) 5

where Rk(η)=eiσkη is a Pauli rotation and σk is a Pauli matrix. For the cross state (θ = 0), the errors μ add up constructively, while for the bar state (θ = π), they cancel out (the latter is a simple example of dynamical decoupling of spins using a pulse sequence). Most crossings in large meshes are close to the cross state, which leads to constructive addition of the errors in the standard MZI mesh. However, for the MZI+X, the input ports of each MZI are exchanged, so the physical MZIs are close to the bar state where the errors cancel out. The resulting uncorrected matrix error is (see Methods):

E0=2Nμ(MZI)22(logN1.423)μ(MZI+ X) 6

Correlated errors (both corrected and uncorrected) are important because they are tightly connected to the operational bandwidth of the mesh, a critical design parameter for machine learning schemes that require broadband operation, e.g., for parallel processing on wavelength-multiplexed data5053. All beamsplitters are dispersive, and this dispersion leads to a correlated wavelength-dependent splitter error, which can usually be expanded to first order μ ≈ (dμ/dλλ. Two important wavelength-dependent figures of merit are (1) the tuning range, which refers to the range of λ over which the mesh can be programmed to a given accuracy, Fig. 7a, c, and (2) the bandwidth, which is related to the number of wavelength channels that can be (simultaneously) processed by the mesh, Fig. 7b, d. The tuning range is limited by the corrected error Ec, while the bandwidth is limited by the uncorrected error E0, since a mesh cannot simultaneously error-correct at two different wavelengths. Since the MZI+X design reduces both E0 and Ec, it leads to enhancements in both the bandwidth and tuning range. The enhancement factors scale as

FBWN/logN,FTR(N3/logN)1/4 7

and are listed for several mesh sizes in Table 1 (see Methods for details). As Fig. 7c, d illustrates, the MZI+X architecture enjoys a significantly larger tuning range, in addition to modestly greater bandwidth.

Fig. 7. Tuning range and bandwidth for MZI+X and standard MZI mesh, N = 64.

Fig. 7

a, b Contrast between single- and multi-wavelength operation, which are limited by tuning range and bandwidth, respectively. c Plot of Ec(λ), which dictates the tuning range for a target matrix error Emax. d Corresponding plot of E0(λ), which dictates the bandwidth. Platform: 500 × 220 nm Si:SiO2 directional coupler with 200 nm gap, dμ/dλ ≈ 3.27/μm.

Table 1.

Approximate tuning range and bandwidth enhancement factors for mesh sizes up to N = 512, Eqs. (7), (32) and (33)

N = 16 32 64 128 256 512
FTR= 5.6× 10× 18× 33× 61× 114×
FBW =  2.4× 2.8× 3.4× 4.3× 5.6× 7.3×

Real crossings have a small amount of nonzero crosstalk, quantified by the S-matrix element S21; scattering into the forward-facing port leads to a perturbation Rx(π2)Rx(π2+γ) in the transfer matrix, where γ=10S21[dB]/20. This does not degrade the effectiveness of self-configuration, since the additional scattering angle merely rotates the Riemann sphere Fig. 5c by an additional angle γ ≪ 1, and the forbidden region is still far from s = 0. In-plane crossings in silicon can achieve sub-40 dB crosstalk suppression (γ < 0.01) with insertion losses well below 0.1 dB5458. Unlike directional couplers, crossings are inherently broadband; the insertion loss and crosstalk depend only very weakly on λ, so any crossing imperfections can be treated as (correctable) wavelength-independent errors that do not affect the bandwidth enhancements of the MZI+crossing scheme. In addition to the forward-scattered light, a 90° crossing will scatter light into the backward-facing port. Back-reflected light can be subsequently reflected in other crossings, leading to a spurious signal that interferes with the forward-propagating light. Provided that the phases of reflected beams are random, these add in quadrature: with amplitude γ2 and O(N2) scattering paths, we expect this to induce an O(Nγ2) error, which may be uncorrectable and set a limit on scaling. However, if this effect is small, gradient-based methods or iterative self-configuration may enable correction of these errors.

Discussion

As photonic circuits grow larger, error tolerance becomes increasingly important. Many techniques exist to manage hardware errors, but all involve a tradeoff between accuracy and complexity. At opposite poles lie “zero-change” error correction, which has limited scalability16,17,19,59, and “perfect” photonic circuits, which require a larger number of photonic and electronic components23,24. This paper has introduced two designs for programmable circuits that strike a tradeoff between these extremes, as shown in Fig. 8 and Table 2, achieving performance that is almost as good as the perfect designs, but with less added complexity (see Supplementary Section 3 for details).

Fig. 8. Comparison of crossing types.

Fig. 8

a MZI, b symmetric (S-MZI)64, c 3-splitter (3-MZI)32, d port-exchanged (MZI+X)49, e Suzuki24, and f Miller23.

Table 2.

Characteristics of the major tunable crossing types

Complexity Features
Passives Actives Area
MZI 2 2 1.0 S
S-MZI 2 2 0.8
3-MZI 3 2 1.2 S (P)
MZI+X 3 2 1.2 S B (P)
Suzuki 3 3 1.5 S P
Miller 4 4 2.0 S P

S self-configuration, B broadband, (P) asymptotically perfect, P perfect.

The main insight from this paper is that, by adding a single passive component (either a splitter or a waveguide crossing) to the MZI, we can recover behavior that is asymptotically perfect—that is, the average normalized matrix error decreases with size. Our design choices are motivated by the elegant theory of self-configuration by matrix diagonalization17, where splitting ratios are set to successively zero the off-diagonal elements of the target unitary. By visualizing the MZI state on the Riemann sphere, we can intuitively understand the increased error robustness of our designs in terms of “rotating” the forbidden regions away from the peak probability density. This leads to a several-orders-of-magnitude reduction in post-correction errors compared to the standard MZI mesh. The ability to achieve near-perfect and freely scalable MZI meshes with less complexity than the MZI-doubled designs23,24 (especially with respect to the number of active components and pads) removes a major roadblock to the realization of very-large-scale nanophotonic systems.

An interesting direction for future work is to explore to what extent multiport interferometers can be made robust to imperfections in the absence of error correction. For example, previous studies of 3-MZI splitters have noted a wavelength-independent coupling ratio for certain parameter choices32. Likewise, the near-cancellation of correlated errors in the MZI+crossing architecture explains the O(N/logN) reduction in the uncorrected error, and corresponding increase in bandwidth. Further design modifications based on the theory of composite pulse sequences6062 may allow this imperfect cancellation to be made exact, further improving the bandwidth (and multiplexing capabilities) of linear photonics.

Methods

Unitaries and the Riemann sphere

A generic 2 × 2 complex-valued matrix has eight degrees of freedom, and a 2 × 2 unitary has four. However, the space of 2 × 2 unitaries can be divided into equivalence classes based on the splitting ratio s = T11/T12. Specifically, any two unitaries are equivalent up to output phases, i.e., T=diag(eiψ1,eiψ2)T^, if and only if the splitting ratios are the same, s=s^. As a complex number, s can be visualized on the Riemann sphere (Fig. 2d), where the mapping is performed by the stereographic projection s = (x + iy)/(1 + z) (which inverts to x + iy = 2s/(1 + ∣s2), z = (1 + ∣s2)/(1 − ∣s2)).

Ordinarily, the distance between matrices is defined as the Frobenius (L2) norm ΔU=(mnΔUmn2)1/2. However, since output phases are corrected in subsequent steps, the most relevant distance metric for a 2 × 2 block is the Frobenius norm modulo these phase shifts,

d(T,T^)minψTeiψ1eiψ2T^=d(s,s^)2 8

where d(s,s^)=2ss^/(s2+1)(s^2+1) is the Euclidean distance between two points on the Riemann sphere.

A common parameterization is s=eiϕtan(θ/2), which represents the splitting ratio of the standard MZI, Fig. 1b. On the Riemann sphere, (θ, ϕ) map to the standard polar coordinates, i.e., x=sin(θ)cos(ϕ), y=sin(θ)sin(ϕ), z=cos(θ).

Coverage and matrix error derivation

The nulling method relies on successive zeroing of off-diagonal elements to diagonalize the matrix X (initialized to U). Each nulling step zeros a single element, increasing the size of the zeroed-out off-diagonal region. Nulling steps are performed in a particular order to ensure that zeroed-out elements remain zero after all subsequent steps1,2,17. In a given step, if nulling cannot be achieved perfectly, the “zeroed-out” region of matrix X is left with a residual of magnitude:

r=T11vT12u=u2+v2d(s,s^)2 9

where s^ is the target splitting ratio, s is the closest physically realizable value, and d(s,s^) is the Euclidean distance on the Riemann sphere, the same metric used in Eq. (8). The coverage and matrix error depend on (1) the distribution P(s) of target splitting ratios, a function of the distribution of target unitaries, and (2) the locations and sizes of the forbidden regions, a function of the specific mesh implementation (MZI, 3-MZI, MZI+X). For the Haar measure, P(s) depends on an MZI’s location in the mesh; for a given Tmn it takes the following form29:

Pmn(s)=n4πz+12n1=n4π(1+s2)n1 10

Here, the density is defined with respect to the area measure on the Riemann sphere

dμ=sin(θ)dθdϕ=4(1+s2)d2s 11

so that ∫Pmn(s)dμ(s) = 1. Note that, under Eq. (10), Pmn is uniform for the lowest row of crossings, and becomes increasingly concentrated as one approaches the triangle’s apex; as a result, the overall distribution is strongly biased towards the cross state for large meshes, as shown in Fig. 2e (the same distribution also holds for the rectangular mesh, up to a reordering of the MZIs).

The forbidden regions F± are centered at opposite poles of the Riemann sphere

(s+,s)=(0,)(MZI)(+i,i)(3-MZI)(,0)(MZI+X) 12

and have radii R± = 2∣α ± β∣. In the case of small hardware errors, where P(s) ≈ P(s±) inside each F±, the probability that s^ falls inside the region is given by πR±2P(s±). The coverage C is the probability that every s^ avoids the forbidden regions, and is well approximated by

C=expmnπ(Pmn(s+)R+2+Pmn(s)R2) 13

The normalized matrix error Ec=ΔUrms/N is approximately the quadrature sum of the residuals accumulated during nulling:

(Ec)2=ΔU2N=2Nmnrmn2 14

Here, 〈…〉 refers to the ensemble average over both Haar-distributed target unitaries U30,31 and the distribution of hardware errors α, β. We calculate the mean residual 〈r2〉 by averaging Eq. (9) over the distribution P(s). This is simplified in the case of small hardware errors, because the forbidden region is correspondingly small and where we can assume P(s) is approximately constant:

rmn2=π24u2+v2qmn[Pmn(s+)R+4+Pmn(s)R4] 15

This residual depends on the quantity qmn = 〈∣u2 + ∣v2〉, where (u, v) are the highlighted in green in Fig. 2b. Following the Gaussian elimination procedure of a Haar matrix, this evaluates to qmn = (n + 1)/(N + 1 − m).

A detailed description of the nulling algorithm, including a comparison to the local method19 and global optimization810 (which has a much longer convergence time), is presented in Supplementary Section 1.

Gaussian errors: MZI and 3-MZI

For the uncorrelated Gaussian perturbation model with 〈αrms = 〈βrms = σ, the forbidden regions are (statistically) symmetric, with moments R±2=8σ2 and R±4=192σ4.

For the MZI mesh, the coverage expression Eq. (13) is dominated by the s = 0 term, where Pmn(0) = n/4π. Considering only this term, we calculate:

CMZI=expR+24mnneN3σ2/3 16

where we have replaced the discrete sum by an integral

mn()0N0Nm()dndm 17

which is valid in the limit of large N. Likewise, the top forbidden region dominates the matrix error, so we evaluate Eq. (15) including only the first term in the sum:

rmn2MZIn(n+1)N+1mR+496 18

Converting the sum to an integral and substituting R+4, we find:

(Ec)MZI=N2432R+423Nσ2 19

Now we redo the calculation for the 3-MZI. In this case, the forbidden regions are located at s± = ± i and contribute equally to the problem. Following Eq. (13), the coverage is given by:

C3MZI=exp2×mnπR±2Pmn(±i)e16Nσ2 20

Applying Eq. (15), the mean residual left by crossing Tmn is:

rmn23MZI=2×π24n+1N+1mqmnn2n+1πPmn(±i)(192σ4)R+4 21

The factors of two in Eqs. (20)–(26) arise because both forbidden regions contribute equally. This rmn2 is not slowly-varying with (m, n), so we cannot convert the sums to integrals. We first perform the summation over n, which converges rapidly due to the 1/2n+1 factor (approximating the upper bound to infinity because of the rapid convergence), followed by summation over m. We find the normalized error:

(Ec)3-MZI=128σ4Nn=1N1n54log(2)1/28σ22log(N)+γe54log(2)N1/2 22

where the discrete sum is approximated using the relation n=1Nn1log(N)+γe, which defines the Euler–Mascheroni constant γe ≈ 0.5772.

Correlated errors: MZI and MZI+X

Under a correlated error model, α = β = μ. In this case, there is only one forbidden region, which for the MZI is centered at s+ = 0, with R+ = 4μ. The coverage and matrix error for the standard MZI can then be calculated from Eqs. (16) and (19) with the appropriate substitutions for R+2, R+4:

CMZI=e(2/3)N3μ2 23
(Ec)MZI=(4/33/2)Nμ2 24

Now consider the MZI+X. The additional crossing rotates the forbidden region to s+ → . Only the MZIs in the bottom row of the triangle (n = 1) contribute to the sums in Eqs. (13) and (14), because the probability distribution Eq. (10) vanishes at s =  for the upper rows.

As before, we use the residual formula Eq. (15) to calculate the matrix error. In this case, there is only one forbidden region, centered at s+ = , with R+ = 4μ. Only the MZIs in the bottom row contribute to the sum, because the probability distribution Eq. (10) vanishes at s =  for the upper rows. The coverage is:

CMZI+X=expmπR+2Pm1()e4Nμ2 25

With the mean residual given by

rm12MZI+X=π242N+1mqm114πPm1()(256μ4)R+4 26

and rmn2=0 for n > 1, the matrix error evaluates to:

(Ec)MZI+X=4μ223log(N)+γe1N1/2 27

Now we consider the uncorrected matrix error. For the standard MZI mesh, this is E0=2Nμ17. Using the transfer matrix of the standard MZI

Tα,β(θ,ϕ)=Rxπ4+βeiθ001Rxπ4+αeiϕ001 28

to first order in (α, β), the norm of the matrix error is:

ΔTMZI2=2[cos2(θ/2)(α+β)+sin2(θ/2)(αβ)2] 29

which is maximized when the MZI is in the cross state θ = 0. For the MZI+crossing (Fig. 5a), we find:

Tα,β(X)(θ,ϕ)=Rxπ4+βeiθ001Rxπ4+αeiϕ001Rxπ2=ei(θ+ϕ)1001Tα,β(θ,ϕ) 30

Up to irrelevant output phases, the effect of the crossing is to flip the relative sign of α and β, so the component errors appear anticorrelated. As a result, ΔTMZI+Xsin(θ/2)μ, which is zero for the cross state. The actual error is found by adding the ∥ΔTmn∥ in quadrature and averaging over the probability distribution Pmn(θ)=nsin(θ/2)cos(θ/2)2n1 (equivalent to Eq. (10)):

E0=22(logN+γe2)μ 31

For a wavelength-dependent splitter error μ ≈ (dμ/dλλ, the tuning range and bandwidth can be calculated from the expressions for Ec (Eq. (27)) and E0 (Eq. (31)), respectively: the tuning range is the range over which Ec(λ)<Emax, while the bandwidth is the range over which E0(λ)<Emax:

ΔλTR=Emaxdλ/dμ33/4N(MZI)3N2(logN0.42)(MZI+X) 32
ΔλBW=Emaxdλ/dμ1N(MZI)12(logN1.42)(MZI+X) 33

From these expressions, we derive the enhancement factors reported in Eqs. (7) and Table 1.

Neural network model

The optical neural network model is based on the architecture described in ref. 11. Input images are first Fourier transformed, and cropped to a N×N window, where N is the DNN’s inner layer size. The signal from this window (N input neurons) passes through two optical layers, with unitary connectivity realized with rectangular meshes. The activation function at the inner layer is realized electro-optically: a fraction of each output field is tapped off and sent to a detector, whose photocurrent modulates the remaining output light28,39, implementing the activation function:

f(E)=1αei(gE2+ϕπ)/2cos12(gE2+ϕ) 34

where α is the power tap fraction, g is the modulator response, and ϕ is the phase at zero power. Here, we choose α = 0.1, g = π/20, and ϕ = π, so that f(E) approximates a leaky ReLU in the right power regime. Models of sizes N = 64 and N = 256 were trained using the NEUROPHOX package40.

Simulations and data analysis

All simulations were performed using the MESHES package, an open-source simulator for feedforward photonic circuits that can account for hardware imperfections63. Figures 3, 4, 6, and 7 plot multiple instances (usually ≥100) per point; dots show medians while shaded regions show the interquartile range. Source code to produce the plots for this manuscript is provided in the Supplementary material.

Supplementary information

Supplementary Information (699.5KB, pdf)
Peer review file (3.2MB, pdf)

Acknowledgements

S.B. is supported by an NSF Graduate Research Fellowship. D.E. acknowledges funding from AFOSR (no. FA9550-20-1-0113, FA9550-16-1-0391). The authors thank Prof. David A. B. Miller and Dr Sunil Pai for helpful discussions.

Source data

Source Data (2.7MB, zip)

Author contributions

S.B. and R.H. jointly conceived the idea. R.H. developed the theory, performed the simulations and data analysis, and wrote the manuscript. R.H., S.B., and D.E. contributed to discussion of the results.

Peer review

Peer review information

Nature Communications thanks Sinan Gündoğdu and the other anonymous reviewer(s) for their contribution to the peer review of this work. Peer review reports are available.

Data availability

All data from this paper can be generated using the MESHES package63 and source code files provided in the Supplementary materialSource Data are provided with this paper.

Code availability

Source code files are provided in the Supplementary material.

Competing interests

R.H., S.B., and D.E. are inventors on patent applications No. 63/151,103 and 63/196,301 describing methods for self-configuration and error correction in linear photonic circuits.

Footnotes

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

The online version contains supplementary material available at 10.1038/s41467-022-34308-3.

References

  • 1.Reck M, Zeilinger A, Bernstein HJ, Bertani P. Experimental realization of any discrete unitary operator. Phys. Rev. Lett. 1994;73:58. doi: 10.1103/PhysRevLett.73.58. [DOI] [PubMed] [Google Scholar]
  • 2.Clements WR, Humphreys PC, Metcalf BJ, Kolthammer WS, Walmsley IA. Optimal design for universal multiport interferometers. Optica. 2016;3:1460–1465. doi: 10.1364/OPTICA.3.001460. [DOI] [Google Scholar]
  • 3.Carolan J, et al. Universal linear optics. Science. 2015;349:711–716. doi: 10.1126/science.aab3642. [DOI] [PubMed] [Google Scholar]
  • 4.Zhong, H.-S. et al. Quantum computational advantage using photons. Science370, 1460–1463 (2020). [DOI] [PubMed]
  • 5.Shen Y, et al. Deep learning with coherent nanophotonic circuits. Nat. Photon. 2017;11:441. doi: 10.1038/nphoton.2017.93. [DOI] [Google Scholar]
  • 6.Marpaung D, et al. Integrated microwave photonics. Laser Photonics Rev. 2013;7:506–538. doi: 10.1002/lpor.201200032. [DOI] [Google Scholar]
  • 7.Zhuang L, Roeloffzen CG, Hoekman M, Boller K-J, Lowery AJ. Programmable photonic signal processor chip for radiofrequency applications. Optica. 2015;2:854–859. doi: 10.1364/OPTICA.2.000854. [DOI] [Google Scholar]
  • 8.Burgwal R, et al. Using an imperfect photonic network to implement random unitaries. Opt. Express. 2017;25:28236–28245. doi: 10.1364/OE.25.028236. [DOI] [Google Scholar]
  • 9.Mower J, Harris NC, Steinbrecher GR, Lahini Y, Englund D. High-fidelity quantum state evolution in imperfect photonic integrated circuits. Phys. Rev. A. 2015;92:032322. doi: 10.1103/PhysRevA.92.032322. [DOI] [Google Scholar]
  • 10.Pai S, Bartlett B, Solgaard O, Miller DA. Matrix optimization on universal unitary photonic devices. Phys. Rev. Appl. 2019;11:064044. doi: 10.1103/PhysRevApplied.11.064044. [DOI] [Google Scholar]
  • 11.Pai, S. et al. Parallel programming of an arbitrary feedforward photonic network. IEEE J. Sel. Top. Quantum Electron.26, 1–13 (2020).
  • 12.Hughes TW, Minkov M, Shi Y, Fan S. Training of photonic neural networks through in situ backpropagation and gradient measurement. Optica. 2018;5:864–871. doi: 10.1364/OPTICA.5.000864. [DOI] [Google Scholar]
  • 13.Miller DA. Self-aligning universal beam coupler. Opt. Express. 2013;21:6360–6370. doi: 10.1364/OE.21.006360. [DOI] [PubMed] [Google Scholar]
  • 14.Miller DA. Self-configuring universal linear optical component. Photonics Res. 2013;1:1–15. doi: 10.1364/PRJ.1.000001. [DOI] [Google Scholar]
  • 15.Miller DA. Setting up meshes of interferometers–reversed local light interference method. Opt. Express. 2017;25:29233–29248. doi: 10.1364/OE.25.029233. [DOI] [Google Scholar]
  • 16.Hamerly R, Bandyopadhyay S, Englund D. Stability of self-configuring large multiport interferometers. Phys. Rev. Appl. 2022;18:024018. doi: 10.1103/PhysRevApplied.18.024018. [DOI] [Google Scholar]
  • 17.Hamerly R, Bandyopadhyay S, Englund D. Accurate self-configuration of rectangular multiport interferometers. Phys. Rev. Appl. 2022;18:024019. doi: 10.1103/PhysRevApplied.18.024019. [DOI] [Google Scholar]
  • 18.Annoni A, et al. Unscrambling light—automatically undoing strong mixing between modes. Light Sci. Appl. 2017;6:e17110. doi: 10.1038/lsa.2017.110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Bandyopadhyay S, Hamerly R, Englund D. Hardware error correction for programmable photonics. Optica. 2021;8:1247–1255. doi: 10.1364/OPTICA.424052. [DOI] [Google Scholar]
  • 20.Kumar, S. P. et al. Mitigating linear optics imperfections via port allocation and compilation. Preprint at arXiv:2103.03183 (2021).
  • 21.López-Pastor VJ, Lundeen JS, Marquardt F. Arbitrary optical wave evolution with fourier transforms and phase masks. Opt. Express. 2021;29:38441–38450. doi: 10.1364/OE.432787. [DOI] [PubMed] [Google Scholar]
  • 22.Basani, J. R., Vadlamani, S. K., Bandyopadhyay, S., Englund, D. R. & Hamerly, R. A self-similar sine-cosine fractal architecture for multiport interferometers. Preprint at arXiv:2209.03335 (2022).
  • 23.Miller DA. Perfect optics with imperfect components. Optica. 2015;2:747–750. doi: 10.1364/OPTICA.2.000747. [DOI] [Google Scholar]
  • 24.Suzuki K, et al. Ultra-high-extinction-ratio 2 × 2 silicon optical switch with variable splitter. Opt. Express. 2015;23:9086–9092. doi: 10.1364/OE.23.009086. [DOI] [PubMed] [Google Scholar]
  • 25.Wilkes CM, et al. 60 dB high-extinction auto-configured Mach-Zehnder interferometer. Opt. Lett. 2016;41:5318–5321. doi: 10.1364/OL.41.005318. [DOI] [PubMed] [Google Scholar]
  • 26.Wu R, et al. Fabrication of a multifunctional photonic integrated chip on lithium niobate on insulator using femtosecond laser-assisted chemomechanical polish. Opt. Lett. 2019;44:4698–4701. doi: 10.1364/OL.44.004698. [DOI] [PubMed] [Google Scholar]
  • 27.Dong M, et al. High-speed programmable photonic circuits in a cryogenically compatible, visible–near-infrared 200 mm CMOS architecture. Nat. Photon. 2022;16:59–65. doi: 10.1038/s41566-021-00903-x. [DOI] [Google Scholar]
  • 28.Bandyopadhyay, S. et al. Single chip photonic deep neural network with accelerated training. Preprint at arXiv:2208.01623 (2022).
  • 29.Russell NJ, Chakhmakhchyan L, O’Brien JL, Laing A. Direct dialling of Haar random unitary matrices. N. J. Phys. 2017;19:033007. doi: 10.1088/1367-2630/aa60ed. [DOI] [Google Scholar]
  • 30.Haar, A. Der massbegriff in der theorie der kontinuierlichen gruppen. Ann. Math.34, 147–169 (1933).
  • 31.Tung, W.-K. Group Theory in Physics: An Introduction to Symmetry Principles, Group Representations, and Special Functions in Classical and Quantum Physics (World Scientific Publishing Company, 1985).
  • 32.Wang M, Ribero A, Xing Y, Bogaerts W. Tolerant, broadband tunable 2 × 2 coupler circuit. Opt. Express. 2020;28:5555–5566. doi: 10.1364/OE.384018. [DOI] [PubMed] [Google Scholar]
  • 33.Fang MY-S, Manipatruni S, Wierzynski C, Khosrowshahi A, DeWeese MR. Design of optical neural networks with component imprecisions. Opt. Express. 2019;27:14009–14029. doi: 10.1364/OE.27.014009. [DOI] [PubMed] [Google Scholar]
  • 34.Tait AN, et al. Neuromorphic photonic networks using silicon photonic weight banks. Sci. Rep. 2017;7:7430. doi: 10.1038/s41598-017-07754-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Hamerly R, Bernstein L, Sludds A, Soljačić M, Englund D. Large-scale optical neural networks based on photoelectric multiplication. Phys. Rev. X. 2019;9:021032. [Google Scholar]
  • 36.Bernstein, L. et al. Single-shot optical neural network. Preprint at arXiv:2205.09103 (2022).
  • 37.Chen, Z. et al. Deep learning with coherent VCSEL neural networks. Preprint at arXiv:2207.05329 (2022).
  • 38.LeCun Y, Bottou L, Bengio Y, Haffner P. Gradient-based learning applied to document recognition. Proc. IEEE. 1998;86:2278–2324. doi: 10.1109/5.726791. [DOI] [Google Scholar]
  • 39.Williamson IA, et al. Reprogrammable electro-optic nonlinear activation functions for optical neural networks. IEEE J. Sel. Top. Quantum Electron. 2019;26:1–12. doi: 10.1109/JSTQE.2019.2930455. [DOI] [Google Scholar]
  • 40.Pai, S. Neurophox: A Simulation Framework for Unitary Neural Networks and Photonic Devices. https://github.com/solgaardlab/neurophox (2020).
  • 41.Mikkelsen JC, Sacher WD, Poon JK. Dimensional variation tolerant silicon-on-insulator directional couplers. Opt. Express. 2014;22:3145–3150. doi: 10.1364/OE.22.003145. [DOI] [PubMed] [Google Scholar]
  • 42.Soldano LB, Pennings EC. Optical multi-mode interference devices based on self-imaging: principles and applications. J. Light. Technol. 1995;13:615–627. doi: 10.1109/50.372474. [DOI] [Google Scholar]
  • 43.Maese-Novo A, et al. Wavelength independent multimode interference coupler. Opt. Express. 2013;21:7033–7040. doi: 10.1364/OE.21.007033. [DOI] [PubMed] [Google Scholar]
  • 44.Wang Y, et al. Compact broadband directional couplers using subwavelength gratings. IEEE Photonics J. 2016;8:1–8. doi: 10.1109/JPHOT.2016.2633560. [DOI] [Google Scholar]
  • 45.Ye C, Dai D. Ultra-compact broadband 2 × 2 3 dB power splitter using a subwavelength-grating-assisted asymmetric directional coupler. J. Light. Technol. 2020;38:2370–2375. doi: 10.1109/JLT.2020.2973663. [DOI] [Google Scholar]
  • 46.Morino H, Maruyama T, Iiyama K. Reduction of wavelength dependence of coupling characteristics using Si optical waveguide curved directional coupler. J. Light. Technol. 2014;32:2188–2192. doi: 10.1109/JLT.2014.2321660. [DOI] [Google Scholar]
  • 47.Lu Z, et al. Broadband silicon photonic directional coupler using asymmetric-waveguide based phase control. Opt. Express. 2015;23:3795–3808. doi: 10.1364/OE.23.003795. [DOI] [PubMed] [Google Scholar]
  • 48.Bogaerts W, Xing Y, Khan U. Layout-aware variability analysis, yield prediction, and optimization in photonic integrated circuits. IEEE J. Sel. Top. Quantum Electron. 2019;25:1–13. doi: 10.1109/JSTQE.2019.2906271. [DOI] [Google Scholar]
  • 49.Suzuki K, et al. Low-insertion-loss and power-efficient 32 × 32 silicon photonics switch with extremely high-δ silica PLC connector. J. Light. Technol. 2018;37:116–122. doi: 10.1109/JLT.2018.2867575. [DOI] [Google Scholar]
  • 50.Feldmann J, et al. Parallel convolutional processing using an integrated photonic tensor core. Nature. 2021;589:52–58. doi: 10.1038/s41586-020-03070-1. [DOI] [PubMed] [Google Scholar]
  • 51.Xu X, et al. 11 tops photonic convolutional accelerator for optical neural networks. Nature. 2021;589:44–51. doi: 10.1038/s41586-020-03063-0. [DOI] [PubMed] [Google Scholar]
  • 52.Sludds, A. et al. Delocalized photonic deep learning on the internet’s edge. Science378, 270–276 (2022). [DOI] [PubMed]
  • 53.Davis, R., III, Chen, Z., Hamerly, R. & Englund, D. Frequency-encoded deep learning with speed-of-light dominated latency. Preprint at arXiv:2207.06883 (2022).
  • 54.Fukazawa T, Hirano T, Ohno F, Baba T. Low loss intersection of Si photonic wire waveguides. Jpn. J. Appl. Phys. 2004;43:646. doi: 10.1143/JJAP.43.646. [DOI] [Google Scholar]
  • 55.Chen H, Poon AW. Low-loss multimode-interference-based crossings for silicon wire waveguides. IEEE Photon. Technol. Lett. 2006;18:2260–2262. doi: 10.1109/LPT.2006.884726. [DOI] [Google Scholar]
  • 56.Ma Y, et al. Ultralow loss single layer submicron silicon waveguide crossing for SOI optical interconnect. Opt. Express. 2013;21:29374–29382. doi: 10.1364/OE.21.029374. [DOI] [PubMed] [Google Scholar]
  • 57.Dumais, P., Goodwill, D., Celo, D., Jiang, J. & Bernier, E. Three-mode synthesis of slab gaussian beam in ultra-low-loss in-plane nanophotonic silicon waveguide crossing. In 2017 IEEE 14th International Conference on Group IV Photonics (GFP) 97–98 (IEEE, 2017).
  • 58.Wu S, Mu X, Cheng L, Mao S, Fu H. State-of-the-art and perspectives on silicon waveguide crossings: a review. Micromachines. 2020;11:326. doi: 10.3390/mi11030326. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Vadlamani, S. K., Englund, D. & Hamerly, R. Transferable learning on analog hardware. Preprint at arXiv:2210.06632 (2022). [DOI] [PMC free article] [PubMed]
  • 60.Brown KR, Harrow AW, Chuang IL. Arbitrarily accurate composite pulse sequences. Phys. Rev. A. 2004;70:052318. doi: 10.1103/PhysRevA.70.052318. [DOI] [Google Scholar]
  • 61.Bulmer J, Jones J, Walmsley I. Drive-noise tolerant optical switching inspired by composite pulses. Opt. Express. 2020;28:8646–8657. doi: 10.1364/OE.378469. [DOI] [PubMed] [Google Scholar]
  • 62.Little BE, Murphy T. Design rules for maximally flat wavelength-insensitive optical power dividers using Mach-Zehnder structures. IEEE Photon. Technol. Lett. 1997;9:1607–1609. doi: 10.1109/68.643284. [DOI] [Google Scholar]
  • 63.Hamerly, R. Meshes: Tools for Modeling Photonic Beamsplitter Mesh Networks. https://github.com/QPG-MIT/meshes (2021).
  • 64.Bell BA, Walmsley IA. Further compactifying linear optical unitaries. APL Photonics. 2021;6:070804. doi: 10.1063/5.0053421. [DOI] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Information (699.5KB, pdf)
Peer review file (3.2MB, pdf)

Data Availability Statement

All data from this paper can be generated using the MESHES package63 and source code files provided in the Supplementary materialSource Data are provided with this paper.

Source code files are provided in the Supplementary material.


Articles from Nature Communications are provided here courtesy of Nature Publishing Group

RESOURCES