Skip to main content
Entropy logoLink to Entropy
. 2018 Aug 23;20(9):628. doi: 10.3390/e20090628

Frequentist and Bayesian Quantum Phase Estimation

Yan Li 1,, Luca Pezzè 2,, Manuel Gessner 2,, Zhihong Ren 1, Weidong Li 1,*, Augusto Smerzi 2
PMCID: PMC7513152  PMID: 33265717

Abstract

Frequentist and Bayesian phase estimation strategies lead to conceptually different results on the state of knowledge about the true value of an unknown parameter. We compare the two frameworks and their sensitivity bounds to the estimation of an interferometric phase shift limited by quantum noise, considering both the cases of a fixed and a fluctuating parameter. We point out that frequentist precision bounds, such as the Cramér–Rao bound, for instance, do not apply to Bayesian strategies and vice versa. In particular, we show that the Bayesian variance can overcome the frequentist Cramér–Rao bound, which appears to be a paradoxical result if the conceptual difference between the two approaches are overlooked. Similarly, bounds for fluctuating parameters make no statement about the estimation of a fixed parameter.

Keywords: quantum metrology, Bayesian estimation, parameter estimation

1. Introduction

The estimation of a phase shift using interferometric techniques is at the core of metrology and sensing [1,2,3]. Applications range from the definition of the standard of time [4] to the detection of gravitational waves [5,6]. The general problem can be concisely stated as the search for optimal strategies to minimize the phase estimation uncertainty. The noise that limits the achievable phase sensitivity can have a “classical” or a “quantum” nature. Classical noise originates from the coupling of the interferometer with some external source of disturbance, like seismic vibrations, parasitic magnetic fields or from incoherent interactions within the interferometer. Such noise can, in principle, be arbitrarily reduced, e.g., by shielding the interferometer from external noise or by tuning interaction parameters to ensure a fully coherent time evolution. The second source of uncertainty has an irreducible quantum origin [7,8]. Quantum noise cannot be fully suppressed, even in the idealized case of the creation and manipulation of pure quantum states. Using classically-correlated probe states, it is possible to reach the so-called shot noise or standard quantum limit, which is the limiting factor for the current generation of interferometers and sensors [9,10,11,12]. Strategies involving probe states characterized by squeezed quadratures [13] or entanglement between particles [14,15,16,17,18,19] are able to overcome the shot noise, the ultimate quantum bound being the so-called Heisenberg limit. Quantum noise reduction in phase estimation has been demonstrated in several proof-of-principle experiments with atoms and photons [20,21].

There is a vast amount of literature dealing with the parameter estimation problem that has been mostly developed following two different approaches [22,23,24]: frequentist and Bayesian. Both approaches have been investigated in the context of quantum phase estimation [18,20,25,26,27,28,29,30,31] and implemented/tested experimentally [32,33,34,35,36]. They build on conceptually different meanings attached to the word “probability” and their respective results provide conceptually different information on the estimated parameters and their uncertainties.

In the limit of a large number of repeated measurements, the sensitivity reached by the frequentist and Bayesian methods generally agree: this fact has very often induced the belief that the two paradigms can be interchangeably used in the phase estimation theory without acknowledging their irreconcilable nature. Overlooking these differences is not only conceptually inconsistent but can even create paradoxes, as, for instance, the existence of ultimate bounds in sensitivity proven in one paradigm that can be violated in the other.

In this manuscript, we directly compare the frequentist and the Bayesian parameter estimation theory. We study different sensitivity bounds obtained in the two frameworks and highlight the conceptual differences between the two. Besides the asymptotic regime of many repeated measurements, we also study bounds that are relevant for small samples. In particular, we show that the Bayesian variance can overcome the frequentist Cramér–Rao bound. The Cramér–Rao bound is a mathematical theorem providing the highest possible sensitivity in a phase estimation problem. The fact that the Bayesian sensitivity can be higher than the Cramér–Rao bound is therefore paradoxical. The paradox is solved by clarifying the conceptual differences between the frequentist and the Bayesian approaches, which therefore cannot be directly compared. Such difference should be considered when discussing theoretical and experimental figures of merit in interferometric phase estimation.

Our results are illustrated with a simple test model [37,38]. We consider N qubits with basis states |0 and |1, initially prepared in a (generalized) GHZ state |GHZ=(|0N+|1N)/2, with all particles being either in |1 or in |0. The phase-encoding is a rotation of each qubit in the Bloch sphere |0eiθ/2|0 and |1e+iθ/2|1, which transforms the |GHZ state into |GHZ(θ)=(eiNθ/2|0N+e+iNθ/2|1N)/2. The phase is estimated by measuring the parity (1)N0, where N0 is the number of particles in the state |0 [37,39,40,41]. The parity measurement has two possible results μ=±1 that are conditioned by the “true value of the phase shift” θ0 with probability p(±1|θ0)=(1±cosNθ0)/2. The probability to observe the sequence of results μ={μ1,μ2,,μm} in m independent repetitions of the experiment (with same probe state and phase encoding transformation) is

p(μ|θ0)=i=1mp(μi|θ0)=1+cosNθ02m+1cosNθ02m, (1)

where m± is the number of the observed results ±1, respectively. Notice that p(μ|θ0) is the conditional probability for the measurement outcome μ, given that the true value of the phase shift is θ0 (which we consider to be unknown in the estimation protocol). Equation (1) provides the probability that will be used in the following sections for the case N=2 and θ0[0,π/2]. Section 2 and Section 3 deal with the case where θ0 has a fixed value and in Section 4 we discuss precision bounds for a fluctuating phase shift.

2. Frequentist Approach

In the frequentist paradigm, the phase (assumed having a fixed but unknown value θ0) is estimated via an arbitrarily chosen function of the measurement results, θest(μ), called the estimator. Typically, θest(μ) is chosen by maximizing the likelihood of the observed data (see below). The estimator, being a function of random outcomes, is itself a random variable. It is characterized by a statistical distribution that has an objective, measurable character. The relative frequency with which the event θest occurs converges to a probability asymptotically with the number of repeated experimental trials.

2.1. Frequentist Risk Functions

Statistical fluctuations of the data reflect the statistical uncertainty of the estimation. This is quantified by the variance,

Δ2θestμ|θ0=μθest(μ)θestμ|θ02p(μ|θ0), (2)

around the mean value θestμ|θ0=μθest(μ)p(μ|θ0), the sum extending over all possible measurement sequences (for fixed θ0 and m). An important class is that of locally unbiased estimators, namely those satisfying θestμ|θ0=θ0 and dθestμ|θdθ|θ=θ0=1 (see, for instance, [42]). An estimator is unbiased if and only if it is locally unbiased at every θ0.

The quality of the estimator can also be quantified by the mean square error (MSE) [23]

MSE(θest)μ|θ0=μθest(μ)θ02p(μ|θ0), (3)

giving the deviation of θest from the true value of the phase shift θ0. It is related to Equation (2) by the relation

MSE(θest)μ|θ0=Δ2θestμ|θ0+θestμ|θ0θ02. (4)

In the frequentist approach, often the variance is not considered as a proper way to quantify the goodness of an estimator. For instance, an estimator that always gives the same value independently of the measurement outcomes is strongly biased: it has zero variance but a large MSE that does not scale with the number of repeated measurements. Notice that the MSE cannot be accessed from the experimentally available data since the true value θ0 is unknown. In this sense, only the fluctuations of θest around its mean value, i.e., the variance (Δ2θest)μ|θ0, have experimental relevance. For unbiased estimators, Equations (2) and (4) coincide. In general, since the bias term in Equation (4) is never negative, MSE(θest)μ|θ0Δ2θestμ|θ0 and any lower bound on (Δ2θest)μ|θ0 automatically provides a lower bound on MSE(θest)μ|θ0 but not vice versa. In the following section, we therefore limit our attention to bounds on (Δ2θest)μ|θ0. The distinction between the two quantities becomes more important in the case of a fluctuating phase shift θ0, where the bias can affect the corresponding bounds in different ways. We will see this explicitly in Section 4.

2.2. Frequentist Bounds on Phase Sensitivity

2.2.1. Barankin Bound

The Barankin bound (BB) provides the tightest lower bound to the variance (2) [43]. It can be proven to be always (for any m) saturable, in principle, by a specific local (i.e., dependent of θ0) estimator and measurement observable. Of course, since the estimator that saturates the BB depends on the true value of the parameter (which is unknown), the bound is of not much use in practice. Nevertheless, the BB plays a central role, from the theoretical point of view, as it provides a hierarchy of weaker bounds which can be used in practice with estimators that are asymptotically unbiased. The BB can be written as [44]

Δ2θestμ|θ0Δ2θBBsupθi,ai,ni=1nai[θestμ|θiθestμ|θ0]2μi=1naiL(μ|θi,θ0)2p(μ|θ0), (5)

where L(μ|θi,θ)=p(μ|θi)/p(μ|θ) is generally indicated as likelihood ratio and the supremum is taken over n parameters aiR, which are arbitrary real numbers, and θi, which are arbitrary phase values in the parameter domain. For unbiased estimators, we can replace θestμ|θi=θi for all i and the BB becomes independent of the estimator:

Δ2θestμ|θ0Δ2θBBubsupθi,ai,ni=1nai[θiθ0]2μi=1naiL(μ|θi,θ0)2p(μ|θ0). (6)

A derivation of the BB is presented in Appendix A.

The explicit calculation of Δ2θBB is impractical in most applications due to the number of free variables that must be optimized. However, the BB provides a strict hierarchy of bounds of increasing complexity that can be of great practical importance. Restricting the number of variables in the optimization can provide local lower bounds that are much simpler to determine at the expense of not being saturable in general, namely, for an arbitrary number of measurements. Below, we demonstrate the following hierarchy of bounds:

Δ2θestμ|θ0Δ2θBBΔ2θEChRBΔ2θChRBΔ2θCRLB, (7)

where Δ2θCRLB is the Cramér–Rao lower bound (CRLB) [45,46] and Δ2θChRB is the Hammersley–Chapman–Robbins bound (ChRB) [47,48]. We will also introduce a novel extended version of the ChRB, indicated as Δ2θEChRB.

2.2.2. Cramér–Rao Lower Bound and Maximum Likelihood Estimator

The CRLB is the most common frequentist bound in parameter estimation. It is given by [45,46]:

Δ2θCRLB=dθestμ|θ0dθ02mF(θ0). (8)

The inequality Δ2θestμ|θ0Δ2θCRLB is obtained by differentiating θestμ|θ0 with respect to θ0 and using a Cauchy–Schwarz inequality:

dθestμ|θ0dθ02=μ(θest(μ)θestμ|θ0)dp(μ|θ0)dθ02mF(θ0)Δ2θestμ|θ0, (9)

where we have used μdp(μ|θ0)dθ0=0 and μ1p(μ|θ0)(p(μ|θ)θ|θ0)2=mμ1p(μ|θ0)(p(μ|θ)θ|θ0)2 valid for m independent measurements, and

Fθ0=μ1p(μ|θ0)p(μ|θ)θ|θ02 (10)

is the Fisher information. The equality Δ2θestμ|θ0=Δ2θCRLB is achieved if and only if

θest(μ)θestμ|θ0=λθ0dlogp(μ|θ0)dθ0, (11)

with λθ0 a parameter independent of μ (while it may depend on θ0). Noticing that dθestμ|θ0dθ0=μθest(μ)f(θ0)dp(μ|θ0)dθ0, the CRLB can be straightforwardly generalized to any function f(θ0) independent of μ. In particular, choosing f(θ0)=θ0, we can directly prove that MSE(θest)μ|θ0Δ2θCRLB, which also depends on the bias.

Asymptotically in m, the saturation of Equation (8) is obtained for the maximum likelihood estimator (MLE) [22,23,49]. This is the value θMLE(μ) that maximizes the likelihood p(μ|θ0) (as a function of the parameter θ0) for the observed measurement sequence μ,

θMLE(μ)argmaxθ0{p(μ|θ0)}. (12)

For a sufficiently large sample size m (in the central limit), independently of the probability distribution p(μ|θ0), the MLE becomes normally distributed [18,22,23,49]:

p(θMLE|θ0)=mFθ02πemFθ02θ0θMLE2(m1), (13)

with mean given by the true value θ0 and variance equal to the inverse of the Fisher information. The MLE is well defined provided that there is a unique maximum in the considered phase interval. In the case of Equation (1), this condition is fulfilled provided that one restrict the phase domain to [0,π/(2N)] for instance.

In Figure 1, we plot the results of a maximum likelihood analysis for the example considered in this manuscript. In this case, the MLE is readily calculated and given by θMLE(μ)=12arccos(m+mm++m), and the Fisher information is F(θ0)=N2, independent of θ0 (we recall that N=2 in our example). In Figure 1a we plot the bias θMLEμ|θ0θ0 (dots) as a function of m, for θ0=π/4. Error bars are ±ΔθCRLB. Notice that θMLEμ|θ0=θ0 for every m. This does not mean that the estimator is locally unbiased: indeed, the derivative dθMLEμ|θ0/dθ0 [see panel (b)] is different from 1 for every value of m. We have dθMLEμ|θ0/dθ01 asymptotically in m. In Figure 1b, we plot mF(θ0)(Δ2θMLE)μ|θ0 as a function of the number of independent measurements m (red dots). This quantity is compared to mF(θ0)Δ2θCRLB=(dθMLEμ|θ0/dθ0)2 (red line). With increasing sample size m, (Δ2θMLE)μ|θ01/mF(θ0) corresponding to the CRLB for unbiased estimators.

Figure 1.

Figure 1

(a) Bias θMLEμ|θ0θ0 (green dots) as function of m with error bars (ΔθMLE)μ|θ0. The red lines are ±ΔθCRLB=±|dθMLEμ|θ0/dθ0|/mF(θ0); (b) variance of the maximum likelihood estimator multiplied by the Fisher information, mF(θ0)(Δ2θMLE)μ|θ0 (red circles), as a function of the sample size m. It is compared to the bias (dθMLEμ|θ0/dθ0)2 (red dashed line). We recall that θ0=π/4 and F(θ0)=4 here.

2.2.3. Hammersley–Chapman–Robbins Bound

The ChRB is obtained from Equation (5) by taking n=2, a1=1,a2=1, θ1=θ0+λ, θ2=θ0, and can be written as [47,48]

Δ2θChRB=supλθestμ|θ0+λθestμ|θ02μp(μ|θ0+λ)2p(μ|θ0)1. (14)

Clearly, restricting the number of parameters in the optimization in Equation (5) leads to a less strict bound. We thus have Δ2θBBΔ2θChRB. For unbiased estimators, we obtain

Δ2θChRBub=supλλ2μp(μ|θ0+λ)2p(μ|θ0)1. (15)

Furthermore, the supremum over λ on the right side of Equation (14) is always larger or equal to its limit λ0:

supλθestμ|θ0+λθestμ|θ02μp(μ|θ0+λ)2p(μ|θ0)1limλ0θestμ|θ0+λθestμ|θ02μp(μ|θ0+λ)2p(μ|θ0)1=dθestμ|θ0dθ02mμ1p(μ|θ0)(dp(μ|θ0)dθ0)2, (16)

provided that the derivatives on the right-hand side exist. We thus recover the CRLB as a limiting case of the ChRB. The ChRB is always stricter than the CRLB and we obtain the last inequality in the chain (7). Notice that the CRLB requires the probability distribution p(μ|θ0) to be differentiable [24]—a condition that can be dropped for the ChRB and the more general BB. Even if the distribution is regular, the above derivation shows that the ChRB, and more generally the BB, provide tighter error bounds than the CRLB. With increasing n, the BB becomes tighter and tighter and the CRLB represents the weakest bound in this hierarchy, which can be observed in Figure 2a. Next, we determine a stricter bound in this hierarchy.

Figure 2.

Figure 2

(a) comparison between unbiased frequentist bounds for the example considered in this manuscript, Equation (1): the CRLB mΔ2θCRLBub=1/F(θ0) (black line), the Hammersley–Chapman–Robbins bound mΔ2θChRBub (Equation (15), filled triangles) and the extended Hammersley–Chapman–Robbins bound mΔ2θEChRBub (Equation (18), empty triangles); (b) values of λ achieving the supremum in Equation (15), as a function of m.

2.2.4. Extended Hammersley–Chapman–Robbins Bound

We obtain the extended Hammersley–Chapman–Robbins bound (EChRB) as a special case of Equation (5), by taking n=3, a1=1, a2=A, a3=1, θ1=θ0+λ1, θ2=θ0+λ2, and θ3=θ0, giving

Δ2θEChRB=supλ1,λ2,Aθestμ|θ0+λ1+Aθestμ|θ0+λ2(1+A)θestμ|θ02μp(μ|θ0+λ1)p(μ|θ0)+Ap(μ|θ0+λ2)2p(μ|θ0), (17)

where the supremum is taken over all possible λ1,λ2N and AR. Since the ChRB is obtained from Equation (17) in the specific case A=0, we have that Δ2θEChRBΔ2θChRB. For unbiased estimators, we obtain

Δ2θEChRBub=supλ1,λ2,Aλ1+Aλ22μp(μ|θ0+λ1)p(μ|θ0)+Ap(μ|θ0+λ2)2p(μ|θ0). (18)

In Figure 2a, we compare the different bounds for unbiased estimators and for the example considered in the manuscript: the CRLB (black line), the ChRB (filled triangles) and the EChRB (empty triangles), satisfying the chain of inequalities (7). In Figure 2b, we show the values of λ in Equation (15) for which the supremum is achieved in our case.

3. Bayesian Approach

The Bayesian approach makes use of the Bayes–Laplace theorem, which can be very simply stated and proved. The joint probability of two stochastic variables μ and θ is symmetric: p(μ,θ)=p(μ|θ)p(θ)=p(θ|μ)p(μ)=p(θ,μ), where p(θ) and p(μ) are the marginal distributions, obtained by integrating the joint probability over one of the two variables, while p(μ|θ) and p(θ|μ) are conditional distributions.

We recall that in a phase inference problem, the set of measurement results μ is generated by a fixed and unknown value θ0 according to the likelihood p(μ|θ0). In the Bayesian approach to the estimation of θ0, one introduces a random variable θ and uses the Bayes–Laplace theorem to define the conditional probability

ppost(θ|μ)=p(μ|θ)ppri(θ)pmar(μ). (19)

The posterior probability ppost(θ|μ) provides a degree of belief, or plausibility, that θ0=θ (i.e., that θ is the true value of the phase), in the light of the measurement data μ [50]. In Equation (19), the prior distribution ppri(θ) expresses the a priori state of knowledge on θ, p(μ|θ) is the likelihood that is determined by the quantum mechanical measurement postulate, e.g., as in Equation (1), and the marginal probability pmar(μ)=abdθp(θ,μ) is obtained through the normalization for the posterior, where a and b are boundaries of the phase domain. The posterior probability ppost(θ|μ) describes the current knowledge about the random variable θ based on the available information, i.e., the measurement results μ.

3.1. Noninformative Prior

In the Bayesian approach, the information on θ provided by the posterior probability always depends on the prior distribution ppri(θ). It is possible to account for the available a priori information on θ by choosing a prior distribution accordingly. However, if no a priori information is available, it is not obvious how to choose a “noninformative” prior [51]. The flat prior ppri(θ)=const was first introduced by Laplace to express the absence of information on θ[51]. However, this prior would not be flat for other functions of θ and, in the complete absence of a priori information, it seems unreasonable that some information is available for different parametrizations of the problem. To see this, recall that a transformation of variables requires that ppri(φ)=ppri(θ)|df1(φ)/dφ| for any function φ=f(θ). Hence, if ppri(θ) is flat, one obtains that ppri(φ)=|df1(φ)/dφ| is, in general, not flat.

Notice that ppri(θ)F(θ)—called Jeffreys prior [52,53]—where F(θ) is the Fisher information (10), remains invariant under re-parametrization. For arbitrary transformations φ=f(θ), the Fisher information obeys the transformation property F(φ)=F(θ)(dθ/dφ)2=F(θ)(df1(φ)/dφ)2. Therefore, if ppri(θ)F(θ) and we perform the change of variable φ=f(θ), then the transformation property of the Fisher information ensures that ppri(φ)=ppri(θ)|df1(φ)/dφ|F(φ). Notice that, as in our case, the Fisher information F(θ) may actually be independent of θ. In this case, the invariance property does not imply that Jeffreys prior is flat for arbitrary re-parametrizations φ=f(θ), instead, F(φ)=|df1(φ)/dφ|.

3.2. Posterior Bounds

From the posterior probability (19), we can provide an estimate θBL(μ) of θ0. This can be the maximum a posteriori, θBL(μ)=argmaxθppost(θ|μ), which coincides with the maximum likelihood Equation (12) when the prior is flat, ppri(θ)=const, or the mean of the distribution, θBL(μ)=abdθθppost(θ|μ).

With the Bayesian approach, it is possible to provide a confidence interval around the estimator, given an arbitrary measurement sequence μ, even with a single measurement. The variance

Δ2θBL(μ)θ|μ=abdθppost(θ|μ)θθBL(μ)2 (20)

can be taken as a measure of fluctuation of our degree of belief around θBL(μ). There is no such concept in the frequentist paradigm. The Bayesian posterior variance Δ2θBL(μ)θ|μ and the frequentist variance (Δ2θBL)μ|θ0 have entirely different operational meanings. Equation (20) provides a degree of plausibility that θBL(μ)=θ0, given the measurement results μ. There is no notion of bias in this case. On the other hand, the quantity (Δ2θBL)μ|θ0 measures the statistical fluctuations of θBL(μ) when repeating the sequence of m measurements infinitely many times.

Ghosh Bound

In the following, we derive a lower bound to Equation (20) first introduced by Ghosh [54]. Using abdθppost(θ|μ)=1, we have

abdθθθBL(μ)dppost(θ|μ)dθ=ppost(θ|μ)θθBL(μ)ababdθppost(θ|μ)=fμ,a,b1, (21)

where fμ,a,b=bppost(b|μ)appost(a|μ)θBL(μ)(ppost(b|μ)ppost(a|μ)) depends on the value of the posterior distribution calculated at the boundaries. If ppri(a)=ppri(b)=0, we have fμ,a,b=0. Analogously with the derivation of the (frequenstist) CRLB, we exploit the Cauchy–Schwarz inequality,

abdθdppost(θ|μ)dθ21ppost(θ|μ)abdθppost(θ|μ)θθBL(μ)2(fμ,a,b1)2,

leading to (Δ2θBL(μ))θ|μΔ2θGB(μ), where [54]

Δ2θGB(μ)=(fμ,a,b1)2abdθ1ppost(θ|μ)dppost(θ|μ)dθ2. (22)

The above bound is a function of the specific measurement sequence μ and depends on abdθ1ppost(θ|μ)(dppost(θ|μ)dθ)2 that we can identify as a “Fisher information of the posterior distribution”. The Ghosh bound is saturated if and only if

θθBL(μ)=λμdlogp(θ|μ)dθ, (23)

where λμ does not depend on θ while it may depend on μ.

3.3. Average Posterior Bounds

While Equation (20) depends on the specific μ, it is natural to consider its average over all possible measurement sequences at fixed θ0 and m, weighted by the likelihood p(μ|θ0):

Δ2θBLμ,θ|θ0=μΔ2θBL(μ)θ|μp(μ|θ0)=μabdθp(θ,μ|θ0)θθBL(μ)2, (24)

which we indicate as average Bayesian posterior variance, where p(θ,μ|θ0)=ppost(θ|μ)p(μ|θ0).

We would be tempted to compare the average posterior sensitivity (Δ2θBL)μ,θ|θ0 to the frequentist Cramér–Rao bound Δ2θCRLB. However, because of the different operational meanings of the frequentist and the Bayesian paradigms, there is no reason for Equation (24) to fulfill the Cramér–Rao bound: indeed, it does not, as we show below.

Likelihood-Averaged Ghosh Bound

A lower bound to Equation (24) is obtained by averaging the Ghosh bound Equation (22) over the likelihood function. We have (Δ2θBL)μ,θ|θ0Δ2θaGB, where [18]

Δ2θaGB=μ(fμ,a,b1)2abdθ1ppost(θ|μ)(ppost(θ|μ)θ)2p(μ|θ0). (25)

This likelihood-averaged Ghosh bound is independent of μ because of the statistical average.

3.4. Numerical Comparison of Bayesian and Frequentist Phase Estimation

In the numerical calculations shown in Figure 3, we consider a Bayesian estimator given by θBL(μ)=abdθθppost(θ|μ) with prior distributions

ppri(θ)=2πeαsin(2θ)21eα/2I0(α/2)1, (26)

where I0(α) is the modified Bessel function of the first kind. This choice of prior distribution can continuously turn from a peaked function to a flat one when changing α, while being differentiable in the full phase interval. The more negative is α, the more ppri(θ) broadens in [0,π/2]. In particular, in the limit α, the prior approaches the flat distribution, which in our case coincides with Jeffreys prior since the Fisher information is independent of θ. In the limit α=0, the prior is given by limα0ppri(θ)=4sin(2θ)2/π. For positive values of α, the larger α, the more peaked is ppri(θ) around θ0=π/4. In particular ppri(θ)e4α(θπ/4)2/π/4α for α1. Equation (26) is normalized to one for θ[0,π2]. In the inset of the different panels of Figure 3, we plot ppri(θ) for α=100 [panel (a)], α=10 (b), α=1 (c) and α=10 (d).

Figure 3.

Figure 3

Comparisons of phase estimation variance as a function of the sample size for Bayesian and frequentist data analysis under different prior distributions, (a) α=100, (b) α=10, (c) α=1, (d) α=10. In all figures, Red circles (frequentist) are m(Δ2θBL)μ|θ0, the red dashed line is the Cramér-Rao lower bound mΔ2θCRLB, Equation (8). Blue circles (Bayesian) are m(Δ2θBL)μ,θ|θ0, the blue solid line is the likelihood-averaged Ghosh bound mΔ2θaGB, Equation (25). The inset in each panel is ppri(θ), Equation (26), for the corresponding values of α.

In Figure 3, we plot, as a function of m, the posterior variance (Δ2θBL)μ,θ|θ0 (blue circles) that, as expected, is always larger than the likelihood-averaged Ghosh bound Equation (25) (solid blue lines). For comparison, we also plot the frequentist variance (Δ2θBL)μ|θ0=μ(θBL(μ)θBLμ|θ0)2p(μ|θ0) (red dots) around the mean value θBLμ|θ0=μθBL(μ)p(μ|θ0) of the estimator. This quantity obeys the Cramér–Rao theorem Δ2θBLμ|θ0Δ2θCRLB and the more general chain of inequalities (7). This is confirmed in the figure where we show Δ2θCRLB=|dθBLμ|θ0/dθ0|2/mF(θ0) (red line). Notice that, when the prior narrows around θ0, the variance Δ2θBLμ|θ0 decreases, but, at the same time, the estimator becomes more and more biased, i.e., |dθBLμ|θ0/dθ0| decreases as well (note indeed that the red dashed line is proportional to |dθBLμ|θ0/dθ0|2).

Interestingly, in Figure 3, we clearly see that the Bayesian posterior variance (Δ2θBL)μ,θ|θ0 and the likelihood-averaged Ghosh bound may stay in some cases below the (frequentist) Δ2θCRLB [see panels (a) and (b)], even if the prior is almost flat. The discrepancy with the CRLB is remarkable and can be quite large for small values of m. Still, there is no contradiction since (Δ2θBL)μ,θ|θ0 and Δ2θBLμ|θ0 have different operational meanings and interpretations. They both respect their corresponding sensitivity bounds.

Asymptotically in the number of measurements m, the Ghosh bound as well as its likelihood average converge to the Cramér–Rao bound. Indeed, it is well known that in this limit the posterior probability becomes a Gaussian centered at the true value of the phase shift and with variance given by the inverse of the Fisher information,

ppost(θ|μ)=mF(θ0)2πemF(θ0)2(θθ0)2,(m1), (27)

a result known as Laplace–Bernstein–von Mises theorem [18,23,55]. By replacing Equation (27) into Equation (22), we recover a posterior variance given by 1/mF(θ0).

4. Bounds for Random Parameters

In this section, we derive bounds of phase sensitivity obtained when θ0 is a random variable distributed according to p(θ0). Operationally, this corresponds to the situation where θ0 remains fixed (but unknown) when collecting a single sequence of m measurements μ. In between measurement sequences, θ0 fluctuates according to p(θ0).

4.1. Frequentist Risk Functions for Random Parameters

Let us first consider the frequentist estimation of a fluctuating parameter θ0 with the estimator θest. The mean sensitivity obtained by averaging (Δ2θest)μ|θ0, Equation (3), over p(θ0) is

(Δ2θest)μ,θ0=abdθ0(Δ2θest)μ|θ0p(θ0)=μabdθ0p(μ|θ0)p(θ0)(θestμ|θ0θest(μ))2=μabdθ0p(μ,θ0)(θestμ|θ0θest(μ))2, (28)

where μ and θ0 are both random variables and we have used p(μ|θ0)p(θ0)=p(μ,θ0).

An averaged risk function for the efficiency of the estimator is given by averaging the mean square error (3) over p(θ0), leading to

MSE(θest)μ,θ0=dθ0MSE(θest)μ|θ0p(θ0)=dθ0μθest(μ)θ02p(μ,θ0). (29)

Analogously to Equation (4), we can write

MSE(θest)μ,θ0=Δ2θestμ,θ0+dθ0θestμ|θ0θ02p(θ0). (30)

In the following, we derive lower bounds for both (Δ2θest)μ,θ0 and MSE(θest)μ,θ0. Notice that bounds on (Δ2θest)μ,θ0 hold also for MSE(θest)μ,θ0 due to MSE(θest)μ,θ0(Δ2θest)μ,θ0. Nevertheless, bounds on the average the mean square error are widely used (and are often called Bayesian bounds [56]) since they can be expressed independently of the bias.

4.2. Bounds on the Mean Square Error

We first consider bounds on MSE(θest)μ,θ0, Equation (29), for arbitrary estimators.

4.2.1. Van Trees Bound

It is possible to derive a general lower bound on the mean square error (29) based on the following assumptions:

  1. p(μ,θ0)θ0 and 2p(μ,θ0)θ02 are absolutely integrable with respect to μ and θ0;

  2. paξ(a)pbξ(b)=0, where ξ(θ0)=μθest(μ)θ0p(μ|θ0).

Multiplying ξ(θ0) by p(θ0) and differentiating with respect to θ0, we have

p(θ0)ξ(θ0)θ0=μθest(μ)θ0p(μ,θ0)θ0p(θ0).

Integrating over θ0 in the range of [a,b] and considering the above properties, we find

μabdθ0θBL(μ)θ0p(μ,θ0)θ0=1. (31)

Finally, using the Cauchy–Schwarz inequality, we arrive at MSE(θest)μ,θ0Δ2θVTB, where

Δ2θVTB=1μabdθ01p(μ,θ0)(p(μ,θ0)θ0)2 (32)

is generally indicated as Van Trees bound [24,56,57]. The equality holds if and only if

θest(μ)θ0=λdlogp(μ,θ0)dθ0, (33)

where λ does not depend on θ0 and μ. It is easy to show that

μabdθ01p(μ,θ0)p(μ,θ0)θ02=mabdθ0p(θ0)F(θ0)+abdθ01p(θ0)p(θ0)θ02, (34)

where the first term is the Fisher information F(θ0), defined by Equation (10), averaged over p(θ0), and the second term can be interpreted as a Fisher information of the prior [24]. Asymptotically in the number of measurements m and for regular distributions p(θ0), the first term in Equation (34) dominates over the second one.

4.2.2. Ziv–Zakai Bound

A further bound on MSE(θest)μ,θ0 can be derived by mapping the phase estimation problem to a continuous series of binary hypothesis testing problems. A detailed derivation of the Ziv–Zakai bound [24,58,59] is provided in Appendix B. The final result reads MSE(θest)μ,θ0Δ2θZZB, where

Δ2θZZB=12dhhdθ0pθ0+pθ0+hPminθ0,θ0+h, (35)

and

Pminθ0,θ0+h=121μpθ0pμ|θ0pθ0+pθ0+hpθ0+hpμ|θ0+hpθ0+pθ0+h (36)

is the minimum error probability of the binary hypothesis testing problem. This bound has been adopted for quantum phase estimation in Ref. [26]. To this end, the probability Pmin(θ0,θ0+h) can be maximized over all possible quantum measurements, which leads to the trace distance [7]. As the optimal measurement may depend on θ0 and h, the bound (35), which involves integration over all values of θ0 and h, is usually not saturable. We remark that the trace distance also defines a saturable frequentist bound for a different risk function than the variance [60].

4.3. Bounds on the Average Estimator Variance

We now consider bounds on (Δ2θest)μ,θ0, Equation (28), for arbitrary estimators.

4.3.1. Average CRLB

Taking the average over p(θ0) of Equation (7), we obtain a chain of bounds for (Δ2θest)μ,θ0. In particular, in its simplest form, we have (Δ2θest)μ,θ0Δ2θaCRLB, where

Δ2θaCRLB=abdθ0dθestμ|θ0dθ02mF(θ0)p(θ0) (37)

is the average CRLB.

4.3.2. Van Trees Bound for the Average Estimator Variance

We can derive a general lower bound for the variance (28) by following the derivation of the Van Trees bound, which was discussed in Section 4.2.1. In contrast to the standard Van Trees bound for the mean square error, here the bias enters explicitly. Defining ξ(θ0)=μθest(μ)θestμ|θ0p(μ|θ0) and assuming the same requirements as in the derivation of the Van Trees bound for the MSE, we arrive at

μabdθ0(θest(μ)θestμ|θ0)p(μ,θ0)θ0=abdθ0dθestμ|θ0dθ0p(θ0).

Finally, a Cauchy–Schwarz inequality gives (Δ2θest)μ,θ0Δ2θfVTB, where

Δ2θfVTB=(abdθ0dθestμ|θ0dθ0p(θ0))2μabdθ01p(μ,θ0)(p(μ,θ0)θ0)2, (38)

with equality if and only if

θest(μ)θestμ|θ0=λdlogp(μ,θ0)dθ0, (39)

where λ is independent of θ0 and μ.

We can compare Equation (38) with the average CRLB Equation (37). We find

abdθ0(dθestμ|θ0dθ0)2mF(θ0)p(θ0)(abdθ0dθestμ|θ0dθ0p(θ0))2mabdθ0p(θ0)F(θ0)(abdθ0|dθestμ|θ0dθ0|p(θ0))2μabdθ01p(μ,θ0)(p(μ,θ0)θ0)2,

where in the first step we use Jensen’s inequality, and the second step follows from Equation (34) which implies mabdθ0p(θ0)F(θ0)μabdθ01p(μ,θ0)(p(μ,θ0)θ0)2 since abdθ01p(θ0)(dp(θ0)dθ0)20.

We thus arrive at

(Δ2θest)μ,θ0Δ2θaCRLBΔ2θfVTB, (40)

which is valid for generic estimators.

4.4. Bayesian Framework for Random Parameters

The Bayesian posterior variance, (Δ2θBL)μ,θ|θ0, Equation (24), averaged over p(θ0) is

(Δ2θBL)μ,θ,θ0=abdθ0(Δ2θBL)μ,θ|θ0p(θ0)=μabdθabdθ0ppost(θ|μ)p(μ|θ0)p(θ0)θθBL(μ)2=μabdθppost(θ|μ)p(μ)θθBL(μ)2, (41)

where p(μ)=abdθ0p(μ|θ0)p(θ0) is the average probability to observe μ taking into account fluctuations of θ0.

A bound on Equation (41) can be obtained by averaging Equation (25) over p(θ0), or, equivalently, averaging the Ghosh bound, Equation (22), over p(μ). We obtain the average Ghosh bound for random parameters θ0, (Δ2θBL)μ,θ,θ0Δ2θaGBr, where

Δ2θaGBr=abdθ0μ(fμ,a,b1)2abdθ1ppost(θ|μ)dppost(θ|μ)dθ2p(μ|θ0)p(θ0)=μ(fμ,a,b1)2abdθ1ppost(θ|μ)dppost(θ|μ)dθ2p(μ). (42)

The bound holds for any prior ppri(θ) and is saturated if and only if, for every value of μ, there exists a λμ such that Equation (23) holds.

Bayesian Bounds

In Equation (41), the prior used to define the posterior ppost(θ|μ) via the Bayes–Laplace theorem is arbitrary. In general, such a prior ppri(θ) is different from the statistical distribution of θ0, which can be unknown. If p(θ0) is known, then one can use it as a prior in the Bayesian posterior probability, i.e., ppri(θ)=p(θ0). In this specific case, we have pmar(μ)=p(μ), and thus ppost(θ|μ)p(μ)=ppost(θ|μ)pmar(μ)=p(μ,θ). In other words, for this specific choice of prior, the physical joint probability p(μ,θ0) of random variables θ0 and μ coincides with the Bayesian p(μ,θ). Equation (41) thus simplifies to

(Δ2θBL)μ,θ=μabdθp(μ,θ)θθBL(μ)2. (43)

Notice that this expression is mathematically equivalent to the frequentist average mean square error (29) if we replace θ with θ0 and θBL(μ) with θest(μ). This means that precision bounds for Equation (29), e.g., the Van Trees and Ziv–Zakai bounds can also be applied to Equation (43). These bounds are indeed often referred to as “Bayesian bounds” (see Ref. [24]).

We emphasize that the average over the marginal distribution pmar(μ), which connects Equations (24) and (43), has operational meaning if we consider that θ0 is a random variable distributed according to p(θ0), and p(θ) is used as prior in the Bayes–Laplace theorem to define a posterior distribution. In this case, and under the condition f(μ,a,b)=0 (for instance if the prior distribution vanishes at the borders of the phase domain), using Jensen’s inequality, we find

Δ2θaGBr=μp(μ)abdθ1ppost(θ|μ)(dppost(θ|μ)dθ)21μp(μ)abdθ1ppost(θ|μ)(dppost(θ|μ)dθ)2=1μabdθ1p(θ,μ)(p(θ,μ)θ)2, (44)

which coincides with the Van Trees bound discussed above. We thus find that the averaged Ghosh bound for random parameters (42) is sharper than the Van Trees bound (38):

(Δ2θBL)μ,θΔ2θaGBrΔ2θVTB, (45)

which is also confirmed by the numerical data shown in Figure 4.

Figure 4.

Figure 4

Comparisons of average posterior Bayesian variance, m(Δ2θBL)μ,θ (dots), as a function of the sample size m under different prior distributions, (a) α=100, (b) α=10, (c) α=1, (d) α=10. This variance is compared to to the average Ghosh bound for random parameters m(Δ2θaGBr) (grey line), the Van Trees bound m(Δ2θVTB) (green line), the Ziv–Zakai bound m(Δ2θZZB) (red line) and 1/F(θ0) (black horizontal line). The inset in each panel is the prior ppri(θ), Equation (26), for the corresponding values of α.

In Figure 4, we compare Δ2θBLμ,θ with the various bounds discussed in this section. As p(θ0), we consider the same prior (26) used in Figure 3. We observe that all bounds approach the Van Trees bound with increasing sharpness of the prior distribution. Asymptotically in the number of measurements m, all bounds converge to the Cramér–Rao bound.

5. Discussion and Conclusions

In this manuscript, we have clarified the differences between frequentist and Bayesian approaches to phase estimation. The two paradigms provide statistical results that have a different conceptual meaning and cannot be compared. We have also reviewed and discussed phase sensitivity bounds in the frequentist and Bayesian frameworks, when the true value of the phase shift θ0 is fixed or fluctuates. These bounds are summarized in Table 1.

Table 1.

Frequentist vs Bayesian bounds for fixed and random parameters.

Paradigm Risk Function Bounds Remarks
θ0 fixed Frequentist (Δ2θest)μ|θ0 BB Equation (5) hierarchy of bounds, Equation (7)
EChRB Equation (17)
MSE(θest)μ|θ0 ChRB Equation (14)
CRLB Equation (8)
Bayesian (Δ2θBL)μ|θ0 GB Equation (22) function of μ
(Δ2θBL)μ,θ|θ0 aGB Equation (25) average over likelihood p(μ|θ0)
θ0 random Frequentist (Δ2θest)μ,θ0 aCRLB Equation (37) hierarchy of bounds, Equation (40)
fVTB Equation (38)
MSE(θest)μ,θ0 VTB Equation (32) bounds are independent of the bias
ZZB Equation (35)
Bayesian (Δ2θBL)μ,θ,θ0 aGBr Equation (42) prior ppri(θ) and fluctuations p(θ0) arbitrary
(Δ2θBL)μ,θ VTB Equation (32) prior ppri(θ) and fluctuations p(θ0) coincide
ZZB Equation (35) hierarchy of bounds, Equation (45)

In the frequentist approach, for a fixed θ0, the phase sensitivity is determined from the width of the probability distribution of the estimator. The physical content of the distribution is that, when repeating the estimation protocol, the obtained θest(μ) will fall, with a certain confidence, in an interval around the mean value θestμ|θ0 (e.g., 68% of the times within a 2(Δθest)μ|θ0 interval for a Gaussian distribution) that, for unbiased estimators, coincides with the true value of the phase shift.

In the Bayesian case, the posterior ppost(θ|μ) provides a degree of plausibility that the phase shift θ equals the interferometer phase θ0 when the data μ was obtained. This allows the Bayesian approach to provide statistical information for any number of measurements, even a single one. To be sure, this is not a sign of failure or superiority of one approach with respect to the other one, since the two frameworks manipulate conceptually different quantities. The experimentalist can choose to use one or both approaches, keeping in mind the necessity to clearly state the nature of the statistical significance of the reported results.

The two predictions converge asymptotically in the limit of a large number of measurements. This does not mean that in this limit the significance of the two approaches is interchangeable (it cannot be stated that in the limit of large repetition of the measurements, frequentist ad Bayesian provide the same results). In this respect, it is quite instructive to notice that the Bayesian 2σ confidence may be below that of the Cramér–Rao bound, as shown in Figure 3. This, at first sight, seems paradoxical, since the CRLB is a theorem about the minimum error achievable in parameter estimation theory. However, the CRLB is a frequentist bound and, again, the paradox is solved taking it into account that the frequentist and the Bayesian approaches provide information about different quantities.

Finally, a different class of estimation problems with different precision bounds is encountered if θ0 is itself a random variable. In this case, the frequentist bounds for the mean-square error (Van Trees, Ziv–Zakai) become independent of the bias, while those on the estimator variance are still functions of the bias. The Van Trees and Ziv–Zakai bounds can be applied to the Bayesian paradigm if the average of the posterior variance over the marginal distribution is the relevant risk function. This is only meaningful if the prior ppri(θ) that enters the Bayes–Laplace theorem coincides with the actual distribution p(θ0) of the phase shift θ0.

We conclude with a remark regarding the so-called Heisenberg limit, which is a saturable lower bound on the CRLB over arbitrary quantum states with a fixed number of particles. For instance, for a collection of N two-level systems, the CRLB can be further bounded by Δθest1/mF(θ0)1/mN [18,20]. This bound is often called the ultimate precision bound since no quantum state is able to achieve a tighter scaling than N. From the discussions presented in this article, it becomes apparent that Bayesian approaches (as discussed in Section 3) or precision bounds for random parameters (Section 4) are expected to lead to entirely different types of ‘ultimate’ lower bounds. Such bounds are interesting within the respective paradigm for which they are derived, but they cannot replace or improve the Heisenberg limit since they address fundamentally different scenarios that cannot be compared in general.

Acknowledgments

This work was supported by the National Key R & D Program of China (No. 2017YFA0304500 and No. 2017YFA0304203), the National Natural Science Foundation of China (Grant No. 11874247), the 111 plan of China (No. D18001), the Hundred Talent Program of the Shanxi Province (2018), the Program of State Key Laboratory of Quantum Optics and Quantum Optics Devices (No. KF201703), and the QuantEra project Q-Clocks. M.G. acknowledges support by the Alexander von Humboldt Foundation.

Appendix A. Derivation of the Barankin Bound

Let θest be an arbitrary estimator for θ. Its mean value

θestμ|θ=μθest(μ)p(μ|θ) (A1)

coincides with θ if and only if the estimator is unbiased (for arbitrary values of θ). In the following, we make no assumption about the bias of θest and therefore do not replace θestμ|θ by θ.

Introducing the likelihood ratio

L(μ|θi,θ0)=p(μ|θi)p(μ|θ0) (A2)

under the condition p(μ|θ0)>0 for all μ, we obtain with Equation (A1) that

μθest(μ)L(μ|θi,θ0)p(μ|θ0)=θestμ|θi, (A3)

for an arbitrary family of phase values θ1,,θn picked from the parameter domain. Furthermore, we have

μL(μ|θi,θ0)p(μ|θ0)=μp(μ|θi)=1 (A4)

for all θi. Multiplying both sides of Equation (A4) with θestμ|θ0 and subtracting it from (A3) yields

μθest(μ)θestμ|θ0L(μ|θi,θ0)p(μ|θ0)=θestμ|θiθestμ|θ0. (A5)

Let us now pick a family of n finite coefficients a1,,an. From Equation (A5), we obtain

μθest(μ)θestμ|θ0i=1naiL(μ|θi,θ0)p(μ|θ0)=i=1naiθestμ|θiθestμ|θ0. (A6)

The Cauchy–Schwarz inequality now yields

i=1naiθestμ|θiθestμ|θ02Δ2θestμ|θ0μi=1naiL(μ|θi,θ0)2p(μ|θ0), (A7)

where

Δ2θestμ|θ0=μθest(μ)θestμ|θ02p(μ|θ0) (A8)

is the variance of the estimator θest. We thus obtain

Δ2θestμ|θ0i=1naiθestμ|θiθestμ|θ02μi=1naiL(μ|θi,θ0)2p(μ|θ0), (A9)

for all n, ai, and θi. The Barankin bound then follows by taking the supremum over these variables.

Appendix B. Derivation of the Ziv–Zakai Bound

Derivations of the Ziv–Zakai bound can be found in the literature (see, for instance, Refs. [24,58,59]). This Appendix follows these derivations closely and provides additional background, which may be useful for readers less familiar with the field of hypothesis testing.

Let X[0,a] be a random variable with probability density p(x). We can formally write p(x)=dP(Xx)/dx, where P(Xx)xap(y)dy is the probability that X is larger or equal than x. We obtain from integration by parts

X2=0ax2p(x)dx=x2P(Xx)0a+20aP(Xx)xdx=20aP(Xx)xdx=1202aPXh2hdh, (A10)

where we assume that a is finite [if a the above relation holds when limaa2P(Xa)=0]. Finally, we can formally extend the above integral up to since P(Xa)=0:

X2=120PXh2hdh. (A11)

Following Ref. [59], we now take ϵ=θest(μ)θ0 and X=|ϵ|. We thus have

MSE(θest)μ,θ0=|ϵ|2=120P|ϵ|h2hdh. (A12)

We express the probability as

P|ϵ|h2=Pϵ>h2+Pϵh2=Pθest(μ)θ0>h2+Pθest(μ)θ0h2=Pθest(μ)θ0>h2|θ0p(θ0)dθ0+Pθest(μ)θ0h2|θ0p(θ0)dθ0.

Next, we replace θ0 with θ0+h in the second integral:

P|ϵ|h2=Pθest(x)θ0>h2|θ0p(θ0)dθ0+Pθest(x)θ0h2|θ0+hp(θ0+h)dθ0=(p(φ)+p(φ+h))p(φ)p(φ)+p(φ+h)Pθest(x)φ>h2|θ0=φ++p(φ+h)p(φ)+p(φ+h)Pθest(x)φh2|θ0=φ+hdφ.

We now take a closer look at the expression within the angular brackets and interpret it in the framework of hypothesis testing. Suppose that we try to discriminate between the two cases θ0=φ (hypothesis 1, denoted H1) and θ0=φ+h (denoted H2). We decide between the two hypothesis H1 and H2 on the basis of the measurement result x using the estimator θest(x). One possible strategy consists in choosing the hypothesis whose value is closest to the obtained estimator. Hence, if θest(x)φ+h/2, we assume H1 to be correct and, otherwise, if θest(x)>φ+h/2, we pick H2.

Let us now determine the probability to make an erroneous decision using this strategy. There are two scenarios that will lead to a mistake. First, our strategy fails whenever θest(x)φ+h/2 when θ0=φ+h. In this case, H2 is true, but our strategy leads us to choose H1. The probability for this to happen, given that θ0=φ+h, is P(θest(x)φh2|θ0=φ+h). To obtain the probability error of our strategy, we need to multiply this with the probability with which θ0 assumes the value φ+h, which is given by p(H2)=p(φ+h)p(φ)+p(φ+h). Second, our strategy also fails if θest(x)>φ+h/2 for θ0=φ. This occurs with the conditional probability P(θest(x)φ>h2|θ0=φ), and θ0=φ with probability p(H1)=p(φ)p(φ)+p(φ+h). The total probability to make a mistake is consequently given by

Perr(φ,φ+h)=Pθest(x)φ>h2|H1p(H1)+Pθest(x)φh2|H2p(H2)=p(φ)p(φ)+p(φ+h)Pθest(x)φ>h2|θ0=φ++p(φ+h)p(φ)+p(φ+h)Pθest(x)φh2|θ0=φ+h, (A13)

and we can rewrite Equation (A13) as

P|ϵ|h2=(p(φ)+p(φ+h))Perr(φ,φ+h)dφ. (A14)

The strategy described above depends on the estimator θest and may not be optimal. In general, a binary hypothesis testing strategy can be characterized in terms of the separation of the possible values of x into the two disjoint subsets X1 and X2 which are used to choose hypothesis H1 or H2, respectively. That is, if xX1 we pick H1 and otherwise H2. Since one of the two hypotheses must be true, we have

1=p(H1)+p(H2)=X1dxp(x|H1)p(H1)+X2dxp(x|H1)p(H1)+X1dxp(x|H2)p(H2)+X2dxp(x|H2)p(H2)=X1dxp(x|H1)p(H1)+X2dxp(x|H2)p(H2)+PerrX1(H1,H2), (A15)

where the error made by such a strategy is given by

PerrX1(H1,H2)=P(xX2|H1)p(H1)+P(xX1|H2)p(H2)=X2p(x|H1)p(H1)dx+X1p(x|H2)p(H2)dx=p(H1)+X1p(x|H2)p(H2)p(x|H1)p(H1)dx. (A16)

This probability is minimized if p(x|H2)p(H2)<p(x|H1)p(H1) for xX1 and, consequently, p(x|H2)p(H2)p(x|H1)p(H1) for xX2. This actually identifies an optimal strategy for hypothesis testing, known as the likelihood ratio test: if the likelihood ratio p(x|H1)/p(x|H2) is larger than the threshold value p(H2)/p(H1), we pick H1, whereas, if it is smaller, we pick H2. With this choice, the error probability is minimal and reads

Pmin(H1,H2)=X2p(x|H1)p(H1)p(x|H2)p(H2)dx+X1p(x|H2)p(H2)p(x|H1)p(H1)dx++X1p(x|H1)p(H1)dx+X2p(x|H2)p(H2)dx=1212p(x|H1)p(H1)p(x|H2)p(H2)dx, (A17)

where we used Equation (A15).

Applied to our case, we obtain

Pmin(φ,φ+h)=121μp(μ|θ0=φ)p(φ)p(φ)+p(φ+h)p(μ|θ0=φ+h)p(φ+h)p(φ)+p(φ+h). (A18)

This result represents a lower bound on PerrX1(φ,φ+h) for arbitrary choices of X1. This includes the case discussed in Equation (A13). Thus, using

Perr(φ,φ+h)Pmin(φ,φ+h) (A19)

in Equation (A14) and inserting back into Equation (A12), we finally obtain the Ziv–Zakai bound for the mean square error:

MSE(θest)μ,θ0120hdhdθ0(p(θ0)+p(θ0+h))Pmin(θ0,θ0+h). (A20)

This bound can be further sharpened by introducing a valley-filling function [61], which is not considered here.

Author Contributions

Y.L., L.P., M.G., W.L. and A.S. conceived the study, performed theoretical calculations and drafted the article. All authors have read and approved the final manuscript.

Conflicts of Interest

The authors declare no conflict of interest.

References

  • 1.Zehnder L. Ein neuer Interferenzrefraktor. Zeitschrift für Instrumentenkunde. 1891;11:275. (In German) [Google Scholar]
  • 2.Mach L. Ueber einen Interferenzrefraktor. Zeitschrift für Instrumentenkunde. 1892;12:89. (In German) [Google Scholar]
  • 3.Ramsey N.F. Molecular Beams. Oxford University Press; London, UK: 1963. [Google Scholar]
  • 4.Wynands R. Atomic Clocks. In: Muga G., Ruschhaupt A., Campo A., editors. Lecture Notes in Physics. Volume 789 Springer; Berlin/Heidelberg, Germany: 2009. [Google Scholar]
  • 5.Barish B.C., Weiss R. LIGO and the Detection of Gravitational Waves. Phys. Today. 1999;52:44–50. doi: 10.1063/1.882861. [DOI] [Google Scholar]
  • 6.Pitkin M., Reid S., Rowan S., Hough J. Gravitational Wave Detection by Interferometry (Ground and Space) Living Rev. Relativ. 2011;14:5. doi: 10.12942/lrr-2011-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Helstrom C.W. Quantum detection and estimation theory. J. Stat. Phys. 1969;1:231. doi: 10.1007/BF01007479. [DOI] [Google Scholar]
  • 8.Holevo A.S. Probabilistic and Statistical Aspects of Quantum Theory. North-Holland Publishing Company; Amsterdam, The Netherlands: 1982. [Google Scholar]
  • 9.Ludlow A.D., Boyd M.M., Ye J., Peik E., Schmidt P.O. Optical atomic clocks. Rev. Mod. Phys. 2015;87:637–701. doi: 10.1103/RevModPhys.87.637. [DOI] [Google Scholar]
  • 10.Schnabel R., Mavalvala N., McClelland D.E., Lam P.K. Quantum metrology for gravitational wave astronomy. Nat. Commun. 2010;1:121. doi: 10.1038/ncomms1122. [DOI] [PubMed] [Google Scholar]
  • 11.Aasi J., Abadie J., Abbott B.P., Abbott R., Abbott T.D., Abernathy M.R., Adams C., Adams T., Addesso P., Adhikari R.X., et al. Enhanced sensitivity of the LIGO gravitational wave detector by using squeezed states of light. Nat. Photon. 2010;7:613–619. doi: 10.1038/nphoton.2013.177. [DOI] [Google Scholar]
  • 12.Cronin A.D., Schmiedmayer J., Pritchard D.E. Optics and interferometry with atoms and molecules. Rev. Mod. Phys. 2009;81:1051–1129. doi: 10.1103/RevModPhys.81.1051. [DOI] [Google Scholar]
  • 13.Caves C.M. Quantum-mechanical noise in an interferometer. Phys. Rev. D. 1981;23:1693–1708. doi: 10.1103/PhysRevD.23.1693. [DOI] [Google Scholar]
  • 14.Giovannetti V., Lloyd S., Maccone L. Quantum metrology. Phys. Rev. Lett. 2006;96:010401. doi: 10.1103/PhysRevLett.96.010401. [DOI] [PubMed] [Google Scholar]
  • 15.Pezzè L., Smerzi A. Entanglement, nonlinear dynamics, and the Heisenberg limit. Phys. Rev. Lett. 2009;102:100401. doi: 10.1103/PhysRevLett.102.100401. [DOI] [PubMed] [Google Scholar]
  • 16.Hyllus P., Laskowski W., Krischek R., Schwemmer C., Wieczorek W., Weinfurter H., Pezzè L., Smerzi A. Fisher information and multiparticle entanglement. Phys. Rev. A. 2012;85:022321. doi: 10.1103/PhysRevA.85.022321. [DOI] [PubMed] [Google Scholar]
  • 17.Tóth G. Multipartite entanglement and high-precision metrology. Phys. Rev. A. 2012;85:022322. doi: 10.1103/PhysRevA.85.022322. [DOI] [Google Scholar]
  • 18.Pezzè L., Smerzi A. Quantum theory of phase estimation. In: Tino G.M., Kasevich M.A., editors. Atom Interferometry, Proceedings of the International School of Physics "Enrico Fermi", Italy, 15–20 July 2013. IOS Press; Sesto Fiorentino, Italy: 2014. Course 188, 691. [Google Scholar]
  • 19.Tóth G., Apellaniz I. Quantum metrology from a quantum information science perspective. J. Phys. A Math. Theor. 2014;47:424006. doi: 10.1088/1751-8113/47/42/424006. [DOI] [Google Scholar]
  • 20.Giovannetti V., Lloyd S., Maccone L. Advances in quantum metrology. Nat. Photon. 2011;5:222–229. doi: 10.1038/nphoton.2011.35. [DOI] [Google Scholar]
  • 21.Pezzè L., Smerzi A., Oberthaler M.K., Schimed R., Treutlein P. Quantum metrology with nonclassical states of atomic ensembles. Rev. Mod. Phys. 2018 in press. [Google Scholar]
  • 22.Kay S.M. Fundamentals of Statistical Signal Processing: Estimation Theory, Volume I. Prentice Hall; Upper Saddle River, NJ, USA: 1993. [Google Scholar]
  • 23.Lehmann E.L., Casella G. Theory of Point Estimation. Springer; Berlin, Germany: 1998. [Google Scholar]
  • 24.Van Trees H.L., Bell K.L. Bayesian Bounds for Parameter Estimation and Nonlinear Filtering/Tracking. Wiley; New York, NY, USA: 2007. [Google Scholar]
  • 25.Lane A.S., Braunstein S.L., Caves C.M. Maximum-likelihood statistics of multiple quantum phase measurements. Phys. Rev. A. 1993;47:1667. doi: 10.1103/PhysRevA.47.1667. [DOI] [PubMed] [Google Scholar]
  • 26.Tsang M. Ziv–Zakai error bounds for quantum parameter estimation. Phys. Rev. Lett. 2012;108:230401. doi: 10.1103/PhysRevLett.108.230401. [DOI] [PubMed] [Google Scholar]
  • 27.Lu X.M., Tsang M. Quantum Weiss-Weinstein bounds for quantum metrology. Quantum Sci. Technol. 2016;1:015002. doi: 10.1088/2058-9565/1/1/015002. [DOI] [Google Scholar]
  • 28.Hall M.J., Wiseman H.M. Heisenberg-style bounds for arbitrary estimates of shift parameters including prior information. New J. Phys. 2012;14:033040. doi: 10.1088/1367-2630/14/3/033040. [DOI] [Google Scholar]
  • 29.Giovannetti V., Maccone L. Sub-Heisenberg estimation strategies are ineffective. Phys. Rev. Lett. 2012;108:210404. doi: 10.1103/PhysRevLett.108.210404. [DOI] [PubMed] [Google Scholar]
  • 30.Pezzè L. Sub-Heisenberg phase uncertainties. Phys. Rev. A. 2013;88:060101(R). doi: 10.1103/PhysRevA.88.060101. [DOI] [Google Scholar]
  • 31.Pezzè L., Hyllus P., Smerzi A. Phase-sensitivity bounds for two-mode interferometers. Phys. Rev. A. 2015;91:032103. doi: 10.1103/PhysRevA.91.032103. [DOI] [Google Scholar]
  • 32.Hradil Z., Myška R., Peřina J., Zawisky M., Hasegawa Y., Rauch H. Quantum phase in interferometry. Phys. Rev. Lett. 1996;76:4295. doi: 10.1103/PhysRevLett.76.4295. [DOI] [PubMed] [Google Scholar]
  • 33.Pezzè L., Smerzi A., Khoury G., Hodelin J.F., Bouwmeester D. Phase detection at the quantum limit with multiphoton mach-zehnder interferometry. Phys. Rev. Lett. 2007;99:223602. doi: 10.1103/PhysRevLett.99.223602. [DOI] [PubMed] [Google Scholar]
  • 34.Kacprowicz M., Demkowicz-Dobrzanski R., Wasilewski W., Banaszek K., Walmsley I.A. Experimental quantum-enhanced estimation of a lossy phase shift. Nat. Photon. 2010;4:357. doi: 10.1038/nphoton.2010.39. [DOI] [Google Scholar]
  • 35.Krischek R., Schwemmer C., Wieczorek W., Weinfurter H., Hyllus P., Pezzè L., Smerzi A. Useful multiparticle entanglement and sub-shot-noise sensitivity in experimental phase estimation. Phys. Rev. Lett. 2011;107:080504. doi: 10.1103/PhysRevLett.107.080504. [DOI] [PubMed] [Google Scholar]
  • 36.Xiang G.Y., Higgins B.L., Berry D.W., Wiseman H.M., Pryde G.J. Entanglement-enhanced measurement of a completely unknown optical phase. Nat. Photon. 2011;5:43–47. doi: 10.1038/nphoton.2010.268. [DOI] [Google Scholar]
  • 37.Bollinger J.J., Itano W.M., Wineland D.J., Heinzen D.J. Optimal frequency measurements with maximally correlated states. Phys. Rev. A. 1996;54:R4649–R4652. doi: 10.1103/PhysRevA.54.R4649. [DOI] [PubMed] [Google Scholar]
  • 38.Pezzè L., Smerzi A. Sub shot-noise interferometric phase sensitivity with beryllium ions Schrödinger cat states. Europhys. Lett. 2007;78:30004. doi: 10.1209/0295-5075/78/30004. [DOI] [Google Scholar]
  • 39.Gerry C.C., Mimih J. The parity operator in quantum optical metrology. Contemp. Phys. 2010;51:497. doi: 10.1080/00107514.2010.509995. [DOI] [Google Scholar]
  • 40.Sackett C.A., Kielpinski D., King B.E., Langer C., Meyer V., Myatt C.J., Rowe M., Turchette Q.A., Itano W.M., et al. Experimental entanglement of four particles. Nature. 2000;404:256–259. doi: 10.1038/35005011. [DOI] [PubMed] [Google Scholar]
  • 41.Monz T., Schindler P., Barreiro J.T., Chwalla M., Nigg D., Coish W., Harlander M., Hänsel W., Hennrich M., Blatt R. 14-Qubit Entanglement: Creation and Coherence. Phys. Rev. Lett. 2011;106:130506. doi: 10.1103/PhysRevLett.106.130506. [DOI] [PubMed] [Google Scholar]
  • 42.Hayashi M. Asymptotic Theory of Quantum Statistical Inference, Selected Papers. World Scientific Publishing; Singapore: 2005. [Google Scholar]
  • 43.Barankin E.W. Locally best unbiased estimates. Ann. Math. Stat. 1949;20:477. doi: 10.1214/aoms/1177729943. [DOI] [Google Scholar]
  • 44.Mcaulay R.J., Hofstetter E.M. Barankin bounds on parameter estimation. IEEE Trans. Inf. Theory. 1971;17:669–676. doi: 10.1109/TIT.1971.1054719. [DOI] [Google Scholar]
  • 45.Cramér H. Mathematical Methods of Statistics. Princeton University Press; Princeton, NJ, USA: 1946. [Google Scholar]
  • 46.Rao C.R. Information and the Accuracy Attainable in the Estimation of Statistical Parameters. Bull. Calcutta Math. Soc. 1971;37:81–91. [Google Scholar]
  • 47.Hammersley J.M. On estimating restricted parameters. J. R. Stat. Soc. Ser. B. 1950;12:192. [Google Scholar]
  • 48.Chapman D.G., Robbins H. Minimum variance estimation without regularity assumptions. Ann. Math. Stat. 1951;22:581. doi: 10.1214/aoms/1177729548. [DOI] [Google Scholar]
  • 49.Pflanzagl J., Hamböker R. Parametric Statistical Theory. De Gruyter; Berlin, Germany: 1994. [Google Scholar]
  • 50.Sivia D.S., Skilling J. Data Analysis: A Bayesian Tutorial. Oxford University Press; London, UK: 2006. [Google Scholar]
  • 51.Robert C.P. The Bayesian Choice: From Decision-Theoretic Foundations to Computational Implementation. Springer; New York, NY, USA: 2007. [Google Scholar]
  • 52.Jeffreys H. An invariant form for the prior probability in estimation problems. Proc. R. Soc. Lond. A. 1946;186:453. doi: 10.1098/rspa.1946.0056. [DOI] [PubMed] [Google Scholar]
  • 53.Jeffreys H. Theory of Probability. Oxford University Press; London, UK: 1961. [Google Scholar]
  • 54.Ghosh M. Cramér–Rao bounds for posterior variances. Stat. Probabil. Lett. 1993;17:173. doi: 10.1016/0167-7152(93)90164-E. [DOI] [Google Scholar]
  • 55.Cam L.L. Asymptotic Methods in Statistical Decision Theory. Springer; New York, NY, USA: 1986. [Google Scholar]
  • 56.Van Trees H.L. Detection, Estimation, and Modulation Theory, Part I. Wiley; New York, NY, USA: 1968. [Google Scholar]
  • 57.Schutzenberger M.P. A generalization of the Fréchet-Cramér inequality to the case of Bayes estimation. Bull. Am. Math. Soc. 1957;63:142. [Google Scholar]
  • 58.Ziv J., Zakai M. Some lower bounds on signal parameter estimation. IEEE Trans. Inform. Theor. 1969;15:386–391. doi: 10.1109/TIT.1969.1054301. [DOI] [Google Scholar]
  • 59.Bell K.L., Steinberg Y., Ephraim Y., Van Trees H.L. Extended Ziv–Zakai lower bound for vector parameter estimation. IEEE Trans. Inf. Theor. 1997;43:624–637. doi: 10.1109/18.556118. [DOI] [Google Scholar]
  • 60.Gessner M., Smerzi A. Statistical speed of quantum states: Generalized quantum Fisher information and Schatten speed. Phys. Rev. A. 2018;97:022109. doi: 10.1103/PhysRevA.97.022109. [DOI] [Google Scholar]
  • 61.Bellini S., Tartara G. Bounds on error in signal parameter estimation. IEEE Trans. Commun. 1974;22:340–342. doi: 10.1109/TCOM.1974.1092192. [DOI] [Google Scholar]

Articles from Entropy are provided here courtesy of Multidisciplinary Digital Publishing Institute (MDPI)

RESOURCES