Generalised Entropy Accumulation

Tony Metger; Omar Fawzi; David Sutter; Renato Renner

doi:10.1007/s00220-024-05121-4

. 2024 Oct 12;405(11):261. doi: 10.1007/s00220-024-05121-4

Generalised Entropy Accumulation

Tony Metger ^1,^✉, Omar Fawzi ², David Sutter ³, Renato Renner ¹

PMCID: PMC11470903 PMID: 39403569

Abstract

Consider a sequential process in which each step outputs a system $A_{i}$ and updates a side information register E. We prove that if this process satisfies a natural “non-signalling” condition between past outputs and future side information, the min-entropy of the outputs $A_{1}, \dots, A_{n}$ conditioned on the side information E at the end of the process can be bounded from below by a sum of von Neumann entropies associated with the individual steps. This is a generalisation of the entropy accumulation theorem (EAT) (Dupuis et al. in Commun Math Phys 379: 867–913, 2020), which deals with a more restrictive model of side information: there, past side information cannot be updated in subsequent rounds, and newly generated side information has to satisfy a Markov condition. Due to its more general model of side-information, our generalised EAT can be applied more easily and to a broader range of cryptographic protocols. As examples, we give the first multi-round security proof for blind randomness expansion and a simplified analysis of the E91 QKD protocol. The proof of our generalised EAT relies on a new variant of Uhlmann’s theorem and new chain rules for the Rényi divergence and entropy, which might be of independent interest.

Introduction

Suppose that Alice and Eve share a quantum state $ρ_{A^{n} E}$ . From her systems $A^{n} : = A_{1} \dots A_{n}$ , Alice would like to extract bits that look uniformly random to Eve, except with some small failure probability $ε$ [1]. The number of such random bits that Alice can extract is given by the smooth min-entropy $H_{min}^{ε} {(A^{n} | E)}_{ρ}$ [2]. This quantity plays a central role in quantum cryptography: for example, the main task in security proofs of quantum key distribution (QKD) protocols is usually finding a lower bound for the smooth min-entropy.

Unfortunately, for many cryptographic protocols deriving such a bound is challenging. Intuitively, the reason is the following: the state $ρ_{A^{n} E}$ is usually created as the output of a multi-round protocol, where each round produces one of Alice’s systems $A_{i}$ and allows Eve to execute some attack to gain information about $A_{1}, \dots, A_{i}$ . These attacks can depend on each other, i.e., Eve may use what she learnt in round $i - 1$ to plan her attack in round i. This non-i.i.d. nature of the attacks makes it hard to find a lower bound on $H_{min}^{ε} {(A^{n} | E)}_{ρ}$ that holds for any possible attack that Eve can execute. In contrast, it is typically much easier to compute a conditional von Neumann entropy associated with a single-round of the protocol, where the non-i.i.d. nature of Eve’s attack plays no role. Therefore, it is desirable to relate the smooth min-entropy of the output of the multi-round protocol to the von Neumann entropies associated with the individual rounds.

From an information-theoretic point of view, this question can be phrased as follows: can the smooth min-entropy $H_{min}^{ε} {(A^{n} | E)}_{ρ}$ be bounded from below in terms of von Neumann entropies $H {(A_{i} | E_{i})}_{ρ_{A_{i} E_{i}}^{i}}$ for some (yet to be determined) systems $E_{i}$ and states $ρ_{A_{i} E_{i}}^{i}$ related to $ρ$ ? While for general states $ρ_{A^{n} E}$ no useful lower bound can be found, previous works have established such bounds under additional assumptions on the state $ρ_{A^{n} E}$ .

The first bound of this form was proven via the asymptotic equipartition property (AEP) [3]. It assumes that the system E is n-partite (i.e., we replace E by $E^{n} = E_{1} \dots E_{n}$ ) and that the state $ρ_{A^{n} E^{n}} = ρ_{A_{1} E_{1}} \otimes \dots \otimes ρ_{A_{n} E_{n}}$ is a product of identical states. Then, the AEP shows that1

\begin{matrix} H_{min}^{ε} {(A^{n} | E^{n})}_{ρ} \geq \sum_{i = 1}^{n} H {(A_{i} | E_{i})}_{ρ} - O (\sqrt{n}) . \end{matrix}

For applications in cryptography, the assumption that $ρ$ is an i.i.d. product state is usually too strong: it corresponds to the (unrealistic) assumption that Eve executes the same independent attack in each round, a so-called collective attack.

The entropy accumulation theorem (EAT) [1] is a generalisation of the AEP which requires far weaker assumptions on the state $ρ_{A^{n} E}$ . Specifically, the EAT considers states that result from a sequential process that starts with a state $ρ_{R_{0} E^{'}}^{0}$ and in every step outputs a system $A_{i}$ and a piece of side information $I_{i}$ . The system $E^{'}$ is not acted upon during the process. The full side information at the end of this process is $E = I_{1} \dots I_{n} E^{'}$ . We can represent such a process by the following diagram, where $M_{i}$ are quantum channels. graphic file with name 220_2024_5121_Figa_HTML.jpg The EAT requires an additional condition on the side information: the new side information $I_{i}$ generated in round i must be independent from the past outputs $A^{i - 1}$ conditioned on the existing side information $I^{i - 1} E^{'}$ . Mathematically, this is captured by the condition that the systems $A^{i - 1} \leftrightarrow I^{i - 1} E^{'} \leftrightarrow I_{i}$ form a Markov chain for any initial state $ρ_{R_{0} E^{'}}^{0}$ . With this Markov condition, the EAT states that2

\begin{matrix} H_{min}^{ε} {(A^{n} | I^{n} E^{'})}_{M_{n} \circ \dots \circ M_{1} (ρ_{R_{0} E^{'}}^{0})} \geq \sum_{i = 1}^{n} inf_{ω} H {(A_{i} | I_{i} \tilde{E})}_{M_{i} (ω)} - O (\sqrt{n}), \end{matrix}

1.1

where $\tilde{E}$ is a purifying system isomorphic to $R_{i - 1}$ and the infimum is taken over all states $ω$ on systems $R_{i - 1} \tilde{E}$ .3

Let us discuss the model of side information used by the EAT in more detail. The EAT considers side information consisting of two parts: the initial side information $E^{'}$ (which is not acted upon during the process) and the outputs $I^{n} = I_{1} \dots I_{n}$ . This splitting of side information into a “static” part $E^{'}$ and a part $I^{n}$ which is generated in each step of the process is particularly suited to device-independent cryptography: there, Eve prepares a device in an initial state $ρ_{R_{0} E^{'}}^{0}$ , where $R_{0}$ is the device’s internal memory and $E^{'}$ is Eve’s initial side information from preparing the device. Then, Alice (and Bob, though we only consider Alice’s system here) executes a multi-round protocol with this device, where each round leaks some additional piece of information $I_{i}$ to Eve, so that Eve’s side information at the end of the protocol is $I^{n} E^{'}$ . Indeed, the EAT has been used to establish tight security proofs in the device-independent setting, see e.g., [4, 5].

The Markov condition in the EAT captures the following intuition: if we want to find a bound on $H_{min}^{ε} (A^{n} | I^{n} E^{'})$ in terms of single-round quantities, it is required that side information about $A_{i}$ is itself output in step i, as otherwise we cannot hope to estimate the contribution to the total entropy from step i. To illustrate what could happen without such a condition, consider a case where $A_{i}$ is classical and no side information is output in the first $n - 1$ rounds, but the side information $I_{n}$ in the last round contains a copy of the systems $A^{n}$ (which can be passed along during the process in the systems $R_{i}$ ). Then, clearly $H_{min}^{ε} (A^{n} | I^{n} E^{'}) = 0$ , but for the first $n - 1$ rounds, each single-round entropy bound that only considers the systems $A_{i}$ and $I_{i}$ can be positive.

Main result In this work, we further relax the assumptions on how the final state $ρ_{A^{n} E}$ is generated. Specifically, we consider sequential processes as in the EAT, but with a fully general model of side information, i.e., the side information can be updated in each step in the process. Diagrammatically, such a process can be represented as follows:

Our generalised EAT then states the following.

Theorem 1.1

Consider quantum channels $M_{i} : R_{i - 1} E_{i - 1} \to A_{i} R_{i} E_{i}$ that satisfy the following “non-signalling” condition (discussed in detail below): for each $M_{i}$ , there must exist a quantum channel $R_{i} : E_{i - 1} \to E_{i}$ such that

\begin{matrix} {Tr}_{A_{i} R_{i}} \circ M_{i} = R_{i} \circ {Tr}_{R_{i - 1}} . \end{matrix}

1.2

Then, the min-entropy of the outputs $A^{n}$ conditioned on the final side information $E_{n}$ can be bounded as

\begin{matrix} H_{min}^{ε} {(A^{n} | E_{n})}_{M_{n} \circ \dots \circ M_{1} (ρ_{R_{0} E_{0}}^{0})} \geq \sum_{i = 1}^{n} inf_{ω} H {(A_{i} | E_{i} {\tilde{E}}_{i - 1})}_{M_{i} (ω)} - O (\sqrt{n}), \end{matrix}

1.3

where ${\tilde{E}}_{i - 1} \equiv R_{i - 1} E_{i - 1}$ is a purifying system for the input to $M_{i}$ and the infimum is taken over all states $ω$ on systems $R_{i - 1} E_{i - 1} {\tilde{E}}_{i - 1}$ .4

We give a formal statement and proof in Sect. 4 and also show that, similarly to the EAT, statistics collected during the process can be used to restrict the minimization over $ω$ (see Theorem 4.3 for the formal statement). By a simple duality argument, Eq. (1.3) also implies an upper bound on the smooth max-entropy $H_{max}$ , which we explain in Appendix A. This generalises a similar result from [1], although in [1] one could not make use of duality due to the Markov condition and instead had to prove the statement about $H_{max}$ separately, again highlighting that our generalised EAT is easier to work with.

The intuition behind the non-signalling condition in our generalised EAT is similar to the Markov condition in the original EAT: by the same reasoning as for the Markov condition, since the lower bound is made up of terms of the form $H {(A_{i} | E_{i} {\tilde{E}}_{i - 1})}_{M_{i} (ω)}$ , it is required that side information about $A_{i}$ that is present in the final system $E_{n}$ is already present in $E_{i}$ . This means that side information about $A_{i}$ should not be passed on via the R-systems and later be included in the E-systems. The non-signalling condition captures this requirement: it demands that if one only considers the marginal of the new side information $E_{i}$ (without the new output $A_{i}$ ), it must be possible to generate this state from the past side information $E_{i - 1}$ alone, without access to the system $R_{i - 1}$ . This means that any side information that $E_{i}$ contains about the past outputs $A_{1} \dots A_{i - 1}$ must have essentially already been present in $E_{i - 1}$ and could not have been stored in $R_{i - 1}$ .

The name “non-signalling condition” is due to the fact that Eq. (1.2) is a natural generalisation of the standard non-signalling conditions in non-local games: if we view the systems $R_{i - 1}$ and $R_{i} A_{i}$ as the inputs and outputs on “Alice’s side” of $M_{i}$ , and $E_{i - 1}$ and $E_{i}$ as the inputs and outputs on “Eve’s side”, then Eq. (1.2) states that the marginal of the output on Eve’s side cannot depend on the input on Alice’s side. This is exactly the non-signalling condition in non-local games, except that here the inputs and outputs are allowed to be fully quantum.

To understand the relation between the Markov and non-signalling conditions, it is instructive to consider the setting of the original EAT as a special case of our generalised EAT. In the original EAT, the full side information available after step i is $E^{'} I^{i}$ , and past side information is not updated during the process. For our generalised EAT, we therefore set $E_{i} = E^{'} I^{i}$ and consider maps $M_{i} = M_{i}^{'} \otimes {id}_{E_{i - 1}}$ , where $M_{i}^{'} : R_{i - 1} \to A_{i} I_{i} R_{i}$ is the map used in the original EAT. We need to check that with this choice of systems and maps, the Markov condition of the original EAT implies the non-signalling condition of our generalised EAT. The Markov condition requires that for any state input $ω_{A^{i - 1} I^{i - 1} R_{i - 1} E^{'}}^{i - 1}$ , the output state $ω_{A^{i} I^{i} R_{i} E^{'}}^{i} = M_{i} (ω^{i - 1})$ satisfies $A^{i - 1} \leftrightarrow I^{i - 1} E^{'} \leftrightarrow I_{i}$ .5 It is then a standard result on quantum Markov chains [6] that there must exist a quantum channel $R_{i} : I^{i - 1} E^{'} \to I^{i} E^{'}$ such that $ω_{I^{i} E^{'}}^{i} = R_{i} (ω_{I^{i - 1} E^{'}}^{i - 1})$ . Remembering that we defined $E_{i} = E^{'} I^{i}$ (so that $R_{i} : E_{i - 1} \to E_{i}$ ) and adding the systems $A^{i - 1}$ (on which both $M_{i}$ and $R_{i}$ act as identity), we find that $M_{i}$ satisfies the non-signalling condition:

\begin{matrix} {Tr}_{A_{i} R_{i}} \circ M_{i} (ω_{A^{i - 1} R_{i - 1} E_{i - 1}}^{i - 1}) = ω_{A^{i - 1} E_{i}}^{i} = R_{i} (ω_{A^{i - 1} E_{i - 1}}^{i - 1}) = R_{i} \circ {Tr}_{R_{i - 1}} (ω_{A^{i - 1} R_{i - 1} E_{i - 1}}^{i - 1}) . \end{matrix}

Then, noting that all conditioning systems on which $M_{i}$ acts as the identity map can collectively be replaced by a single purifying system isomorphic to the input, we see that we recover the original EAT (Eq. (1.1)) from our generalised EAT (Eq. (1.3)).

We emphasise that while the original EAT with the Markov condition can be recovered as a special case, our model of side information and the non-signalling condition are much more general than the original EAT; arguably, for a sequential process they are the most natural and general way of expressing the notion that future side information should not contain new information about past outputs, which appears to be necessary for an EAT-like result. To demonstrate the greater generality of our result, in Sect. 5 we use it to give the first multi-round proof for blind randomness expansion, a task to which the original EAT could not be applied, and a more direct proof of the E91 QKD protocol than was possible with the original EAT. Our generalised EAT can also be used to prove security of a much larger class of QKD protocols than the original EAT. Interestingly, for (device-dependent) QKD protocols, no “hidden system” R is needed and therefore the non-signalling condition is trivially satisfied, i.e., the advantage of our generalised EAT for QKD security proofs stems entirely from the more general model of side information, not from replacing the Markov condition by the non-signalling condition; see Sect. 5.2 for an informal comparison of how the original and generalised EAT can be applied to QKD, and [7] for a detailed treatment of the application of our generalised EAT to QKD, including protocols to which the original EAT could not be applied.

Proof sketch. The generalised EAT involves both the min-entropy, which can be viewed as a “worst-case entropy”, and the von Neumann entropy, which can be viewed as an “average case entropy”. These two entropies are special cases of a more general family of entropies called Rényi entropies, which are denoted by $H_{α}$ for a parameter $α > 1$ (see Sect. 2.2 for a formal definition).6 The min-entropy can be obtained from the Rényi entropy by taking $α \to \infty$ , whereas the von Neumann entropy corresponds to the limit $α \to 1$ . Hence, the Rényi entropies interpolate between the min-entropy and the von Neumann entropy, and they will play a crucial role in our proof.

The key technical ingredient for our generalised EAT is a new chain rule for Rényi entropies (Theorem 3.6 in the main text).

Lemma 1.2

Let $α \in (1, 2)$ , $ρ_{ARE}$ a quantum state, and $M : R E \to A^{'} R^{'} E^{'}$ a quantum channel which satisfies the non-signalling condition in Eq. (1.2), i.e. there exists a channel $R : E \to E^{'}$ such that ${Tr}_{A^{'} R^{'}} \circ M = R \circ {Tr}_{R}$ . Then

\begin{matrix} H_{α} {(A A^{'} | E^{'})}_{M (ρ)} \geq H_{α} {(A | E)}_{ρ} + inf_{ω_{R E \tilde{E}}} H_{\frac{1}{2 - α}} {(A^{'} | E^{'} \tilde{E})}_{M (ω)} \end{matrix}

1.4

for a purifying system $\tilde{E} \equiv R E$ , where the infinimum is over all quantum states $ω$ on systems $R E \tilde{E}$ .

We first describe how this chain rule implies our generalised EAT, following the same idea as in [1, 8]. For this, recall that our goal is to find a lower bound on $H_{min}^{ε} {(A^{n} | E_{n})}_{M_{n} \circ \dots \circ M_{1} (ρ_{R_{0} E_{0}}^{0})}$ for a sequence of maps satisfying the non-signalling condition ${Tr}_{A_{i} R_{i}} \circ M_{i} = R_{i} \circ {Tr}_{R_{i - 1}}$ . As a first step, we use a known relation between the smooth min-entropy and the Rényi entropy [3], which (up to a small penalty term depending on $ε$ and $α$ ) reduces the problem to lower-bounding

\begin{matrix} H_{α} {(A^{n} | E_{n})}_{M_{n} \circ \dots \circ M_{1} (ρ_{R_{0} E_{0}}^{0})} = H_{α} {(A_{n} A^{n - 1} | E_{n})}_{M_{n} \circ \dots \circ M_{1} (ρ_{R_{0} E_{0}}^{0})} . \end{matrix}

To this, we can apply Lemma 1.2 by choosing $A = A^{n - 1}$ , $A^{'} = A_{n}$ , $E = E_{n - 1}$ , $E^{'} = E_{n}$ , $R = R_{n - 1}$ , $R^{'} = R_{n}$ , and $ρ = M_{n - 1} \circ \dots \circ M_{1} (ρ_{R_{0} E_{0}}^{0})$ . Then, since the map $M_{n}$ satisfies the non-signalling condition, Lemma 1.2 implies that

\begin{matrix} H_{α} {(A_{1}^{n} | E_{n})}_{M_{n} \circ \dots \circ M_{1} (ρ_{R_{0} E_{0}})} & \geq H_{α} {(A_{1}^{n - 1} | E_{n - 1})}_{M_{n - 1} \circ \dots \circ M_{1} (ρ_{R_{0} E_{0}})} \\ + inf_{ω \in S (R_{n - 1} E_{n - 1} {\tilde{E}}_{n - 1})} H_{\frac{1}{2 - α}} {(A_{n} | E_{n} {\tilde{E}}_{n - 1})}_{M_{n} (ω)} . \end{matrix}

We can now repeat this argument for the term $H_{α} {(A_{1}^{n - 1} | E_{n - 1})}_{M_{n - 1} \circ \dots \circ M_{1} (ρ_{R_{0} E_{0}})}$ . After n applications of Lemma 1.2, we find that

\begin{matrix} H_{α} {(A_{1}^{n} | E_{n})}_{M_{n} \circ \dots \circ M_{1} (ρ_{R_{0} E_{0}})} & \geq \sum_{i = 1}^{n} inf_{ω \in S (R_{i - 1} E_{i - 1} {\tilde{E}}_{i - 1})} H_{\frac{1}{2 - α}} {(A_{i} | E_{i} {\tilde{E}}_{i - 1})}_{M_{i} (ω)} . \end{matrix}

To conclude, we use a continuity bound from [8] to relate $H_{\frac{1}{2 - α}} {(A_{i} | E_{i} {\tilde{E}}_{i - 1})}_{M_{i} (ω)}$ to $H {(A_{i} | E_{i} {\tilde{E}}_{i - 1})}_{M_{i} (ω)}$ . It can be shown that for a suitable choice of $α$ , the penalty terms we incur by switching from the min-entropy to the Rényi entropy and then to the von Neumann entropy scale as $O (\sqrt{n})$ . Therefore, we obtain Eq. (1.3). We also provide a version that allows for “testing” (which is crucial for application in quantum cryptography and explained in detail in Sect. 4.2) and features explicit second-order terms similar to those in [8].

We now turn our attention to the proof of Lemma 1.2. For this, we need to introduce the (sandwiched) Rényi divergence of order $α$ between two (possibly unnormalised) quantum states $ρ$ and $σ$ , denoted by $D_{α} (ρ, ‖, σ)$ . We refer to Sect. 2.2 for a formal definition; for this overview, it suffices to know that $D_{α} (ρ, ‖, σ)$ is a measure of how different $ρ$ is from $σ$ , and that the conditional Rényi entropy is related to the Rényi divergence by

\begin{matrix} H_{α} {(A | B)}_{ρ} = - D_{α} (ρ_{AB} ‖ 1_{A} \otimes ρ_{B}) . \end{matrix}

Our starting point for proving Lemma 1.2 is the following chain rule for the Rényi divergence from [9]:

\begin{matrix} D_{α} (M (ρ) ‖ F (σ)) \leq D_{α} (ρ_{ARE}, ‖, σ_{ARE}) + lim_{n \to \infty} \frac{1}{n} sup_{ω_{R^{n} E^{n} {\tilde{E}}^{n}}} D_{α} (M^{\otimes n}, (ω), ‖, F^{\otimes n}, (ω)), \end{matrix}

1.5

where $M$ and $F$ are (not necessarily trace preserving) quantum channels from RE to $A^{'} R^{'} E^{'}$ , and $ρ$ and $σ$ are any quantum states on ARE. The optimization is over all quantum states $ω$ on n copies of the systems $R E \tilde{E}$ (with $\tilde{E} \equiv R E$ as before).

Making a suitable choice of $F$ (which depends on $M$ ) and $σ$ (which depends on $ρ$ ), one can turn Eq. (1.5) into the following chain rule for the conditional Rényi entropy:

\begin{matrix} H_{α} {(A A^{'} | E^{'})}_{M (ρ)} \geq H_{α} {(A | R E)}_{ρ} + lim_{n \to \infty} \frac{1}{n} inf_{ω_{R^{n} E^{n} {\tilde{E}}^{n}}} H_{α} {({(A^{'})}^{n} | {(E^{'})}^{n} {\tilde{E}}^{n})}_{M^{\otimes n} (ω)} . \end{matrix}

1.6

This chain rule resembles Lemma 1.2, but is significantly weaker and cannot be used to prove a useful entropy accumulation theorem. The reason for this is twofold:

(i)
Equation (1.6) provides a lower bound in terms of $H_{α} (A | R E)$ , not $H_{α} (A | E)$ . The additional conditioning on the R-system can drastically lower the entropy: for example, in a device-independent scenario, R would describe the internal memory of the device. Then, Alice’s output A contains no entropy when conditioned on the internal memory of the device that produced the output, i.e. $H_{α} (A | R E) = 0$ . On the other hand, Alice’s output conditioned only on Eve’s side information E may be quite large (and can usually be certified by playing a non-local game), i.e. $H_{α} (A | E) > 0$ .
(ii)
Equation (1.6) contains the regularised quantity ${lim}_{n \to \infty} \frac{1}{n} {inf}_{ω_{R^{n} E^{n} {\tilde{E}}^{n}}} H_{α} {({(A^{'})}^{n} | {(E^{'})}^{n} {\tilde{E}}^{n})}_{M^{\otimes n} (ω)}$ . Due to the limit $n \to \infty$ , this quantity cannot be computed numerically and therefore the bound in Eq. (1.6) cannot be evaluated for concrete examples.

We now describe how we overcome each of these issues in turn.

(i)
We prove a new variant of Uhlmann’s theorem [10], a foundational result in quantum information theory. The original version of Uhlmann’s theorem deals with the case of $α = 1 / 2$ ; we show that for $α > 1$ , a similar result holds, but an additional regularisation is required. Concretely, we prove that for any states $ρ_{ARE}$ and $σ_{AE}$ :
$\begin{matrix} lim_{k \to \infty} \frac{1}{k} inf_{\begin{matrix} {\hat{σ}}_{A^{k} R^{k} E^{k}} \\ s.t. {\hat{σ}}_{A^{k} E^{k}} = σ_{AE}^{\otimes k} \end{matrix}} D_{α} (ρ_{ARE}^{\otimes k}, ‖, {\hat{σ}}_{A^{k} R^{k} E^{k}}) = D_{α} (ρ_{AE}, ‖, σ_{AE}) . \end{matrix}$ 1.7
The proof of this result relies heavily on the spectral pinching technique [11, 12] and we refer to Lemma 3.3 for details as well as a non-asymptotic statement with explicit error bounds. We make use of this extended Uhlmann’s theorem as follows: for the case we are interested in, the map $F$ in Eq. (1.5) satisfies a non-signalling condition. We can show that this condition implies that for any state ${\hat{σ}}_{A^{k} R^{k} E^{k}} s.t. {\hat{σ}}_{A^{k} E^{k}} = σ_{AE}^{\otimes k}$ :
$\begin{matrix} D_{α} (M (ρ) ‖ F (σ)) = \frac{1}{k} D_{α} (M^{\otimes k}, (ρ_{ARE}^{\otimes k}), ‖, F^{\otimes k}, ({\hat{σ}}_{A^{k} R^{k} E^{k}})) . \end{matrix}$
Applying Eq. (1.5) to the r.h.s. of this equality results in a bound that contains $D_{α} (ρ_{ARE}^{\otimes k}, ‖, {\hat{σ}}_{A^{k} R^{k} E^{k}})$ . We can now minimise over all states ${\hat{σ}}_{A^{k} R^{k} E^{k}} s.t. {\hat{σ}}_{A^{k} E^{k}} = σ_{AE}^{\otimes k}$ and take the limit $k \to \infty$ . Then, Eq. (1.7) allows us to drop the R-system. Therefore, under the non-signalling condition on $F$ , we obtain the following improved chain rule for the sandwiched Rènyi divergence, which might be of independent interest:
$\begin{matrix} D_{α} (M (ρ) ‖ F (σ)) \leq D_{α} (ρ_{AE}, ‖, σ_{AE}) + lim_{n \to \infty} \frac{1}{n} sup_{ω_{R^{n} E^{n} {\tilde{E}}^{n}}} D_{α} (M^{\otimes n}, (ω), ‖, F^{\otimes n}, (ω)) . \end{matrix}$
Using this chain rule, we can show that Eq. (1.6) still holds if $H_{α} (A | R E)$ is replaced by $H_{α} (A | E)$ .
(ii)
To remove the need for a regularisation in Eq. (1.6), we show that due to the permutation-invariance of $M^{\otimes n}$ and $F^{\otimes n}$ , for $α > 1$ and $n \to \infty$ one can replace the optimization over $ω_{R^{n} E^{n} {\tilde{E}}^{n}}$ with a fixed input state, namely the projector onto the symmetric subspace of $R^{n} E^{n} {\tilde{E}}^{n}$ . For this replacement, one incurs a small loss in $α$ , replacing it by $\frac{1}{2 - α}$ (which is only slightly larger than $α$ in the typical regime where $α$ is close to 1). The projector onto the symmetric subspace has a known representation as a mixture of tensor product states [13]. Combining these two steps, we show that the optimization over $ω_{R^{n} E^{n} {\tilde{E}}^{n}}$ can be restricted to tensor product states, which means that the regularisation in Eq. (1.6) can be removed (see Sect. 3.2 for details):
$\begin{matrix} lim_{n \to \infty} \frac{1}{n} inf_{ω_{R^{n} E^{n} {\tilde{E}}^{n}}} H_{α} {({(A^{'})}^{n} | {(E^{'})}^{n} {\tilde{E}}^{n})}_{M^{\otimes n} (ω)} \geq inf_{ω_{R E \tilde{E}}} H_{\frac{1}{2 - α}} {(A^{'} | E^{'} \tilde{E})}_{M (ω)} . \end{matrix}$

Combining these results yields Lemma 1.2 and, as a result, our generalised EAT.

Sample application: blind randomness expansion. The main advantage of the generalised EAT over previous results is its broader applicability. For example, as demonstrated in [7], the generalised EAT can be used to prove the security of prepare-and-measure QKD protocols, which is of immediate practical relevance, and can also simplify the analysis of entanglement-based QKD protocols as discussed in Sect. 5.2. Here, we focus on the application of our generalised EAT to mistrustful device-independent (DI) cryptography. In mistrustful DI cryptography, multiple parties each use a quantum device to execute a protocol with one another. Each party trusts neither its quantum device nor the other parties in the protocol. Hence, from the point of view of one party, say Alice, all the remaining parties in the protocol are collectively treated as an adversary Eve, who may also have prepared Alice’s untrusted device.

While the original EAT could be used to analyse DI protocols in which the parties trust each other, e.g. DIQKD [14], the setting of mistrustful DI cryptography is significantly harder to analyse because the adversary Eve actively participates in the protocol and may update her side information during the protocol in arbitrary ways. Analysing such protocols requires the more general model of side information we deal with in this paper. As a concrete example for mistrustful DI cryptography, we consider blind randomness expansion, a primitive introduced in [15]. Previous work [15, 16] could only analyse blind randomness expansion under the i.i.d. assumption. Here, we give the first proof that blind randomness expansion is possible for general adversaries. The proof is a straightforward application of our generalised EAT and briefly sketched below; we refer to Sect. 5.1 for a detailed treatment.

In blind randomness expansion, Alice receives an untrusted quantum device from the adversary Eve. Alice then plays a non-local game, e.g. the CHSH game, with this device and Eve, and wants to extract certified randomness from her outputs of the non-local game, i.e. we need to show that Alice’s outputs contain a certain amount of min-entropy conditioned on Eve’s side information. Concretely, in each round of the protocol Alice samples inputs x and y for the non-local game, inputs x into her device to receive outcome a, and sends y to Eve to receive outcome b; Alice then checks whether (x, y, a, b) satisfies the winning condition of the non-local game. For comparison, recall that in standard DI randomness expansion [17–21], Alice receives two devices from Eve and uses them to play the non-local game. This means that in standard DI randomness expansion, Eve never learns any of the inputs and outputs of the game. In contrast, in blind randomness expansion Eve learns one of the inputs, y, and is free to choose one of the outputs, b, herself. Hence, Eve can choose the output b based on past side information and update her side information in each round of the protocol using the values of y and b.

To analyse such a protocol, we use the setting of Theorem 1.1, with $A_{i}$ representing the output of Alice’s device D from the non-local game in the i-th round, $R_{i}$ the internal memory of D after the i-th round, and $E_{i}$ Eve’s side information after the i-th round, which can be generated arbitrarily from entanglement shared between Eve and D at the start of the protocol and information Eve gathered during the first i rounds of the protocol. The map $M_{i}$ describes one round of the protocol, and because Alice’s device and Eve cannot communicate during the protocol it is easy to show that the non-signalling condition from Theorem 1.1 is satisfied. Therefore, we can apply Theorem 1.1 to lower-bound Alice’s conditional min-entropy $H_{min} (A^{n} | E_{n})$ in terms of the single-round quantities ${inf}_{ω} H {(A_{i} | E_{i} {\tilde{E}}_{i - 1})}_{M_{i} (ω)}$ .7 This single-round quantity corresponds to the i.i.d. scenario, i.e. the generalised EAT has reduced the problem of showing blind randomness expansion against general adversaries to the (much simpler) problem of showing it against i.i.d. adversaries. The quantity ${inf}_{ω} H {(A_{i} | E_{i} {\tilde{E}}_{i - 1})}_{M_{i} (ω)}$ can be computed using a general numerical technique [22], and for certain classes of non-local games it may also be possible to find an analytical lower bound using ideas from [15, 16]. Inserting the single-round bound, we obtain a lower bound on $H_{min} (A^{n} | E_{n})$ that scales linearly with n, showing that blind randomness expansion is possible against general adversaries. We also note that as explained in [15], this result immediately implies that unbounded randomness expansion is possible with only three devices, whereas previous works required four devices [21, 23, 24].

Future work In this work, we have developed a new information-theoretic tool, the generalised EAT. The generalised EAT deals with a more general model of side information than previous techniques and is therefore more broadly and easily applicable. In particular, our generalised EAT can be used to analyse mistrustful DI cryptography. We have demonstrated this by giving the first proof of blind randomness expansion against general adversaries. We expect that the generalised EAT could similarly be used for other protocols such as two-party cryptography in the noisy storage model [25] or certified deletion [16, 26, 27]. In addition to mistrustful DI cryptography, our result can also be used to give new proofs for device-dependent QKD, as demonstrated in Sect. 5.2 and [7], and is applicable to proving the security of commercial quantum random number generators, which typically have correlations between rounds due to experimental imperfections [28].

Beyond cryptography, the generalised EAT is useful whenever one is interested in bounding the min-entropy of a large system that can be decomposed in a sequential way. Such problems are abundant in physics. For example, the dynamics of an open quantum system can be described in terms of interactions that take place sequentially with different parts of the system’s environment [29]. In quantum thermodynamics, such a description is commonly employed to model the thermalisation of a system that is brought in contact with a thermal bath. For a lack of techniques, the entropy flow during a thermalisation process of this type is usually quantified in terms of von Neumann entropy rather than the operationally more relevant smooth min- and max-entropies [30]. The generalised EAT may be used to remedy this situation. A similar situation arises in quantum gravity, where smooth entropies play a role in the study of black holes [31].

In a different direction, one can also try to further improve the generalised EAT itself. Compared to the original EAT [1], our generalised EAT features a more general model of side information and a weaker condition on the relation between different rounds, replacing the Markov condition of [1] with our weaker non-signalling condition in Eq. (1.2). It is natural to ask whether a further step in this direction is possible: while the model of side information we consider is fully general, it may be possible to replace the non-signalling condition with a weaker requirement. We have argued above that our non-signalling condition appears to be the most general way of stating the requirement that future side information does not reveal information about past outputs, which seems necessary for an EAT-like theorem.8 It would be interesting to formalise this intuition and see whether our theorem is provably “tight” in terms of the conditions placed on the sequential process. Furthermore, it might be possible to improve the way the statistical condition in Theorem 4.3 is dealt with in the proof, e.g. using ideas from [33, 34].

Finally, one could attempt to extend entropy accumulation from conditional entropies to relative entropies. Such a relative entropy accumulation theorem (REAT) would be the following statement: for two sequences of channels ${E_{1}, \dots, E_{n}}$ and ${F_{1}, \dots, F_{n}}$ (where $F_{i}$ need not necessarily be trace-preserving), and $ε > 0$ ,

\begin{matrix} D_{max}^{ε} (E_{n} \circ \dots \circ E_{1} ‖ F_{n} \circ \dots \circ F_{1}) \overset{?}{\leq} \sum_{i = 1}^{n} D^{reg} (E_{i}, ‖, F_{i}) + O (\sqrt{n}) . \end{matrix}

Here, $D_{max}^{ε}$ is the $ε$ -smooth max-relative entropy [11] and we used the (regularised) channel divergences defined in Definition 2.5. The key technical challenge in proving this result is to show that the regularised channel divergence $D_{α}^{reg} (E_{i}, ‖, F_{i})$ is continuous in $α$ at $α = 1$ , which is an important technical open question. If one had such a continuity statement and the maps $F_{i}$ additionally satisfied a non-signalling condition (which is not required for the statement above), one could also use our Theorem 3.1 to derive a more general REAT, which would imply our generalised EAT.

Preliminaries

Notation

Throughout this paper, we restrict ourselves to finite-dimensional Hilbert spaces. The set of positive semidefinite operators on a quantum system A (with associated Hilbert space $H_{A}$ ) is denoted by $Pos (A)$ . The set of quantum states is given by $S (A) = {ρ \in Pos (A) | Tr [ρ] = 1}$ . The set of completely positive maps from linear operators on A to linear operators on $A^{'}$ is denoted by $CP (A, A^{'})$ . If such a map is additionally trace preserving, we call it a quantum channel and denote the set of such maps by $CPTP (A, A^{'})$ . The identity channel on system A is denoted as ${id}_{A}$ . The spectral norm is denoted by ${∥\cdot∥}_{\infty}$ .

A cq-state is a quantum state $ρ \in S (X A)$ on a classical system X (with alphabet $X$ ) and a quantum system A, i.e. a state that can be written as

\begin{matrix} ρ_{XA} = \sum_{x \in X} | x ⟩ ⟨ x | \otimes ρ_{A, x} \end{matrix}

for subnormalised $ρ_{A, x} \in Pos (A)$ . For $Ω \subset X$ , we define the conditional state

\begin{matrix} ρ_{X A | Ω} = \frac{1}{{Pr}_{ρ} [Ω]} \sum_{x \in Ω} | x ⟩ ⟨ x | \otimes ρ_{A, x}, where {Pr}_{ρ} [Ω] : = \sum_{x \in Ω} Tr [ρ_{A, x}] . \end{matrix}

If $Ω = {x}$ , we also write $ρ_{X A | x}$ for $ρ_{X A | Ω}$ .

Rényi divergence and entropy

We will make extensive use of the sandwiched Rényi divergence [35, 36] and quantities associated with it, namely Rényi entropies and channel divergences. We recall the relevant definitions here.

Definition 2.1

(Rényi divergence). For $ρ \in S (A)$ , $σ \in Pos (A)$ , and $α \in [1 / 2, 1) \cup (1, \infty)$ the (sandwiched) Rényi divergence is defined as

\begin{matrix} D_{α} (ρ, ‖, σ) : = \frac{1}{α - 1} log Tr [(, σ^{\frac{1 - α}{2 α}}, ρ, σ^{\frac{1 - α}{2 α}}, )^{α}] \end{matrix}

for $supp (ρ) \subseteq supp (σ)$ , and $+ \infty$ otherwise.

From the Rényi divergence, one can define the conditional Rényi entropies as follows (see [11] for more details).

Definition 2.2

(Conditional Rényi entropy). For a bipartite state $ρ_{AB} \in S (A B)$ and $α \in [1 / 2, 1) \cup (1, \infty)$ , we define the following two conditional Rényi entropies:

From the definition it is clear that Inline graphic . Importantly, a relation for the other direction also holds.

Lemma 2.3

([11, Corollary 5.3]). For $ρ_{AB} \in S (A B)$ and $α \in (1, 2)$ :

In the limit $α \to 1$ the sandwiched Rényi divergence converges to the relative entropy:

\begin{matrix} lim_{α \to 1} D_{α} (ρ, ‖, σ) = D (ρ ‖ σ) = Tr [ρ (log ρ - log σ)] . \end{matrix}

Accordingly, the conditional Rényi entropy converges to the conditional von Neumann entropy:

\begin{matrix} lim_{α \to 1} H_{α} {(A | B)}_{ρ} = H {(A | B)}_{ρ} = H {(A B)}_{ρ} - H {(B)}_{ρ} = - Tr [ρ_{AB} log ρ_{AB}] + Tr [ρ_{B} log ρ_{B}] . \end{matrix}

Conversely, in the limit $α \to \infty$ , the Rényi entropy Inline graphic converges to the min-entropy. We will make use of a smoothed version of the min-entropy, which is defined as follows [2].

Definition 2.4

(Smoothed min-entropy). For $ρ_{AB} \in S (A B)$ and $ε \in [0, 1]$ , the $ε$ -smoothed min-entropy of A conditioned on B is

\begin{matrix} H_{min}^{ε} {(A | B)}_{ρ} = - log inf_{{\tilde{ρ}}_{AB} \in B_{ε} (ρ_{AB})} inf_{σ_{B} \in S (B)} {∥σ_{B}^{- \frac{1}{2}}, {\tilde{ρ}}_{AB}, σ_{B}^{- \frac{1}{2}}∥}_{\infty}, \end{matrix}

where ${∥\cdot∥}_{\infty}$ denotes the spectral norm and $B_{ε} (ρ_{AB})$ is the $ε$ -ball around $ρ_{AB}$ in term of the purified distance [11].

Finally, we can extend the definition of the Rényi divergence from states to channels. The resulting quantity, the channel divergence (and its regularised version), will play an important role in the rest of the manuscript.

Definition 2.5

(Channel divergence). For $E \in CPTP (A, A^{'})$ , $F \in CP (A, A^{'})$ , and $α \in [1 / 2, 1) \cup (1, \infty)$ , the (stabilised) channel divergence9 is defined as

\begin{matrix} D_{α} (E, ‖, F) = sup_{ω \in S (A \tilde{A})} D_{α} (E (ω) ‖ F (ω)), \end{matrix}

2.1

where without loss of generality $\tilde{A} \equiv A$ . The regularised channel divergence is defined as

\begin{matrix} D_{α}^{reg} (E, ‖, F) : = lim_{n \to \infty} \frac{1}{n} D_{α} (E^{\otimes n}, ‖, F^{\otimes n}) = sup_{n} \frac{1}{n} D_{α} (E^{\otimes n}, ‖, F^{\otimes n}) . \end{matrix}

We note that the channel divergence is in general not additive under the tensor product [37, Proposition 3.1], so the regularised channel divergence can be strictly larger that the non-regularised one, i.e., $D_{α}^{reg} (E, ‖, F) > D_{α} (E, ‖, F)$ . The regularised channel divergence, however, does satisfy an additivity property:

\begin{matrix} D_{α}^{reg} (E^{\otimes k}, ‖, F^{\otimes k}) \\ = lim_{n \to \infty} \frac{1}{n} D_{α} (E^{\otimes k n}, ‖, F^{\otimes k n}) = k lim_{n \to \infty} \frac{1}{n^{'}} D_{α} (E^{\otimes n^{'}}, ‖, F^{\otimes n^{'}}) = k D_{α}^{reg} (E, ‖, F), \end{matrix}

2.2

where we switched to the index $n^{'} = k n$ for the second equality.

Spectral pinching

A key technical tool in our proof will be the use of spectral pinching maps [38], which are defined as follows (see [12, Chapter 3] for a more detailed introduction).

Definition 2.6

(Spectral pinching map). Let $ρ \in Pos (A)$ with spectral decomposition $ρ = \sum_{λ} λ P_{λ}$ , where $λ \in Spec (ρ) \subset R_{\geq 0}$ are the distinct eigenvalues of $ρ$ and $P_{λ}$ are mutually orthogonal projectors. The (spectral) pinching map $P_{ρ} \in CPTP (A, A)$ associated with $ρ$ is given by

\begin{matrix} P_{ρ} (ω) : = \sum_{λ \in Spec (ρ)} P_{λ} ω P_{λ} . \end{matrix}

We will need a few basic properties of pinching maps.

Lemma 2.7

(Properties of pinching maps). For any $ρ, σ \in Pos (A)$ , the following properties hold:

(i)
Invariance: $P_{ρ} (ρ) = ρ$ .
(ii)
Commutation of pinched state: $[σ, P_{σ} (ρ)] = 0$ .
(iii)
Pinching inequality: $P_{σ} (ρ) \geq \frac{1}{| Spec (σ) |} ρ$ .
(iv)
Commutation of pinching maps: if $[ρ, σ] = 0$ , then $P_{ρ} \circ P_{σ} = P_{σ} \circ P_{ρ}$ .
(v)
Partial trace: ${Tr}_{B} [P_{ρ_{A} \otimes 1_{B}}, (ω_{AB})] = P_{ρ_{A}} (ω_{A}) \forall ω_{AB} \in Pos (A B)$ .

Proof

Properties (i)–(iii) follow from the definition and [3, Chapter 2.6.3] or [12, Lemma 3.5].

For the fourth statement, note that since $[ρ, σ] = 0$ , there exists a joint orthonormal eigenbasis ${| x_{i} ⟩}$ of $ρ$ and $σ$ . Let $P_{λ}$ be the projector onto the eigenspace of $ρ$ with eigenvalue $λ$ , and $Q_{μ}$ the projector onto the eigenspace of $σ$ with eigenvalue $μ$ . We can expand

\begin{matrix} P_{λ} = \sum_{i s.t. ρ | x_{i} ⟩ = λ | x_{i} ⟩} | x_{i} ⟩ ⟨ x_{i} | and Q_{μ} = \sum_{j s.t. σ | x_{j} ⟩ = μ | x_{j} ⟩} | x_{j} ⟩ ⟨ x_{j} | . \end{matrix}

Since ${| x_{i} ⟩}$ is a family of orthonormal vectors,

\begin{matrix} P_{λ} Q_{μ} = \sum_{\begin{matrix} i s.t. ρ | x_{i} ⟩ = λ | x_{i} ⟩ \\ and σ | x_{i} ⟩ = μ | x_{i} ⟩ \end{matrix}} | x_{i} ⟩ ⟨ x_{i} | = Q_{μ} P_{λ}, \end{matrix}

which implies commutation of the pinching maps.

For the fifth statement, note that if we write $ρ = \sum_{λ} λ P_{λ}$ with eigenprojectors $P_{λ}$ , then the set of eigenprojectors of $ρ_{A} \otimes 1_{B}$ is simply ${P_{λ} \otimes 1_{B}}$ . Hence,

\begin{matrix} {Tr}_{B} [P_{ρ_{A} \otimes 1_{B}}, (ω_{AB})] & = \sum_{λ} {Tr}_{B} [P_{λ} \otimes 1_{B} ω_{AB} P_{λ} \otimes 1_{B}] \\ = \sum_{λ} P_{λ} {Tr}_{B} [ω_{AB}] P_{λ} = P_{ρ_{A}} (ω_{A}) . \end{matrix}

$□$

It is often useful to use the pinching map associated with tensor power states, i.e., $P_{ρ^{\otimes n}}$ . This is because for $ρ \in Pos (A)$ , the factor $| Spec (ρ^{\otimes n}) |$ from the pinching inequality (see Lemma 2.7) only scales polynomially in n (see e.g. [12, Remark 3.9]):

\begin{matrix} | Spec (ρ^{\otimes n}) | \leq {(n + 1)}^{dim (A) - 1} . \end{matrix}

2.3

In fact, we can show a similar property for all permutation-invariant states, not just tensor product states.

Lemma 2.8

Let $ρ \in Pos (A^{\otimes n})$ be permutation invariant and denote $d = dim (A)$ . Then

\begin{matrix} | Spec (ρ) | \leq {(n + d)}^{d (d + 1) / 2} . \end{matrix}

Proof

By Schur-Weyl duality and Schur’s lemma (see e.g. [39, Lemma 0.8 and Theorem 1.10]), since $ρ$ is permutation-invariant, we have

\begin{matrix} ρ ≅ ⨁_{λ \in I_{d, n}} ρ {(λ)}_{Q_{λ}} \otimes 1_{P_{λ}}, \end{matrix}

where $≅$ denotes equality up to unitary conjugation (which leaves the spectrum invariant), $I_{d, n}$ is the set of Young diagrams with n boxes and at most d rows, $Q_{λ}$ and $P_{λ}$ are systems whose details need not concern us, and $ρ (λ) \in Pos (Q_{λ})$ . From this it is clear that

\begin{matrix} | Spec (ρ) | \leq \sum_{λ \in I_{d, n}} | Spec (ρ (λ)) | \leq \sum_{λ \in I_{d, n}} dim (Q_{λ}) . \end{matrix}

It is known that $| I_{d, n} {| \leq (n + 1)}^{d}$ and $dim (Q_{λ}) \leq {(n + d)}^{d (d - 1) / 2}$ (see e.g. [40, Section 6.2]). Hence

\begin{matrix} | Spec (ρ) | \leq {(n + 1)}^{d} {(n + d)}^{d (d - 1) / 2} \leq {(n + d)}^{d (d + 1) / 2} . \end{matrix}

$□$

Corollary 2.9

Let $ρ, σ \in Pos (A)$ and $d = dim (A)$ . Then

\begin{matrix} | Spec (P_{ρ^{\otimes n}}, (σ^{\otimes n})) {| \leq (n + d)}^{d (d + 1) / 2} . \end{matrix}

Proof

Note that $P_{ρ^{\otimes n}} (σ^{\otimes n})$ is itself not a product state because the eigenprojectors of $ρ^{\otimes n}$ do not have a product form. However, since every eigenspace of $ρ^{\otimes n}$ is permutation-invariant, $P_{ρ^{\otimes n}} (σ^{\otimes n})$ is permutation-invariant, too, so we can apply Lemma 2.8. $□$

Strengthened Chain Rules

One of the crucial properties of entropies are chain rules, which allow us to relate entropies of large composite systems to sums of entropies of the individual subsystems. In this section, we prove two new such chain rules, one for the Rényi divergence (Theorem 3.1, which is a generalisation of [9, Corollary 5.1]) and one for the conditional entropy (Theorem 3.6). The chain rule from Theorem 3.6 is the key ingredient for our generalised EAT, to which we will turn our attention in Sect. 4. Theorem 3.6 plays a similar role for our generalised EAT as [1, Corollary 3.5] does for the original EAT, but while the latter requires a Markov condition, the former does not. As a result, our generalised EAT based on Theorem 3.6 also avoids the Markov condition.

The outline of this section is as follows: we first prove a generalised chain rule for the Rényi divergence (Theorem 3.1). This chain rule contains a regularised channel divergence. As the next step, we show that in the special case of conditional entropies, we can drop the regularisation (Sect. 3.2). This allows us to derive a chain rule for conditional entropies from the chain rule for channels (Sect. 3.3).

Strengthened chain rule for Rényi divergence

The main result of this section is the following chain rule for the Rényi divergence.

Theorem 3.1

Let $α > 1$ , $ρ \in S (A R)$ , $σ \in Pos (A R)$ , $E \in CPTP (A R, B)$ , and $F \in CP (A R, B)$ . Suppose that there exists $R \in CP (A, B)$ such that $F = R \circ {Tr}_{R}$ . Then

\begin{matrix} D_{α} (E, (ρ_{AR}), ‖, F, (σ_{AR})) \leq D_{α} (ρ_{A}, ‖, σ_{A}) + D_{α}^{reg} (E, ‖, F) . \end{matrix}

3.1

This is a stronger version of an existing chain rule due to [9], which we will use in our proof of Theorem 3.1:

Lemma 3.2

([9, Corollary 5.1]). Let $α > 1$ , $ρ \in S (A)$ , $σ \in Pos (A)$ , $E \in CPTP (A, B)$ , and $F \in CP (A, B)$ . Then

\begin{matrix} D_{α} (E (ρ) ‖ F (σ)) \leq D_{α} (ρ, ‖, σ) + D_{α}^{reg} (E, ‖, F) . \end{matrix}

3.2

The difference between Theorem 3.1 and Lemma 3.2 is that on the r.h.s. of Eq. (3.1), we only have the divergence $D_{α} (ρ_{A}, ‖, σ_{A})$ between the two reduced states on system A. In contrast, if we used Eq. (3.2) with systems AR, then we would get the divergence $D_{α} (ρ_{AR}, ‖, σ_{AR})$ between the full states. In particular, the weaker Lemma 3.2 can easily be recovered from Theorem 3.1 by taking the system R to be trivial, in which case the condition $F = R \circ {Tr}_{R}$ becomes trivial, too.

While the difference between Theorem 3.1 and Lemma 3.2 may look minor at first sight, the two chain rules can give considerably different results: in general, the data processing inequality ensures that $D_{α} (ρ_{A}, ‖, σ_{A}) \leq D_{α} (ρ_{AR}, ‖, σ_{AR})$ , but the gap between the two quantities can be significant, i.e., there exist states for which $D_{α} (ρ_{A}, ‖, σ_{A}) ≪ D_{α} (ρ_{AR}, ‖, σ_{AR})$ . In such cases, Theorem 3.1 yields a significantly tighter bound. This turns out to be crucial if we want to apply this chain rule repeatedly to get an EAT.

We also note that the statement of Theorem 3.1 is known to be correct also for $α = 1$ [37, Theorem 3.5]. However, this requires a separate proof and does not follow from Theorem 3.1 as it is currently not known whether the function $α \mapsto D_{α}^{reg} (E, ‖, F)$ is continuous in the limit $α ↘ 1$ .10

We now turn to the proof of Theorem 3.1. The key question for the proof is the following: given states $ρ_{AR}$ and $σ_{A}$ , does there exist an extension $σ_{AR}$ of $σ_{A}$ such that $D_{α} (ρ_{A}, ‖, σ_{A}) = D_{α} (ρ_{AR}, ‖, σ_{AR})$ ? For the special case of $α = 1 / 2$ , an affirmative answer is given by Uhlmann’s theorem [10] (see also [11, Corollary 3.14]). This also holds for $α = \infty$ , but not in general for $α \geq 1$ as discussed in Sect. B. The following lemma shows that a similar property still holds for $α > 1$ on a regularised level.

Lemma 3.3

Consider quantum systems A and R with $d = dim (A)$ . For $n \in N$ , we define $A^{n} = A_{1} \dots A_{n}$ , where $A_{i}$ are copies of the system A, and likewise $R^{n} = R_{1} \dots R_{n}$ . Then for $ρ \in S (A R)$ , $σ \in Pos (A)$ , and $α > 1$ we have

\begin{matrix} D_{α} (ρ_{A}, ‖, σ_{A}) \leq inf_{{\hat{σ}}_{A^{n} R^{n}} s.t. {\hat{σ}}_{A^{n}} = σ_{A}^{\otimes n}} \frac{1}{n} D_{α} (ρ_{AR}^{\otimes n}, ‖, {\hat{σ}}_{A^{n} R^{n}}) \\ \leq D_{α} (ρ_{A}, ‖, σ_{A}) + \frac{α}{α - 1} \frac{d (d + 1) log (n + d)}{n} . \end{matrix}

Proof

The inequality

\begin{matrix} D_{α} (ρ_{A}, ‖, σ_{A}) \leq inf_{{\hat{σ}}_{A^{n} R^{n}} s.t. {\hat{σ}}_{A^{n}} = σ_{A}^{\otimes n}} \frac{1}{n} D_{α} (ρ_{AR}^{\otimes n}, ‖, {\hat{σ}}_{A^{n} R^{n}}) \end{matrix}

follows directly from the data processing inequality for taking the partial trace over $R^{n}$ , and additivity of $D_{α}$ under tensor product [11].

For the other direction, we consider n-fold tensor copies of $ρ_{AR}$ and $σ_{A}$ , which we denote by $ρ_{A^{n} R^{n}} = ρ_{A_{1} R_{1}} \otimes \dots \otimes ρ_{A_{n} R_{n}}$ and $σ_{A^{n}} = σ_{A_{1}} \otimes \dots \otimes σ_{A_{n}}$ . We define the following two pinched states

\begin{matrix} ρ_{A^{n} R^{n}}^{'} = P_{σ_{A^{n}} \otimes 1_{R^{n}}} (ρ_{A^{n} R^{n}}) and {\hat{ρ}}_{A^{n} R^{n}} = P_{ρ_{n}^{'} \otimes 1_{R^{n}}} (ρ_{A^{n} R^{n}}^{'}) . \end{matrix}

3.3

By definition of ${\hat{ρ}}_{A^{n} R^{n}}$ and using the pinching inequality (see Lemma 2.7(iii)) twice, we have

\begin{matrix} ρ_{A^{n} R^{n}} \leq | Spec (σ_{A^{n}}) | | Spec (ρ_{n}^{'}) | {\hat{ρ}}_{A^{n} R^{n}} . \end{matrix}

Using the operator monotonicity of the sandwiched Rényi divergence in the first argument [11] we find for any state ${\hat{σ}}_{A^{n} R^{n}}$

\begin{matrix} \frac{1}{n} D_{α} (ρ_{AR}^{\otimes n}, ‖, {\hat{σ}}_{A^{n} R^{n}}) \leq \frac{1}{n} D_{α} ({\hat{ρ}}_{A^{n} R^{n}}, ‖, {\hat{σ}}_{A^{n} R^{n}}) + \frac{1}{n} \frac{α}{α - 1} η (n), \end{matrix}

3.4

with the error term

\begin{matrix} η (n) = log | Spec (σ_{A^{n}}) | + log | Spec (ρ_{n}^{'}) | . \end{matrix}

To prove the lemma, we now need to bound the error term $η (n)$ and construct a specific choice for ${\hat{σ}}_{A^{n} R^{n}}$ for which ${\hat{σ}}_{A^{n}} = σ_{A}^{\otimes n}$ and $\frac{1}{n} D_{α} ({\hat{ρ}}_{A^{n} R^{n}}, ‖, {\hat{σ}}_{A^{n} R^{n}}) \leq D_{α} (ρ_{A}, ‖, σ_{A})$ . We first bound $η (n)$ . Since $σ_{A^{n}} = σ_{A}^{\otimes n}$ , we have from Eq. (2.3) that $| Spec (σ_{A^{n}}) | \leq {(n + 1)}^{d - 1}$ , where $d = dim (A)$ . To bound $| Spec (ρ_{n}^{'}) |$ , we note that by Eq. (3.3) and Lemma 2.7(v)

\begin{matrix} ρ_{n}^{'} = {Tr}_{R^{n}} [P_{σ_{A^{n}} \otimes 1_{R^{n}}}, (ρ_{A^{n} R^{n}})] = P_{σ_{A^{n}}} (ρ_{A^{n}}) = P_{σ_{A}^{\otimes n}} (ρ_{A}^{\otimes n}) . \end{matrix}

3.5

We can therefore use Lemma 2.9 to obtain $| Spec (ρ_{n}^{'}) | \leq {(n + d)}^{d (d + 1) / 2}$ . Hence,

\begin{matrix} η (n) \leq d (d + 1) log (n + d) . \end{matrix}

3.6

It thus remains to construct ${\hat{σ}}_{A^{n} R^{n}}$ satisfying the properties mentioned above. To do so we first establish a number of commutation statements.

(i)
From Lemma 2.7(ii) we have that $[{\hat{ρ}}_{A^{n} R^{n}}, ρ_{n}^{'} \otimes 1_{R^{n}}] = 0$ . Recalling the definition of $ρ^{'}$ from Eq. (3.3), we get
$\begin{matrix} {\hat{ρ}}_{A^{n}} = {Tr}_{R^{n}} [P_{ρ_{n}^{'} \otimes 1_{R^{n}}}, (ρ_{A^{n} R^{n}}^{'})] = P_{ρ_{n}^{'}} (ρ_{n}^{'}) = ρ_{n}^{'}, \end{matrix}$ 3.7
where the final step uses Lemma 2.7(i). As a result we find
$\begin{matrix} [{\hat{ρ}}_{A^{n} R^{n}}, {\hat{ρ}}_{A^{n}} \otimes 1_{R^{n}}] = 0 . \end{matrix}$ 3.8
(ii)
From Lemma 2.7(ii) we have that $[ρ_{A^{n} R^{n}}^{'}, σ_{A^{n}} \otimes 1_{R^{n}}] = 0$ . Taking the partial trace over $R^{n}$ , this implies $[ρ_{n}^{'}, σ_{A^{n}}] = 0$ , so by Lemma 2.7(iv) and Eq. (3.3)
$\begin{matrix} {\hat{ρ}}_{A^{n} R^{n}} = P_{ρ_{n}^{'} \otimes 1_{R^{n}}} (P_{σ_{A^{n}} \otimes 1_{R^{n}}}, (ρ_{A^{n} R^{n}})) = P_{σ_{A^{n}} \otimes 1_{R^{n}}} (P_{ρ_{n}^{'} \otimes 1_{R^{n}}}, (ρ_{A^{n} R^{n}})) . \end{matrix}$
Therefore, by Lemma 2.7(ii),
$\begin{matrix} [{\hat{ρ}}_{A^{n} R^{n}}, σ_{A^{n}} \otimes 1_{R^{n}}] = 0 . \end{matrix}$ 3.9
(iii)
Taking the partial trace over $R^{n}$ in Eq. (3.9), we get
$\begin{matrix} [{\hat{ρ}}_{A^{n}}, σ_{A^{n}}] = 0 . \end{matrix}$ 3.10

Having established these commutation relations, we define $T \in CPTP (A^{n}, A^{n} R^{n})$ by11

\begin{matrix} T (ω_{A^{n}}) = {\hat{ρ}}_{A^{n} R^{n}}^{1 / 2} {\hat{ρ}}_{A^{n}}^{- 1 / 2} ω_{A^{n}} {\hat{ρ}}_{A^{n}}^{- 1 / 2} {\hat{ρ}}_{A^{n} R^{n}}^{1 / 2} . \end{matrix}

By construction,

\begin{matrix} T ({\hat{ρ}}_{A^{n}}) = {\hat{ρ}}_{A^{n} R^{n}} . \end{matrix}

3.11

We define

\begin{matrix} {\hat{σ}}_{A^{n} R^{n}} = T (σ_{A^{n}}) . \end{matrix}

3.12

To see that this is a valid choice of $\hat{σ}$ , i.e., that ${\hat{σ}}_{A^{n}} = σ_{A^{n}} = σ_{A}^{\otimes n}$ , we use Eqs. (3.8), (3.9) and (3.10) to find

\begin{matrix} {\hat{σ}}_{A^{n}} = {Tr}_{R^{n}} [{\hat{ρ}}_{A^{n} R^{n}}^{1 / 2}, {\hat{ρ}}_{A^{n}}^{- 1 / 2}, σ_{A^{n}}, {\hat{ρ}}_{A^{n}}^{- 1 / 2}, {\hat{ρ}}_{A^{n} R^{n}}^{1 / 2}] = {Tr}_{R^{n}} [{\hat{ρ}}_{A^{n} R^{n}}, {\hat{ρ}}_{A^{n}}^{- 1}, σ_{A^{n}}] = σ_{A^{n}} . \end{matrix}

Using Eqs. (3.11) and (3.12) followed by the data processing inequality [11], we obtain

\begin{matrix} \frac{1}{n} D_{α} ({\hat{ρ}}_{A^{n} R^{n}}, ‖, {\hat{σ}}_{A^{n} R^{n}}) = \frac{1}{n} D_{α} (T, ({\hat{ρ}}_{A^{n}}), ‖, T, (σ_{A^{n}})) \leq \frac{1}{n} D_{α} ({\hat{ρ}}_{A^{n}}, ‖, σ_{A^{n}}) . \end{matrix}

3.13

By Eqs. (3.7) and (3.3) we have ${\hat{ρ}}_{A^{n}} = ρ_{n}^{'} = P_{σ_{A^{n}}} (ρ_{A^{n}})$ . Therefore, continuing from Eq. (3.13) and using $σ_{A^{n}} = P_{σ_{A^{n}}} (σ_{A^{n}})$ followed by the data processing inequality gives

\begin{matrix} \frac{1}{n} D_{α} ({\hat{ρ}}_{A^{n} R^{n}}, ‖, {\hat{σ}}_{A^{n} R^{n}}) \leq \frac{1}{n} D_{α} (ρ_{A^{n}}, ‖, σ_{A^{n}}) = \frac{1}{n} D_{α} (ρ_{A}^{\otimes n}, ‖, σ_{A}^{\otimes n}) = D_{α} (ρ_{A}, ‖, σ_{A}) . \end{matrix}

Inserting this and our error bound from Eq. (3.6) into Eq. (3.4) proves the desired statement. $□$

With this, we can now prove Theorem 3.1.

Proof of Theorem 3.1

Because $D_{α}$ is additive under tensor products, for any $n \in N$ we have

\begin{matrix} D_{α} (E, (ρ_{AR}), ‖, F, (σ_{AR})) & = \frac{1}{n} D_{α} (E^{\otimes n}, (ρ_{AR}^{\otimes n}), ‖, F^{\otimes n}, (σ_{AR}^{\otimes n})) \\ = inf_{{\hat{σ}}_{A^{n} R^{n}} s.t. {\hat{σ}}_{A^{n}} = σ_{A}^{\otimes n}} \frac{1}{n} D_{α} (E^{\otimes n}, (ρ_{AR}^{\otimes n}), ‖, F^{\otimes n}, ({\hat{σ}}_{A^{n} R^{n}})), \end{matrix}

3.14

where the second equality holds because $F = R \circ {Tr}_{R}$ , so $F^{\otimes n} (σ_{AR}^{\otimes n}) = F^{\otimes n} ({\hat{σ}}_{A^{n} R^{n}})$ for any ${\hat{σ}}_{A^{n} R^{n}}$ that satisfies ${\hat{σ}}_{A^{n}} = σ_{A}^{\otimes n}$ . From the chain rule in Lemma 3.2 we get that for any ${\hat{σ}}_{A^{n} R^{n}}$ :

\begin{matrix} \frac{1}{n} D_{α} (E^{\otimes n}, (ρ_{AR}^{\otimes n}), ‖, F^{\otimes n}, ({\hat{σ}}_{A^{n} R^{n}})) & \leq \frac{1}{n} D_{α} (ρ_{AR}^{\otimes n}, ‖, {\hat{σ}}_{A^{n} R^{n}}) + \frac{1}{n} D_{α}^{reg} (E^{\otimes n}, ‖, F^{\otimes n}) \\ = \frac{1}{n} D_{α} (ρ_{AR}^{\otimes n}, ‖, {\hat{σ}}_{A^{n} R^{n}}) + D_{α}^{reg} (E, ‖, F), \end{matrix}

where for the second line we used additivity of the regularised channel divergence (see Eq. (2.2)). Combining this with Eq. (3.14), we get

\begin{matrix} D_{α} (E, (ρ_{AR}), ‖, F, (σ_{AR})) \leq inf_{{\hat{σ}}_{A^{n} R^{n}} s.t. {\hat{σ}}_{A^{n}} = σ_{A}^{\otimes n}} \frac{1}{n} D_{α} (ρ_{AR}^{\otimes n}, ‖, {\hat{σ}}_{A^{n} R^{n}}) + D_{α}^{reg} (E, ‖, F) . \end{matrix}

3.15

Finally, using Lemma 3.3 and the fact that $d : = dim (A)$ and $α > 1$ are constants independent of n, we have

\begin{matrix} lim_{n \to \infty} inf_{{\hat{σ}}_{A^{n} R^{n}} s.t. {\hat{σ}}_{A^{n}} = σ_{A}^{\otimes n}} \frac{1}{n} D_{α} (ρ_{AR}^{\otimes n}, ‖, {\hat{σ}}_{A^{n} R^{n}}) \\ \leq D_{α} (ρ_{A}, ‖, σ_{A}) + lim_{n \to \infty} \frac{α}{α - 1} \frac{d (d + 1) log (n + d)}{n} \\ = D_{α} (ρ_{A}, ‖, σ_{A}) . \end{matrix}

Therefore, taking $n \to \infty$ in Eq. (3.15) and inserting this yields the theorem statement. $□$

Removing the regularisation

The chain rule presented in Theorem 3.1 contains a regularised channel divergence term, which cannot be computed easily and whose behaviour as $α ↘ 1$ is not understood. In this section we show that in the specific case relevant for entropy accumulation, this regularisation can be removed. From this, we then derive a chain rule for Rényi entropies in Theorem 3.6.

Definition 3.4

(Replacer map). The replacer map $S_{A} \in CP (A, A)$ is defined by its action on an arbitrary state $ω_{AR}$ :

\begin{matrix} S_{A} (ω_{AR}) = 1_{A} \otimes ω_{R} . \end{matrix}

Note that as usual, when we write $S_{A} (ω_{AR})$ , we include an implicit tensoring with the identity channel, i.e. $S_{A} (ω_{AR}) = (S_{A} \otimes {id}_{R}) (ω_{AR})$ .

Lemma 3.5

Let $α \in (1, 2)$ , $E \in CPTP (A R, A^{'} R^{'})$ , and $F = S_{A^{'}} \circ E$ , where $S_{A^{'}}$ is the replacer map. Then we have

\begin{matrix} D_{α}^{reg} (E, ‖, F) \leq D_{\frac{1}{2 - α}} (E, ‖, F) . \end{matrix}

Proof

Due to the choice of $F$ , we have that for any state $ψ^{n} \in S (A^{n} R^{n} {\tilde{R}}^{n})$ (with $\tilde{R} \equiv A R$ ):

\begin{matrix} D_{α} (E^{\otimes n}, (ψ^{n}), ‖, F^{\otimes n}, (ψ^{n})) = - H_{α} {({(A^{'})}^{n}, |, {(R^{'})}^{n}, {\tilde{R}}^{n})}_{E^{\otimes n} (ψ^{n})} . \end{matrix}

From [41, Proposition II.4] and [2, Lemma 4.2.2] we know that for every n, there exists a symmetric pure state $| {\hat{ψ}}^{n} ⟩ \in {Sym}^{n} (A R \tilde{R})$ such that

\begin{matrix} D_{α} (E^{\otimes n}, ‖, F^{\otimes n}) = D_{α} (E^{\otimes n}, ({\hat{ψ}}^{n}), ‖, F^{\otimes n}, ({\hat{ψ}}^{n})) = - H_{α} {({(A^{'})}^{n}, |, {(R^{'})}^{n}, {\tilde{R}}^{n})}_{E^{\otimes n} (ψ^{n})}, \end{matrix}

where ${\hat{ψ}}^{n} = | {\hat{ψ}}^{n} ⟩ ⟨ {\hat{ψ}}^{n} |$ and the supremum in the definition of the channel divergence is achieved because the conditional entropy is continuous in the state. Let $d = dim (A R \tilde{R})$ and $g_{n, d} = dim ({Sym}^{n} (A R \tilde{R})) \leq {(n + 1)}^{d^{2} - 1}$ . We define the state

\begin{matrix} τ_{A^{n} R^{n} {\tilde{R}}^{n}}^{n} = \int μ (σ_{A R \tilde{R}}) σ_{A R \tilde{R}}^{\otimes n}, \end{matrix}

3.16

where $μ$ is the Haar measure on pure states. We now claim that in the limit $n \to \infty$ , we can essentially replace the optimizer ${\hat{ψ}}_{A^{n} R^{n} {\tilde{R}}^{n}}^{n}$ by the state $τ_{A^{n} R^{n} {\tilde{R}}^{n}}^{n}$ in Eq. (3.16). More precisely, we claim that

\begin{matrix} lim_{n \to \infty} \frac{1}{n} H_{α} {({(A^{'})}^{n} | {(R^{'})}^{n} {\tilde{R}}^{n})}_{E^{\otimes n} ({\hat{ψ}}^{n})} \geq lim_{n \to \infty} \frac{1}{n} H_{\frac{1}{2 - α}} {({(A^{'})}^{n} | {(R^{'})}^{n} {\tilde{R}}^{n})}_{E^{\otimes n} (τ^{n})} . \end{matrix}

3.17

To show this, we first use Lemma 2.3 to get

It is know that $τ_{A^{n} R^{n} {\tilde{R}}^{n}}^{n}$ is the maximally mixed state on ${Sym}^{n} (A R \tilde{R})$ (see e.g. [13]). Therefore,

\begin{matrix} ρ_{A^{n} R^{n} {\tilde{R}}^{n}}^{n} : = \frac{g_{n, d} τ^{n} - {\hat{ψ}}^{n}}{g_{n, d} - 1} \end{matrix}

is a valid quantum state (i.e. positive and normalised). Hence, we can write

\begin{matrix} τ^{n} = (1 - \frac{1}{g_{n, d}}) ρ^{n} + \frac{1}{g_{n, d}} {\hat{ψ}}^{n} . \end{matrix}

Using [1, Lemma B.5], it follows that

Since $\frac{log (g_{n, d})}{n} \leq (d^{2} - 1) \frac{log n}{n}$ vanishes as $n \to \infty$ , taking the limit and using Inline graphic proves Eq. (3.17).

Having established Eq. (3.17), we can now conclude the proof of the lemma as follows

\begin{matrix} D_{α}^{reg} (E, ‖, F) & = - lim_{n \to \infty} \frac{1}{n} H_{α} {({(A^{'})}^{n} | {(R^{'})}^{n} {\tilde{R}}^{n})}_{E^{\otimes n} ({\hat{ψ}}^{n})} \\ \leq - lim_{n \to \infty} \frac{1}{n} H_{\frac{1}{2 - α}} {({(A^{'})}^{n} | {(R^{'})}^{n} {\tilde{R}}^{n})}_{E^{\otimes n} (τ^{n})} \\ = lim_{n \to \infty} \frac{1}{n} D_{\frac{1}{2 - α}} (E^{\otimes n} (\int μ (σ_{A R \tilde{R}}) σ_{A R \tilde{R}}^{\otimes n}) ‖ F^{\otimes n} (\int μ (σ_{A R \tilde{R}}) σ_{A R \tilde{R}}^{\otimes n})) \\ \leq lim_{n \to \infty} sup_{σ_{A R \tilde{R}} \in S (A R \tilde{R})} \frac{1}{n} D_{\frac{1}{2 - α}} (E^{\otimes n}, (σ_{A R \tilde{R}}^{\otimes n}), ‖, F^{\otimes n}, (σ_{A R \tilde{R}}^{\otimes n})) \\ = D_{\frac{1}{2 - α}} (E, ‖, F), \end{matrix}

where we used joint quasi-convexity [11, Proposition 4.17] in the fourth line and additivity under tensor products in the last line. $□$

Strengthened chain rule for conditional Rényi entropy

We next combine Theorem 3.1 with Lemma 3.5 to derive a new chain rule for the conditional Rényi entropy which then allows us to prove the generalised EAT in Sect. 4.

Lemma 3.6

Let $α \in (1, 2)$ , $ρ \in S (A R E)$ , and $M \in CPTP (R E, A^{'} R^{'} E^{'})$ such that there exists $R \in CPTP (E, E^{'})$ such that ${Tr}_{A^{'} R^{'}} \circ M = R \circ {Tr}_{R}$ . Then

\begin{matrix} H_{α} {(A A^{'} | E^{'})}_{M (ρ)} \geq H_{α} {(A | E)}_{ρ} + inf_{ω \in S (R E \tilde{E})} H_{\frac{1}{2 - α}} {(A^{'} | E^{'} \tilde{E})}_{M (ω)} \end{matrix}

3.18

for a purifying system $\tilde{E} \equiv R E$ .

Proof

We define the following maps12

\begin{matrix} N & = S_{A^{'}} \circ M & \in CP (R E, A^{'} R^{'} E^{'}), \\ \tilde{M} & = {id}_{A} \otimes {Tr}_{R^{'}} \circ M & \in CPTP (A R E, A A^{'} E^{'}), \\ \tilde{N} & = S_{A^{'}} \circ \tilde{M} & \in CP (A R E, A A^{'} E^{'}) . \end{matrix}

Note that in Eq. (3.18), we can replace $M$ by $\tilde{M}$ , as the system $R^{'}$ does not appear in Eq. (3.18). With $σ_{ARE} = 1_{A} \otimes ρ_{RE}$ and $\tilde{N} = S_{A^{'}} \circ \tilde{M}$ , we can write

\begin{matrix} - H_{α} {(A A^{'} | E^{'})}_{M (ρ)} = D_{α} (\tilde{M}, (ρ_{ARE}), ‖, \tilde{N}, (σ_{ARE})) . \end{matrix}

We now claim that there exists a map $\tilde{R} \in CP (A E, A A^{'} E)$ such that $\tilde{N} = \tilde{R} \circ {Tr}_{R}$ . To see this, observe that by assumption, ${Tr}_{A^{'}} \circ \tilde{M} = {id}_{A} \otimes R \circ {Tr}_{R}$ for some $R \in CP (E, E^{'})$ . Then, we can define $\tilde{R} \in CP (A E, A A^{'} E)$ by its action on an arbitrary state $ω_{AE}$ :

\begin{matrix} \tilde{R} (ω_{AE}) : = 1_{A^{'}} \otimes ({id}_{A} \otimes R) (ω_{AE}) = 1_{A^{'}} \otimes {Tr}_{A^{'}} \circ \tilde{M} (ω_{ARE}) = \tilde{N} (ω_{ARE}) \end{matrix}

for any extension $ω_{ARE}$ of $ω_{AE}$ . Therefore, we can apply Theorem 3.1 to find

\begin{matrix} D_{α} (\tilde{M}, (ρ_{ARE}), ‖, \tilde{N}, (σ_{ARE})) \leq D_{α} (ρ_{AE}, ‖, σ_{AE}) + D_{α}^{reg} (\tilde{M}, ‖, \tilde{N}) . \end{matrix}

By definition of $σ$ , we have $D_{α} (ρ_{AE}, ‖, σ_{AE}) = - H_{α} {(A | E)}_{ρ}$ . Since the channel divergence is stabilised (see Footnote 9), tensoring with ${id}_{A}$ has no effect, i.e.,

\begin{matrix} D_{α}^{reg} (\tilde{M}, ‖, \tilde{N}) = D_{α}^{reg} ({Tr}_{R^{'}} \circ M ‖ {Tr}_{R^{'}} \circ N) = D_{α}^{reg} ({Tr}_{R^{'}} \circ M ‖ S_{A^{'}} \circ {Tr}_{R^{'}} \circ M) . \end{matrix}

To this, we can apply Lemma 3.5 and obtain

\begin{matrix} D_{α}^{reg} (\tilde{M}, ‖, \tilde{N}) \leq D_{\frac{1}{2 - α}} ({Tr}_{R^{'}} \circ M ‖ S_{A^{'}} \circ {Tr}_{R^{'}} \circ M) = - inf_{ω \in S (R E \tilde{E})} H_{\frac{1}{2 - α}} {(A^{'} | E^{'} \tilde{E})}_{M (ω)} \end{matrix}

with $\tilde{E} \equiv R E$ . Combining all the steps yields the desired statement. $□$

Generalised Entropy Accumulation

We are finally ready to state and prove the main result of this work which is a generalisation of the EAT proven in [1]. We first state a simple version of this theorem, which follows readily from the chain rule Theorem 3.6 and captures the essential feature of entropy accumulation: the min-entropy of a state $M_{n} \circ \dots \circ M_{1} (ρ)$ produced by applying a sequence of n channels can be lower-bounded by a sum of entropy contributions of each channel $M_{i}$ . However, for practical applications, it is desirable not to consider the state $M_{n} \circ \dots \circ M_{1} (ρ)$ , but rather that state conditioned on some classical event, for example “success” in a key distribution protocol – a concept called “testing”. Analogously to [1], we present an EAT adapted to that setting in Sect. 4.2.

Generalised EAT

Theorem 4.1

(Generalised EAT). Consider a sequence of channels $M_{i} \in$ $CPTP (R_{i - 1} E_{i - 1}, A_{i} R_{i} E_{i})$ such that for all $i \in {1, \dots, n}$ , there exists $R_{i} \in CPTP (E_{i - 1}, E_{i})$ such that ${Tr}_{A_{i} R_{i}} \circ M_{i} = R_{i} \circ {Tr}_{R_{i - 1}}$ . Then for any $ε \in (0, 1)$ and any $ρ_{R_{0} E_{0}} \in S (R_{0} E_{0})$

\begin{matrix} H_{min}^{ε} {(A^{n} | E_{n})}_{M_{n} \circ \dots \circ M_{1} (ρ_{R_{0} E_{0}})} \geq \sum_{i = 1}^{n} inf_{ω \in S (R_{i - 1} E_{i - 1} {\tilde{E}}_{i - 1})} H {(A_{i} | E_{i} {\tilde{E}}_{i - 1})}_{M_{i} (ω)} - O (\sqrt{n}) \end{matrix}

for a purifying system ${\tilde{E}}_{i - 1} \equiv R_{i - 1} E_{i - 1}$ . For a statement with explicit constants, see Eq. (4.1) in the proof.

Proof

By [1, Lemma B.10], we have for $α \in (1, 2)$

\begin{matrix} H_{min}^{ε} {(A_{1}^{n} | E_{n})}_{M_{n} \circ \dots \circ M_{1} (ρ_{R_{0} E_{0}})} \geq H_{α} {(A_{1}^{n} | E_{n})}_{M_{n} \circ \dots \circ M_{1} (ρ_{R_{0} E_{0}})} - \frac{g (ε)}{α - 1} \end{matrix}

with $g (ε) = log (1 - \sqrt{1 - ε^{2}})$ . From Theorem 3.6, we have

\begin{matrix} H_{α} {(A_{1}^{n} | E_{n})}_{M_{n} \circ \dots \circ M_{1} (ρ_{R_{0} E_{0}})} \\ \geq H_{α} {(A_{1}^{n - 1} | E_{n - 1})}_{M_{n - 1} \circ \dots \circ M_{1} (ρ_{R_{0} E_{0}})} \\ + inf_{ω \in S (R_{n - 1} E_{n - 1} {\tilde{E}}_{n - 1})} H_{\frac{1}{2 - α}} {(A_{n} | E_{n} {\tilde{E}}_{n - 1})}_{M_{n} (ω)} . \end{matrix}

Repeating this step $n - 1$ times, we get

\begin{matrix} H_{α} {(A_{1}^{n} | E_{n})}_{M_{n} \circ \dots \circ M_{1} (ρ_{R_{0} E_{0}})} & \geq H_{α} {(A_{1} | E_{1})}_{M_{1} (ρ_{R_{0} E_{0}})} \\ + \sum_{i = 2}^{n} inf_{ω \in S (R_{i - 1} E_{i - 1} {\tilde{E}}_{i - 1})} H_{\frac{1}{2 - α}} {(A_{i} | E_{i} {\tilde{E}}_{i - 1})}_{M_{i} (ω)} \\ \geq \sum_{i = 1}^{n} inf_{ω \in S (R_{i - 1} E_{i - 1} {\tilde{E}}_{i - 1})} H_{\frac{1}{2 - α}} {(A_{i} | E_{i} {\tilde{E}}_{i - 1})}_{M_{i} (ω)}, \end{matrix}

where the final step uses the monotonicity of the Rényi divergence in $α$ [11, Corollary 4.3]. From [1, Lemma B.9] we have for each $i \in {1, \dots, n}$ and $α$ sufficiently close to 1,

\begin{matrix} inf_{ω \in S (R_{i - 1} E_{i - 1} {\tilde{E}}_{i - 1})} H_{\frac{1}{2 - α}} {(A_{i} | E_{i} {\tilde{E}}_{i - 1})}_{M_{i} (ω)} \\ \geq inf_{ω \in S (R_{i - 1} E_{i - 1} {\tilde{E}}_{i - 1})} H {(A_{i} | E_{i} {\tilde{E}}_{i - 1})}_{M_{i} (ω)} - \frac{α - 1}{2 - α} {log}^{2} (1 + 2 dim (A_{i})) . \end{matrix}

Setting $d_{A} = {max}_{i} dim (A_{i})$ and combining the previous steps, we obtain

\begin{matrix} H_{min} {(A_{1}^{n} | E_{n})}_{M_{n} \circ \dots \circ M_{1} (ρ_{R_{0} E_{0}})} \\ \geq \sum_{i = 1}^{n} inf_{ω_{i} \in S (R_{i - 1} E_{i - 1} {\tilde{E}}_{i - 1})} H {(A_{i} | E_{i} {\tilde{E}}_{i - 1})}_{M_{i} (ω_{i})} - n \frac{α - 1}{2 - α} {log}^{2} (1 + 2 d_{A}) - \frac{g (ε)}{α - 1} . \end{matrix}

4.1

Using $α = 1 + O (1 / \sqrt{n})$ yields the result. $□$

Generalised EAT with testing

In this section, we will extend Theorem 4.1 to include the possibility of “testing”, i.e., of computing the min-entropy of a cq-state conditioned on some classical event. This analysis is almost identical to that of [8]; we give the full proof for completeness, but will appeal to [8] for specific tight bounds. The resulting EAT (Theorem 4.3) has (almost) the same tight bounds as the result in [8], but replaces the Markov condition with the more general non-signalling condition. Hence, relaxing the Markov condition does not result in a significant loss in parameters (including second-order terms).

Consider a sequence of channels $M_{i} \in CPTP (R_{i - 1} E_{i - 1}, C_{i} A_{i} R_{i} E_{i})$ for $i \in {1, \dots, n}$ , where $C_{i}$ are classical systems with common alphabet $C$ . We require that these channels $M_{i}$ satisfy the following condition: defining $M_{i}^{'} = {Tr}_{C_{i}} \circ M_{i}$ , there exist channels $T_{i} \in CPTP (A_{i} E_{i}, C_{i} A_{i} E_{i})$ and $T \in CPTP (A^{n} E_{n}, C^{n} A^{n} E_{n})$ such that $M_{i} = T_{i} \circ M_{i}^{'}$ and $M_{n} \circ \dots \circ M_{1} = T \circ M_{n}^{'} \circ \dots \circ M_{1}^{'}$ , where $T_{i}$ and $T$ have the form

\begin{matrix} T_{i} (ω_{A_{i} E_{i}}) = \sum_{y \in Y_{i}, z \in Z_{i}} (Π_{A_{i}}^{(y)} \otimes Π_{E_{i}}^{(z)}) ω_{A_{i} E_{i}} (Π_{A_{i}}^{(y)} \otimes Π_{E_{i}}^{(z)}) \otimes | r_{i} (y, z) ⟩ ⟨ r_{i} (y, z) |_{C_{i}} \\ T (ω_{A^{n} E_{n}}) = \sum_{y \in Y, z \in Z} (Π_{A^{n}}^{(y)} \otimes Π_{E_{n}}^{(z)}) ω_{A^{n} E_{n}} (Π_{A^{n}}^{(y)} \otimes Π_{E_{n}}^{(z)}) \otimes | r (y, z) ⟩ ⟨ r (y, z) |_{C^{n}}, \end{matrix}

4.2

where ${Π_{A_{i}}^{(y)}}_{y}$ and ${Π_{E_{i}}^{(z)}}_{z}$ are families of mutually orthogonal projectors on $A_{i}$ and $E_{i}$ , and $r_{i} : Y_{i} \times Z_{i} \to C$ is a deterministic function Similarly, ${Π_{A^{n}}^{(y)}}_{y}$ and ${Π_{E_{n}}^{(z)}}_{z}$ are families of mutually orthogonal projectors on $A^{n}$ and $E_{n}$ , and $r : Y \times Z \to C$ is a deterministic function. (Note that even though we use the same symbol for both, in principle there does not have to be any relationship between the single-round projectors $Π_{A_{i}}$ and the projector $Π_{A^{n}}$ (and likewise for $Π_{E_{i}}$ and $Π_{E_{n}}$ ), although in practice the latter will usually be the tensor product of the former.) Intuitively, this condition says that for each round, the classical statistics can be reconstructed “in a projective way” from the systems $A_{i}$ and $E_{i}$ in that round, and furthermore the full statistics information $C^{n}$ can be reconstructed in a projective way from the systems $A^{n}$ and $E_{n}$ at the end of the process. The latter condition is not implied by the former because future rounds may modify the $E_{i}$ -system in such a way that $C_{i}$ can no longer be reconstructed from the side information $E_{n}$ at the end of the protocol. To rule this out, we need to specify the latter condition separately. In particular, this requirement is always satisfied if the statistics $C_{i}$ are computed from classical information contained in $A_{i}$ and $E_{i}$ and this classical information is not deleted from $E_{i}$ in future rounds. This is the scenario in all applications that we are aware of, but we state Eq. (4.2) more generally to allow for the possibility of protocols where the statistics are constructed in a more general way.

Let $P$ be the set of probability distributions on the alphabet $C$ of $C_{i}$ , and let ${\tilde{E}}_{i - 1}$ be a system isomorphic to $R_{i - 1} E_{i - 1}$ . For any $q \in P$ we define the set of states

\begin{matrix} Σ_{i} (q) = {ν_{C_{i} A_{i} R_{i} E_{i} {\tilde{E}}_{i - 1}} = M_{i} (ω_{R_{i - 1} E_{i - 1} {\tilde{E}}_{i - 1}}) | ω \in S (R_{i - 1} E_{i - 1} {\tilde{E}}_{i - 1}) and ν_{C_{i}} = q}, \end{matrix}

4.3

where $ν_{C_{i}}$ denotes the probability distribution over $C$ with the probabilities given by $Pr [c] = ⟨ c | ν_{C_{i}} | c ⟩$ . In other words, $Σ_{i} (q)$ is the set of states that can be produced at the output of the channel $M_{i}$ and whose reduced state on $C_{i}$ is equal to the probability distribution q.

Definition 4.2

A function $f : P \to R$ is called a min-tradeoff function for ${M_{i}}$ if it satisfies

\begin{matrix} f (q) \leq min_{ν \in Σ_{i} (q)} H {(A_{i} | E_{i} {\tilde{E}}_{i - 1})}_{ν} \forall i = 1, \dots, n . \end{matrix}

Note that if $Σ_{i} (q) = \emptyset$ , then f(q) can be chosen arbitrarily.

Our result will depend on some simple properties of the tradeoff function, namely the maximum and minimum of f, the minimum of f over valid distributions, and the maximum variance of f:

\begin{matrix} Max (f) & : = max_{q \in P} f (q), \\ Min (f) & : = min_{q \in P} f (q), \\ {Min}_{Σ} (f) & : = min_{q : Σ (q) \neq \emptyset} f (q), \\ Var (f) & : = max_{q : Σ (q) \neq \emptyset} \sum_{x \in C} q (x) f {(δ_{x})}^{2} - {(\sum_{x \in C}, q, (x), f, (δ_{x}))}^{2}, \end{matrix}

where $Σ (q) = ⋃_{i} Σ_{i} (q)$ and $δ_{x}$ is the distribution with all the weight on element x. We write $freq (C^{n})$ for the distribution on $C$ defined by $freq (C^{n}) (c) = \frac{| {i \in {1, \dots, n} : C_{i} = c} |}{n}$ . We also recall that in this context, an event $Ω$ is defined by a subset of $C^{n}$ , and for a state $ρ_{C^{n} A^{n} E_{n} R_{n}}$ we write ${Pr}_{ρ} [Ω] = \sum_{c^{n} \in Ω} Tr [ρ_{A_{1}^{n} E_{n} R_{n}, c^{n}}]$ for the probability of the event $Ω$ and

\begin{matrix} ρ_{C^{n} A^{n} E_{n} R_{n} | Ω} = \frac{1}{{Pr}_{ρ} [Ω]} \sum_{c^{n} \in Ω} | c^{n} ⟩ ⟨ c^{n} |_{C^{n}} \otimes ρ_{A^{n} E_{n} R_{n}, c^{n}} \end{matrix}

for the state conditioned on $Ω$ .

Theorem 4.3

Consider a sequence of channels $M_{i} \in CPTP (R_{i - 1} E_{i - 1}, C_{i} A_{i} R_{i} E_{i})$ for $i \in {1, \dots, n}$ , where $C_{i}$ are classical systems with common alphabet $C$ and the sequence ${M_{i}}$ satisfies Eq. (4.2) and the non-signalling condition: for each $M_{i}$ , there exists $R_{i} \in CPTP (E_{i - 1}, E_{i})$ such that ${Tr}_{A_{i} R_{i} C_{i}} \circ M_{i} = R_{i} \circ {Tr}_{R_{i - 1}}$ . Let $ε \in (0, 1)$ , $α \in (1, 3 / 2)$ , $Ω \subset C^{n}$ , $ρ_{R_{0} E_{0}} \in S (R_{0} E_{0})$ , and f be an affine13 min-tradeoff function with $h = {min}_{c^{n} \in Ω} f (freq (c^{n}))$ . Then,

\begin{matrix} H_{min}^{ε} {(A^{n} | E_{n})}_{M_{n} \circ \dots \circ M_{1} {(ρ_{R_{0} E_{0}})}_{| Ω}} & \geq n h - n \frac{α - 1}{2 - α} \frac{ln (2)}{2} V^{2} - \frac{g (ε) + α log (1 / {Pr}_{ρ^{n}} [Ω])}{α - 1} \\ - n {(\frac{α - 1}{2 - α})}^{2} K^{'} (α), \end{matrix}

4.4

where $Pr [Ω]$ is the probability of observing event $Ω$ , and

\begin{matrix} g (ε) & = - log (1 - \sqrt{1 - ε^{2}}), \\ V & = log (2 d_{A}^{2} + 1) + \sqrt{2 + Var (f)}, \\ K^{'} (α) & = \frac{{(2 - α)}^{3}}{6 {(3 - 2 α)}^{3} ln 2} 2^{\frac{α - 1}{2 - α} (2 log d_{A} + Max (f) - {Min}_{Σ} (f))} {ln}^{3} (2^{2 log d_{A} + Max (f) - {Min}_{Σ} (f)} + e^{2}), \end{matrix}

with $d_{A} = {max}_{i} dim (A_{i})$ .

Remark 4.4

The parameter in $α$ in Theorem 4.3 can be optimized for specific problems, which leads to tighter bounds. Alternatively, it is possible to make a generic choice for $α$ to recover a theorem that looks much more like Theorem 4.1, which is done in Corollary 4.6. We also remark that even tighter second order terms have been derived in [42]. To keep our theorem statement and proofs simpler, we do not carry out this additional optimization explicitly, but note that this can be done in complete analogy to [42].

To prove Theorem 4.3, we will need the following lemma (which is already implicit in [1, Claim 4.6], but we give a simplified proof here).

Lemma 4.5

Consider a quantum state $ρ \in S (C A D E)$ that has the form

\begin{matrix} ρ_{CADE} = \sum_{c \in Ω} | c ⟩ ⟨ c | \otimes ρ_{A E, c} \otimes ρ_{D | c}, \end{matrix}

where $Ω \subset C$ is a subset of the alphabet $C$ of the classical system C, and for each c, $ρ_{A E, c} \in Pos (A E)$ is subnormalised and $ρ_{D | c} \in S (D)$ is a quantum state. Then for $α > 1$ ,

Proof

Let $σ_{E} \in S (E)$ such that

Then

\begin{matrix} {(σ_{E}^{\frac{1 - α}{2 α}}, ρ_{CADE}, σ_{E}^{\frac{1 - α}{2 α}})}^{α} = \sum_{c \in Ω} | c ⟩ ⟨ c | \otimes {(σ_{E}^{\frac{1 - α}{2 α}}, ρ_{A E, c}, σ_{E}^{\frac{1 - α}{2 α}})}^{α} \otimes ρ_{D | c}^{α} . \end{matrix}

Hence,

\begin{matrix} Tr [{(σ_{E}^{\frac{1 - α}{2 α}}, ρ_{CADE}, σ_{E}^{\frac{1 - α}{2 α}})}^{α}] & = \sum_{c \in Ω} Tr [{(σ_{E}^{\frac{1 - α}{2 α}}, ρ_{A E, c}, σ_{E}^{\frac{1 - α}{2 α}})}^{α}] Tr [ρ_{D | c}^{α}] \\ \leq sup_{{\tilde{σ}}_{E} \in S (E)} Tr [\sum_{c \in Ω} | c ⟩ ⟨ c | \otimes {({\tilde{σ}}_{E}^{\frac{1 - α}{2 α}}, ρ_{A E, c}, {\tilde{σ}}_{E}^{\frac{1 - α}{2 α}})}^{α}] \\ \times max_{c \in Ω} Tr [ρ_{D | c}^{α}] \\ = sup_{{\tilde{σ}}_{E} \in S (E)} Tr [{({\tilde{σ}}_{E}^{\frac{1 - α}{2 α}}, ρ_{CAE}, {\tilde{σ}}_{E}^{\frac{1 - α}{2 α}})}^{α}] max_{c \in Ω} Tr [ρ_{D | c}^{α}] \end{matrix}

Recalling the definitions of $D_{α}$ (Definition 2.1) and Inline graphic (Definition 2.2), we see that the lemma follows by taking the logarithm and multiplying by $\frac{1}{α - 1}$ . $□$

Proof of Theorem 4.3

As in the proof of Theorem 4.1, we first use [1, Lemma B.10] to get

4.5

for $α \in (1, 2]$ and $g (ε) = log (1 - \sqrt{1 - ε^{2}})$ . We therefore need to find a lower bound for

4.6

where the equality holds because of Eq. (4.2) and [1, Lemma B.7].

Before proceeding with the formal proof, let us explain the main difficulty compared to Theorem 4.1. The state for which we need to compute the entropy in Eq. (4.6) is conditioned on the event $Ω \subset C^{n}$ . This is a global event, in the sense that it depends on the classical outputs $C_{1}, \dots, C_{n}$ of all rounds. We essentially seek a lower bound that involves ${min}_{ν \in Σ_{i} (freq (c^{n}))} H_{α} {(A_{i} | E_{i})}_{ν}$ for some $c^{n} \in Ω$ , i.e., for every round we only want to minimize over output states of the channel $M_{i}$ whose distribution on $C_{i}$ matches the frequency distribution $freq (c^{n})$ of the n rounds we observed. This means that we must use the global conditioning on $Ω$ to argue that in each round, we can restrict our attention to states whose outcome distribution matches the (worst-case) frequency distribution associated with $Ω$ . The chain rule Theorem 3.1 does not directly allow us to do this as the r.h.s. of Eq. (3.18) always minimizes over all possible input states.

To circumvent this, we follow a strategy that was introduced in [1] and optimized in [8] (see also [16, 21, 43] for related ideas and [44] for follow-up work). For every i, we introduce a quantum system $D_{i}$ with $dim (D_{i}) = ⌈ 2^{Max (f) - Min (f)} ⌉$ and define $D_{i} \in CPTP (C_{i}, C_{i} D_{i})$ by

\begin{matrix} D_{i} (ω_{C_{i}}) = \sum_{c \in C} ⟨ c | ω_{C_{i}} | c ⟩ \cdot | c ⟩ ⟨ c | \otimes τ_{D_{i} | c} . \end{matrix}

For every $c \in C$ , the state $τ_{D_{i} | c} \in S (D)$ is defined as the mixture between a uniform distribution on ${1, \dots, ⌊ 2^{Max (f) - f (δ_{c})} ⌋}$ and a uniform distribution on ${1, \dots, ⌈ 2^{Max (f) - f (δ_{c})} ⌉}$ that satisfies

\begin{matrix} H {(D_{i})}_{τ_{D_{i} | c}} = Max (f) - f (δ_{c}), \end{matrix}

where $δ_{x}$ stands for the distribution with all the weight on element x. This is clearly possible if $dim (D_{i}) = ⌈ 2^{Max (f) - Min (f)} ⌉$ .

We define ${\bar{M}}_{i} = D_{i} \circ M_{i}$ and denote

\begin{matrix} ρ_{C^{n} A^{n} R_{n} E_{n}}^{n} = M_{n} \circ \dots \circ M_{1} (ρ_{R_{0} E_{0}}) and {\bar{ρ}}_{C^{n} A^{n} D^{n} R_{n} E_{n}}^{n} = {\bar{M}}_{n} \circ \dots \circ {\bar{M}}_{1} (ρ_{R_{0} E_{0}}) . \end{matrix}

The state ${\bar{ρ}}_{| Ω}^{n}$ has the right form for us to apply Lemma 4.5 and get

4.7

where

\begin{matrix} {\bar{ρ}}_{D^{n} | c^{n}}^{n} = τ_{D_{1} | c_{1}} \otimes \dots \otimes τ_{D_{n} | c_{n}} . \end{matrix}

We treat each term in Eq. (4.7) in turn.

(i)
For the term on the l.h.s., it is easy to see that ${\bar{ρ}}_{C^{n} A^{n} R_{n} E_{n} | Ω}^{n} = ρ_{C^{n} A^{n} R_{n} E_{n} | Ω}^{n}$ , so
4.8
(ii)
For the first term on the r.h.s., we compute
$\begin{matrix} H_{α} {(D^{n})}_{{\bar{ρ}}_{D^{n} | c^{n}}^{n}} = \sum_{i} H_{α} {(D_{i})}_{τ_{D_{i} | c_{i}}} \leq \sum_{i} H {(D_{i})}_{τ_{D_{i} | c_{i}}} & = n Max (f) - \sum_{i} f (δ_{c_{i}}) \\ = n Max (f) - n f (freq (c^{n})), \end{matrix}$ 4.9
where the last equality holds because f is affine.
(iii)
For the second term on the r.h.s., we first use [1, Lemma B.5] to remove the conditioning on the event $Ω$ , and then use that removing the classical system $C^{n}$ and switching from to $H_{α}$ can only decrease the entropy:
where we used ${Pr}_{ρ^{n}} [Ω] = {Pr}_{{\bar{ρ}}^{n}} [Ω]$ . Now noting that ${Tr}_{D_{i}} \circ {\bar{M}}_{i} = M_{i}$ , we see that the non-signalling condition ${Tr}_{A_{i} R_{i} C_{i}} \circ M_{i} = R_{i} \circ {Tr}_{R_{i - 1}}$ on $M_{i}$ implies the non-signalling condition ${Tr}_{A_{i} R_{i} C_{i} D_{i}} \circ {\bar{M}}_{i} = R_{i} \circ {Tr}_{R_{i - 1}}$ on ${\bar{M}}_{i}$ . We can therefore apply the chain rule in Theorem 3.6 to find
$\begin{matrix} H_{α} {(A^{n} D^{n} | E_{n})}_{{\bar{ρ}}^{n}} \geq \sum_{i = 1}^{n} min_{ω_{i - 1} \in S (R_{i - 1} E_{i - 1} {\tilde{E}}_{i - 1})} H_{β} {(A_{i} D_{i} | E_{i} {\tilde{E}}_{i - 1})}_{{\bar{M}}_{i} (ω_{i - 1})}, \end{matrix}$
where we introduced the shorthand $β : = \frac{1}{2 - α}$ and the purifying system ${\tilde{E}}_{i - 1} \equiv R_{i - 1} E_{i - 1}$ . Noting that for $α \in (1, 3 / 2)$ we have $β \in (1, 2)$ , we can now use [8, Corollary IV.2] to obtain
$\begin{matrix} H_{β} {(A_{i} D_{i} | E_{i} {\tilde{E}}_{i - 1})}_{{\bar{M}}_{i} (ω_{i - 1})} \\ \geq H {(A_{i} D_{i} | E_{i} {\tilde{E}}_{i - 1})}_{{\bar{M}}_{i} (ω_{i - 1})} - (β - 1) \frac{ln (2)}{2} V^{2} - {(β - 1)}^{2} K (β), \end{matrix}$
where $V^{2}$ and $K (β)$ are quantities from [8, Proposition V.3] that satisfy
$\begin{matrix} K (β) & \leq \frac{1}{6 {(2 - β)}^{3} ln 2} \\ 2^{(β - 1) (2 log d_{A} + Max (f) - {Min}_{Σ} (f))} {ln}^{3} (2^{2 log d_{A} + Max (f) - {Min}_{Σ} (f)} + e^{2}), \\ V^{2} & = {(log (2 d_{A}^{2} + 1) + \sqrt{2 + Var (f)})}^{2}, \end{matrix}$
where $d_{A} = {max}_{i} dim (A_{i})$ . Note that the above expressions derived in [8, Proposition V.3] also hold in our case due to the first part of Eq. (4.2). Furthermore, as in the proof of [8, Proposition V.3], we have
$\begin{matrix} H {(A_{i} D_{i} | E_{i} {\tilde{E}}_{i - 1})}_{{\bar{M}}_{i} (ω_{i - 1})} \geq Max (f) . \end{matrix}$
Therefore, the second term on the r.h.s. of Eq. (4.7) is bounded by
4.10

Combining our results for each of the three terms (i.e. Eqs. (4.8), (4.9) and (4.10)) and recalling $h = {min}_{x^{n} \in Ω} f (freq (x^{n}))$ , Eq. (4.7) becomes

Inserting this into Eqs. (4.5) and (4.6), and defining $K^{'} (α) = K (β) = K (\frac{1}{2 - α})$ we obtain

\begin{matrix} H_{min}^{ε} {(A^{n} | E_{n})}_{M_{n} \circ \dots \circ M_{1} {(ρ_{R_{0} E_{0}})}_{| Ω}} & \geq n h - n (β - 1) \frac{ln (2)}{2} V^{2} - \frac{g (ε) + α log (1 / {Pr}_{ρ^{n}} [Ω])}{α - 1} \\ - n {(β - 1)}^{2} K (β) \end{matrix}

4.11

as desired. $□$

Corollary 4.6

For the setting given in Theorem 4.3 we have

\begin{matrix} H_{min}^{ε} {(A^{n} | E_{n})}_{M_{n} \circ \dots \circ M_{1} {(ρ_{R_{0} E_{0}})}_{| Ω}} \geq n h - c_{1} \sqrt{n} - c_{0}, \end{matrix}

where the quantities $c_{1}$ and $c_{0}$ are given by

\begin{matrix} c_{1} & = \sqrt{\frac{2 ln (2) V^{2}}{η} (g (ε) + (2 - η) log (1 / {Pr}_{ρ^{n}} [Ω]))}, \\ c_{0} & = \frac{(2 - η) η^{2} log (1 / {Pr}_{ρ^{n}} [Ω]) + η^{2} g (ε)}{3 {(ln 2)}^{2} V^{2} {(2 η - 1)}^{3}} \\ 2^{\frac{1 - η}{η} (2 log d_{A} + Max (f) - {Min}_{Σ} (f))} {ln}^{3} (2^{2 log d_{A} + Max (f) - {Min}_{Σ} (f)} + e^{2}) \end{matrix}

with

\begin{matrix} η & = \frac{2 ln (2)}{1 + 2 ln (2)}, g (ε) = log (1 - \sqrt{1 - ε^{2}}), V = log (2 d_{A}^{2} + 1) + \sqrt{2 + Var (f)} . \end{matrix}

Proof

We first note that for any $Ω$ with non-zero probability, $h \leq log d_{A}$ . Therefore, if $n \leq {(\frac{c_{1}}{2 log d_{A}})}^{2}$ , it is easy to check that $n h - c_{1} \sqrt{n} \leq - n log d_{A}$ , so the statement of Corollary 4.6 becomes trivial. We may therefore assume that $n \geq {(\frac{c_{1}}{2 log d_{A}})}^{2}$ .

As in the proof of Theorem 4.3, we define $β = \frac{1}{2 - α}$ . The first part of the proof works for any $α \in (1, 2 - η)$ for $η = \frac{2 ln (2)}{1 + 2 ln (2)} \approx 0.58$ ; later we will make a specific choice of $α$ in this interval. Then, $β - 1 = \frac{1}{2 - α} - 1 \leq \frac{α - 1}{η}$ and $β \in (1, 1 / η)$ . Therefore, using $K (β)$ as defined in the proof of Theorem 4.3 and noting that in the interval $β \in (1, 1 / η) \subset (1, 2)$ this quantity is monotonically increasing in $β$ , we have

\begin{matrix} K (β) \leq K : = \\ \frac{η^{3}}{6 {(2 η - 1)}^{3} ln 2} 2^{\frac{1 - η}{η} (2 log d_{A} + Max (f) - {Min}_{Σ} (f))} {ln}^{3} (2^{2 log d_{A} + Max (f) - {Min}_{Σ} (f)} + e^{2}), \end{matrix}

Hence, we can simplify the statement of Theorem 4.3 to

\begin{matrix} H_{min}^{ε} {(A^{n} | E_{n})}_{M_{n} \circ \dots \circ M_{1} {(ρ_{R_{0} E_{0}})}_{| Ω}} \\ \geq n h - n (α - 1) \frac{ln (2)}{2 η} V^{2} - \frac{g (ε) + (2 - η) \cdot log (1 / {Pr}_{ρ^{n}} [Ω])}{α - 1} - n {(α - 1)}^{2} \frac{K}{η^{2}} . \end{matrix}

4.12

We now choose $α > 1$ as a function of n and $ε$ so that the terms proportional to $α - 1$ and $\frac{1}{α - 1}$ match:

\begin{matrix} α = 1 + \sqrt{\frac{2 η}{n ln (2) V^{2}} (g (ε) + (2 - η) log (1 / {Pr}_{ρ^{n}} [Ω]))} . \end{matrix}

Inserting this choice of $α$ into Eq. (4.12) and combining terms yields the constants in Corollary 4.6. The final step is to show that this choice of $α$ indeed satisfies $α \leq 2 - η$ for $n \geq {(\frac{c_{1}}{2 log d_{A}})}^{2}$ . For this, we note that for $n \geq {(\frac{c_{1}}{2 log d_{A}})}^{2}$ , we have

\begin{matrix} α = 1 + \frac{η}{ln (2) V^{2}} \frac{c_{1}}{\sqrt{n}} \leq 1 + \frac{2 η log d_{A}}{ln (2) V^{2}} . \end{matrix}

We can now use that $V^{2} \geq {(log (2 d_{A}^{2}))}^{2} \geq 4 log d_{A}$ since $d_{A} \geq 2$ , so

\begin{matrix} α \leq 1 + \frac{2 η log d_{A}}{ln (2) V^{2}} \leq 1 + \frac{η}{2 ln (2)} = 2 - η, \end{matrix}

where the last inequality holds because $η = \frac{2 ln (2)}{1 + 2 ln (2)}$ . $□$

In many applications, e.g. randomness expansion or QKD, a round can either be a “data generation round” (e.g. to generate bits of randomness or key) or a “test round” (e.g. to test whether a device used in the protocol behaves as intended). More formally, in this case the maps $M_{i} \in CPTP (R_{i - 1} E_{i - 1}, C_{i} A_{i} R_{i} E_{i})$ can be written as

\begin{matrix} M_{i} = γ M_{i, R_{i - 1} E_{i - 1} \to C_{i} A_{i} R_{i} E_{i}}^{test} + (1 - γ) M_{i, R_{i - 1} E_{i - 1} \to A_{i} R_{i} E_{i}}^{data} \otimes {| ⊥ ⟩ ⟨ ⊥ |}_{C_{i}}, \end{matrix}

4.13

where the output of $M_{i}^{test}$ on system $C_{i}$ is from some alphabet $C^{'}$ that does not include $⊥$ , so the alphabet of system $C_{i}$ is $C = C^{'} \cup {⊥}$ . The parameter $γ$ is called the testing probability, and for efficient protocols we usually want $γ$ to be as small as possible.

For maps of the form in Eq. (4.13), there is a general way of constructing a min-tradeoff function for the map $M_{i}$ based only on the statistics generated by the map $M_{i}^{test}$ . This was shown in [8] and we reproduce their result (adapted to our notation) here for the reader’s convenience.

Lemma 4.7

([8, Lemma V.5]). Let $M_{i} \in CPTP (R_{i - 1} E_{i - 1}, C_{i} A_{i} R_{i} E_{i})$ be channels satisfying the same conditions as in Theorem 4.3 that can furthermore be decomposed as in Eq. (4.13). Suppose that an affine function $g : P (C^{'}) \to R$ satisfies for any $q^{'} \in P (C^{'})$ and any $i = 1, \dots, n$

\begin{matrix} g (q^{'}) \leq min_{ω \in S (R_{i - 1} E_{i - i} {\tilde{E}}_{i - 1})} {H {(A_{i} | E_{i} {\tilde{E}}_{i - 1})}_{M_{i} (ω)} : {(M_{i}^{test}, (ω))}_{C_{i}} = q^{'}} \end{matrix}

4.14

where ${\tilde{E}}_{i - 1} \equiv R_{i - 1} E_{i - 1}$ is a purifying system. Then, the affine function $f : P (C) \to R$ defined by

\begin{matrix} f (δ_{x}) & = Max (g) + \frac{1}{γ} (g (δ_{x}) - Max (g)) \forall x \in C^{'} \\ f (δ_{⊥}) & = Max (g) \end{matrix}

is a min-tradeoff function for ${M_{i}}$ . Moreover,

\begin{matrix} Max (f) & = Max (g) \\ Min (f) & = (1 - \frac{1}{γ}) Max (g) + \frac{1}{γ} Min (g) \\ {Min}_{Σ} (f) & \geq Min (g) \\ Var (f) & \leq \frac{1}{γ} (Max (g) - Min (g))^{2} . \end{matrix}

Sample Applications

To demonstrate the utility of our generalised EAT, we provide two sample applications. Firstly, in Sect. 5.1 we prove security of blind randomness expansion against general attacks. The notion of blind randomness was defined in [15] and has potential applications in mistrustful cryptography (see [15, 16] for a detailed motivation). Until now, no security proof against general attacks was known. In particular, the original EAT is not applicable because its model of side information is too restrictive. With our generalised EAT, we can show that security against general attacks follows straightforwardly from a single-round security statement.

Secondly, in Sect. 5.2 we give a simplified security proof for the E91 QKD protocol [45], which was also treated with the original EAT [1]. This example is meant to help those familiar with the original EAT understand the difference between that result and our generalised EAT. In particular, this application highlights the utility of our more general model of side information: in our proof, the non-signalling condition is satisfied trivially and the advantage over the original EAT stems purely from being able to update the side information register $E_{i}$ . We point out that while here we focus on the E91 protocol to allow an easy comparison with the original EAT, our generalised EAT can be used for a large class of QKD protocols for which the original EAT was not applicable at all. A comprehensive treatment of this is given in [7].

Blind randomness expansion

We start by recalling the idea of standard (non-blind) device-independent randomness expansion [17–21]. Alice would like to generate a uniformly random bit string using devices $D_{1}$ and $D_{2}$ prepared by an adversary Eve. To this end, in her local lab (which Eve cannot access) she isolates the devices from one another and plays multiple round of a non-local game with them, e.g. the CHSH game. On a subset of the rounds of the game, she checks whether the CHSH condition is satisfied. If this is the case on a sufficiently high proportion of rounds, she can conclude that the devices’ outputs on the remaining rounds must contain a certain amount of entropy, conditioned on the input to the devices and any quantum side information that Eve might have kept from preparing the devices. Using a quantum-proof randomness extractor, Alice can then produce a uniformly random string.

Blind randomness expansion [15, 16] is a significant strengthening of the above idea. Here, Alice only receives one device $D_{1}$ , which she again places in her local lab isolated from the outside world. Now, Alice plays a non-local game with her device $D_{1}$ and the adversary Eve: she samples questions for a non-local game as before, inputs one of the questions to $D_{1}$ , and sends the other question to Eve. $D_{1}$ and Eve both provide an output. Alice then proceeds as in standard randomness expansion, checking whether the winning condition of the non-local game is satisfied on a subset of rounds and concluding that the output of her device $D_{1}$ must contain a certain amount of entropy conditioned on the adversary’s side information.

For the purpose of applying the EAT, the crucial difference between the two notions of randomness expansion is the following: in standard randomness expansion, the adversary’s quantum side information is not acted upon during the protocol, and additional side information (the inputs to the devices, which we also condition on) are generated independently in a round-by-round manner. This allows a relatively straightforward application of the standard EAT [4]. In contrast, in blind randomness expansion, the adversary’s quantum side information gets updated in every round of the protocol and is not generated independently in a round-by-round fashion. This does not fit in the framework of the standard EAT, which requires the side information to be generated round-by-round subject to a Markov condition. As a result, [15, 16] were not able to prove a general multi-round blind randomness expansion result.

In the rest of this section, we will show that our generalised EAT is capable of treating multi-round blind randomness expansion, using a protocol similar to [14, Protocol 3.1]. A formal description of the protocol is given in Protocol 1.

The following proposition shows a lower bound on on the amount of randomness Alice can extract from this protocol, as specified by the min-entropy. For this, we assume a lower-bound on the single-round von Neumann entropy. Such a single-round bound can be found numerically using a generic method as explained after the proof of Lemma 5.1.

Proposition 5.1

Suppose Alice executes Protocol 1 with a device D that cannot communicate with Eve. We denote by $R_{i}$ and $E_{i}^{'}$ the (arbitrary) quantum systems of the device D and the adversary Eve after the i-th round, respectively. Eve’s full side-information after the i-th round is $E_{i} : = T^{i} X^{i} Y^{i} B^{i} E_{i}^{'}$ . A single round of the protocol can be described by a quantum channel $N_{i} \in CPTP (R_{i - 1} E_{i - 1}, C_{i} A_{i} R_{i} E_{i})$ . We also define $N_{i}^{test}$ to be the same as $N_{i}$ , except that $N_{i}^{test}$ always picks $T_{i} = 1$ . Let $ρ_{A^{n} C^{n} R_{n} E_{n}}$ be the state at the end of the protocol and $Ω$ the event that Alice does not abort.

Let $g : P ({0, 1}) \to R$ be an affine function satisfying the conditions

\begin{matrix} g (p) \leq inf_{ω \in S (R_{i - 1} E_{i - 1} {\tilde{E}}_{i - 1}) : N_{i}^{test} {(ω)}_{C_{i}} = p} H {(A_{i} | E_{i} {\tilde{E}}_{i - 1})}_{N_{i} (ω)}, Max (g) = g (δ_{1}), \end{matrix}

5.1

where ${\tilde{E}}_{i - 1} \equiv R_{i - 1} E_{i - 1}$ is a purifying system. Then, for any $ε_{a}, ε_{s} \in (0, 1)$ , either $Pr [Ω] \leq ε_{a}$ or

\begin{matrix} H_{min}^{s} {(A^{n} | E_{n})}_{ρ_{| Ω}} \geq n h - c_{1} \sqrt{n} - c_{0} \end{matrix}

for $c_{1}, c_{0} \geq 0$ independent of n and

\begin{matrix} h & = min_{p^{'} \in P ({0, 1}) : p^{'} (0) \leq 1 - ω_{exp} + δ} g (p^{'}), \end{matrix}

where $ω_{exp}$ is the expected winning probability and $δ$ the error tolerance from Protocol 1. If we treat $ε_{s}, ε_{a}, dim (A_{i}), δ, Max (g),$ and $Min (g)$ as constants, then $c_{1} = O (1 / \sqrt{γ})$ and $c_{0} = O (1)$ .

Furthermore, if there exists a quantum strategy that wins the game G with probability $ω_{exp}$ , there is an honest behaviour of D and Eve for which $Pr [Ω] \geq 1 - exp (- \frac{δ^{2}}{1 - ω_{exp} + δ} γ n)$ .

Remark 5.2

The condition on g(p) in Eq. (5.1) is formulated in terms of the entropy

\begin{matrix} H {(A_{i} | E_{i} {\tilde{E}}_{i - 1})}_{N_{i} (ω)} = H {(A_{i} | T^{i} X^{i} Y^{i} B^{i} E_{i}^{'} {\tilde{E}}_{i - 1})}_{N_{i} (ω)} \end{matrix}

with ${\tilde{E}}_{i - 1} \equiv R_{i - 1} E_{i - 1}$ . However, the map $N_{i}$ corresponding to the i-th round does not act on the systems $T^{i - 1} X^{i - 1} Y^{i - 1} B^{i - 1}$ . Therefore, we can view these systems as part of the purifying system. Since the infimum in Eq. (5.1) already includes a purifying ${\tilde{E}}_{i - 1}$ , we can drop these additional systems and without loss of generality choose ${\tilde{E}}_{i - 1}$ to be isomorphic to those input systems on which $N_{i}$ acts non-trivially, i.e. ${\tilde{E}}_{i - 1} \equiv R_{i - 1} E_{i - 1}^{'}$ . This means that we can replace the upper bound on g in Eq. (5.1) by the equivalent condition

\begin{matrix} g (p) \leq inf_{ω \in S (R_{i - 1} E_{i - 1} {\tilde{E}}_{i - 1}) : N_{i}^{test} {(ω)}_{C_{i}} = p} H {(A_{i} | B_{i} X_{i} Y_{i} T_{i} E_{i}^{'} {\tilde{E}}_{i - 1})}_{N_{i} (ω)} \end{matrix}

5.2

with ${\tilde{E}}_{i - 1} \equiv R_{i - 1} E_{i - 1}^{'}$ . For the proof of Lemma 5.1 we will use Eq. (5.1) since it more closely matches the notation of Theorem 4.3, but intuitively, Eq. (5.2) is more natural as it only involves quantities related to the i-th round of the protocol.

Proof of Lemma 5.1

To show the min-entropy lower bound, we will make use of Corollary 4.6. For this, we first check that the maps $N_{i}$ satisfy the required conditions. Since $C_{i}$ is a deterministic function of the (classical) variables $X_{i}, Y_{i}, A_{i},$ and $B_{i}$ , it is clear that Eq. (4.2) is satisfied. For the non-signalling condition, we define the map $R_{i} \in CPTP (E_{i - 1}, E_{i})$ as follows: $R_{i}$ samples $T_{i}, X_{i}$ and $Y_{i}$ as Alice does in Step 5.1 of Protocol 1. $R$ then performs Eve’s actions in the protocol (which only act on $Y_{i}$ and $E_{i - 1}^{'}$ , which is part of $E_{i - 1}$ ). It is clear that the distribution on $X_{i}$ and $Y_{i}$ produced by $R_{i}$ is the same as for $N_{i}$ . By the assumption that D and Eve cannot communicate, the marginal of the output of $N_{i}$ on Eve’s side must be independent of the device’s system $R_{i - 1}$ . Hence, ${Tr}_{A_{i} R_{i} C_{i}} \circ N_{i} = R_{i} \circ {Tr}_{R_{i - 1}}$ .

To construct a min-tradeoff function, we note that we can split $N_{i} = γ N_{i}^{test} + (1 - γ) N_{i}^{data}$ , with $N_{i}^{test}$ always picking $T_{i} = 1$ and $N_{i}^{data}$ always picking $T_{i} = 0$ . Then, we get from Lemma 4.7 and the condition $Max (g) = g (δ_{1})$ that the affine function f defined by

\begin{matrix} f (δ_{0}) = g (δ_{1}) + \frac{1}{γ} (g (δ_{0}) - g (δ_{1})), f (δ_{1}) = f (δ_{⊥}) = g (δ_{1}) \end{matrix}

is an affine min-tradeoff function for ${N_{i}}$ .

Viewing the event $Ω$ as a subset of the range ${0, 1}^{n}$ of the random variable $C^{n}$ and comparing with the abort condition in Protocol 1, we see that $c^{n} \in Ω$ implies $freq (c^{n}) (0) \leq (1 - ω_{exp} + δ) γ$ . Therefore, for $c^{n} \in Ω$ and denoting $p = freq (c^{n})$ ,

\begin{matrix} f (freq (c^{n})) = p (0) f (δ_{0}) + (1 - p (0)) f (δ_{1}) = \frac{p (0)}{γ} g (δ_{0}) + (1 - \frac{p (0)}{γ}) g (δ_{1}) \geq h, \end{matrix}

where the last inequality holds because g is affine and the distribution $p^{'} (0) = p (0) / γ, p^{'} (1) = 1 - p (0) / γ$ satisfies $p^{'} (0) \leq 1 - ω_{exp} + δ$ . The proposition now follows directly from Corollary 4.6 and the scaling of $c_{1}$ and $c_{0}$ is easily obtained from the expressions in Corollary 4.6.

To show that an honest strategy succeeds in the protocol with high probability, we define a random variable $F_{i}$ by $F_{i} = 1$ if $C_{i} = 0$ , and $F_{i} = 0$ otherwise. If D and Eve execute the quantum strategy that wins the game G with probability $ω_{exp}$ in each round, then $E [F_{i}] = (1 - ω_{exp}) γ$ . Using the abort condition in the protocol, we then find

\begin{matrix} Pr [abort] & = Pr [\sum_{i = 1}^{n} F_{i} > (1 - ω_{exp} + δ) \cdot γ n] \\ = Pr [\sum_{i = 1}^{n} F_{i} > (1 + \frac{δ}{1 - ω_{exp}}) \cdot E [\sum_{i = 1}^{n} F_{i}]] \\ \leq e^{- \frac{δ^{2}}{1 - ω_{exp} + δ} γ n}, \end{matrix}

where in the last line we used a Chernoff bound. $□$

To make use of Lemma 5.1, we need to construct a function g(p) that satisfies the condition in Eq. (5.1). For this, we will use the equivalent condition Eq. (5.2). A general way of obtaining such a bound automatically is using the recent numerical method [22].14 Specifically, using the assumption that Alice’s lab is isolated, the maps $N_{i}$ describing a single round of the protocol take the form described in Fig. 1.

Fig. 1 — Circuit diagram of $N : R_{i - 1} E_{i - 1}^{'} \to A_{i} R_{i} T_{i} X_{i} Y_{i} B_{i} E_{i}^{'}$ . For every round of the protocol, a circuit of this form is applied, where $A$ and $B$ are the (arbitrary) channels applied by Alice’s device and Eve, respectively. As in the protocol, $T_{i}$ is a bit equal to 1 with probability $γ$ , and $X_{i}$ and $Y_{i}$ are generated according to q whenever $T_{i} = 1$ , and are fixed to $x^{*}, y^{*}$ otherwise. We did not include the register $C_{i}$ in the figure as it is a deterministic function of $T_{i} X_{i} Y_{i} A_{i} B_{i}$

The method of [22] allows one to obtain lower bounds on the infimum of

\begin{matrix} H {(A_{i} | B_{i} X_{i} Y_{i} T_{i} E_{i}^{'} {\tilde{E}}_{i - 1})}_{N_{i} (ω_{R_{i - 1} E_{i - 1}^{'} {\tilde{E}}_{i - 1}})} \end{matrix}

over all input states $ω_{R_{i - 1} E_{i - 1}^{'} {\tilde{E}}_{i - 1}}$ and for any map $N_{i}$ of the form depicted in Fig. 1. Importantly, for any $N_{i}$ we may also restrict the infimum to states $ω$ that are consistent with the observed statistics, i.e., $N^{test} {(ω)}_{C_{i}} = p$ for some distribution p on $C_{i}$ , using the notation of Lemma 5.1. Using this numerical method for the CHSH game, we obtain the values shown in Fig. 2. From this, one can also construct an explicit affine min-tradeoff function g(p) in an automatic way using the same method as in [46]. As our focus is on illustrating the use of the generalised EAT, not the single-round bound, we do not carry out these steps in detail here.

Fig. 2 — Lower bound on the conditional entropy $H {(A_{i} | B_{i} X_{i} Y_{i} T_{i} E_{i}^{'})}_{ρ_{| T_{i} = 0}}$ for any state generated as in Fig. 1 and such that on test rounds the obtained winning probability for the CHSH game is $ω$ . This lower bound was obtained by using the method from [22]. For each input $y \in Y$ , the channel $B_{y}$ is modelled as $B_{y} (ω) = \sum_{b} Π_{y}^{(b)} ω Π_{y}^{(b)}$ , where ${Π_{y}^{(b)}}_{b \in B}$ are orthogonal projectors summing to the identity, and similarly for the map $A$ . It is simple to see that this is without loss of generality

Combining this single-round bound and Lemma 5.1, one obtains that for Protocol 1 instantiated with the CHSH game, $ω_{exp}$ sufficiently close to the maximal winning probability of $\frac{1}{2} + \frac{1}{2 \sqrt{2}}$ , and $γ = Θ (\frac{log n}{n})$ , one can extract $Ω (n)$ bits of uniform randomness from $A_{1} \dots A_{n}$ while using only $polylog (n)$ bits of randomness to run the protocol. In other words, Protocol 1 achieves exponential blind randomness expansion with the CHSH game.

E91 quantum key distribution protocol

The E91 protocol is one of the simplest entanglement-based QKD protocols [45, 47]. This protocol was already treated using the original EAT in [1]. Here, we do not give a formal security definition and proof, only an informal comparison of how the original EAT and our generalised EAT can be applied to this problem; the remainder of the security proof is then exactly as in [1]. For a detailed treatment of the application of our generalised EAT to QKD, see [7]. To facilitate the comparison with [1], in this section we label systems the same as in [1] even though this differs from the system labels used earlier in this paper. The protocol we are considering is described explicitly in Protocol 2. It is the same as in [1] except for minor modifications to simplify the notation.

We consider the systems $B_{i}, {\bar{B}}_{i}, A_{i}, {\bar{A}}_{i}, Q_{i}, {\bar{Q}}_{i}$ as in Protocol 2 and additionally define the system $X_{i}$ storing the statistical information used in the parameter estimation step:

\begin{matrix} X_{i} = \{\begin{matrix} A_{i} \oplus {\bar{A}}_{i} & if B_{i} = {\bar{B}}_{i} = 1, \\ ⊥ & otherwise. \end{matrix}) \end{matrix}

Denoting by E the side information gathered by Eve during the distribution step, we can follow the same steps as for [1, Equation (57)] to show that the security of Protocol 2 follows from a lower bound on

\begin{matrix} H_{min}^{ε} {(A^{n} | B^{n} {\bar{B}}^{n} E)}_{ρ_{| Ω}} . \end{matrix}

5.3

Here, $ρ_{| Ω}$ is the state at the end of the protocol conditioned on acceptance.

We first sketch how the original EAT (whose setup was described in Sect. 1) is applied to this problem in [1]. One cannot bound $H_{min}^{ε} {(A^{n} | B^{n} {\bar{B}}^{n} E)}_{ρ_{| Ω}}$ directly using the EAT because a condition similar to Eq. (4.2) has to be satisfied. Therefore, one modifies the systems ${\bar{A}}_{i}$ from Protocol 2 by setting ${\bar{A}}_{i} = ⊥$ if $B_{i} = {\bar{B}}_{i} = 0$ and then applies the EAT to find a lower bound on

\begin{matrix} H_{min}^{ε} {(A^{n} {\bar{A}}^{n} | B^{n} {\bar{B}}^{n} E)}_{ρ_{| Ω}} . \end{matrix}

5.4

For this, a round of Protocol 2 is viewed as a map $M_{i} : Q_{i}^{n} {\bar{Q}}_{i}^{n} \to Q_{i + 1}^{n} {\bar{Q}}_{i + 1}^{n} A_{i} {\bar{A}}_{i} B_{i} {\bar{B}}_{i} X_{i}$ , which chooses $B_{i} {\bar{B}}_{i}$ as in Protocol 2, applies Alice and Bob’s (trusted) measurements on systems $Q_{i} {\bar{Q}}_{i}$ to generate $A_{i} {\bar{A}}_{i}$ , and generates $X_{i}$ as described before. To apply the EAT, $R_{i - 1} : = Q_{i}^{n} {\bar{Q}}_{i}^{n}$ takes the role of the “hidden sytem”, and $A_{i} {\bar{A}}_{i}$ and $B_{i} {\bar{B}}_{i}$ are the output and side information of the i-th round, respectively. It is easy to see that with this choice of systems, the Markov condition of the EAT is satisfied, so, using a min-tradeoff function derived from an entropic uncertainty relation [48], one can find a lower bound on Eq. (5.4).

However, adding the system ${\bar{A}}_{i}$ in this manner has the following disadvantage: to relate the lower bound on $H_{min}^{ε} {(A^{n} {\bar{A}}^{n} | B^{n} {\bar{B}}^{n} E)}_{ρ_{| Ω}}$ to the desired lower bound on $H_{min}^{ε} {(A^{n} | B^{n} {\bar{B}}^{n} E)}_{ρ_{| Ω}}$ one needs to use a chain rule for min-entropies, incurring a penalty term of the form $H_{max}^{ε} {({\bar{A}}^{n} | A^{n} B^{n} {\bar{B}}^{n} E)}_{ρ_{| Ω}}$ . This penalty term is relatively easy to bound for the case of the E91 protocol, but can cause problems in general.15

We now turn our attention to proving Eq. (5.3) using our generalised EAT. For this, we first observe that

\begin{matrix} H_{min}^{ε} {(A^{n} | B^{n} {\bar{B}}^{n} E)}_{ρ_{| Ω}} \geq H_{min}^{ε} {(A^{n} | B^{n} {\bar{B}}^{n} X^{n} E)}_{ρ_{| Ω}}, \end{matrix}

so it suffices to find a lower bound on the r.h.s. This step is similar to adding the ${\bar{A}}_{i}$ systems in Eq. (5.4) in that its purpose is to satisfy Eq. (4.2). However, it has the advantage that here, $X^{n}$ can be added to the conditioning system and therefore lowers the entropy, not raises it like going from Eqs. (5.3) to (5.4). The same step is not possible in the original EAT due to the restrictive Markov condition.

Using the same system names as before, we define $E_{i} : = Q_{i + 1}^{n} {\bar{Q}}_{i + 1}^{n} B^{i} {\bar{B}}^{i} X^{i} E$ .16 Then, analogously to the original EAT, we can describe a single round of Protocol 2 by a map $M_{i} : E_{i - 1} \to A_{i} E_{i} X_{i}$ . (Compared to the map $M_{i}$ we described above for the original EAT, we have traced out ${\bar{A}}_{i}$ , added a copy of $X_{i}$ , and added identity maps on the other additional systems in $E_{i - 1}$ .) Denoting by $ρ_{Q^{n} {\bar{Q}}^{n} E}^{0}$ the joint state of Alice and Bob’s systems $Q^{n} {\bar{Q}}^{n}$ before measurement and the information E that Eve gathered during the distribution step, the state at the end of the protocol is $ρ = M_{n} \circ \dots \circ M_{1} (ρ^{0})$ . To apply Corollary 4.6 to find a lower bound on

\begin{matrix} H_{min}^{ε} {(A^{n} | E_{n})}_{M_{n} \circ \dots \circ M_{1} {(ρ^{0})}_{| Ω}}, \end{matrix}

we first observe that the condition in Eq. (4.2) is satisfied because the system $X^{n}$ is part of $E_{n}$ , and the non-signalling condition is trivially satisfied because there is no $R_{i}$ -system. A min-tradeoff function can be constructed in exactly the same way as in [1, Claim 5.2] by noting that all systems in $E_{i}$ on which $M_{i}$ does not act can be viewed as part of the purifying system.

This comparison highlights the advantage of the more general model of side information in our generalised EAT: for the original EAT, one has to first bound $H_{min}^{ε} (A^{n} {\bar{A}}^{n} | B^{n} {\bar{B}}^{n} E)$ (rather than $H_{min}^{ε} (A^{n} | B^{n} {\bar{B}}^{n} E)$ ) in order to be able to satisfy the Markov condition, and then perform a separate step to remove the ${\bar{A}}^{n}$ system. In our case, the non-signalling condition, the analogue of the Markov condition, is trivially satisfied because we need no $R_{i}$ -system. This is because we can add the quantum systems $Q^{n} {\bar{Q}}^{n}$ to the side information register $E_{0}$ at the start and then, since we allow side information to be updated and Alice and Bob act on $Q_{i} {\bar{Q}}_{i}$ using trusted measurement devices, we can remove the systems $Q_{i} {\bar{Q}}_{i}$ one by one during the rounds of the protocol.

Acknowledgements

We thank Rotem Arnon-Friedman, Peter Brown, Kun Fang, Raban Iten, Joseph M. Renes, Martin Sandfuchs, Ernest Tan, Jinzhao Wang, John Wright, and Yuxiang Yang for helpful discussions. We further thank Mario Berta and Marco Tomamichel for insights on Lemma 3.2, and Frédéric Dupuis and Carl Miller for discussions about blind randomness expansion.

Dual Statement for Smooth Max-Entropy

In the main text we have focused on deriving a lower bound on the smooth min-entropy. Here, we show that this also implies an upper bound on the smooth max-entropy by applying a simple duality relation between min- and max-entropy. A similar upper bound was also derived in [1]. However, that bound is subject to a Markov condition and cannot be derived by a simple duality argument since the “dual version” of the Markov condition is unwieldy. We show that the bound from [1] follows as a special case of our more general bound even without any Markov conditions or other non-signalling constraints. For simplicity, we restrict ourselves to an asymptotic statement without “testing”, i.e. we derive an $H_{max}^{ε}$ -version of Theorem 4.1. By applying the same duality relation to the more involved statement in Theorem 4.3, one can also obtain an $H_{max}^{ε}$ -bound with explicit constants and testing.

Recall that for $ρ_{AB} \in S (A B)$ and $ε \in [0, 1]$ , the $ε$ -smoothed max-entropy of A conditioned on B is defined as

\begin{matrix} H_{max}^{ε} {(A | B)}_{ρ} = log inf_{{\tilde{ρ}}_{AB} \in B_{ε} (ρ_{AB})} sup_{σ_{B} \in S (B)} {∥{\tilde{ρ}}_{AB}^{\frac{1}{2}}, σ_{B}^{\frac{1}{2}}∥}_{1}^{2}, \end{matrix}

where ${∥\cdot∥}_{1}$ denotes the trace norm and $B_{ε} (ρ_{AB})$ is the $ε$ -ball around $ρ_{AB}$ in terms of the purified distance [11]. The smooth min- and max-entropy satisfy the following duality relation [11, Proposition 6.2]: for a pure quantum state $ψ_{ABC}$ ,

\begin{matrix} H_{min}^{ε} {(A | B)}_{ψ} = - H_{max}^{ε} {(A | C)}_{ψ} . \end{matrix}

For the setting of Theorem 4.1, let $V_{i} : R_{i - 1} E_{i - 1} \to A_{i} R_{i} E_{i} F_{i}$ be the Stinespring dilation of the map $M_{i}$ , and let $| ρ^{0} ⟩_{R_{0} E_{0} F_{0}}$ be a purification of the input state $ρ_{R_{0} E_{0}}^{0}$ . Then, $V_{n} \dots V_{1} | ρ^{0} ⟩$ is a purification of $M_{n} \circ \dots \circ M_{1} (ρ^{0})$ , so by the duality of the smooth min- and max-entropy,

\begin{matrix} H_{min}^{ε} {(A^{n} | E_{n})}_{M_{n} \circ \dots \circ M_{1} (ρ^{0})} = - H_{max}^{ε} {(A^{n} | F^{n} R_{n})}_{V_{n} \dots V_{1} | ρ^{0} ⟩} . \end{matrix}

Furthermore, by concavity of the conditional entropy the infimum in Theorem 4.1 can be restricted to pure states ${| ω ⟩}_{R_{i - 1} E_{i - 1} {\tilde{E}}_{i - 1}}$ , so $V_{i} | ω ⟩$ is a purification of $M_{i} (ω)$ . Then, by the duality relation for von Neumann entropies,

\begin{matrix} H {(A_{i} | E_{i} {\tilde{E}}_{i - 1})}_{M_{i} (ω)} = - H {(A_{i} | R_{i} F_{i})}_{V_{i} | ω ⟩} . \end{matrix}

Therefore, we obtain the following dual statement to Theorem 4.1:

\begin{matrix} H_{max}^{ε} {(A^{n} | F^{n} R_{n})}_{V_{n} \dots V_{1} | ρ^{0} ⟩} \leq \sum_{i = 1}^{n} max_{| ω ⟩} H {(A_{i} | R_{i} F_{i})}_{V_{i} | ω ⟩} + O (\sqrt{n}), \end{matrix}

A.1

where the maximisation is over pure states on $R_{i - 1} E_{i - 1} {\tilde{E}}_{i - 1}$ . This holds for any sequence of isometries $V_{i}$ for which the maps $M_{V_{i}} : R_{i - 1} E_{i - 1} \to A_{i} R_{i} E_{i}$ given by $M_{V_{i}} (ρ) = {Tr}_{F_{i}} [V_{i}, ρ, V_{i}^{†}]$ satisfy the non-signalling condition of Theorem 4.1: for each i, there must exist a map $R_{i} \in CPTP (E_{i - 1}, E_{i})$ such that ${Tr}_{A_{i} R_{i}} \circ M_{V_{i}} = R_{i} \circ {Tr}_{R_{i - 1}}$ .

To gain some intuition for the above statement, consider a setting where an information source generates systems $A_{1}, \dots, A_{n}$ and $F_{1}, \dots, F_{n}$ by applying isometries $V_{i} : S_{i - 1} \to A_{i} F_{i} S_{i}$ to some pure intial state $| ρ^{0} ⟩_{S_{0}}$ . We might be interested in compressing the information in $A^{n}$ in such a way that given $F^{n}$ , one can reconstruct $A^{n}$ except with some small failure probability $ε$ . Then, the amount of storage needed for the compressed information is given by $H_{max}^{ε} (A^{n} | F^{n})$ . To apply Eq. (A.1), for $i < n$ we split the systems $S_{i}$ into $R_{i} E_{i}$ in such a way that the channel $M_{V_{i}}$ defined above satisfies the non-signalling condition, and set $E_{n} = S_{n}$ (so that $R_{n}$ is trivial). Then Eq. (A.1) gives an upper bound on $H_{max}^{ε} (A^{n} | F^{n})$ . Note that this bound depends on how we split the systems $S_{i} = R_{i} E_{i}$ : the non-signalling condition can always be trivially satisfied by choosing $R_{i}$ to be trivial, but Eq. (A.1) tells us that if we can describe the source in such a way that $E_{i}$ is relatively small and $R_{i}$ is relatively large while still satisfying the non-signalling condition, we obtain a tighter bound on the amount of required storage.

From Eq. (A.1) we can also recover the max-entropy version of the original EAT, but without requiring a Markov condition. To facilitate the comparison with [1], we first re-state their theorem with their choice of system labels, but add a bar to every system label to avoid confusion with our notation from before. The max-entropy statement in [1] considers a sequence of channels ${\bar{M}}_{i} : {\bar{R}}_{i - 1} \to {\bar{A}}_{i} {\bar{B}}_{i} {\bar{R}}_{i}$ and asserts that under a Markov condition, for any initial state $ρ_{{\bar{R}}_{0} \bar{E}}$ with a purifying system $\bar{E} \equiv {\bar{R}}_{0}$ :

\begin{matrix} H_{max}^{ε} {({\bar{A}}^{n} | {\bar{B}}^{n} \bar{E})}_{{\bar{M}}_{n} \circ \dots \circ {\bar{M}}_{1} (ρ_{{\bar{R}}_{0} \bar{E}})} \leq \sum_{i = 1}^{n} max_{ω \in S ({\bar{R}}_{i - 1} \bar{R})} H {({\bar{A}}_{i} | {\bar{B}}_{i} \bar{R})}_{{\bar{M}}_{i} (ω)} + O (\sqrt{n}), \end{matrix}

A.2

where $\bar{R} \equiv {\bar{R}}_{i - 1}$ . We want to recover this statement from Eq. (A.1) without any Markov condition. For this, we consider the Stinepring dilations ${\bar{V}}_{i} : {\bar{R}}_{i - 1} \to {\bar{R}}_{i} {\bar{A}}_{i} {\bar{B}}_{i} {\bar{F}}_{i}$ of ${\bar{M}}_{i}$ . We make the following choice of systems:

\begin{matrix} R_{i} = {\bar{B}}^{i} \bar{E}, A_{i} = {\bar{A}}_{i}, E_{i} = {\bar{R}}_{i} {\bar{F}}_{i}, \end{matrix}

and choose $F_{i}$ to be trivial. By tensoring with the identity, we can then extend ${\bar{V}}_{i}$ to an isometry $V_{i} : R_{i - 1} E_{i - 1} \to A_{i} R_{i} E_{i}$ . Then, the maps $M_{V_{i}}$ satisfy the non-signalling condition since $V_{i}$ acts as identity on $R_{i - 1}$ . Therefore, remembering that $R_{n} = {\bar{B}}^{n} \bar{E}$ and $F^{n}$ is trivial, we see that Eq. (A.1) implies Eq. (A.2). Note that our derivation did not require any conditions on the channels ${\bar{M}}_{i}$ we started with, i.e. we have shown Eq. (A.2) holds for any sequence of channels ${\bar{M}}_{i}$ , not just channels satisfying a Markov or non-signalling condition.

Uhlmann Property for the Rényi Divergence

We establish that for the max-divergence (where $α \to \infty$ ), Uhlmann’s theorem holds.

Proposition B.1

Let $σ_{A} \in S (A)$ and $ρ_{AR} \in S (A R)$ . Then we have

\begin{matrix} D_{max} (ρ_{A} ‖ σ_{A}) = inf_{{\hat{σ}}_{AR} : {\hat{σ}}_{A} = σ_{A}} D_{max} (ρ_{AR} ‖ {\hat{σ}}_{AR}) . \end{matrix}

B.1

In addition, if $ρ_{AR}, ρ_{A} \otimes {id}_{R}$ and $σ_{A} \otimes {id}_{R}$ all commute, then for any $α \in [\frac{1}{2}, \infty)$ , we have

\begin{matrix} D_{α} (ρ_{A} ‖ σ_{A}) = inf_{{\hat{σ}}_{AR} : {\hat{σ}}_{A} = σ_{A}} D_{α} (ρ_{AR} ‖ {\hat{σ}}_{AR}) . \end{matrix}

B.2

Proof

We start with Eq. (B.1). The inequality $\leq$ is a direct consequence of the data-processing inequality for $D_{max}$ . For the inequality $\geq$ , we use semidefinite programming duality, see e.g., [50]. Observe that we can write $2^{D_{max} (ρ_{A} ‖ σ_{A})}$ as the following semidefinite program

\begin{matrix} min_{τ_{A} \in Pos (A), λ \in R} {Tr [τ_{A}] subject to ρ_{A} \leq τ_{A} and τ_{A} = λ σ_{A}} . \end{matrix}

Using semidefinite programming duality, this is also equal to

\begin{matrix} max_{X_{A} \in Pos (A), Y_{A} \in Herm (A)} {Tr [X_{A}, ρ_{A}] subject to {id}_{A} + Y_{A} = X_{A} and Tr [Y_{A}, σ_{A}] = 0} . \end{matrix}

B.3

We can also write a semidefinite program for ${inf}_{{\hat{σ}}_{AR} : {\hat{σ}}_{A} = σ_{A}} 2^{D_{max} (ρ_{AR} ‖ {\hat{σ}}_{AR})}$ . We introduce the variable $θ_{AR} = λ {\hat{σ}}_{AR}$ and get

\begin{matrix} min_{θ \in Pos (A \otimes R), λ \in R} {Tr [θ_{AR}] subject to ρ_{AR} \leq θ_{AR} and θ_{A} = λ σ_{A}} . \end{matrix}

Again, by semidefinite programming duality, we get that it is equal to

\begin{matrix} max_{X_{AR} \in Pos (A \otimes R), Y_{A} \in Herm (A)} {Tr [X_{AR}, ρ_{AR}] subject to ({id}_{A} + Y_{A}) \otimes {id}_{R} \\ = X_{AR} and Tr [Y_{A}, σ_{A}] = 0} . \end{matrix}

B.4

Eliminating the variable $X_{AR}$ , we can write this last program as

\begin{matrix} max_{Y_{A} \in Herm (A)} {Tr [({id}_{A} + Y_{A}), ρ_{A}] subject to {id}_{A} + Y_{A} \in Pos (A) and Tr [Y_{A}, σ_{A}] = 0}, \end{matrix}

which is the same as Eq. (B.3). This proves Eq. (B.1). Equation (B.2) follows immediately by choosing ${\hat{σ}}_{AR} = σ_{A} ρ_{A}^{- 1} ρ_{AR}$ and using the commutation conditions. $□$

However, for $α \geq 1$ and arbitrary $σ_{A} \in S (A)$ , $ρ_{AE} \in S (A E)$ , the Uhlmann property given by Eq. (B.2) does not hold. A concrete example is $ρ_{AR} = {| ψ ⟩ ⟨ ψ |}_{AR}$ with

\begin{matrix} {| ψ ⟩}_{AR} = \sqrt{\frac{1}{4}} {| 00 ⟩}_{AR} + \sqrt{\frac{3}{4}} {| 11 ⟩}_{AR} \end{matrix}

and $σ_{A} = \frac{1}{3} | + ⟩ ⟨ + | + \frac{2}{3} | - ⟩ ⟨ - |$ . In this case, $D_{2} (ρ_{A} ‖ σ_{A}) < 0.476$ whereas

\begin{matrix} inf_{{\hat{σ}}_{AR} : {\hat{σ}}_{A} = σ_{A}} D_{2} (ρ_{AR} ‖ {\hat{σ}}_{AR}) \geq inf_{{\hat{σ}}_{AR} : {\hat{σ}}_{A} = σ_{A}} D (ρ_{AR} ‖ {\hat{σ}}_{AR}) > 0.48 . \end{matrix}

This computation was performed by numerically solving the semidefinite programs via CVXQUAD [51]. Putting everything together shows that Eq. (B.2) does not hold for $α \in {1, 2}$ :

\begin{matrix} D (ρ_{A} ‖ σ_{A}) \leq D_{2} (ρ_{A} ‖ σ_{A}) < inf_{{\hat{σ}}_{AR} : {\hat{σ}}_{A} = σ_{A}} D (ρ_{AR} ‖ {\hat{σ}}_{AR}) \leq inf_{{\hat{σ}}_{AR} : {\hat{σ}}_{A} = σ_{A}} D_{2} (ρ_{AR} ‖ {\hat{σ}}_{AR}) . \end{matrix}

Funding

Open access funding provided by Swiss Federal Institute of Technology Zurich. TM and RR acknowledge support from the National Centres of Competence in Research (NCCRs) QSIT (funded by the Swiss National Science Foundation under grant number 51NF40-185902) and SwissMAP, the Air Force Office of Scientific Research (AFOSR) via project No. FA9550-19-1-0202, the SNSF project No. 200021_188541 and the QuantERA project eDICT. OF acknowledges funding from the European Research Council (ERC Grant AlgoQIP, Agreement No. 851716), from the European Union’s Horizon 2020 QuantERA II Programme (VERIqTAS, Agreement No 101017733) and from a government grant managed by the Agence Nationale de la Recherche under the Plan France 2030 with the reference ANR-22-PETQ-0009. Part of this work was carried out when DS was with the Institute for Theoretical Physics at ETH Zurich.

Data Availability

No experimental data has been generated as part of this project. The introduction of this work has been published as an extended abstract in the proceedings of FOCS 2022 [49].

Declarations

Conflict of interest

The authors have no Conflict of interest to declare.

Footnotes

Since $ρ$ is a product of identical states, all of the terms $H {(A_{i} | E_{i})}_{ρ}$ are equal, i.e., $\sum_{i = 1}^{n} H {(A_{i} | E_{i})}_{ρ} = n H {(A_{i} | E_{i})}_{ρ}$ for any i. We write the sum here explicitly to highlight the analogy with the EAT presented below.

The EAT from [1] also makes an analogous statement about an upper bound on the max-entropy $H_{max}$ . We derive a generalisation of that statement in Appendix A but only focus on $H_{min}$ in the introduction and main text since that is the case that is typically relevant for applications.

In fact, the EAT is more general in that it allows taking into account observed statistics to restrict the minimization over $ω_{A_{i} B_{i} E}$ , but we restrict ourselves to the simpler case without statistics in this introduction.

⁴

As usual, the channels $M_{i}$ act as identity on any additional systems that may be part of the input state, i.e. $M_{i} (ω_{R_{i - 1} E_{i - 1} {\tilde{E}}_{i - 1}}) = (M_{i} \otimes {id}_{{\tilde{E}}_{i - 1}}) (ω_{R_{i - 1} E_{i - 1} {\tilde{E}}_{i - 1}})$ is a state on $A_{i} R_{i} E_{i} {\tilde{E}}_{i - 1}$ . In particular, the register ${\tilde{E}}_{i - 1}$ containing a purification of the input is also part of the output state.

⁵

Strictly speaking, the EAT as stated in [1] only requires that this Markov property holds for any input state $ω^{i - 1}$ in the image of the previous maps $M_{i - 1} \circ \dots \circ M_{1}$ . The same is true for the non-signalling condition, i.e., one can check that our proof of the generalised EAT still works if the map $R_{i}$ only satisfies Eq. (1.2) on states in the image of $M_{i - 1} \circ \dots \circ M_{1}$ . To simplify the presentation, we use the stronger condition Eq. (1.2) throughout this paper.

⁶

We note that the definition of Rényi entropies can be extended to $α < 1$ , but we will only need the case $α > 1$ .

⁷

In fact, in order for this single-round quantity to be positive one has to restrict the infimum to input states that allow the non-local game to be won with a certain probability. This requires using the generalised EAT with testing (Sect. 4.2), not Theorem 1.1. We refer to Sect. 5.1 for details.

⁸

In an EAT-like theorem, the entropy contribution from a particular round i has to be calculated conditioned on the side information revealed in that round because we want to analyse the process round-by-round, not globally. If a future round revealed additional side information, then the total entropy contributed by round i would decrease, but there is no way of accounting for that in an EAT-like theorem that simply sums up single-round contributions. As an extreme case, the last round of the process could reveal all prior outputs as side information, so that the total amount of conditional entropy produced by the process is 0, but single-round entropy contributions could be positive. This demonstrates the need for some condition that enforces that future side information does not reveal information about past outputs. We note that this does not mean that there is no way of proving an entropy lower bound in more general settings: for example, [32] do show a bound on the entropy produced by parallel repeated non-local games, but this requires a global analysis.

⁹

“Stabilised” refers to the fact that the supremum in Eq. (2.1) maximises over states in $S (A \tilde{A})$ , not just $S (A)$ , i.e. the maximisation includes a purifying system $\tilde{A}$ . One can also consider non-stabilised channel divergences, where the supremum is only over states in $S (A)$ . However, in this paper we only use the stabilised channel divergence.

¹⁰

It is well-known [3, Lemma 8] that ${lim}_{α ↘ 1} D_{α} (E, ‖, F) = D (E ‖ F)$ , but it is unclear whether the same holds for the regularised quantity.

¹¹

In case ${\hat{ρ}}_{A^{n}}$ does not have full support, we only take the inverse on the support of ${\hat{ρ}}_{A^{n}}$ .

¹²

The map $M$ in the theorem statement is also implicitly tensored with an identity map on A, but for the definition of $\tilde{M}$ we make this explicit to avoid confusion when applying Theorem 3.1.

¹³

A function f on the convex set $P (C)$ is called affine if it is linear under convex combinations, i.e., for $λ \in [0, 1]$ and $p_{1}, p_{2} \in P (C)$ , $λ f (p_{1}) + (1 - λ) f (p_{2}) = f (λ p_{1} + (1 - λ) p_{2})$ . Such functions are also sometimes called convex-linear.

¹⁴

The main result of [15] (Theorem 14) does not appear to be sufficient for this. The reason is that the statement made in [15] essentially concerns the randomness produced on average over the question distribution q of the game G. However, choosing a question at random consumes randomness, so to achieve exponential randomness expansion, in Protocol 1 we fix the inputs $x^{*}, y^{*}$ used for generation rounds. To the best of our knowledge, the results of [15] do not give a bound on the randomness produced in the non-local game for any fixed inputs $x^{*}, y^{*}$ . If one could prove an analogous statement to [15, Theorem 14] that also certifies randomness on fixed inputs for a large class of games, our Lemma 5.1 would then imply exponential blind randomness expansion for any such game. Alternatively, one can also assume that public (non-blind) randomness is a free resource and use this to choose the inputs for the non-local game. Then, no special inputs $x^{*}, y^{*}$ are needed in Protocol 1 to “save randomness” and the result of [15] combined with our generalised EAT implies that such a conversion from public to blind randomness is possible for any complete-support game.

¹⁵

An error correction scheme is reliable if, except with negligible probability, either Bob’s guess of Alice’s string is correct or the protocol aborts.

¹⁶

In Protocol 2, instead of Alice distributing the systems $Q_{i} {\bar{Q}}_{i}$ and Eve gathering side information E by intercepting ${\bar{Q}}_{i}$ , we can equivalently imagine that Eve first prepares a state $ρ_{Q^{n} {\bar{Q}}^{n} E}^{0}$ and distributes $Q_{i} {\bar{Q}}_{i}$ to Alice and Bob in each round. Then, the choice of $E_{i}$ intuitively captures the side information available to Eve from the first i rounds: Eve still possesses the systems $Q_{i + 1}^{n} {\bar{Q}}_{i + 1}^{n}$ to be distributed in future rounds, has gathered classical information $B^{i} {\bar{B}}^{i} X^{i}$ , and keeps the static side information E from preparing the initial state.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

1.Dupuis, F., Fawzi, O., Renner, R.: Entropy accumulation. Commun. Math. Phys. 379(3), 867–913 (2020) [DOI] [PMC free article] [PubMed] [Google Scholar]
2.Renner, R.: Security of quantum key distribution. Int. J. Quantum Inf. 6(01), 1–127 (2008) [Google Scholar]
3.Tomamichel, M., Colbeck, R., Renner, R.: A fully quantum asymptotic equipartition property. IEEE Trans. Inf. Theory 55(12), 5840–5847 (2009) [Google Scholar]
4.Arnon-Friedman, R., Dupuis, F., Fawzi, O., Renner, R., Vidick, T.: Practical device-independent quantum cryptography via entropy accumulation. Nat. Commun. 9(1), 459 (2018) [DOI] [PMC free article] [PubMed] [Google Scholar]
5.Bamps, C., Massar, S., Pironio, S.: Device-independent randomness generation with sublinear shared quantum resources. Quantum 2, 86 (2018) [Google Scholar]
6.Petz, D.: Sufficient subalgebras and the relative entropy of states of a von Neumann algebra. Commun. Math. Phys. 105(1), 123–131 (1986) [Google Scholar]
7.Metger, T., Renner, R.: Security of quantum key distribution from generalised entropy accumulation. Preprint at arXiv:2203.04993 (2022) [DOI] [PMC free article] [PubMed]
8.Dupuis, F., Fawzi, O.: Entropy accumulation with improved second-order term. IEEE Trans. Inf. Theory 65(11), 7596–7612 (2019) [Google Scholar]
9.Fawzi, H., Fawzi, O.: Defining quantum divergences via convex optimization. Quantum 5, 387 (2021) [Google Scholar]
10.Uhlmann, A.: The “transition probability’’ in the state space of a -algebra. Rep. Math. Phys. 9(2), 273–279 (1976) [Google Scholar]
11.Tomamichel, M.: Quantum Information Processing with Finite Resources: Mathematical Foundations, vol. 5. Springer, Cham, Switzerland (2015) [Google Scholar]
12.Sutter, D.: Approximate Quantum Markov Chains. Springer, Cham (2018) [Google Scholar]
13.Christandl, M., König, R., Renner, R.: Postselection technique for quantum channels with applications to quantum cryptography. Phys. Rev. Lett. 102(2), 020504 (2009) [DOI] [PubMed] [Google Scholar]
14.Arnon-Friedman, R., Renner, R., Vidick, T.: Simple and tight device-independent security proofs. SIAM J. Comput. 48(1), 181–225 (2019) [Google Scholar]
15.Miller, C.A., Shi, Y.: Randomness in nonlocal games between mistrustful players. Quantum Inf. Comput. 17(7), 595 (2017) [PMC free article] [PubMed] [Google Scholar]
16.Honghao, F., Miller, C.A.: Local randomness: examples and application. Phys. Rev. A 97(3), 032324 (2018) [DOI] [PMC free article] [PubMed] [Google Scholar]
17.Colbeck, R.: Quantum and relativistic protocols for secure multi-party computation. PhD Thesis, University of Cambridge (2006)
18.Colbeck, R., Kent, A.: Private randomness expansion with untrusted devices. J. Phys. A Math. Theor. 44(9), 095305 (2011) [Google Scholar]
19.Pironio, S., Acín, A., Massar, S., de La Giroday, A.B., Matsukevich, D.N., Maunz, P., Olmschenk, S., Hayes, D., Le Luo, L., Manning, T.A., et al.: Random numbers certified by Bell’s theorem. Nature 464(7291), 1021–1024 (2010) [DOI] [PubMed] [Google Scholar]
20.Vazirani, U., Vidick, T..: Certifiable quantum dice: or, true random number generation secure against quantum adversaries. In: Proceedings of the Forty-Fourth Annual ACM Symposium on Theory of Computing, pp. 61–76 (2012)
21.Miller, C.A., Shi, Y.: Robust protocols for securely expanding randomness and distributing keys using untrusted quantum devices. J. ACM (JACM) 63(4), 1–63 (2016) [Google Scholar]
22.Brown, P., Fawzi, H., Fawzi, O.: Computing conditional entropies for quantum correlations. Nat. Commun. 12(1), 1–12 (2021) [DOI] [PMC free article] [PubMed] [Google Scholar]
23.Chung, K.M., Shi, Y., Wu, X.: Physical randomness extractors: generating random numbers with minimal assumptions. Preprint at arXiv:1402.4797 (2014)
24.Coudron, M., Yuen, H..: Infinite randomness expansion with a constant number of devices. In: Proceedings of the Forty-Sixth Annual ACM Symposium on Theory of Computing, STOC ’14, pp. 427-436, New York, NY, USA. Association for Computing Machinery (2014)
25.Kaniewski, J., Wehner, S.: Device-independent two-party cryptography secure against sequential attacks. New J. Phys. 18(5), 055004 (2016) [Google Scholar]
26.Broadbent, A., Islam, R.: Quantum encryption with certified deletion. In: Theory of Cryptography Conference, pp. 92–122. Springer (2020)
27.Kundu, S., Tan, E.: Composably secure device-independent encryption with certified deletion. Preprint at arXiv:2011.12704 (2020)
28.Frauchiger, D., Renner, R., Troyer, M.: True randomness from realistic quantum devices. Preprint at arXiv:1311.4547 (2013)
29.Campbell, S., Vacchini, B.: Collision models in open system dynamics: A versatile tool for deeper insights? Europhys. Lett. 133(6), 60001 (2021) [Google Scholar]
30.del Rio, L., Hutter, A., Renner, R., Wehner, S.: Relative thermalization. Phys. Rev. E 94(2), 022104 (2016) [DOI] [PubMed] [Google Scholar]
31.Akers, C., Penington, G.: Leading order corrections to the quantum extremal surface prescription. J. High Energy Phys. 2021(4), 1–73 (2021)35342281 [Google Scholar]
32.Jain, R., Kundu, S.: A direct product theorem for quantum communication complexity with applications to device-independent QKD. In: 2021 IEEE 62nd Annual Symposium on Foundations of Computer Science (FOCS), pp. 1285–1295. IEEE (2022)
33.Zhang, Y., Fu, H., Knill, E.: Efficient randomness certification by quantum probability estimation. Phys. Rev. Res. 2, 013016 (2020) [DOI] [PMC free article] [PubMed] [Google Scholar]
34.Knill, E., Zhang, Y., Bierhorst, P.: Generation of quantum randomness by probability estimation with classical side information. Phys. Rev. Res. 2, 033465 (2020) [Google Scholar]
35.Müller-Lennert, M., Dupuis, F., Szehr, O., Fehr, S., Tomamichel, M.: On quantum Rényi entropies: a new generalization and some properties. J. Math. Phys. 54(12), 122203 (2013) [Google Scholar]
36.Wilde, M.M., Winter, A., Yang, D.: Strong converse for the classical capacity of entanglement-breaking and Hadamard channels via a sandwiched Rényi relative entropy. Commun. Math. Phys. 331(2), 593–622 (2014) [Google Scholar]
37.Fang, K., Fawzi, O., Renner, R., Sutter, D.: Chain rule for the quantum relative entropy. Phys. Rev. Lett. 124, 100501 (2020) [DOI] [PubMed] [Google Scholar]
38.Hayashi, M.: Quantum Information Theory. Springer, Berlin Heidelberg (2017) [Google Scholar]
39.Christandl, M.: The Structure of Bipartite Quantum States-Insights from Group Theory and Cryptography. Ph. D. Thesis (2006)
40.Harrow, A.W.: Applications of coherent classical communication and the Schur transform to quantum information theory. Preprint at arXiv:quant-ph/0512255 (2005)
41.Leditzky, F., Kaur, E., Datta, N., Wilde, M.M.: Approaches for approximate additivity of the Holevo information of quantum channels. Phys. Rev. A 97(1), 012332 (2018) [Google Scholar]
42.Liu, W.-Z., Li, M.-H., Ragy, S., Zhao, S.-R., Bai, B., Liu, Y., Brown, P.J., Zhang, J., Colbeck, R., Fan, J., et al.: Device-independent randomness expansion against quantum side information. Nat. Phys. 17(4), 448–451 (2021) [Google Scholar]
43.Miller, C.A., Shi, Y.: Universal security for randomness expansion from the spot-checking protocol. SIAM J. Comput. 46(4), 1304–1335 (2017) [Google Scholar]
44.Arqand, A., Hahn, T.A., Tan, E.Y.-Z.: Generalized renyi entropy accumulation theorem and generalized quantum probability estimation. Preprint at arXiv:2405.05912 (2024)
45.Ekert, A.K.: Quantum cryptography based on Bell’s theorem. Phys. Rev. Lett. 67, 661–663 (1991) [DOI] [PubMed] [Google Scholar]
46.Brown, P., Ragy, S., Colbeck, R.: A framework for quantum-secure device-independent randomness expansion. IEEE Trans. Inf. Theory 66(5), 2964–2987 (2019) [Google Scholar]
47.Christandl, M., Renner, R., Ekert, A.: A generic security proof for quantum key distribution. Preprint at arXiv:quant-ph/0402131 (2004)
48.Berta, M., Christandl, M., Colbeck, R., Renes, J.M., Renner, R.: The uncertainty principle in the presence of quantum memory. Nat. Phys. 6(9), 659–662 (2010) [Google Scholar]
49.Metger, T., Fawzi, O., Sutter, D., Renner, R.: Generalised entropy accumulation. In: 2022 IEEE 63rd Annual Symposium on Foundations of Computer Science (FOCS), pp. 844–850. IEEE (2022)
50.Watrous, J.: The Theory of Quantum Information. Cambridge University Press, Cambridge (2018) [Google Scholar]
51.Fawzi, H., Fawzi, O.: Efficient optimization of the quantum relative entropy. J. Phys. A Math. Theor. 51(15), 154003 (2018) [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

No experimental data has been generated as part of this project. The introduction of this work has been published as an extended abstract in the proceedings of FOCS 2022 [49].

[CR1] 1.Dupuis, F., Fawzi, O., Renner, R.: Entropy accumulation. Commun. Math. Phys. 379(3), 867–913 (2020) [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR2] 2.Renner, R.: Security of quantum key distribution. Int. J. Quantum Inf. 6(01), 1–127 (2008) [Google Scholar]

[CR3] 3.Tomamichel, M., Colbeck, R., Renner, R.: A fully quantum asymptotic equipartition property. IEEE Trans. Inf. Theory 55(12), 5840–5847 (2009) [Google Scholar]

[CR4] 4.Arnon-Friedman, R., Dupuis, F., Fawzi, O., Renner, R., Vidick, T.: Practical device-independent quantum cryptography via entropy accumulation. Nat. Commun. 9(1), 459 (2018) [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR5] 5.Bamps, C., Massar, S., Pironio, S.: Device-independent randomness generation with sublinear shared quantum resources. Quantum 2, 86 (2018) [Google Scholar]

[CR6] 6.Petz, D.: Sufficient subalgebras and the relative entropy of states of a von Neumann algebra. Commun. Math. Phys. 105(1), 123–131 (1986) [Google Scholar]

[CR7] 7.Metger, T., Renner, R.: Security of quantum key distribution from generalised entropy accumulation. Preprint at arXiv:2203.04993 (2022) [DOI] [PMC free article] [PubMed]

[CR8] 8.Dupuis, F., Fawzi, O.: Entropy accumulation with improved second-order term. IEEE Trans. Inf. Theory 65(11), 7596–7612 (2019) [Google Scholar]

[CR9] 9.Fawzi, H., Fawzi, O.: Defining quantum divergences via convex optimization. Quantum 5, 387 (2021) [Google Scholar]

[CR10] 10.Uhlmann, A.: The “transition probability’’ in the state space of a -algebra. Rep. Math. Phys. 9(2), 273–279 (1976) [Google Scholar]

[CR11] 11.Tomamichel, M.: Quantum Information Processing with Finite Resources: Mathematical Foundations, vol. 5. Springer, Cham, Switzerland (2015) [Google Scholar]

[CR12] 12.Sutter, D.: Approximate Quantum Markov Chains. Springer, Cham (2018) [Google Scholar]

[CR13] 13.Christandl, M., König, R., Renner, R.: Postselection technique for quantum channels with applications to quantum cryptography. Phys. Rev. Lett. 102(2), 020504 (2009) [DOI] [PubMed] [Google Scholar]

[CR14] 14.Arnon-Friedman, R., Renner, R., Vidick, T.: Simple and tight device-independent security proofs. SIAM J. Comput. 48(1), 181–225 (2019) [Google Scholar]

[CR15] 15.Miller, C.A., Shi, Y.: Randomness in nonlocal games between mistrustful players. Quantum Inf. Comput. 17(7), 595 (2017) [PMC free article] [PubMed] [Google Scholar]

[CR16] 16.Honghao, F., Miller, C.A.: Local randomness: examples and application. Phys. Rev. A 97(3), 032324 (2018) [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR17] 17.Colbeck, R.: Quantum and relativistic protocols for secure multi-party computation. PhD Thesis, University of Cambridge (2006)

[CR18] 18.Colbeck, R., Kent, A.: Private randomness expansion with untrusted devices. J. Phys. A Math. Theor. 44(9), 095305 (2011) [Google Scholar]

[CR19] 19.Pironio, S., Acín, A., Massar, S., de La Giroday, A.B., Matsukevich, D.N., Maunz, P., Olmschenk, S., Hayes, D., Le Luo, L., Manning, T.A., et al.: Random numbers certified by Bell’s theorem. Nature 464(7291), 1021–1024 (2010) [DOI] [PubMed] [Google Scholar]

[CR20] 20.Vazirani, U., Vidick, T..: Certifiable quantum dice: or, true random number generation secure against quantum adversaries. In: Proceedings of the Forty-Fourth Annual ACM Symposium on Theory of Computing, pp. 61–76 (2012)

[CR21] 21.Miller, C.A., Shi, Y.: Robust protocols for securely expanding randomness and distributing keys using untrusted quantum devices. J. ACM (JACM) 63(4), 1–63 (2016) [Google Scholar]

[CR22] 22.Brown, P., Fawzi, H., Fawzi, O.: Computing conditional entropies for quantum correlations. Nat. Commun. 12(1), 1–12 (2021) [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR23] 23.Chung, K.M., Shi, Y., Wu, X.: Physical randomness extractors: generating random numbers with minimal assumptions. Preprint at arXiv:1402.4797 (2014)

[CR24] 24.Coudron, M., Yuen, H..: Infinite randomness expansion with a constant number of devices. In: Proceedings of the Forty-Sixth Annual ACM Symposium on Theory of Computing, STOC ’14, pp. 427-436, New York, NY, USA. Association for Computing Machinery (2014)

[CR25] 25.Kaniewski, J., Wehner, S.: Device-independent two-party cryptography secure against sequential attacks. New J. Phys. 18(5), 055004 (2016) [Google Scholar]

[CR26] 26.Broadbent, A., Islam, R.: Quantum encryption with certified deletion. In: Theory of Cryptography Conference, pp. 92–122. Springer (2020)

[CR27] 27.Kundu, S., Tan, E.: Composably secure device-independent encryption with certified deletion. Preprint at arXiv:2011.12704 (2020)

[CR28] 28.Frauchiger, D., Renner, R., Troyer, M.: True randomness from realistic quantum devices. Preprint at arXiv:1311.4547 (2013)

[CR29] 29.Campbell, S., Vacchini, B.: Collision models in open system dynamics: A versatile tool for deeper insights? Europhys. Lett. 133(6), 60001 (2021) [Google Scholar]

[CR30] 30.del Rio, L., Hutter, A., Renner, R., Wehner, S.: Relative thermalization. Phys. Rev. E 94(2), 022104 (2016) [DOI] [PubMed] [Google Scholar]

[CR31] 31.Akers, C., Penington, G.: Leading order corrections to the quantum extremal surface prescription. J. High Energy Phys. 2021(4), 1–73 (2021)35342281 [Google Scholar]

[CR32] 32.Jain, R., Kundu, S.: A direct product theorem for quantum communication complexity with applications to device-independent QKD. In: 2021 IEEE 62nd Annual Symposium on Foundations of Computer Science (FOCS), pp. 1285–1295. IEEE (2022)

[CR33] 33.Zhang, Y., Fu, H., Knill, E.: Efficient randomness certification by quantum probability estimation. Phys. Rev. Res. 2, 013016 (2020) [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR34] 34.Knill, E., Zhang, Y., Bierhorst, P.: Generation of quantum randomness by probability estimation with classical side information. Phys. Rev. Res. 2, 033465 (2020) [Google Scholar]

[CR35] 35.Müller-Lennert, M., Dupuis, F., Szehr, O., Fehr, S., Tomamichel, M.: On quantum Rényi entropies: a new generalization and some properties. J. Math. Phys. 54(12), 122203 (2013) [Google Scholar]

[CR36] 36.Wilde, M.M., Winter, A., Yang, D.: Strong converse for the classical capacity of entanglement-breaking and Hadamard channels via a sandwiched Rényi relative entropy. Commun. Math. Phys. 331(2), 593–622 (2014) [Google Scholar]

[CR37] 37.Fang, K., Fawzi, O., Renner, R., Sutter, D.: Chain rule for the quantum relative entropy. Phys. Rev. Lett. 124, 100501 (2020) [DOI] [PubMed] [Google Scholar]

[CR38] 38.Hayashi, M.: Quantum Information Theory. Springer, Berlin Heidelberg (2017) [Google Scholar]

[CR39] 39.Christandl, M.: The Structure of Bipartite Quantum States-Insights from Group Theory and Cryptography. Ph. D. Thesis (2006)

[CR40] 40.Harrow, A.W.: Applications of coherent classical communication and the Schur transform to quantum information theory. Preprint at arXiv:quant-ph/0512255 (2005)

[CR41] 41.Leditzky, F., Kaur, E., Datta, N., Wilde, M.M.: Approaches for approximate additivity of the Holevo information of quantum channels. Phys. Rev. A 97(1), 012332 (2018) [Google Scholar]

[CR42] 42.Liu, W.-Z., Li, M.-H., Ragy, S., Zhao, S.-R., Bai, B., Liu, Y., Brown, P.J., Zhang, J., Colbeck, R., Fan, J., et al.: Device-independent randomness expansion against quantum side information. Nat. Phys. 17(4), 448–451 (2021) [Google Scholar]

[CR43] 43.Miller, C.A., Shi, Y.: Universal security for randomness expansion from the spot-checking protocol. SIAM J. Comput. 46(4), 1304–1335 (2017) [Google Scholar]

[CR44] 44.Arqand, A., Hahn, T.A., Tan, E.Y.-Z.: Generalized renyi entropy accumulation theorem and generalized quantum probability estimation. Preprint at arXiv:2405.05912 (2024)

[CR45] 45.Ekert, A.K.: Quantum cryptography based on Bell’s theorem. Phys. Rev. Lett. 67, 661–663 (1991) [DOI] [PubMed] [Google Scholar]

[CR46] 46.Brown, P., Ragy, S., Colbeck, R.: A framework for quantum-secure device-independent randomness expansion. IEEE Trans. Inf. Theory 66(5), 2964–2987 (2019) [Google Scholar]

[CR47] 47.Christandl, M., Renner, R., Ekert, A.: A generic security proof for quantum key distribution. Preprint at arXiv:quant-ph/0402131 (2004)

[CR48] 48.Berta, M., Christandl, M., Colbeck, R., Renes, J.M., Renner, R.: The uncertainty principle in the presence of quantum memory. Nat. Phys. 6(9), 659–662 (2010) [Google Scholar]

[CR49] 49.Metger, T., Fawzi, O., Sutter, D., Renner, R.: Generalised entropy accumulation. In: 2022 IEEE 63rd Annual Symposium on Foundations of Computer Science (FOCS), pp. 844–850. IEEE (2022)

[CR50] 50.Watrous, J.: The Theory of Quantum Information. Cambridge University Press, Cambridge (2018) [Google Scholar]

[CR51] 51.Fawzi, H., Fawzi, O.: Efficient optimization of the quantum relative entropy. J. Phys. A Math. Theor. 51(15), 154003 (2018) [Google Scholar]

PERMALINK

Generalised Entropy Accumulation

Tony Metger

Omar Fawzi

David Sutter

Renato Renner

Abstract

Introduction

Theorem 1.1

Lemma 1.2

Preliminaries

Notation

Rényi divergence and entropy

Definition 2.1

Definition 2.2

Lemma 2.3

Definition 2.4

Definition 2.5

Spectral pinching

Definition 2.6

Lemma 2.7

Proof

Lemma 2.8

Proof

Corollary 2.9

Proof

Strengthened Chain Rules

Strengthened chain rule for Rényi divergence

Theorem 3.1

Lemma 3.2

Lemma 3.3

Proof

Proof of Theorem 3.1

Removing the regularisation

Definition 3.4

Lemma 3.5

Proof

Strengthened chain rule for conditional Rényi entropy

Lemma 3.6

Proof

Generalised Entropy Accumulation

Generalised EAT

Theorem 4.1

Proof

Generalised EAT with testing

Definition 4.2

Theorem 4.3

Remark 4.4

Lemma 4.5

Proof

Proof of Theorem 4.3

Corollary 4.6

Proof

Lemma 4.7

Sample Applications

Blind randomness expansion

Proposition 5.1

Remark 5.2

Proof of Lemma 5.1

Fig. 1.

Fig. 2.

E91 quantum key distribution protocol

Acknowledgements

Dual Statement for Smooth Max-Entropy

Uhlmann Property for the Rényi Divergence

Proposition B.1

Proof

Funding

Data Availability

Declarations

Conflict of interest

Footnotes

References

Associated Data

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles