Abstract
The localization receiver operating characteristic (LROC) curve is a standard method to quantify performance for the task of detecting and locating a signal. This curve is generalized to arbitrary detection/estimation tasks to give the estimation ROC (EROC) curve. For a two-alternative forced-choice study, where the observer must decide which of a pair of images has the signal and then estimate parameters pertaining to the signal, it is shown that the average value of the utility on those image pairs where the observer chooses the correct image is an estimate of the area under the EROC curve (AEROC). The ideal LROC observer is generalized to the ideal EROC observer, whose EROC curve lies above those of all other observers for the given detection/estimation task. When the utility function is nonnegative, the ideal EROC observer is shown to share many mathematical properties with the ideal observer for the pure detection task. When the utility function is concave, the ideal EROC observer makes use of the posterior mean estimator. Other estimators that arise as special cases include maximum a posteriori estimators and maximum-likelihood estimators.
1. INTRODUCTION
The evaluation of imaging systems based on observer performance for a task that combines detection and estimation has been studied extensively when the parameters to be estimated specify the location of a signal in an image. A useful figure of merit in this situation is the ALROC, the area under the localization receiver operating characteristic (LROC) curve [1,2]. Recently, Khurd and Gindi [3,4] determined the ideal LROC observer, whose LROC curve lies above those of all other observers for the given task. Of course, this also implies that the ideal LROC observer maximizes the ALROC for the given task. For a given imaging system, the ideal ALROC can be used as a figure of merit for the optimization of system parameters in order to improve detection and localization performance.
We are proposing a general framework, the estimation ROC curve (EROC), for the evaluation of observers on more general combined detection and estimation tasks. We define the EROC curve for the detection of a signal and the estimation of a set of signal parameters. This curve is a straightforward generalization of the LROC curve. The location of the signal is replaced with an arbitrary set of signal parameters to be estimated. In addition, the binary correct-localization function, which is used in LROC analysis to determine whether a location estimate is within the tolerance limit, is replaced with a utility function, which measures the usefulness of a particular estimate given the true parameter vector. The expected utility for the true-positive detections may then be plotted versus the false-positive fraction as the detection threshold is varied to generate an EROC curve.
We will show how the area under the EROC curve (AEROC) is related to a two-alternative forced-choice (2AFC) test. For this 2AFC test, the observer is shown two images, one of which has the signal. The observer must decide which image has the signal and then estimate the parameter vector for that signal. When the observer picks the wrong image, the score for that pair is zero. When the observer picks the right image, the score for that pair is the utility of the estimate of the parameter vector as compared with the true parameter vector. The average of these scores over a large number of trials is an estimate of the AEROC for this observer.
We next formulate the ideal EROC observer and study its properties. This is a mathematical observer for a detection/estimation task whose EROC curve lies above those of all other observers for the given task. The ideal observer for a pure detection task requires full knowledge of the probability distributions of the data under the signal-absent and signal-present hypotheses. Similary, for the detection/estimation task, the ideal observer must know the distribution of the data under the signal-absent hypothesis and the joint distribution of the data and signal parameters under the signal-present hypothesis. If these distributions are known and the utility function is specified, then the ideal EROC observer uses them to compute a test statistic, to compare with a threshold for the detection step, and to obtain an estimate of the signal parameters. This observer maximizes the AEROC for the given task, and this maximum value may be used as a figure of merit for system or reconstruction algorithm design.
For the pure detection task, the ideal observer calculates the likelihood ratio for a test statistic. This statistic possesses certain mathematical properties that lead to alternative expressions [5,6] for the ideal AUC, the area under the ROC curve for the ideal observer, in terms of moments of the likelihood ratio. These alternative expressions in turn lead to upper and lower bounds on, and approximate expressions for, the ideal AUC [7,8]. These bounds and approximations can also be computed from certain moments of the likelihood ratio. When the utility function is nonnegative, the ideal EROC observer exhibits similar mathematical properties, which in turn lead to bounds and approximations for the ideal AEROC in terms of moments of the detection test statistic. Another result of the similarity in the mathematics between these two ideal observers is that the ideal AEROC can be approximated, for a weak signal, by an expression involving a generalization of the Fisher information matrix. A similar Fisher information approximation has been derived for the ideal AUC in the pure detection context [9,10]. We will present details of these alternative expressions, bounds, and approximations below.
Finally, we will examine some special cases of the ideal EROC observer. When the utility function is very narrow, the detection statistic becomes a scanning likelihood ratio, and the estimator becomes a maximum a posteriori (MAP) or a maximum-likelihood (ML) estimator. When the utility function is symmetric and concave, the estimator is the posterior mean of the parameter, and the test statistic is a likelihood-weighted average utility of the estimate. For a narrow utility function and normal distribution of the data, the detection test statistic is a scanning Hotelling observer, and the estimator could be called a scanning linear estimator. These results show that many common detection and estimation strategies are paired together as optimal EROC observers when the utility function satisfies certain constraints.
2. EROC CURVE
An observer performing a combined detection and estimation task is given a data vector g that is drawn from either a signal-absent ensemble with probability density pr(g|H0) or a signal-present ensemble with probability density pr(g|θ,H1). The symbol θ represents a parameter vector associated with the signal. This parameter vector may have variable dimensions to accommodate situations where the number of scalar parameters that specify the signal may vary. For example, the parameter vector may be the number of small lesions in an image and their locations. In this case, the dimension of the parameter vector is twice the number of lesions (for a two-dimensional image) plus one for the number of lesions.
Part of the observer’s task is to decide whether the signal is present or absent. If we assume that the observer is not subject to internal noise, then this decision can be reduced in the usual way [5] to the comparison of a test statistic T(g) with a threshold T0. If T(g) >T0, then the observer declares the signal to be present. Otherwise, the signal is declared to be absent. For those data vectors where the observer decides that the signal is present, an estimate θ̂(g) of the parameter θ must be produced in order to complete the task.
The utility of the estimate θ̂(g) is denoted by u[θ̂(g), θ] when the signal is actually present and the true parameter vector is θ. In general, we would expect this function to have high values when the estimate is close to the true parameter vector and low values when it is far from the true parameter vector. The choice of the utility function will affect the EROC curve for the given observer and should be based on the value of a good estimate. In general, whenever parameter estimation is involved, a utility function (or its opposite, a cost function) must be specified in order to measure the performance of an estimator. Later, we will examine some consequences of making more specific assumptions about the shape of the utility function.
To plot the EROC curve, we define the false-positive fraction at a given threshold T0 in the usual way as
| (1) |
where the integration in this equation is over all of data space. Similar integrals in succeeding equations will also be over all of data space unless otherwise specified. We can also write the above expression using expectations
| (2) |
This number is the probability of deciding that the signal is present when it is absent and is the abscissa of the point on the EROC curve corresponding to the threshold value T0. For the corresponding ordinate of this point on the curve, we use the expected utility for those data vectors where the estimation occurs and the signal is present, i.e., the true-positive fraction. To compute this expectation, we need the prior distribution pr(θ) on the signal parameter vector, since this is an unknown random vector. The expected utility for the true-positive fraction is given by
| (3) |
where the outer integral is over all of parameter space. Similar integrals in subsequent equations will also be over all parameter space unless otherwise specified. Using the angle bracket notation, the ordinate may be written as
| (4) |
A plot of UTP(T0) versus PFP(T0) as the threshold is varied generates the EROC curve [11]. Each point on the EROC curve gives the expected utility of our estimate of the parameter vector for the true-positive cases at a given false-positive fraction.
3. AREA UNDER THE EROC CURVE
The area under the EROC curve is given by
| (5) |
The range of integration here is the range of values of the test statistic T0. The AEROC can be used as a figure of merit for the observer on the combined detection and estimation tasks. By taking the derivative of the false-positive fraction with respect to the threshold, we arrive at an alternative expression for the AEROC:
| (6) |
One useful property of the AEROC as a figure of merit is that it can be computed from a 2AFC test. This fact can be derived from Eq. (7) by writing out the expected utility inside the angle brackets:
| (7) |
where the inner expectation is over the joint distribution of the data vector and the parameter vector under the signal-present hypothesis. For the 2AFC test, the observer is shown a large number of pairs of data vectors or images, with each pair consisting of a signal-absent image and a signal-present image. The observer must decide which of the pair of images is from the signal-present ensemble and estimate the parameter vector for that image. For the pairs where the correct image was chosen, the utility of the estimate as compared with the true value is computed. These utility values are then summed, and the sum is divided by the total number of image pairs. The end result is an estimate of the AEROC for this observer.
4. IDEAL EROC OBSERVER
Another useful property of the EROC curve is that there is an ideal EROC observer for any given detection/estimation task and utility function. The EROC curve for this ideal observer lies above all others for the given probability distributions and utility function. Of course, this implies that the AEROC for this ideal observer is the maximum possible; therefore, this ideal AEROC can be used as a figure of merit for an imaging system on the given detection/estimation task relative to the specified utility function.
To define the test statistic and estimator for the ideal EROC observer, we first define a conditional likelihood ratio as
| (8) |
The ideal EROC observer test statistic is given by the maximum value of a likelihood-ratio-weighted average of the utility function:
| (9) |
This integral could also be viewed as a utility-weighted average of the conditional likelihood ratio. A third interpretation is that the integral is the weighted inner product of the conditional likelihood with the utility as a function of its second argument. This will be maximized when the first argument θ′ is such that these two functions align as closely as possible as vectors in the weighted Hilbert space defined by the prior probability on the parameter vector. The ideal EROC observer estimator is actually computed along with the test statistic:
| (10) |
This equation implies that we may also write the ideal test statistic in the form
| (11) |
where the ideal estimator is defined by Eq. (10). This form is useful for studying the mathematical properties of the ideal EROC test statistic.
The proof that these expressions give the ideal EROC observer is an easy adaptation of the proof presented by Khurd and Gindi for the ideal LROC observer [3,4]. For a given value P of the false-positive fraction, we have a constrained maximization problem for UTP(T0), considered as a functional of the test statistic T(g) and the estimator θ̂(g), and as a function of the threshold T0. By using a Lagrange multiplier λ, this optimization problem is equivalent to choosing the functions θ̂(g) and T(g) and the numbers T0 and λ that maximize
| (12) |
This quantity may be written as
| (13) |
For a fixed function T(g) and fixed numbers T0 and λ, the estimator in Eq. (10) maximizes this quantity. Now that we have the estimator, we choose T(g) and T0 so that
| (14) |
This is most easily achieved by using Eq. (9) for the test statistic and setting λ=T0. Finally, the threshold T0 is chosen so that the false-positive fraction is P. Of course, as with ordinary detection tasks, any monotonic transformation applied to this test statistic and threshold will work just as well. This derivation does not tell us whether any ideal EROC observer must use a test statistic that is a monotonic transformation of Eq. (9).
5. ALTERNATE EXPRESSIONS FOR THE IDEAL AEROC
One interesting result of the definition of the ideal EROC observer is that the AEROC for this observer can be computed by sampling independent pairs from the signal-absent ensemble:
| (15) |
The proof of this statement has been given elsewhere [11] and is summarized in the Appendix. If we have a good model for the distribution of the ideal EROC test statistic under the signal-absent hypothesis, then the equation
| (16) |
can be used to estimate the ideal AEROC. This equation is similar to an expression for the AUC of the ideal observer for a pure detection task,
| (17) |
where Λ (g) is the likelihood ratio. Equation (17) leads to other equalities and inequalities that relate the ideal AUC to various moments of the likelihood ratio under the signal-absent hypothesis [5–8]. Analogous relations can be derived for the AEROC and signal-absent moments of TI.
For example, evaluating the step function immediately gives
| (18) |
If the lower limit in the inner integral is extended to negative infinity, then the double integral gives the mean value of TI. This means that Eq. (18) can also be written as
| (19) |
This equation is meaningful as long as the mean value of the ideal test statistic is finite under the signal-absent hypothesis.
Symmetric versions of these two equations can be derived also. First, we interchange the order of integration in Eq. (18) and interchange the integration variables to obtain
| (20) |
Then we add this equation to Eq. (18) and divide by 2. The result is AEROCI
| (21) |
A similar procedure with Eq. (19) leads to
| (22) |
Either of these last two equations gives us the lower bound
| (23) |
We will assume that the ideal AEROC is a finite number. This lower bound then implies that the mean value of the ideal test statistic is also finite.
6. SPECIAL CASES
We will summarize briefly some special cases of the ideal EROC observer. When the utility function is δ(θ′ – θ), the ideal EROC observer uses MAP estimation. In this case, we have
| (24) |
| (25) |
These results show that MAP estimation is close to optimal in the EROC sense when close tolerances are required for the parameter estimation.
If, in addition to the δ utility function, we also have a flat prior on the parameters, then the ideal EROC observer uses ML estimation
| (26) |
| (27) |
In this case, the decision statistic is the conditional likelihood ratio at the estimated parameter value. These equations show that ML estimation and what we will call “likelihood windowing” are close to optimal in the EROC sense when we have close tolerances for our estimates and complete ignorance about the true parameter values.
If we define a Bayesian cost function to be the negative of our utility function, then our ideal EROC estimator also minimizes the Bayesian risk. This means that, under very general conditions on the utility function and the posterior density on the parameter vector [12], the posterior mean estimator
| (28) |
is the ideal EROC estimator with the corresponding test statistic
| (29) |
An example of a set of such conditions is the following: The utility function is a concave symmetric function of the difference of the parameter vectors, and the posterior distribution of the parameters is symmetric about its mean. The concavity of the utility function implies that it cannot be positive unless the range of each parameter is bounded. An advantage of this type of utility function is that we do not need to perform the maximization calculation to compute the test statistic or the estimator.
Finally, consider the case of a normal probablity distribution for the data
| (30) |
with the signal in the mean
| (31) |
Again, we will assume a δ utility function and a flat prior. The decision statistic in this case can be replaced with its logarithm, which is given by
| (32) |
The corresponding estimator is given by
| (33) |
Note that, in contrast to the ideal AUC observer in this situation, the ideal EROC observer does not employ a linear estimator or test statistic due to the maximization step. Instead, an affine function of the data is scanned in the parameter space. The decision test statistic could be called a scanning Hotelling observer and is optimal in the EROC sense with the assumptions given above.
7. POSITIVE UTILITY FUNCTIONS
If the utility function is positive, then the ideal decision test statistic will be positive and the ideal EROC observer shares many properties with the likelihood ratio, which is also necessarily a positive quantity. For example, the ideal AEROC may be expressed in terms of complex moments of the ideal EROC test statistic as
| (34) |
This equation is not obvious, and the steps leading to it involve the Fourier transform and contour integration. A detailed derivation of this equation has been presented elsewhere [11] and is summarized in the Appendix. Equation (34) can be used to relate the ideal AEROC to Fisher information when the signal is weak. This relation will be discussed below.
Another way to derive Eq. (34), and other relations that we will soon see, is to define probability distributions
| (35) |
The second PDF here is a purely artificial mathematical construction that is nevertheless useful for studying the mathematical properties of the ideal AEROC. If we form a likelihood ratio with these two densities, we have
| (36) |
Therefore, the normalized ideal EROC test statistic is a likelihood ratio, albeit for an artificially constructed detection task. The AUC for this likelihood ratio is related to the ideal AEROC via
| (37) |
This equation implies that there are many properties of the ideal AUC that transfer over to the ideal AEROC.
As a first example of the use of Eq. (37), consider the following equation for the AUC [7]:
| (38) |
where
| (39) |
Setting T=〈TI〉TIH0Λ, we obtain
| (40) |
This equation is in fact related to Eq. (34) by Parseval’s theorem for the Mellin transform.
There are several inequalities that can be derived from the various expressions for the ideal AEROC presented above. One example that follows easily from Eq. (34) is
| (41) |
By the Schwarz inequality, we can show that this lower bound is an improvement over the one given above in Eq. (23) for a general utility function.
We can obtain another inequality by defining a function μ(β) by
| (42) |
Adapting an inequality in [8] for the ideal AUC, we have
| (43) |
Note that the function in the center of this inequality is a monotonically increasing function of AEROCI. The lower bound in Eq. (43) is actually the same as the lower bound in Eq. (41). The upper bound in Eq. (43) is a considerable improvement over the upper bound in Eq. (41). If μ(β) is slowly varying near , then Eq. (43) will give a good approximation to the ideal AROC in terms of the β=1 and moments of the ideal EROC test statistic.
A different approximation to the ideal AEROC follows from the G(0) approximation to the ideal AUC [5]. When translated into the notation used here, this approximation yields
| (44) |
The same Schwarz inequality mentioned above shows that the quantity in the square root is positive.
In summary then, all of the inequalities and approximations for ideal AUC observers have their counterparts for ideal AEROC observers when the utility function is nonnegative. This fact, together with the 2AFC interpretation of the AEROC, means that much of the mathematical machinery that has been developed to estimate the ideal AUC [13] and test these estimates for bias and variance [14] carries over with minor modifications to the ideal AEROC when the utility function is positive.
8. IDEAL AEROC AND FISHER INFORMATION
The relation between the ideal AUC for weak signals and Fisher information has been worked out for general signal parameters in detail elsewhere [9,10]. For simplicity, we will confine ourselves to the signal amplitude parameter. Let α be the signal amplitude, which is fixed, small, positive, and not one of the parameters that we are trying to estimate. Then pr(g|θ,H1) is replaced with pr(g|α,θ,H1) throughout, and Eq. (35) is modified to
| (45) |
Under these circumstances, the ideal AUC for the artificial detection task is given to lowest order in α by
| (46) |
where F0 is the Fisher information
| (47) |
We want to use this to obtain the lowest-order approximation for the ideal AEROC when the signal is weak.
To compute this Fisher information, we start with the conditional likelihood
which now depends on the signal amplitude, and the corresponding ideal AEROC test statistic
| (48) |
We will assume that θ is an ordinary vector parameter. The derivative of the test statistic is
| (49) |
The last term in the square brackets is zero by definition of the ideal EROC estimator. When α=0, the ideal EROC estimate is independent of the data. We will call this estimate θ0. Therefore,
| (50) |
and
| (51) |
We also have
| (52) |
in which the constant on the right is defined.
Now we are ready to put the pieces together to compute F0. After some rearranging of integrals, we find that an important function is the conditional score, defined as
| (53) |
We may now write the derivative in Eq. (50) as
| (54) |
The function s(g|θ) may be regarded as a random function of θ; in other words, a random process on the parameter space. The random nature of this process is due to the random vector g. For the mean of this random process, we have
This implies that the following expectation is the covariance for this random process:
| (55) |
This function might be called a Fisher information kernel and generalizes the standard notion of the Fisher information. Finally, the Fisher information that we seek is given by the expectation
| (56) |
This number is necessarily nonnegative since the covariance operator of a random process is a nonnegative definite operator. With this value for F0, and using Eqs. (51) and (54), we have, to lowest order in the signal amplitude,
| (57) |
for weak signals. If we plot the ideal AEROC versus signal amplitude, this equation tells us that the slope is determined by ū0 and F0. This can be used to relate the ideal AEROC to the ideal AUC for the pure detection task when the signal is weak.
9. CONCLUSIONS
The EROC curve is a straightforward generalization of the LROC curve to the task of detecting a signal and estimating an arbitrary parameter vector associated with it. The use of a general utility function also takes us beyond the standard LROC paradigm. The relation between AEROC and 2AFC studies provides an easy way to measure observer performance, as measured by the AEROC, in combined detection/estimation tasks.
The formulas for the ideal EROC observer are also easy generalizations of those for the ideal LROC observer. The ideal AEROC can, therefore, be computed as a figure of merit that is task dependent but independent of any particular detection/estimation algorithm. Of course, this figure of merit also depends on the choice of utility function.
There is a relation between the ideal AEROC, when the utility function is positive, and the ideal AUC for an artificially constructed detection task. The bounds and approximations derived from this relation allow us to import many of the methods that have been developed to compute the ideal AUC to the computation of the ideal AEROC. In particular, for weak signals, the ideal AEROC can be approximated from a Fisher information kernel to lowest order in the signal amplitude.
Finally, with particular choices for the prior and the utility function, many commonly used detection test statistics and estimators turn up as special cases of the ideal EROC observer. This gives conditions under which many popular detection algorithms and parameter estimators are optimal in the EROC sense.
Acknowledgments
The author acknowledges support from NIH/NIBIB grants R01-EB002146 and P41 EB002035.
APPENDIX A
1. Proof of Equation (15)
First, we want to prove that
| (A1) |
To see why this equation is true, we start with
| (A2) |
This expression then gives us
| (A3) |
On the right in this equation is the inner expectation in the 2AFC expression for the ideal AEROC in Eq. (7). Since the outer expectations in the two expressions for the ideal AEROC are the same, this shows their equivalence.
2. Proof of Equation (34)
For a nonnegative utility function, we may define an equivalent test statistic by
| (A4) |
Then the ideal AEROC is given by
| (A5) |
The following chain of equalities leads to an expression for the ideal AEROC in terms of complex moments of TI. We start with the characteristic function
| (A6) |
and write the ideal AEROC as an integral in frequency space
| (A7) |
Evaluating the delta function gives
| (A8) |
In terms of the ideal test statistic TI, this expression becomes
| (A9) |
If we let β=−2πω, the integral can be written as
| (A10) |
This last integral can be viewed as a contour integral along the imaginary axis. We shift the contour one-half unit to the right in order to dispense with the principal value. The integrand is analytic on the strip between these two contours due to two inequalities. The first inequality is
| (A11) |
which is true for any x and y, and the second inequality is
| (A12) |
which is true for 0≤x≤1. When we shift the contour, iβ is replaced by . After taking into account the pole at the origin and then keeping only the real part, we arrive at
| (A13) |
Footnotes
OCIS codes: 110.3000, 110.2960.
References
- 1.Gifford HC, Wells RG, King MA. A comparison of human observer LROC and numerical observer ROC tumor detection in SPECT images. IEEE Trans Nucl Med. 1999;46:1032–1037. [Google Scholar]
- 2.Swensson RG. Using localization data from image interpretations to improve estimates of performance accuracy. Med Decis Making. 2000;20:170–185. doi: 10.1177/0272989X0002000203. [DOI] [PubMed] [Google Scholar]
- 3.Khurd P, Gindi G. Decision strategies maximizing the area under the LROC curve. Proc SPIE. 2005;5749:150–161. doi: 10.1109/TMI.2005.859210. [DOI] [PubMed] [Google Scholar]
- 4.Khurd P, Gindi G. Decision strategies that maximize the area under the LROC curve. IEEE Trans Med Imaging. 2005;24:1626–1636. doi: 10.1109/TMI.2005.859210. [DOI] [PubMed] [Google Scholar]
- 5.Barrett HH, Abbey CK, Clarkson E. Objective assessment of image quality III: ROC metrics, ideal observers and likelihood-generating functions. J Opt Soc Am A. 1998;15:1520–1535. doi: 10.1364/josaa.15.001520. [DOI] [PubMed] [Google Scholar]
- 6.Clarkson E, Barrett HH. Statistical decision theory and tumor detection. In: Strickland R, editor. Image Processing Techniques for Tumor Detection. Dekker; 2002. Chap. 4. [Google Scholar]
- 7.Clarkson E, Barrett HH. Approximations to ideal-observer performance on signal detection tasks. Appl Opt. 2000;39:1783–1793. doi: 10.1364/ao.39.001783. [DOI] [PubMed] [Google Scholar]
- 8.Clarkson E. Bounds on the area under the receiver operating characteristic curve for the ideal observer. J Opt Soc Am A. 2002;19:1963–1968. doi: 10.1364/josaa.19.001963. [DOI] [PubMed] [Google Scholar]
- 9.Shen F, Clarkson E. Using Fisher information to compute ideal observer performance on detection tasks. Proc SPIE. 2004;5372:22–30. doi: 10.1364/josaa.23.002406. [DOI] [PubMed] [Google Scholar]
- 10.Shen F, Clarkson E. Using Fisher information to approximate ideal observer performance on detection tasks for lumpy backgrounds. J Opt Soc Am A. 2006;23:2406–2414. doi: 10.1364/josaa.23.002406. [DOI] [PubMed] [Google Scholar]
- 11.Clarkson E. Estimation ROC curves and their corresponding ideal observers. Proc SPIE. 2007;6515:651504-1–651504-7. [Google Scholar]
- 12.Van Trees HL, editor. Detection, Estimation, and Modulation Theory (Part I) Academic; 1968. [Google Scholar]
- 13.Kupinski MA, Hoppin J, Clarkson E, Barrett HH. Ideal observer computation using Markov-chain Monte Carlo. J Opt Soc Am A. 2003;20:430–438. doi: 10.1364/josaa.20.000430. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Clarkson E, Kupinski MA, Hoppin J. Assessing the accuracy of estimates of the likelihood ratio. Proc SPIE. 2003;5034:135–143. [Google Scholar]
