Incorporating User Interaction and Topological Constraints within Contour Completion via Discrete Calculus

Jia Xu; Maxwell D Collins; Vikas Singh

doi:10.1109/cvpr.2013.246

. Author manuscript; available in PMC: 2021 May 14.

Published in final edited form as: Proc IEEE Comput Soc Conf Comput Vis Pattern Recognit. 2013 Oct 3;2013:1886–1893. doi: 10.1109/cvpr.2013.246

Incorporating User Interaction and Topological Constraints within Contour Completion via Discrete Calculus

Jia Xu ¹, Maxwell D Collins ¹, Vikas Singh ^1,^*

PMCID: PMC8118905 NIHMSID: NIHMS1698956 PMID: 33994765

Abstract

We study the problem of interactive segmentation and contour completion for multiple objects. The form of constraints our model incorporates are those coming from user scribbles (interior or exterior constraints) as well as information regarding the topology of the 2-D space after partitioning (number of closed contours desired). We discuss how concepts from discrete calculus and a simple identity using the Euler characteristic of a planar graph can be utilized to derive a practical algorithm for this problem. We also present specialized branch and bound methods for the case of single contour completion under such constraints. On an extensive dataset of ~ 1000 images, our experiments suggest that a small amount of side knowledge can give strong improvements over fully unsupervised contour completion methods. We show that by interpreting user indications topologically, user effort is substantially reduced.

1. Introduction

This paper is focused on developing optimization models for the problem of multiple contour completion/segmentation subject to side constraints. The type of constraints our algorithm incorporates are (a) those relating to inside (or outside) seed indications given via user scribbles; (b) global constraints on the topology, i.e., information which reflects the number of unique closed contours a user is looking for. Given the output from a boundary detector (e.g., Probability of Boundary or Pb [25]), we obtain a large set of weighted locally-based contours (or edgelets) as shown in Fig. 1. The objective then is to find k closed “legal” contour cycles with desirable properties (e.g., curvilinear continuity, strong edge gradient, small curvature), where legal solutions are those that satisfy the side constraints, shown in Fig. 1. The basic primitives in our construction are contour fragments, not pixels. The motivation for this choice is similar to most works on contour detection for image segmentation – by moving from predominantly region-based terms to a function that utilizes strength of edges, we seek to partly mitigate the dependence of the final segmentation on the homogeneity of the regions alone and the number of seeds. Additionally, in at least some circumstances, one expects benefits in terms of running time by utilizing a few hundred edges instead of a million pixels in the image. Our high level goal is the design of practical contour completion algorithms that take advice – which in a sense parallels a powerful suite of methods that have recently demonstrated how global knowledge can be incorporated within popular region-based image segmentation methods [26].

Figure 1: — Left to right: input images, edgelets or contours with seed indications, and final contour. Foreground is marked in green; background is marked in red; boundrary is marked in white. Best viewed in color.

Related Work.

The study of methods for detection of salient edges and object boundaries from images has a long history in computer vision [37]. The associated body of literature is vast – methods range from performing edge detection at the level of local patches [32], to taking the continuity of edge contours into account [37, 29], to incorporating high-level cues [36] such as those derived from shape and/or appearance [25]. While the appropriateness of a specific contour detector is governed by the downstream application, developments in recent years have given a number of powerful methods that yield high quality boundary detection on a large variety of images and perform well on established benchmarks [25]. Broadly, this class of methods uses local measurements to estimate the likelihood of a boundary at a pixel location. To do this, the conventional approach was to identify discontinuities in the brightness channel, where as newer methods exploit significantly more information. For instance, [27] suggests a logistic regression on brightness, color, and texture, and [9, 24] learns a classifier by operating on a large number of features derived from image patches or filter responses at multiple orientations. Contemporary to this line of research, there are also a variety of existing algorithms that integrate (or group) local edge information into a globally salient contour. Since one expects the global contour to be smooth, the well known Snakes formulation introduced an objective function based on first and second derivative of the curve. Others have proposed utilizing the ratio of two line integrals [18], incorporating curvature [31, 10], joining pre-extracted line segments [40, 35], and using CRFs to ensure the continuity of contours [30]. Note that despite similarities, contour detection on its own is not the same as image segmentation. In fact, even when formalized under contour completion, an algorithm may not always produce a closed contour. Nonetheless, from most “edge-based” methods one can obtain a partition of the image into object and background regions. Without getting into the merits of edges versus regions, one can view edge-based contours as a viable alternative to “region-based” image segmentation methods in many applications.

The success of the above developments notwithstanding, the applicability of these methods has been somewhat limited by their inability to successfully discriminate between contours of different classes of objects. To address this limitation, there has been a noticeable shift recently towards the incorporation of additional information within the contour completion process. In particular, several groups have presented frameworks that leverage category specific (or semantic) information into the process of obtaining closed object boundaries. Specific examples of this line of work include semantic contours [16], the hierarchical ultrametric contour map [2], and particle filtering based object detection via edges [23]. The basic idea here is to achieve a balance between bottom up edge/boundary detection and top-down supervision, for simultaneous image segmentation and recognition. While semantic knowledge based contour completion is quite powerful, its performance invariably depends on the richness of the underlying training corpus. Indeed, if the shape epitomes do not reflect the object of interest accurately enough (significant pose variations), if there is clutter/occlusion, or when a novel class is not well represented in the training data, the results may be unsatisfactory. In these circumstances, it seems natural to endow the contour completion models with the capability to leverage some form of user supervision (foreground and background seeds) [15]. Further, knowledge provided in the form of the number of closed contours a user requires, can be a powerful form of user guidance as well. Notice that the adoption of Grabcut type methods suggests that a nominal amount of “interactive scribbles” is readily available in many applications, and may significantly improve the quality of solutions. While there are many mechanisms which incorporate such constraints in region based segmentation, only a few methods take such information explicitly into account for edge-based contour completion. In this work, we leverage a discrete calculus based toolset to incorporate such topological and seed indications type supervision within a practical contour completion algorithm.

The primary contributions of the this paper are: (i) We present a unified optimization model for multiple contour completion/segmentation which incorporates topological constraints as well as inclusion/exclusion of foreground and background seeds. The topological knowledge is included by using the Euler characteristic of the edgelet graph where as inclusion/exclusion constraints utilize concepts from discrete calculus. (ii) For an extensive dataset, we provide strong evidence that with a small amount of user interaction, one can obtain high quality segmentations based on edge contours information alone. We give an easy to use implementation, as well as user scribble data corresponding to varying levels of interaction on this large (~ 1000) set of images.

2. Preliminaries

The tools of discrete calculus provide a powerful formalism to represent the topological information in an image [14, 20, 7]. We use conventions of discrete calculus to describe our problem of finding multiple contour closures. In this section, we introduce the idea of cell complices which are the fundamental building blocks of our construction. The following text also introduces the necessary notations, which will be used thoughout the rest of the text.

2.1. Discrete Calculus

The domain of an image is decomposed into a set of cells. If the decomposition is such that (i) the interiors of the cells are disjoint and (ii) the boundary between any two p-dimensional cells is a (p − 1)-dimensional cell then we have a cell complex. As an example, consider a planar graph G = ⟨V, E, F⟩ with vertices V, edges E, and faces F. Such a graph has incidence relationship between each face and its bounding edges, and between each edge and its endpoint vertices. Similarly, each vertex is incident on two or more edges and each edge is incident on two faces. Notice that the interior of a pair of faces is disjoint, and the boundary between any two faces gives an edge, where the dimension is reduced by one. As a consequence, we get a 2D cell complex for a planar graph, and also a set of incidence relationships among simplices of different dimensions.

A cell complex may be oriented such that we can describe directions on each cell relative to its orientation, see Fig. 2(a). Each type of cell has a corresponding pair of possible orientations: a vertex (0-cell) is either a source or a sink while an edge (1-cell) may be directed toward either endpoint. Further, each cell induces a corresponding orientation on incident cells; for example, a directed edge has a source endpoint vertex at one end and sink at the other. The orientations of a cell and a member of its boundary are coherent if the induced orientations agree, an example is shown in Fig. 2(b).

Figure 2: — Visualization of the orientations on cells of different dimensionalities (a). In (b) we show in the left column p-cells with all of their boundary (p − 1)-cells coherently oriented, and all boundary cells anti-coherently oriented in the right column.

We may represent the two-dimensional image as an oriented complex. All faces are given the same orientation, while edges and vertices are given arbitrary orientations. After enumerating its constituent vertices, edges and faces, a selection of some subset of faces is specified with an indicator vector x ∈ {0,1}^|F|. x_i = 1 denotes the candidate face F_i ∈ F is in the foreground, and x_i = 0 otherwise. Similarly, we represent the edge and vertex configuration of G by indicator vectors y ∈ {0, 1}^|E| and z ∈ {0, 1}^|V| respectively. We require that the indicator vectors x, y, z on each level of cell consistently describe a segmentation. The key relationship is consistency between the labels on the incident cells. These relationships can be expressed algebraically using the notion of a dimension-appropriate incidence matrix. The edge-face incidence matrix (also called the boundary operator) C₁ ∈ {−1, 0, 1}^|E|×|F| is defined by

C_{1; i j} = {\begin{array}{l} 1 & if edge i is incident to face j and coherently oriented; \\ - 1 & if edge i is incident to face j and anti-coherently oriented; \\ 0 & otherwise . \end{array}

(1)

Here, C_1;ij refers to entry (i, j) in C₁. Similarly, by discarding orientation information, we can define the edge-face corresponding matrix C₂ ∈ {0, 1}^|E|×|F| which labels which edges are incident to which face. It can be calculated as the element-wise absolute value of C₁, such that C_2;ij = |C_1;ij|. The node-edge incident matrix A₁ ∈ {−1, 0, 1}^|V|×|E| is defined analogously to (1), where A_1;ij = 1 iff node i is incident to edge j. As with C₂, we define the node-edge corresponding matrix A₂ = |A₁| ∈ {0, 1}^|V|×|E|. We further use a node-edge degree matrix $A_{3} \in R^{| V | \times | E |}$ , where A_3;ij = A_2;ij/d_i where d_i denotes the degree of node i.

Discrete calculus describes the notion of duality between cell complices. In a p-complex, each q-cell will have a corresponding dual (p − q)-cell (say, q ≤ p). For any given cell complex, we can construct its dual in a way that preserves incidence relationships between cells, see Fig. 3. Using these concepts, in the following sections, we will formalize the required constraints within a contour completion objective function.

Figure 3: — Duality relationships between 2D cell complices.

3. Problem Formulation

As described in Section 2.1, our model works with selections of the cells constituting the foreground. Since the notion of foreground for a face is self-evident, we will describe the labeling of vertices and edges, starting from a face labeling x. We enforce the following condition:

Condition 1.

A p-cell is in the foreground if and only if it is incident to a (p + 1)-cell in the foreground.

This condition ensures that each connected component of the foreground is itself a cell complex, a property we will use shortly.

First, we introduce an auxiliary indicator variable w ∈ {0, 1}^|E| which selects the boundary edges. These edges are those which are incident to both a foreground and a background face. W.l.o.g., consider edge 1 incident to faces 1 and 2 respectively, then $w_{1} = | x_{1} - x_{2} = I (x_{1} \neq x_{2})$ . Taken together, the full set of boundary edges precisely represent the contour of the selected foreground. We can now use the boundary operator from Section 2.1 to derive the identity

w = | C_{1} x |

(2)

Observe that each edge is incident to exactly two faces, and we specified that all faces have identical orientation. It follows that an edge must be coherent with one face and anti-coherent with the other. Therefore, for all internal edges (non-boundary edges in the foreground) the C₁ operator when multiplied with x, cancels the contribution from these two faces, leaving non-zero values only for the boundary edges. The internal edges (which are incident to foreground faces on both sides) can still be computed in a different manner. The vector C₂x will count the inside edges twice and the boundary edges once, as we discard orientation (and thus sign information). In the preceding, w.l.o.g. (C₂x)₁ = x₁ + x₂. Thus, Condition 1 will be satisfied if the following identity holds:

2 y = w + C_{2} x

(3)

We use the matrices A₂, A₃ for a pair of linear inequalities which are equivalent to Condition 1 for vertices. Observe that the vector A₂y will be the number of foreground edges incident to each foreground vertex (or node), where (A₂y)_i is the number of foreground edges incident to vertex (or node) i. Similarily, when scaled by the degree d_i of vertex i, (A₃y)_i ∈ [0, 1] will be the proportion of edges incident to i which are in foreground. Enforcing condition 1 is equivalent to:

A_{3} y \leq z \leq A_{2} y

(4)

Since z_i ∈ {0, 1}, the condition, z_i ≥ (A₃y)_i, will be true only for z_i = 1 if any edge incident to i is in foreground. Conversely, if no edge incident to i is selected in the solution, then (A₂y)_i = (A₃y)_i = 0 and (4) is satisfied only for z_i = 0.

The expressions introduced above allow the identification of whether a user provided seed falls “inside” or “outside” the contour completion given by w, and will serve as constraints for our multiple contour completion model. Fig. 4 shows an illustrative example for an image, where the input to the contour completion are edgelets (or edgels) obtained from boundaries of a globalPb derived superpixels.

Figure 4: — A superpixel-based segmentation with the foreground subgraph consistent under condition 1. Selected faces are shaded, foreground edges are bold and foreground vertices highlighted in yellow. Internal edges y_i ≠ w_i = 0 are bold/black, boundary edges y_i = w_{_i} = 1 are red.

Euler Characteristic.

Our final requirement is to be able to specify the number of closed contours desired. The existing literature on region based image segmentation provides some ideas on how this can be accomplished for random field based models – in the form of so-called connectedness constraints. TopologyCuts is an extension of graph-cuts and utilizes certain levelset ideas to preserve topology [41]. The DijkstraGC [38] finds a segmentation where two manually indicated seed points are connected via the foreground where as Nowozin [28] makes use of a LP relaxation. Very recently, [8] proposed selectively perturbing the energy function to ensure topological properties. Here, we show how a much simpler form can capture the desired topological properties, as described next.

For any graph we can define the Euler characteristic as

χ = | V | - | E | + | F |,

(5)

where χ = 2 for any planar embedding of a graph. If we explicitly constrain that the Euler characteristic of an induced subgraph created by selecting any given foreground is exactly two, this will give a foreground region that is connected and simple in a geometric sense. For multiple connected regions, we can use the generalized form of this formula for arbitrary planar graphs:

| F | + | V | - | E | = n + 1

(6)

where n is the number of connected components.)

Lemma 3.1. Let x, y, z denote indicator vectors for the selection of faces, edges, and vertices for planar graph G. The selected subgraph will satisfy (6) if

\sum_{i} x_{i} + \sum_{k} z_{k} - \sum_{j} y_{j} = n

(7)

Proof. (Sketch) The left-hand side of this formula counts each relevant quantity for the Euler characteristic of the selected subgraph, but it neglects to count the “outside” face. Subtract one from the RHS and derive the equality. □

This will not count the extra outside faces corresponding to any “holes”. This was not a problem in our experiments, but can be explicitly avoided by requiring the background be connected using the spanning tree constraints of [33]. Using (7) as a constraint in our model will guarantee that we recover n simply connected foregrounds.

3.1. Optimization Model

Before we introduce the contour completion model, we briefly describe the procedure for deriving the components of the graph from an image. This process follows existing algorithms for contour and boundary detection. First, we run the globalPb detector on an image which provides the probability of boundary for each image pixel. Next, we generate a set of superpixels from the image using the globalPb output in conjunction with TurboPixels (which uses local information and compactness). Each superpixel corresponds to a face, and the boundary of the superpixel corresponds to edges in the graph (these are the basic primitives of the closed contours we will derive). If two edges are connected, we introduce a node in the graph. With this construction, the problem of finding multiple contour closures reduces to finding multiple cycles in the graph. To select the cycles for the strongest contours, we want to weight the edges appropriately. For this purpose, we calculate two types of weight measures following [21]. The first, denoted by N, measures the “goodness” of edges. The better edge i is, the smaller N_i will be. The second, denoted by D, is the count of all the pixels on the superpixel boundary. We use an objective function which is the ratio of these quantities, $\frac{N (w)}{D (w)}$ . This ends up being the portion of contour w.r.t arc-length which does not lie on a true image edge. Minimizing this quantity has been shown to provide a contour that has strong edge support in the image.

Finally, the user indictations are represented in terms of indicator vectors x₀, x₁, where x_0;i = 1 if face i contains a background seed. With the basic components (or constraints) in hand, we now have the main optimization model.

\begin{array}{l} \min_{w, x, y, z} \frac{N^{T} w}{D^{T} w}, \\ \begin{matrix} s.t. & w = | C_{1} x |, & 2 y = w + C_{2} x, \end{matrix} \end{array}

(8a,b)

\begin{matrix} A_{3} y \leq z \leq A_{2} y, & 1^{T} x + 1^{T} z - 1^{T} y = n, \end{matrix}

(8c,d)

\begin{matrix} x_{1} \leq x \leq 1 - x_{0}, & w, x, y, z \in {0, 1} . \end{matrix}

(8e,f)

3.2. Optimizing Ratio Objective

Since the objective in (8) of the main paper is in ratio form, we transform it into a linear function with a free variable, t. Our linear ratio cost objective function is solved by minimizing f(t, u) = (N − tD)^Tu, over admissible u for a sequence of chosen values of t. Here, u denotes the concatenated vector of all indicator variables in the model. Assume D ≥ 0 and D^Tu ≠ 0. For an initial finite bounding interval [t_l, t_u], let t₀ be the initial value. Let $\bar{u} = {arg min}_{u} f (t_{0}, u)$ , the procedure proceeds as follows:

$f (t_{0}, \bar{u}) = 0 : N^{T} \bar{u} / D^{T} \bar{u} = t_{0},$ stop with solution t₀
$f (t_{0}, \bar{u}) < 0 : N^{T} \bar{u} / D^{T} \bar{u} < t_{0}, t_{u} \leftarrow N^{T} \bar{u} / D^{T} \bar{u}$
$f (t_{0}, \bar{u}) > 0 : N^{T} \bar{u} / D^{T} \bar{u} > t_{0}, t_{l} \leftarrow t_{0}$

Each iteration is easily solved in a few seconds using the CPLEX IP solver on a standard workstation.

4. Beyond Superpixel-derived Edgelets

Recall that the model in Section 3.1 constructs a cell complex using a superpixel decomposition of the image domain. While fast algorithms for finding this decomposition are available [22], it is known that superpixels are not robust for all types of images. Occlusion or weak boundaries give cases where the set of superpixel boundary primitives (the input to our optimization) do not include some valid edgelets (ones which have not been picked up by either the contour detector or superpixel method). The natural solution to this is to supplement the basic set of edgelet primitives with additional contour pieces that bridge the ‘gaps’ and allow a more accurate contour closure even in the presence of very weak signal variations. Next, we present such an extension to find completions using a base set of disconnected edgelets. But introducing completions between all pairs of edgelets is prohibitive and leads to a problem with a large number of variables (especially for multiple contours). The following model, while applicable to the multiple contour setting, is most effective for finding a single contour which encloses a simply connected foreground region.

Euler Spirals.

A key subcomponent of this problem is how to join two edgelets which will follow each other on the contour. This is the problem solved by [19] which proposes to use segments of the Euler spiral. This spiral can be shown to be the curve C with minimal total curvature TC₂ = ∫_C k(s)² ds where k(s) is the curvature at a given point on the curve parameterized by arc-length. For any pair of points along with tangents we can construct a segment of an euler spiral which connects these points with consistent tangents. They show that these completions satisfy the conditions given by [17] for a “pleasing” curve (invariance to similarity transformations, symmetry, extensibility, smoothness, roundness).

We parameterize the spiral by the turning angle as in [39]. To form a completion, we consider the Euler spiral under a similarity transformation determined by the position and Frenet frame (P₀, T₀, N₀) at the spiral’s inflection point, and a scaling factor a. The transformed spiral is

Q (θ) = {\begin{array}{l} P_{0} + a C (θ) T_{0} + a S (θ) N_{0} & θ \geq 0 \\ P_{0} - a C (- θ) T 0 - a S (- θ) N_{0} & θ < 0 \end{array}

where S and C are the Fresnel integrals. A choice of interval [θ₁, θ₂] selects a given segment. [39] gives a set of equations to determine these free variables, given segment endpoints P₁, P₂ and their tangents T₁, T₂. We solve these equations using a modified Newton’s method. The most expensive step, the computation of the Fresnel integrals, is sped up considerably using [12], but augmented with pre-computed tables. We can compute an average completion in 30μs, versus 1ms for [19] on the same machine, making it an attractive option to calculate a large number of completions, quickly, within the core contour completion engine.

Euler-Spirals for One Contour Completion.

We are given a set of image edgelets derived from an edge detector as before, as well as user-provided foreground and background seeds. The core objective considered by the algorithm is an alternating path p which consists of a sequence of edgelets joined by Euler Spiral segments. The goal is to find a closed contour that minimizes an objective function that increases with the addition of each contour segment.

Our solution strategy is to iteratively build upon the current partial path, until we get a cycle that encloses a feasible region. To do this, we adopt a specialized branch and bound procedure. Here, each node v of the branch-and-bound tree corresponds to some alternating path p. If p is a cycle, then v is a leaf node and thus a candidate solution. In this case, we check p is checked for feasiblity w.r.t. the seed constraints. If p is not a cycle, we may construct the children of this node by considering each image edglet in sequence and calculating the euler completion, on the fly. The path for the a child is then p plus the current completion and edgelet appended to the end. Children are discarded if they give rise to a self-intersecting partial path; therefore, entire subtrees can be discarded directly. Any partial path with objective worse than the best candidate solution found so far may be ignored. Otherwise, we descend the tree to each child in turn, ordered by the cost of their partial contour.

This algorithm implicitly solves a model of the form in (8), with a linear objective function on w and smoothness constraints on the solution contour. We can construct a planar graph for this model using the extensibility property of Euler spirals and splitting any two intersecting segments.

5. Experiments

We first provide evaluations of the model from Section 3 on images from the Weizmann Horse Database (WHD) [5], the Weizmann Segmentation Database (WSD) [1], and the Berkeley Segmentation Data Set (BSDS500) [3]. We then continue to experiments with a robot user on the ISEG dataset. These experiments will show that the combination of interaction with a contour-based method can achieve high levels of accuracy with a minimum of user effort.

We compare our approach (which we refer as EulerSeg) with three other contour grouping methods: (i) Ratio Region Cut (RRC) from [34], (ii) Superpixel Closure (SC) from [21], and an adaptive grouping method (EJ) [11]. We note that these are unsupervised whereas our algorithm incorporates user interaction, but SC and EJ produce multiple segmentations of which we select the most favorable. We compute the F-measure by the region overlapping and report quantitative results in Fig. 9.

Figure 9: — F-measure scores on datasets described in Section 5.

The cell complex is generated from superpixels via [22] and the same number of superpixels as SC in all our experiment. We typically indicate 1 ~ 2 interior seeds for the sought objects, but in the presence of ≥ 2 objects, we may need 3 – 7 points including both interior and exterior seeds. The indicated seeds are shown in the images: green marks are foreground and red marks are background.

RRC was run using the default parameters λ = 0, α = 1. That method has an additional parameter to indicate an arbitrary number of objects. However, it frequently fails to get a second boundary even when the image includes 2 objects. For SC, we use their reported best parameters with the number of superpixels set to 200 and T_e = 0.05. That algorithm generates K = 10 possible solutions, here we report results for the best one.

WHD Results:

WHD consists of 328 side-view images of horses, with exactly one horse in each image. Fig. 6 shows both RRC and SC select large regions of ground between the horses’ legs due to their large-region bias. As the examples show, our objective function minimizes gaps in the closure and leverages user seeds to handle slender objects better and outperforms both with ≤ 5 seeds.

Figure 6: — Sample results from WHD. **Best viewed in color.**

WSD Results:

WSD contains 200 images and is divided into 2 subsets of images with one or two foreground objects. As shown in Fig. 7, our algorithm is comparable to RRC and SC when there is one object with only one seed. However, when the image contains 2 objects, our Euler characteristic constraint fires in and we correctly segment both objects of interest, while RRC and SC either selects one of the objects or segments one large region which includes both.

Figure 7: — Sample results from WSD. **Best viewed in color.**

BSDS500 Results:

Compared with WSD and WHD, images in this dataset are more complicated. We note that in some images of BSDS500, there are no salient objects or closed contours (e.g., images of sky or street). In these cases our algorithm cannot find a meaningful closed contour, but where one is present our model performs at least as well as any of the compared methods. However, another challenging class of images in BSD are those that depict a large number of foreground objects, here our algorithm significantly improves upon previous results with a small amount of user guideline and the topological constraint. An example of this can be seen in the bottom row of Fig. 8, where RRC and SC fail whereas our method is able to find the correct solution easily.

Figure 8: — Sample results from BSDS500. **Best viewed in color.**

ISEG Results:

We compare our algorithm with the state-of-art interactive segmentation methods on the ISEG dataset[15]. These include Boykov & Jolly (BJ) with no shape constraints [6], shortest paths method (SP) [4], Random Walker (RW) [13], and Geodesic Star Convexity sequential system (GSCseq) [15]. We measure the effects of user interactions using a robot user setting. All the algorithms are set up with the default setting using the robot engine from [15]. The question we ask is how much user interaction is required to get a region F-measure score of 0.95 for the ISEG dataset (restricted to cases where all algorithms can achieve F=0.95 within 20 strokes). Table 1 demonstrates that EulerSeg requires the fewest stokes to reach a reasonable segmentation. On the other hand, as ISEG already provides a good initialization, which benefits the rest methods for building up an appearance model, the extra effort needed for a good segmentation is reduced. It is important to note that seeds in EulerSeg act as a pure geometric role and enable segmentation with fewer stroked pixels. When starting with no initialization (which we refer as EulerSeg-0), EulerSeg is still able to segment the object(s) within 5-10 strokes. These results are shown in Fig. 10.

Table 1:

Average interaction efforts required to reach an F=0.95

Method	BJ	RW	SP	GSCseq	EulerSeg
Avg. Effort	5.51	6.48	4.54	2.30	2.06

Open in a new tab

Figure 10: — Sample results from ISEG. Red strokes are background seeds while green strokes are foreground seeds. Strokes for column 3-7 are the default setting in the robot engine [15] with brush radius equal to 8 pixels, while strokes feed in EulerSeg-0 are simple point seeds, whose radius is one pixel. We marked seeds for EulerSeg-0 as crosses just for noticeability. **Best viewed in color**.

Running Time

The preprocessing to generate superpixels is the primary computational cost, and is the only resolution-dependent component of our method. The total number of variables in our ILP typically is about 2000 (with residuals); on a 3GHz i7 CPU, each iteration of the linear ratio objective solver takes < 1s. Given superpixels, our implementation creates a segmentation usually within 15 iterations, though for some exceptionally textured images or those with a large number of components our algorithm may take more than 1 minute to solve.

6. Discussion

We present a framework based on discrete calculus which unifies the contour completion and segmentation settings. This is augmented with a Euler characteristic constraint which allows us to specify the topology of the segmented foreground. Our model easily accommodates user indications and multiple foreground regions. Two solvers specialized toward different aspects of the problem are derived, one based on an ILP over superpixels and the other a branch-and-bound using completions with spirals to join edgelets. We demonstrate our model finds salient contours across a large dataset, showing significant improvement over similar methods.

Supplementary Material

Supplementary Information

NIHMS1698956-supplement-Supplementary_Information.pdf^{(1.3MB, pdf)}

Figure 5: — Branch-and-bound result on a BSD image.

Acknowledgments:

This work is funded via grants NIH R01 AG040396 and NSF RI 1116584. Partial support was provided by UW-ICTR and Wisconsin ADRC. Collins was supported by a CIBM fellowship (NLM 5T15LM007359).

References

[1].Alpert S, Galun M, Basri R, and Brandt A. Image segmentation by probabilistic bottom-up aggregation and cue integration. In CVPR, 2007. [DOI] [PubMed] [Google Scholar]
[2].Arbelaez P, Maire M, Fowlkes C, and Malik J. From contours to regions: An empirical evaluation. In CVPR, 2009. [Google Scholar]
[3].Arbelaez P, Maire M, Fowlkes C, and Malik J. Contour detection and hierarchical image segmentation. PAMI, 33(5):898–916, 2011. [DOI] [PubMed] [Google Scholar]
[4].Bai X and Sapiro G. Geodesic matting: A framework for fast interactive image and video segmentation and matting. IJCV, 82(2):113–132, 2009. [Google Scholar]
[5].Borenstein E and Ullman S. Class-specific, top-down segmentation. In ECCV, 2002. [Google Scholar]
[6].Boykov Y and Jolly M. Interactive graph cuts for optimal boundary and region segmentation of objects in N-D images. In ICCV, 2001. [Google Scholar]
[7].Chauve A-L, Labatut P, and Pons J-P. Robust piecewise-planar 3d reconstruction and completion from large-scale unstructured point data. In CVPR, 2010. [Google Scholar]
[8].Chen C, Freedman D, and Lampert C. Enforcing topological constraints in random field image segmentation. In CVPR, 2011. [Google Scholar]
[9].Dollar P, Tu Z, and Belongie S. Supervised learning of edges and object boundaries. In CVPR, 2006. [Google Scholar]
[10].El-Zehiry NY and Grady L. Fast global optimization of curvature. In CVPR, 2010. [Google Scholar]
[11].Estrada FJ and Jepson AD. Robust boundary detetion with adaptive grouping. In POCV, 2006. [Google Scholar]
[12].Fleckner OL. A method for the computation of the fresnel integrals and related functions. Mathematics of Computation, 22(103):635–640, 1968. [Google Scholar]
[13].Grady L. Random walks for image segmentation. PAMI, 28(11):1768–1783, 2006. [DOI] [PubMed] [Google Scholar]
[14].Grady L and Polimeni JR. Discrete Calculus: Applied Analysis on Graphs for Computational Science. Springer, 2010. [Google Scholar]
[15].Gulshan V, Rother C, Criminisi A, Blake A, and Zisserman A. Geodesic star convexity for interactive image segmentation. In CVPR, 2010. [Google Scholar]
[16].Hariharan B, Arbeláez P, Bourdev L, Maji S, and Malik J. Semantic contours from inverse detectors. In ICCV, 2011. [Google Scholar]
[17].Horn B. The curve of least energy. ACM Trans. Math. Soft, 9(4):441–460, 1983. [Google Scholar]
[18].Jermyn I and Ishikawa H. Globally optimal regions and boundaries as minimum ratio weight cycles. PAMI, 23(10):1075–1088, 2001. [Google Scholar]
[19].Kimia BB, Frankel I, and Popescu A-M. Euler spiral for shape completion. IJCV, 54(1-3):159–182, 2003. [Google Scholar]
[20].Kovalevsky VA. Finite topology as applied to image analysis. Computer Vision, Graphics, Image Processing, 46(2):141–161, 1989. [Google Scholar]
[21].Levinshtein A, Sminchisescu C, and Dickinson S. Optimal contour closure by superpixel grouping. In ECCV, 2010. [Google Scholar]
[22].Levinshtein A, Stere A, Kutulakos K, Fleet D, Dickinson S, and Siddiqi K. Turbopixels: Fast superpixels using geometric flows. PAMI, 31(12):2290–2297, 2009. [DOI] [PubMed] [Google Scholar]
[23].Lu C, Latecki L, Adluru N, Yang X, and Ling H. Shape guided contour grouping with particle filters. In ICCV, 2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
[24].Mairal J, Leordeanu M, Bach F, Hebert M, et al. Discriminative sparse image models for class-specific edge detection and image interpretation. In ECCV, 2008. [Google Scholar]
[25].Maire M, Arbeláez P, Fowlkes C, and Malik J. Using contours to detect and localize junctions in natural images. In CVPR, 2008. [Google Scholar]
[26].Maji S, Vishnoi N, and Malik J. Biased normalized cuts. In CVPR, 2011. [Google Scholar]
[27].Martin DR, Fowlkes C, and Malik J. Learning to detect natural image boundaries using local brightness, color, and texture cues. PAMI, 26(5):530–549, 2004. [DOI] [PubMed] [Google Scholar]
[28].Nowozin S and Lampert C. Global interactions in random field models: A potential function ensuring connectedness. SIAM J. Imag. Sci, 3(4):1048–1074, 2010. [Google Scholar]
[29].Parent P and Zucker S. Trace inference, curvature consistency, and curve detection. PAMI, 11(8):823–839, 1989. [Google Scholar]
[30].Ren X, Fowlkes C, and Malik J. Scale-invariant contour completion using conditional random fields. In ICCV, 2005. [Google Scholar]
[31].Schoenemann T and Cremers D. Introducing curvature into globally optimal image segmentation: Minimum ratio cycles on product graphs. In ICCV, 2007. [Google Scholar]
[32].Shotton J, Blake A, and Cipolla R. Contour-based learning for object detection. In ICCV, 2005. [Google Scholar]
[33].Singh M and Lau LC. Approximating minimum bounded degree spanning trees to within one of optimal. In STOC, 2007. [Google Scholar]
[34].Stahl JS and Wang S. Edge grouping combining boundary and region information. TIP, 16(10):2590–2606, 2007. [DOI] [PubMed] [Google Scholar]
[35].Stahl JS and Wang S. Globally optimal grouping for symmetric closed boundaries by combining boundary and region information. PAMI, 30(3):395–411, 2008. [DOI] [PubMed] [Google Scholar]
[36].Tu Z, Chen X, Yuille A, and Zhu S. Image parsing: Unifying segmentation, detection, and recognition. IJCV, 63(2):113–140, 2005. [Google Scholar]
[37].Ullman S and Shaashua A. Structural saliency: The detection of globally salient structures using a locally connected network. Technical report, MIT, 1988. [Google Scholar]
[38].Vicente S, Kolmogorov V, and Rother C. Graph cut based image segmentation with connectivity priors. In CVPR, 2008. [Google Scholar]
[39].Walton DJ and Meek DS. G1 interpolation with a single cornu spiral segment. Journal of Computational and Applied Mathematics, 223(1):86–96, 2009. [Google Scholar]
[40].Wang S, Kubota T, Siskind JM, and Wang J. Salient closed boundary extraction with ratio contour. PAMI, 27(4):546–561, 2005. [DOI] [PubMed] [Google Scholar]
[41].Zeng Y, Samaras D, Chen W, et al. Topology cuts: A novel min-cut/max-flow algorithm for topology preserving segmentation. CVIU, 112:81–90, 2008. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Information

NIHMS1698956-supplement-Supplementary_Information.pdf^{(1.3MB, pdf)}

[R1] [1].Alpert S, Galun M, Basri R, and Brandt A. Image segmentation by probabilistic bottom-up aggregation and cue integration. In CVPR, 2007. [DOI] [PubMed] [Google Scholar]

[R2] [2].Arbelaez P, Maire M, Fowlkes C, and Malik J. From contours to regions: An empirical evaluation. In CVPR, 2009. [Google Scholar]

[R3] [3].Arbelaez P, Maire M, Fowlkes C, and Malik J. Contour detection and hierarchical image segmentation. PAMI, 33(5):898–916, 2011. [DOI] [PubMed] [Google Scholar]

[R4] [4].Bai X and Sapiro G. Geodesic matting: A framework for fast interactive image and video segmentation and matting. IJCV, 82(2):113–132, 2009. [Google Scholar]

[R5] [5].Borenstein E and Ullman S. Class-specific, top-down segmentation. In ECCV, 2002. [Google Scholar]

[R6] [6].Boykov Y and Jolly M. Interactive graph cuts for optimal boundary and region segmentation of objects in N-D images. In ICCV, 2001. [Google Scholar]

[R7] [7].Chauve A-L, Labatut P, and Pons J-P. Robust piecewise-planar 3d reconstruction and completion from large-scale unstructured point data. In CVPR, 2010. [Google Scholar]

[R8] [8].Chen C, Freedman D, and Lampert C. Enforcing topological constraints in random field image segmentation. In CVPR, 2011. [Google Scholar]

[R9] [9].Dollar P, Tu Z, and Belongie S. Supervised learning of edges and object boundaries. In CVPR, 2006. [Google Scholar]

[R10] [10].El-Zehiry NY and Grady L. Fast global optimization of curvature. In CVPR, 2010. [Google Scholar]

[R11] [11].Estrada FJ and Jepson AD. Robust boundary detetion with adaptive grouping. In POCV, 2006. [Google Scholar]

[R12] [12].Fleckner OL. A method for the computation of the fresnel integrals and related functions. Mathematics of Computation, 22(103):635–640, 1968. [Google Scholar]

[R13] [13].Grady L. Random walks for image segmentation. PAMI, 28(11):1768–1783, 2006. [DOI] [PubMed] [Google Scholar]

[R14] [14].Grady L and Polimeni JR. Discrete Calculus: Applied Analysis on Graphs for Computational Science. Springer, 2010. [Google Scholar]

[R15] [15].Gulshan V, Rother C, Criminisi A, Blake A, and Zisserman A. Geodesic star convexity for interactive image segmentation. In CVPR, 2010. [Google Scholar]

[R16] [16].Hariharan B, Arbeláez P, Bourdev L, Maji S, and Malik J. Semantic contours from inverse detectors. In ICCV, 2011. [Google Scholar]

[R17] [17].Horn B. The curve of least energy. ACM Trans. Math. Soft, 9(4):441–460, 1983. [Google Scholar]

[R18] [18].Jermyn I and Ishikawa H. Globally optimal regions and boundaries as minimum ratio weight cycles. PAMI, 23(10):1075–1088, 2001. [Google Scholar]

[R19] [19].Kimia BB, Frankel I, and Popescu A-M. Euler spiral for shape completion. IJCV, 54(1-3):159–182, 2003. [Google Scholar]

[R20] [20].Kovalevsky VA. Finite topology as applied to image analysis. Computer Vision, Graphics, Image Processing, 46(2):141–161, 1989. [Google Scholar]

[R21] [21].Levinshtein A, Sminchisescu C, and Dickinson S. Optimal contour closure by superpixel grouping. In ECCV, 2010. [Google Scholar]

[R22] [22].Levinshtein A, Stere A, Kutulakos K, Fleet D, Dickinson S, and Siddiqi K. Turbopixels: Fast superpixels using geometric flows. PAMI, 31(12):2290–2297, 2009. [DOI] [PubMed] [Google Scholar]

[R23] [23].Lu C, Latecki L, Adluru N, Yang X, and Ling H. Shape guided contour grouping with particle filters. In ICCV, 2009. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R24] [24].Mairal J, Leordeanu M, Bach F, Hebert M, et al. Discriminative sparse image models for class-specific edge detection and image interpretation. In ECCV, 2008. [Google Scholar]

[R25] [25].Maire M, Arbeláez P, Fowlkes C, and Malik J. Using contours to detect and localize junctions in natural images. In CVPR, 2008. [Google Scholar]

[R26] [26].Maji S, Vishnoi N, and Malik J. Biased normalized cuts. In CVPR, 2011. [Google Scholar]

[R27] [27].Martin DR, Fowlkes C, and Malik J. Learning to detect natural image boundaries using local brightness, color, and texture cues. PAMI, 26(5):530–549, 2004. [DOI] [PubMed] [Google Scholar]

[R28] [28].Nowozin S and Lampert C. Global interactions in random field models: A potential function ensuring connectedness. SIAM J. Imag. Sci, 3(4):1048–1074, 2010. [Google Scholar]

[R29] [29].Parent P and Zucker S. Trace inference, curvature consistency, and curve detection. PAMI, 11(8):823–839, 1989. [Google Scholar]

[R30] [30].Ren X, Fowlkes C, and Malik J. Scale-invariant contour completion using conditional random fields. In ICCV, 2005. [Google Scholar]

[R31] [31].Schoenemann T and Cremers D. Introducing curvature into globally optimal image segmentation: Minimum ratio cycles on product graphs. In ICCV, 2007. [Google Scholar]

[R32] [32].Shotton J, Blake A, and Cipolla R. Contour-based learning for object detection. In ICCV, 2005. [Google Scholar]

[R33] [33].Singh M and Lau LC. Approximating minimum bounded degree spanning trees to within one of optimal. In STOC, 2007. [Google Scholar]

[R34] [34].Stahl JS and Wang S. Edge grouping combining boundary and region information. TIP, 16(10):2590–2606, 2007. [DOI] [PubMed] [Google Scholar]

[R35] [35].Stahl JS and Wang S. Globally optimal grouping for symmetric closed boundaries by combining boundary and region information. PAMI, 30(3):395–411, 2008. [DOI] [PubMed] [Google Scholar]

[R36] [36].Tu Z, Chen X, Yuille A, and Zhu S. Image parsing: Unifying segmentation, detection, and recognition. IJCV, 63(2):113–140, 2005. [Google Scholar]

[R37] [37].Ullman S and Shaashua A. Structural saliency: The detection of globally salient structures using a locally connected network. Technical report, MIT, 1988. [Google Scholar]

[R38] [38].Vicente S, Kolmogorov V, and Rother C. Graph cut based image segmentation with connectivity priors. In CVPR, 2008. [Google Scholar]

[R39] [39].Walton DJ and Meek DS. G1 interpolation with a single cornu spiral segment. Journal of Computational and Applied Mathematics, 223(1):86–96, 2009. [Google Scholar]

[R40] [40].Wang S, Kubota T, Siskind JM, and Wang J. Salient closed boundary extraction with ratio contour. PAMI, 27(4):546–561, 2005. [DOI] [PubMed] [Google Scholar]

[R41] [41].Zeng Y, Samaras D, Chen W, et al. Topology cuts: A novel min-cut/max-flow algorithm for topology preserving segmentation. CVIU, 112:81–90, 2008. [Google Scholar]

PERMALINK

Incorporating User Interaction and Topological Constraints within Contour Completion via Discrete Calculus

Jia Xu

Maxwell D Collins

Vikas Singh

Abstract

1. Introduction

Figure 1:

Related Work.

2. Preliminaries

2.1. Discrete Calculus

Figure 2:

Figure 3:

3. Problem Formulation

Condition 1.

Figure 4:

Euler Characteristic.

3.1. Optimization Model

3.2. Optimizing Ratio Objective

4. Beyond Superpixel-derived Edgelets

Euler Spirals.

Euler-Spirals for One Contour Completion.

5. Experiments

Figure 9:

WHD Results:

Figure 6:

WSD Results:

Figure 7:

BSDS500 Results:

Figure 8:

ISEG Results:

Table 1:

Figure 10:

Running Time

6. Discussion

Supplementary Material

Figure 5:

Acknowledgments:

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases