Abstract
A new global stereo matching method is presented that focuses on the handling of disparity, discontinuity and occlusion. The Bayesian approach is utilized for dense stereo matching problem formulated as a maximum a posteriori Markov Random Field (MAP-MRF) problem. In order to improve stereo matching performance, edges are incorporated into the Bayesian model as a soft constraint. Accelerated belief propagation is applied to obtain the maximum a posteriori estimates in the Markov random field. The proposed algorithm is evaluated using the Middlebury stereo benchmark. Our experimental results comparing with some state-of-the-art stereo matching methods demonstrate that the proposed method provides superior disparity maps with a subpixel precision.
Keywords: MAP-MRF, local affine model, edge classify, belief propagation
1. Introduction
Binocular stereo vision infers 3D scene geometry from a pair of images with different viewpoints. Dense stereo matching is a key computational step for 3D reconstruction and has been one of the most active research topics in the stereo vision field. Given a pair of images, the aim of stereo matching is to determine the disparity values of the pixels belonging to the image selected as reference view. However, dense stereo matching is difficult for the following reasons: (1) light variations, optical blurring, and sensor noise corrupts the input images, (2) lacking texture in input images may lead to ineffectiveness in using intensity consistency constraints, (3) disparity discontinuities, caused by sudden disparity changes, often appear at object boundaries and are difficult to detect; the smoothness constraint may break-down in these boundaries, and (4) some portions of objects are visible by one camera but not by the other camera (occlusion), causing difficulties in computing reasonable disparity values in affected regions.
In general, depending on whether the matching method relies on local window-based computation or the minimization of a global energy function, dense stereo matching can be broadly classified into two categories: local methods and global methods. In local methods, the disparity computed at a given point depends only on intensity values within a finite window. Many effective local algorithms for stereo matching have been reported [1–3]. However, when local structures in an image are similar, it may be very difficult to find their correspondences in the other image without global reasoning. Although a high accuracy is not guaranteed in local methods, they have advantages in computational speed and parallelism over global methods.
A stereo algorithm belongs to global methods if there is a global cost function to be optimized, which considers the correlation of disparities in the neighborhood. Hence, the key of global algorithms is not only to define a good cost function but also to provide an effective computational method for global optimization. Among various global methods, the Bayesian approach [4] treats the stereo matching problem as finding the “best guessed” solution. Depending on the computation model, this approach can be classified into two categories: dynamic programming-based and MRFs-based.
The dynamic programming method [5] assumes that the occlusion cost is identical in each scanline. Ignoring the dependency between scanlines leads to the well-known “streaking” artifacts. In contrast, the MRFs-model based methods rely on energy minimization algorithms, such as Graph Cut (GC) [6], and Belief Propagation (BP) [4]. Both of these algorithms can be implemented efficiently to compute the minimum for an MAP-MRF whose energy function is Potts or Generalized Potts [4]. Experiments have shown [7] that, the solution produced by GC tends to be smoother than BP, but BP is significantly faster than GC, even up to real-time [8], while achieving a similar performance with GC.
Traditional global algorithms [4][6][9] usually give good results in most regions of an image. However, in the regions containing occlusions and disparity discontinuities, disparity errors often increase substantially. Hence, explicit smoothness assumptions have to be incorporated in these particular regions. Segment-based priors have been widely explored [4][10–13] due to their first-rank performance. The matching results greatly depend on whether the segmentation results are coincident in the locations where the occlusions and disparity discontinuities take place. But the consistency can hardly be achieved when tackling the highly-textured areas or the nonplanar surfaces with a uniform color. This problem can be solved by planar fitting after segmentation with impressive results [10][11]. However, the computational complexity increases.
Different from the methods mentioned above, our algorithm emphasizes on the edge information of the images. Assuming that discontinuities and occlusions occur on some edges, we explore the characteristics of the normal edges and the disparity discontinuities in reference image of the input image pairs. The edge information is incorporated into our MAP-MRF stereo matching model to obtain better stereo matching results. Another contribution of the proposed algorithm is that a local affine model is used to aggregate the matching costs. The aggregated matching costs are updated based on the current understanding of which pixels in the reference image are occluded.
This paper is organized as follows: Section 2 gives a mathematic model of the proposed algorithm. In Section 3, the framework of the proposed algorithm is described. Section 4 presents experimental results in which our algorithm and other algorithms are compared based on images in the Middlebury dataset [14]. Finally, conclusion and the future work are discussed in Section 5.
2. Related Works
In this section, we review the related models, the Maximum a posteriori Markov Random Field (MAP-MRF) model and the local affine model, which are used in the proposed algorithm.
2.1 MAP-MRF Model
The Bayesian approach provides a promising way for solving the problems of stereo matching, that is to find the best disparity map D given the observations IL and IR, according to maximizing the posterior probability P(D|IL, IR). By Bayes’ rule P(D|IL, IR) ∝ P(IL, IR|D)P(D), where P(D) is called the prior. Accordingly, the disparity problem can be modeled as pair-wise Markov Random Field (MRF), encoding the smoothness. If N(p) is the neighborhood of pixel p, pixel q belongs to N(p), and disparities of p and q are dp and dq, respectively. According to the Markovian property and Gibbs distribution, the basic Bayesian model can be expanded as:
| (1) |
where ψ(dp, dq) penalizes the different assignments of neighbor sites when no discontinuity exists between them. Φ(dp) is the matching cost function of pixel p with disparity dp. If (1) is written as a Gibbs distribution with energy function: E(D|IL, IR) = −log(P(D|IL, IR)), then, (1) has the familiar energy form:
| (2) |
where Esmooth is defined to measure the disparity smoothness between neighboring segment pairs, and Edata measures the disagreement of segments and their matching regions based on the assumed disparity planes. Depending on the different solving methods, we can maximize the posterior probability P(D|IL, IR) or minimize the energy function E(D|IL, IR).
Sun et al [4] incorporated segmentation results into (1) as soft constraints under a probabilistic framework. They point out that low-level visual cues (e.g. edge, corner) can also be incorporated into the basic Bayesian model. Motivated by Sun’s method, we incorporate the edge classification result into (1) as a soft constraint. In this way, the probabilistic framework becomes:
| (3) |
where ρedge(dp, dq) is the edge constraint between neighbor sites, which can be defined as:
| (4) |
where edge(p) is a label, which is used to judge whether the pixel p is in a discontinuity and λedge is a constant. In general, the larger the λedge is, the more difficult to pass the message between neighbor sites.
2.2 Local Affine Model
An aggregation step is needed to aggregate each pixel’s matching cost over a weight region to reduce the matching ambiguities and noises in the initial cost volume. In order to deal with disparity discontinuities and ambiguous regions (textureless areas, repetitive patterns, etc), an ideal cost aggregation strategy should modify its support at each position according to image content so that only those points with the same (unknown) disparity are included. Different from the traditional weight-based aggregation [10], which tries to change the color weight and distance weight to complete weight calculation, the proposed method uses a local affine model to aggregate the cost, which has been proven useful in image denoising [15] and image matting [16][17]. Assume there are two small local matching regions (ΩP and Ωq) centered on pixel p and pixel q in the two rectified images L and R (obtained from input pair IL and IR), respectively, an affine relation between the intensities of the local matching region is described as:
| (5) |
where Mp and Bp are linear coefficients which are constants in Ωp. Assuming that the left image is the reference image, we can get the disparity information from Dk = Rk′−Lk Combining with (5), Dk is given by
| (6) |
with Ap = Mp − 1 Eqation (6) gives the relationship between the disparity image D and input image L in local region Ωp. Additionally, the local affine model satisfies Dk = Ap∇Lk, which ensures that the image D has an edge if image L also has an edge. In this way, disparity discontinuities are also parts of edges in the reference image.
In order to obtain AP and BP, we minimize the difference between input and output disparities pk and disparity Dk respectively, using a least-square optimal estimator in a local window:
| (7) |
where pk is a pixel in the window Ωp in an initial input disparity map, denoted by D̂. ε is a factor which prevents Ap from being too large. A guided filter (GF) has been utilized to solve (7) using a least-squares estimator [15]. The optimal Ap and Bp are given by:
| (8) |
where μp and are the mean and variance, respectively, of input image L in window Ωp. |Ωp| is the number of pixels in Ωp, and is the mean of p in Ωp. Since p is included in many windows centered by its neighborhood, correspondingly, there are many Ap and Bp in different windows including p. Hence, the GF simply averages all possible values of Dk by
| (9) |
Additionally, from the (9), we found that all the summations can be calculated by box filtering. Therefore, these terms can be computed within the time proportional to N, i.e., O (N) time.
In the proposed algorithm, we expand (9) to disparity space image. In addition, we use a color image as the reference image in order to obtain a more reliable data term. Then, (9) is rewritten as:
| (10) |
where the bold faced Li and Ak are both 3 × 1 vectors, and , with r, g and b denoting color components.
3. Proposed Algorithm
In order to ease description, we partition our algorithm into three modules: initialization, global optimization and enhancement.
In the initialization module, the initial matching cost is first computed. Then, the reference image and local affine model are used to obtain aggregation matching costs. This process will be applied twice, because both the left and right input images are needed to be reference images. The initial data term Φ(dp) is computed from the aggregation matching cost. Hence, this step generates the following outputs: the initial left disparity map DL, initial right disparity map DR, initial disparity image space , aggregated disparity image space , and data term Φ(dp). Another important output of the initialization module is the labels of the edge classification from which ρedge is incorporated as the constraint in the basic Bayesian model. In the global optimization module, the method of accelerated belief propagation is employed for iterative optimization that trades off between the data term and the prior term. The prior is decreased at discontinuities using the edge constraint, since the discontinuities are likely to coincide with edges in the reference image. Finally, the goal of the enhancement module is to improve the disparity result in a subpixel level.
3.1 Initialization
Step1: Matching cost computation
Typically, a matching cost is computed at each pixel for all disparities under consideration, which constructs the disparity space image. Recently, Klaus proposed a simple self-adapting dissimilarity measurement which linearly combines SAD and a gradient for computing the matching cost [11]. Considering both the good performance and simplicity of Klaus’ method, in this paper, this measurement is used for constructing the disparity space image.
Assume that the left image is a reference image, CCSAD is the absolute color difference and CCGRAD is the color gradient difference. The matching cost C(x, y, d)can be defined by
| (11) |
where CCSAD and CCGRAD are given by:
| (12) |
where C(x, y, d) is the matching cost of pixel (x, y) in the disparity d, α balances the color and gradient terms, ∇x and ∇y represent the horizontal and vertical gradients, respectively, and τc and τg are truncation values. In order to suppress noise, a small supporting window Ω(x, y) centered at (x, y) is used to calculate the matching cost. After this step, the initial disparity space image is constructed by C(x,y,d) for all the pixels in the reference image.
Step2: Aggregation
To obtain more accurate results in both smooth and discontinuous regions, the local affine model is used to initialize a reliable matching cost. The local affine model can be expanded to the RGB space as well by rewriting (10) as:
| (13) |
Ā(x, y) and B̄(x, y) are given by:
| (14) |
In (13) and (14), I is a 3 × 1 vector, μ is a mean and Λ is a 3 × 3 covariance matrix, respectively, of I in local window Ω. A and Ā are 3 × 1 coefficient vectors, U is a 3 × 3 identity matrix. Cnew(x, y, d) constructs the new disparity space image , which will replace the initial disparity space image .
In a similar manner, we compute the disparity space image for the right reference image. When two disparity space images are obtained, the Winner-Take-All (WTA) [1] strategy is applied to form two initial disparity maps DL and DR.
In a reliable disparity map, any stable pixel is expected to satisfy the mutual consistency condition, which requires the pixels on the pixel grids in the left and right disparity maps to be perfectly consistent with each other (i.e., having the same disparity value). This is performed by a subsequent mutual consistency check (often called left-right checking [1]) that divides all the pixels into stable or unstable pixels:
| (15) |
where Tlf is a threshold. We mark pixels in the left disparity map as occluded pixels if (15) does not hold true.
After the left-right checking step, the data term can be defined by the occluded pixels and stable pixels as:
| (16) |
The constant 0 in (16) reflects the fact that the occluded pixels need the most regularization.
Step3: Edge classification
This step is based on the assumption that an intensity edge in an image indicates a depth discontinuity in the scene. In our case, the edges in the reference image are detected by the Sobel edge detector [18]. If D̄k is the mean disparity value of the window Ωk centered at pixel k, and i is a pixel in the neighborhood of pixel k with disparity Di, then, the standard deviation (Std) is computed to estimate the edge variation in the disparity map DL:
| (17) |
Hence, if Std(k) is smaller than the threshold λ, the edge belongs to a normal edge. Otherwise, the edge is a discontinuity. The edge classification is thus given by
| (18) |
Fig. 1 provides an example of edge classification using (17) and (18). It can be observed that most disparity discontinuities are successfully identified.
Figure 1.

(a) Left image in a pair of stereo images; (b) Edges detected by Sobel detector; (c) Edges corresponding to disparity discontinuities detected by Eq.(17) and Eq.(18).
3.2 Global Optimization
The data term Φ(dp) is determined by updating the matching cost in eq.(16), and the prior term ψ(dp, dq) in (3) is defined by the Potts model. Assuming that two neighboring pixels p and q are likely to have the same disparity if their intensities satisfies I(p) ≈ I(q). This contextual information is incorporated into ψ(dp, dq), which is computed in the same fashion as that in [7]:
| (19) |
where T is a gradient threshold, and ΔI is the difference between Ip and Iq, s is a penalty term for violating the smoothness constraint and P is a penalty term when the gradient has a small magnitude. Note that T, P and s are constants over the whole image. Assuming that there is a matching cost, denoted by C, in order to use belief propagation, C should be converted to a compatibility form by calculating e−C. For numerical reasons (e.g. e−C is extremely small), we include a positive constant M and define the compatibility as e−C/M.
Belief Propagation (BP) is an iterative inference algorithm that propagates messages in a network. Among several high performance algorithms, we adopted the max-product BP algorithm [4][7] which works by passing messages within a graph defined by an image with four-connected pixel grid. Each message is a vector of the dimension given by the number of possible labels. During a iteration, each node uses the messages received in the previous iteration from neighboring nodes. A new message is then calculated and sent to its neighbors.
A message update scheme determines when a message is sent to a node where it will be used to compute subsequent messages for neighbors of the node. One of the update schemes is to propagate messages in one direction and update each node immediately. In this work, we use the “accelerated” BP updating scheme (seen in Fig. 2) proposed by Tappen et al [7]. The “up-down-left-right” message passing scheme enables the BP algorithm to converge rapidly.
Figure 2.

Accelerated BP message updating scheme: (a) from right to left, (b) from left to right, (c) from down to up, (d) from up to down.
Let us consider the case where node p is located to the right side of node q (Fig. 2(a)). Let be the message that node p sends to q in the current iteration t, which contains node p’s belief about each possible state of node q. This message is computed from the messages that p has received from its neighbors (up, down, right) at iteration t − 1. Then, the new message (for the right side case) is updated as:
| (20) |
where and are the messages received by p from the nodes’ right, above and below at iteration t−1. The new updated left, above and below messages (Fig. 2(b)–(d)) are computed similarly to Eq.(20). After iteration T, the beliefs at node q can be computed by
| (21) |
Then, the best disparity dqMAP in pixel q is obtained from the maximum belief:
| (22) |
3.3 Enhancement
To reduce the discontinuities error caused by the quantization effect in the disparity, a method producing subpixel precision is proposed. Assuming that dc is the disparity after accelerated BP. The final disparities df can be achieved by:
| (23) |
Since a disparity map has been produced from the belief propagation module and the matching cost of the three candidates C(x, y, dc − 1), C(x, y, dc + 1) and C(x, y, dc) are obtained from the aggregated disparity space image in the aggregation step, the enhanced disparity map can be easily computed.
Our algorithm was summarized in Fig. 3.
Figure 3.
Framework of the proposed method
4. Experimental Results
In this section, we evaluate the performance of our stereo matching algorithm by the Middlebury stereo benchmark dataset [14]. This dataset consists of four pairs of images: Tsukuba, Venus, Teddy and Cones. Our experimental study includes the following components: parameters setting, evaluation of results, and comparison with several state-of-the-art stereo matching methods [10][11][13][19][20].
4.1 Parameter Setting
We first describe the settings of parameter used in our algorithm. Note that the parameters are constant across all datasets. In order to improve stability, we normalize the pixel intensity to [0,1].
Cost computation
The computation of matching cost is parallel at each pixel and each disparity level. To obtain appropriate thresholds, we analyze the parameters statistically by experiments shown in Fig. 4. Then, the truncation values τc and τg are set, respectively, to 0.0028 and 0.016 in (12). The balance term α is set to 0.92. Ω is chosen to be a 3 × 3 local window.
Figure 4.

Intermediate results of four different standard test images by our algorithm compared to the ground truth. (a) Left images of the stereo image pairs;(b)edge detection result; (c) WTA disparity maps after local affine model; (d) disparity maps after applying left-right checking ;(e) disparity maps after BP propagation;(f) disparity maps after refinement;(g) ground truth.
Cost aggregation
We use the Eq. (13) for the aggregation. The parameters are chosen empirically as follows: Ωk is a 15 × 15 window and ∈ is 0.0001.
Occlusion detection and edge constraint
Since there is a certain error in computation, we choose Tlf to be 1 in Eq. (15). After experiments, λ in Eq.(18) is set to 0.18 in order to distinguish the disparity discontinuity from normal edges.
Belief Propagation
Parameters s, T and P in Eq. (19) of the BP module are set to 0.000153, 0.06 and 8, respectively. The constant M is set to 0.1961 in the global optimization module of section II. The iteration time is set to 5 by experiments.
Fig. 4 shows the disparity maps obtained by our algorithm using the above parameters. The results after different intermediate stages provided visual explanations that how the different stages in the algorithm improve the results. The ground truths are also given for comparison.
4.2 Evaluation
The performance of stereo matching algorithms can be evaluated quantitatively. Two methods are often utilized based on the RMS (root-mean-squared) error and the percentage of bad matching pixels with respect to the ground truth data included in the Middlebury datasets [1]. In this work, the second method is utilized. First, we compute B which reflects the percentage of bad matching pixels:
| (24) |
where δd is the disparity error tolerance (Table 1 list results for δd =0.5, 0.75 and 1). The result of each image pair in the Middlebury set is computed by measuring B in pixels of three regions: nonoccluded pixels (denoted by nocc), discontinuity pixels (denoted by disc) and either nonoccluded or half-occluded (denoted by all). The value of B in the whole disparity image is considered (the second column marked by Ave. Error in Tables 1 and 2). It can be seen that the result of image pair Venus is most accurate when δd =1 (Table 1 in the rows with bold faces). This is because our algorithm performs well when the scene is mainly composed of planar objects. Note that, in both tables, the subscript of each number denotes the rank in the Middlebury stereo benchmark which contains approximately 140 algorithms.
Table 1.
Result of the proposed method with different error thresholds
| Error Threshold | Avg. Error | Tsukuba | Venus | Teddy | Cones | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Nocc | All | Disc | Nocc | All | Disc | Nocc | All | Disc | Nocc | All | Disc | ||
| δd =1 | 5.88 | 1.2937 | 2.0852 | 5.7118 | 0.093 | 0.3819 | 1.314 | 7.1461 | 13.264 | 15.735 | 3.4946 | 9.6259 | 10.364 |
| δd =0.75 | 8.26 | 7.1523 | 8.2722 | 12.67 | 0.4317 | 0.8322 | 3.6216 | 7.9138 | 14.753 | 17.625 | 4.0534 | 10.342 | 11.650 |
| δd =0.5 | 10.5 | 7.159 | 8.2711 | 12.65 | 3.7628 | 4.3427 | 9.8319 | 10.313 | 17.524 | 22.715 | 4.646 | 11.815 | 13.114 |
Table 2.
Results of Comparison (δd=0.5)
| Algorithm | Avg. Error | Tsukuba | Venus | Teddy | Cones | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Nocc | All | Disc | Nocc | All | Disc | Nocc | All | Disc | Nocc | All | Disc | ||
| Our method(final) | 10.5 | 7.159 | 8.2711 | 12.65 | 3.7628 | 4.3427 | 9.8319 | 10.313 | 17.524 | 22.715 | 4.646 | 11.815 | 13.114 |
| Our method(cons) | 10.7 | 7.3610 | 8.4811 | 13.87 | 3.6927 | 4.2727 | 9.5119 | 10.413 | 17.422 | 23.016 | 4.798 | 12.015 | 13.316 |
| Our method(no) | 10.8 | 7.5310 | 8.7313 | 14.18 | 3.7228 | 4.3027 | 9.8819 | 10.313 | 17.725 | 23.016 | 4.9310 | 12.419 | 13.718 |
| SubPixDoubleBP[19] | 10.7 | 8.7818 | 9.4515 | 14.910 | 0.723 | 1.125 | 5.243 | 10.112 | 16.415 | 21.37 | 8.4945 | 14.743 | 16.541 |
| AdaptOvrSegBP[13] | 11.9 | 5.983 | 6.563 | 9.091 | 3.6625 | 3.9622 | 13.252 | 13.043 | 18.937 | 26.435 | 9.4853 | 14.945 | 17.248 |
| OvrSegBP[20] | 12.4 | 7.7514 | 8.179 | 13.87 | 4.3333 | 4.7330 | 16.884 | 13.244 | 19.341 | 27.540 | 6.5324 | 12.622 | 14.021 |
| AdaptingBP[11] | 13.6 | 19.169 | 19.365 | 17.427 | 4.8439 | 5.0835 | 7.8411 | 12.839 | 16.717 | 26.334 | 7.0230 | 13.228 | 14.020 |
| DoubleBP[10] | 15.7 | 18.764 | 19.162 | 15.818 | 7.8277 | 8.2273 | 11.332 | 14.454 | 19.947 | 24.424 | 11.877 | 17.668 | 19.768 |
4.3 Comparation
Table 2 summarizes the quantitative performance of our method and those of other global BP stereo methods [10][11][13][19][20]. The results are ranked roughly in descending order of overall performance. The evaluation of our final results after disparity enhancement is presented in the first row. Additionally, the evaluation results with and without edge constrain incorporated into stereo matching are shown in the second and third rows (in bold face), respectively. After incorporating the edge constraints, the matching result is improved, which is indicated by smaller average error in Table 2. Note that our algorithm produces the best performance in subpixel level (e.g. δd =0.5), and the rank is up to 5 in approximately 140 algorithms.
The Tsukuba image pairs contain some dark and noisy regions near the lamp and the desk, which usually lead to incorrect support regions for aggregation. Our method performs very well in Tsukuba, including high texture parts (such as the lamp stem in Figure 4.(f)) and textureless parts (such as the box, desk). That is because our method preserves edges while smoothing the relatively flat regions.
Although a reliable local affine model prevents edge smoothing obtaining an accurate affine model is very difficult. Therefore, in the proposed algorithm, we just average possible local affine model parameters, which lead to the inequality ∇Dk < Āk ∇ Lk in high texture regions. In this case, certain details are lost, which can be seen in Tsukuba of Figure 4(c) where the lamp holder appears to be broken.
Additionally, our algorithm performs well when the scene is mainly composed of planar objects. However, if the scene is mainly composed of smooth, curved surfaces, the performance of the proposed algorithm may decrease. This is because the smoothness prior used in the proposed method is the first-order prior [21][22], which indeed favors low-curvature fronto-parallel surfaces. Even in man-made scenes (seen in the “stair-case” in Venes of Fig. 4(g)), the prior is maximized by fronto-parallel planes, leading to inaccurate depth estimates.
In the left-right checking stage, all unstable disparities are replaced by 0. Since the noise causes inconsistency in the left and right disparity maps, some discarded unstable pixels may have a correct disparity. Therefore, it can be explained why the error increases after left-right checking (seen in Fig. 4(c)–(d)).
5. Conclusion
In this paper, stereo matching is formulated as a Bayesian inference problem. The edge information is integrated into the basic MAP-MRF stereo model as a soft constraint. Then, the accelerated belief propagation algorithm is adopted to this MAP-MRF problem effectively. In order to obtain more reliable cost volume, a local affine model is used to aggregate the stereo matching process. Our experimental results have demonstrated high performance of the proposed algorithm. Additionally, our algorithm performs well when the scene is mainly composed of planar objects. However, the performance of the propose algorithm may decrease if the scene is mainly composed of smooth and curved surfaces. Hence, combining more visibility reasoning and second-order smoothness in an optimization framework will be considered in the future work.
Acknowledgments
This research was supported by the Chinese National Natural Science Foundation under grant No.61072135 and National Institutes of Health grants U01HL91736 and R01CA165255 of the United States.
Contributor Information
Jie Li, Email: jielonline@gmail.com.
Wenxuan Shi, Email: shiwxdsp@gmail.com.
Dexiang Deng, Email: whuddx@gmail.com.
Wenyan Jia, Email: jiawenyan@gmail.com.
Mingui Sun, Email: drsun@pitt.edu.
References
- 1.Scharstein Daniel, Szeliski Richard. A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. International Journal of Computer Vision, Springer. 2002;47(1–3):7–42. [Google Scholar]
- 2.Yoon Kuk-Jin, Kweon In-So. Adaptive Support-Weight Approach for Correspondence Search. IEEE Transactions on Pattern Analysis and Machine Intelligence, IEEE. 2006;28(4):650–656. doi: 10.1109/TPAMI.2006.70. [DOI] [PubMed] [Google Scholar]
- 3.Rhemann Christoph, Hosni Asmaa, Bleyer Michael, Rother Carsten, Gelautz Margrit. Fast cost-volume filtering for visual correspondence and beyond. Proceedings of International Conference on Computer Vision; 2011. pp. 3017–3024. [DOI] [PubMed] [Google Scholar]
- 4.Sun Jian, Shum Heung-Yeung, Zheng Nan-Ning. Stereo Matching using Belief Propagation. IEEE Transactions on Pattern Analysis and Machine Intelligence, IEEE. 2003;25(7):787–800. [Google Scholar]
- 5.Veksler Olga. Stereo Correspondence by Dynamic Programming on a Tree. Proceedings of IEEE Conference on Computer Vision and Pattern Recognition; IEEE. 2005. pp. 384–390. [Google Scholar]
- 6.Boykov Yuri, Veksler Olga, Zabih Ramin. Fast Approximate Energy Minimization via Graph Cuts. IEEE Transactions on Pattern Analysis and Machine Intelligence, IEEE. 2001;23(11):1222–1239. [Google Scholar]
- 7.Tappen Marshall F, Freeman William T. Comparison of Graph Cuts with Belief Propagation for Stereo, using Identical MRF Parameters. Proceedings of IEEE International Conference on Computer Vision; IEEE. 2003. pp. 900–907. [Google Scholar]
- 8.Yang Qingxiong, Wang Liang, Yang Ruigang, Wang Shengnan, Liao Miao, Nistér David. Real-time Global Stereo Matching Using Hierarchical Belief Propagation. Proceedings of British Machine Vision Conference; Springer. 2006. pp. 989–998. [Google Scholar]
- 9.Felzenszwalb Pedro F, Huttenlocher Daniel P. Efficient Belief Propagation for Early Vision. Proceedings of IEEE Conference on Computer Vision and Pattern Recognition; IEEE. 2004. pp. 261–268. [Google Scholar]
- 10.Yang Qingxiong, Wang Liang, Yang Ruigang, Stewenius Henrik, Nister David. Stereo matching with color-weighted correlation, hierarchical belief propagation and occlusion handling. IEEE Transactions on Pattern Analysis and Machine Intelligence; IEEE. 2009. pp. 492–504. [DOI] [PubMed] [Google Scholar]
- 11.Klaus Andreas, Sormann Mario, Karner Konrad F. Segment-based stereo matching using belief propagation and a self-adapting dissimilarity measure. Proceedings of International Conference on Pattern Recognition; IEEE. 2006. pp. 15–18. [Google Scholar]
- 12.Zhang Shuying, Li Dongmei. Stereo Match Algorithm Based on Image Color Segments. AISS: Advances in Information Sciences and Service Sciences, AICIT. 2012;4(17):519–526. [Google Scholar]
- 13.Taguchi Yuichi, Wilburn Bennett, Zitnick Lawrence. Stereo reconstruction with mixed pixels using adaptive over-segmentation. Proceedings of IEEE Conference on Computer Vision and Pattern Recognition; IEEE. 2008. pp. 1–8. [Google Scholar]
- 14.http://vision.middlebury.edu/stereo.
- 15.He Kaiming, Sun Jian, Tang Xiaoou. Guided Image Filtering. Proceedings of European Conference on Computer Vision; Springer. 2010. pp. 1–14. [Google Scholar]
- 16.Levin Aant, Rav-Acha Alex, Lischinski Dani. Spectral Matting. IEEE Transactions on Pattern Analysis and Machine Intelligence, IEEE. 2008;30:1699–1712. doi: 10.1109/TPAMI.2008.168. [DOI] [PubMed] [Google Scholar]
- 17.Levin Anat, Lischinski Dani, Weis Yair. A Closed Form Solution to Natural Image Matting. IEEE Transactions on Pattern Analysis and Machine Intelligence, IEEE. 2008;30:228–242. doi: 10.1109/TPAMI.2007.1177. [DOI] [PubMed] [Google Scholar]
- 18.González Rafael C, Woods Richard E. Digital Image Processing. Addison Wesley; USA: 1992. pp. 414–428. [Google Scholar]
- 19.Yang Qingxiong, Yang Ruigang, Davis James, Nistér David. Spatial-depth super resolution for range images. Proceedings of IEEE Conference on Computer Vision and Pattern Recognition; IEEE. 2007. pp. 1–8. [Google Scholar]
- 20.Zitnick Lawrence, Kang Sing Bing. Stereo for image-based rendering using image over-segmentation. International Journal of Computer Vision, Springer. 2007;75:49–65. [Google Scholar]
- 21.Woodford Oliver J, Torr Philip HS, Reid Ian D, Fitzgibbon Andrew W. Global Stereo Reconstruction under Second Order Smoothness Priors. IEEE Transactions on Pattern Analysis and Machine Intelligence, IEEE. 2009;31(12):2115–2128. doi: 10.1109/TPAMI.2009.131. [DOI] [PubMed] [Google Scholar]
- 22.Fu Jibin, Wu Huixin, Zhai Zhengang. A Probabilistic Stereo Matching Model for Slanted Surface. IJACT: International Journal of Advancements in Computing Technology, AICIT. 2012;4(15):226–234. [Google Scholar]


