Skip to main content
Sensors (Basel, Switzerland) logoLink to Sensors (Basel, Switzerland)
. 2016 Feb 6;16(2):217. doi: 10.3390/s16020217

Design and Analysis of a Single-Camera Omnistereo Sensor for Quadrotor Micro Aerial Vehicles (MAVs)

Carlos Jaramillo 1, Roberto G Valenti 2, Ling Guo 3, Jizhong Xiao 2,*
Editor: Xin Zhao
PMCID: PMC4801593  PMID: 26861351

Abstract

We describe the design and 3D sensing performance of an omnidirectional stereo (omnistereo) vision system applied to Micro Aerial Vehicles (MAVs). The proposed omnistereo sensor employs a monocular camera that is co-axially aligned with a pair of hyperboloidal mirrors (a vertically-folded catadioptric configuration). We show that this arrangement provides a compact solution for omnidirectional 3D perception while mounted on top of propeller-based MAVs (not capable of large payloads). The theoretical single viewpoint (SVP) constraint helps us derive analytical solutions for the sensor’s projective geometry and generate SVP-compliant panoramic images to compute 3D information from stereo correspondences (in a truly synchronous fashion). We perform an extensive analysis on various system characteristics such as its size, catadioptric spatial resolution, field-of-view. In addition, we pose a probabilistic model for the uncertainty estimation of 3D information from triangulation of back-projected rays. We validate the projection error of the design using both synthetic and real-life images against ground-truth data. Qualitatively, we show 3D point clouds (dense and sparse) resulting out of a single image captured from a real-life experiment. We expect the reproducibility of our sensor as its model parameters can be optimized to satisfy other catadioptric-based omnistereo vision under different circumstances.

Keywords: catadioptrics, omnistereo, 3D perception, Micro Aerial Vehicles (MAVs)

1. Introduction

Micro aerial vehicles (MAVs), such as quadrotor helicopters, are popular platforms for unmanned aerial vehicle (UAV) research due to their structural simplicity, small form factor, vertical take-off and landing (VTOL) capability, and high omnidirectional maneuverability. In general, UAVs have plenty of military and civilian applications, such as target localization and tracking, 3-dimensional (3D) mapping, terrain and infrastructural inspection, disaster monitoring, environmental and traffic surveillance, search and rescue, deployment of instrumentation, and cinematography, among other uses. However, MAVs have size, payload, and on-board computation limitations, which involve the use of compact and lightweight sensors. The most commonly used perception sensors on MAVs are laser scanners and cameras in various configurations such as monocular, stereo, or omnidirectional. We present a vision-based omnidirectional stereo (omnistereo) sensor motivated by several aspects of MAV robotics.

1.1. Sensor Motivation

We justify the need for the proposed omnistereo sensor after observing two basic differences in the sensor requirements between MAVs and ground vehicles:

  1. Size and payload—In MAV applications, the sensor’s physical dimensions and weight are always a great concern due to payload constraints. Generally, MAVs require fewer and lighter sensors that are compactly designed, while larger robots (including high-payload UAVs) have greater freedom of sensor choice.

  2. Field-of-view (FOV)—Due to their omnidirectional motion model, MAVs require a simultaneous observation of the 3D surroundings. Conversely, most ground robots can safely rely upon narrow vision as their motion control on the plane is more stable.

1.2. Existing Range Sensors for MAVs

In addition to specifying our sensor requirements, it is important to note the most prevalent robot range sensors used today by MAVs and their limitations. For example, lightweight 2.5D laser scanners can accurately measure distances at fast rates, however, their instantaneous sensing is limited to plane sweeps, which in turn require the quadrotor to move vertically in order to generate 3D maps or to foresee obstacles and free space during navigation. More recently, 3D laser rangefinders and LiDARs are being developed, such as the sensor presented in [1], but this one is not compact enough for MAVs. Another disadvantage of laser-based technologies is their active sensing nature, which requires more power to operate and their measurements are more vulnerable to detection and corruption (e.g., due to dark/reflective surfaces) than vision-based solutions. Time-of-flight (ToF) cameras as well as red, green, blue plus depth (RGB-D) sensors like the Microsoft Kinect® are also very popular for robot navigation. They have been adopted for low-sunlight conditions and mainly indoor navigation of MAVs [2] due to its structured infrared light projection and short range sensing (under 5 m) [3]. Hence, a lightweight imaging system capable of instantly providing a large field of view (FOV) with acceptable resolutions is essential for MAV applications in 3D space. These state-of-the-art sensors’ pitfalls motivate the design and analysis of our omnistereo sensor.

1.3. Related Work

Using omnidirectional images alone and motion—like the approaches taken in [4,5]—have been proposed to map and localize a robot. Omnidirectional vision using a single mirror for the flight of large UAVs was first attempted in [6]. In [7], Hrabar proposed the use of traditional horizontal stereo-based obstacle avoidance and path planing for AUVs, but these techniques were only tested in a scaled-down air vehicle simulator (AVS). Omnidirectional catadioptric cameras can be aided by structured light such as the prototypes presented in [8] and more flexible configurations demonstrated in [9]. Alternatively, stereo cameras can provide passive, instantaneous 3D information for robot mapping and navigation (including UAVs [10]). Intuitively, omnidirectional stereo (omnistereo) can be achieved through circular arrangements of multiple perspective cameras with overlapping views. Higher resolution panoramas can be achieved by rotating a linear camera as presented in [11], but this approach suffers from motion blur in dynamic environments. We point the reader to [12] for a detailed study of multiple view geometry, and [13] for a compendium of geometric computer vision concepts. Instead, our solution to omnistereo vision consists of a ‘catadioptric’ system by employing cameras and mirrors [14].

Throughout the years, [15,16,17,18,19,20] are some of the works that have applied various omnistereo catadioptric configurations for ground mobile robots. Unfortunately, these systems are not compact since they use separate camera-mirror pairs, which are known to experience synchronization issues. In [21], Yi and Ahuja described a configuration using a mirror and a concave lens for omnistereo, but it rendered a very short baseline in comparison to the two-mirror configurations. Originally, Nayar and Peri [22] studied 9 possible folded-catadioptric configurations for a single-camera omnistereo imaging system. Eventually, a catadioptric system using two hyperbolic mirrors in a vertical configuration was implemented by He et al. [23]. Their omnistereo sensor provides a lengthy baseline at the expense of a very tall system. In the past [24], we developed a novel omnistereo catadioptric rig consisting of a perspective camera coaxially-aligned with two spherical mirrors of distinct radii (in a “folded" configuration). One caveat of spherical mirrors is their non-centrality; they do not satisfy the single effective viewpoint (SVP) constraint (discussed in Section 2.2) but rather a locus of viewpoints is obtained [25].

1.4. Proposed Sensor

We design a SVP-compliant omnistereo system based on the folded, catadioptric configuration with hyperboloidal mirrors. Our approach resembles the work of Jang, Kim, and Kweon [26], who first implemented an omnistereo system using a pair of hyperbolic mirrors and a single camera. However, their sensor’s characteristics were not studied in order to justify their design parameters and capabilities, which we do in our case.

It is true that an omnidirectional catadioptric system sacrifices spatial resolution on the imaging sensor (analyzed in Section 3.4). However, our sensor offers practical advantages such as reduced cost, acceptable weight, and truly-instantaneous pixel-disparity correspondences since the same single camera-lens operates for both views, so mis-synchronization issues do not exist. In fact, we believe we are the first to present a single-camera catadioptric omnistereo solution for MAVs. The initial geometry of our model was proposed in [27]. Now, we perform an extensive analysis of our model’s parameters (Section 2) involving its geometric projection (Section 3) that are obtained as a constrained numerical optimization solution devising the sensor’s real-life application to MAVs passive range sensing (Section 4). We also show how the panoramic images are obtained, where we find correspondences and triangulate 3D points for which an uncertainty model is introduced (Section 5). Finally, we present our experimental results and evaluation for 3D sensing with the proposed omnistereo sensor (Section 6), and we discuss the future direction of our work in Section 7.

2. Sensor Design

Figure 1 shows the single-camera catadioptric omnistereo vision system that we specifically design to be mounted on top of our micro quadrotors (manufactured by Ascending Technologies [28]). It consists of (1) one hyperboloid-planar mirror at the top; (2) one hyperboloidal mirror at the bottom; and (3) a high-resolution USB camera also at the bottom (inside the bottom mirror and looking up). The components are housed and supported by a (4) transparent tube or plastic standoffs (for the real-life prototype shown in Figure 13). The choice of the hyperboloidal reflectors owes to three reasons: it is one of the four non-degenerated conic shapes satisfying the SVP constraint [29]; it allows a wider vertical FOV than elliptical and planar mirrors; and it does not require a telescopic (orthographic) lens for imaging as with paraboloidal mirrors (so our system can be downsized). In addition, the planar part of mirror 1 works as a reflex mirror, which in part reduces distortion caused by dual conic reflections. Based on the SVP property, the system obtains two radial images of the omnidirectional views in the form of an inner and an outer ring as illustrated in Figure 2a,b). Nevertheless, the unique set of parameters describing the entire system categorizes it as a “global camera model" given by [13] because changing the value of any parameter in the model affects the overall projection function of visible light rays in the scene as well as other computational imaging factors such as depth resolution and overlapping field of view, which we attempt to optimize with the following design subsections. Please, refer to Appendix A for clarification on our symbolic notation.

Figure 1.

Figure 1

Synthetic and real prototypes for the catadioptric single-camera omnistereo system.

Figure 13.

Figure 13

Real-life prototype of the omnistereo sensor.

Figure 2.

Figure 2

Photo-realistic synthetic scene: (a) Side-view of the quadrotor with the omnistereo rig in an office environment; (b) the image captured by the system’s camera using this pose.

2.1. Model Parameters

In the configuration of Figure 3, mirror 1’s real or primary focus is F1, which is separated by a distance c1 from its virtual or secondary focus, F1, at the bottom. Without loss of generality, we make both the camera’s pinhole and F1 coincide with the origin of the camera’s coordinate system, OC. This way, the position of the primary focus, F1, can be referenced by vector Cf1=[0,0,c1]T in Cartesian coordinates with respect to the camera frame, C. Similarly, the distance between the foci of mirror 2, F2 and F2, is measured by c2. Here, we use the planar (reflex) mirror of radius rref and unit normal vector

Cn^ref=[0,0,1] (1)

in order to project the real camera’s pinhole located at OC as a virtual camera OC coinciding with the virtual focal point F2 positioned at Cf2v=[0,0,d]T. We achieve this by setting d/2 as the symmetrical distance from the reflex mirror to OC and from the reflex mirror to OC. With respect to C, mirror 2’s primary focus, F2, results in position Cf2=[0,0,dc2]T. It yields the following expression for the reflective plane:

Cn^refTCx=d/2 (2)

Figure 3.

Figure 3

Geometric model and observable design parameters.

The profile of each hyperboloid is determined by independent parameters k1 and k2, respectively. Their reflective vertical field of view (vFOV) are indicated by angles α1 and α2. They play an important role when designing the total vFOV of the system, αsys, formally defined by Equation (54) and illustrated in Figure 5. Also importantly, while performing stereo vision, it is to consider angle αSROI, which measures the common (overlapping) vFOV of the omnistereo system. The camera’s nominal field of view αcam and its opening radius rcam also determine the physical areas of the mirrors that can be fully imaged. Theoretically, the mirrors’ vertical axis of symmetry (coaxial configuration) produces two image points that are radially collinear. This property is advantageous for the correspondence search during stereo sensing (Section 5) with a baseline measured as

b=c1+c2d (3)

Figure 5.

Figure 5

Vertical Field of View (vFOV) angles: α1 and α2 are the individual angles of the mirrors formed by their respective elevation limits θ1/2,min/max; αsys is the overall vFOV angle of the system; and αSROI measures the overlapping region conceived between α1 and α2.

Among design parameters, we also include the total height of the system, hsys, and weight msys, both being formulated in Section 2.3.

To summarize, the model has 6 primary design parameters given as a vector

θ=c1,c2,k1,k2,d,rsys (4)

in addition to by-product parameters such as

b,hsys,rref,rcam,msys,α1,α2,αsys,αSROI,αcam

In Section 4, we perform a numerical optimization of the parameters in θ with the goal to maximize the baseline, b, required for life-size navigational stereopsis. At the same time, we restrict the overall size of the rig (Section 2.3) without sacrificing sensing performance characteristics such as vertical field of view, spatial resolution, and depth resolution. In the upcoming subsections, we first derive the analytical solutions for the forward projection problem in our coaxial stereo configuration as a whole. In Section 3.2, we derive the back-projection equations for lifting 2D image points into 3D space.

2.2. Single Viewpoint (SVP) Configuration for OmniStereo

As a central catadioptric system, its projection geometry must obey the existence of the so-called single effective viewpoint (SVP). While the SVP guarantees that true perspective geometry can always be recovered from the original image, it limits the selection of mirror profiles to a set of conics. Generally, a circular hyperboloid of revolution (about its axis of symmetry) conforms to the SVP constraint as demonstrated by Baker and Nayar in [30]. Since a hyperboloidal mirror has two foci, the effective viewpoint is the primary focus F inside the physical mirror and the secondary (outer) focus F is where the centre (pinhole) of the perspective camera should be placed for depicting a scene obeying the SVP configuration discussed in this section.

First of all, a hyperboloid i can be described by the following parametric equation:

ziz0i2ai2ri2bi2=1,withai=ci2ki2ki,bi=ci22ki (5)

where z0i=ci2 is the offset (shift) position of the focus along the Z-axis from the origin OC, and ri is the orthogonal distance to the axis of revolution / symmetry (i.e. the Z-axis) from a point Pi on its surface.

In fact, the position of a valid point Pi is constrained within the mirror’s physical surface of reflection, which is radially limited by ri,min and ri,max, such that:

ri=xi2+yi2,forri,minriri,max,i{1,2} (6)

and r1,min=rref,r1,max=rsys,r2,min=rcam,r2,max=rsys. Observe that the radius of the system is the upper bound for both mirrors (Figure 3). In addition, the hyperboloids profiled by Equation (5) must obey the following conical constraints:

i{1,2}ci>0ki>2 (7)

k is a constant parameter (unit-less) inversely related to the mirror’s curvature or more precisely, the eccentricity εc of the conic. In fact, εc>1 for hyperbolas, yet a plane is produced when εc or k=2.

We devise Mi as the set of all the reflection points Pi with coordinates (xi,yi,zi) laying on the surface of the respective mirror i within bounds. Formally,

Mi:=PiR3|ziz0i2ai2ri2bi2=1Equation(6)Equation(7) (8)

In our model, we describe both hyperboloidal mirrors, 1 and 2, with respect to the camera frame C, which acts as the common origin of the coordinate system. Therefore,

z01=c12 (9)
z02=dc22 (10)

By expanding Equation (5) with their respective index terms, it becomes

z1c122r12k121=c124k12k1 (11)
z2d+c222r22k221=c224k22k2 (12)

Additionally, we define the function fzi:rzi to find the corresponding zi component from a given r value as

fzi(r):=z0i+γiifi=1Equation(6)z0iγiifi=2Equation(6)Noneotherwise (13)

where γi=aibibi2+ri2.

The inverse relation fri:z+ri,ri can be also implemented as

fri(z):=±biΓiifi{1,2}Equation(6)Noneotherwise (14)

where Γi=zz0i2ai21, so a valid input z can be associated with both positive and negative solutions ri.

2.3. Rig Size

In the attempt to evaluate the overall system size, we consider the height and weight variables due to the primary design parameters, θ.

First, the height of the system, hsys can be estimated from the functional relationships fz1 and fz2 defined in Equation (13), which can provide the respective zcomponent values at the out-most point on the mirror’s surface. More specifically, knowing rsys, we get

hsys=zmaxzmin (15)

where zmax=fz1(rsys) and zmin=fz2(rsys).

The rig’s weight can be indicated by the total resulting mass of the main “tangible” components:

msys=mcam+mtub+mmir (16)

where the mass of the camera-lens combination is mcam; the mass of the support tube mtub can be estimated from its cylindrical volume Vtub and material density ρtub, and the mass due to the mirrors

mmir=Vmirρmir=V1+Vref+V2ρmir (17)

For computing the volume of the hyperboloidal shell, Vi for mirror i, we apply a “ring method” of volume integration. By assuming all mirror material has the same wall thickness τm, we acquire Vi by integrating the horizontal cross-sections area along the Z-axis. Each ring area depends on its outer and inner circumferences that vary according to radius rz for a given height z. Equation (14) establishes the functional relation ri+=fri(z), from which we only need its positive answer. We let A be the function that computes the ring area of constant thickness τm for a variable outer radius ri

A(ri)=πri2πriτm2=πτm2riτm (18)

We consider the definite integral evaluated in the z interval bounded by its height limits, which are correlated with its radial limits Equation (6) and can be obtained via the fzi defined in Equation (13), such that

zi,min=fziri,minandzi,max=fziri,max (19)

Then, we proceed to integrate Equation (18), so the shell volume for each hyperboloidal mirror is defined as

Vi=zi,minzi,maxA(ri)dz (20)

Finally, since the reflex mirror piece is just a solid cylinder of thickness τm, its volume is simply

Vref=τmπrref2 (21)

3. Projective Geometry

3.1. Analytical Solutions to Projection (Forward)

Assuming a central catadioptric configuration for the mirrors and camera system (Section 2.2), we derive the closed-form solution to the imaging process (forward projection) for an observable point Pw, positioned in three-dimensional Euclidean space, R3, with respect to the reference frame, C, as vector Cpw=[xw,yw,zw]T. In addition, we assume all reference frames such as F1 and F2 have the same orientation as C.

For mathematical stability, we must constrain that all projecting world points lie outside the mirror’s volume:

fri(zw)<ρw,whereρw=xw2+yw2 (22)

where fri is defined by Equation (14) and ρw measures the horizontal range to Pw.

Pw is imaged at pixel position Im1 after its reflection as point P1 on the hyperboloidal surface of mirror 1 (Figure 4). On the other hand, the second image point’s position, Cm2, due to reflection point P2 on mirror 2 is rather obtained indirectly after an additional point Pr is reflected at Cpref on the reflex mirror represented via Equation (32).

Figure 4.

Figure 4

Omnistereo projection of a 3D point Pw to obtain image points Im1 and Im2.

First, for Pw’s reflection point via mirror 1 at position vector Cp1, we use λ1 as the parametrization term for the line equation passing through F1 toward Pw with direction F1d1=CpwCf1. The position of any point P1 on this line is given by:

Cp1=Cf1+λ1F1d1 (23)

Substituting Equation (23) into Equation (11), we obtain:

λ1(zwc1)+c122λ12xw2+λ12yw2k121c124k12k1=0

in order to solve for λ1, which turns out to be

λ1=c1F1d1k1·(k12)k1zwc1 (24)

where F1d1=xw2+yw2+(zwc1)2 is the Euclidean norm between Pw and mirror 1’s focus, F1.

In practice, we represent the reflection point’s position Cp1 as a matrix-vector multiplication between the 3×4 transformation matrix K1=[λ1I(3),1λ1Cf1] and the point’s position vector Cpw,h=[xw,yw,zw,1]T in homogeneous coordinates:

Cp1=K1Cpw,h (25)

Note that Cp1’s elevation angle, θ1, must be bounded as

θ1,minθ1θ1,max (26)

where θ1,min and θ1,max are the angular elevation limits for the real reflective area of the hyperboloid.

Finally, the reflection point P1 with position Cp1 can now be perspectively projected as a pixel point located at Im1=[u1,v1]T on the image. In fact, the entire imaging process of Pw via mirror 1 can be expressed in homogeneous coordinates as:

Im1,h=ζ1KcK1Cpw,h (27)

where the scalar ζ1=1/z1=1/c1+λ1zwc1 is the perspective normalizer that maps the principal ray passing through p1 onto a point Cq1=[xq1,yq1,1]T on the normalized projection plane π^img1. The traditional 3×3 intrinsic matrix of the camera’s pinhole model is

Kc=fusuc0fvvc001 (28)

in which fu=f/hx and fv=f/hy are based on the focal length f and the pixel dimension (hx,hy), s is the skew parameter, and Imc=[uc,vc]T is the optical center position on the image I. Figure 4 illustrates the projection point fCq1 on the respective image plane πimg1.

Similarly, we provide the analytical solution for the forward projection of Pw via mirror 2 by first considering the position of reflection point P2:

Cp2=K2Cpw,h (29)

where K2=[λ2I(3),1λ2Cf2] is similar to the transformation matrix K1, but obviously it now uses Cf2 and

λ2=c2F2d2k2·(k22)+k2zw(dc2) (30)

with direction vector’s norm

F2d2=CpwCf2=xw2+yw2+zw(dc2)2 (31)

For completeness, note that the physical projection via mirror 2 is incident to the reflex mirror at

Cpref=Cf2v+λrefCp2Cf2v (32)

where λref=d2(dz2) according to Equation (2) in the theoretical model. Ultimately, ignoring any astigmatism and chromatic aberrations introduced by the reflex mirror, and because the same (and only) real camera with Kc is used for imaging, we obtain the projected pixel position Im2,h=[u2,v2,1]T:

Im2,h=ζ2KcKrefK2Cpw,h (33)

where ζ2=1/dz2 is the perspective normalizer to find Cq2 on the normalized projection plane, π^img2.

Due to planar mirroring via the reflex mirror, CCKref is used to change the coordinates of P2 from C onto the virtual camera frame, C, located at Cf2v. Hence,

CCKref=I(3)+2Dn^ref,Cf2v (34)

where the 3×1 unit normal vector of the reflex mirror plane, Cn^ref given in Equation (1), is mapped into its corresponding 3×3 diagonal matrix Dn^ref, via the relationship:

Dn^refI(3)diagCn^ref (35)

It is convenient to define the forward projection functions fφ1(Cp) and fφ2(Cp) for a 3D point P whose position vector is known with respect to C and which is situated within the vertical field of view αi of mirror i (for i{1,2}) indicated in Figure 5. Function fφi(Cp) maps Cp to image point Imi on frame I, such that fφi:R3R2,

fφi(Cp):=CpEquation(27)Im1ifi=1Equations(37)and(22)CpEquation(33)Im1ifi=2Equations(37)and(22)Noneotherwise (36)

In fact, Imi is considered valid if it is located within the imaged radial bounds, such that:

ICiIimri,minICiImiICiImri,max (37)

where the frame of reference ICi implies that its origin is the image center Imc=[uci,vci]T of the Ii masked image (Figure 7). Therefore, the magnitude (norm) of any position ICim in pixel space ICi can be measured as

ICiIim:=IimIimc=(uuc)2+(vvc)2 (38)

Figure 7.

Figure 7

The omnidirectional image I shown in Figure 2b is now annotated for the separate regions of interest in I1 and I2. In addition, we indicate the corresponding radial heights hI1 and hI2 of the SROI, so we can determine the imaging ratio χI1:2=hI1hI2. For the optimal parameter values listed in Table 1, we find that χI1:22.

In particular, ICiImri,lim is the image radius obtained from the projection Imri,limfφi(Cpi,lim) corresponding to a particular point coincident with the line of sight of the radial limit ri,lim—it being either rsys, rref, or rcam as indicated by Equation (6).

3.2. Analytical Solutions to Back Projection

The back projection procedure establishes the relationship between the 2D position of a pixel point Imi=[u,v]T on the image Ii and its corresponding 3D projective direction vector vi toward the observed point Pw in the world.

Initially, the pixel point Im1 (imaged via mirror 1) is mapped as Q1 onto the normalized projection plane π^img1 with coordinates Cq1=[xq1,yq1,1]T by applying the inverse transformation of the camera intrinsic matrix Equation (28) as follows:

Cq1=CIKc1Im1,h=1fusfufvsvcfvucfufv01fvvcfv001u1v11 (39)

For simplicity, we assume no distortion parameters exist, so we can proceed with the lifting step along the principal ray that passes through three points: the camera’s pinhole OC, point Q1 on the projection plane, and the reflection point P1 (Figure 4). The vector form of this line equation can be written as:

Cp1=Coc+t1Cq1Coc=t1Cq1 (40)

By substituting Equation (40) into Equation (11), we solve for the parameter t1, to get

t1=c1k1Cq1k1·(k12) (41)

where Cq1=xq12+yq12+1 is the distance between Q1 and OC.

Given F1v1 as the direction vector leaving focal point F1 toward the world point CPw. Through frame transformation CF1T1Cp1,h, we get

F1v1=CF1T1Cp1,h,whereCF1T1(3×4)=I(3),Cf1 (42)

for Cp1,h as the homogeneous form of Equation (40). In fact, F1v1 provides the back-projected angles (elevation θ1, azimuth ψ1) from focus F1 toward CPw:

F1θ1=arcsinzv1F1v1=arcsinz1c1F1v1 (43)
F1ψ1=arctanyv1xv1=arctany1x1 (44)

where F1v1 is the norm of the back-projection vector up to the mirror surface.

Using the same approach, we lift a pixel point Im2 imaged via mirror 2. Because the virtual camera OC located at Cf2=[0,0,dc2]T uses the same intrinsic matrix Kc, we can safely back-project pixel Im2 to Q2v on the normalized projection plane π^img2 as follows:

Cq2v=Cq2=Kc1Im2,h (45)

where the inverse transformation of the camera intrinsic matrix Kc1 is given by Equation (28). Since the reflection matrix Kref defined in Equation (34) is bidirectional due to the symmetric position of the reflex mirror about C and C, we can find the desired position of Cq2v with respect to C:

Cq2v=CCKrefCq2v,h (46)

which is equivalent to Cq2v=[xq2v,yq2v,d1]T.

In Figure 4, we can see the principal ray that passes through the virtual camera’s pinhole OC and the reflection point P2, so this line equation can be written as:

Cp2=Cf2v+t2Cq2vCf2v (47)

Solving for t2 Equations (47) and (12), we get

t2=c2k2Cq2k2·(k22) (48)

where Cq2=xq22+yq22+1 is the distance between the normalized projection point Q2 and the camera OC while considering Equation (46). Beware that the newly found location of P2 is given with respect to the real camera frame, C.

Again, we obtain the back-projection ray

F2v2=CF2T2Cp2,h,whereCF2T2(3×4)=I(3),f2 (49)

in order to indicate the direction leaving from the primary focus F2 toward Pw through P2. Here, the corresponding elevation and azimuth angles are respectively given by

F2θ2=arcsinzv2F2v2=arcsindt2F2v2 (50)
F2ψ2=arctanyv2xv2=arctany2x2 (51)

where F2v2=x22+y22+c2t22 is the magnitude of the direction vector from its reflection point P2.

Like done for the (forward) projection, it is convenient to define the back-projection functions fβ1 and fβ2 for lifting a 2D pixel point Im within radial bounds validated by Equation (37) to their angular components Fiθi,ψi with respect to the respective foci frame Fi (oriented like C) as indicated by Equations (43), (44), (50) and (51), such that fβi:R2R2,

fβi(Im):=ImEquation(43)F1θ1,ImEquation(44)F1ψ1ifi=1ImEquation(50)F2θ2,ImEquation(51)F2ψ2ifi=2None¬Equation(37). (52)

3.3. Field-of-View

The horizontal FOV is clearly 360° for both mirrors. In other words, azimuths ψ can be measured in the interval 0,2π rad. As discussed previously, there exists a positive correlation between the vertical field of view (vFOV) angle αi of mirror i and its profile parameter ki, such that αi180° as ki (see Figure 9). As demonstrated in Figure 5, αi is physically bounded by its corresponding elevation angles: θi,max, θi,min. Both vFOV angles, α1 and α2, are computed from their elevation limits as follows:

α1=θ1,maxθ1,min (53a)
α2=θ2,maxθ2,min (53b)

Figure 9.

Figure 9

The effect that parameter ki (showing mirror 1 only) has over the system radius rsys for various values of the vertical field of view angle α1. In order to maintain a vertical field of view αi that is bounded by zmaxrsys, the value of rsys must change accordingly. Inherently, the system’s height, hsys, and its mass, msys, are also affected by ki (see Section 2.3).

The overall vFOV of the system is also given from these elevation limits:

αsys=maxθ1,max,θ2,maxminθ1,min,θ2,min (54)

Figure 6 highlights the the so-called common vFOV angle, αSROI, for the stereo region of interest (SROI) where the same point can be seen from both mirrors so point correspondences can be found (Section 5). In our model, αSROI can be decided from the value of the three prevailing elevation angles (θ1,max, θ1,min, and θ2,min), such that:

αSROI=θSROI,maxθSROI,min (55)

where generally,

θSROI,min=max(θ1,min,θ2,min) (56a)
θSROI,max=min(θ1,max,θ2,max) (56b)

Figure 6.

Figure 6

A cross section of the SROI (shaded area) formed by the intersection of view rays for the limiting elevations θ1/2,min/max. The nearest stereo (ns) points are labeled Pnshigh, Pnsmid and Pnslow since they are the vertices of the hull that near-bounds the set of usable points for depth computation from triangulation (Section 5.2). See Table 3 for the proposed sensor’s values.

The shaded area in Figure 6 illustrates the SROI that is far-bounded by the set of triangulated points found at the maximum range due to minimum disparity Δm12=1px in the discrete case (refer to Figure 17), such that

Pfs=PwfΔ((θ1,ψ1),(θ2,ψ2))(θ1,ψ1)fβ1(m1)(θ2,ψ2)fβ2(m2)Δm12=1,px (57)

where functions fβi and fΔ, are provided in Equations (52) and (89).

Figure 17.

Figure 17

Variation of horizontal range, Δρw, due to change in pixel disparity Δm12 on the omnidirectional image, I. There exists a “nonlinear & inverse” relation between the change in depth from triangulation (Δρw) and the number of disparity pixels (Δm12) available from the omnistereo image pair I1,I2, which are exclusive subspaces of I.

The SROI is near-bounded (to the Z-axis of radial symmetry) by its vertices Pnshigh, Pnsmid and Pnslow, which result from the following ray-intersection cases:

  • (a)

    PnshighfΔ((θ1,max,ψ1),(θ2,max,ψ2))

  • (b)

    PnsmidfΔ((θ1,min,ψ1),(θ2,max,ψ2))

  • (c)

    PnslowfΔ((θ1,min,ψ1),(θ2,min,ψ2))

where the intersection function fΔ is implemented for direction rays (or angles) as defined in the Triangulation Section 5.2.

By assuming a radial symmetry on the camera’s field of view αcam, it should allow for a complete view of the mirror surface at its outmost diameter of 2rsys according to Equation (6). Substantially, as depicted in Figure 6, αcam is upper-bounded by the camera hole radius rcam selected according to Equation (78). The following inequality constraint emerges

2arctanrsysfz1(rsys)αcam2arctanrcamfz2(rcam) (58)

where the respective functions fzi are defined in Equation (13).

Our specific viewing requirements when mounting the omnidirectional sensor along the central axis of the quadrotor ensure that objects located at 15 cm under the rig’s base and at 1 meter away (from the central axis) can be viewed. Thus, angles θ1,min and θ2,min should only be large enough as to avoid occlusions from the MAV’s propellers (Figure 5) and to produce inner and outer ring images at a useful ratio (Figure 7).

3.4. Spatial Resolution

The resolution of the images acquired by our system are not space invariant. In fact, an omnidirectional camera producing spatial resolution-invariant images can only be obtained through a non-analytical function of the mirror profile as shown in [31]. In this section, we study the effect our design has on its spatial resolution as it depends on position parameters like d and ci introduced in Section 2.1 as well as a direct dependency on the characteristics (e.g., focal length f) of the camera obtaining the image.

Let ηcam be the spatial resolution for a conventional perspective camera as defined by Baker and Nayar in [25,29]. It measures the ratio between the infinitesimal solid angle dωi (usually measured in steradians) that is directed toward a point Pi at an angle θi,pix (formed with the optical axis ZC) and the infinitesimal element of image area dApix that dωi subtends (as shown in Figure 8). Accordingly, we have:

ηcam=dApixdωi=f2cos3θi,pix (59)

whose behavior tends to decrease as θpix0, so higher resolution areas on the sensor plane continuously increase the farther away they get from the optical center imaged at Imc. For ease of visualization, we plot only the u pixel coordinates corresponding to the 2D spatial resolution η2D, which is obtained by projecting the solid angle Ω onto a planar angle θΩ (the apex angle in 2D of the solid cone of view). This yields θΩ=2arccos1Ω/2π, and we reduce the image area into its circular diameter with 2A/π. Generally, our conversion from 3D spatial resolution η in m2/sr units to 2D proceeds as follows:

η2D=2η/πθΩ=1sr (60)

where θΩ=1sr1.14390752211rad. More specifically, Equation (59) is manipulated to provide ηi,cam as the indicative of spatial resolution toward any specific point in the mirror, CPiMi according to Equation (8), as follows:

ηi,cam=f2r12+z12z13ifi=1f2r22+(dz2)2dz23ifi=2 (61)

where ri is the radial length defined in Equation (6) and its associated zi coordinate, f is the camera’s focal length, and the design parameters d and ci that relate to the position of the mirror focal points Fi with respect to the camera frame C.

Figure 8.

Figure 8

The spatial resolution for a central catadioptric sensor is the ratio between an infinitesimal image area dA and its corresponding solid angle dν1 that views a point Pw. (Note: infinitesimal elements are exaggerated in the figure for better visualization.)

Thus, for a conventional perspective camera, ηi,cam grows as θi,pixπ/2 due to the foreshortening effect that stretches the image representation around the sensor plane’s periphery where spatial information gets collected onto a larger number of pixels. Therefore, image areas farther from the optical axis are considered to have higher spatial resolutions.

Baker and Nayar also defined the resolution, ηi, of a catadioptric sensor in order to quantify the view of the world or dνi, an infinitesimal element of the solid angle subtended by the mirror’s effective viewpoint Fi, which is consequently imaged onto a pixel area dApix. Again, here we provide the resolution according to our model:

η1=dApixdν1=r12+(c1z1)2)r12+z12η1,cam (62a)
η2=dApixdν2=r22+(c2d+z2)2)r22+(dz2)2η2,cam (62b)

for our mirror-perspective camera configuration, where OC is the origin of coordinates as shown in Figure 8 and ηi,cam is given in Equation (61).

As demonstrated by the plot of Figure 12 in Section 4.2.2, ηi grows accordingly towards the periphery of each mirror (the equatorial region). This aspect of our sensor design is very important because it indicates that the common field of view, αSROI, where stereo vision is employed (Section 5), is imaged at a relatively higher resolution than the unused polar regions closer to the optical axis (the ZC axis).

Figure 12.

Figure 12

Using the formula given in Equation (60), we plot the 2D version of the spatial resolution of our proposed omnistereo catadioptric sensor (37mm-radius rig). Both resolutions η1 and η2 increase towards the equatorial region where they are physically limited by rsys. This verifies the spatial resolution theory given in [29], and it justifies our coaxial configuration useful for omnistereo sensing within the SROI indicated in Figure 6.

If we modify ηi by substituting ri with its equivalent fri(zi) function defined in Equation (14), using mirror 1 for example, we get:

η1=fr12(z1)+(c1z1)2fr12(z1)+z12η1,cam=f2fr12(z1)+z12fr12(z1)+(c1z1)2z13 (63)

which is an inherent indicative of how the resolution ηi for a reflection point Pi increases with ki (Figure 11). Conversely, the smaller the ki parameter gets (related to eccentricity as discussed in Section 2.2), the flatter the mirror becomes, so its resolution resembles more that of the perspective camera alone. Mathematically, limki2ηiηi,cam.

Figure 11.

Figure 11

Comparison of ki values and their effect on spatial resolution ηi for i={1,2}. For the big rig, the optimal focal dimensions c1 and c2 (from Table 1) were used as well as the angular span on the common vertical FOV, αSROI28. Although resolution ηi(Opt.) for the optimal values of ki could be improved by employing smaller k values (lower curvature profiles indicated on the left plot of the figure), this would in turn increase the system radius, rsys, as to maintain αi (Figure 9). As expected, the plot on the right help us appreciate how the spatial resolutions, ηi, increase towards the equatorial regions (θ1θSROI,max and θ2θSROI,min).

As shown in Figure 9, a smaller ki would require a wider radius rsys in order to achieve the same omnidirectional vertical field of view, αsys. Even worse, in order to image such a wider reflector, either the camera’s field of view, αcam, would have to increase (by decreasing the focal length f and perhaps requiring a larger camera hole rcam and sensor size), or the distance ci between the effective pinhole and the viewpoint would have to increase accordingly. Another consequence is the effect on the baseline b, which must change in order to maintain the same vertical field of view (Figure 10). As a result, the depth resolution of the stereo system would suffer as well.

Figure 10.

Figure 10

The effect that parameter k1 has over the omnistereo system’s baseline b for several common FOV angles (αSROI) and a fixed camera with αcam. An inverse relationship exists between k and b as plotted here (using a logarithmic scale for the vertical axis). Intuitively, the flatter the mirror gets (k2), the farther F1 must be translated in order to fit within the camera’s view, αSROI, causing b to increase.

4. Parameter Optimization and Prototyping

The nonlinear nature of this system makes it very difficult to balance among its desirable performance aspects. The optimal vector of design parameters, θ*, can be found by posing a constrained maximization problem for the objective function

fb(θ)=c1+c2d (64)

which measures the baseline according to Equation (3). Indeed, the optimization problem is subject to the set of constraints C, which we enumerate in Section 4.1. Formally,

θ*=arg maxθΘfb(θ)subjecttoC (65)

where ΘR6 is the 6-dimensional solution space for θR6 given in Equation (4) as θ=c1,c2,k1,k2,d,rsys.

4.1. Optimization Constraints

We discuss the constraints that the proposed omnistereo sensor is subject to. Overall, we mainly take the following into account:

  • (a)

    geometrical constraints — including SVP and reflex constraints described by Equations (11), (12) and (2);

  • (b)

    physical constraints — the rig’s dimensions, which include the mirrors radii as well as by-product parameters such as system height hsys and mass msys;

  • (c)

    performance constraints — the spatial resolution and range from triangulation determined by parameters k1, k2, and c1; the desired viewing angles for an optimal SROI field of view, αSROI.

Following the design model described throughout Section 2, we now list the pertaining linear and nonlinear constraints that compose the set C. We disjoint the linear constraints in a subset CL and the non-linear constraints subset CNL, so C=CLCNL. Within each subset, we generalize equality constraints as functions h:R6R that obey

h(θ)=0 (66)

whereas inequality functions g:R6R satisfy

g(θ)0 (67)

4.1.1. Linear Constraints

We have only setup linear inequalities for constraints in CL. Specifically, we require the following:

  • g1:
    In order to set the position of F2 below the origin OC of the pinhole camera frame C, the focal distance c2 of mirror 2 must be larger than d (distance between OC and F2v),
    dc2 (68)
  • g2:
    Because the hyperboloidal mirror should reflect light towards its effective viewpoint F1 without being occluded by the reflex mirror, mirror 1’s focal distance, c1, needs to exceed the placement of the reflex mirror,
    d/2c1 (69)
  • g3:
    The empirical constraint
    53k2k1 (70)
    pertains our rig dimensions in order to assign a greater curvature to mirror 2’s profile (located a the bottom), so its view is directed toward the equatorial region rather than up. Complementarily, this constraint flattens mirror 1’s profile, so it can possess a greater view of the ground. This curvature inequality allows the SROI to be bounded by a wider vertical field of view when the sensor must be mounted above the MAV’s propellers as depicted in Figure 5.

4.1.2. Non-Linear Constraints

For the non-linear design constraints, we establish the following inequalities:

  • g4:
    The AscTec Pelican quadrotor has a maximum payload of 650g (according to the manufacturer specifications [28]). Therefore, we must satisfy the system mass computed via Equation (16), such that
    msys650 (71)
  • g5:
    Similarly, we limit the system’s height obtained with Equation (15) by a height limit hsys,max,
    hsyshsys,max (72)
    For example, we set hsys,max=150mm for the 37mm-radius rig.
  • g6:
    The origin of coordinates for the camera frame is set at its viewpoint, OC. In order to fit the camera enclosure under mirror 2, it is realistic to position the focus F2 on the vertical transverse axis at more than 5mm away from OC:
    5z02a2 (73)
    where z02 is defined in Equation (10), and a2 pertains to Equation (5).

Next, we determine the bounds for the limiting angles that partake in the computation of the system’s vertical field of view αsys, which is based on equation Equation (54). Our application has specific viewing requirements that can be achieved with the following application conditions:

  • g7:
    Let Λ1,max=14 be an acceptable upper-bound for angle θ1,max , such that
    θ1,maxΛ1,max (74)
  • g8:
    Because we desire a larger view towards the ground from mirror 1, we empirically set Λ1,min=25 as a lower-bound for the minimum elevation θ1,min,
    Λ1,minθ1,min (75)
  • g9:
    In order to avoid occlusions with the MAV’s propellers while being capable to image objects located about 5 cm under the rig’s base and 20 cm away (horizontally) from the central axis, we limit mirror 2’s lowest angle by a lower-bound Λ2,min=14,
    Λ2,minθ2,min (76)

Finally, we restrict the radius of the system, rsys, to be identical for both hypeboloids by satisfying the following equality condition:

  • h1:
    With functions fr1 and fr2 defined in Equation (14), we set
    rsys=ri,max=fri(zi,max),i{1,2}
    where we imply that zi,maxfzi(rsys) using Equation (13). Thus, the entire function composition for this equality becomes
    fr1fz1(rsys)=fr2fz2(rsys) (77)

4.2. Optimal Results

Applying the aforementioned constraints (Section 4.1) and using an iterative nonlinear optimization method such as one of the surveyed in [32], a bounded solution vector θ* converges to the the values shown in Table 1 for two rig sizes. Table 2 contains the by-product parameters corresponding to the dimensions listed in Table 1.

Table 1.

Optimal System Design Parameters.

Parameter Big Rig Small Rig
b=maxfb(θ*) 131.61 108.92
rsys[mm] 37.0 28.0
c1[mm] 123.49 104.59
c2[mm] 241.80 204.34
d[mm] 233.68 200.00
k1 5.73 6.88
k2 9.74 11.47

Table 2.

By-product Length Parameters.

Parameter Big Rig Small Rig
rref[mm] 17.23 11.74
rcam[mm] 7 7
hsys[mm] 150.00 120.00

As Figure 3 illustrates, a realistic dimension for the radius of the camera hole, rcam, must consider the maximum value between a physical micro-lens radius (rlens) and the radius rαcamrsys for an unoccluded field of view of the camera αcam imaging the complete surface of mirror 1. Practically,

rcam=maxrlens,rαcamrsys (78)

For both rigs, the expected vertical field of views are αsys=75°(21°)96° according to Equation (54), and αSROI=14°(14°)28° using Equation (55). Note that θ2,max may be actually limited by the camera hole radius, so in reality θcam59°, and αsys80°. For the big rig, Table 3 shows the nearest vertices of the SROI that result from these angles (Figure 6).

Table 3.

Near Vertices of the SROI for the Big Rig.

Vertex Cρw [mm] Czw [mm]
Pnshigh 93.5 144.4
Pnsmid 65.2 98.4
Pnslow 763.4 -170.3

4.2.1. Optimality of Parameters k1 and k2

Finally, we study the effect parameter ki has over the system radius rsys (Figure 9), the omnistereo baseline b (Figure 10), and the spatial resolution (Figure 11 and Figure 12). Figure 9 addresses the relation between ki and radius rsys (recall the rig size specified in Section 2.3). In Figure 11, it can be seen that for the same rsys, realistic values for k1 fall in the range 3<k1<13, and the vertical field of view α10 as k2, which is expected according to the SVP property specified in Section 2.2. In fact, the left part of Figure 11 also demonstrates the necessary rsys to maintain αSROI28 for various values of ki.

Figure 10 shows the inverse relationship between values of k1 and the baseline, b, as we attempt to fit the view of a wider/narrower mirror profile (due to k1) on the constant camera field of view, αcam. In order to make a fair comparison, let

k1=k1+εk,k1>2,εk>0

for which we find its new focal length c1 while solving for the new rsys and zmax. Provided with a function such that c1fc1(k1), we perform the analysis for a given αSROI and αcam shown in Figure 10. Given the baseline function fb defined in Equation (64), the following implication holds true:

fbc1fc1(k1)>fbc1fc1(k1+εk),k1>2,εk>0 (79)

Notice that k2, c2 and d are kept constant through this last analysis, and we ignore possible occlusions from the reflex mirror fixed at d/2.

4.2.2. Spatial Resolution Optimality

In this section, we compare the sensor’s spatial resolution, ηi, defined in Section 3.4 for the optimal parameters listed in Table 1 (for the big rig, only). In Figure 12, we verify how both resolutions η1 and η2 increase towards the equatorial region according to the spatial resolution theory presented in [29]. Indeed, the increase in spatial resolution within the SROI that covers the equatorial region (as indicated in Figure 6) justifies our model’s coaxial configuration intended for omnistereo applications.

In Figure 11, we compare the effect on ηi for various mirror profiles, which depend directly on ki. We illustrate the change in curvature due to parameters k1 and k2 and also show (in the legend) the respective rsys achieving a common vFOV of αSROI28° as for the optimal parameters of the big rig. From this plot, we appreciate the compromise due to optimal parameters, k1(Opt.)=5.7 and k2(Opt.)=9.7, for a realistic system size due to rsys and a suitable range of spatial resolutions, ηi, within the SROI.

4.3. Prototypes

We validate our design with both synthetic and real-life models.

4.3.1. Synthetic Prototype (Simulation)

After converging to an optimal solution θ*, we employ these parameters (Table 1) to describe synthetic models using POV-Ray, an open-source ray-tracer. We render 3D scenes via the camera of the synthetic omnistereo sensor like the example shown in Figure 2b. The simulation stage plays two important roles in our investigation:

  • (1)

    to acquire ground-truth 3D-scene information in order to evaluate the computed range by the omnistereo system (as explained in Section 5); and

  • (2)

    to provide an almost accurate geometrical representation of the model by discounting some real-life computer vision artifacts such as assembly misalignments, glare from the support tube (motivating the use of standoffs on the real prototype), as well as the camera’s shallow depth-of-field. All of these artifacts can affect the quality of the real-life results shown in Section 6.

4.3.2. Real-Life Prototypes

We have also produced two physical prototypes that can be installed on the Pelican quadrotor (made by Ascending Technologies [28]). Figure 13a shows the rig constructed with hyperboloidal mirrors of rsys37mm, and a Logitech® HD Pro Webcam C910 camera capable of (2592 × 1944) pixel images at 15∼20 FPS. We decided to skip the use of the acrylic glass tube to separate the mirrors at the specified hsys distance, and instead we constructed a lighter 3-standoff mount in order to avoid glare and cross-reflections. This support was designed in 3D-CAD and printed for assembly. The three areas of occlusion due to the 3mm-wide standoffs are non-invasive for the purpose of omnidirectional sensing and can be ignored with simple masks during image processing. In fact, we stamped fiducial markers to the vertical standoffs to aid with the panoramas generation (Section 5.1) and future calibration methods. To image the entire surface of mirror 1, we require a camera with a (minimum) field of view of αcam>31°, which is achieved by rαcam>1.4mm. In practice, as noted by Equation (78), microlenses measure around rlens7mm. Therefore, we set rcam>7mm, as a safe specification to fit a standard microlens through the opening of mirror 2 as shown in Figure 3.

Recall that msys is limited by the maximum 650g-payload that the AscTec Pelican quadrotor is capable of flying with (according to the manufacturer specifications [28]). The camera with lens weights approximately 25g. A cylindrical tube made of acrylic has an average density ρtub1.18 g·cm3, whereas the mirrors machined out of brass have a density ρmir8.5 g·cm3. Empirically, we verify a close estimate of the entire system’s mass, such that msys550g for the big rig, and msys150g for the small rig.

5. 3D Sensing from Omnistereo Images

Stereo vision from point correspondences on images at distinct locations is a popular method for obtaining 3D range information via triangulation. Techniques for image point matching are generally divided between dense (area-based scanning [32]) and sparse (feature description [33]) approaches. Due to parallax, the disparity in point positions for objects close to the vision system must be larger than for objects that are farther away. As illustrated in Figure 6, the nearsightedness of the sensor is determined mainly by the common observable space (a.k.a. SROI) acquired by the limiting elevation angles of the mirrors (Section 3.3). In addition, we will see next (Section 5.2) that the baseline b also plays a major role in range computation.

Due to our model’s coaxial configuration, we could scan for pixel correspondences radially between a given pair of warped images I1,I2 like in the approach taken by similar works such as [34]. However, it seems more convenient to work on a rectified image space, such as with panoramic images, where the search for correspondences can be performed using any of the various existing methods for perspective stereo views. Hence, we first demonstrate how these rectified panoramic images are produced (Section 5.1) and used for establishing point correspondences. Then, we proceed to study our triangulation method for the range computation from a given set of point correspondences (Section 5.2). Last, we show preliminary 3D point clouds as the outcome from such procedure.

5.1. Panoramic Images

Figure 14 illustrates how we form the respective panoramic image Ξ1 out of its warped omnidirectional image I1. As illustrated in Figure 7, Ii is simply the region of interest out of the full image I where projection occurs via mirror i. However, we can safely refer to I because it will never be the case that projections via different mirrors overlap on the same pixel position Im. In a few words, we obtain a panorama Ξi by reverse-mapping each discretized 3D point PcyliScyli to its projected pixel coordinates Im on I according to Section 3.2.

Figure 14.

Figure 14

An example for the formation of panoramic image Ξ1 out of the omnidirectional image I1 (showing only the masked region of interest on the back of image plane πimg1). Any particular ray, v1 indicated by its elevation and azimuth such as F1ψ1,θ1 that is directed towards the focus F1 must traverse the projection cylinder Scyl1 at point Pcyl1. More abstractly, the figure also shows how a pixel position Ξ1mα on the panoramic pixel space gets mapped from its corresponding pixel position I1mα via function hΞ1 defined in Equation (85). Although not up to scale, it’s crucial to notice the relative orientation between Scyl1 and the back of the projection plane πimg1 where the omnidirectional image I1 is found.

More thoroughly, for i={1,2}, Scyli is the set of all valid 3D points Pcyli that lie on an imaginary unit cylinder centered along the Z-axis and positioned with respect to the mirror’s primary focus Fi. Recall that the radius of a unit cylinder is rcyl=1, so its circumference becomes wcyl=2πrcyl=2π. Noticed that the imaging ratio, χI1:2=hI1hI2, illustrated in Figure 7 provides a way of inferring the scale between pairs of point correspondences. However, we achieve conforming scales among both panoramic representations by simply setting both cylinders to an equal height hcyl, which is determined from the system’s elevation limits, (θsys,min,θsys,max), since they partake in the measurement of the system’s vertical field of view given by Equation (54). Hence, we obtain

hcyl=zcyl,maxzcyl,min,wherezcyl,max=tanθsys,maxzcyl,min=tanθsys,min (80)

Consequently, to achieve panoramic images Ξi of the same dimensions by maintaining a true aspect ratio wΞ:hΞ, it suffices to indicate either the width (number of columns) wΞ or the height (number of rows) hΞ as number of pixels. Here, we propose a custom method for resolving the panoramic image dimensions by setting the equality for the length lpx of an individual “square” pixel in the cylinder (behaving like a panoramic camera sensor):

lpx=wcylwΞ=hcylhΞ (81)

For instance, if the width wΞ is given, then the height is simply hΞ=wΞhcyl/wcyl.

To increase the processing speed for each panoramic image Ξi, we fill up its corresponding look-up-table LUTΞi of size wΞ×hΞ that encodes the mapping for each panoramic pixel coordinates Ξim=Ξi[u,v]T to its respective projection Iim=Ii[u,v]T on the distorted image Ii. Each pixel Ξim gets associated with its cylinder’s 3D point positioned at Fipcyli, which can inherently be indicated by its elevation Fiθi and azimuth Fiψi (relative to the mirror’s primary focus Fi) as illustrated in Figure 4. Thus, the ray Fivi of a particular 3D point directed about Fiψi,θi must pass through Pcyli in order to get imaged as pixel Imi.

Since the circumference of the cylinder, wcyl, is discretized with respect to the number of pixel columns or width wΞ, we use the pixel length lpx as the factor to obtain the arc length lψi spanned by the azimuth Fiψi out of a given Ξiu coordinate on the panoramic image. Generally,

Fiψi=lψircyl=wcylΞiulpxrcyl (82)

or simply Fiψi=2πΞiulpx for the unit cylinder case.

An order reversal in the columns of the panorama is performed by Equation (82) because we account for the relative position between Scyli and the projection plane πimg. For Ξ1, Figure 14 depicts the unrolling of the cylindrical panoramic image onto a planar panoramic image. However, note that πimg is shown from above (or its back) in Figure 14, so the panorama visualization places the viewer inside the cylinder at F1.

Similarly, the elevation angle Fiθi is inferred out the row or Ξiv coordinate, which is scaled to its cylindrical representation by lpx. Recall that both cylinders have the same height, hcyl, computed by Equation (80). By taking into account any row offset from the maximum height position, Fizcyl,max, of the cylinder, we get

Fiθi=arctanFizcyl,maxΞivlpx (83)

Given these angles and assuming coaxial alignment, we evaluate the positon vector Cpcyli for a point on the panoramic cylinder with respect to the camera frame C:

Cpcyli=rcylcosFiψisinFiψitanFiθi+Cfi (84)

where rcyl cancels out for a unit cylinder. The direction equations Equations (82) and (83) leading to Equation (84) as a process: ΞimEquation(83)Equation(82)FiψθEquation(84)Cpcyli, which is eventually used as the input argument to Equation (36) in order to determine pixel Iim via the mapping function hΞi:R2R2,

IimhΞi(Ξim):=fφiCpcyliΞim (85)

Stereo Matching on Panoramas

We understand that the algorithm chosen for finding matches is crucial to attain correct pixel disparity results. We refer the reader to [35] for a detailed survey of stereo correspondence methods. After comparing various block matching algorithms, we were able to obtain acceptable disparity maps with the semi-global block matching (SGBM) method introduced by [36], which can find subpixel matches in real time. As a result of this stereo block matcher among the pair of panoramic images Ξ1,Ξ2, we get the dense disparity map ΞΔm12 visualized as an image in Figure 15 and Figure 21a. Note that valid disparity values must be positive (Δm12Ξim1>0) and they are given with respect to the reference image, in this case, Ξ1. In addition, recall that no stereo matching algorithm (as far as we are aware) is totally immune to mismatches due to several well-known reasons in the literature such as ambiguity of cyclic patterns.

Figure 15.

Figure 15

For the synthetic omnidirectional image I shown in Figure 2b, we generate its pair of panoramic images Ξ1,Ξ2 using the procedure explained in Section 5.1. Note that we only work on the SROI (shown here) to perform a semi-global block match between the panoramas as indicated in Section 5.1.1. The resulting disparity map, ΞΔm12, is visualized at the bottom as a gray-scale panoramic image normalized about its 256 intensity levels, where brighter colors imply larger disparity values. To distinguish the relative vertical view of both panoramas, we have annotated the row position of the zero-elevation.

Figure 21.

Figure 21

Real-life experiment using the 37mm-radius prototype and a single 2592 × 1944 pixels image where the rig was positioned in the middle of the room observed in Figure 13a. Some landmarks of the scene are annotated as following: Ⓐ appliances, Ⓑ monitors and shelf, Ⓒ back wall, Ⓓ chair, Ⓔ monitors and shelf,Ⓕ book, Ⓖ monitors, Ⓗ person, Ⓘ hallway, Ⓙ supplies. For the point cloud, the grid size is 0.50 m in all directions and points are thickened for clarity.

An advantage of the block (window) search for correspondences is that it can be narrowed along epipolar lines. Unlike the traditional horizontal stereo configuration, our system captures panoramic images whose views differ in a vertical fashion. As shown in [14], the unwrapped panoramas contain vertical, parallel epipolar lines that facilitate the pixel correlation search. Thus, given a pixel position Ξim1 on the reference panorama Ξ1 and its disparity value Δm12Ξim1, we can resolve the correspondence Ξ2m2 pixel coordinate on the target image, Ξ2, by simply offsetting the v-coordinate with the disparity value:

Ξ2m2=u1v1+Δm12Ξ1m1 (86)

5.2. Range from Triangulation

Recall the duality that states a point Pw as the intersection of a pair of lines. Regardless of the correspondence search technique employed, such as block stereo matching between panoramas Ξi (Section 5.1.1) or feature detection directly on I, we can resolve for Im1,m2. From Equations (42) and (49), we obtain the respective pair of back-projected rays F1v1,F2v2, emanating from their respective physical viewpoints, F1 and F2, which are separated by baseline b. We can compute elevation angles θ1 and θ2 using equations Equations (43) and (50). Then, we can triangulate the back-projected rays in order to calculate the horizontal range ρw defined in Equation (22), as follows:

ρw=bcos(θ1)cos(θ2)sin(θ1θ2) (87)

Finally, we obtain the 3D position of Pw:

Cpw=ρwcos(ψ12)ρwsin(ψ12)c1ρwtan(θ1) (88)

where ψ12 is the common azimuthal angle (on the XY-plane) for coplanar rays, so it can be determined either by Equation (44) or Equation (51). Functionally, we define the “naive” intersection function that implements Equations (87) and (88) such that

CpwfΔ((θ1,ψ1),(θ2,ψ2),θ) (89)

where θ is the model parameters vector defined in Equation (4) and can be omitted when calling this function because the model parameters should not change (ideally).

5.2.1. Common Perpendicular Midpoint Triangulation Method

Because the coplanarity of these rays cannot be guaranteed (skew rays case), a better triangulation approximation while considering coaxial misalignments is to find the midpoint of their common perpendicular line segment (as attempted in [23]). As illustrated in Figure 16, we define the common perpendicular line segment G1G2¯ as the parametrized vector v12=λ12v^12, for the unit vector normal to the back-projected rays, v1 and v2, such that:

v^12=v1v2||v1v2|| (90)
Figure 16.

Figure 16

The more realistic case of skew back-projection rays (v1,v2) approximates the triangulated point Pw by getting the midpoint PwG on the common perpendicular line segment G1G2¯:λ12v^12. Note that the visualized skew rays were formed from a pixel correspondence pair I(m1,m2) and by offsetting the coordinate u2 by 15 pixels.

If the rays are not parallel (||v1v2||0), we can compute the “exact” solution, λ=[λG1,λG2,λ12]T, of the well-determined linear matrix equation

Vλ=b,whereV=v1,v2,v^12andb=Cf2Cf1 (91)

It follows that the location of the midpoint PwG on the common perpendicular v12 with respect to the common frame C is

CpwG=Cf1+λG1F1v1+12λ12G1v^12 (92)

5.2.2. Range Variation

Before we introduce an uncertainty model for triangulation (Section 5.3), we briefly analyze how range varies according to the possible combinations of pixel correspondences, I(m1,m2) on the image I. Here, we demonstrate how a radial variation of discretized pixel disparities, Δm12, affects the 3D position of a point obtained from triangulation (Section 5.2). Figure 17 demonstrates the nonlinear characteristics of the variation in horizontal range, Δρw, from the discrete relation between pixel positions Imi and their respective back-projected (direction) rays obtained from fβi and triangulated via function fΔ defined in Equation (89). It can be observed that the horizontal range variation, Δρw, increases quadratically as Δm121px, which is the minimum discrete pixel disparity, which provides a maximum horizontal range ρw,max18,28m (computed analytically). The main plot of Figure 17 shows the small disparity values in the interval Δm12¯=[1,20]px, whereas the subplot is a zoomed-in extension of the large disparity cases in the interval Δm12¯=[20,100]px.

The current analysis is an indicative that triangulation error (e.g., due to false pixel correspondences) may have a severe effect on range accuracy that increases quadratically with distance as it can be appreciated with the 8 m variation on the disparity interval Δm12¯=[1,2]px. Also, observe the example of Figure 20 for a reconstructed point cloud, where this range sensing characteristic is more noticeable for faraway points. In fact, the following uncertainty model provides a probabilistic framework for the triangulation error (uncertainty) that agrees with the current numerical claims.

Figure 20.

Figure 20

A 3-D dense point cloud computed out of the synthetic model that rendered the omnidirectional image shown in Figure 2b. Pixel correspondences are established via the panoramic depth map visualized in Figure 15. The 3D point triangulation implements the common perpendicular midpoint method indicated in Section 5.2.1. The position of the omnistereo sensor mounted on the quadrotor is annotated as frame C with respect to the scene’s coordinates frame S. (a) 3D visualization of the point cloud (the quadrotor with the omnistereo rig has been added for visualization only); (b) Orthographic projection of the point cloud to the XY-plane of the visualization grid.

5.3. Triangulation Uncertainty Model

Let fPw be the vector-valued function that computes the 3D coordinates of point PwG with respect to C as the common perpendicular midpoint defined in Equation (92). We express this triangulation function component-wise as follows:

CpwGfPw(m12)fxw(m12)fyw(m12)fzw(m12) (93)

where m12=[u1,v1,u2,v2] is composed by the pixel coordinates of the correspondence I(m1,m2) upon which to base the triangulation (Section 5.2).

Without loss of generality, we model a multivariate Gaussian uncertainty model for triangulation, so that the position vector CpwG of any world point is centered at its mean CμfPw with a 3×3 covariance matrix ΣfPw:

CμfPw=xwywzw,ΣfPw=σfxw2σfxwσfywσfxwσfzwσfxwσfywσfyw2σfywσfzwσfxwσfzwσfywσfzwσfzw2 (94)

However, since fPw is a non-linear vector-valued function, we linearize it by approximation to a first-order Taylor expansion and we use its Jacobian matrix to propagate the uncertainty (covariance) as in the linear case as follows:

ΣfPw=JfPwΩm12JfPwT (95)

where the 3×4 Jacobian matrix for the triangulation function is

JfPw=fxwu1fxwv1fxwu2fxwv2fywu1fywv1fywu2fywv2fzwu1fzwv1fzwu2fzwv2 (96)

and the 4×4 covariance matrix of the pixel arguments being

Ωm12=σpx2I4 (97)

where we assume σpx=1px for the standard deviation of each pixel coordinate in the discretized pixel space. The complete symbolic solution of ΣfPw is too involved to appear in this manuscript. However, in Figure 18, we show the top-view of the covariance ellipsoid drawn at a three-σfPw level for a point triangulated nearly around ρw100mm. Figure 19 visualizes uncertainty ellipsoids drawn at a one-σfPw level for several triangulation ranges. We refer the reader to the end of Section 6.3 where we validate the safety of this 1 pixel deviation assumption through experimental results using subpixel precision.

Figure 18.

Figure 18

Top-view of the three-sigma level ellipsoid for the triangulation uncertainty of a pixel pair I(m1,m2) with an assumed standard deviation σpx=1px.

Figure 19.

Figure 19

Uncertainty ellipsoids for triangulated points at ranges ρw{0.3,0.5,1.0}m.

6. Experiment Results

In this section, we demonstrate the capabilities of the omnistereo sensor to provide 3D information either as dense point clouds or as for the registration of sparse 2D features and 3D points. We also evaluate the precision of both projection and triangulation of a few detected corners from a chessboard whose various 3D poses are given as ground-truth.

6.1. Dense 3D Point Clouds

By implementing the process described in Section 5, we begin by visualizing the dense point-cloud obtained from the omnidirectional synthetic image given in Figure 2b, whose actual size is 1280 × 960 pixels. The associated panoramic images, Ξi, were obtained using function hΞi defined in Equation (85) and are shown in Figure 15. Pixel correspondences (Ξ1m1,Ξ2m2) on the panoramic representations are mapped via hΞi into their respective image positions I(m1,m2). Then, these are triangulated with CfPw given in Equation (93), resulting in the set (cloud) of color 3D points PΔ visualized in Figure 20. Here, the synthetic scene (Figure 2a) is for a room 5.0 m wide (along its X-axis), 8.0 m long (along its Y-axis), and 2.5 m high (along its Z-axis). With respect to the scene center of coordinates, S, the catadioptric omnistereo sensor, C, is positioned at CSt=[1.60,2.85,0.16]T in meters.

We also present results from a real experiment using the prototype described in Section 4.3.2 and shown in Figure 13a. The panoramic images and dense point cloud shown in Figure 21 are obtained by implementing the pertinent functions described throughout this manuscript and by holding the SVP assumption of an ideal configuration. We provide these qualitative results as preliminary proof of concept for the proposed sensor after employing a calibration procedure based on the generalized unified model proposed in [37].

6.2. Sparse 3D Points from Features

Using the SURF feature detector and descriptors [38], Figure 22 demonstrates 44 correct matches that are triangulated with Equation (93). Sparse 3D points can be useful for applications of visual odometry where the sensor changes poses and those registered point features can be matched against new images. Please, refer to [39] for a tutorial on visual odometry.

Figure 22.

Figure 22

Sparse point correspondences for the real-life image from Figure 13b. Point correspondences are identifiable by random colors that persist in both the panoramic image and the respective triangulated 3D points (scaled-up for visualization).

6.3. Triangulation Evaluation

6.3.1. Evaluation of Synthetic Rig

Due to the unstructured nature of the dense point clouds previously discussed, we proceed to triangulate sets of sparse 3D points whose positions with respect to the omnistereo sensor camera frame, C, are known in advance. We synthesize a calibration chessboard pattern G containing m×n square cells for various predetermined poses GCTh. Since the sensor is assumed to be rotationally symmetric, it suffices to experiment with groups of L=4 chessboard patterns situated at a given horizontal range. A total of Lmn 3D points are available for each range group. Each corner point’s position Cpj is found with respect to C via the frame transformation Cpj,g=CgCThCgpj for all indices j{1,,mn},g{1,,L}.

Figure 23 shows the set of detected corner points on the image from the group of patterns set to a range of CρG=2m. We adjust the pattern’s cell sizes accordingly so its points can be safely discerned by an automated corner detector [35]. We systematically establish correspondences of pattern points on the omnidirectional image, and proceed to triangulate with Equation (93). For each range group of points, we compute the root-mean-square of the 3D position errors (RMSE) between the observed (triangulated) points Cp˜jfPw(m˜1,m˜2) and the true (known) points Cpj that were used to describe the ray-traced image. Table 4 compiles the RMSE results and the standard deviation (SD) for some group of patterns whose frames Gg, are located at specified horizontal ranges CρG[0.25,8.0]m away from C.

Figure 23.

Figure 23

Example of sparse point correspondences detected with subpixel precision from corners on the chessboard patterns around the omnistereo sensor. The size of the rendered images for this experiment is 1280 × 960 pixels. For this example’s patterns, the square cell size is 140mm. The RMSE for this set of points at CρG=2m is approximately 15mm (Table 4).

Table 4.

Results of RMSE from Synthetic Triangulation Experiment.

CρG [m] RMSE [mm] SD [mm]
0.25 0.46 0.31
0.50 1.20 0.71
1.0 4.62 2.55
2.0 14.85 9.06
4.0 57.67 31.34
8.0 219.09 129.92

We notice that for all the 3D points in the synthetic patterns, we obtained an average error of 0.1px with a standard deviation σ˜px=0.05px for the subpixel detection of corners on the image versus their theoretical values obtained from fφi defined in Equation (36). This last experiment helps us validate the pessimistic choice of σpx=1px for the discrete pixel space in the triangulation uncertainty model proposed in Section 5.3.

6.3.2. Evaluation of Real-Life Rig

The following experiment uses L=5 different poses of a real chessboard pattern with 5×8 corner points where the square cell size is 24mm. As done in Section 6.3.1, the evaluated error is the Euclidean norms between the triangulated points and the ground-truth positions of the chessboard posses captured via a motion capture system. The RMSE for all projected points in this set of chessboard patterns is 2.5 pixels with a standard deviation of 1.5 pixels. The RMSE for all triangulated points in this set is 3.5mm with a standard deviation of 1.4mm. Figure 24 visually confirms the proximity of the triangulated chessboard poses against the ground-truth pose information.

Figure 24.

Figure 24

Visualization of estimated 3D poses for some chessboard patterns using the real-life omnistereo rig. Color annotations: ground-truth poses (green), estimated triangulated poses (red).

7. Discussion and Future Work

The portable aspect of the proposed omnistereo sensor is one of its greatest advantages, as discussed in the introduction section. The total weight of the big rig using 37mm-radius mirrors is about 550g, so it can be carried by the AscTec Pelican quadrotor under its payload limitations of 650g. The mirror profiles maximize the stereo baseline while obeying the various design constraints such as size and field of view. Currently, the mirrors are custom-manufactured out of brass using CNC machining. However, it is possible to reduce the system’s weight dramatically by employing lighter materials.

In reality, it is almost impossible to assemble a perfect imaging system that fulfills the SVP assumption and avoids the triangulation uncertainty studied in Section 5.3 on top of the error already introduced by any feature matching technique. The coaxial misalignment of the folded mirrors-camera system, defocus blur of the lens, and the inauspicious glare from the support tube are all practical caveats we need to overcome for better 3D sensing tasks. As described in the text for the real-life rig, we have avoided the traditional use of a support cylinder in order to workaround the cross-reflections and glare issues. Possible vibrations caused by the robot dynamics are reduced by vibration pads placed on the sensor-body interface. Details about our tentative calibration method for vertically-folded omnistereo systems has not been included in the current study since we would like the reader’s attention to be devoted to the sensor characteristics defended by this analysis.

Our ongoing research is also focusing on the development of efficient software algorithms for real-time 3D pose estimation from point clouds. Bear in mind that all the experimental results demonstrated in this manuscript rely upon a single camera snapshot. We understand that the narrow vertical field-of-view where stereo vision operates is a limiting factor for dense scene reconstruction from a single image, so we have also considered non-optimal geometries for the quadrotor’s view. In fact, increasing the region of interest for stereo (SROI) while maintaining the wide baseline implies an enlargement of each mirror’s radius. We believe that our omnidirectional system is more advantageous than forward-looking sensors because it can provide a robust pose estimation by extracting 3D point features from all around the scene at once. As in our past work [24], fusing multiple modalities (e.g., stereo and optical-flow) is a possibility in order to resolve the scale-factor problem inherent while performing structure from motion over the non-stereo regions of each mirror (near the poles).

In this work, we performed an extensive study of the proposed omnistereo sensor’s properties, such as its spatial resolution and triangulation uncertainty. We validated the projection accuracy of the synthetic model (the ideal case) where 3D points in the world are given exactly. In order to validate the precision of the real sensor, we require a perfectly constructed and assembled device so point projections can be accepted as the ultimate truth. This is hard to achieve at a low-cost prototyping stage. Although we acquired ground-truth 3D points via a position capture system alone, we deem this insufficient to validate the imaging accuracy of the real sensor because the precision of the calibration method is truly what is being accounted for. For reproducibility purposes, source code is available for the implementation of the theoretical omnistereo model, optimization, plots and figures presented in this analysis [40].

Acknowledgments

This work was supported in part by U.S. Army Research Office grantNo. W911NF-09-1-0565, U.S. National Science Foundation grant No. IIS-0644127, and a Ford Foundation Pre-doctoral Fellowship awarded to Carlos Jaramillo.

Appendix A. Symbolic Notation

Pi

a point R3 where post-subscript i as a unique identifier.

A

a reference frame or image space with origin OA.

Api

The position vector of Pi with respect to reference frame A.

Api,h

for homogeneous coordinates.

Imi

a 2D point or pixel position on image frame I.

pi

the magnitude (Euclidean norm) of pi.

q^

A unit vector so ||q^||=1.

Mi

a 3×3 matrix, or Mi,h in homogeneous coordinates.

fs

a scalar-valued function that outputs some s.

fv

a vector-valued function for the computation of v.

All coordinate systems obey the right-hand rule unless otherwise indicated.

Author Contributions

The work presented in this paper is a collaborative development by all of the authors. C. Jaramillo wrote this manuscript, carried out all the experiments and conceived the extensive analysis of the omnistereo sensor studied here. R.G. Valenti contributed with the analytical derivation of various equations and manuscript revisions. L. Guo established the geometrical model and rules for the constrained optimization of design parameters. J. Xiao funded and guided this entire study and helped with revisions.

Conflicts of Interest

The authors declare no conflict of interest.

References

  • 1.Marani R., Renò V., Nitti M., D’Orazio T., Stella E. A Compact 3D Omnidirectional Range Sensor of High Resolution for Robust Reconstruction of Environments. Sensors. 2015;15:2283–2308. doi: 10.3390/s150202283. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Valenti R.G., Dryanovski I., Jaramillo C., Strom D.P., Xiao J. Autonomous quadrotor flight using onboard RGB-D visual odometry; Proceedings of the International Conference on Robotics and Automation (ICRA 2014); Hong Kong, China. 31 May–7 June 2014; pp. 5233–5238. [Google Scholar]
  • 3.Khoshelham K., Elberink S.O. Accuracy and resolution of Kinect depth data for indoor mapping applications. Sensors. 2012;12:1437–1454. doi: 10.3390/s120201437. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Payá L., Fernández L., Gil A., Reinoso O. Map building and monte carlo localization using global appearance of omnidirectional images. Sensors. 2010;10:11468–11497. doi: 10.3390/s101211468. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Berenguer Y., Payá L., Ballesta M., Reinoso O. Position Estimation and Local Mapping Using Omnidirectional Images and Global Appearance Descriptors. Sensors. 2015;15:26368–26395. doi: 10.3390/s151026368. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Hrabar S., Sukhatme G. Omnidirectional vision for an autonomous helicopter; Proceedings of the International Conference on Robotics and Automation (ICRA); Taipei, Taiwan. 14–19 September 2003; pp. 3602–3609. [Google Scholar]
  • 7.Hrabar S. 3D path planning and stereo-based obstacle avoidance for rotorcraft UAVs; Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS); Nice, France. 22–26 September 2008; pp. 807–814. [Google Scholar]
  • 8.Orghidan R., Mouaddib E.M., Salvi J. Omnidirectional depth computation from a single image; Proceedings of the IEEE International Conference on Robotics and Automation; Barcelona, Spain. 18–22 April 2005; pp. 1222–1227. [Google Scholar]
  • 9.Paniagua C., Puig L., Guerrero J.J. Omnidirectional structured light in a flexible configuration. Sensors. 2013;13:13903–13916. doi: 10.3390/s131013903. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Byrne J., Cosgrove M., Mehra R. Stereo based obstacle detection for an unmanned air vehicle; Proceedings of the International Conference on Robotics and Automation; Orlando, FL, USA. 15–19 May 2006. [Google Scholar]
  • 11.Smadja L., Benosman R., Devars J. Hybrid stereo configurations through a cylindrical sensor calibration. Mach. Vis. Appl. 2006;17:251–264. doi: 10.1007/s00138-006-0032-4. [DOI] [Google Scholar]
  • 12.Hartley R., Zisserman A. Multiple View Geometry in Computer Vision. 2nd ed. Volume 2. Cambridge University Press; Cambridge, UK: 2004. [Google Scholar]
  • 13.Sturm P., Ramalingam S., Tardif J.P., Gasparini S., Barreto J.P. Camera models and fundamental concepts used in geometric computer vision. Found. Trends®Comput. Graph. Vis. 2010;6:1–183. doi: 10.1561/0600000023. [DOI] [Google Scholar]
  • 14.Gluckman J., Nayar S.K., Thoresz K.J. Real-Time Omnidirectional and Panoramic Stereo. Comput. Vis. Image Underst. 1998 [Google Scholar]
  • 15.Koyasu H., Miura J., Shirai Y. Realtime omnidirectional stereo for obstacle detection and tracking in dynamic environments; Proceedings of the International Conference on Intelligent Robots and Systems (IROS); Maui, HI, USA. 29 October–3 November 2001; pp. 31–36. [Google Scholar]
  • 16.Bajcsy R., Lin S.S. High resolution catadioptric omni-directional stereo sensor for robot vision; Proceedings of the 2003 IEEE International Conference on Robotics and Automation; Taipei, Taiwan. 14–19 September 2003; pp. 1694–1699. [Google Scholar]
  • 17.Cabral E.E., de Souza J.C.J., Hunold M.C. Omnidirectional stereo vision with a hyperbolic double lobed mirror; Proceedings of the 17th International Conference on Pattern Recognition (ICPR); Cambridge, UK. 23–26 August 2004; pp. 0–3. [Google Scholar]
  • 18.Su L., Zhu F. Design of a novel stereo vision navigation system for mobile robots; Proceedings of the IEEE Robotics and Biomimetics (ROBIO); Hong Kong, China. 5–9 July 2005; pp. 611–614. [Google Scholar]
  • 19.Mouaddib E.M., Sagawa R. Stereovision with a single camera and multiple mirrors; Proceedings of the International Conference on Robotics and Automation; Barcelona, Spain. 18–22 April 2005; pp. 800–805. [Google Scholar]
  • 20.Schönbein M., Kitt B., Lauer M. Environmental Perception for Intelligent Vehicles Using Catadioptric Stereo Vision Systems; Proceedings of the European Conference on Mobile Robots (ECMR); Örebro, Sweden. 7–9 September 2011; pp. 1–6. [Google Scholar]
  • 21.Yi S., Ahuja N. An Omnidirectional Stereo Vision System Using a Single Camera; Proceedings of the 18th International Conference on Pattern Recognition (ICPR’06); Hong Kong, China. 20–24 August 2006; pp. 861–865. [Google Scholar]
  • 22.Nayar S.K., Peri V. Folded catadioptric cameras; Proceedings of the 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition; Fort Collins, CO, USA. 23–25 June 1999; pp. 217–223. [Google Scholar]
  • 23.He L., Luo C., Zhu F., Hao Y. Motion Planning. In-Tech Education and Publishing; Vienna, Austria: 2008. Stereo Matching and 3D Reconstruction via an Omnidirectional Stereo Sensor; pp. 123–142. Number 60575024. [Google Scholar]
  • 24.Labutov I., Jaramillo C., Xiao J. Generating near-spherical range panoramas by fusing optical flow and stereo from a single-camera folded catadioptric rig. Mach. Vis. Appl. 2011;24:1–12. doi: 10.1007/s00138-011-0368-2. [DOI] [Google Scholar]
  • 25.Swaminathan R., Grossberg M.D., Nayar S.K. Caustics of catadioptric cameras; Proceedings of the Eighth IEEE International Conference on Computer Vision (ICCV 2001); Vancouver, BC, Canada. 7–14 July 2001; pp. 2–9. [Google Scholar]
  • 26.Jang G., Kim S., Kweon I. Single camera catadioptric stereo system; Proceedings of the Workshop on Omnidirectional Vision, Camera Networks and Nonclassical Cameras (OMNIVIS2005); Beijing, China. 21 October 2005. [Google Scholar]
  • 27.Jaramillo C., Guo L., Xiao J. A Single-Camera Omni-Stereo Vision System for 3D Perception of Micro Aerial Vehicles (MAVs); Proceedings of the IEEE Conference on Industrial Electronics and Applications (ICIEA); Melbourne, Australia. 19–21 June 2013; [Google Scholar]
  • 28.Ascending Technologies (AscTec) [(accessed on 23 May 2014)]. Available online: http://www.asctec.de/en/uav-uas-drones-rpas-roav/
  • 29.Baker S., Nayar S.K. A theory of single-viewpoint catadioptric image formation. Int. J. Comput. Vis. 1999;35:175–196. doi: 10.1023/A:1008128724364. [DOI] [Google Scholar]
  • 30.Nayar S.K., Baker S. Catadioptric Image Formation; Proceedings of the 1997 DARPA Image Understanding Workshop; New Orleans, LA, USA. May 1997; pp. 1431–1437. [Google Scholar]
  • 31.Gaspar J., Deccó C., Okamoto J.J., Santos-Victor J., Sistemas I.D., Pais A.R., Brazil S.P. Constant resolution omnidirectional cameras; Proceedings of the OMNIVIS’02 Workshop on Omni-directional Vision; Copenhagen, Denmark. 2 June 2002. [Google Scholar]
  • 32.Forsgren A., Gill P., Wright M. Interior Methods for Nonlinear Optimization. Soc. Ind. Appl. Math. (SIAM Rev.) 2002;44:525–597. doi: 10.1137/S0036144502414942. [DOI] [Google Scholar]
  • 33.Tuytelaars T.T., Mikolajczyk K. Local Invariant Feature Detectors- A Survey. Found. Trends® in Comput. Graph. Vis. 2008;3:177–280. doi: 10.1561/0600000017. [DOI] [Google Scholar]
  • 34.Spacek L. Coaxial Omnidirectional Stereopsis. Computer Vision-ECCV 2004. Springer Berlin Heidelberg; Berlin, Heidelberg: 2004. pp. 354–365. [Google Scholar]
  • 35.Bradski G., Kaehler A. Learning OpenCV: Computer vision with the OpenCV library. O’Reilly Media, Inc.; Sebastopol, California: 2008. [Google Scholar]
  • 36.Hirschmüller H. Stereo processing by semiglobal matching and mutual information. IEEE Trans. Pattern Anal. Mach. Intell. 2008;30:328–341. doi: 10.1109/TPAMI.2007.1166. [DOI] [PubMed] [Google Scholar]
  • 37.Xiang Z., Dai X., Gong X. Noncentral catadioptric camera calibration using a generalized unified model. Opt. Lett. 2013;38:1367–1369. doi: 10.1364/OL.38.001367. [DOI] [PubMed] [Google Scholar]
  • 38.Bay H., Ess A., Tuytelaars T., Vangool L. Speeded-Up Robust Features (SURF) Comput. Vis. Image Underst. 2008;110:346–359. doi: 10.1016/j.cviu.2007.09.014. [DOI] [Google Scholar]
  • 39.Scaramuzza D., Fraundorfer F. Visual Odometry Part 1: The First 30 Years and Fundamentals. IEEE Robot. Autom. Mag. 2011;18:80–92. doi: 10.1109/MRA.2011.943233. [DOI] [Google Scholar]
  • 40.Source Code Repository. [(accessed on 5 February 2016)]. Available online: https://github.com/ubuntuslave/omnistereo_sensor_design.

Articles from Sensors (Basel, Switzerland) are provided here courtesy of Multidisciplinary Digital Publishing Institute (MDPI)

RESOURCES