Fig. 3.
Schematic summary of all the computations included in the model. Red frames indicate dynamic (video) rather than static scenes. Top: the background capacities (C1-C6) applied to the familiarization videos (detailed illustration of the capacities is shown in Fig. 4). Optical flow is computed for the input, and then capacities C1-C6 are applied to create a representation of the input. The capacities are shown on the right, later capacities added on top of earlier ones. Bottom: recognition processes at three stages – dynamic, static at low view, and static at high view (detailed illustration of the recognition process is shown in Figs. 5A–D, 7). Dynamic: the algorithm detects a switch in boundary ownership. Static: the algorithm detects mixed boundary ownership. High-view: the algorithm detects that all boundaries are within the container’s back region. Right column: Detected boundary ownership in the output is indicated in blue (object) or in red (container). In high view, the container’s detected front region is indicated in red, and the back region in orange. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
