Skip to main content
. Author manuscript; available in PMC: 2023 Jan 9.
Published in final edited form as: Proc IEEE Comput Soc Conf Comput Vis Pattern Recognit. 2022 Sep 27;2022:2161–2170. doi: 10.1109/cvpr52688.2022.00221

Figure 2. B-KinD, an approach for keypoint discovery from spatiotemporal difference reconstruction.

Figure 2.

It and It+T are video frames at time t and t + T. Both frame It and frame It+T are fed to an appearance encoder Φ and a pose decoder Ψ. Given the appearance feature from It and geometry features from both It and It+T (Sec 3.1), our model reconstructs the spatiotemporal difference (Sec 3.2.1) computed from two frames using the reconstruction decoder ψ.