Skip to main content
. 2021 Sep 7;21(18):5996. doi: 10.3390/s21185996

Table A2.

3D Evaluation Datasets.

Dataset Size & Source N of Joints/N of People Summary
KTH Multiview Football dataset II (3D part) (extended version of the original)
  • -

    2400 images (800 time frames captures from 3 views)

  • -

    2 different players and 2 different sequences per player

  • -

    Less than 1GB

  • -

    Images from a football match

14 (x,y,z)/1
  • -

    Multiview (3 orthographic cameras).

  • -

    Filming at 25Hz and a resolution of 1920 × 1080.

  • -

    1 person per video sequence playing football on the field during a match.

  • -

    3 cameras are used to record the player from 3 different angles, the cameraman rotates the cameras to follow the player and zooms him.

  • -

    The 2D annotation, as indicated in Table A1, is done by hand, and the 3D positions are reconstructed using the method described in [67].

  • -

    Images in jpg format + joints in txt format

Martial Arts, Dancing and Sports (MADS) [68]
  • -

    30 video sequences or different people, 53,000 frames

  • -

    24GB

  • -

    Own images using a MoCap system

19 (x,y,z)/1
  • -

    Multiview (3 cameras).

  • -

    Cameras capturing at 15fps and a resolution of 1024 × 768.

  • -

    1 person per video sequence, 6 sequences per category, and 5 action categories: Tai-chi, Karate, Jazz dance, Hip-hop dance, and different sports.

  • -

    Recorded in a lab.

  • -

    Ground Truth was obtained using a MoCap system by Motion Analysis working at 60fps.

  • -

    Video in avi format + joint information in MATLAB data format