Skip to main content
Scientific Data logoLink to Scientific Data
. 2026 Feb 24;13:511. doi: 10.1038/s41597-026-06853-9

CzechLynx: A Dataset for Individual Identification and Pose Estimation of the Eurasian Lynx

Lukas Picek 1,2,, Jakub Straka 1, Miroslav Jirik 1, Elisa Belotti 3,4, Martin Duľa 5,6, Josefa Krausová 4,6, Michal Bojda 5,6, Vojtech Cermak 7, Luděk Bufka 4, Rostislav Dvořák 6, Luboslav Hrdý 6, Václav Kocourek 6, Jiří Labuda 5,6, Luděk Toman 6, Vlado Trulík 6, Martin Váňa 6, Miroslav Kutal 5,6
PMCID: PMC13043752  PMID: 41735338

Abstract

We introduce CzechLynx, the first large-scale, open-access dataset for individual identification, pose estimation, and instance segmentation of the Eurasian lynx (Lynx lynx). CzechLynx contains 39,760 camera trap images annotated with segmentation masks, identity labels, and 20-point skeletons and covers 319 unique individuals across 15 years of systematic monitoring in two geographically distinct regions: southwest Bohemia and the Western Carpathians. In addition to the real camera trap data, we provide a large complementary set of photorealistic synthetic images and a Unity-based generation pipeline with diffusion-based text-to-texture modeling, capable of producing arbitrarily large amounts of synthetic data spanning diverse environments, poses, and coat-pattern variations. To enable systematic testing across realistic ecological scenarios, we define three complementary evaluation protocols: (i) geo-aware, (ii) time-aware open-set, and (iii) time-aware closed-set, covering cross-regional and long-term monitoring settings. With the provided resources, CzechLynx offers a unique, flexible benchmark for robust evaluation of computer vision and machine learning models across realistic ecological scenarios.

Subject terms: Scientific data, Computer science, Conservation biology, Information technology, Animal behaviour

Background & Summary

Understanding species movement, population structure, and habitat connectivity is critical for effective conservation, especially for wide-ranging, protected carnivores1,2. In Europe, the Eurasian lynx (Lynx lynx) exemplifies these challenges, being ecologically important, legally protected, and indicative of broader biodiversity concerns. This medium-sized, solitary, and territorial carnivore lives at low population densities, with substantial home ranges estimated at 443.36 ± 283.14 km2 for males and 191.92 ± 116.34 km2 for females in Europe2,3. Although primarily inhabiting forested areas and rugged terrains, the lynx demonstrates notable adaptability by using refuge areas to mitigate human pressures4 and can disperse across Central Europe’s cultural landscapes, covering distances greater than 98 km from source populations5. Monitoring such wide-ranging, low-density populations across heterogeneous landscapes requires extensive, long-term effort, and camera trap networks have become a key tool for this purpose. As a result, long-term camera trap monitoring has generated rich data that are valuable not only for ecological research, such as estimating population densities and demographic parameters69, but also for the development and evaluation of novel machine learning methods10,11. This tight connection creates a mutually beneficial loop in which ecological monitoring provides the data that drives advances in machine learning, and these advances, in turn, enable more accurate and efficient processing of ever larger and more complex conservation data. Consolidating and standardizing these datasets for machine learning is, therefore, key to unlocking their full potential for both domains and have already been taken up in several recent studies built on the data described here.

Straka et al12 used a subset of the CzechLynx dataset as a case study for 2D pose estimation of endangered species. They treated Eurasian lynx camera trap images as a testbed to compare pre-trained backbones, fine-tuning strategies, data augmentation schemes, and different mixes of real and synthetic training data, using HRNet-w32 13 and related architectures as base models. Their experiments revealed how real and synthetic data can be combined to support pose estimation under data-limited conditions.

Picek et al14 used the part of the CzechLynx dataset from Friends of the Earth Czech Republic (FoE CZ) to evaluate a method for individual animal identification, which models foreground and background information separately. They focused on the challenges of identifying lynx individuals over time and across different locations, using data from the 15 years of camera trap monitoring in Central Europe. The dataset enabled them to demonstrate how incorporating spatial and temporal priors, along with their Per-Instance Temperature Scaling (PITS), improves identification accuracy, particularly when animals appear in new areas or exhibit changes in appearance over time.

Dula et al8 and Palmero et al9 used some parts of the CzechLynx dataset to estimate population densities by spatial capture-recapture models and other demographic parameters of the Eurasian lynx in the West Carpathians in the Czech-Slovakia borderland and in the Bohemian Forest Ecosystem: Bavarian Forest National Park in Germany and the Šumava National Park in the Czech Republic. By identifying lynx individuals, Dula et al8 estimated apparent survival, transition rates, and population turnover. Fluctuating densities and high turnover rates indicated human-caused mortality, which could limit population growth and the dispersal of lynx to other adjacent areas, thus undermining the favourable conservation status of the Carpathian population.

Adam et al15,16 used the CzechLynx dataset as part of the AnimalCLEF 2025 competition on individual animal identification. Their goal was to establish a standardized benchmark for evaluating algorithms under realistic open-set conditions across multiple species. The CzechLynx dataset was chosen because its scale, temporal depth, and natural visual diversity provide a realistic and challenging testbed for evaluating the generalization and robustness of identification models. Its use in the challenge demonstrated the dataset’s suitability for large-scale benchmarking and its potential to support community-driven development of transferable wildlife monitoring methods.

Related Work

Animal-focused image datasets are now well cataloged and serve as a central resource for developing and evaluating methods for individual identification of animals across species 17. Available resources cover a wide range of taxa, including large terrestrial mammals (e.g., zebras, elephants), marine species (e.g., belugas, whale sharks, turtles), and carnivores (e.g., leopards, hyenas, tigers). However, geographical coverage is not evenly distributed: most publicly available datasets originate from North America and Africa, while European resources remain relatively scarce, in part because of data privacy and conservation constraints. Large carnivores on EU red lists are particularly underrepresented in open, image-based identification datasets. A summary of non-public datasets is provided in Table 1, and public datasets with their key statistics are listed in Table 2. Short descriptions of selected public datasets are provided below.

Table 1.

Non-public animal re-id datasets with large carnivores.

Individuals Images
Bears 40 132 4,674
Dogs 41 60 625
Jaguars 1 42 16 176
Lynxes 43 51 252
Polar bears 2 44 15 42
Ocelots 45 N/A 503
Jaguars 2 45 N/A 680
Cheetahs 46 N/A N/A
Polar bears 1 47 N/A N/A

Table 2.

All publicly available datasets with large carnivores.

Name Species Images Identities Span Bounding Boxes Masks Pose Wild
PolarBearVidID21 Ursus maritimus 1,431 13 N/A ° ° °
LionData20 Panthera leo 750 98 N/A ° ° °
Hyena ID 202219 Crocuta crocuta 3,129 256 N/A ° °
Leopard ID 202219 Panthera pardus 6,806 430 N/A ° °
ATRW18 Pantheria tigris 8,076 92 1 day °
(our) CzechLynx22 Lynx lynx 39,760 319 15 years

For each dataset, we report the number of images, unique individual identities, temporal span (Span), and the availability of bounding boxes, segmentation masks (Masks), pose keypoints (Pose), and in-the-wild imagery (Wild). A filled bullet (•) indicates that the annotation type is provided, while an open circle (°) indicates it is not. Datasets derived from video recordings. Statistics exclude synthetic data.

The ATRW dataset 18 contains 8,076 videos collected from ten zoos in China and covers 92 Amur tigers (Panthera tigris). Each image is annotated with bounding boxes, identity labels, and pose keypoints, allowing development of re-identification and pose estimation methods for controlled environments and populations.

The Hyena ID 2022 and Leopard ID 2022 datasets 19, developed by the Botswana Predator Conservation Trust, contain 3,129 images of 256 individual spotted hyenas (Crocuta crocuta) and 6,795 images of 430 African leopards (Panthera pardus), respectively. Images were taken in natural settings and annotated with bounding boxes and viewpoints, making these datasets valuable for testing in-the-wild generalization performance for identification of individuals.

The LionData dataset 20 was compiled as part of the Mara Masai project in Kenya and includes 750 images of 98 lions (Panthera leo). Each image was manually annotated to support training and evaluation of animal re-identification models under natural variability, such as lighting changes and different camera angles.

The PolarBearVidID dataset 21 covers 13 individual polar bears recorded in 6 German zoos. It contains approximately 138k images extracted from 1,431 video sequences. To reduce the risk of background overfitting caused by fixed camera positions, the individuals were cropped from the original photographs.

Methods

This section describes how the CzechLynx dataset22 was built, including detailed descriptions of data acquisition, annotation, and pre-processing steps. The dataset combines data collected from long-term camera trapping surveys across two distinct regions with synthetic samples generated to enhance variability. Where applicable, previously published methods are cited, and only novel technical processes are described in detail.

Dataset Collection

The CzechLynx dataset22 comprises data collected from two independent conservation projects operating within distinct regions of the Eurasian lynx’s (Lynx lynx) Central European range. Each project used its own field methodology and data management pipeline. To maintain transparency and reproducibility, all data sources are described separately below.

Carnivore Conservation Programme, Friends of the Earth Czech Republic (FoE CZ), and Department of Forest Ecology, Mendel University in Brno

This source includes data collected through long-term monitoring in the Western Carpathians (since 2009) and Southwest Bohemia (since 2015); labeled as “FoE CZ – The Western Carpathians” and “FoE CZ – Southwest Bohemia” in the following text. Camera traps were deployed on forest roads, trails, mountain ridges, as well as marking and resting sites, identified via snow-tracking, signs of occurrence, or telemetry research 2,23. From 2015 onward, the Western Carpathians area followed a systematic camera trapping and spatial capture-recapture model to estimate lynx densities and evaluate density fluctuations, apparent survival, transition rate, and individual turnover during five consecutive seasons 8. Earlier data and those collected by opportunistic camera trapping allowed for estimation of minimum population size, social structure, and evidence of reproduction. Cameras used included both white-flash and infrared models (video-enabled). Most individuals were manually identified based on distinctive coat patterns visible on their flanks, forelimbs, and hindlimbs 8.

Šumava National Park Administration

This source includes data collected from the areas of Šumava National Park and the Protected Landscape Area in southwestern Bohemia. The data were gathered by the Šumava National Park Administration within two long-term international monitoring projects. The first project, started in 2009, covers roughly the northwestern two-thirds of Šumava National Park. Its main goal is to obtain yearly density estimates of independent lynx individuals living in the core area of the Bohemian-Bavarian-Austrian lynx population. Within the scope of the first project, data have been collected using a 2.7 × 2.7 km grid. White-flash camera traps were installed seasonally in every second grid cell at suitable locations along forest paths, roads, and trails. Additional details about this methodology can be found in studies by Weingarth et al24 and Palmero et al9. The second project (initiated in 2017) aims to provide as complete an overview as possible of the entire Bohemian-Bavarian-Austrian lynx population throughout its full range. Data for this project were collected year-round at various sites, including forest paths, roads, trails, and lynx scent-marking locations, using both white-flash and infrared video-enabled camera traps. These cameras were set up continuously based on a 10 × 10 km grid (ETRS LAEA 5210). In each grid cell predominantly within the Šumava National Park and Protected Landscape Area, between 4 to 8 camera sites were continuously monitored throughout the year. Further information about this project is available in a separate report 25.

Raw Data Pre-processing

The pre-processing of raw camera trap data started with the manual removal of empty images, which frequently resulted from false triggers caused by wind, moving vegetation, or fluctuating light conditions. Once these non-informative images were discarded, the remaining photographs were screened by a team of experts and trained volunteers, who classified the content by species, while removing data with humans or vehicles. Only images positively identified as containing Eurasian lynxes were used for further analysis. All pre-selected images then underwent a second review by specialists with long-term experience in lynx monitoring. Using the species’ distinctive coat patterns, especially on the hind limbs, forelimbs, and flanks, these experts manually identified individual lynxes. In each study area (southwest Bohemia and the Western Carpathians), at least three well-trained local experts independently processed and cross-checked identifications in line with established camera trapping standards 8,26. The resulting set of expert-verified images, each linked to a specific, individually recognized lynx, formed the basis for all subsequent processing.

Video Pre-processing

The CzechLynx dataset22 contains Lynx lynx encounters collected using different camera trap models. Some devices capture still image/images, whereas others record short videos (35% of observations). Because consecutive frames within a burst or video clip are often nearly identical, using all frames would introduce substantial redundancy and would make manual annotation time-consuming. Therefore, a small number (up to three) of the most informative frames is automatically selected for each detection event (i.e., observation).

The selection was based on automated animal detection using MegaDetector 27, a YOLOv5-based model widely used in ecological research for reliably detecting any animal in camera trap imagery 2830. Each video frame was processed, and those with a detection confidence below 0.7 were discarded. From the remaining frames, the first and last with valid detections were identified, and the interval between them was divided into three equal parts. One frame was then selected from each segment (specifically, the frame with the largest detected bounding box) ensuring that the chosen images capture the animal prominently and are distributed across the video timeline (see Fig. 1). Selected frames were used as representative samples for downstream tasks, e.g., individual identification and pose estimation. The following criteria guided the frame selection process: (i) the animal’s full body should be visible; (ii) the animal should be captured from a side view, where identifying coat patterns are most apparent; (iii) the animal should be close enough to the camera to reveal fine details without any part being cropped; and (iv) the selected frames should be temporally spaced to capture variation in pose and movement.

Fig. 1.

Fig. 1

Example of frame selection from video sequences based on bounding box area and temporal spacing. The video is divided into three segments (visualized as delimited by red lines), and one frame with the highest relative area is selected from each segment.

Dataset Annotation

All extracted images were further processed, i.e., manually annotated, to support evaluation of three computer vision and machine learning tasks: (i) instance segmentation, (ii) individual identification, and (iii) animal pose estimation. The Computer Vision Annotation Tool (CVAT), an open-source, web-based labeling platform, was used to create polygon, segmentation mask, keypoint, and attribute annotations.

The data for instance segmentation were annotated using a semi-automated, human-in-the-loop approach based on the Segment Anything Model (SAM) 31. Annotators used positive and negative points to prompt SAM to create initial masks, which were then checked and corrected (if needed) in CVAT to ensure accurate outlines of each lynx. To better evaluate the approach, a comparison of human- and SAM-only masks with the final human-verified annotations has been done. SAM alone reached a mean Intersection over Union (IoU) of 0.96, while fully manual annotation reached an IoU of 0.87. The human-in-the-loop workflow was both faster and more accurate (30 seconds vs 5 minutes per image), as SAM helped annotators avoid small boundary errors common in manual segmentation.

The data targeted for animal pose estimation were also annotated using a semi-automated approach. Initial keypoint predictions were generated with a pre-trained AnimalPose model 32 and subsequently updated manually in CVAT if needed. Up to 20 key points were assigned to each individual based on the MMPose animal skeleton specification, covering joints, facial features, and the tail base. The detailed description of all keypoints and sample poses is provided in Fig. 2 and Table 3. In cases of partial visibility, some key points were left unannotated if they were not visible in the image. In addition to spatial labels, image descriptive metadata, including the lynx’s viewpoint (e.g., left flank, right flank, frontal, or rear) are annotated.

Fig. 2.

Fig. 2

Lynx lynx pose. 20-keypoint skeletal structure based on the AnimalPose 32 standard.

Table 3.

Numbered anatomical definitions of used keypoints for Eurasian lynx skeleton (i.e., pose), following the AnimalPose32 standard and the MMPose animal skeleton definition.

1 2 3 4 5 6 7 8 9 10
eyeleft eyeright earleft earright nose throat tailbase withers elbowleftfront elbowrightfront
11 12 13 14 15 16 17 18 19 20
elbowleftback elbowrightback kneeleftfront kneerightfront kneeleftback kneerightback pawleftfront pawrightfront pawleftback pawrightback

Synthetic Data Generation

Synthetic data has proven valuable in improving performance in many scenarios, especially when real-world data is scarce, difficult to collect, or lacks sufficient diversity 12,33,34. In the context of individual animal identification or pose estimation, however, generating synthetic data that accurately captures the complexity of real-world conditions remains a significant challenge. Despite its usefulness, existing methods for generating synthetic animal pose data often struggle with realism in appearance, motion, and environmental context, which can lead to reduced accuracy in downstream applications 3537. To complement the real data and broaden the variability in both poses and textures, a synthetic dataset using rendered 2D images of a 3D lynx model was created. The pipeline, built with the Unity game engine, enables the creation of highly realistic synthetic samples representing multiple individuals of the Lynx lynx species across a wide range of conditions. By leveraging detailed environmental modeling, including vegetation, terrain, and lighting, along with lifelike animations and diverse camera setups, the production of high-fidelity synthetic images that resemble real-world scenarios was ensured.

Algorithm 1

Synthetic Data Generation Pipeline.Inline graphic

Data generation pipeline

The proposed pipeline generates synthetic animal images with rich metadata, including precise pose, segmentation masks, and individual IDs. The process starts with building a 3D environment, populated with a rigged lynx model and realistic scenery. Four control points are manually placed in each scene, some of which are outside the camera’s field of view, and the animal is animated to move between them on predefined paths. Along each path, a random set of n “stop” points is sampled. At every stop, both the scene and the model are modified: environment variants (e.g., tree type, grass, snow coverage) and small changes to the camera viewpoint (e.g., rotation) are applied, while the model’s orientation and animation state (e.g., walking, sitting) are adjusted. If multiple textures corresponding to a specific identity are available, one is randomly selected for the current frame. For each valid view, a 2D snapshot is recorded together with keypoints, bounding box, and the active texture (identity), and two samples are stored per stop (before and after modifications). Frames in which most of the model lies outside the camera’s view are discarded. This procedure produces a diverse set of images that mimics real camera trap conditions. See pseudo code in Algorithm 1 for a simple overview.

Animal model

A textured model of Lynx lynx is used as the animal model. The model’s skeletal structure consists of 20 keypoints and follows the Animal Pose dataset specification, therefore, allows for training on both real and synthetic datasets using the same model and enables direct performance comparisons. Since the selected model included only a walking animation, an additional sitting animation was manually developed to increase pose diversity. With proper setup, other animal models can also be used.

Textures

To increase appearance diversity between synthetic individuals, an additional set of coat textures was generated using the Paint-It diffusion-based text-to-texture model 38,39. Such a model enables the creation of textures from short text descriptions. As illustrated in Fig. 3, the variability of the generated textures depended strongly on the choice of prompt rather than on the random seed. Therefore, a library of 28 prompts was defined by combining species terms (e.g., lynx, bobcat) with descriptors of common feline coat patterns (e.g., spotted, tabby, marbled) and additional details. Each prompt was sampled with multiple random seeds. Despite increasing visual variability, this approach also has some practical limitations, as the diffusion model sometimes produces artifacts or unrealistic color combinations. To mitigate this, the set of prompts was iteratively refined: prompts that consistently led to low-quality textures were replaced, whereas prompts that produced stable, realistic patterns were kept. Using the final prompt set and seed combinations, 299 textures were generated and used to represent different synthetic individuals in the simulation.

Fig. 3.

Fig. 3

Examples of generated textures and given text prompts. The first two rows show textures generated from the simple prompts lynx and bobcat with different random seeds, illustrating that changes in the seed lead to only minor visual differences for a fixed prompt, while changing the prompt already alters the overall color and pattern. The third row shows an example of a more complex prompt that results in a noticeably different texture. The first column indicates how many textures were generated for each prompt (across both lynx and bobcat variants and seeds). In total, 299 textures were generated using 28 different prompts.

Environment

The synthetic data were designed to resemble real camera trap images closely. To achieve this, highly realistic, freely available assets from Unity’s Book Of The Dead were used, including terrain, trees, logs, and other vegetation. Several real-world scenes were selected as references, and four photorealistic scenes were recreated based on them. To enhance data variability, each scene was rendered in multiple versions that differed in tree types, grass density, and snow coverage. A comparison between real and synthetic scenes is shown in Fig. 4.

Fig. 4.

Fig. 4

Selected synthetic data samples. Inspired by the real camera trap views, we have created four highly realistic scenes. All scenes are made publicly available for further use.

Data Records

The CzechLynx dataset22 includes real camera trap photographs and synthetic samples of the Eurasian lynx (Lynx lynx), organized around three downstream tasks: (i) individual identification, (ii) pose estimation, and (iii) instance segmentation. The main part of the dataset, consisting of 39,760 manually verified and labeled camera-trap images, is fixed, whereas the synthetic part, in practice, can be scaled to any size (for simple use, a synthetic subset with a similar number of individuals and images is provided on CzechLynx Zenodo handle). The real images span more than 15 years and come from two geographically distinct regions in Central Europe: Southwest Bohemia and the Western Carpathians. As the monitoring network expanded, the yearly volume of real images increased: fewer than 300 images were collected per year before 2012, compared to more than 5,000 images per year after 2020. Observations are recorded throughout the year, with the highest capture rates between January and March (about 40% of images) and fewer images during summer months (June-August). This seasonal pattern is consistent with standard lynx monitoring designs in Central Europe, where winter camera trapping is often preferred because animals are more detectable before and during the mating season and against snow-covered backgrounds 7,9. Detailed statistics on image counts, identity coverage, and geographic distribution are provided in Table 4.

Table 4.

Summary of camera-trap sources contributing to the CzechLynx dataset. For each source, we report the number of images, observation events, identified individuals (Ids), camera sites, spatial locations (Locs; 10×10 km grid cells), and sampling period. The FoE CZ – Southwest Bohemia and Šumava National Park Administration sources partially overlap geographically, therefore, sharing 47 identities and 12 grid cells.

Source Images Observations Ids Sites Locs Period
FoE CZ – The Western Carpathians 17,997 9,753 95 361 39 2009–2025
FoE CZ – Southwest Bohemia 6,822 1,957 102 79 32 2015–2023
Šumava National Park Administration 14,941 7,072 169 219 27 2016–2024
Total 39,760 18,782 319 659 86 2009–2025

All images are stored in JPEG format (with 90% compression), with metadata provided in a structured CSV file. To simplify access to the data and support standardized development and evaluation of downstream tasks, i.e., instance segmentation, individual identification, and pose estimation, all real data are distributed in a single package, even though not all components are required for every task. The synthetic data are provided in the same repository in a separate archive. Instead of maintaining separate annotation files for each downstream task, a single shared CSV file with all annotations and necessary information is provided. It contains one row per image and, for each task, indicates whether the record is used and in which split and subset (i.e, training or test).

Metadata: The CzechLynx dataset22 includes additional metadata about the origin of the data, temporal context (e.g., observation date, relative age since first sighting, and encounter sequence), spatial context (e.g., 10 × 10 km grid-cell code, location and trap identifiers, and GPS coordinates), phenotypic annotation, and dataset partitioning flags (i.e., geo-aware, time-aware open, and time-aware closed splits). These annotations support flexible filtering and grouping by identity, time, space, and experimental split, thereby enabling reproducible, domain-aware evaluation. The majority of metadata attributes are available for all images, observation dates are available for 98.9% of images, location attributes for approximately 92%, and coat-pattern annotations for about 40%. See Table 5 for a detailed description of all metadata fields.

Table 5.

Available metadata and their definitions. Besides annotations, the CzechLynx dataset includes standardized information on observation identity, timing, spatial context, phe- notypic annotations, and dataset splits. Spatial attributes follow a hierarchical structure: (i) WGS84 coordinates of the grid-cell centroid (coordinates), (ii) a 10×10 km ETRS89-LAEA grid cell (cell code), and (iii) the nearest administrative region to the grid-cell center (location). For the definitions of the geo-aware, time-open, and time-closed splits see Section Predefined Splits.

Metadata Description
Source Data provider. The string foe_carpaths, foe_bohemia, or snpa corresponds to FoE CZ - The Western Carpathians, FoE CZ - Southwest Bohemia, and Šumava National Park Administration sources, respectively.
Unique name Unique identification of Lynx lynx individual. The format is lynx_<integer>.
Path Relative path to the file in the dataset.
Date Date when the animal was observed in yyyy-mm-dd format.
Relative age Relative age derived from the difference between the actual date and the first observation of the individual in the dataset.
Encounter ID of unique sequence of images in the same camera trap location.
Coat pattern Describes lynx’s coat pattern with values marbled and spotted.
Latitude, Longitude WGS84 coordinates of the center of the 10 × 10 km grid cell containing observation.
Cell code 10 × 10 km grid-cell identifier in the ETRS89-LAEA (EPSG:3035) pan-European coordinate system. Each entry has the form 10kmE<easting_index>N<northing_index>
Location Unique location identifier. The closest geopolitical region to the center of the 10 × 10 km cell.
Trap ID Unique identification of camera trap. There could be multiple in each grid cell.
Geo-aware split Training/test split. Distinct populations belong to one or the other.
Time-aware closed split Training/test split. All individuals are included in both the training and test subsets.
Time-aware open split Training/test split. Individuals unseen in the training subset are included in the test subset.
Pose split Training/test split. Empty if the image is not used for pose estimation.
Mask Pixel-level instance segmentation mask, stored as a COCO-style RLE.
Pose 2D pose annotation, with up to 20 visible keypoints per individual stored as a dict {<keypoint_name>: [x, y]}; empty if no pose annotation is available.

Task-specific subsets

The CzechLynx dataset22 is organized into three task-specific subsets designed to support the development and evaluation of (i) individual identification, (ii) pose estimation, and (iii) instance segmentation. For individual identification and instance segmentation, the same set of images with clearly visible coat patterns, for which human experts confirm the identity, is provided. Each image is paired with an identity label and a pixel-level mask outlining the lynx body, which enables the training of segmentation models while providing suitable input for re-identification. The pose estimation part is a subset of the identification/segmentation images and is smaller due to the labor-intensive annotation process. Images are selected to cover a broad range of viewpoints and behaviors and also include challenging samples with rare poses, partial occlusions, and low-visibility conditions.

Individual identification

This subset contains all real images for which individual identity can be reliably determined, together with a large synthetic complement. The real images span more than 15 years and come from two primary sources: FoE CZ (at the Western Carpathians and Southwest Bohemia) and the Šumava National Park Administration (at Šumava National Park and the Protected Landscape Area in southwestern Bohemia). All images are also part of the instance-segmentation subset, paired with a pixel-level mask of the lynx body. Spatial and temporal metadata are provided to enable evaluation of identity recognition models under large domain shifts, including protocols based on geo-aware and time-aware splits (see Section Predefined Splits).

Instance segmentation

This subset uses the same real images as the one for individual identification. For all images, a binary instance mask delineating each lynx’s full visible body is provided. The segmentation masks enable training and evaluation of models that separate lynx from complex natural backgrounds, including snow, forest undergrowth, and shadows. All masks are provided in COCO format using run-length encoding (RLE) and are spatially aligned with their corresponding images.

Animal pose

The pose-estimation subset consists of around 5k real and a large set of synthetically generated images, each annotated with up to 20 keypoints per individual (only visible keypoints are provided). The real images primarily capture side-view walking postures, reflecting the natural bias of camera trap data. Only images with a single visible individual are included, and annotations are provided in a standardized COCO format.

Predefined Splits

To support robust evaluation under real-world constraints, the CzechLynx dataset 22 is partitioned into three distinct data splits: one geo-aware and two time-aware. Each split is designed to test model generalization across spatial or temporal domains and is provided with a clear training-test split that reflects practical conservation and ecological monitoring scenarios. The summary of image counts and individual identities per split is presented in Table 6.

Table 6.

Number of identities, locations, sites, and images across three provided splits; two time-aware and one geo-aware. All splits were created to prevent training-to-test data leakage, i.e., samples that are close in time and/or location will never appear in both the training and test subsets. In the open-set split, 44 new identities appear in the test set.

# of images # of identities # of sites # of locations
training test training test training test training test
Geo-aware-open 21,763 17,997 224 95 298 361 47 39
Time-aware-open 27,587 12,173 275 126 565 313 82 63
Time-aware-closed 27,836 11,924 319 319 603 464 83 77

Time-aware closed-set split

In this split, data are divided by time and include the same individuals in both the training and test sets. Although this setup is less ecologically realistic, since wild populations are rarely closed, it provides a controlled evaluation scenario comparable to standard benchmarks used in machine learning. The data are separated using clear time cutoffs, with no samples captured within the same encounter or in temporally neighboring periods appearing in both the training and test sets, preventing train-to-test data leakage.

Time-aware open-set split

In this split, data are also divided by time, with most individuals (82) in the test set also present in the training set. However, the test images come from later time periods, separated from the training data by season. This setup reflects realistic long-term monitoring scenarios in which populations naturally change over time (e.g., new individuals are born, some individuals die, and others may age or change appearance). The split, therefore, evaluates a model’s ability to generalize across temporal variation and population drift. The split was done using a clean time-cutoff strategy to prevent data leakage, ensuring that no images from neighboring seasons appear in both training and test sets.

Geo-aware split

This split is designed to evaluate spatial generalization. Training data come from The Western Carpathians, while the test data comes from geographically distinct southwest Bohemia. The individuals in the training and test sets are completely disjoint, ensuring that no data leakage occurs. This setup reflects an ecological scenario where re-identification models will be deployed in a new area without locally labeled data, which tests generalization and transferability to different populations of the same species, but observed in other regions.

Technical Validation

To ensure the technical quality of the CzechLynx dataset22, we performed a series of validation steps focused on the consistency, accuracy, and usefulness of the data for both computer vision and ecological research.

Annotation Quality Control

The dataset was annotated using standard tools such as CVAT. To reduce labeling errors, a dual-pass validation strategy was used. First, annotations were created or verified by trained annotators. Then, a second round of validation was performed by a separate team to ensure consistency in identity assignment, keypoint placement, and segmentation masks.

Data Consistency and Identity Validation

The subset for individual identification includes only images where individual identity could be confirmed by at least two independent observers, based on distinct coat patterns on limbs and flanks. Any ambiguous cases were excluded. This process ensured a high level of reliability in the identity labels across the dataset.

Synthetic Data Comparison

To address pose diversity limitations, a synthetic subset was generated using a 3D model and the Unity engine. We performed side-by-side comparisons of real and synthetic images (see Fig. 4) and confirmed that the synthetic data closely mirrors real-world camera trap imagery. The synthetic samples also include similar metadata and annotations, such as keypoints and segmentation masks.

Data Distribution and Coverage

The dataset spans more than 15 years and covers 319 individual lynx across two populations and multiple regions. Images were collected from 659 sites, with the majority sampled systematically in a 10 × 10 km grid. This ensures representative coverage of habitats and individual variation. The dataset also includes time- and geographically-aware splits, enabling robust open- and closed-set evaluation.

Acknowledgements

This research was supported by the Technology Agency of the Czech Republic, project No. SS05010008. We would like to express our gratitude to Peter Drengubiak, Martina Dušková, Šárka Frýbová, Martin Gendiar, Beňadik Machciník, Michal Králik, Michal Kudlák, Leona Marčáková, Martin Špilák, Zdeněk Tyller, Gabriela Váňová and the dedicated volunteers of Carnivore Tracking Project for their help collecting data and fieldwork.

Author contributions

Conceptualization and Methodology: Lukas Picek. Software: Lukas Picek, Miroslav Jirik, Jakub Straka, and Vojtech Cermak. Data Acquisition: Miroslav Kutal, Elisa Belotti, Luděk Bufka, Martin DuI’a, Rostislav Dvořák, Michal Bojda,Václav Kocourek, Josefa Krausová, Jíří Labuda, Luděk Toman, Martin Váňa Data Curation: Lukas Picek, Miroslav Jirik, and Jakub Straka. Writing - Original Draft: Lukas Picek, Miroslav Kutal, Martin DuI’a, Miroslav Jirik, Jakub Straka, Vojtech Cermak, Elisa Belotti, and Josefa Krausová. Writing - Review & Editing: All authors read and approved the final manuscript.

Data availability

The full dataset is publicly available on Zenodo. A mirrored copy with example notebooks, is also hosted on Kaggle.

Code availability

Code to load and work with the CzechLynx dataset 22, including baseline training and evaluation scripts, is available in the open-source WildlifeDatasets repository. Executable baseline notebooks are provided on Kaggle in the Code section associated with the dataset. The Unity-based pipeline and diffusion-based tools used to generate the synthetic images and annotations are available in a separate repository, WildlifeSynthetic.

Competing interests

The authors declare no competing interests.

Footnotes

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

  • 1.Rabinowitz, A. & Zeller, K. A range-wide model of landscape connectivity and conservation for the jaguar, Panthera onca. Biological Conservation.143, 939–945 (2010). [Google Scholar]
  • 2.Kubala, J. et al. Factors shaping home ranges of Eurasian lynx (Lynx lynx) in the Western Carpathians. SCIENTIFIC REPORTS.14, 21600 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Ripari, L. et al. Others Human disturbance is the most limiting factor driving habitat selection of a large carnivore throughout Continental Europe. Biological Conservation.266, 109446 (2022). [Google Scholar]
  • 4.Oeser, J. et al. Others Prerequisites for coexistence: human pressure and refuge habitat availability shape continental-scale habitat use patterns of a large carnivore. Landscape Ecology.38, 1713–1728 (2023). [Google Scholar]
  • 5.Gajdárová, B. et al. Long-distance Eurasian lynx dispersal - a prospect for connecting native and reintroduced populations in Central Europe. Conservation Genetics.22, 1–11, 10.1007/s10592-021-01363-0 (2021). [Google Scholar]
  • 6.Weingarth, K., Heibl, C., Knauer, F., Zimmermann, F., Bufka, L. & Heurich, M. First estimation of Eurasian lynx (Lynx lynx) abundance and density using digital cameras and capture-recapture techniques in a German national park. Animal Biodiversity And Conservation.35, 197–207 (2012). [Google Scholar]
  • 7.Pesenti, E. & Zimmermann, F. Density estimations of the Eurasian lynx (Lynx lynx) in the Swiss Alps. Journal Of Mammalogy.94, 73–81 (2013). [Google Scholar]
  • 8.DuI’a, M. et al. Multi-seasonal systematic camera-trapping reveals fluctuating densities and high turnover rates of Carpathian lynx on the western edge of its native range. Scientific Reports.11, 9236, 10.1038/s41598-021-88348-8 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Palmero, S. et al. Demography of a Eurasian lynx (Lynx lynx) population within a strictly protected area in Central Europe. Scientific Reports.11, 19868 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Norouzzadeh, M. et al. Automatically identifying, counting, and describing wild animals in camera-trap images with deep learning. Proceedings Of The National Academy Of Sciences.115, E5716–E5725 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Schneider, S., Taylor, G., Linquist, S. & Kremer, S. Past, present and future approaches using computer vision for animal re-identification from camera trap data. Methods In Ecology And Evolution.10, 461–470 (2019). [Google Scholar]
  • 12.Straka, J., Hruz, M. & Picek, L. The Hitchhiker’s Guide to Endangered Species Pose Estimation. Proceedings Of The IEEE/CVF Winter Conference On Applications Of Computer Vision. pp. 50-59 (2024).
  • 13.Wang, J. et al. Others Deep high-resolution representation learning for visual recognition. IEEE Transactions On Pattern Analysis And Machine Intelligence.43, 3349–3364 (2020). [DOI] [PubMed] [Google Scholar]
  • 14.Picek, L., Neumann, L. & Matas, J. Animal identification with independent foreground and background modeling. DAGM German Conference On Pattern Recognition. pp. 241-257 (2024).
  • 15.Adam, L., Papafitsoros, K., Kovář, R., Čermák, V. & Picek, L. Overview of AnimalCLEF 2025: recognizing individual animals in images. Working Notes Of CLEF. (2025).
  • 16.Picek, L. et al. Others Overview of lifeclef 2025: Challenges on species presence prediction and identification, and individual animal identification. International Conference Of The Cross-Language Evaluation Forum For European Languages. pp. 338-362 (2025).
  • 17.Cermak, V., Picek, L., Adam, L. & Papafitsoros, K. WildlifeDatasets: An open-source toolkit for animal re-identification. Proceedings Of The IEEE/CVF Winter Conference On Applications Of Computer Vision. pp. 5953-5963 (2024)
  • 18.Li, S., Li, J., Tang, H., Qian, R. & Lin, W. ATRW: A Benchmark for Amur Tiger Re-identification in the Wild. Proceedings Of The 28th ACM International Conference On Multimedia. pp. 2590-2598 10.1145/3394171.3413569 (2020).
  • 19.Trust, B. Panthera pardus CSV custom export. https://africancarnivore.wildbook.org/ (2022).
  • 20.Dlamini, N. & Zyl, T. Automated Identification of Individuals in Wildlife Population Using Siamese Neural Networks. 2020 7th International Conference On Soft Computing & Machine Intelligence (ISCMI). pp. 224-228 https://ieeexplore.ieee.org/document/9311574 (2020).
  • 21.Zuerl, M. et al. PolarBearVidID: A Video-Based Re-Identification Benchmark Dataset for Polar Bears. Animals.13, 801 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Picek, L. et al. CzechLynx Dataset (v1.0). (Zenodo,2025,6) Zenodo 10.5281/zenodo.17592004 (2025).
  • 23.Dul’a, M. et al. The first insight into hunting and feeding behaviour of the Eurasian lynx in the Western Carpathians. Mammal Research.68, 237–242 (2023). [Google Scholar]
  • 24.Weingarth, K. et al. First estimation of Eurasian lynx (Lynx lynx) abundance and density using digital cameras and capture-recapture techniques in a German national park. Animal Biodiversity And Conservation.35, 197–207 (2012). [Google Scholar]
  • 25.Wölfl, S. et al. Others Lynx Monitoring Report for the Bohemian-Bavarian-Austrian Lynx Population in 2018/2019. Report Prepared Within The 3Lynx Project. (2020).
  • 26.Choo, Y. et al. Best practices for reporting individual identification using camera trap photographs. Global Ecology And Conservation.24, e01294 (2020). [Google Scholar]
  • 27.Beery, S., Morris, D. & Yang, S. Efficient Pipeline for Camera Trap Image Review. ArXiv Preprint ArXiv:1907.06772. (2019).
  • 28.Fennell, M., Beirne, C. & Burton, A. Use of object detection in camera trap image identification: Assessing a method to rapidly and accurately classify human and animal detections for research and application in recreation ecology. Global Ecology And Conservation.35, e02104 (2022). [Google Scholar]
  • 29.Leorna, S. & Brinkman, T. Human vs. machine: Detecting wildlife in camera trap images. Ecological Informatics.72, 101876 (2022). [Google Scholar]
  • 30.Henrich, M. et al. A semi-automated camera trap distance sampling approach for population density estimation. Remote Sensing In Ecology And Conservation.10, 156–171 (2024). [Google Scholar]
  • 31.Kirillov, A. et al. Others Segment anything. Proceedings Of The IEEE/CVF International Conference On Computer Vision. pp. 4015-4026 (2023).
  • 32.Cao, J. et al. Cross-domain adaptation for animal pose estimation. Proceedings Of The IEEE/CVF International Conference On Computer Vision. pp. 9498-9507 (2019).
  • 33.Peng, X., Sun, B., Ali, K. & Saenko, K. Learning deep object detectors from 3d models. Proceedings Of The IEEE International Conference On Computer Vision. pp. 1278-1286 (2015).
  • 34.Azizi, S., Kornblith, S., Saharia, C., Norouzi, M. & Fleet, D. Synthetic Data from Diffusion Models Improves ImageNet Classification. Transactions On Machine Learning Research. https://openreview.net/forum?id=DlRsoxjyPm (2023).
  • 35.Jiang, L., Liu, S., Bai, X. & Ostadabbas, S. Prior-Aware Synthetic Data to the Rescue: Animal Pose Estimation with Very Limited Real Data. 33rd British Machine Vision Conference 2022, BMVC 2022, London, UK, November 21-24, 2022. https://bmvc2022.mpi-inf.mpg.de/0868.pdf (2022).
  • 36.Shooter, M., Malleson, C. & Hilton, A. SyDog: A Synthetic Dog Dataset for Improved 2D Pose Estimation. (2021).
  • 37.Bonetto, E. & Ahmad, A. Synthetic data-based detection of zebras in drone imagery. 2023 European Conference On Mobile Robots (ECMR). pp. 1-8 (2023).
  • 38.Youwang, K., Oh, T. & Pons-Moll, G. Paint-it: Text-to-Texture Synthesis via Deep Convolutional Texture Map Optimization and Physically-Based Rendering. IEEE Conference On Computer Vision And Pattern Recognition (CVPR). (2024).
  • 39.Chen, D., Siddiqui, Y., Lee, H., Tulyakov, S. & Nießner, M. Text2tex: Text-driven texture synthesis via diffusion models. Proceedings Of The IEEE/CVF International Conference On Computer Vision. pp. 18558-18568 (2023).
  • 40.Clapham, M., Miller, E., Nguyen, M. & Darimont, C. Automated facial recognition for wildlife that lack unique markings: A deep learning approach for brown bears. Ecology And Evolution.10, 12883–12892, 10.1002/ece3.6840 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Moreira, T., Perez, M., Werneck, R. & Valle, E. Where is my puppy? Retrieving lost dogs by facial features. Multimedia Tools And Applications.76, 15325–15340, 10.1007/s11042-016-3824-1 (2017). [Google Scholar]
  • 42.Timm, M., Maji, S. & Fuller, T. Large-scale ecological analyses of animals in the wild using computer vision. Proceedings Of The IEEE Conference On Computer Vision And Pattern Recognition Workshops. pp. 1896-1898 (2018).
  • 43.Thornton, D. & Pekins, C. Spatially explicit capture-recapture analysis of bobcat (Lynx rufus) density: implications for mesocarnivore monitoring. Wildlife Research.42, 394–404 (2015). [Google Scholar]
  • 44.Prop, J., Staverløkk, A. & Moe, B. Identifying individual polar bears at safe distances: A test with captive animals. PloS One.15, e0228991, 10.1371/journal.pone.0228991 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Nipko, R., Holcombe, B. & Kelly, M. Identifying individual jaguars and ocelots via pattern-recognition software: comparing HotSpotter and Wild-ID. Wildlife Society Bulletin.44, 424–433, 10.1002/wsb.1086 (2020). [Google Scholar]
  • 46.Kelly, M. Computer-aided photograph matching in studies using individual identification: an example from Serengeti cheetahs. Journal Of Mammalogy.82, 440–449 (2001). [Google Scholar]
  • 47.Anderson, C., Da Vitoria Lobo, N., Roth, J. & Waterman, J. Computer-aided photo-identification system with an application to polar bears based on whisker spot patterns. Journal Of Mammalogy.91, 1350–1359 (2010). [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

The full dataset is publicly available on Zenodo. A mirrored copy with example notebooks, is also hosted on Kaggle.

Code to load and work with the CzechLynx dataset 22, including baseline training and evaluation scripts, is available in the open-source WildlifeDatasets repository. Executable baseline notebooks are provided on Kaggle in the Code section associated with the dataset. The Unity-based pipeline and diffusion-based tools used to generate the synthetic images and annotations are available in a separate repository, WildlifeSynthetic.


Articles from Scientific Data are provided here courtesy of Nature Publishing Group

RESOURCES