Pod-pose : an efficient top-down keypoint detection model for fine-grained pod phenotyping in mature soybean

Fei Liu; Hang Liu; Qiong Wu; Zhongzhi Han; Shanchen Pang; Shudong Wang; Longgang Zhao

doi:10.1186/s13007-025-01399-0

letter

. 2025 Jun 9;21:82. doi: 10.1186/s13007-025-01399-0

Pod-pose : an efficient top-down keypoint detection model for fine-grained pod phenotyping in mature soybean

Fei Liu ^1,², Hang Liu ¹, Qiong Wu ¹, Zhongzhi Han ¹, Shanchen Pang ², Shudong Wang ^2,^✉, Longgang Zhao ^1,^✉

PMCID: PMC12147338 PMID: 40490832

Abstract

Background

Phenotypic characterization of mature soybean pods is a crucial aspect of breeding programs, yet efficiently obtaining accurate pod phenotypic parameters remains a major challenge. Recent advances in deep learning, particularly in keypoint detection models, have introduced innovative methods for pod phenotype extraction. However, precise identification and analysis of fine-scale phenotypic traits in soybean pods remain challenging in current research.

Results

We propose Pod-pose, an innovative top-down keypoint detection model for precise soybean pod phenotyping that adapts human pose estimation techniques to plant phenotyping. Specifically, Pod-pose integrates the architectural strengths of various advanced YOLO (You Only Look Once) models through bottleneck structure optimization and positional feature enhancement to achieve superior detection accuracy. Furthermore, we implemented a two-stage detection method augmented with transfer learning, which not only reduces training complexity but also significantly enhances the model's performance. Extensive evaluation of our custom-built dataset demonstrated Pod-Pose's superior performance, with the X variant achieving an Average Precision of 0.912 at an IoU threshold of 0.5 (AP@IoU = 0.5). Notably, four critical pod-related phenotypic traits were successfully quantified: pod length, bending length, curvature, and inflection point width.

Conclusions

This study establishes Pod-Pose as a viable solution for pod phenotyping, with potential applications in soybean breeding optimization.

Keywords: Soybean pod, Deep learning, Keypoint detection, Digital breeding

Introduction

Soybeans are among the world's most important food crops, forming the foundation of the global food and agricultural industries [1]. Advanced plant phenotyping technology is critically needed for soybean breeding, as accurate and efficient phenotypic measurements enhance genetic insights and significantly increase genetic gains, enabling the selection and breeding of high-yielding genotypes [2]. Among various phenotypic traits, pod-related characteristics play a pivotal role in material selection, yield prediction, and breeding strategy optimization [3, 4]. Previous studies have predominantly employed image segmentation techniques to calculate traits such as the number of pods per plant, pod size, and pod color. However, these methods often overlook finer pod-related phenotypic parameters, including pod length, bending length, inflection point width, and Curvature [5–7]. Thus, there is a pressing need to develop a fast and accurate method for measuring soybean pod phenotypes.

In recent years, rapid advancements in intelligent agriculture have increased expectations for the application of deep learning (DL) techniques in soybean phenotyping. For instance, convolutional neural networks (CNNs) have demonstrated effectiveness in soybean feature identification, facilitating accurate localization of key phenotypic traits [8]. The integration of rotational object detection and region-based search algorithms has demonstrated robust performance in stem node and branch identification, supporting reliable plant skeleton reconstruction essential for accurate growth modeling [9]. The SPP-extractor algorithm has enhanced the capability of the YOLOv5 s model to recognize pods and stems, enabling automated phenotyping extraction in densely planted soybean crops [10]. The soybean pod counting challenges in dense populations were addressed through a novel DARFP-SD model, which enhances feature learning while handling pod occlusion and uneven distribution [11]. The FEI-YOLO model has optimized the recognition of different pod types, thereby improving overall detection efficiency and accuracy [12]. Additionally, a soybean stem and leaf recognition method based on the DFSP algorithm, combined with the ISS-CPD-ICP algorithm, has enabled efficient 3D reconstruction of the soybean canopy, providing valuable insights into soybean growth status [13]. The DSBEAN framework was designed to tackle complex soybean phenotype analysis by integrating deep learning approaches [14].

However, there has been limited exploration of DL algorithms for measuring soybean pod-related phenotypes, with fewer than ten published studies addressing this challenge. One study advanced Mask R-CNN to achieve annotation-free segmentation and morphological analysis of densely clustered soybean pods, overcoming key challenges in agricultural image analysis [15]. Another study developed an FPN-based SPM-IS algorithm for automated soybean phenotyping, addressing manual measurement constraints in breeding research [16]. An enhanced YOLOv5 model was developed for accurate soybean pod detection and yield estimation, addressing real-time monitoring and intelligent breeding challenges [6]. Additionally, four YOLOv8-based models were utilized to segment mature soybean plants, identify pods, determine the number of soybeans in each pod, and derive soybean phenotypes [17]. A bottom-up model, DEKR-SPrior, was introduced to address the challenge of accurately phenotyping densely packed and overlapping soybean pods and seeds in situ [18]. Collectively, these studies highlight the extensive application and substantial potential of DL methodologies in advancing pod phenotype research.

While substantial research efforts have been dedicated to pod phenotypic characterization, current segmentation-dependent deep learning approaches often face inherent limitations in measurement accuracy and spatial precision, particularly in complex pod arrangements [19–21]. Recent studies have demonstrated that keypoint detection-based approaches offer enhanced measurement precision and localization accuracy [16, 18]. Keypoints in image analysis represent both spatial locations and contextual relationships, capturing fixed regions and neighborhood characteristics essential for phenotypic analysis [22]. This technique has demonstrated remarkable success in crop phenotyping, with established applications across staple crops including rice [23, 24], wheat [25, 26], maize [27, 28], and soybean [29, 30]. Key point detection improves localization precision and effectively captures subtle structural variations critical for phenotype analysis. Its widespread adoption is driven by a well-established research foundation and readily available technological frameworks. In particular, the YOLO series [31–37] has gained prominence for optimizing key point detection, enhancing both real-time performance and accuracy, and solidifying its role as a vital tool in this field.

However, directly applying the YOLO series to key point detection in mature soybean pods often yields suboptimal performance due to several challenges: (i) Variability in pod orientations complicates accurate localization and identification; (ii) The high color similarity between mature soybean plants and pods increases susceptibility to background interference, reducing detection accuracy; (iii) The dense arrangement of pods on the plant heightens the risk of misidentification, making it difficult to distinguish individual pods; (iv) Existing YOLO models may not effectively utilize contextual information to improve recognition accuracy in complex scenarios [38, 39]. Therefore, developing optimized key point detection methods is essential for accurately measuring mature pod phenotypes.

Drawing inspiration from previous studies [40–43], a bottom-up keypoint detection framework is developed specifically for soybean pod analysis, providing a robust foundation for phenotypic characterization. Through comprehensive evaluation of state-of-the-art detection models and systematic analysis of various YOLO architectures, we developed Pod-pose, a novel model that combines the strengths of YOLOv3 [32] and YOLOv8 [36] to address the limitations of single-model approaches. This model aims to enhance the accuracy and efficiency of key point detection, providing new technological tools for pod key point detection and phenotype research.

Materials and methods

This study achieves automated extraction of fine-grained pod phenotypic traits through precise keypoint localization and detection in soybean pods, with the technical workflow encompassing critical stages of data acquisition, model development, and performance evaluation, as illustrated in Fig. 1.

Fig. 1 — The experimental procedure of the Pod-pose method. A Training with separate object and keypoint detection heads. B Detecting keypoints using the trained model. C Identifying phenotypic traits based on detected keypoints

Step A: A top-down approach is used to train the Pod-Pose network, integrating both the object detection head and the keypoint detection head within a two-stage training framework. First, the soybean pod detection network is trained. Then, a transfer learning strategy is employed to transfer the optimal model parameters from the pod detection task to the keypoint detection task, improving both convergence speed and keypoint detection accuracy.

Step B: The Pod-Pose network trained in Step A is used to detect and localize keypoints on soybean pods. An RGB image of mature soybean plants is fed into the model, where the detection head of the Pod-Pose network extracts the region of interest (ROI) for the pods. The detected pod ROI is then processed by the Pod-Pose network, which is reconfigured with a keypoint detection head and optimized parameters to achieve precise keypoint detection and localization.

Step C: A keypoint-driven phenotypic identification algorithm is developed based on the extracted keypoint coordinates of soybean pods. This algorithm extracts four key phenotypic traits: pod length, bending length, curvature, and inflection point width. To ensure result accuracy and reliability, statistical metrics such as the Pearson correlation coefficient (R) are used to quantify the relationships between these phenotypic traits and relevant variables, providing a scientific basis for subsequent breeding research.

Image acquisition and pre-training

The training of DL networks relies on large-scale datasets to mitigate overfitting and underfitting, ensuring optimal model performance. However, acquiring and annotating datasets is labor-intensive and time-consuming. Therefore, leveraging publicly available datasets to augment the image corpus is an effective strategy. The dataset used in this study comprises publicly available and proprietary datasets, as outlined in Table 1. To ensure robust model generalization and reliable evaluation, the 1320 images were partitioned into training (70%), validation (15%), and test (15%) sets.

Table 1.

Composition of the datasets

Categories	DataSource	Image number	Pod number	Keypoint number
Public datasets	https://www.kaggle.com/datasets/soberguo/ soybeannode	820	9518	38072
Custom datasets	Dongying experimental base, Shandong Province (Latitude and Longitude: 37.31 N, 118.65 E)	500	6070	24280
Total		1320	15588	62352

Open in a new tab

For the proprietary dataset, the experimental subjects included the soybean varieties “QiHuang34” and “QiHuang641,” along with their derivatives. These varieties were cultivated at the Shandong Dongying Guangrao National Saline-Alkali Land Comprehensive Utilization Technology Innovation Center, located at coordinates 118.65°E, 37.31°N and 118.49°E, 37.83°N. This center is located in an area with mildly saline alkali soil, where the pH value is 8.62 (though it may fluctuate with precipitation). The planting density was carefully controlled at 20,000 plants per acre, with two seeds sown per hole, and row and plant spacings maintained at 50 and 13 cm, respectively. To ensure data reliability, three replicates were used for each variety. Sowing occurred in mid-May, and harvesting was completed by mid-October.

To ensure compatibility with public datasets, this study adopted the image acquisition protocol described in [8]. RGB images were captured in a 100 × 100 × 100 cm cubic darkroom (Fig. 2B). The darkroom featured a black synthetic exterior and a silver reflective interior to ensure uniform lighting. A Canon EOS 1200D camera was positioned at the top opening, while 480 high-brightness 5500 K LED beads provided illumination. Soybean plants were laid flat and photographed from above to prevent image distortion, with a one-meter ruler placed along the edges for scale. Two images per plant were captured from different angles to document morphological traits. Data augmentation techniques were employed to expand the dataset and enhance model robustness. Augmentation methods such as rotation and flipping (Fig. 2C) were used to generate diverse image samples and enhance model training efficacy. Finally, keypoints on soybean pods in mature soybean images were annotated by trained graduate students within the team using the LabelMe tool, as depicted in Fig. 2D. Each annotation task was independently performed by three researchers. Subsequently, a Python script was used to calculate relevant phenotypic parameters based on the keypoint coordinates, and the average of the three sets of results was taken as the final value to ensure data accuracy and consistency.

Fig. 2 — Workflow of image acquisition and pre-training process

Pod-pose network

Given the substantial differences in features between soybean pod keypoint detection and human pose keypoint detection, along with the increased difficulty of plant keypoint detection, this study combines YOLOv3 and YOLOv8-pose to present the Pod-pose network. Pod-pose is constructed by integrating the backbone and neck of YOLOv3 with the head of YOLOv8, incorporating key architectural enhancements to optimize soybean pod keypoint detection. Figure 3 illustrates the architecture of the proposed Pod-pose network. The YOLOv3 backbone and neck, built upon Darknet-53, provide robust feature extraction capabilities by effectively capturing multi-scale representations. Meanwhile, the YOLOv8 head leverages an advanced anchor-free mechanism and dynamic label assignment, significantly improving detection accuracy under complex field conditions. These integrations collectively enhance the model's precision and robustness.

Fig. 3 — The architecture of Pod-pose network

The network comprises three main components-Convolutional Neural Network (CNN), BottleNeck, and Sequential BasicBlock-and functions as follows: The CNN systematically extracts and abstracts deep features from the input data through a series of convolutional layers, enabling the model to capture spatial hierarchies and intricate patterns within the data. This hierarchical approach enhances the model's ability to generalize and perform well on diverse tasks. In the Pod-pose network, a total of 14 instances of this structure are employed across different layers to optimize feature extraction and improve overall performance.

The primary function of the BottleNeck module is feature extraction and enhancement. It processes the input feature maps through convolution operations and incorporates residual connections to ensure efficient information propagation between network layers. Specifically, the input feature maps undergo a series of convolutions, normalization, and activation operations, with the final output feature maps merged with the directly passed feature maps in the concatenation block (Concat). This structure significantly reduces computational load and memory usage, enabling deep convolutional neural networks to effectively handle large-scale datasets. In Pod-pose, the BottleNeck module has been optimized to improve feature extraction and enhance the model's ability to detect objects at various scales. The module is employed seven times in the network architecture, enhancing overall performance and efficiency to support efficient object detection tasks.

The Sequential BasicBlock is a structure composed of multiple sequentially arranged basic blocks (BottleNeck), designed to enhance the model's expressive and generalization capabilities, thereby improving overall performance. Each basic block introduces deep non-linear transformations during feature extraction, enabling the model to learn more complex feature representations. By stacking multiple BottleNeck modules, the Sequential BasicBlock forms a deeper network structure that captures multi-level features within the data. In the YOLO architecture, the Sequential BasicBlock not only optimizes computational efficiency but also enhances the model's ability to detect objects across different scales and complex backgrounds. This structure allows for flexible adjustments in the number and configuration of basic blocks to suit specific task requirements. Additionally, the design of the Sequential BasicBlock facilitates more efficient information flow within the network, mitigating the vanishing gradient problem and improving the stability of the training process. In Pod-pose, the Sequential BasicBlock further advances real-time object detection performance, enabling the model to adapt to various application scenarios with greater flexibility.

The Bottleneck structure significantly improves computational efficiency by reducing the number of parameters and computational complexity, while the Sequential BasicBlock enhances the model's expressive power by strengthening feature propagation. These improvements have been validated in multiple related studies (e.g., [44, 45]) and have demonstrated significant advantages in similar tasks. Therefore, this study adopts these enhanced architectures to ensure performance while improving the model's practicality.

The Head section of the Pod-pose network is designed to support both object detection and keypoint detection. Unlike previous YOLO versions, Pod-pose employs an anchor-free detection model, directly predicting object center positions without relying on predefined anchor box offsets. This approach reduces the number of bounding box predictions and simplifies post-processing, thereby improving detection efficiency. Additionally, the Pose class, which inherits from the Detect class, is specifically designed for keypoint detection. It first computes keypoint predictions and then invokes the forward propagation method of the parent Detection class to obtain the fundamental detection results. Finally, the detection outputs are integrated with keypoint predictions, enabling more precise object detection and keypoint localization. This structural design enhances the adaptability and accuracy of Pod-pose across diverse tasks.

Phenotypic identification

This study utilizes key point coordinates from the pod key point detection model for automatic pod phenotype acquisition, eliminating the need to dismantle soybean plants, reducing manual intervention, and improving convenience and accuracy. Figure 4 is a schematic diagram of soybean pod morphological traits, providing a detailed representation of four key types of phenotypic characteristics based on key points. Algorithm 1 calculates soybean pod phenotypic traits based on key point coordinates. The algorithm efficiently extracts key phenotypic traits of soybean pods, including length, bending length, curvature, and pod inflection point width, which are critical for soybean breeding. Pod length directly influences yield, as longer pods typically contain more seeds. Bending length reflects pod growth status and lodging resistance, with excessive bending potentially reducing yield. Curvature indicates the pod’s adaptability to environmental stress, whereas smaller curvature often signifies healthier growth. Pod inflection point width affects mechanical harvesting efficiency and the stability of pod transport. Thus, these phenotypic traits offer essential insights for optimizing breeding strategies and improving soybean production efficiency.

Fig. 4 — Schematic diagram of soybean pod morphological traits. Key points include Point, Node, L-point, and R-point

Evaluation standard

To validate the performance of the Pod-pose model, this study assesses the proposed model from two aspects: performance and complexity. Performance metrics include precision (P), recall (R), and Average Precision (AP) at both 50% and 95% Intersection over Union (IoU) thresholds, which measure the model's accuracy across various levels. Complexity is evaluated by the number of model parameters and Floating-point Operations Per Second (FLOPs), reflecting the model's computational requirements. Integrating these evaluation results aids in gaining a deeper understanding of the efficiency and accuracy of the DL model in soybean bud detection. The formulas are presented in (1) - (2):

p a r a m e t e r s = [r \times (f \times f) \times o] + o

F L O P s = 2 \times H_{out} \times W_{out} \times (C_{in} \times K^{2} \times b i a s) \times C_{out}

where $r$ is the input size, $f$ is the size of the convolution kernel, $o$ is the output size, $H \times W$ is the size of the output feature map, $C_{in}$ is the input channel, $K$ is the kernel size, $s$ is the stride, and $C_{out}$ is the output channel.

The mean absolute error (MAE), root mean squared error (RMSE), and the correlation coefficient (R) were used as the evaluation metrics to assess the counting performance. They take the forms as follows:

M A E = \frac{1}{N} \sum_{1}^{N} |t_{i} - c_{i}|

R M S E = \sqrt{\frac{1}{N} \sum_{1}^{N} {(t_{i} - c_{i})}^{2}}

R = \sqrt{1 - \frac{\sum_{i = 1}^{N} {(t_{i} - c_{i})}^{2}}{\sum_{i = 1}^{N} {(t_{i} - \bar{t})}^{2}}}

Where N denotes the number of test images, $t_{i}$ is the ground-truth count for the ith image, $c_{i}$ is the inferred count for the ith image, and $\bar{t}$ is the mean of $t_{i}$ .

Experimental platform

This study was conducted on a cloud server using the PyTorch (1.11.0) deep learning framework and Python (3.8) programming language. The experiments were performed on a machine running Windows 10 64-bit operating system, equipped with an Intel(R) Xeon(R) Gold 5320 CPU @ 2.20GHz (R12 vCPU), an RTX A4000 (16GB) GPU, and 32GB of RAM. The input image size for the model was set to 640 × 640 pixels, and a batch size of 16 was used during training. Pre-trained weights were not utilized and the model was trained from scratch to ensure it effectively learned the fine-grained features of soybean pods. The training process consisted of 300 epochs, with a learning rate of 0.01, momentum of 0.937, and a weight decay coefficient of 0.0005. The stochastic gradient descent (SGD) optimization strategy was employed throughout the training.

Results and analysis

Performance comparison with state-of-the-art methods

As shown in Table 2, our Pod-pose achieves state-of-the-art performance across various model scales. We first compare Pod-pose with the latest models, i.e., YOLOv10-pose On N/S/M/L/X five variants. Our Pod-pose achieves 0.9%/1.3%/2.2%/2.2%/1.4% AP@0.5 improvements in the testing process and 0.3%/2.0%/2.6%/2.6%/1.2% AP@0.5 improvements in the verification process. Compared with other methods, Pod-pose also exhibits superior trade-offs between accuracy and computational cost. Specifically, for lightweight and small models, Pod-pose-N/S outperforms YOLOv5-N/S by 3.0 AP@0.5^test and 2.7 AP@0.5^test. For medium models, compared with YOLOv8-M/L, Pod-pose-M/L outperforms 2.4 AP@0.5^test and 1.2 AP@0.5^test under the same or better performance, respectively. For large models, compared with Yolov9-pose-X, our Pod-pose-X shows a significant improvement of 0.7% AP@0.5^test. Furthermore, compared with RT-DETR, Pod-pose obtains significant performance under the similar performance. These results will demonstrate the superiority of Pod-pose as the key-point Detection detector used in soybean pods.

Table 2.

Comparison with state-of-the-art methods

Method		#Param. (M)	FLOPs(G)	$P_{test}$ (%)	$R_{test}$ (%)	AP@0.5^test (%)	AP@0.95^test (%)	$P_{val}$ (%)	$R_{val}$ (%)	AP@0.5^val (%)	AP@0.95^val (%)
Yolov5-pose	N	2.6	7.3	81.5	82.5	84.0	77.0	80.4	86.4	85.0	78.9
	S	9.4	24.8	84.4	84.6	86.7	82.3	83.1	88.0	86.9	81.6
	M	25.7	66.2	85.0	86.0	87.6	82.8	84.4	88.0	86.9	82.8
	L	54.3	138.6	85.6	86.0	87.9	82.1	84.2	89.0	87.5	82.5
	X	99.0	252.1	86.8	88.2	90.3	85.6	87.0	88.8	88.9	84.7
Yolov6-pose	N	4.3	11.9	79.0	84.6	82.3	71.4	79.4	88.4	84.0	74.7
	S	16.4	44.3	82.7	85.5	85.9	78.5	84.1	86.7	86.0	79.5
	L	52.2	161.7	84.9	87.3	87.1	81.4	85.4	88.2	87.0	81.8
	X	111.2	392.2	85.0	88.0	88.2	82.3	86.7	87.4	87.2	82.0
Yolov8-pose	N	3.1	8.3	78.7	79.8	81.1	70.0	80.1	87.4	83.8	75.8
	S	11.4	29.4	84.0	86.8	88.1	82.3	83.8	89.1	87.6	82.2
	M	26.5	83.7	86.0	87.9	88.6	83.2	87.2	88.1	88.1	83.6
	L	44.6	176.9	86.6	89.5	89.9	85.8	87.8	89.5	89.1	85.8
	X	69.7	276.3	86.9	87.9	90.1	86.5	88.3	88.7	89.3	85.3
Yolov9-pose	N	11.6	43.7	85.7	84.8	869	78.1	84.0	85.9	87.1	80.9
	S	13.8	53.2	84.0	85.5	874	81.4	84.2	86.2	87.6	82.5
	M	21.2	84.9	84.1	88.6	898	85.4	84.2	88.9	88.8	84.9
	L	26.2	106	87.2	88.9	893	84.8	86.0	90.1	89.4	85.5
	X	58.6	248	88.1	89.0	90.5	87.5	87.4	90.6	89.9	86.9
Yolov10-pose	N	2.7	8	82.8	86.8	86.1	79.1	82.8	89.8	86.3	80.1
	S	8.8	27.8	84.7	86.1	88.1	82.4	86.3	86.8	86.9	82.1
	M	13.8	66.1	84.5	87.3	88.8	83.9	87.1	86.6	87.7	83.9
	L	22.5	73.8	84.4	88.4	88.9	83.8	86.2	88.0	87.9	83.7
	X	73.6	247.9	86.5	87.7	89.8	85.0	88.1	88.5	89.3	84.9
RT-DETR	N	8.8	28.7	84.0	87.2	88.3	82.3	86.1	87.6	87.6	82.5
	S	18.8	70.5	84.7	86.6	88.9	83.8	85.1	88.2	88.0	83.5
	M	29.3	117.3	84.9	86.2	89.0	84.0	84.4	88.2	88.0	83.6
	L	45.9	194.5	86.9	87.1	89.7	84.2	87.3	88.8	88.6	84.3
	X	57.6	247.4	86.7	87.5	90.3	86.1	86.1	89.9	89.6	85.9
Pod-pose (Ours)	N	2.6	7.3	80.4	86.3	87.0	80.1	80.8	89.5	86.6	81.3
	S	7.6	24.7	87.2	86.8	89.4	83.2	88.6	90.7	88.9	83.5
	M	28.5	81.9	87.4	89.3	91.0	86.6	88.8	90.8	90.3	86.7
	L	41.5	111.1	89.4	89.2	91.1	87.1	89.0	91.0	90.5	86.9
	X	94.8	286.1	87.2	89.9	91.2	87.2	89.1	91.4	90.5	86.6

Open in a new tab

N, S, M, L, and X represent different variants of the model, representing varying model sizes and complexities

Boldface highlights the methods of this article

We also compare Pod-pose with other models, As shown in Table 2 and Fig. 5, Pod-pose also exhibits state-of-the-art performance and efficiency across different model scales, indicating the effectiveness of our architectural designs. On the other hand, Pod-pose outperforms other models in various aspects. Both in the validation and testing phases, alternative models fail to match Pod-pose's keypoint prediction accuracy. Considering the stringent accuracy requirements for soybean pod keypoint detection applications, we strategically integrated the strengths of YOLOv3 and YOLOv8 architectures, particularly through the implementation of an optimized Bottleneck structure. This architectural integration resulted in substantial performance gains for Pod-pose, as exemplified by the N variant, which demonstrated a 5.9% improvement in AP@0.5 on the test set and a 2.8% enhancement in AP@0.5 on the validation set compared to YOLOv8-pose.

To further validate the effectiveness of the Pod-Pose method, we conducted a comprehensive evaluation across all epochs using the X variant, comparing the proposed model against state-of-the-art approaches. The results show that the Pod-pose demonstrated outstanding performance across multiple epochs, particularly after 100 epochs, where its advantage over other models became more pronounced, as illustrated in Fig. 6. The performance curve indicates that as the number of training epochs increases, the accuracy of the Pod-pose model continues to improve, surpassing other models. This indicates that our proposed model excels in detecting and localizing key points of soybean pods. The performance improvement can be attributed to the integration of several advanced techniques in the model architecture. First, the Pod-pose network employs an innovative BottleNeck module, which more effectively focuses on critical regions of soybean pods, enhancing the accuracy of feature representation. Second, the use of transfer learning strategies enables the model to capture both global and local contextual information, thereby improving the robustness of the pod detection task.

Fig. 6 — Performance comparison of pod-pose and other models across epochs (X Variant)

To better evaluate the performance of different network models, this study selected various soybean plant varieties as test samples. As shown in Fig. 7, the results (X Variant) indicate that the Pod-pose model accurately detects all key points of the soybean pods and correctly determines orientation (i.e., whether the pod is facing up or down). Notably, in the last row of images, even when two key points are in close proximity, Pod-pose can still distinguish them precisely and identify their specific positions. Compared to state-of-the-art keypoint detection models, Pod-pose not only achieves higher accuracy but also exhibits exceptional robustness, effectively handling keypoint detection tasks even under complex backgrounds or noisy conditions. This superior performance is reflected in both the precise localization of key points and the overall efficiency of the model. The advantages of Pod-pose in accuracy and efficiency can be attributed to several key factors. First, the innovative design of its architecture optimizes the utilization of key point information. For example, the model effectively integrates global and local contextual information, enhancing its robustness when dealing with complex structures and closely spaced key points. Second, the adoption of a top-down transfer learning strategy accelerates the convergence process, significantly reducing training time.

Fig. 7 — Comparison of different models (X Variant). The white-background sections represent actual detection results, while the blue-background sections highlight instances of inaccurately detected pods

Figure 8 presents the feature visualization and validation results of the Pod-pose-X model on field-collected images. These images were obtained from the QAU Field High-Throughput Phenotyping Platform (QAU Field HTP Platform), a newly developed gantry-based system named TraitDiscover. Spanning 25 by 200 meters and located in the Dongying region of the Yellow River Delta, this platform is the first high-throughput phenotyping platform in China dedicated to saline agriculture. The test images, captured in real and complex field environments, demonstrate the Pod-pose model’s excellent detection performance. It accurately identifies and localizes key points, with predicted positions closely matching the actual ones, highlighting the model’s strong adaptability. Even under challenging natural conditions, such as varying lighting, the model maintains high detection accuracy, further validating its robustness and wide applicability in practical field scenarios.

Fig. 8 — Examples of field-based soybean pod keypoint detecting by the proposed Pod-pose-X model. The images in dotted boxes are zoomed-in sections of images inside the solid boxes

In summary, Pod-pose excels in the task of soybean pod keypoint detection with its high accuracy and efficiency, surpassing other existing keypoint detection models. This result demonstrates that Pod-pose will provide a powerful tool for the automatic detection of soybean pod key points.

Phenotypic identification results

Using the proposed Pod-pose model, keypoint detection and phenotype measurements were performed on RGB images of mature soybean plants to extract pod-related phenotypic parameters. Four pod-related phenotypic parameters were successfully extracted: pod length, inflection point width, bending length, and curvature. To validate the results, 120 pod samples were manually measured, and a correlation analysis was performed between the manual measurements and the algorithm's predictions. The results are visually represented by scatter plots and regression lines, as shown in Fig. 9. The specific analyses are as follows:

Fig. 9 — Correlation analysis between actual and predicted values of soybean pod phenotypic traits. A Pod length; B Inflection point width; C Bending length; D Curvature

Figure 9A illustrates the correlation between actual and predicted pod lengths. As pod length is a critical phenotypic trait in soybeans, the model achieved an MAE of 2.16, an MSE of 7.24, an RMSE of 2.69, and an R of 0.95. These metrics indicate strong agreement between predicted and actual measurements, further validating the model's reliability. Additionally, the number of nodes in randomly selected soybean plants is predominantly concentrated within the 30–60 mm range, reflecting the biological characteristics of this trait.

Figure 9B presents the correlation between actual and predicted inflection point widths, defined by the distances between the two lowest inflection points and reflecting pod shape characteristics. For this parameter, the model achieved an R of 0.93, demonstrating its effectiveness in capturing variations in inflection point width. This high correlation validates the Pod-pose model's feature extraction capability and underscores the importance of inflection point width as a phenotypic parameter in breeding and genetic research. Furthermore, potential associations between inflection point width and environmental factors (e.g., moisture, nutrient availability) or genetic background could offer valuable insights for future studies.

Figure 9C presents the correlation between actual and predicted bending lengths, defined as the distance between the two endpoints and the most distant inflection point. The model achieved an R of 0.94 for this parameter, demonstrating its effectiveness in estimating bending length. Bending length reflects the physical characteristics of the pods and may influence their biomechanical performance during growth. The model’s high accuracy enables reliable assessment of morphological changes in pods across different genetic types, providing a basis for breeding selection.

Figure 9D illustrates the correlation between actual and predicted curvatures, with an R of 0.94. Curvature is a key parameter for describing pod shape, reflecting growth status and environmental adaptability. The bubble sizes in the figure represent the frequency of occurrences between actual and predicted values, highlighting data distribution patterns. Larger bubbles indicate more common curvature values in actual measurements, which may be associated with the biological characteristics of soybean varieties and their growth environments. By analyzing these curvature data, researchers can gain deeper insights into the influence of pod shape on growth and development, offering valuable guidance for optimizing pod morphological traits.

Overall, these results demonstrate the high effectiveness of the Pod-pose model in accurately measuring key soybean phenotypic traits, highlighting its potential for broad application in agricultural research and breeding programs.

Discussion

Importance of pod phenotype

The significance of soybean pod phenotypes spans agricultural production, genetic research, and market demand. First, pod traits are key determinants of yield, directly influencing seed quality and overall production efficiency [46, 47]. Analyzing pod phenotypes enables the identification of high-yield and high-quality traits, providing a scientific basis for breeding superior cultivars, which enhances farmers'income and promotes sustainable agriculture. In genetic research, pod phenotypes serve as phenotypic manifestations of genetic variation, offering insights into the genetic mechanisms underlying soybean traits [48, 49]. Integrating phenotypic and genomic data facilitates a deeper understanding of soybean genetic diversity, accelerating the selection of desirable traits [50], improving breeding efficiency, and supporting germplasm conservation. Moreover, pod morphology is closely linked to stress resistance, as studies have shown that pod shape and structure significantly influence soybean’s resilience to drought, disease, and pest infestations [51]. Investigating these phenotypes aids in developing stress-tolerant cultivars, ensuring stable yields and strengthening food security. Finally, increasing consumer demand for high-quality soybeans highlights the economic importance of pod characteristics such as size and shape, shaping breeding priorities toward cultivars that align with market preferences. In summary, soybean pod phenotype research is essential for enhancing yield, optimizing breeding strategies, and advancing sustainable agriculture.

Benefits of key-point detection

Existing automated pod phenotyping techniques primarily rely on image segmentation algorithms; however, several challenges remain. Color similarity between pods and other plant structures often leads to segmentation errors, variations in environmental lighting affect image quality and model performance, and significant differences in pod morphology across cultivars and growth conditions hinder model generalization [52–54]. In contrast, keypoint detection technology offers distinct advantages for plant feature recognition. By identifying salient feature points (e.g., pod tips, bases, and inflection points), keypoint detection eliminates the need for pixel-level precision required by traditional segmentation methods, demonstrating greater robustness in complex agricultural environments [55, 56]. This approach has been widely applied in human pose estimation and crop phenotyping, underscoring its potential for high-precision phenotypic measurement [57–59]. In crop research, keypoint detection enables accurate extraction of morphological traits in leaves, stems, and fruits, providing an efficient tool for phenotypic analysis. Beyond improving data accuracy, it facilitates the measurement of novel phenotypic traits at finer granularity—such as pod curvature, bending angles, and spatial distribution—thereby enhancing genome-phenome association studies. Furthermore, integrating keypoint detection with multi-view imaging or 3D reconstruction expands its applications in complex phenotyping [60, 61], offering comprehensive data support for precision agriculture and digital breeding.

Significance of pod-pose method

Despite significant advancements in computer vision, its application in soybean pod phenotyping remains limited. Existing methods, ranging from traditional image processing to deep learning-based approaches, face challenges in both efficiency and accuracy. Although state-of-the-art YOLO variants demonstrate excellent performance in general object detection [62–64], they encounter specific limitations in soybean pod phenotyping. Their single-stage detection paradigm struggles to capture fine-grained individual features under occlusion and effectively handle densely packed pods. Additionally, the global feature extraction approach fails to fully leverage contextual information, leading to frequent misdetections, particularly in cases of occlusion or clustering. These limitations are quantitatively validated through experimental comparisons presented in the “Performance Comparison with State-of-the-Art Methods” section, highlighting the challenges existing YOLO variants face in adapting to soybean pod keypoint detection tasks. To address these issues, we propose an innovative Pod-pose method that employs a top-down paradigm, combining the strengths of multiple advanced networks and specifically designed to tackle the unique challenges of soybean pod detection. This method accurately locates key points of the pods, such as pod endpoints, corner points, and waist endpoints, enabling more detailed capture of phenotypic features. Compared to traditional methods, Pod-pose demonstrates enhanced capabilities in handling pod occlusion, dense distribution, and fine-grained feature extraction, significantly improving detection accuracy and efficiency. This approach not only advances pod phenotyping technology but also provides robust support for molecular breeding.

Limitation and future work

While non-destructive phenotyping offers significant advantages for soybean breeding, its application faces several challenges. Currently, data acquisition is primarily limited to controlled laboratory environments, with three main limitations in complex field conditions: (1) environmental factors such as varying lighting and occlusions may compromise the accuracy of pod dimension measurements; (2) existing methods rely on high-resolution imaging, hindering their scalability for large-scale field applications; and (3) current techniques require soybean plants to be imaged in a lying position, restricting practical application scenarios. To address these technical bottlenecks, supported by field high-throughput phenotyping platforms, future research will focus on developing precise measurement technologies adapted to complex field environments, with key priorities including (1) developing robust field phenotyping algorithms, (2) optimizing measurement accuracy under low-resolution imaging, and (3) enabling phenotyping of plants in their natural growth states. Building on successful experiences in stem and pod phenotyping, future research will expand the application scope of keypoint detection technology, achieving higher-throughput data collection through integrated stem-pod measurements. Additionally, dynamic pod morphology monitoring based on keypoint detection technology will provide critical support for precision agriculture management. By enabling real-time monitoring of pod morphological changes, this approach will facilitate accurate tracking of growth and development status, offering technical support for the sustainable development of the soybean industry. An in-depth exploration of these research directions will establish a solid foundation for enhancing crop production efficiency and agricultural sustainability.

Conclusion

The development of automated tools for precise measurement of crop phenotypes is crucial for improving breeding efficiency. This study addresses a key challenge in soybean breeding: the accurate acquisition of fine-grained pod phenotypic traits. A novel top-down keypoint detection model, Pod-pose, is proposed, specifically designed for detailed phenotypic analysis of soybean pods. By integrating the strengths of YOLOv3 and YOLOv8 and optimizing bottleneck structures and positional features, the detection accuracy is significantly improved. Furthermore, the combination of a two-stage detection method and a transfer learning strategy enhances model performance while reducing training complexity. Experimental results on a self-constructed dataset demonstrate that Pod-pose outperforms several state-of-the-art models, achieving an AP@0.5 of 0.912. Additionally, four critical pod-related phenotypic traits—pod length, bending length, curvature, and inflection point width—were extracted, with R values of 0.95, 0.94, 0.93, and 0.94, respectively. These results indicate that Pod-pose holds significant potential for fine-grained pod phenotypic analysis, providing a powerful tool for advancing future soybean breeding efforts.

Author contributions

FL and HL developed the methodology, implemented computer code and algorithms, written the original draft. QW planted experimental material and acquired data. ZZH, SDW and SCP reviewed and edited the paper draft. GLZ administered the project. All authors read and approved the final manuscript.

Funding

This work was supported in part by the Key R&D Program of Shandong Province (2023LZGC008), the Shandong Soybean Industrial Technology System of China (SDAIT-28-02) and Seed-Industrialized Development Program in Shandong Province (2023LZGC008-001、2024LZGC030-002、2024LZGC010-05).

Available of data and materials

No datasets were generated or analysed during the current study.

Declarations

Ethics approval and consent to participate

All authors agreed to publish this manuscript.

Consent for publication

Consent and approval for publication were obtained from all authors.

Competing interests

The authors declare no competing interests.

Footnotes

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Contributor Information

Shudong Wang, Email: shudongwang2013@sohu.com.

Longgang Zhao, Email: zhaolonggang@qau.edu.cn.

References

1.Erickson DR, editor. Practical handbook of soybean processing and utilization. UK: Elsevier; 2015. [Google Scholar]
2.Nuthalapati CS, Kumar A, Birthal PS, Sonkar VK. Demand-side and supply-side factors for accelerating varietal turnover in smallholder soybean farms. J Clean Prod. 2024;447:141372. [Google Scholar]
3.Chang F, Lv W, Lv P, Xiao Y, Yan W, Chen S, Zheng L, Xie P, Wang L, Karikari B, Abou-Elwafa SF. Exploring genetic architecture for pod-related traits in soybean using image-based phenotyping. Mol Breed. 2021;41:1–21. [DOI] [PMC free article] [PubMed] [Google Scholar]
4.Chen Y, Xiong Y, Hong H, Li G, Gao J, Guo Q, Sun R, Ren H, Zhang F, Wang J, Song J. Genetic dissection of and genomic selection for seed weight, pod length, and pod width in soybean. The Crop J. 2023;11(3):832–41. [Google Scholar]
5.Zhou W, Chen Y, Li W, Zhang C, Xiong Y, Zhan W, Huang L, Wang J, Qiu L. SPP-extractor: Automatic phenotype extraction for densely grown soybean plants. The Crop J. 2023;11(5):1569–78. [Google Scholar]
6.He H, Ma X, Guan H, Wang F, Shen P. Recognition of soybean pods and yield prediction based on improved deep learning model. Front Plant Sci. 2023;13(13):1096619. [DOI] [PMC free article] [PubMed] [Google Scholar]
7.Yang S, Zheng L, Wu T, Sun S, Zhang M, Li M, Wang M. High-throughput soybean pods high-quality segmentation and seed-per-pod estimation for soybean plant breeding. Eng Appl Artif Intell. 2024;1(129):107580. [Google Scholar]
8.Guo X, Li J, Zheng L, Zhang M, Wang M. Acquiring soybean phenotypic parameters using Re-YOLOv5 and area search algorithm. Trans Chin Soc Agric Eng. 2022;38:186–94. [Google Scholar]
9.Guo Y, Gao Z, Zhang Z, Li Y, Hu Z, Xin D, Zhu R. Automatic and accurate acquisition of stem-related phenotypes of mature soybean based on deep learning and directed search algorithms. Front Plant Sci. 2022;13:906751. [DOI] [PMC free article] [PubMed] [Google Scholar]
10.Zhou W, Chen Y, Li W, Zhang C, Xiong Y, Zhan W, Qiu L. SPP-extractor: Automatic phenotype extraction for densely grown soybean plants. Crop J. 2023;11(5):1569–78. [Google Scholar]
11.Xu C, Lu Y, Jiang H, Liu S, Ma Y, Zhao T. Counting Crowded Soybean Pods Based on Deformable Attention Recursive Feature Pyramid. Agronomy. 2023;13(6):1507. [Google Scholar]
12.Li Y, Teng S, Chen J, Zhou W, Zhan W, Wang J, Huang L, Qiu L. FEI-YOLO: A Lightweight Soybean Pod-Type Detection Model. Agronomy. 2024; 14(11): 2526.
13.Ma X, Wei B, Guan H, Cheng Y, Zhuo Z. A method for calculating and simulating phenotype of soybean based on 3D reconstruction. Eur J Agron. 2024;154:127070. [Google Scholar]
14.Zhang Z, Jin X, Rao Y, Wan T, Wang X, Li J, Shao X. DSBEAN: An innovative framework for intelligent soybean breeding phenotype analysis based on various main stem structures and deep learning methods. Comput Electron Agric. 2024;224:109135. [Google Scholar]
15.Yang S, Zheng L, Yang H, Zhang M, Wu T, Sun S, Wang M. A synthetic datasets based instance segmentation network for High-throughput soybean pods phenotype investigation. Expert Syst Appl. 2022;192:116403. [Google Scholar]
16.Li S, Yan Z, Guo Y, Su X, Cao Y, Jiang B, Zhu R. SPM-IS: An auto-algorithm to acquire a mature soybean phenotype based on instance segmentation. Crop J. 2022;10(5):1412–23. [Google Scholar]
17.Zhang QY, Fan KJ, Tian Z, Guo K, Su WH. High-Precision Automated Soybean Phenotypic Feature Extraction Based on Deep Learning and Computer Vision. Plants. 2024;13(18):2613. [DOI] [PMC free article] [PubMed] [Google Scholar]
18.He J, Weng L, Xu X, Chen R, Peng B, Li N, Feng X. DEKR-SPrior: An Efficient Bottom-Up Keypoint Detection Model for Accurate Pod Phenotyping in Soybean. Plant Phenom. 2024;6:0198. [DOI] [PMC free article] [PubMed] [Google Scholar]
19.Chopin J, Laga H, Miklavcic SJ. A hybrid approach for improving image segmentation: application to phenotyping of wheat leaves. PLoS One. 2016;11(12):e0168496. [DOI] [PMC free article] [PubMed] [Google Scholar]
20.Murphy KM, Ludwig E, Gutierrez J, Gehan MA. Deep learning in image-based plant phenotyping. Ann Rev Plant Biol. 2024;75:771. [DOI] [PubMed] [Google Scholar]
21.Jain S, Ramesh D, Damodar Reddy E, Rathod S, Ondrasek G. A fast high throughput plant phenotyping system using YOLO and Chan-Vese segmentation. Soft Comput. 2024;28:1–14. [Google Scholar]
22.Law H, Teng Y, Russakovsky O, Deng J. Cornernet-lite: Efficient keypoint based object detection. arXiv 2019. arXiv preprint arXiv:1904.08900.
23.Seng X, Liu T, Yang X, Zhang R, Yuan C, Guo T, Liu W. Measurement of the Angle between stems and leaves of rice based on key point detection. J Comput Electron Inform Manag. 2024;13(1):30–7. [Google Scholar]
24.Wang X, Yang W, Lv Q, Huang C, Liang X, Chen G, Duan L. Field rice panicle detection and counting based on deep learning. Front Plant Sci. 2022;13:966495. [DOI] [PMC free article] [PubMed] [Google Scholar]
25.Khaki S, Safaei N, Pham H, Wang L. WheatNet: A lightweight convolutional neural network for high-throughput image-based wheat head detection and counting. Neurocomputing. 2022;489:78–89. [Google Scholar]
26.Zou R, Zhang Y, Chen J, Li J, Dai W, Mu S. Density estimation method of mature wheat based on point cloud segmentation and clustering. Comput Electron Agr. 2023;205:107626. [Google Scholar]
27.Xiang L, Gai J, Bao Y, Yu J, Schnable PS, Tang L. Field-based robotic leaf angle detection and characterization of maize plants using stereo vision and deep convolutional neural networks. J Field Robot. 2023;40(5):1034–53. [Google Scholar]
28.Jia Y, Fu K, Lan H, Wang X, Su Z. Maize tassel detection with CA-YOLO for UAV images in complex field environments. Comput Electron Agr. 2024;217:108562. [Google Scholar]
29.Li J, Magar RT, Chen D, Lin F, Wang D, Yin X, Li Z. SoybeanNet: Transformer-based convolutional neural network for soybean pod counting from Unmanned Aerial Vehicle (UAV) images. Comput Electron Agr. 2024;220:108861. [Google Scholar]
30.Zhao J, Kaga A, Yamada T, Komatsu K, Hirata K, Kikuchi A, Guo W. Improved field-based soybean seed counting and localization with feature level considered. Plant Phenom. 2023;5:0026. [DOI] [PMC free article] [PubMed] [Google Scholar]
31.Maji D, Nagori S, Mathew M, et al. Yolo-pose: Enhancing yolo for multi person pose estimation using object keypoint similarity loss[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2022: 2637-2646.
32.Redmon J, Farhadi A. Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767, 2018.
33.Ultralytics, "YOLOv5," 2020. [Online]. Available: https://github.com/ultralytics/yolov5.
34.Li C, Li L, Jiang H, Weng K, Geng Y, Li L, Wei X. YOLOv6: A single-stage object detection framework for industrial applications. arXiv preprint arXiv:2209.02976, 2022.
35.Wang CY, Bochkovskiy A, Liao HYM. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2024; 7464-7475.
36.Ultralytics, "YOLOv8," 2023. [Online]. Available: https://github.com/ultralytics/ultralytics.
37.Wang A, Chen H, Liu L, Chen K, Lin Z, Han J, Ding G. Yolov10: Real-time end-to-end object detection. arXiv preprint arXiv:2405.14458, 2024.
38.Xiang S, Wang S, Xu M, Wang W, Liu W. YOLO POD: a fast and accurate multi-task model for dense Soybean Pod counting. Plant Methods. 2023;19(1):8. [DOI] [PMC free article] [PubMed] [Google Scholar]
39.Fu X, Li A, Meng Z, Yin X, Zhang C, Zhang W, Qi L. A dynamic detection method for phenotyping pods in a soybean population based on an improved yolo-v5 network. Agronomy. 2022;12(12):3209. [Google Scholar]
40.Zaji A, Liu Z, Xiao G, Bhowmik P, Sangha JS, Ruan Y. Wheat spike localization and counting via hybrid UNet architectures. Comput Electron Agr. 2022;203:107439. [Google Scholar]
41.Xie Q, Luong MT, Hovy E, Le QV. Self-training with noisy student improves imagenet classification. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020;10687-10698.
42.Huang H, Zhou X, Cao J, He R, Tan T. Vision transformer with super token sampling. arXiv preprint arXiv:2211.11167, 2022.
43.Taghavi M, Russello H, Ouweltjes W, Kamphuis C, Adriaens I. Cow key point detection in indoor housing conditions with a deep learning model. J Dairy Sci. 2024;107(4):2374–89. [DOI] [PubMed] [Google Scholar]
44.Hamza A, Khan MA, Ur Rehman S, Al-Khalidi M, Alzahrani AI, Alalwan N, Masood A. A novel bottleneck residual and self-attention fusion-assisted architecture for land use recognition in remote sensing images. IEEE J Select Top Appl Earth Observ Remote Sens. 2024;17:2995–3009. [Google Scholar]
45.Zhong Z, Yang Z, Deng B, Yan J, Wu W, Shao J, Liu CL. Blockqnn: Efficient block-wise neural network architecture generation. IEEE Trans Pattern Anal Mach Intell. 2020;43(7):2314–28. [DOI] [PubMed] [Google Scholar]
46.Carvalho IR, Nardino M, Demari GH, Szareski VJ, Follmann DN, de Pelegrin AJ, de Souza VQ. Relations among phenotypic traits of soybean pods and growth habit. African J Agr Res. 2017;12(6):450–8. [Google Scholar]
47.Chang F, Lv W, Lv P, Xiao Y, Yan W, Chen S, Zhao T. Exploring genetic architecture for pod-related traits in soybean using image-based phenotyping. Mol Breed. 2021;41:1–21. [DOI] [PMC free article] [PubMed] [Google Scholar]
48.Chen Y, Xiong Y, Hong H, Li G, Gao J, Guo Q, Qiu L. Genetic dissection of and genomic selection for seed weight, pod length, and pod width in soybean. Crop J. 2023;11(3):832–41. [Google Scholar]
49.Yu Z, Wang Y, Ye J, Liufu S, Lu D, Zhu X, Tan Q. Accurate and fast implementation of soybean pod counting and localization from high-resolution image. Front Plant Sci. 2024;15:1320109. [DOI] [PMC free article] [PubMed] [Google Scholar]
50.Bhat JA, Yu D. High-throughput NGS-based genotyping and phenotyping: Role in genomics-assisted breeding for soybean improvement. Legume Sci. 2021;3(3):e81. [Google Scholar]
51.Yang S, Zheng L, Chen X, Zabawa L, Zhang M, Wang M. Transfer learning from synthetic in-vitro soybean pods dataset for in-situ segmentation of on-branch soybean pods. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022; 1666-1675.
52.He H, Ma X, Guan H. A calculation method of phenotypic traits of soybean pods based on image processing technology. Ecol Inform. 2022;69:101676. [Google Scholar]
53.Ning S, Zhao Q, Liu K. Soybean Pods and Stems Segmentation Based on an Improved Watershed In International Conference on 5G for Future Wireless Networks. Springer Nature Switzerland. Cham. 2022;166-181.
54.Duc NT, Ramlal A, Rajendran A, Raju D, Lal SK, Kumar S, Chinnusamy V. Image-based phenotyping of seed architectural traits and prediction of seed weight using machine learning models in soybean. Front Plant Sci. 2023;14:1206357. [DOI] [PMC free article] [PubMed] [Google Scholar]
55.Cui J, Zhang J, Sun G, Zheng B. Extraction and research of crop feature points based on computer vision. Sensors. 2019;19(11):2553. [DOI] [PMC free article] [PubMed] [Google Scholar]
56.Cai G, Qian J, Song T, Zhang Q, Liu B. A deep learning-based algorithm for crop Disease identification positioning using computer vision. Int J Comput Sci Inform Technol. 2023;1(1):85–92. [Google Scholar]
57.Zhang Y, You S, Karaoglu S, Gevers T. 3D human pose estimation and action recognition using fisheye cameras: A survey and benchmark. Pattern Recogn. 2025;162:111334. [Google Scholar]
58.Dai S, Bai T, Zhao Y. Keypoint Detection and 3D Localization Method for Ridge-Cultivated Strawberry Harvesting Robots. Agriculture. 2025;15(4):372. [Google Scholar]
59.Meng Z, Du X, Sapkota R, Ma Z, Cheng H. YOLOv10-pose and YOLOv9-pose: Real-time strawberry stalk pose detection models. Comput Indust. 2025;165:104231. [Google Scholar]
60.Ci J, Wang X, Rapado-Rincón D, Burusa AK, Kootstra G. 3D pose estimation of tomato peduncle nodes using deep keypoint detection and point cloud. Biosyst Eng. 2024;243:57–69. [Google Scholar]
61.Gao Y, Li Z, Li B, Zhang L. Extraction of corn plant phenotypic parameters with keypoint detection and stereo images. Agronomy. 2024;14(6):1110. [Google Scholar]
62.Badgujar CM, Poulose A, Gan H. Agricultural object detection with You Only Look Once (YOLO) Algorithm: A bibliometric and systematic literature review. Comput Electron Agr. 2024;223:109090. [Google Scholar]
63.Zhang Y, Zhang H, Huang Q, Han Y, Zhao M. DsP-YOLO: An anchor-free network with DsPAN for small object detection of multiscale defects. Expert Syst Appl. 2024;241:122669. [Google Scholar]
64.Fan Q, Li Y, Deveci M, Zhong K, Kadry S. LUD-YOLO: A novel lightweight object detection network for unmanned aerial vehicle. Inform Sci. 2025;686:121366. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

No datasets were generated or analysed during the current study.

[CR1] 1.Erickson DR, editor. Practical handbook of soybean processing and utilization. UK: Elsevier; 2015. [Google Scholar]

[CR2] 2.Nuthalapati CS, Kumar A, Birthal PS, Sonkar VK. Demand-side and supply-side factors for accelerating varietal turnover in smallholder soybean farms. J Clean Prod. 2024;447:141372. [Google Scholar]

[CR3] 3.Chang F, Lv W, Lv P, Xiao Y, Yan W, Chen S, Zheng L, Xie P, Wang L, Karikari B, Abou-Elwafa SF. Exploring genetic architecture for pod-related traits in soybean using image-based phenotyping. Mol Breed. 2021;41:1–21. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR4] 4.Chen Y, Xiong Y, Hong H, Li G, Gao J, Guo Q, Sun R, Ren H, Zhang F, Wang J, Song J. Genetic dissection of and genomic selection for seed weight, pod length, and pod width in soybean. The Crop J. 2023;11(3):832–41. [Google Scholar]

[CR5] 5.Zhou W, Chen Y, Li W, Zhang C, Xiong Y, Zhan W, Huang L, Wang J, Qiu L. SPP-extractor: Automatic phenotype extraction for densely grown soybean plants. The Crop J. 2023;11(5):1569–78. [Google Scholar]

[CR6] 6.He H, Ma X, Guan H, Wang F, Shen P. Recognition of soybean pods and yield prediction based on improved deep learning model. Front Plant Sci. 2023;13(13):1096619. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR7] 7.Yang S, Zheng L, Wu T, Sun S, Zhang M, Li M, Wang M. High-throughput soybean pods high-quality segmentation and seed-per-pod estimation for soybean plant breeding. Eng Appl Artif Intell. 2024;1(129):107580. [Google Scholar]

[CR8] 8.Guo X, Li J, Zheng L, Zhang M, Wang M. Acquiring soybean phenotypic parameters using Re-YOLOv5 and area search algorithm. Trans Chin Soc Agric Eng. 2022;38:186–94. [Google Scholar]

[CR9] 9.Guo Y, Gao Z, Zhang Z, Li Y, Hu Z, Xin D, Zhu R. Automatic and accurate acquisition of stem-related phenotypes of mature soybean based on deep learning and directed search algorithms. Front Plant Sci. 2022;13:906751. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR10] 10.Zhou W, Chen Y, Li W, Zhang C, Xiong Y, Zhan W, Qiu L. SPP-extractor: Automatic phenotype extraction for densely grown soybean plants. Crop J. 2023;11(5):1569–78. [Google Scholar]

[CR11] 11.Xu C, Lu Y, Jiang H, Liu S, Ma Y, Zhao T. Counting Crowded Soybean Pods Based on Deformable Attention Recursive Feature Pyramid. Agronomy. 2023;13(6):1507. [Google Scholar]

[CR12] 12.Li Y, Teng S, Chen J, Zhou W, Zhan W, Wang J, Huang L, Qiu L. FEI-YOLO: A Lightweight Soybean Pod-Type Detection Model. Agronomy. 2024; 14(11): 2526.

[CR13] 13.Ma X, Wei B, Guan H, Cheng Y, Zhuo Z. A method for calculating and simulating phenotype of soybean based on 3D reconstruction. Eur J Agron. 2024;154:127070. [Google Scholar]

[CR14] 14.Zhang Z, Jin X, Rao Y, Wan T, Wang X, Li J, Shao X. DSBEAN: An innovative framework for intelligent soybean breeding phenotype analysis based on various main stem structures and deep learning methods. Comput Electron Agric. 2024;224:109135. [Google Scholar]

[CR15] 15.Yang S, Zheng L, Yang H, Zhang M, Wu T, Sun S, Wang M. A synthetic datasets based instance segmentation network for High-throughput soybean pods phenotype investigation. Expert Syst Appl. 2022;192:116403. [Google Scholar]

[CR16] 16.Li S, Yan Z, Guo Y, Su X, Cao Y, Jiang B, Zhu R. SPM-IS: An auto-algorithm to acquire a mature soybean phenotype based on instance segmentation. Crop J. 2022;10(5):1412–23. [Google Scholar]

[CR17] 17.Zhang QY, Fan KJ, Tian Z, Guo K, Su WH. High-Precision Automated Soybean Phenotypic Feature Extraction Based on Deep Learning and Computer Vision. Plants. 2024;13(18):2613. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR18] 18.He J, Weng L, Xu X, Chen R, Peng B, Li N, Feng X. DEKR-SPrior: An Efficient Bottom-Up Keypoint Detection Model for Accurate Pod Phenotyping in Soybean. Plant Phenom. 2024;6:0198. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR19] 19.Chopin J, Laga H, Miklavcic SJ. A hybrid approach for improving image segmentation: application to phenotyping of wheat leaves. PLoS One. 2016;11(12):e0168496. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR20] 20.Murphy KM, Ludwig E, Gutierrez J, Gehan MA. Deep learning in image-based plant phenotyping. Ann Rev Plant Biol. 2024;75:771. [DOI] [PubMed] [Google Scholar]

[CR21] 21.Jain S, Ramesh D, Damodar Reddy E, Rathod S, Ondrasek G. A fast high throughput plant phenotyping system using YOLO and Chan-Vese segmentation. Soft Comput. 2024;28:1–14. [Google Scholar]

[CR22] 22.Law H, Teng Y, Russakovsky O, Deng J. Cornernet-lite: Efficient keypoint based object detection. arXiv 2019. arXiv preprint arXiv:1904.08900.

[CR23] 23.Seng X, Liu T, Yang X, Zhang R, Yuan C, Guo T, Liu W. Measurement of the Angle between stems and leaves of rice based on key point detection. J Comput Electron Inform Manag. 2024;13(1):30–7. [Google Scholar]

[CR24] 24.Wang X, Yang W, Lv Q, Huang C, Liang X, Chen G, Duan L. Field rice panicle detection and counting based on deep learning. Front Plant Sci. 2022;13:966495. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR25] 25.Khaki S, Safaei N, Pham H, Wang L. WheatNet: A lightweight convolutional neural network for high-throughput image-based wheat head detection and counting. Neurocomputing. 2022;489:78–89. [Google Scholar]

[CR26] 26.Zou R, Zhang Y, Chen J, Li J, Dai W, Mu S. Density estimation method of mature wheat based on point cloud segmentation and clustering. Comput Electron Agr. 2023;205:107626. [Google Scholar]

[CR27] 27.Xiang L, Gai J, Bao Y, Yu J, Schnable PS, Tang L. Field-based robotic leaf angle detection and characterization of maize plants using stereo vision and deep convolutional neural networks. J Field Robot. 2023;40(5):1034–53. [Google Scholar]

[CR28] 28.Jia Y, Fu K, Lan H, Wang X, Su Z. Maize tassel detection with CA-YOLO for UAV images in complex field environments. Comput Electron Agr. 2024;217:108562. [Google Scholar]

[CR29] 29.Li J, Magar RT, Chen D, Lin F, Wang D, Yin X, Li Z. SoybeanNet: Transformer-based convolutional neural network for soybean pod counting from Unmanned Aerial Vehicle (UAV) images. Comput Electron Agr. 2024;220:108861. [Google Scholar]

[CR30] 30.Zhao J, Kaga A, Yamada T, Komatsu K, Hirata K, Kikuchi A, Guo W. Improved field-based soybean seed counting and localization with feature level considered. Plant Phenom. 2023;5:0026. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR31] 31.Maji D, Nagori S, Mathew M, et al. Yolo-pose: Enhancing yolo for multi person pose estimation using object keypoint similarity loss[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2022: 2637-2646.

[CR32] 32.Redmon J, Farhadi A. Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767, 2018.

[CR33] 33.Ultralytics, "YOLOv5," 2020. [Online]. Available: https://github.com/ultralytics/yolov5.

[CR34] 34.Li C, Li L, Jiang H, Weng K, Geng Y, Li L, Wei X. YOLOv6: A single-stage object detection framework for industrial applications. arXiv preprint arXiv:2209.02976, 2022.

[CR35] 35.Wang CY, Bochkovskiy A, Liao HYM. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2024; 7464-7475.

[CR36] 36.Ultralytics, "YOLOv8," 2023. [Online]. Available: https://github.com/ultralytics/ultralytics.

[CR37] 37.Wang A, Chen H, Liu L, Chen K, Lin Z, Han J, Ding G. Yolov10: Real-time end-to-end object detection. arXiv preprint arXiv:2405.14458, 2024.

[CR38] 38.Xiang S, Wang S, Xu M, Wang W, Liu W. YOLO POD: a fast and accurate multi-task model for dense Soybean Pod counting. Plant Methods. 2023;19(1):8. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR39] 39.Fu X, Li A, Meng Z, Yin X, Zhang C, Zhang W, Qi L. A dynamic detection method for phenotyping pods in a soybean population based on an improved yolo-v5 network. Agronomy. 2022;12(12):3209. [Google Scholar]

[CR40] 40.Zaji A, Liu Z, Xiao G, Bhowmik P, Sangha JS, Ruan Y. Wheat spike localization and counting via hybrid UNet architectures. Comput Electron Agr. 2022;203:107439. [Google Scholar]

[CR41] 41.Xie Q, Luong MT, Hovy E, Le QV. Self-training with noisy student improves imagenet classification. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020;10687-10698.

[CR42] 42.Huang H, Zhou X, Cao J, He R, Tan T. Vision transformer with super token sampling. arXiv preprint arXiv:2211.11167, 2022.

[CR43] 43.Taghavi M, Russello H, Ouweltjes W, Kamphuis C, Adriaens I. Cow key point detection in indoor housing conditions with a deep learning model. J Dairy Sci. 2024;107(4):2374–89. [DOI] [PubMed] [Google Scholar]

[CR44] 44.Hamza A, Khan MA, Ur Rehman S, Al-Khalidi M, Alzahrani AI, Alalwan N, Masood A. A novel bottleneck residual and self-attention fusion-assisted architecture for land use recognition in remote sensing images. IEEE J Select Top Appl Earth Observ Remote Sens. 2024;17:2995–3009. [Google Scholar]

[CR45] 45.Zhong Z, Yang Z, Deng B, Yan J, Wu W, Shao J, Liu CL. Blockqnn: Efficient block-wise neural network architecture generation. IEEE Trans Pattern Anal Mach Intell. 2020;43(7):2314–28. [DOI] [PubMed] [Google Scholar]

[CR46] 46.Carvalho IR, Nardino M, Demari GH, Szareski VJ, Follmann DN, de Pelegrin AJ, de Souza VQ. Relations among phenotypic traits of soybean pods and growth habit. African J Agr Res. 2017;12(6):450–8. [Google Scholar]

[CR47] 47.Chang F, Lv W, Lv P, Xiao Y, Yan W, Chen S, Zhao T. Exploring genetic architecture for pod-related traits in soybean using image-based phenotyping. Mol Breed. 2021;41:1–21. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR48] 48.Chen Y, Xiong Y, Hong H, Li G, Gao J, Guo Q, Qiu L. Genetic dissection of and genomic selection for seed weight, pod length, and pod width in soybean. Crop J. 2023;11(3):832–41. [Google Scholar]

[CR49] 49.Yu Z, Wang Y, Ye J, Liufu S, Lu D, Zhu X, Tan Q. Accurate and fast implementation of soybean pod counting and localization from high-resolution image. Front Plant Sci. 2024;15:1320109. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR50] 50.Bhat JA, Yu D. High-throughput NGS-based genotyping and phenotyping: Role in genomics-assisted breeding for soybean improvement. Legume Sci. 2021;3(3):e81. [Google Scholar]

[CR51] 51.Yang S, Zheng L, Chen X, Zabawa L, Zhang M, Wang M. Transfer learning from synthetic in-vitro soybean pods dataset for in-situ segmentation of on-branch soybean pods. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022; 1666-1675.

[CR52] 52.He H, Ma X, Guan H. A calculation method of phenotypic traits of soybean pods based on image processing technology. Ecol Inform. 2022;69:101676. [Google Scholar]

[CR53] 53.Ning S, Zhao Q, Liu K. Soybean Pods and Stems Segmentation Based on an Improved Watershed In International Conference on 5G for Future Wireless Networks. Springer Nature Switzerland. Cham. 2022;166-181.

[CR54] 54.Duc NT, Ramlal A, Rajendran A, Raju D, Lal SK, Kumar S, Chinnusamy V. Image-based phenotyping of seed architectural traits and prediction of seed weight using machine learning models in soybean. Front Plant Sci. 2023;14:1206357. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR55] 55.Cui J, Zhang J, Sun G, Zheng B. Extraction and research of crop feature points based on computer vision. Sensors. 2019;19(11):2553. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR56] 56.Cai G, Qian J, Song T, Zhang Q, Liu B. A deep learning-based algorithm for crop Disease identification positioning using computer vision. Int J Comput Sci Inform Technol. 2023;1(1):85–92. [Google Scholar]

[CR57] 57.Zhang Y, You S, Karaoglu S, Gevers T. 3D human pose estimation and action recognition using fisheye cameras: A survey and benchmark. Pattern Recogn. 2025;162:111334. [Google Scholar]

[CR58] 58.Dai S, Bai T, Zhao Y. Keypoint Detection and 3D Localization Method for Ridge-Cultivated Strawberry Harvesting Robots. Agriculture. 2025;15(4):372. [Google Scholar]

[CR59] 59.Meng Z, Du X, Sapkota R, Ma Z, Cheng H. YOLOv10-pose and YOLOv9-pose: Real-time strawberry stalk pose detection models. Comput Indust. 2025;165:104231. [Google Scholar]

[CR60] 60.Ci J, Wang X, Rapado-Rincón D, Burusa AK, Kootstra G. 3D pose estimation of tomato peduncle nodes using deep keypoint detection and point cloud. Biosyst Eng. 2024;243:57–69. [Google Scholar]

[CR61] 61.Gao Y, Li Z, Li B, Zhang L. Extraction of corn plant phenotypic parameters with keypoint detection and stereo images. Agronomy. 2024;14(6):1110. [Google Scholar]

[CR62] 62.Badgujar CM, Poulose A, Gan H. Agricultural object detection with You Only Look Once (YOLO) Algorithm: A bibliometric and systematic literature review. Comput Electron Agr. 2024;223:109090. [Google Scholar]

[CR63] 63.Zhang Y, Zhang H, Huang Q, Han Y, Zhao M. DsP-YOLO: An anchor-free network with DsPAN for small object detection of multiscale defects. Expert Syst Appl. 2024;241:122669. [Google Scholar]

[CR64] 64.Fan Q, Li Y, Deveci M, Zhong K, Kadry S. LUD-YOLO: A novel lightweight object detection network for unmanned aerial vehicle. Inform Sci. 2025;686:121366. [Google Scholar]

PERMALINK

Pod-pose : an efficient top-down keypoint detection model for fine-grained pod phenotyping in mature soybean

Fei Liu

Hang Liu

Qiong Wu

Zhongzhi Han

Shanchen Pang

Shudong Wang

Longgang Zhao

Abstract

Background

Results

Conclusions

Introduction

Materials and methods

Fig. 1.

Image acquisition and pre-training

Table 1.

Fig. 2.

Pod-pose network

Fig. 3.

Phenotypic identification

Fig. 4.

Evaluation standard

Experimental platform

Results and analysis

Performance comparison with state-of-the-art methods

Table 2.

Fig. 5.

Fig. 6.

Fig. 7.

Fig. 8.

Phenotypic identification results

Fig. 9.

Discussion

Importance of pod phenotype

Benefits of key-point detection

Significance of pod-pose method

Limitation and future work

Conclusion

Author contributions

Funding

Available of data and materials

Declarations

Ethics approval and consent to participate

Consent for publication

Competing interests

Footnotes

Contributor Information

References

Associated Data

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases