Event detection in football: Improving the reliability of match analysis

Jonas Bischofberger; Arnold Baca; Erich Schikuta

doi:10.1371/journal.pone.0298107

. 2024 Apr 18;19(4):e0298107. doi: 10.1371/journal.pone.0298107

Event detection in football: Improving the reliability of match analysis

Jonas Bischofberger ^1,^*, Arnold Baca ¹, Erich Schikuta ²

Editor: Ersan Arslan³

PMCID: PMC11025972 PMID: 38635802

Abstract

With recent technological advancements, quantitative analysis has become an increasingly important area within professional sports. However, the manual process of collecting data on relevant match events like passes, goals and tacklings comes with considerable costs and limited consistency across providers, affecting both research and practice. In football, while automatic detection of events from positional data of the players and the ball could alleviate these issues, it is not entirely clear what accuracy current state-of-the-art methods realistically achieve because there is a lack of high-quality validations on realistic and diverse data sets. This paper adds context to existing research by validating a two-step rule-based pass and shot detection algorithm on four different data sets using a comprehensive validation routine that accounts for the temporal, hierarchical and imbalanced nature of the task. Our evaluation shows that pass and shot detection performance is highly dependent on the specifics of the data set. In accordance with previous studies, we achieve F-scores of up to 0.92 for passes, but only when there is an inherent dependency between event and positional data. We find a significantly lower accuracy with F-scores of 0.71 for passes and 0.65 for shots if event and positional data are independent. This result, together with a critical evaluation of existing methodologies, suggests that the accuracy of current football event detection algorithms operating on positional data is currently overestimated. Further analysis reveals that the temporal extraction of passes and shots from positional data poses the main challenge for rule-based approaches. Our results further indicate that the classification of plays into shots and passes is a relatively straightforward task, achieving F-scores between 0.83 to 0.91 ro rule-based classifiers and up to 0.95 for machine learning classifiers. We show that there exist simple classifiers that accurately differentiate shots from passes in different data sets using a low number of human-understandable rules. Operating on basic spatial features, our classifiers provide a simple, objective event definition that can be used as a foundation for more reliable event-based match analysis.

1. Introduction

The objective evaluation of performance is a ubiquitous goal in modern professional sports environments. When recruiting players or analyzing opponents, it is crucial to be able to assess the overall capabilities as well as strengths and weaknesses of teams and athletes. Furthermore, objective, quantitative analysis has the power to reduce the impact of individual and societal biases which could ultimately lead to more truthful and healthy relationships between athletes, coaches, and the public.

With growing sophistication and decreasing costs, technologies such as video-based systems, electronic tracking systems and match analysis software become more and more widespread, leading to an increasingly important role of quantitative analysis in sports. In football, tactical and technical performance analysis traditionally focuses on player actions such as shots, passes and dribbles [1]. The data for these analyses is collected through manual tagging of events which is a time-consuming and cost-intensive process. Additionally, while the reliability of manual event detection systems can be ensured by extensive training of the data collectors [2], their validity is harder to guarantee. Data providers are not required to publish accurate and detailed definitions of the events they annotate. Definitions also vary across providers, because most football concepts are not prescribed by the rules of the game but emerged empirically, which makes their definitions subject to opinion. For example, the term recovery is used for very different sets of actions, ranging from a player picking up a loose ball [3] to any action by which a player wins the ball [4]. Even foundational actions such as passes, dribbles and shots are typically ambiguous: For example, the provider Wyscout treats crosses as a subset of passes whereas Stats Perform does not. Stats Perform also requires a pass to be intentional, which is hardly an objective qualificiation.

Without universal definitions, different studies which seemingly use the same categories of events are not necessarily comparable. Also, the event definition that is required or expected by an analyst or researcher might differ from the definition used by the data collector. For that reason, an automated and customizable data collection process would increase the validty of both scientific and practiced sports performance analysis—if such a process is sufficiently accurate.

So far, various methods to automatically extract events from raw video footage or positional data of the players and the ball have been proposed, using either machine learning or rule-based detection routines. A rule-based algorithm operating on positional data would be particularly well suited to not only alleviate the burden of manual data collection, but also provide a simple, objective definition of events as a foundation for further analysis. Multiple machine-learning- and rule-based methods have been proposed to detect events in positional data, reporting promising accuracies of 90% and above [5–8]. However, most studies did not evaluate their algorithms across multiple data sets, so it is not guaranteed that these algorithms pick up the underlying structure of the football game rather than the error profile or other specifics of the respective data set. Also, and more importantly, the data sets that were used for validation are typically not independent from the positional data, as they both come from a common intermediate source or are partially derived from each other. Using such data for the evaluation of an algorithm inevitably leads to an inflation of its estimated performance, since information from the reference data spills over into the input data for the model.

This article complements and enriches those previous findings by providing a strong validation of a simple rule-based algorithm for the detection of passes and shots as two of the most important events in football from positional data. We propose a highly robust validation routine and use it to evaluate the algorithm across four different data sets, where one data set includes independent positional and event data. We also compare different algorithms to further distinguish passes and shots, including both rule-based and machine-learning classifiers to determine whether there exists a simple, human-understandable set of rules which accurately distinguishes shots from passes.

Designing a proper validation routine for this problem is technically challenging, because it involves detecting composite events from a continuous stream of positional data. It is a temporal, hierarchical and imbalanced classification task with unreliable reference data. The suggested validation routine is therefore relevant beyond the scope of football event detection for problems with a similar structure, such as object detection from videos [9] or sentiment analysis from streams of social media posts [10].

Overall, the main novel contributions of this paper are:

The presentation of a reliable validation routine for football event detection as a temporal, hierarchical, and imbalanced classification problem.
A reliable estimate of the performance of different pass and shot detection algorithms based on positional data across four diverse data sets.
A quantification and exploration of the difference in performance between independent and dependent reference data, which adds important context to existing findings.
An accurate pass and shot classifier that can be used as an adjustable foundation for event-based match analysis.

The remaining paper is structured as follows: Section 2 reviews existing approaches to automatic event detection in football. Section 3 elaborates the pass and shot detection algorithms evaluated in this paper. Section 4 describes the data sets used and lays out the design of the validation procedure. Section 5 presents the validation results. Section 6 provides a discussion of the results. Section 7 summarizes the paper and proposes directions for future research.

2. State of the art

In football, there are different types of events that are relevant for performance analysis: Player-level actions such as runs, tackles and passes, team- or group-level events such as counterattacks, offside traps and changes of the tactical formation, and events prescribed by the rules of football, for instance game interruptions when the ball leaves the bounds of the pitch and substitutions. Player actions form the building blocks upon which the majority of technical and tactical performance analysis in football is built, such as the analysis of pass completion rates [1], expected goals [11] and passing networks [12].

Event recognition from sports videos is an active area of research. It can involve either machine learning [13] or basic image recognition techniques in conjunction with manual classification rules [14, 15]. Given that positional data becomes more and more widely available and that many events can be defined in terms of spatio-temporal interactions between players and the ball, it becomes increasingly feasible to perform event detection on positional data on a large scale. In fact, some approaches in video-based event detection even involve the recognition of players and the ball as a preprocessing step [14].

One of the earliest attempts to rule-based event detection on positional data in football can be attributed to Tovinkere and Qian [8] who used rules and heuristics derived from domain knowledge to identify complex exents in positional data. They achieved F-scores very close or equal to 1.0 for kicks, receptions, saves and goals. However, details about their methods are too sparsely given to draw general conclusions from this result. Also, they evaluated their algorithm on a very small sample which contained only 101 player actions.

More recently, Morra et al. [6] proposed and evaluated a rule-based algorithm on positional data, which is able to extract complex events like passes from atomic events using sets of temporal and logical rules. The achieved F-scores of 0.89 (passes) and 0.81 (shots) are likely inflated by the use of synthetic positional and reference event data, which have been jointly generated from a football simulation engine. Khaustov and Mozgovoy [7] applied another rule-based algorithm to positional data from 5 football matches and achieved an F-score of 0.93 for successful passes, 0.86 for unsuccessful passes, and 0.95 for shots on goal. Even higher values were achieved, such as 0.998 for successful passes, on another data set containing several short sequences of play. However, they also generated their gold standard by hand by watching the game “in a 2D soccer simulator”, i.e. likely using the same positional data that also underlies the event detection process.

Richly, Moritz and Schwarz [16] used a three-layer artificial neural network to detect events in positional data and achieved an average F-score of 0.89 for kicks and receptions. However, they also used positional data to assist the manual creation of their gold standard, specifically the acceleration data of the ball. They also used a very small data set with only 194 events in total. Vidal-Codina et al. [5] proposed a two-step rule-based detector and evaluated it on a very heterogeneous data set, however with no discussion of possible data dependencies and differences of the algorithm’s performance between the various included providers. Among other events, they achieved a total F-score of 0.92 for passes and 0.63 for shots.

Overall, while the achieved F-scores beyond 90% for passes and shots appear promising, the currently available evaluation results don’t necessarily reflect a practical setting where manual event data is supposed to be replaced and is therefore not available to pre-process positional data. It is likely that the existing studies tend to overestimate the accuracy of their algorithms due to information from the reference data leaking into the input data. For that reason, it is currently not clear which merit rule-based classification routines hold concerning event detection in football and if their accuracy is sufficient for industrial and research purposes. Also, there is a lack of agreed-upon standards regarding the validation a given algorithm, for example the specific conditions under which a detected event can be considered to match a reference event. Other problems like low sample sizes, a lack of variety in the evaluated data sets and the use of synthetic data further emphasize the need for new perspectives on the topic.

3. Approach to detection and classification

We propose a simple rule-based algorithm using positional data to detect passes and shots, as two of the most important and widely analyzed actions in football. The structure of the algorithm is hierarchical as passes and shots can both be viewed as actions where the ball is kicked by a player. Therefore, in a first step, plays (defined as actions that are either a pass or a shot) are detected from the positional data. In the second step, plays are classified into passes and shots using three different methods: A rule-based decision routine based on expert knowledge, decision trees of various complexity, and various black-box machine learning classifiers. The rule-based routine and decision trees are used to estimate the achievable performance of a human-understandable classifier which is desirable to obtain event defintions. The black box classifiers are used to estimate whether a higher accuracy is possible without imposing this requirement.

All models have been implemented in Python 3.9 and rely on the packages numpy (1.21.5), pandas (1.4.4) and scikit-learn (1.0.2).

Step 1: Play detection

The basic idea of detecting passes and shots is to view them as composed of two actions where a player exerts a force upon the ball, i.e. a kick followed by either a reception or deflection. Parsimoniously, a hit is defined as an instance where the ball is accelerated beyond a certain threshold (min_acc) while at least one player is in sufficient vicinity of the ball (min_vicinity) to have realistically carried out the hit. The player performing the hit is defined as the closest player to the ball during the hit. With the acceleration a_Ball(f) of the ball in frame f and the minimal ball-player distance over all players d_closest(f), the occurence of a hit in f is defined as:

\begin{matrix} H i t (f) ≔ a_{B a l l} (f) > min_acc and d_{c l o s e s t} (f) < min_vicinity \end{matrix}

(1)

A play is then defined as either two subsequent hits by different players (which corresponds to a pass or shot followed by a reception, deflection or goalkeeper save) or a hit followed by a game interruption (e.g. a shot that misses the target or a misplaced pass that crosses the sideline).

This definition of a play is broad and captures all “pass-like” events such as crosses, deflections, clearances, misplaced touches and throw-ins which may or may not be considered passes in different contexts. If one wants to further subdivide the pass event into categories, this can be done explicitly through additional rules. For example, a cross could be defined as a pass that originates in a specified zone lateral to the penalty box, is received in the penalty box, and reaches a certain height during its trajectory.

Since crosses, clearances and throw-ins are commonly recorded in football event data, we include those events as passes in the evaluation. Deflections and misplaced touches on the other hand are not always recorded, so they should be excluded algorithmically. Misplaced touches are difficult to detect because essentially, misplaced touches differ from passes by the intention of the player rather than directly observable parameters of the play. While the same is true for deflections, deflections appear to have the more distinct kinematic features. Intuitively, deflections can be thought of as plays that directly follow another play when the deflecting player has not had sufficient time to make a conscious play. We therefore use the following rule to classify plays as deflections and exclude those from the final set of detected plays.

\begin{matrix} \begin{matrix} D e f l e c t i o n (p l a y) ≔ p l a y e r . f r a m e = p r e v i o u s_p l a y . t a r g e t_f r a m e and \\ p r e v i o u s_p l a y . d u r a t i o n < max_deflection_time \end{matrix} \end{matrix}

(2)

Algorithm 1 describes the procedure programmatically. Given that all calculations per frame run in constant time, its time complexity is O(n) where n is the number of frames in the positional data. The space complexity is also O(n).

Algorithm 1 Play Detection Algorithm

1: function isDeflection(play, previousPlay)

2: return play.frame = previousPlay.target_frame and previousPlay.duration < max_deflection_time

4: function isHit(f, min_acc, min_vicinity)

5: return a_Ball(f) > min_acc and d_closest(f) < min_vicinity

7: function detectPlays(game, min_acc, min_vicinity, max_deflection_time)

8: plays ← []

9: startFrame ← −1

10: firstHit ← −1

11: previousClosestPlayer ← −1

12: previousPlay ← −1

13: for each frame f in game do

14: closestPlayer ← closestPlayerToBallf(f)

15: if isHit(f, min_acc, min_vicinity) and previousClosestPlayer ≠ closestPlayer then

16: if firstHit = −1 then

17: firstHit ← f

18: else if note isDeflection(play, previousPlay) then

19: Append (firstHit, f) to plays

20: previousPlay ← (firstHit, f)

21: firstHit ← −1

22: else if f is Interruption and firstHit ≠ −1 then

23: Append (firstHit, f) to plays

24: previousPlay ← (firstHit, f)

25: firstHit ← −1

26: return Detected plays

Step 2: Shot classification

To classify the detected plays into passes and shots, we first extract a manually defined set of features. The features, are shown in Table 1 and are selected because they seem to be both simple and informative for differentiating shots from passes. These features serve as input for all classifiers.

Table 1. Features used for the pass/shot classification.

Feature	Type	Rationale
Distance from end position to opposition goal line	float	Shots are likely to cross the goal line
Distance to opposition goal	float	Shots are likely to be taken close to the opposition goal
Initial speed	float	Shots tend to be kicked more forcefully than passes
Receiver is opposition goalkeeper	boolean	Shots are often caught by the opposition goalkeeper.
Receiver is opposition field player	boolean	Shots are often blocked by opponents
Has receiver	boolean	Shots are not intended to be received by a player.
Goal angle	float	Shots are rarely taken from acute angles
Extrapolated lateral deviation	float	Shots are usually aimed in the direction of the goal rather than away from it.
Angle to goal	float	Shots are usually aimed in the direction of the goal rather than away from it.
Direct play or deflection follows	boolean	Shots are often blocked or deflected.
Progressed distance towards goal	float	Shots are usually taken towards the goal.
Distance to closest opponent	float	Shots are often taken under pressure.

Open in a new tab

Manual algorithm

For the first algorithm, we use the extracted features to carefully build a set of rules that classifies plays as shots or passes. These rules are designed to capture expert intuition about what constsitutes a shot in football. A play is therefore classified as a shot if and only if it satisfies all of the following rules:

play.progressed_dist_toward_goal > min_progression: A shot has to bring the ball closer to the opponent’s goal.
play.origin.distance_to_goal < max_dist_to_goal: A shot has to be taken within sufficient vicinity of the opponent’s goal.
play.origin.opening_angle_to_goal < min_opening_angle or play.initial_speed > min_speed_from_bad_angle: The opening angle from the position of the play towards the goal posts must be large enough or otherwise the ball has to be kicked particularly forcefully.
play.extrapolated_position_on_goalline < max_lateral_deviation: The play must have been aimed sufficiently close towards the goal.
play.receiver is None and play.target.distance_to_goalline < max_target_dist_to_goalline or play.initial_speed > min_speed_general or play.receiver is opposition goalkeeper play.initial_speed > min_speed_gk: A shot must either end at the goalline or must be kicked forcefully enough. The required speed threshold differs depending on whether the ball hits an outfield player or the opposition goalkeeper.

Machine learning

To automatically learn rulesets with various complexity, we also evaluate decision trees with a fixed number of leaves. The structure of the decision trees is optimized by fitting other hyperparamters to data.

Additionally, we use different black box machine learning models to estimate whether and how much additional structure in the data can be uncovered when human-understandable rules are not required. These models are a random forest, a SVM and AdaBoost with decision trees as base classifier.

Baseline

Baseline performance is measured using a dummy predictor that always predicts the most frequent class, i.e. “Pass” in the training data.

4. Evaluation

Data sets

We use positional and event data from four different providers for the evaluation.

Metrica [17]: Anonymized sample data published by Metrica Sports consisting of 3 games with synchronized positional and event data.
Stats [18]: Synchronized positional and event data of consecutive games of a professional men’s national team in various competitions, provided by Stats Perform, 14 games.
Euro [19]: Positional data from the men’s European Championship 2021, provided by Tracab, complemented with independent Hudl Sportscode event data, 4 games.
Subsequent [20]: Synchronized positional and event data of consecutive games of a professional men’s national team in various competitions, provided by Subsequent. 6 games.

The positional data from all four providers was collected using optical tracking. Tracab and Stats Perform use in-venue camera systems whereas Metrica and Subsequent generate positional data from a single video recording and are therefore expected to be of lower quality. The positional data contains the x-y coordinates of the players and the ball during the match, captured at 25 Hz (Metrica, Euro, Subsequent) and 10 Hz (Stats) respectively. Due to the nature of the data, the event information contained in the four data sets is heterogeneous. Nevertheless, all four data sets do record passes and shots, including a timestamp which can be used to synchronize the respective action with the positional data. Additional information that is included in all data sets is the identity of the player who performed the pass or shot. An indication about the success of the pass as well as the identity of the pass receiver and the location at which the pass or shot starts and ends is not present in all data sets. The success of a pass and the identity of its receiver can however be deduced from the information given about the next ball-related event after the pass.

From qualitative inspection, it is obvious that the bundled positional and event data have not been generated independently from each other. For example, in the Metrica data set, the position of the ball is typically exactly equal to the position of the player who is currently in possession of the ball—a phenomenon that has also been observed in previous studies on data from other providers [5]. This observation strongly suggests that the position of the ball has been partly or even entirely reconstructed from manually annotated events. To a lesser degree, such artifacts are also apparent in the Stats and Subsequent datasets, but not in the in-venue positional data from Tracab within the Euro data set.

In the Euro dataset, the event data was obtained from the online platform Wyscout. However, the timestamps of the events were not accurately aligned with the actual events. For that reason, we corrected the timestamps of all passes, shots and game interruptions manually using broadcast footage of the games. This process also involved some minor corrections to the data, for example for events that were clearly missing or duplicate. Around 3 percent of events have been added or removed for such reasons. No positional data was used during this process.

Game segmentation

Since detected events have to be matched with reference events, the validation routine needs to operate on contiguous segments of play in which to search for matching events. Since our models contain parameters to be fitted, we need at least two such segments in order to obtain a training set and a test set. Naturally, the data could be divided into games or halfs. But since our smallest data set contains only 6 halfs, a subdivision along halfs would be too coarse to obtain a representative test set.

More blended data sets are obtained by instead dividing the game at any sufficiently long interruption. The game interruptions must be long enough so that a detected event and its true corresponding reference event almost certainly cannot end up in different segments. A higher interruption length therefore minimizes the risk of unwanted separations while a shorter interruption length increases the number of available segments. We found a minimum interruption time of 2 seconds to be a good compromise.

Temporal matching

To determine whether a detected event matches a reference event, they have to be temporally matched. Since we treat passes and shots as composed of two atomic events which are modeled without a duration (hits and game interruptions), it is sensible to match two plays by individually matching their two constituent events. A play is matched if both of its consitutient atomic events match a detected event. The atomic events match if they are no further than a certain time span (matching window) apart from each other.

The choice of the optimal matching window involves the following trade-off: If the matching window is too small, it mistakenly misses actually matching events and underestimates the performance of the algorithm. If it is too large, it could mistakenly match unrelated plays and overestimate the performance of the algorithm. Therefore, additional information like the player and the location of the play should be used to establish truthful matching conditions. The shots and passes in our data sets share only one additional variable: the player who took the play. Therefore, we further require that this player must be equal for two events to be matched.

The dependency of detection performance on the choice of matching window is depicted in Fig 1. We qualitatively estimate the optimal matching window as 500 milliseconds for Stats [18], Metrica [17], and Subsequent [20], and 1000 milliseconds for Euro [19]. This is roughly where the scores begin to increase much slower than before, which indicates that the majority of actually corresponding events have been matched.

Any ambiguities where an event matches multiple candidates are resolved by finding a maximum cardinality matching for each segment using the Hopcroft-Karp algorithm.

Choice of evaluation metrics

Play detection

The relevant raw performance metrics for play detection are as follows:

Precision $P_{p l a y} = \frac{# detected plays matched with a reference pass or shot}{# detected plays}$
Pass Recall $R_{p l a y, p a s s} = \frac{# detected plays matched with a reference pass}{# reference passes}$
Shot Recall $R_{p l a y, s h o t} = \frac{# detected plays matched with a reference shot}{# reference shots}$

Passes and shots constitute a class imbalance, as passes are about 40 times more common than shots in football. Since different categories of events are typically of separate interest in analysis rather than being mixed together, it is most appropriate to assign equal importance to passes and shots as categories, i.e. to assign more weight to an individual shot than an individual pass. This way, the algorithm will be optimized in a way that allows it to be used for the analysis of both types of events rather than being optimized to recognize mostly passes.

Based on that line of reasoning, we compute the macro-averaged recall R_play.

\begin{matrix} R_{p l a y} = \frac{R_{p l a y, p a s s} + R_{p l a y, s h o t}}{2} \end{matrix}

(3)

R_play is then used to compute the F1-score F_play that serves as the optimization target to balance overall recall and precision:

\begin{matrix} F_{p l a y} = 2 \frac{R_{p l a y} \cdot P_{p l a y}}{R_{p l a y} + P_{p l a y}} = \frac{2 P_{p l a y} \cdot (R_{p l a y, p a s s} + R_{p l a y, s h o t})}{2 P_{p l a y} + R_{p l a y, s h o t} + R_{p l a y, p a s s}} \end{matrix}

(4)

Pass/shot classification

The classification into passes and shots can be evaluated independently of the preceding play detection step using the precision and recall of passes and shots relative to the set of successfully matched plays.

Shot Precision $P_{s h o t} = \frac{# classified shots matched with a reference shot}{# classified shots matched with a reference shot or pass}$
Pass Precision $P_{p a s s} = \frac{# classified passes matched with a reference pass}{# classified passes matched with a reference shot or pass}$
Shot Recall $R_{s h o t} = \frac{# classified shots matched with a reference shot}{# detected shots and passes matched with a reference shot}$
Pass Recall $R_{s h o t} = \frac{# classified passes matched with a reference pass}{# detected shots and passes matched with a reference pass}$

Again, the optimization target must account for class imbalance. In this case, since precision and recall are available for both classes, we can use the macro-average of the two regular F1-scores to obtain our optimization target F_avg.

$F_{s h o t} = \frac{R_{s h o t} \cdot P_{s h o t}}{R_{s h o t} + P_{s h o t}}$
$F_{p a s s} = \frac{R_{p a s s} \cdot P_{p a s s}}{R_{p a s s} + P_{p a s s}}$
$F_{a v g} = \frac{F_{s h o t} + F_{p a s s}}{2}$

To quantify the overall performance of the classifier, we also report variants of the above metrics relative to the total number of reference and detected events, respectively.

$P_{s h o t}^{'} = \frac{# correctly classified shots}{# classified shots (including among falsely detected plays)}$
$P_{p a s s}^{'} = \frac{# correctly classified passes}{# classified passes (including among falsely detected plays)}$
$R_{s h o t}^{'} = \frac{# correctly classified shots}{# reference shots}$
$R_{p a s s}^{'} = \frac{# correctly classified passes}{# reference passes}$
$F_{s h o t}^{'} = 2 \frac{P_{s h o t}^{'} \cdot R_{s h o t}^{'}}{P_{s h o t}^{'} + R_{s h o t}^{'}}$
$F_{p a s s}^{'} = 2 \frac{P_{p a s s}^{'} \cdot R_{p a s s}^{'}}{P_{p a s s}^{'} + R_{p a s s}^{'}}$
$F_{a v g}^{'} = \frac{F_{p a s s}^{'} + F_{s h o t}^{'}}{2}$

Parameter optimization

Each data set is split into a training set and a test set with a 65-35 ratio of game segments. The resulting number of shots and passes is shown in Table 2.

Table 2. Overview of training and test sets.

Dataset	Games	Training passes	Test passes	Training shots	Test shots
`Stats` [18]	14	9831	5053	212	117
`Euro` [19]	4	2813	1495	62	27
`Metrica` [17]	3	2422	1277	48	20
`Subsequent` [20]	6	4364	2375	105	51

Open in a new tab

The parameters of the play detector are fitted on the entire training set using 300 iterations of Bayesian optimization within the following bounds.

min_vicinity \in [0.01 m, 10 m]

min_acc \in [0 \frac{m}{s^{2}}, 120 \frac{m}{s^{2}}]

max_deflection_time \in [0 m s, 1000 m s]

Similarly, the 8 parameters of the manual rules classifier are fitted using Bayesian optimization with 120 iterations and the following bounds.

min_progression \in [- 100 m, 50 m]

max_dist_to_goal \in [0 m, 50 m]

min_opening_angle \in [0, 180^{\circ}]

max_lateral_deviation \in [0 m, 34 m]

max_target_dist_to_goalline \in [0 m, 10 m]

min_speed_general \in [0 \frac{m}{s}, 50 \frac{m}{s}]

min_speed_gk \in [0 \frac{m}{s}, 100 \frac{m}{s}]

min_speed_from_bad_angle \in [0 \frac{m}{s}, 100 \frac{m}{s}]

The hyperparameters of the machine learning models are fitted using a 10 times repeated 10-fold stratified cross-validation on the training set using 250 iterations of Bayesian parameter search. For the decision trees, the parameter max_leaves is instead fixed to various values.

5. Results

Play detection

As shown in Table 3, play detection performs well for the Stats, Subsequent and Metrica data sets, achieving macro-averaged F-scores of 0.87, 0.88, and 0.83 respectively. The more realistic Euro data set, where positional and event data are decoupled, achieves a significantly weaker score of 0.70. Shots display a lower class-specific recall than passes across all data sets.

Table 3. Evaluation results for play detection.

Dataset	Precision P_play	Recall R_play,pass	Recall R_play,shot	F-score F_play
`Stats` [18]	0.84	0.91	0.88	0.87
`Euro` [19]	0.63	0.82	0.72	0.70
`Metrica` [17]	0.89	0.90	0.70	0.83
`Subsequent` [20]	0.89	0.96	0.80	0.88

Open in a new tab

As can be seen from Table 4, the optimized values for min_acc, min_vicinity, and max_deflection_time vary significantly between the data sets.

Table 4. Optimized parameter values for play detection.

Dataset	$min_acc [\frac{m}{s^{2}}]$	`min_vicinity` [m]	`max_deflection_time` [ms]
`Stats` [18]	25.9	3.0	748
`Euro` [19]	63.3	1.5	99
`Metrica` [17]	26.8	1.4	91
`Subsequent` [20]	9.5	3.0	487

Open in a new tab

Pass/shot classification

The results of the pass and shot classifier are shown in Fig 2.

Due to the strong imbalance of the data, the baseline model, which always predicts the majority class, yields a macro average F-score of roughly 0.5. All classifiers easily outperform this baseline.

AdaBoost and Random Forest show the strongest performance with F-scores F_avg ranging from 0.93 to 0.95 for Stats, Euro and Metrica, and 0.85 to 0.87 for Subsequent. The performance of the rule-based classifiers is almost as strong with F-scores between 0.83 and 0.91.

Fig 3 shows the performance of the decision trees depending on the fixed maximum number of leaves. The performance converges already after 3-6 leaves, after which the possibility to add more splitting rules does not lead to a clear performance improvement.

General insights into feature importance are drawn by inspecting the decision trees and the random forest. As shown in Table 5, the 2-leaf trees either use the rule “Distance End Position to Goalline < X” where X is around 2-5 meters away from the goal line or “Distance Start Position to Goal < X” where X is around 20-30 meters to identify shots. There are only two other rules used in the decision trees up to 4 leaves, namely “Opening angle < 12.4°” and “Lateral end position (projected) < X” with X between 5 and 10m. Beginning with the third split, the decision trees begin to learn redundant splits, assigning the same class to both child nodes. Inspecting the impurity-based feature importance of the random forest (Fig 4) confirms the paramount role of these four features in classification, while some of the remaining features such as the initial speed of the ball, the distance of the closest attacker to the goal and the progressed distance towards the goal also appear to be relevant.

Table 5. The logical rules learned by the first three decision trees to classify a play as a shot, for each data set.

D_start,goal: Distance from play origin to goal. D_start,goal: Distance from play end position to goal-line. A_open: Opening angle of the goal from play origin. Y_end*: End position of the play, projected onto goal-line.

Data set	2 Leaves	3 Leaves	4 Leaves
`Euro` [19]	D_start,goal < 30.1m	A_open > 12.4° and Y_end* < 9.48m	D_end,gl < 2.14m and A_open > 12.1°
`Stats` [18]	D_end,gl < 3.08m	D_end,gl < 3.08m and Y_end* < 8.15m	A_open > 12.6° and Y_end* < 7.16m
`Metrica` [17]	D_end,gl < 2.46m	D_end,gl < 4.18m and A_open > 12.9°	D_end,gl < 2.46m and A_open > 10.9°
`Subsequent` [20]	D_end,gl < 3.91m	D_end,gl < 3.91m and A_open > 10.4°	D_end,gl < 3.91m and A_open > 10.4°

Open in a new tab

The combined performance of the event detection routine is shown in Table 6, using AdaBoost for shot classification. The total macro-averaged F-score for detecting passes and shots range from 0.67 to 0.82, depending on the data set. As is also evident from the evaluation of the detector alone (Table 3), shots achieve much lower scores than passes. Passes are detected with an overall F-score of around 0.9, except for the Euro dataset which achieves a lower score of 0.71.

Table 6. Total classification performance of play detector + AdaBoost shot classifier.

Dataset	$P_{p a s s}^{'}$	$R_{p a s s}^{'}$	$F_{p a s s}^{'}$	$P_{s h o t}^{'}$	$R_{s h o t}^{'}$	$F_{s h o t}^{'}$	$F_{a v g}^{'}$
`Stats` [18]	0.84	0.91	0.87	0.78	0.76	0.77	0.82
`Euro` [19]	0.63	0.82	0.71	0.65	0.59	0.62	0.67
`Metrica` [17]	0.89	0.90	0.89	0.65	0.65	0.65	0.77
`Subsequent` [20]	0.88	0.96	0.92	0.61	0.55	0.58	0.75

Open in a new tab

6. Discussion

Our results show that the performance of pass and shot detection is heavily dependent on the characteristics of the data set. Regarding the data sets Subsequent, Metrica and Stats, our study reproduces the previously observed F-scores in pass detection, while using a minimalistic detection algorithm. Our scores between 0.87 and 0.92 for those data sets are in line with the results from Morra et al. (0.89) [6], Khaustov and Mozgovoy (0.86 unsuccessful passes, 0.93 successful passes) [7], Richly et al. (0.89) [16], and Vidal-Codina et al. (0.92) [5].

The large differences of the optimal parameter values (Table 4) indiciate that the utilized data sets are heterogeneous. The large difference in the optimal acceleration threshold stems from the acceleration being manually computed as the second derivative of the position. Therefore, the particularly high optimal threshold for the Euro data set indicates that its positional data is indeed the most raw among the four providers.

In contrast, Subsequent, Stats and Metrica likely used event data to post-process their positional data. Therefore, the performance of the pass and shot detection algorithm on these data sets is likely an overestimation of its true ability to identify these plays in raw positional data. Its performance on the Euro data set (0.71 for passes, 0.62 for shots) is a more truthful reflection of its capabilities as the positional and event data within this data set are independent and the positional data appears to contain few event artifacts.

Given these results and assuming that other algorithms would experience a similar drop in performance when evaluated on independent data (see the results of Vidal-Codina et al. [5] for a rough impression), the current state of the art in detecting events from positional data seems unsatisfying. One third of the detected passes or shots would not appear in the manual event data that analysts are used to, and conversely, around one third of the manually collected events would be missing. Even when factoring in the inherent subjectivity of manual event data, this appears to be a troubling deficit in accuracy.

Our two-step event detection pipeline exposes play detection rather than the subsequent classification as the primary issue. Qualitative post-hoc inspection of the detector’s mistakes on the Euro data set reveals the following causes of suboptimal performance:

Inaccuracies of the positional data: For example, the ball position in the data from Tracab comes with small artificats where the ball sometimes changes its velocity abruptly during a pass. This is falsely recognized as a hit if some player happens to be close by, for example when a pass goes slightly past or over a player. The algorithm can account for that by reducing the required player-ball-distance to determine a hit. However, the required player-ball-distance also needs to be large enough to account for the reach of the player and noise in the ball and player positions. The algorithm cannot account for both at the same time.
The algorithm struggles when many players are located around the ball and the ball is accelerated either through dribbling or artifacts. In these situations, the closest player to the ball can change frequently without an actual possession change. This effect is much less prevalent in the other data sets because the ball “sticks” to the currently possessing player as presumably determined from manually collected event information.
The lack of ball height data makes it difficult to identify irrelevant x-y-accelerations due to bouncing.
Errors in the reference data, in particular missing events.

Shot classification on the other hand performs well across all data sets. Given the quick convergence of the decision trees, it seems that for most data sets, one to three human-understandable rules are already sufficient to differentiate shots from passes with an accuracy of around 90%. These rules primarily operate upon the start and end position relative to the opponent’s goal. At least a small additional boost in accuracy can be achieved using machine learning. A small set of rules is therefore sufficient for differentiating shots from passes and can be used as a more objective definition for this kind of event.

Also, like other rule-based methods, the proposed algorithm runs in linear time, which makes it suitable for real-time application which is an essential requirement in the industry where data must be streamed to clients during matches.

7. Conclusions and future work

We proposed an evaluation routine for event detection in football that deals with the temporal, hierarchical, and imbalanced nature of the task. It demonstrates solutions for the problems that this type of classification task poses, like temporal matching and the choice of evaluation metrics. It can be applied to football event detection as well as related tasks like object detection and sentiment analysis.

As evaluated by the novel validation routine, the proposed two-step event detection algorithm effectively detects passes and shots from a stream of positional data, reaching state of the art performance in the majority of examined data sets while using a rule-based algorithm with minimal complexity. Using a small set of rule leads to an easily interpretable and extendable definition of the detected events which is an essential requirement to improve the objectivity and accuracy of further insights gained by researchers and practitioners.

For the most realistic data set examined, the detection of plays from raw positional data proved as the main obstacle in achieving high-accuracy results. Further analysis suggests that this problem could be partially mitigated in the future by richer and more accurate positional data. However, once plays have been detected, the task of differentiating passes and shots is relatively simple and can be performed using a low number of human-understandable rules. This is a promising insight to help put event-based performance analysis of passes and shots onto a more reliable foundation.

While rule-based pass and shot detection on positional data appears to achieve high accuracy given the current state of the art, our study found that this impression is most likely distorted by the fact that information from the reference data commonly spills over into the input data of the models. More research that performs high-quality evaluations on suitable algorithms in a realistic data setting is needed to determine viable solutions for the automation of manual event data collection.

While we provide a broad perspective on event detection performance through using four data sets from different providers, a limitation of this study is that the data sets themselves are relatively small, especially regarding the number of shots which are a much rarer event in football than passes. Also, our study focuses only on shots and passes while more complex and subjective events like tackles and dribbles and events like yellow cards which are virtually impossible to define in terms of spatio-temporal interactions would presumably achieve a far lower accuracy. For that reason, it seems hard to imagine rule-based event detection as a full-blown solution to the automation of event detection processes. However, it can serve as a complement to more comprehensive detection systems, especially for applications where flexibility, interpretability, and objectivity are paramount, such as academic studies or when existing game models of football clubs and federations need to be accommodated.

Acknowledgments

Thanks to Philipp Schmid for his diligent assistance with the timestamp correction.

Data Availability

The data underlying this research is partially available under the following URL: https://github.com/metrica-sports/sample-data The remaining data comes from commercial sources and we are not allowed to share it. It can be requested from the Austrian Football Federation (https://www.oefb.at/), via the “Abteilung für Wissenschaft, Analyse und Entwicklung” or directly from the various data providers/rights holders, i.e. Stats Perform (https://www.statsperform.com/), the UEFA (https://www.uefa.com/), and Subsequent (https://subsequent.ai/).

Funding Statement

The authors received no specific funding for this work.

References

1. Sarmento H, Marcelino R, Campanico J, Matos N, Leitão J. Match analysis in football: a systematic review. Journal of sports sciences. 2014;32:1831–1843. doi: 10.1080/02640414.2014.898852 [DOI] [PubMed] [Google Scholar]
2. Liu H, Hopkins W, Gómez AM, Molinuevo SJ. Inter-operator reliability of live football match statistics from OPTA Sportsdata. International Journal of Performance Analysis in Sport. 2013;13(3):803–821. doi: 10.1080/24748668.2013.11868690 [DOI] [Google Scholar]
3.StatsBomb Data Specification v1.1; 2019. Available from: https://github.com/statsbomb/open-data/blob/master/doc/StatsBomb%20Open%20Data%20Specification%20v1.1.pdf.
4.Wyscout Glossary; n.d. Available from: https://dataglossary.wyscout.com/recovery/.
5. Vidal-Codina F, Evans N, El Fakir B, Billingham J. Automatic event detection in football using tracking data. Sports Engineering. 2022;25(1):18. doi: 10.1007/s12283-022-00381-6 [DOI] [Google Scholar]
6.Morra L, Manigrasso F, Canto G, Gianfrate C, Guarino E, Lamberti F. Slicing and dicing soccer: automatic detection of complex events from spatio-temporal data. In: Image Analysis and Recognition: 17th International Conference, ICIAR 2020, Póvoa de Varzim, Portugal, June 24–26, 2020, Proceedings, Part I 17. Springer; 2020. p. 107–121.
7. Khaustov V, Mozgovoy M. Recognizing events in spatiotemporal soccer data. Applied Sciences. 2020;10(22):8046. doi: 10.3390/app10228046 [DOI] [Google Scholar]
8.Tovinkere V, Qian RJ. Detecting semantic events in soccer games: Towards a complete solution. In: IEEE International Conference on Multimedia and Expo, 2001. ICME 2001. IEEE Computer Society; 2001. p. 212–212.
9. Nascimento JC, Marques JS. Performance evaluation of object detection algorithms for video surveillance. IEEE Transactions on Multimedia. 2006;8(4):761–774. doi: 10.1109/TMM.2006.876287 [DOI] [Google Scholar]
10. Xu QA, Chang V, Jayne C. A systematic review of social media-based sentiment analysis: Emerging trends and challenges. Decision Analytics Journal. 2022;3:100073. doi: 10.1016/j.dajour.2022.100073 [DOI] [Google Scholar]
11. Brechot M, Flepp R. Dealing with randomness in match outcomes: how to rethink performance evaluation in European club football using expected goals. Journal of Sports Economics. 2020;21(4):335–362. doi: 10.1177/1527002519897962 [DOI] [Google Scholar]
12.Pena JL, Touchette H. A network theory analysis of football strategies. arXiv preprint arXiv:12066904. 2012;.
13.Sorano D, Carrara F, Cintia P, Falchi F, Pappalardo L. Automatic pass annotation from soccer video streams based on object detection and lstm. In: Machine Learning and Knowledge Discovery in Databases. Applied Data Science and Demo Track: European Conference, ECML PKDD 2020, Ghent, Belgium, September 14–18, 2020, Proceedings, Part V. Springer; 2021. p. 475–490.
14.Khan A, Lazzerini B, Calabrese G, Serafini L. Soccer Event Detection. In: 4th International Conference on Image Processing and Pattern Recognition; 2018. p. 119–129.
15.Chen SC, Shyu ML, Chen M, Zhang C. A decision tree-based multimodal data mining framework for soccer goal detection. In: 2004 IEEE International Conference on Multimedia and Expo (ICME)(IEEE Cat. No. 04TH8763). vol. 1. IEEE; 2004. p. 265–268.
16.Richly K, Moritz F, Schwarz C. Utilizing Artificial Neural Networks to Detect Compound Events in Spatio-Temporal Soccer Data. In: 3rd SIGKDD Workshop on Mining and Learning from Time Series; 2017.
17.Dagnino B. Metrica Sports Sample Data; 2021. GitHub. Available from: https://github.com/metrica-sports/sample-data/commit/e706dd506b360d69d9d123d5b8026e7294b13996.
18.Stats Perform. Proprietary data set; 2021.
19.ChyronHego; Wyscout. Proprietary data set; 2021.
20.Subsequent. Proprietary data set; 2022.

PLoS One. doi: 10.1371/journal.pone.0298107.r001

Decision Letter 0

Ersan Arslan

26 Oct 2023

PONE-D-23-27502Event detection in football: Improving the reliability of match analysisPLOS ONE

Dear Dr. Bischofberger,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please submit your revised manuscript by Dec 10 2023 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.
A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.
An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript.

Kind regards,

Ersan Arslan, Ph.D.

Academic Editor

PLOS ONE

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2. Note from Emily Chenette, Editor in Chief of PLOS ONE, and Iain Hrynaszkiewicz, Director of Open Research Solutions at PLOS: Did you know that depositing data in a repository is associated with up to a 25% citation advantage (https://doi.org/10.1371/journal.pone.0230416)? If you’ve not already done so, consider depositing your raw data in a repository to ensure your work is read, appreciated and cited by the largest possible audience. You’ll also earn an Accessible Data icon on your published paper if you deposit your data in any participating repository (https://plos.org/open-science/open-data/#accessible-data).

3. Please note that PLOS ONE has specific guidelines on code sharing for submissions in which author-generated code underpins the findings in the manuscript. In these cases, all author-generated code must be made available without restrictions upon publication of the work. Please review our guidelines at https://journals.plos.org/plosone/s/materials-and-software-sharing#loc-sharing-code and ensure that your code is shared in a way that follows best practice and facilitates reproducibility and reuse.

4. We note you have included a table to which you do not refer in the text of your manuscript. Please ensure that you refer to Table 5 in your text; if accepted, production will need this reference to link the reader to the Table.

Additional Editor Comments:

ACADEMIC EDITOR:

Dear corresponding Author, the Reviewers found some concerns with your manuscript. Please take care of their suggestion and send back the manuscript as soon as possibile.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Yes

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: No

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: Include the problem statement, objective, motivation, and paper organization in the introduction section.

List out the strengths, limitations, and techniques of recent related existing methods in the related work section.

Provide a detailed analysis of the datasets with their data details, feature details, and volume of datasets.

This work is not compared with any existing methods. The authors should provide any statistical, mathematical, or comparative proof to claim that this work is state-of-the-art.

Include findings, strengths, limitations, application area, and future work of the proposed method in the conclusion section.

Reviewer #2: Review Comments

Presented paper emphasizes existing research by validating a two-step rule-based pass and shot detection algorithm on four different data sets using a comprehensive validation routine that accounts for the temporal, hierarchical and imbalanced nature of the task. Our evaluation shows that pass and shot detection performance is highly dependent on the specifics of the data set. However, the following suggestions can be considered by the authors to further improve the quality of the manuscript.

I have some corrections and suggestions below:-

1. Authors must show explain the novel contribution of the work with proper justification of the outcomes. What novelty is established in this work compared to existing works?

2. The computational complexity in terms of time and space must be discussed. Also, compare the proposed method in terms of computational complexity?

3. Literature survey need to be updated based on current state of art methods. Some more paper based on Event detection in football:.

4. The abstract need to be improved and the outcome of the work in terms of achieved various other performance calculations must be included in the abstract.

5. Explaining the problem and the gaps in existing literature in a concise but self-contained way (although readers might wish to consult references, they should not be forced to do so)

6. Organization of the paper can be added at the end of introductions.

7. Comparative analysis of various performance parameters with respect to sate of art methods must be discussed. More recent state-of-the-art approaches should be compared; the experiments should use more sizable real-world data sets from public repositories (if any);

8. Add industrial significance of the proposed approach.

9. Results must be verified some more other data sets also. Describing in detail the data set used and what are the expected outcomes- widening the experimental comparison including other data and methods.

10. Comparative analysis of various performance parameters with respect to various data sets must be discussed. The comparison can be a bit unfair if different data is not used for comparative analysis.

11. Limitations of the proposed work must be included.

12. Precision vs. recall curves of the proposed algorithms with respect to different data sets must be included.

13. Implementation platforms with complete specifications of the system must be included.

14. How much data should be considered for training and testing for model implementation? Details of training and testing data sets must be tabulated.

14. To make the proposed algorithm of this article more readable use pseudo-code.

15. In all results tables’/figures utilized datasets like in table 2, 3 and 4 etc. must be cited with proper and specific citations.

16. Various visualized results based on proposed work must be added and also compared the results with existing work.

17. Comparative analysis with respect to real-time time analysis is missing?

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: Yes: Dr. Mahendra Prasad

Reviewer #2: No

**********

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2024 Apr 18;19(4):e0298107. doi: 10.1371/journal.pone.0298107.r002

Author response to Decision Letter 0

28 Dec 2023

Dear editors, dear reviewers,

I wish to express my sincere thanks to you for taking the time and consideration to review our work. The review process has raised valid issues which we tried to address and improve our manuscript accordingly. Below is our response to each of the points that have been raised.

Include the problem statement, objective, motivation, and paper organization in the introduction section.

- Extended the introduction accordingly.

List out the strengths, limitations, and techniques of recent related existing methods in the related work section.

- Added detail about the methods employed in previous studies.

Provide a detailed analysis of the datasets with their data details, feature details, and volume of datasets.

- Added details about the data sets, in particular details about relevant features and the number of relevant events in the training and test sets.

This work is not compared with any existing methods. The authors should provide any statistical, mathematical, or comparative proof to claim that this work is state-of-the-art.

- Added a discussion of time and space complexity, and a more explicit comparison of detection performance with previous results.

Include findings, strengths, limitations, application area, and future work of the proposed method in the conclusion section.

- Added details to the conclusion section.

1. Authors must show explain the novel contribution of the work with proper justification of the outcomes. What novelty is established in this work compared to existing works?

- Added some clarity concerning our contributions throughout the paper. The novel contributions have been listed at the end of the introduction.

2. The computational complexity in terms of time and space must be discussed. Also, compare the proposed method in terms of computational complexity?

- Added time and space complexity. All rule-based event detection algorithms should trivially be of linear complexity as events are tightly localized within in the positional data.

3. Literature survey need to be updated based on current state of art methods. Some more paper based on Event detection in football:.

- We decided to limit ourselves to studies that operate upon positional data and believe our selection of such studies should be complete. In the case that we missed scientific work on that specific topic, we would be happy to learn about it and add it to the paper! We did not explicitly include papers where events are detected directly from video due to the large categorical differences between those two approaches, especially regarding their main goals and areas of application (comprehensive industry-scale automation vs. research purposes and internal model building in clubs/federations), as touched upon in the introduction section. A paragraph about the practical role of the proposed algorithm has been added in the conclusion.

4. The abstract need to be improved and the outcome of the work in terms of achieved various other performance calculations must be included in the abstract.

- Added more performance calculations in the abstract, mentioning the outcomes more explicitly.

5. Explaining the problem and the gaps in existing literature in a concise but self-contained way (although readers might wish to consult references, they should not be forced to do so)

- Added more details to the literature background and pointed out the research gap more clearly.

6. Organization of the paper can be added at the end of introductions.

- Added a corresponding paragraph.

- Added a more explicit comparison of performance parameters with the state of the art and mentioned the size of the data sets as a limitation. While larger data sets would certainly be beneficial, those are not available to us due to the proprietary nature of most football match data. I also added a table with an overview of the number of reference events for each of the four data sets to provide a more explicit impression of our sample sizes.

8. Add industrial significance of the proposed approach.

- Added a paragraph in the conclusion section.

- Added detail the data used. More data is not available due to the proprietary nature of football match data.

- Completed the comparison such that each comment in the results section includes all the analysed data sets without leaving out any single one.

11. Limitations of the proposed work must be included.

- Added a paragraph in the conclusion discussing limitations.

12. Precision vs. recall curves of the proposed algorithms with respect to different data sets must be included.

- Since not all classifiers are probabilistic, we decided to use F1-scores to allow for a comparison between classifiers.

13. Implementation platforms with complete specifications of the system must be included.

- Added information about the software framework used for implementation. Hardware specifics would not be informative since the code is executable on any regular computer with a compatible Python environment and sufficient RAM.

14. How much data should be considered for training and testing for model implementation? Details of training and testing data sets must be tabulated.

- Added more information about the training and test sets in tabular form.

14. To make the proposed algorithm of this article more readable use pseudo-code.

- Added pseudo-code.

15. In all results tables’/figures utilized datasets like in table 2, 3 and 4 etc. must be cited with proper and specific citations.

- Added the suggested citations in all tables.

16. Various visualized results based on proposed work must be added and also compared the results with existing work.

- Added a more explicit performance comparison in the discussion section.

17. Comparative analysis with respect to real-time time analysis is missing?

- Added a corresponding part in the discussion section.

If there remain any issues with our revised manuscript, we would be happy to address those!

Best regards

Jonas Bischofberger, corresponding author

Attachment

Submitted filename: rebuttal_letter.pdf

pone.0298107.s001.pdf^{(237.6KB, pdf)}

PLoS One. doi: 10.1371/journal.pone.0298107.r003

Decision Letter 1

Ersan Arslan

20 Jan 2024

Event detection in football: Improving the reliability of match analysis

PONE-D-23-27502R1

Dear Dr. Bischofberger,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Ersan Arslan, Ph.D.

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: All comments have been addressed

Reviewer #2: All comments have been addressed

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

Reviewer #1: Yes

Reviewer #2: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

Reviewer #1: Yes

Reviewer #2: Yes

**********

6. Review Comments to the Author

Reviewer #1: The authors have addressed the raised queries and the authors have improved the manuscript. I do not have more queries related to this manuscript.

Reviewer #2: All my comments has been added and successfully modified based on comments. I accept in current form.

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: Yes: Dr. Mahendra Prasad

Reviewer #2: Yes: MOHAMMAD FARUKH HASHMI

**********

PLoS One. doi: 10.1371/journal.pone.0298107.r004

Acceptance letter

Ersan Arslan

21 Mar 2024

PONE-D-23-27502R1

PLOS ONE

Dear Dr. Bischofberger,

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now being handed over to our production team.

At this stage, our production department will prepare your paper for publication. This includes ensuring the following:

* All references, tables, and figures are properly cited

* All relevant supporting information is included in the manuscript submission,

* There are no issues that prevent the paper from being properly typeset

If revisions are needed, the production department will contact you directly to resolve them. If no revisions are needed, you will receive an email when the publication date has been set. At this time, we do not offer pre-publication proofs to authors during production of the accepted work. Please keep in mind that we are working through a large volume of accepted articles, so please give us a few weeks to review your paper and let you know the next and final steps.

Lastly, if your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

If we can help with anything else, please email us at customercare@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. Ersan Arslan

Academic Editor

PLOS ONE

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Attachment

Submitted filename: rebuttal_letter.pdf

pone.0298107.s001.pdf^{(237.6KB, pdf)}

Data Availability Statement

[pone.0298107.ref001] 1. Sarmento H, Marcelino R, Campanico J, Matos N, Leitão J. Match analysis in football: a systematic review. Journal of sports sciences. 2014;32:1831–1843. doi: 10.1080/02640414.2014.898852 [DOI] [PubMed] [Google Scholar]

[pone.0298107.ref002] 2. Liu H, Hopkins W, Gómez AM, Molinuevo SJ. Inter-operator reliability of live football match statistics from OPTA Sportsdata. International Journal of Performance Analysis in Sport. 2013;13(3):803–821. doi: 10.1080/24748668.2013.11868690 [DOI] [Google Scholar]

[pone.0298107.ref003] 3.StatsBomb Data Specification v1.1; 2019. Available from: https://github.com/statsbomb/open-data/blob/master/doc/StatsBomb%20Open%20Data%20Specification%20v1.1.pdf.

[pone.0298107.ref004] 4.Wyscout Glossary; n.d. Available from: https://dataglossary.wyscout.com/recovery/.

[pone.0298107.ref005] 5. Vidal-Codina F, Evans N, El Fakir B, Billingham J. Automatic event detection in football using tracking data. Sports Engineering. 2022;25(1):18. doi: 10.1007/s12283-022-00381-6 [DOI] [Google Scholar]

[pone.0298107.ref006] 6.Morra L, Manigrasso F, Canto G, Gianfrate C, Guarino E, Lamberti F. Slicing and dicing soccer: automatic detection of complex events from spatio-temporal data. In: Image Analysis and Recognition: 17th International Conference, ICIAR 2020, Póvoa de Varzim, Portugal, June 24–26, 2020, Proceedings, Part I 17. Springer; 2020. p. 107–121.

[pone.0298107.ref007] 7. Khaustov V, Mozgovoy M. Recognizing events in spatiotemporal soccer data. Applied Sciences. 2020;10(22):8046. doi: 10.3390/app10228046 [DOI] [Google Scholar]

[pone.0298107.ref008] 8.Tovinkere V, Qian RJ. Detecting semantic events in soccer games: Towards a complete solution. In: IEEE International Conference on Multimedia and Expo, 2001. ICME 2001. IEEE Computer Society; 2001. p. 212–212.

[pone.0298107.ref009] 9. Nascimento JC, Marques JS. Performance evaluation of object detection algorithms for video surveillance. IEEE Transactions on Multimedia. 2006;8(4):761–774. doi: 10.1109/TMM.2006.876287 [DOI] [Google Scholar]

[pone.0298107.ref010] 10. Xu QA, Chang V, Jayne C. A systematic review of social media-based sentiment analysis: Emerging trends and challenges. Decision Analytics Journal. 2022;3:100073. doi: 10.1016/j.dajour.2022.100073 [DOI] [Google Scholar]

[pone.0298107.ref011] 11. Brechot M, Flepp R. Dealing with randomness in match outcomes: how to rethink performance evaluation in European club football using expected goals. Journal of Sports Economics. 2020;21(4):335–362. doi: 10.1177/1527002519897962 [DOI] [Google Scholar]

[pone.0298107.ref012] 12.Pena JL, Touchette H. A network theory analysis of football strategies. arXiv preprint arXiv:12066904. 2012;.

[pone.0298107.ref013] 13.Sorano D, Carrara F, Cintia P, Falchi F, Pappalardo L. Automatic pass annotation from soccer video streams based on object detection and lstm. In: Machine Learning and Knowledge Discovery in Databases. Applied Data Science and Demo Track: European Conference, ECML PKDD 2020, Ghent, Belgium, September 14–18, 2020, Proceedings, Part V. Springer; 2021. p. 475–490.

[pone.0298107.ref014] 14.Khan A, Lazzerini B, Calabrese G, Serafini L. Soccer Event Detection. In: 4th International Conference on Image Processing and Pattern Recognition; 2018. p. 119–129.

[pone.0298107.ref015] 15.Chen SC, Shyu ML, Chen M, Zhang C. A decision tree-based multimodal data mining framework for soccer goal detection. In: 2004 IEEE International Conference on Multimedia and Expo (ICME)(IEEE Cat. No. 04TH8763). vol. 1. IEEE; 2004. p. 265–268.

[pone.0298107.ref016] 16.Richly K, Moritz F, Schwarz C. Utilizing Artificial Neural Networks to Detect Compound Events in Spatio-Temporal Soccer Data. In: 3rd SIGKDD Workshop on Mining and Learning from Time Series; 2017.

[pone.0298107.ref017] 17.Dagnino B. Metrica Sports Sample Data; 2021. GitHub. Available from: https://github.com/metrica-sports/sample-data/commit/e706dd506b360d69d9d123d5b8026e7294b13996.

[pone.0298107.ref018] 18.Stats Perform. Proprietary data set; 2021.

[pone.0298107.ref019] 19.ChyronHego; Wyscout. Proprietary data set; 2021.

[pone.0298107.ref020] 20.Subsequent. Proprietary data set; 2022.

PERMALINK

Event detection in football: Improving the reliability of match analysis

Jonas Bischofberger

Arnold Baca

Erich Schikuta

Roles

Abstract

1. Introduction

2. State of the art

3. Approach to detection and classification

Step 1: Play detection

Step 2: Shot classification

Table 1. Features used for the pass/shot classification.

Manual algorithm

Machine learning

Baseline

4. Evaluation

Data sets

Game segmentation

Temporal matching

Fig 1. Relationship between matching window and F-score in the training data.

Choice of evaluation metrics

Play detection

Pass/shot classification

Parameter optimization

Table 2. Overview of training and test sets.

5. Results

Play detection

Table 3. Evaluation results for play detection.

Table 4. Optimized parameter values for play detection.

Pass/shot classification

Fig 2. F-scores of pass/shot classifier relative to the correctly detected plays.

Fig 3. Performance of decision trees by number of leaves.

Table 5. The logical rules learned by the first three decision trees to classify a play as a shot, for each data set.

Fig 4. Impurity-based feature importance for random forest.

Table 6. Total classification performance of play detector + AdaBoost shot classifier.

6. Discussion

7. Conclusions and future work

Acknowledgments

Data Availability

Funding Statement

References

Decision Letter 0

Ersan Arslan

Roles

Author response to Decision Letter 0

Decision Letter 1

Ersan Arslan

Roles

Acceptance letter

Ersan Arslan

Roles

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases