Forecasting Events in Soccer Matches Through Language

This paper introduces an approach to predicting the next event in a soccer match, a challenge bearing remarkable similarities to the problem faced by Large Language Models (LLMs). Unlike other methods that severely limit event dynamics in soccer, often abstracting from many variables or relying on a mix of sequential models, our research proposes a novel technique inspired by the methodologies used in LLMs. These models predict a complete chain of variables that compose an event, significantly simplifying the construction of Large Event Models (LEMs) for soccer. Utilizing deep learning on the publicly available WyScout dataset, the proposed approach notably surpasses the performance of previous LEM proposals in critical areas, such as the prediction accuracy of the next event type. This paper highlights the utility of LEMs in various applications, including betting and match analytics. Moreover, we show that LEMs provide a simulation backbone on which many analytics pipelines can be built, an approach opposite to the current specialized single-purpose models. LEMs represent a pivotal advancement in soccer analytics, establishing a foundational framework for multifaceted analytics pipelines through a singular machine-learning model.

Refer to caption

1. Introduction

Soccer, despite its global popularity and economic impact, has lagged behind other sports in leveraging data analytics for insights. This gap stems from the inherent dynamism of the game, exceeding other team sports due to the larger number of players and constant interactions. As a result, soccer presents a vast, fertile ground for advancements in sports analytics, particularly through leveraging the power of artificial intelligence.

Recent developments in AI, specifically the emergence of foundational models capable of learning the full spectrum of a game and facilitating insightful interpretation, offer exciting possibilities for revolutionizing soccer analytics. Large Event Models (LEMs) (Mendes-Neves et al . , 2024 ) exemplify this paradigm by presenting a novel approach to generating soccer events, serving as a valuable tool for developing generative models in this domain. Still, LEMs have several limitations.

We present a novel approach to building LEMs that addresses limitations of existing LEMs. Instead of relying on sequential inferences from multiple models, we draw inspiration from the core methodologies of language models. By tokenizing soccer event data, we enable a single model to effectively learn the ”language” of soccer events. This single model performs sequential inferences, with each inference corresponding to a token representing a part of an event. Our model captures the most relevant aspects of soccer events with seven inferences, achieving superior performance in forecasting key variables compared to conventional LEMs.

Furthermore, we explore the extensive potential of LEMs. By leveraging their ability to model complex event sequences, LEMs can unlock valuable insights applicable to various domains within the soccer analytics ecosystem. These include:

Betting: LEMs can predict the likelihood of future events, empowering users to make informed betting decisions based on a comprehensive understanding of game dynamics.

Sports Analytics Tools: Integrating LEMs within advanced analytics tools can enable automated and in-depth analysis of match strategies, player performance, and tactical trends.

Simulation and Scenario Planning: LEMs can generate realistic simulations of hypothetical scenarios, aiding coaches and analysts in evaluating potential strategies and planning future matches.

By demonstrating the versatility and efficacy of our LEMs across diverse use cases and through comparison with prominent frameworks, we aim to solidify their position as a valuable tool for enriching the landscape of soccer analytics and generating a deeper understanding of the sport.

This paper is organized as follows:

Section 2 provides an overview of the current state of LEMs, including their limitations and areas to improve.

Section 3 details our approach to build a single-model LEM, including datasets used, encoding, formulation, and hyperparameters.

Section 4 shows several experiments detailing experimental results and applications.

Section 5 presents the concluding remarks.

2. Related Work

Soccer has witnessed a growing trend in data-driven analysis, aiming to uncover patterns and predict future outcomes. From injury prediction (Kakavas et al . , 2020 ) to transfer values (McHale and Holmes, 2023 ) , there are several applications where machine learning can help soccer teams and players improve their performance.

However, these applications are driven through specialized models. For each new application, a new model needs to be built from scratch, which is time costly and prone to errors. To solve this, simulation environments (Kurach et al . , 2020 ) and approaches (Mendes-Neves et al . , 2021 ) have been proposed. However, they are still limited in scope.

LEMs show promise in regards to progress towards a general approach (Mendes-Neves et al . , 2024 ) . It shows significant progresses over previous approaches (Simpson et al . , 2022 ) which were limited in terms of scope, mostly due to only using a small portion of offensive events. By (1) extending the covered events and (2) significantly accelerating the inference process, the original LEM proposal enabled large-scale simulations of soccer matches. These simulations can be used for many use cases, including those stated before injury prediction, valuing player contribution, forecasting probabilities, among others.

However, the current iteration of LEMs still faces significant challenges that limit their practical applicability. While the models advanced over previous iterations by enabling complete event forecasts, crucial issues remain. A major shortcoming lies in their reliance on three separate models performing three inferences, encompassing seven variables. This complex architecture leads to several drawbacks:

Complex Architecture: LEMs use three different deep learning models to predict different aspects of a soccer event.

Synchronization Challenges: Independent training can lead to incompatibility between the models and convergence to divergent local optima.

Partial Information: Inferences for certain variables may lack the full context required for optimal accuracy.

Hyperparameter Tuning Complexity: Tuning independent hyperparameters for each model significantly complicates the optimization process.

LLMs can predict complete sentences: they do not need a model to forecast nouns and another to forecast adjectives. The ability of LLMs to learn general knowledge and generate coherent data inspired the exploration of approaches similar to LLMs with the aim of solving the problem of forecasting the next event in a soccer match.

The application of techniques used in LLMs to other areas is not something novel. LLMs have transcended text generation, demonstrating success in diverse domains like music composition (van den Oord et al . , 2016 ) and physical simulation (Jain et al . , 2022 ) .

Generative artificial intelligence is still in the early stages for soccer. Our proposal provides a ground foundation to advance LEMs toward real applications in soccer by providing a significant improvement in terms of model accuracy and ease of implementation.

3. A Language-based Approach to LEMs

We use the Public Wyscout dataset, which contains data from Wyscout V2 API (Pappalardo et al . , 2019 ) . The dataset contains data for the 5 most valuable leagues in European football: England, Spain, Germany, Italy, and France. This dataset does not represent the most up-to-date data standards available in the industry. However, publicly available datasets with the most up-to-date data. For each event, we extract the key features highlighted in Figure 2 .

We use the France, Germany, and Italy leagues on our training set, leaving the complete seasons of England and Spain for testing. Although not including data from the testing league in the training set leads to worse performance, we opted to leave these leagues completely out of the dataset so they could be used for season-long application development using our models, without the risk of overfitting.

The 11 features used per event are the following:

Event Type: Categorical variable indicating of the event, e.g, pass, shot, take on, tackle.

isGoal: Binary variable indicating whether the event is related to a goal.

isAccurate: Binary variable indicating if the action was successful.

isHomeTeam: Binary variable indicating if the action was performed by the home team.

Period: Integer variable describing the current half being played, i.e., 1st or 2nd.

Minute: Integer variable with the current minute.

Second: Integer variable with the current second.

X: Integer variable with the X coordinate of the event.

Y: Integer variable with the Y coordinate of the event

Home Score: Integer variable indicating the current score of the home team.

Away Score: Integer variable indicating the current score of the away team.

It is important to note that event data contains several limitations. Therefore, our model will also extend some of these limitations. Examples of these limitations are (1) the absence of off-the-ball data, which represents over 98% of the time players spend on the pitch, (2) biases regarding annotation edge cases, (3) manual annotation of continuous variables are imprecise, among others.

3.2. Leveraging Language Models for Event Prediction

Soccer can be abstracted to Markov chains (Rudd, 2011 ; Decroos et al . , 2019 ) . A soccer match is a sequence of events performed by the players on the field. This sequential nature of soccer makes it a fertile ground for Markov-based approaches to forecast many aspects of the game, e.g., next event (Simpson et al . , 2022 ; Mendes-Neves et al . , 2024 ) , action value (Decroos et al . , 2019 ) . It is easy to draw the parallel between generating the next events and generating the next word. As LLMs use the previous words as context to forecast the next word, LEMs use the previous events as context to forecast the next event.

There are several aspects of LLMs which are relevant for our approach. LLMs provide a basis for sequencial generation, which is crucial - we do not want to forecast the next event, but rather a chain of future events and their outcomes. Sequential generation is what enables LEMs to generate full soccer matches, enabling the extraction of multiple metrics from the data. LLMs are also very good at understanding previous context. Context in soccer is key. The next event should reflect the current state of the game, including several long-term aspects such as current score.

In abstraction, LLMs tokenize and encode the text to facilitate the learning problem. Tokenization is the process of breaking down a large piece of text or data into smaller units called tokens (Otter et al . , 2021 ) . For example, the word “learning” can be split in the tokens “learn” and “ing”. This facilitates the algorithm’s learning process in multiple aspects. The “ing” token is repeated across multiple instances in the English language - decoupling the “ing” from “learning” will enable the reutilization of the same token across all gerund words. Then, the tokens need to be encoded into numbers, which can be used by deep learning models.

For our proposal, we do not perform tokenization of event data. The vocabulary in event data is extremely limited, with only 140 different words used for the selected data points. There is no semantic aspect to our data, other than sequenciality. Therefore, in this use case, tokenization does not have a significant impact on the performance of our models.

The same rational can be applied to the encoding. The simplicity of our approach makes advanced approaches, like embeddings (Otter et al . , 2021 ) , inefficient. We employed an ordinal encoding approach, where tokens are encoded in the following way:

numeric values 0 to 100 occupy the first 101 positions of the encoder.

categorical values regarding action types occupy the next 37 positions of the encoder, ordered from the most to least frequent.

¡PERIOD_OVER¿ and ¡GAME_OVER¿ tokens occupy the next 2 positions of the encoder.

the ¡NaN¿ token occupies the last position of the encoder.

An example of the resulting data format is presented in Table 2 , extracted from the sample of event data presented in Table 1 .

Event Type IsGoal IsAccurate IsHome Period Minute Second X Y HomeScore AwayScore
ground_attacking_duel False True False 2 47 11 93 14 0 2
ground_defending_duel False False True 2 47 11 7 86 0 2
simple_pass False True False 2 47 12 97 36 0 2
shot True True False 2 47 14 87 43 0 3
Event Type IsGoal IsAccurate IsHome Period Minute Second X Y HomeScore AwayScore
102 0 1 0 2 47 11 93 14 0 2
103 0 0 1 2 47 11 7 86 0 2
101 0 1 0 2 47 12 97 36 0 2
116 1 1 0 2 47 14 87 43 0 3

Our approach fixes several limitations of previous proposals: (1) it simulates all action types, overcoming the limitation of only being able to simulate the attacking portion of the game (Simpson et al . , 2022 ) , (2) it generates events from a singular model, severely simplifying the architecture and preserving complete sequentiality of the generation process.

The first limitation is fixed by modeling the use of all event types in our model, enriching the amount of captured event dynamics. The second limitation is overcome by our methodology, using a similar approach to language generation models to address the problem of event forecasting.

3.3. Formulation

Refer to caption

LEMs use a set ( S 𝑆 S italic_S ) of previous events ( e 𝑒 e italic_e ) to forecast the next event, as formalized in Equations 1 and 2 . This set of events has size k 𝑘 k italic_k , defining how many previous events are used to forecast the next event.

(1)
(2)

Our approach models the probability ( p 𝑝 p italic_p ) for each possible token ( t 𝑡 t italic_t ). Therefore, we can obtain the probability vector ( p ^ ^ 𝑝 \hat{p} over^ start_ARG italic_p end_ARG ) according to Equation 3 .

(3)

The predicted p ^ ^ 𝑝 \hat{p} over^ start_ARG italic_p end_ARG vector is then used to sample the next token in the event, through a Multinomial sampler. The sampler contains restrictions regarding which part of the event is being predicted. The restrictions act as a diversity control system. If it is predicting the first token within an event, corresponding to the event type, then only the probabilities related to event type tokens are passed to the sampler. The restrictions are required to ensure consistent outputs from the models. As the model will be used for millions of inferences and simulations, any residual probability that is not consistent with the event data format can derail the prediction of future events, making the simulation unusable.

But our approach can only forecast one token at the time. To solve this problem, we need to provide the model with information about the current composition of the event. The model needs to be informed about which position of the event tupple is being predicted. To achieve this, the predicted elements are added to the set S 𝑆 S italic_S . This way, the model is informed about which positions of the event were already predicted and can then better forecast the next position within the event. For modeling purposes, we pad the flattened vector with the ¡NaN¿ token, which the model can use to determine which predictions are valid for the next token.

There is two exceptions in the prediction of the next event which is not the same as the input:

Current scores for home and away teams are computed externally, avoiding the burden of extra inferences to calculate a variable which is deterministically computable.

The Period, Minute and Second variables are transformed into a TimeElapsed variable. This shortens the number of inferences required by two. All three variables are then deterministically computed from the previous values of the variables, the predicted Event Type and Time Elapsed variables.

3.4. Deep Learning

Recent advances in deep learning architectures, such as RNNs (Hochreiter and Schmidhuber, 1997 ) and Transformers (Radford et al . , 2018 ) , enabled large LLM performance increases. However, these architectures increase the number of learnable parameters in the neural networks. In LLMs, the increase in the number of parameters can be easily compensated with an increase in computational power and larger datasets. GPT-2 models (Radford et al . , 2019 ) use 40gb of data to train a 1.5B parameter model.

Due to the constrains in publicly available soccer event data, we opted to use a multi-layer perceptron architecture for our model. The available event data in our dataset, after pre-processed, is 100mb. Although we see significant gains to be made using more advanced architectures, the size of data available to us is not sufficient to thoroughly explore these approaches.

Our architecture was determined empirically, detailed as follows:

Hidden Layers: The model contains 3 layers with 512 neurons each, with the lite version of the model containing 2 layers with 256 neurons each. For the k=1 model, these architectures contain around 600k and 100k parameters respectively. We use ReLU as the activation function.

Learning Rate: Initiated at a value of 0.001, the learning rate undergoes dynamic adjustment each epoch, following a cosine annealing schedule (Huang et al . , 2017 ) . This approach aids in stabilizing the learning process over epochs.

Epochs: The training was conducted over 50 epochs. This number of epochs was chosen to allow sufficient time for the model to converge.

Loss Function: Binary Cross-Entropy Loss (BCELoss) was used as the criterion for training the model. This choice is particularly suitable for binary classification problems, offering a probabilistic approach to classification tasks.

Optimizer: The Adam optimizer (Kingma and Ba, 2017 ) was employed for its efficiency in handling sparse gradients and adaptive learning rate capabilities.

One of the LEMs’ main purposes is to enable the large-scale simulation of soccer games. For such, the user only needs to iteratively query the model, while arranging the current state of the game to be updated by the new predictions.

Moreover, the LEM is implemented in such way that the only limiting factor for the number of parallel simulations is the amount of available GPU memory. This architecture allows the parallel simulation of millions of soccer matches concurrently, enabling simulation-based insights at large scale.

In some games, the current event has frequent long-term impacts on the outcome of matches. Due to soccer’s low scoring nature, there are few factors which can significantly impact the current game status, completely changing the context of the match. This is a great feature for our models - as soccer contains a lot of “reset points”, such as set pieces, or whenever the ball goes out-of-bound, a new sequence is started. With the start of a new sequence, the cumulative error of our predictions has a soft reset. While some variables, such as the current score, have a significant impact in the following outcome, these resets provide our model with an opportunity to rebalance its predictions in cases where the model starts to halucinate.

3.5. Implementation

For the implementation, we used PyTorch running on a NVIDIA 3060 12GB. The code required to reproduce this work is available at https://github.com/nvsclub/LargeEventsModel .

4. Experiments

4.1. training the lems.

For this paper, we trained three models:

K=1 : a model which only uses the the previous event as the game state, similar to the original proposal of LEM.

K=1s : a lite model trained with a significantly smaller number of parameters (6̃x less parameters than K=1).

K=3 : a model using the last 3 previous events as the game state.

The performance of the three models is presented in Table 3 .

Variable Metric BL LEM K=1 K=1s K=3
Type ACC 40.8% 55.7% 57.5% 57.3% 62.2%
F1 0.24 0.5 0.52 0.52 0.57
Goal ACC 99.7% 99.8% 99.8% 99.8% 99.8%
F1 0.81 0.87 0.68 0.68 0.68
Accurate ACC 67.8% 81.7% 82.7% 82.5% 82.8%
F1 0 0.69 0.87 0.87 0.87
Home ACC 50.9% 93.8% 92.1% 91.5% 93.6%
F1 0.67 0.94 0.92 0.92 0.94
Time MAE 3.1 1.6 1.6 1.7 1.7
R2 0 0.55 0.45 0.39 0.42
X MAE 21.2 8.5 6.7 7.4 6.5
R2 0 0.29 0.81 0.77 0.82
Y MAE 26.5 15.6 12.1 12.8 11.4
R2 0 0.64 0.54 0.50 0.59

Our proposed LEMs significantly improve soccer event prediction accuracy compared to a traditional baseline model and the previous iteration of LEMs. Table 3 reveals substantial gains across various metrics, including event type, accuracy prediction, and spatial coordinates. Only in two variables, isHome and TimeElapsed , our proposal did not improve over the previous iteration. Moreover, some improvements are substantial: (1) event type accuracy increased 6.5% for the K=3 model, (2) spatial coordinates X and Y decreased error by 24% and 28% respectively.

The results regarding the Goal/Accurate variables offer a glimpse into the trade-off this approach has against the original LEM proposal. The greater granularity of predictions, which are now predicted individually rather than massively, leads to increased performance in the variable that is predicted later, at a performance cost of the variable that was previously predicted. The original LEM proposal leads to better predictive performance. However, the methodology requires multiple variables to be forecasted simultaneously. This is where our current proposal allows for a significant leap. We can now forecast all variables sequentially, increasing the amount of information available in each event prediction, making the predicted events more cohesive than in the original approach.

We also ran tests to evaluate the inference speed of each algorithm. Comparing to K=1, K=1s offers a 21% increase in inference speed, while K=3 slows down inference by 62%.

4.2. Model Inspection

This section focuses on the model’s ability to predict the next event location, a crucial aspect for understanding player movement patterns and team strategies. We evaluated the K=1 model by analyzing the learned transition matrices, representing the probability of transitioning from one location to another, presented in Figure 3 .

Refer to caption

The transition matrices exhibit significant matching, particularly from bottom left to top right. This indicates the model accurately predicts transitions. A prominent diagonal extends from the top left to the bottom right, highlighting the high probability of possession changes. As events are always annotated from the team attacking perspective, the coordinates change to the symmetrical point when there is a change of possession between teams.

4.3. Computing Situational Expected Goals Maps

The expected goals metric (xG) is a widely used measure of opportunity quality in soccer. It measures the probability of a given shot situation leading to a goal. Figure 4 shows the expected goals map from two different game states. The first game state happens after a pass in the coordinates (80, 50), while the second happens after a cross in the coordinates (90, 80). For the K=3 model, both actions are preceded by two passes from coordinates (20, 50) and (50, 50), emulating a fast transition situation.

Refer to caption

Several insights are visible in 4 . First, we observe that there are data points with high distance to the goal and anomalous values for expected goals. These anomalies occur due to a bias in event data: when a cross goes in the direction of the goal, it is labeled as a shot if it ends in a goal but is still labeled as a cross if it gets claimed by the keeper. This discrepancy leads to divergences between real goal expectations and our model’s goal expectation.

The K=1s model struggles to learn the basic patterns in data. The results are less precise than the peer models. The trade-off of lowering the number of parameters for increased inference speed affects performance substantially.

Interesting differences arise between K=1 and K=3. As both models are large enough to learn the patterns in data, the larger context in the K=3 model learns different patterns from the K=1 model. The K=3 model has access to two previous actions that indicate progressive passes from the shooting team. This means that the model is expecting a shot from a situation that is likely to be a fast transition. This added information increases the precision of the maps from the K=1 model that is unaware of this context.

4.4. Forecasting Short-term Probabilities

LEMs also have the ability to forecast probabilities at multiple time scales. In soccer, we define short-term probability as the probability of scoring a goal in a short period after the event.

The most famous use case for short-term probabilities are the momentum indicators in live score apps (e.g., Sofascore). Figure 5 shows an example of a momentum indicator.

Refer to caption

4.5. Forecasting Long-term Probabilities

In opposition to short-term probabilities, long-term probabilities refers to a large time scale, usually to predict the outcome of a full match. With the simulations ran on LEMs, we can calculate the frequency of outcomes across a range of variables - everything from who wins (see Figure 6 , to goals (see Figure 7 , corners and others.

Refer to caption

The forecasted probabilities for both game outcome and the probability of the game ending with over or under 2.5 goals follow the expected pattern. The current outcome of the match increases its probability until a score change happens. When a goal happens, the probabilities sharply move, benefiting the team that scored.

Both Figures 6 and 7 look like they have a lot of noise on them. The noise represents the small fluctuations of probabilities that happen due to the flow of the match: as a team becomes likelier to score, their probabilities increase at a much smaller scale than the increase they have when scoring a goal. Therefore, the probability noise actually resembles the changes in probability between events.

One of the advantages of using LEMs for such purposes is that with the same simulations we can calculate an wide range of probabilities. For each new variable we want to forecast, we do not need to run a new simulation. We can reuse the generated event data from previous simulations and extract the probabilities. As for the limitations, the model cannot capture the shift in probabilities due to a red card in the 63rd minute, right before Barcelona’s second goal. This factor is absent from the context given to LEMs.

When estimating both short-term and long-term probabilities in soccer matches, we enable the calculation of action values via the Valuing Actions by Estimating Probabilities (VAEP) framework (Decroos et al . , 2019 ) . VAEP defines the value of an action as the resultant increase in the probability of scoring minus the increase in the probability of conceding. Using this framework, we calculated the VAEP value for the actions before the last goal in the Real Madrid - Barcelona, December 23, 2017. These results are presented in Table 4 . We compare VAEP scores with ST/10 values, which assess the short-term scoring impact of actions within the time frame of 10 actions, to understand their alignment. Furthermore, we examine long-term probabilities (LT/inf) and a modified version (LT*/inf) calculated with a hypothetical 0-0 score to evaluate their effectiveness in different contexts. All values use the K=1 model.

ID Type Team VAEP ST/10 LT/inf LT*/inf
912061 pass away 0 -0.011 -0.002 0.002
912062 pass away -0.01 -0.001 0.002 -0.02
912063 pass away 0.01 -0.003 -0.002 -0.003
912064 take on away 0.05 -0.003 0.002 -0.003
912065 tackle home - -0.012 0.001 -0.029
912066 pass away 0.09 0.109 0.002 0.18
912067 shot away 0.83 0.865 0 1.727
912068 reflexes home - 0.003 0 0.001

VAEP values closely resemble ST/10 scores for most actions, indicating their effectiveness in capturing the immediate impact on scoring opportunities. As expected, the pass directly assisting the goal exhibits a significantly high VAEP score, reflecting its critical role in creating the scoring chance. Furthermore, there is an agreement on the high impact of the shot action, with both VAEP and ST/10 giving similar scores.

One action where VAEP and ST/10 disagree is the value of the dribble. The difference is explained by the better context given in the original VAEP probability estimator - the estimator knows where the dribble ends, while LEMs currently do not accept this information as input. Therefore, the added information of VAEP leads to a different valuation for the dribble. The initial LT/inf values, considering the entire game and its outcome, underestimate the value of the final shot since the match was already decided. However, when we adjust for a hypothetical 0-0 score (LT*(inf), the value of the shot aligns more accurately with the expected VAEP impact, although using a different scale.

These results are very important from the point of view of player valuation. Currently, the most relevant metrics for player valuation are aggregating the values of their events. Therefore, by evaluating events we are effectively evaluating the players.

5. Conclusion

This paper presented a LEM-based approach for predicting event sequences in soccer matches with remarkable results. The proposed model achieved good accuracy and surpassed previous iterations, demonstrating the effectiveness and promise of this methodology.

Future research focusing on contextual enrichment has the potential to unlock even greater accuracy and broader applicability of LEMs. Exploring advanced architectures like RNNs and Transformers can also propel this approach to new heights. This approach is a significant step forward in developing LEMs, opening doors for groundbreaking applications in soccer research. The possibilities are vast, from enhanced tactical analysis to betting applications.

Acknowledgements.

  • Decroos et al . (2019) Tom Decroos, Lotte Bransen, Jan Van Haaren, and Jesse Davis. 2019. Actions Speak Louder than Goals: Valuing Player Actions in Soccer. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining . ACM, Anchorage AK USA. https://doi.org/10.1145/3292500.3330758
  • Hochreiter and Schmidhuber (1997) Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long Short-Term Memory. Neural Computation 9, 8 (Nov. 1997), 1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735
  • Huang et al . (2017) Gao Huang, Yixuan Li, Geoff Pleiss, Zhuang Liu, John E. Hopcroft, and Kilian Q. Weinberger. 2017. Snapshot Ensembles: Train 1, get M for free. arXiv:1704.00109 [cs.LG]
  • Jain et al . (2022) Ajay Jain, Ben Mildenhall, Jonathan T. Barron, Pieter Abbeel, and Ben Poole. 2022. Zero-Shot Text-Guided Object Generation with Dream Fields. arXiv:2112.01455 [cs.CV]
  • Kakavas et al . (2020) Georgios Kakavas, Nikolaos Malliaropoulos, Ricard Pruna, and Nicola Maffulli. 2020. Artificial intelligence: A tool for sports trauma prediction. Injury 51 (Aug. 2020), S63–S65. https://doi.org/10.1016/j.injury.2019.08.033
  • Kingma and Ba (2017) Diederik P. Kingma and Jimmy Ba. 2017. Adam: A Method for Stochastic Optimization. arXiv:1412.6980 [cs] (Jan. 2017). http://arxiv.org/abs/1412.6980
  • Kurach et al . (2020) Karol Kurach, Anton Raichuk, Piotr Stańczyk, Michał Zając, Olivier Bachem, Lasse Espeholt, Carlos Riquelme, Damien Vincent, Marcin Michalski, Olivier Bousquet, and Sylvain Gelly. 2020. Google Research Football: A Novel Reinforcement Learning Environment. arXiv:1907.11180 [cs.LG]
  • McHale and Holmes (2023) Ian G. McHale and Benjamin Holmes. 2023. Estimating transfer fees of professional footballers using advanced performance metrics and machine learning. European Journal of Operational Research 306, 1 (April 2023), 389–399. https://doi.org/10.1016/j.ejor.2022.06.033
  • Mendes-Neves et al . (2024) Tiago Mendes-Neves, Luís Meireles, and João Mendes-Moreira. 2024. Towards a Foundation Large Events Model for Soccer. [Manuscript under revision at the Machine Learning Journal] (2024).
  • Mendes-Neves et al . (2021) Tiago Mendes-Neves, João Mendes-Moreira, and Rosaldo J. F. Rossetti. 2021. A Data-Driven Simulator for Assessing Decision-Making in Soccer. In Progress in Artificial Intelligence , Goreti Marreiros, Francisco S. Melo, Nuno Lau, Henrique Lopes Cardoso, and Luís Paulo Reis (Eds.). Vol. 12981. Springer International Publishing, Cham, 687–698. https://doi.org/10.1007/978-3-030-86230-5_54
  • Otter et al . (2021) Daniel W. Otter, Julian R. Medina, and Jugal K. Kalita. 2021. A Survey of the Usages of Deep Learning for Natural Language Processing. IEEE Transactions on Neural Networks and Learning Systems 32, 2 (2021), 604–624. https://doi.org/10.1109/TNNLS.2020.2979670
  • Pappalardo et al . (2019) Luca Pappalardo, Paolo Cintia, Alessio Rossi, Emanuele Massucco, Paolo Ferragina, Dino Pedreschi, and Fosca Giannotti. 2019. A public data set of spatio-temporal match events in soccer competitions. Scientific Data 6, 1 (Dec. 2019), 236. https://doi.org/10.1038/s41597-019-0247-7
  • Radford et al . (2018) Alec Radford, Karthik Narasimhan, Tim Salimans, and Ilya Sutskever. 2018. Improving Language Understanding by Generative Pre-Training. (2018). https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf
  • Radford et al . (2019) Alec Radford, Jeff Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever. 2019. Language Models are Unsupervised Multitask Learners. (2019).
  • Rudd (2011) Sarah Rudd. 2011. A Framework for Tactical Analysis and Individual Offensive Production Assessment in Soccer Using Markov Chains.. In New England Symposium on Statistics in Sports . http://nessis.org/nessis11/rudd.pdf
  • Simpson et al . (2022) Ian Simpson, Ryan J. Beal, Duncan Locke, and Timothy J. Norman. 2022. Seq2Event: Learning the Language of Soccer Using Transformer-based Match Event Prediction. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining . ACM, Washington DC USA, 3898–3908. https://doi.org/10.1145/3534678.3539138
  • van den Oord et al . (2016) Aaron van den Oord, Sander Dieleman, Heiga Zen, Karen Simonyan, Oriol Vinyals, Alex Graves, Nal Kalchbrenner, Andrew Senior, and Koray Kavukcuoglu. 2016. WaveNet: A Generative Model for Raw Audio. arXiv:1609.03499 [cs.SD]

Kendall correlations and radar charts to include goals for and goals against in soccer rankings

  • Original Paper
  • Open access
  • Published: 17 September 2024

Cite this article

You have full access to this open access article

soccer research forecasting paper

  • Roy Cerqueti 1 , 2 ,
  • Raffaele Mattera   ORCID: orcid.org/0000-0001-8770-7049 1 &
  • Valerio Ficcadenti 3  

35 Accesses

Explore all metrics

This paper deals with the challenging themes of the way sporting teams and athletes are ranked in sports competitions. Starting from the paradigmatic case of soccer, we advance a new method for ranking teams in the official national championships through computational statistics methods based on Kendall correlations and radar charts. In detail, we consider the goals for and against the teams in the individual matches as a further source of score assignment beyond the usual win-tie-lose trichotomy. Our approach overcomes some biases in the scoring rules that are currently employed. The methodological proposal is tested over the relevant case of the Italian “Serie A” championships played during 1930–2023.

Similar content being viewed by others

soccer research forecasting paper

Ranking handball teams from statistical strength estimation

soccer research forecasting paper

Ranking the Teams in European Football Leagues with Agony

soccer research forecasting paper

A new hierarchical composite indicator model for ranking the top 20 European football teams

Avoid common mistakes on your manuscript.

1 Introduction

Sports competitions are often structured in official championships, where individual athletes or sporting teams compete to win. In such championships, one has a list of matches where the same teams/players play different games under a plethora of rules to be respected. In some cases, the team/player losing a match is eliminated from the competition—like in Grand Slam tennis tournaments, such as Wimbledon, the US Open, the Australian Open, and the French Open, where the winning cup is assigned at the final game to one of the last two surviving players. In other cases, all the teams/players play the same number of matches, and the winner comes out from the outcomes of all the matches—this is the case of football, with the official championships of European countries like Serie A in Italy, Premier League in the UK, Ligue 1 in France and Campeonato Nacional de Liga de Primera División in Spain. In the former case, there is no real need to quantify the performance of the players to identify the winner and the other positions in the final ranking—the winner is the player who wins the final match, while her/his competitor takes a “silver medal”. In the latter case, one has to state some rules that assign a score to the playing teams at the end of each match. The analysis of the existing scoring rules and the proposal of new criteria—more reasonable, from different perspectives—offers room to carry out scientific research at a methodological level but also in the context of applications, see Sziklai et al. ( 2022 ) for an overview on tournaments’ efficacy.

In general, sports statistics is a widely acknowledged field of science (see, e.g. Albert and Koning 2007 ). One of the most famous papers in sports statistics is Reep and Benjamin ( 1968 ), where the authors analysed more than 2500 football matches and found that fewer passes are associated with a higher probability of a goal. This paper is the root of the football philosophy of the so-called “long ball system”, for which the ball should be kicked over long distances to avoid a high number of passes. There was (and still is) a long debate on Reep and Benjamin’s results regarding the presence of some biases in the analysis. From our point of view, debating the outcomes of the analysis does not affect the universal validity of Reep and Benjamin’s research question. Intuitively, statistics in sports might be efficiently exploited to advance methods to predict the outcomes of different matches. On this, Baker and Scarf ( 2006 ) face the case of 20 annual sporting contexts by including the heterogeneity of the prediction criteria in their investigation. More recently, Mattera ( 2023 ) provides a forecasting exercise of football outcomes by employing score-driven models. Still, in the context of football, Heuer et al. ( 2010 ) advance a Poisson Process-based model for predicting the outcomes of football matches.

One can deal with the players’ scores and performance from a perspective still related to the outcomes. In this respect, Volf ( 2009 ) provides a view of the scores in sports matches as the realizations of a point process based on the plethora of elements surrounding matches and players. Along the same line, Gabel and Redner ( 2012 ) deal with the scoring procedure of basketball games and elaborate on a random walk-type stochastic process behind the evolution of such a procedure over time. According to Volf ( 2009 ), Higham et al. ( 2014 ) identify the performance indicators for the case of rugby by highlighting their roles in the formation of the scores of the teams. Also, Boys and Philipson ( 2019 ) discuss the ranking procedures of sportsmen in the special context of cricket. A relevant contribution is Sandri et al. ( 2020 ), where the authors explore game performance variability through Markov switching models. In Ausloos ( 2024 ), the author offers a new perspective on the way the final ranking of cyclist rides should be carried out. The interested reader is also signposted to, e.g., Strauss and Arnold ( 1987 ), Merritt and Clauset ( 2014 ), Migliorati et al. ( 2023 ) and references therein contained.

This paper adds and contributes to the literature on scoring procedures with an application to the relevant case of football championships. Specifically, we propose a novel method for assigning a score to the teams to identify the winner and the final ranking of every season of the considered championship. In doing so, we are close to several studies dealing with the analysis of the performance and the scoring procedures in football matches. We mention Ausloos et al. ( 2014 ), where the authors deal with the analysis of the structure of the ranking when considering UEFA and FIFA championships at a country level. Ausloos ( 2014 ) provides a view of the football rankings as unified frameworks through a rank-size analysis, with specific attention to the illustrative power of the Lavalette law. Ribeiro et al. ( 2010 ) build a model based on random walks for describing the scores of the soccer leagues. More recently and on the same line, Vernon-Carter et al. ( 2023 ) present soccer leagues as competitive complex systems where competitiveness can be measured through the scoring of individual soccer teams. We refer the interested reader also to Glickman and Stern ( 2005 ), Mendes et al. ( 2007 ), Thakkar and Shah ( 2021 ), Ficcadenti et al. ( 2023 ), Cefis and Carpita ( 2024 ). The methodological proposal is tested on the paradigmatic case of all the seasons of the Italian championship, Serie A.

Our starting point is the disappointing evidence that the current rule for Serie A admits the existence of special circumstances leading to the mathematical assignment of the winner’s cup to a specific team, disregarding some matches to be played to officially end the season. That happens because of simple arithmetic consequences in the rule set. Indeed, the rigidity of the score on the basis of the trichotomy win-tie-lose makes recoveries be impossible when the distance in scores is large enough. This often implies a deteriorated level of game qualities in the matches played toward the end of the season, when the team that is mathematically the winner of the championship season starts losing matches against low-level teams. An example can be taken from the 2018–2019 championship. Juventus won the Serie A title with five games to spare. They clinched the title on April 20, 2019, after a 2–1 victory over Fiorentina, which put them 20 points clear of their closest challengers at the time, Napoli, with only 15 points left to play for. This significant points gap made it mathematically impossible for any other team to catch up with Juventus in the season’s remaining fixtures.

Following this victory, the matches that Juventus played for the rest of the season lacked the same competitive edge, at least from their perspective, as the title was already secured. They tied against Internazionale, Torino and Atalanta and lost against Roma first and Sampdoria later. This situation illustrates the potential downside of having a team win the league so early: the intensity and competitive nature of their remaining matches can diminish, potentially affecting the overall quality of the league’s competition towards the end of the season. While Juventus continued to compete professionally, the urgency and high stakes associated with their matches were notably reduced, aligning with the concerns expressed about the impact of early championship wins on the quality of the game.

We hypothesize a novel scoring rule for which scored and conceded goals play a relevant role in determining the final ranking of the considered championship season. In so doing, we are not far from Cerqueti et al. ( 2022 ), where there is an application on football data to rank teams according to their goals. The following approach is of data science-computational type. We consider the sample of all the Italian Serie A championship seasons, from 1929–1930 to 2022–2023, with specific reference to the official final rankings of the teams. We then implement a four-step procedure based on a combination of Kendall \(\tau\) and radar charts. The procedure leads to what we call New Rankings of the seasons, which are grounded on a different way to assign the scores to the teams also including the goals—hence, removing or reducing the cases of mathematical certainty of being the winner well-before the end of the championship. The approach is quite close to Gorgi et al. ( 2023 ), where the authors advance a pair of comparison methods for reconstructing the rankings of the football championships in the presence of forced interruption. This is particularly relevant in that there has been evidence of different cases of interrupted championship—the most recent one being linked to the COVID-19 pandemic. However, the quoted paper uses Kendall \(\tau\) only to test the proposed framework’s validity. Differently, we here base the analysis on such a statistical correlation measure along with a radar chart-based evaluation of the multidimensional performance of specific entities—the seasons of the considered championship, in our case.

The paper is organized as follows. Section  2 describes the considered dataset. Section  3 presents the methodology used for dealing with the data, outlining the four-steps procedure used to obtain the New Rankings. Section  4 collects the main results of the analysis, along with some related discussions. The last section offers some conclusive remarks and traces lines of future research.

This study utilises a comprehensive dataset encompassing the outcomes of football matches from the Italian Serie A championship, spanning from its inception, 1929–1930, to the present day, amounting to \(M=90\) seasons. Footnote 1 For brevity, we refer to the seasons by mentioning only the related last year, so that e.g. 1929–1930 becomes 1930 for us.

The summary statistics of the dataset in Table A1 offers a glimpse into the volume and nature of the data analysed. These statistics encompass various metrics to understand football dynamics, including the number of goals scored by home and away teams (“Goals For” identified with GF and “Goals Against” with GA), points accumulated throughout the season, and the number of wins, draws, and losses. The dataset comprises records from the 1930 season of Serie A, totalling 34 matches, to the 2023 season, with 38 matches played. For each match, the dataset records the date, the teams involved, the goals scored by each team, and the final result (win, draw, or loss), allowing for a detailed examination of team performances at the season level and the evolution of the championship over time.

In addition to the standard metrics, we have developed rankings based on GF and GA each season, identified by the variables \(GF_r\) and \(GA_r\) , respectively. These rankings provide an alternative perspective on team performance at the end of the season, emphasizing, for example, offensive and defensive capabilities beyond the traditional league standings.

In preparing the data for analysis, several preprocessing steps were undertaken to ensure the dataset was fit for purpose. These steps included verifying and correcting match outcomes, normalising team names to account for historical changes, and identifying and treating any missing or incomplete records. Finally, only official rankings (including penalties applied by authorities), GF and GA were employed to serve as the primary basis for our analysis. Developing rankings based on GF and GA for each season involved taking the total number of goals scored by and against each team, allowing us to derive \(GF_r\) and \(GA_r\) . This approach offers a nuanced view of team strategy and performance, fitting the objective of our study and complementing the official league standings with metrics highlighting each team’s offensive and defensive strengths.

3 Methodology

This section illustrates the four steps of the procedure for achieving the New Rankings for the Italian Serie A championship seasons. First, we consider the unofficial rankings with teams ordered on the basis of the scored goals or, simply, Goals For (in decreasing order, so that rank=1 is associated with the team with the highest number of scored goals in the championship season) and conceded goals or, simply, Goals Against (in increasing order, so that rank=1 is assigned to the team with the lowest number of conceded goals). Thus, we have three rankings for each season on the same set of teams. Second, we compute the Kendall \(\tau\) of all the possible couples of rankings, hence obtaining three values of the Kendall \(\tau\) for each season. Third, we build a radar chart for each season, whose axes are associated with the three Kendall values. Therefore, a triangle describes each championship season. We compute the area of the obtained triangles, and then we suitably normalise it so that the areas range from 0 (case of all Kendall \(\tau\) equals − 1) to 1 (case of all Kendall \(\tau\) equals \(+\) 1). The areas of the triangles represent the target (normalised) Kendall \(\tau\) . Fourth, we detect rankings with a target Kendall \(\tau\) correlation with the official one. The obtained rankings are the New Rankings . As we will see soon, the New Rankings are often far from the official ones.

We enter the details.

3.1 Kendall \(\tau\) correlation analysis

The association between official team rankings and goal metrics (GF and GA) is achieved through Kendall \(\tau\) correlation analysis. The \(\tau\) correlation coefficient measures the strength and direction of the association between two ranked variables. It is defined as:

where n is the number of observations, \(x_i\) and \(x_j\) are the ranks of the i -th and j -th observations for the first variable, and \(y_i\) and \(y_j\) are the ranks of i and j for the second variable. The sign function, \(\text {sign}(\cdot )\) , returns \(-1\) , 0, or 1 depending on the sign of its argument.

In our case, we use the \(\tau _b\) variant, Kendall ( 1945 ). Such a coefficient is a measure of association based on the ranks of the data and is adjusted for ties. We point out that ties occur when two or more items have the same rank. The formula defines it:

being P and Q the number of concordant and discordant pairs, respectively and T and U the number of ties only in x and y , respectively. If a tie is registered for the same couple in x and y , such a tie is not considered for the value of T and U .

The \(\tau _b\) coefficient accounts for ties by adjusting the denominator to reflect the number of tied ranks, which can affect the distribution of concordant and discordant pairs. This adjustment makes \(\tau _b\) a more accurate reflection of the association between two variables when ties are present in the data. In datasets where ties are common—as in our case Footnote 2 — \(\tau _b\) offers a more reliable correlation estimate than the standard Kendall \(\tau\) coefficient, which does not adjust for ties.

This coefficient is computed for each season to analyse the relationships between the official rankings ( Rank ) and the rankings based on goals for ( \(GF_r\) ) and goals against ( \(GA_r\) ), as well as the relationship between \(GF_r\) and \(GA_r\) themselves. It is worth recalling that in our data, the ties can be met only in \(GF_r\) and \(GA_r\) , as the official ranking is built on a set of rules that avoid the presence of ties. Table 1 reports a summary of the instances considered, and Fig.  1 shows correlations over time.

The Kendall correlation coefficients are computed using Python, leveraging the scipy.stats.kendalltau function for correlation analysis. As we will appreciate when we introduce the radar charts, a normalization process is needed to use the correlations smoothly to form the areas. Such a normalization procedure adjusts the correlation values to a [0, 1] scale centred at 0.5, allowing for a consistent geometric interpretation across different years. The normalisation formula applied to each tau correlation coefficient, denoted as \(\tau _b\) , is defined as follows:

Here, \(\tau _{b;N}\) represents the normalised correlation coefficient. By adding 1 to the original correlation coefficient \(\tau _b\) , the new range starts from 0 (previously \(-1\) ) to 2 (previously 1). Dividing this result by 2 adjusts the scale to range from 0 to 1. If the quantity \(\tau _{b;N}\) in ( 3 ) has a value of 0.5, then we do not have correlation; values closer to 1 indicate a strong positive correlation, while values closer to 0 suggest a strong negative correlation.

To facilitate the understanding of the steps, we report in Table 2 a snapshot of the 1939 and 2023 cases, being chosen as explicative instances; in Table 3 one can see the rank correlations and their respective normalisation.

figure 1

The different correlation analyses are reported over the seasons. We use the non normalised version of the Kendall tau in formula ( 2 ). Rank is the official ranking, \(GA_r\) is the ranking when “Goals Against” is considered, and \(GF_r\) indicates the ranking when “Goals For” is considered

figure 2

The vertices are suitably labelled, and the area of the triangle is shaded to visually represent the correlations’ magnitude. The radial grid lines are set at intervals of 0.1 to indicate the scale of the normalised correlation (according to Formula 3 ), ranging from negative correlation (zero in the graph) to perfect positive correlation (one in the graph). The shaded areas, calculated from the triangles formed by these correlations, provide a quantitative measure of the combined correlation strength among attributes for each year, in the graph being 1939 and 2023

figure 3

The areas calculated with Eq. ( 7 ). Each point represents a season

3.2 Mapping correlations into radar charts

In the analysis, the normalised correlation coefficients for each year are represented geometrically, forming triangles that encapsulate the relationship among different performance metrics. The triangles are constructed as radar charts, by plotting points on axes that extend from a central point, with each axis representing one of the analysis types. The process is described as follows:

For each year under analysis, the normalised correlation coefficients ( \(\tau _{b;N}\) ) for the selected metrics are retrieved. These coefficients range from 0 to 1, and are centred at 0.5.

Angles for the vertices of the triangles are calculated to distribute the analysed couples of metrics around a centre evenly. This is achieved using the formula:

where \(\theta _h\) represents the angle for the h -th vertex, n is the total number of analysis types, and h ranges from 0 to \(n-1\) . In our context, the three angles in radiants are 0, 2.09 and 4.18.

Each correlation coefficient is then used as a point on its respective axis, determined by the corresponding angle \(\theta _h\) .

The points are connected in sequence, forming a closed shape that, in the context of this analysis, is a triangle; one gets three vertexes thanks to the three types of analyses considered. See Fig.  2 , where the examples of 1939 and 2023 cases are reported on the basis of the data presented in Table 3 .

3.3 Area calculation from the resulting triangles and correlation target

The area of each triangle described above is calculated to quantify the combined strength of the correlations. Given the vertices positioned at angles \(\theta _1\) , \(\theta _2\) , and \(\theta _3\) with their respective normalised correlation coefficients, the Cartesian coordinates for each vertex are determined by:

where \(\tau ^{(h)}_{b;N}\) is the normalised correlation coefficient for vertex h , and \(\text {shift}_x = \text {shift}_y = 10\) and used to ensure all points are positioned in the positive quadrant to simplify the area calculation.

The area of a triangle formed by the three points representing the considered normalised correlations is calculated, to provide a geometric representation of these correlations. Given the vertices coordinates \((x_1, y_1)\) , \((x_2, y_2)\) , and \((x_3, y_3)\) , the area ( A ) of the triangle is given by:

The procedure of computing A is implemented to each of the \(M=90\) seasons considered in our dataset. An example of the calculations can be found in Table 4 for the cases 1930 and 2023.

A time series version of the calculated areas can be found in Fig.  3 , where there is a clear view of the time-evolution of the considered areas.

We then hypothesise that the area of the triangles represents the (normalised) Kendall correlation targets of the New Rankings with the official ranks. So, considering goals for and against would give a ranking of the teams whose correlation with the official one is the area of the triangle of the related radar chart.

3.4 Finding the new rankings

This section contains the computational strategy devised to discover alternative ranking systems that may more accurately reflect football championships’ dynamics and a performance-centric nature in the final rankings. The goal is to pinpoint permutations of team rankings yielding Kendall \(\tau _b\) correlations that match the geometric areas previously calculated with Eq. ( 7 ), suggesting a ranking better-representing team performance throughout the seasons.

3.4.1 Generating permutations and calculating Kendall’s tau correlations

For a given number of teams, \(n\) , in the football championship (indicated in Table A1 as “N. Teams”), we embark on a systematic generation of permutations to simulate the possible teams’ positions in the final ranking, therefore simulating various possible seasons’ outcomes. Owing to computational constraints and the \(n!\) increasing number of permutations with \(n\) , our exploration is confined to a select subset of permutations. In this way, we can still show here to what extent the official rankings are affected by the partially missed account of “Goal For” and “Goal Against”. Specifically, we run the first 362,880 permutations for the 2023 case. Such a threshold is based on system capabilities and the aspiration to encompass a broad spectrum of potential rankings.

Within each permutation, we compute Kendall \(\tau _b\) , with respect to the original ranking sequence using Eq. ( 2 ). As already said, Kendall’s tau is a measure used to ascertain the ordinal association between two quantities.

One should use the original sequence of numbers between 1 and n for each season against the permuted sequence of the same set of numbers. The permuted series should be ordered according to the lexicographic criterion to simplify the process. This computation yields a distribution of \(\tau _b\) values formed by the correlations associated with all the possible permutations. The target \(\tau _b\) associated with a given permutation illustrates the degree of correlation of such a permutation with the original team ranking. Such a permutation of the original ranking can be viewed as the outcome of a championship where the teams are ranked according to the permuted ranking. To illustrate this statement, we refer to Fig.  4 , where 362,880 permutations are evaluated with different values of n . This exemplifies the idea of having championships of n teams whose rankings are shuffled and compared to the original one, which is assumed to be \((1,\dots , n)\) .

figure 4

Variations in Kendall’s \(\tau ^{(j)}_b\) correlation with permutation Index ( j ) for different sample sizes ( n ): This figure illustrates how the correlation coefficients change as a function of permutation index, when permutations are in lexicographic order, across various sample sizes. Each subplot represents a different value of n , with red dashed lines marking factorial milestones to highlight significant permutations. Annotations indicate the factorial values of the first integers ( \(362,000=9!\) ), providing insights into correlation trends and permutation complexity as n increases

3.4.2 Identifying the optimal permutations

At the heart of our analysis lies the quest for permutations that realise Kendall correlation aligned with the target \(\tau _b\) values stemming from the geometric correlation analysis. For each season, we may have different \(n\in {16,18,20,21}\) , as can be grasped from Table A1 . The procedure is as follows:

We extract the target \(\tau _b\) values from the preceding geometric analysis for the year as the area of the triangle/radar chart. We transform the various A s in a \(\tau _{b}\) s.

We calculate the absolute difference between each \(\tau _b\) target and the various \(\tau ^{(j)}_b\) resulting from comparing each permutation with the target, where j is the index of permutation.

We isolate permutations whose \(\tau ^{(j)}_b\) values are nearest to the target \(\tau _b\) , implementing a tolerance threshold to facilitate a significant comparison. This tolerance is derived from the rounded (to the third decimal digit) interval between consecutive lexicographically ordered permutations \(\tau ^{(1)}_b - \tau ^{(2)}_b\) values in our permutation analysis, accommodating the inherent variability in the dataset; in fact, in this way, it depends on the \(n!\) possible operations.

This methodology empowers us to single out some permutations (i.e., hypothetical rankings accounting for GA and GF) that most accurately conform to the theoretical ideals elucidated by our prior analysis. These optimal permutations shed light on alternative ranking methodologies that more faithfully mirror team performances and the competitive dynamics across the football season.

3.4.3 Implementation and challenges

The implementation was conducted using Python, with the assistance of libraries such as itertools for permutation generation, scipy for statistical computations, and pandas for data handling. This computational framework facilitated a thorough exploration of ranking permutations against predefined criteria, offering a fresh perspective on the assessment of football championships.

A significant challenge in this process is the identification of all the permutations that match Kendall \(\tau _b\) target correlations, resulting in the target areas obtained from the triangle’s geometric analysis, driving the problem to something computationally challenging. In fact, to find the best permutations that meet the case of \(n=21\) , one has to explore potentially 21! possibilities, which are evidently complex and expensive.

4 Results and discussion

This section elaborates on the outcomes derived from our methodological framework and discusses their implications within the realm of football analytics, specifically in the context of Serie A. The analysis provides insightful revelations about team performance dynamics and proposes a novel perspective on ranking methodologies.

Our geometric representation of team performances allows a different view of the Serie A history and is illustrated in Fig.  3 . It unveils a trend where the last ten seasons exhibit a distinguishable pattern. This observation suggests a shift towards more balanced team strategies, aiming to optimise both offensive and defensive plays (see Okada and Takagi 2008 , on the impact of various strategies on GA and GF). The gap between the calculated areas and the ideal scenario (Area = 1) quantifies the extent to which the official rankings might overlook the intricate balance between goals scored (GF) and goals conceded (GA) when accounting for more granular teams’ performance in forming the final ranking.

The transformation of these areas into target Kendall \(\tau _{b}\) coefficients provides a foundation for empirical analysis. As depicted in Fig.  5 , the majority of seasons align positively with the official rankings, indicating a generally robust system but not completely accounting for GA and GF. Anomalies identified in the negative range (1943, 1956, 1957) call for a closer examination of those particular seasons and potentially underline the need for a refined ranking mechanism that better captures team performance nuances.

Our exploration into the New Rankings, facilitated by the examination of permutations and Kendall \(\tau _b\) correlations, highlights the potential for alternative standings that deviate significantly from the official rankings. As shown in Fig.  6 for 2023, incorporating goals scored and conceded into the rankings can result in substantial shifts in team positions. This variability underscores the impact of evaluating team performances beyond mere wins, draws, and losses, advocating for a more granular approach to ranking that acknowledges the multifaceted nature of football competitions. The case presented in Fig.  6 is already meaningful even if the number of iterations tested stops at 362,880, and to complete the exercise, one should have gone to 20!, as 20 were the team competing. Another way to observe the impact of targeting a level of Kendall \(\tau _b\) that ensures capturing a more comprehensive set of features into the ranking is proved by Fig.  7 . Such a figure contains the Kendall \(\tau _b\) in the cases of inversion of the first f elements as well as of the l last elements of the rankings. We take n ranging from 10 to 30 to include the cases of interest for the analysed championships. For example, if \(f=5\) and \(n=10\) , we have the “reverted” ranking (5, 4, 3, 2, 1, 6, 7, 8, 9, 10) and when \(l=5\) and \(n=10\) , we have (1, 2, 3, 4, 5, 10, 9, 8, 7, 6). One can notice that in the case of \(n=16\) teams playing in a championship, the inversion of the ranking for the last \(l=4\) teams in the ranking means pursuing a target Kendall \(\tau _b\) of 0.9, indicating that with very little variations on the ranking system, teams may or may not face a relegation. This explains the impact of the inversion well when GF and GA are taken into full consideration.

By arranging team permutations in lexicographical order, we systematically explore variations from the initial ranking, incrementally adjusting team positions. In Fig.  4 , the permutation index j is plotted along the x -axis, and the corresponding Kendall’s tau correlation coefficient, \(\tau ^{(j)}_b\) , is plotted along the y -axis. This arrangement reveals a pattern of regular fluctuations in \(\tau ^{(j)}_b\) values, manifesting as seasonal cycles across the permutation index. We argue that the number of solutions depends on the target correlation. The extreme cases of \(\tau _b=-1\) or \(\tau _b=1\) are associated with singular solutions to the problem – for \(\tau =-1\) being the complete reversion of the ranking, while for \(\tau _b=1\) being the original series itself. More specifically, \(\tau _b=1\) is the perfect agreement with the official ranking, representing a scenario where the permutation does not alter the original team order, highlighting the unique case where the equivalence class contains only the official ranking itself. If one slightly modifies these two corner cases, the number of solutions that lead to the target case increases as one approaches a target \(\tau _b=0\) for the definition of Kendall correlation.

The cycles presented in Fig.  4 reflect the comprehensive range of permutations explored, with the length of each cycle corresponding to the total number of teams involved; for instance, with 18 teams, the permutation space, and hence the cycle length, expands to 18!. Due to computational constraints, the analysis presented in the figure samples only a fraction of the total permutation space.

This cyclical pattern underscores that multiple permutations yield identical Kendall \(\tau _b\) correlations, indicating that multiple rankings could feasibly represent the data with equivalent statistical validity. Thus, as already said above, a specific \(\tau _b\) value may correspond to a set of rankings rather than a single, unique order. This set forms an equivalence class, each member sharing the same Kendall \(\tau _b\) correlation with the official ranking. The diversity within these classes illustrates the potential for alternative interpretations of team performance and rankings.

figure 5

Histogram of the resulting \(\tau _b\) obtained from mapping the areas back to the [− 1,1] correlation range along the Italian Serie A history, 1930–2023

figure 6

In this box plot, the results for the 2023 season are reported. The target correlation is \(\tau _b = 0.842962\) and the optimal correlation is \(\tau ^j_b = 0.842105\) with j taking some opportune values in the set \(\{0,\dots ,362880\}\) , being 362,880 the number of permutation tested in this case when the generated permutations are stored in lexicographic order. The whiskers of each box represent respectively min and max ranking obtained in the considered permutations to meet the target correlations. The vertical line that splits the box in two is the median obtained from the raking obtained for that position ( y -axis) and the left and right sides of the box are the 25-th and 75-th percentile of ranks assigned to that ranking position. When there is a single bar, like for cases 1, \(\dots\) ,11, it means that no changes has been recorded. It is worth recalling that here 362880 permutations are considered over 20! that should have been explored to complete the plot

figure 7

The x-axis has ticks indicating that the Kendal Tau correlation reported in the cells is calculated comparing the original series 1,2,..., n (y-axis) with the series where the first (‘f.’) or the last (’l’) k elements have been permuted, inverting their order. For example, when \(n = 10\) , for “f. 2 inv.”, the value 0.96 is obtained by applying Formula (2) to the series (1,2,3,4,5,6,7,8,9,10) and (2,1,3,4,5,6,7,8,9,10)

5 Concluding remarks

This study embarked on a novel exploration of Serie A football championship rankings by introducing a comprehensive methodology that integrates geometric analysis with Kendall’s \(\tau _b\) correlation coefficients. Through this approach, we scrutinised the alignment between official team rankings and those derived from goals scored and conceded, as well as with the official rankings, unveiling the potential for alternative rankings that might offer a deeper insight into team performance dynamics.

Our findings reveal that while the performance metrics-based rankings (here being Goals For and Against, GF and GA) broadly align with the official ranking system, there are distinct seasons where alternative ranking methodologies could provide a more nuanced understanding of team capabilities. Introducing a geometric representation to visualise the relationship between different ranking metrics not only enriches our analytical toolkit but also highlights the multifaceted nature of football competitions, where outcomes are influenced by a complex interplay of offensive and defensive strategies mirrored in the GA and GF-based rankings, here indicated with \(GA_r\) and \(GF_r\) respectively.

Furthermore, identifying equivalence classes among rankings underscores the notion that multiple valid perspectives can exist regarding team performance, challenging the singular narrative often presented by official standings. This observation invites a broader discussion on the criteria and metrics used to assess and compare teams, suggesting that there is room for innovation in ranking methodologies that more accurately reflect the competitive landscape of Serie A football.

5.1 Implications for stakeholders

For stakeholders in the football community-ranging from team management and coaches to analysts and fans-our study offers a fresh lens through which to evaluate team performance. By considering alternative rankings, stakeholders can better understand a team’s strengths and weaknesses, guiding strategic decisions from player development to game tactics. That certainly does not come to question the official outcome of a season winner, but more to award different teams because of their outstanding defensive tactics or aggressive ones manifested in GA or GF, also addressing some challenges described in Sziklai et al. ( 2022 ).

5.2 Limitations and future research

While our study contributes valuable insights into football ranking systems, it is not without limitations. The computational complexity of analysing all possible permutations of team rankings poses a challenge, necessitating further methodological innovations to explore the full permutation space efficiently. In particular, the application of suitably defined heuristics for reducing the cardinality of the set of permutations as n grows is a possible way to let the problem be tractable, no doubt. This opens the gate to more operational research-oriented studies—that are out of the scope of the present paper.

Additionally, our focus on Serie A limits the generalizability of our findings. Future research could extend this methodology to other leagues and sports, examining the universality of our observations across different competitive contexts and combining it with existing methods such as Sum of Ranking Differences, presented in Sziklai and Héberger ( 2020 ).

Moreover, incorporating other performance metrics (in fact, the axis on the radar charts can be more than three), such as aggregated player statistics or situational variables (e.g., weather conditions during matches), could enhance the robustness and relevance of alternative ranking systems.

5.3 Final thoughts

In conclusion, our study enlightens the potential for alternative perspectives in evaluating football team performances, inviting a reconsideration of conventional ranking systems. As we continue to navigate the rich and evolving landscape of sports analytics, the pursuit of more sophisticated and representative methodologies for assessing team success remains a compelling and worthwhile endeavour.

A simple counting of the years between 1929 and 2023 (last ended season) would not work because of the Championship suspensions occurred during the second world war and the COVID-19 pandemic.

We use the Dense Ranking Method, namely elements with the score receive the same rank.

Albert J, Koning RH Eds. (2007) Statistical thinking in sports. CrC Press

Ausloos M, Cloots R, Gadomski A, Vitanov NK (2014) Ranking structures and rank-rank correlations of countries: the FIFA and UEFA cases. Int J Mod Phys C 25(11):1450060

Article   Google Scholar  

Ausloos M (2014) Intrinsic classes in the Union of European Football Associations soccer team ranking. Cent Eur J Phys 12:773–779

Google Scholar  

Ausloos M (2024) Hierarchy selection: new team ranking indicators for cyclist multi-stage races. Eur J Oper Res 314(2):807–816

Baker R, Scarf P (2006) Predicting the outcomes of annual sporting contests. J R Stat Soc: Ser C: Appl Stat 55(2):225–239

Article   MathSciNet   Google Scholar  

Boys RJ, Philipson PM (2019) On the ranking of test match batsmen. J R Stat Soc: Ser C: Appl Stat 68(1):161–179

Cefis M, Carpita M (2024) The higher-order PLS-SEM confirmatory approach for composite indicators of football performance quality. Comput Stat 39(1):93–116

Cerqueti R, D’Urso P, De Giovanni L, Mattera R, Vitale V (2022) INGARCH-based fuzzy clustering of count time series with a football application. Mach Learn Appl 10:100417

Ficcadenti V, Cerqueti R, Varde’i CH (2023) A rank-size approach to analyse soccer competitions and teams: the case of the Italian football league “Serie A”. Ann Oper Res 325(1):85–113

Gabel A, Redner S (2012) Random walk picture of basketball scoring. J Quant Anal Sports 8(1):6

Gorgi P, Koopman SJ, Lit R (2023) Estimation of final standings in football competitions with a premature ending: the case of COVID-19. AStA Adv Stat Anal 107(1):233–250

Glickman ME, Stern HS (2005) A state-space model for National Football League scores. In Anthology of statistics in sports. Society for Industrial and Applied Mathematics, pp 23–33

Heuer A, Mueller C, Rubner O (2010) Soccer: is scoring goals a predictable Poissonian process? Europhys Lett 89(3):38007

Higham DG, Hopkins WG, Pyne DB, Anson JM (2014) Performance indicators related to points scoring and winning in international rugby sevens. J Sports Sci Med 13(2):358

Kendall MG (1945) The treatment of ties in ranking problems. Biometrika 33(3):239–251

Mattera R (2023) Forecasting binary outcomes in soccer. Ann Oper Res 325(1):115–134

Mendes RS, Malacarne LC, Anteneodo C (2007) Statistics of football dynamics. Eur Phys J B 57:357–363

Merritt S, Clauset A (2014) Scoring dynamics across professional team sports: tempo, balance and predictability. EPJ Data Sci 3:1–21

Migliorati M, Manisera M, Zuccolotto P (2023) Integration of model-based recursive partitioning with bias reduction estimation: a case study assessing the impact of Oliver’s four factors on the probability of winning a basketball game. AStA Adv Stat Anal 107(1):271–293

Okada H, Takagi T (2008) Evaluation of multi-objective genetic algorithm for RoboCupSoccer team evolution. In SICE annual conference, Chofu, Japan 2008, pp 151–154. https://doi.org/10.1109/SICE.2008.4654639

Reep C, Benjamin B (1968) Skill and chance in association football. J R Stat Soc Ser A 131(4):581–585

Ribeiro HV, Mendes RS, Malacarne LC, Picoli S, Santoro PA (2010) Dynamics of tournaments: the soccer case: a random walk approach modeling soccer leagues. Eur Phys J B 75:327–334

Sandri M, Zuccolotto P, Manisera M (2020) Markov switching modelling of shooting performance variability and teammate interactions in basketball. J R Stat Soc Ser C Appl Stat 69(5):1337–1356

Strauss D, Arnold BC (1987) The rating of players in racquetball tournaments. J R Stat Soc Ser C Appl Stat 36(2):163–173

MathSciNet   Google Scholar  

Sziklai BR, Biró P, Csató L (2022) The efficacy of tournament designs. Comput Oper Res 144:105821

Sziklai BR, Héberger K (2020) Apportionment and districting by sum of ranking differences. PLoS ONE 15(3):e0229209

Thakkar P, Shah M (2021) An assessment of football through the lens of data science. Ann Data Sci 8:823–836

Vernon-Carter EJ, Ochoa-Tapia JA, Alvarez-Ramirez J (2023) Singular value decomposition entropy of the standing matrix for quantifying competitiveness of soccer leagues. Phys A 625:129007

Volf P (2009) A random point process model for the score in sport matches. IMA J Manag Math 20(2):121–131

Download references

Open access funding provided by Università degli Studi di Roma La Sapienza within the CRUI-CARE Agreement.

Author information

Authors and affiliations.

Department of Social and Economic Sciences, Sapienza University of Rome, Rome, Italy

Roy Cerqueti & Raffaele Mattera

GRANEM, University of Angers, Angers, France

Roy Cerqueti

School of Business, London South Bank University, London, UK

Valerio Ficcadenti

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Raffaele Mattera .

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

See Table A1 .

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cerqueti, R., Mattera, R. & Ficcadenti, V. Kendall correlations and radar charts to include goals for and goals against in soccer rankings. Comput Stat (2024). https://doi.org/10.1007/s00180-024-01542-w

Download citation

Received : 28 March 2024

Accepted : 12 August 2024

Published : 17 September 2024

DOI : https://doi.org/10.1007/s00180-024-01542-w

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Sport statistics
  • Radar charts
  • Find a journal
  • Publish with us
  • Track your research

SolutionTipster

POOL RSK PAPERS

rsk papers

Week 12 Pools RSK Papers 2024: Soccer X Research, Bob Morton, Capital Intl, Winstar, BigWin

Week 12 Pool RSK papers page Here, we furnish you with weekly and current Pool Fixture Papers, Pools RSK papers and other forecasting papers for […]

Week 11 Pools RSK Papers 2024: Soccer X Research, Bob Morton, Capital Intl, Winstar, BigWin

Week 11 Pool RSK papers page Here, we furnish you with weekly and current Pool Fixture Papers, Pools RSK papers and other forecasting papers for […]

Week 10 Pools RSK Papers 2024: Soccer X Research, Bob Morton, Capital Intl, Winstar, BigWin

Week 10 Pool RSK papers page Here, we furnish you with weekly and current Pool Fixture Papers, Pools RSK papers and other forecasting papers for […]

Week 09 Pools RSK Papers 2024: Soccer X Research, Bob Morton, Capital Intl, Winstar, BigWin

Week 09 Pool RSK papers page Here, we furnish you with weekly and current Pool Fixture Papers, Pools RSK papers and other forecasting papers for […]

Week 08 Pools RSK Papers 2024: Soccer X Research, Bob Morton, Capital Intl, Winstar, BigWin

Week 08 Pool RSK papers page Here, we furnish you with weekly and current Pool Fixture Papers, Pools RSK papers and other forecasting papers for […]

Week 07 Pools RSK Papers 2024: Soccer X Research, Bob Morton, Capital Intl, Winstar, BigWin

Week 07 Pool RSK papers page Here, we furnish you with weekly and current Pool Fixture Papers, Pools RSK papers and other forecasting papers for […]

Week 06 Pools RSK Papers 2024: Soccer X Research, Bob Morton, Capital Intl, Winstar, BigWin

Week 06 Pool RSK papers page Here, we furnish you with weekly and current Pool Fixture Papers, Pools RSK papers and other forecasting papers for […]

Week 05 Pools RSK Papers 2024: Soccer X Research, Bob Morton, Capital Intl, Winstar, BigWin

Week 05 Pool RSK papers page Here, we furnish you with weekly and current Pool Fixture Papers, Pools RSK papers and other forecasting papers for […]

Week 04 Pools RSK Papers 2024: Soccer X Research, Bob Morton, Capital Intl, Winstar, BigWin

Week 04 Pool RSK papers page Here, we furnish you with weekly and current Pool Fixture Papers, Pools RSK papers and other forecasting papers for […]

Week 03 Pools RSK Papers 2024: Soccer X Research, Bob Morton, Capital Intl, Winstar, BigWin

Week 03 Pool RSK papers page Here, we furnish you with weekly and current Pool Fixture Papers, Pools RSK papers and other forecasting papers for […]

Copyright © 2024

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 20 September 2024

Research and application on modeling and landing point prediction technology for water jet trajectory of fire trucks under large-scale scenarios

  • Qing Fan 1 ,
  • Qianwang Deng 1 &
  • Qin Liu 2  

Scientific Reports volume  14 , Article number:  21950 ( 2024 ) Cite this article

Metrics details

  • Computer science
  • Mechanical engineering

To let water jets land at the fire, pitch angle of the fire monitor has to be adjusted manually with successive rounds, which seriously affects the efficiency of fire extinguishing. To improve the efficiency, this paper proposes a technology for water jet trajectory modeling and landing point prediction to help with extinguishing automatically. Considering fragmentation and atomization of water jets, trajectories are analyzed and the predicted trajectory is closer to the real situation. Secondly, a compensation method for the prediction is proposed to further reduce the deviation between the predicted and the actual landing point, taking into account the combined effects of high altitude, initial jet velocity, and wind. On this basis, considering the difficulty of directly solving the analytical solution to the target initial pitch angle of the jets, a searching method is also proposed, which greatly improves the solving efficiency. Finally, through practical experiments and verification, the proposed model takes an average time of 0.00292 s, which is far less compared with other methods. The prediction error is improved by at least 45.3%, and the average deviation is less than 2 m.

Introduction

The intelligent fire monitor is a significant appliance through the development of fire distinguishing system, which aims and shoots automatically. Whether the fire is to be distinguished is up to the aiming accuracy, hence it is the destination of jets that this paper is concerned about.

The trajectory of water jets is like a parabola before jets hit the ground or other targets since ejection. The exact trajectory depends on the real-time force condition that water jets are under, which usually consists of monitor thrust, air resistance, gravity and other forces, which determines the landing point. Moreover, the initial force condition depends on the pitch angle of the monitor, which initializes the counterbalance between forces at the beginning. In other words, it is able to adjusting the jet range by adjusting the initial pitch angle of the monitor. Once the jet range is decided, the corresponding initial pitch angle is supposed to be obtained and applied. As a result, to achieve the intelligent fire water monitors, it can be done by pitch angle adjustment that water jets land at which the fire locates at.

The trajectory model and landing point prediction is the basis of pitch angle adjustment automation. Whether a monitor can point to the fire is closely related to the prediction precision, influencing distinguishing efficiency. Consequently, trajectory model and landing point prediction is an important subject in the process of intelligent distinguishing system development, which former researchers have already conducted some, mainly about 3 categories of solution: prediction using neural network 1 , 2 , 3 , 4 , modeling using classical mechanics 5 , 6 , 7 , 8 , and modeling using fluid dynamic methodology 9 , 10 .

Neural network, known for its human-alike characteristics of hierarchical cognition, is widely applied in recent years 1 . To predict the landing point, Hao 11 and other researchers 2 proposed trajectory models of water jets, which use BP neural networks combined with the genetic algorithm to solve. However, when a neural network comes into the training of landing point prediction, it is hard to exploit its advantage from a small and limited dataset, from which it is difficult for it to get an accurate landing point. Moreover, performance of the genetic algorithm depends on the chromosome coding, population size, iteration times and other super-parameters, which makes difficult to balance between high precision and large computation complexity that takes a lot of time. Thus, Lin et al. 3 fitted a linear prediction model under ideal condition at first and then updated coefficients of the model by BP neural network, which saves a lot of time but hard to be applied in other complex environments.

The ideal condition without air resistance named in the field of classical mechanics barely exists in real situation, especially in emergency response system. Therefore, Chen et al. 7 considered the influence of the air resistance on the trajectory of water jets and he assumed the magnitude of air resistance was proportional to velocity. Min et al. 8 believed that the air resistance was related to the initial pitch angle. A different pitch angle brings a different horizontal velocity and a different vertical velocity, thus a different air resistance magnitude coefficient in horizontal and in vertical direction respectively. However, Sun 9 thought fragmentation of water jets would stop them from modeling a precise trajectory even though they already considered air resistance, and he proposed a jet trajectory with the help of Fluent, which is a popular computational fluid dynamics(CFD) tool, considering jets fragmentation and the monitor’s internal structure, though the range error still reached around 10%.

The movement of water jets can also be explained by hydrodynamics. Sun 9 simulated it by Fluent, while Zhu 10 simulated it by a moving particle semi-implicit method(MPS), in which the movement for each particle at the moment is updated by coupling relation between particles at the last moment and every movement at any time becomes accessible step by step. Indeed, predicting the landing point supported by hydrodynamics brings a high precision. Nevertheless, its computation costs lots of time, extremely hard to meet the emergency requirement of quick response. As a result, it is normally applied in model validation instead of in real-time computation.

Above all, the precision and the time efficiency of jet landing point prediction in the field of emergency firefighting are continued to be upgraded. This paper proposes a jet trajectory model under large-scale scenarios, which updates the classical jet trajectory reference to the fragmentation and air resistance and also proposes a compensation method for landing point prediction, which greatly improves the accuracy without increasing time conspicuously. This paper considers the magnitude of air resistance to be just proportional to the velocity, unlike Min and other researcher’s study 8 , the air resistance coefficient for each direction remains the same. It is the influence of fragmentation on water trajectory that this paper mainly focuses on, using a core stable jet coefficient to represent the change of trajectory after jets crushing, which performs well in the experiments. At the end, this paper also generates a system to predict the initial pitch angle to help with emergent firefighting, which is illustrated in Fig. 1 .

figure 1

Process of cannon controlling.

Water jet trajectory modeling

Water jet trajectory considering air resistance.

The water jet movement in the air can be seen as a special projectile motion too. Considering gravity and air resistance instead of fragmentation and other influence, the movement is almost compliant with the characteristics noted in Newton’s Second Law.

Normally, there are 2 statements about the magnitude of air resistance, which are that the magnitude is directly proportional to velocity 12 and that proportional to the square of velocity 8 , 9 .

Proportional to velocity

That the air resistance is proportional to velocity is represented by Eq. ( 1 ).

After force decomposing, the velocity \(v_{x,t}\) in horizontal direction, which is represented by x , at the time of t can be denoted as Eq. ( 2 ).

while jetting upwards, the velocity \(v_{y,t}\) in vertical direction, which is represented by y , at the time t can be denoted as Eq. ( 3 ),

Thus, the displacement of x and y at t is represented by Eq. ( 4 ).

From  1 to  4 above, m denotes the gravity of water jets, k denotes the coefficient of air resistance, \(\Delta t\) denotes the time unit, and \(g_{n}\) denotes the gravitational acceleration.

To let x directly correspond to t , Eq. ( 2) can be transformed to:

Assuming a water jet weighted m ejects at speed \(v_{out}\) at a pitch angle \(\theta\) ( \(\theta >0\) means ejecting upwards while \(\theta <0\) means ejecting downwards), which is \(x_{0}=0,y_{0}=H_{0},v_{0x}=v_{out}\cdot \cos {\theta },v_{0y}=v_{out}\cdot \sin {\theta }\) . Integrate Eq. (5 ) over time \(\left( 0,t\right)\) , then get Eq. ( 6 ).

Similarly, y can also be solved, thus the trajectory can be represented as:

where \(H_0\) is the height where jets eject, briefly named as the initial ejecting height.

Proportional to the square of velocity

That the air resistance magnitude is proportional to velocity is defined by Eq. (8 ).

Similarly, the velocity \(v_{xt}\) is defined as: \(v_{x,t} = v_{x,t-1} - \frac{k}{m} \cdot \Vert v_{t-1} \Vert \cdot \Delta t\)

If jetting upwards, the velocity \(v_{y,t}\) is defined as:

And if jetting downwards, the \(v_{y,t}\) is defined as:

so, the velocity \(v_{t}\) is defined as:

the displacement of x and y is defined as:

From Eqs. (8 to 12 ), m is the gravity of water jets, k denotes the coefficient of air resistance, \(\Delta t\) denotes the time unit, and \(g_{n}\) denotes the gravitational acceleration.

Though the integral of Eq. (12) cannot be parsed directly, similar to Eq. (7) , it can be reasonably inferred that under certain condition of m and k , x and y are only corresponding to \(v_{out}\) and \(\theta\) .

However, as is said before, the ideal condition that only considers air resistance and gravity barely exists in real world. The classical jet trajectory defined in Eqs. 4 and in 12 are still far away from the real trajectory. One of the causes lies in the fact that the process from jet ejecting to landing is concluded to 3 stages, which is the stability stage, the decline stage and the crushing stage. In the stage of stability, there is a region of supersonic flow with constant velocity at the cannon nozzle outlet, where the axial shaft line velocity and density of jets basically remain unchanged and the formation of jets is stable, insusceptible of wind and others. In stage decline, affected by air resistance, water jets’ axial shaft line velocity declines regularly and the part far from the axis of jet deviates the axis gradually, though the formation is still quite stable, which is totally different from stage crushing, where the velocity decreases violently, jets crush into water droplets and mix up with the air, susceptible to air resistance, wind and other influences.

figure 2

As is illustrated in Fig. 2 , at the end of trajectory, its fragmentation makes it vulnerable, thus it becomes easier for jets to be affected by air resistance and wind. Consequently, the landing area expands, the landing happens earlier and the range is closer, compared to the landing under ideal condition.

Regarding to these characteristics, Min et al. 8 updated the jet movement model starting from the coefficient of air resistance. However, his adaptation is hard to be executed in a practical application, it triggers off this paper though.

Jet trajectory proposed

The movement of jets consists of upward movement and downward movement, from ejecting to landing. Moving upwards means that at time t , the direction \(\theta _{t}\) of velocity v is greater than 0, while moving downwards means \(\theta _{t}<0\) . Considering that air resistance direction is opposite to the movement direction, these 2 movements are discussed respectively.

Jet trajectory based on core stable jets during moving upward

In the classical jet trajectory models defined by Eqs. 4 and 12 , during moving upward, jets receive steadily negative affection from air resistance and gravity, which decreases the vertical velocity uniformly. However, it is contrary to the facts that the vertical velocity decreases with a accelerated acceleration, pushing jets fast into crushing stage where the force condition changes extremely and that the acceleration undergoes a qualitative change. Meanwhile, the horizontal direction of jets is also affected by air resistance negatively, of which the acceleration is far smaller than of the vertical direction. Thus, the impact to the horizontal acceleration is negligible. To conclude, in the process of moving forward, vertical acceleration increases gradually and the formation becomes unstable, which affects the acceleration in return, which repeats until jets crush totally or the vertical velocity meets 0.

As a result, a definition of core stable jets , which are the parts from the jet ejection to the last time before its crush begins, is proposed. During upward movement, it is mainly represented by the vertical displacement, directly affecting vertical velocity. Thus, the proposed upward movement is:

where \(\sigma _{u}\left( \sigma _{u}>0\right)\) is a super-parameter, representing the impact degree of core stale jets on vertical velocity. Thus, the larger vertical displacement is, the larger vertical velocity is, in line with the reality.

Jet trajectory based on core stable jet combined with integrated coefficient during moving downward

During moving downward, vertical velocity receives a positive impact from gravity and a negative impact from air resistance, hence the acceleration is smaller than of moving upward, resulting a longer movement. Thus, core stable jets’ impact on horizontal velocity is not negligible. However, compared to its impact in upward movement, the impact in downward movement is smaller due to the smaller acceleration. Hence, the proposed downward movement is represented as:

where \(\sigma _{d}\left( 0<\sigma _{d}<\sigma _{u}\right)\) and \(\sigma _{x}(0<\sigma _{x})\) are super-parameters, representing the influence of core stable jets on vertical velocity and on horizontal velocity respectively; \(\theta _{t}\) represents the pitch angle of water jets at time t ; \(v_{y,t-1}\) and \(v_{x,t-1}\) represent the vertical velocity and the horizontal velocity at time \((t-1)\) respectively and \(\Vert \cdot \Vert\) represents the mode.

Landing point prediction and pitch angle computation

Wang et al. 13 and Wang et al. 14 reached a landing point by an equation respectively, and Hao 11 used neural network to predict a landing point. None of these prediction methods are suitable for this paper, since it is too complex to solve Eq. (14) . As a result, this paper decides to follow Liu et al. 15 to solve Eq. (14) iteratively rather than an one-step formula.

figure 3

Water jet trajectory computed by models. IPV represents the initial pitch angle, MCARPTV  represents the model considering air resistance proportional to velocity, MCARPTSV represents the model considering air resistance proportional to the square of velocity, MP represents the model proposed.

As a result, this paper compares the landing points \(\{x_{t},y_{t}|y_{t}=0\}\) obtained by Eqs. ( 4 , 12 and 14) and Fig. 3 shows their results respectively, and the results are valued as Table 1 . The initial ejecting height is 10 m, the working water pressure is 0.9 mPa and the initial pitch angle includes \(-15^\circ\) , \(0^\circ\) , \(15^\circ\) , \(30^\circ\) and \(45^\circ\) . As is shown in Table 1 , the jet height range computed by the model considering air resistance measured directly by velocity is from 10 to 41.2 m, and the jet distance range is from 22.8 to 89 m. Its biggest height range happens at pitch angle \(45^\circ\) and biggest distance range happens at pitch angle \(30^\circ\) . The height range varies from 5 to 31 m and the distance range varies from 5 to 56 m. The figure of its trajectory is quite symmetrical. The jet height range computed by the model considering air resistance measured by the square of velocity is from 10 to 13.8 m, and the jet distance range is from 7.8 to 8.9 m. Its biggest height range happens at pitch angle \(45^\circ\) and biggest distance range happens at pitch angle \(15^\circ\) . The height range varies from 1 to 4 m and the distance range varies from 0.5 to 2 m. The figure of its trajectory is dramatically unsymmetrical, and it is just like falling down straightly before y hits 0. The jet height range computed by the model proposed in this paper is from 10 to 38.8 m, and the jet distance range is from 20 to 66.8 m. Its biggest height range happens at pitch angle \(45^\circ\) and biggest distance range happens at pitch angle \(15^\circ\) . The height range varies from 5 to 29 m and the distance range varies from 4 to 46 m. The figure of its trajectory is relatively unsymmetrical and it is like a straight falling before the hit when the pitch angle is quite large. From the point of view of trajectory characteristics, the proposed model matches best to the real trajectory in Fig. 2 .

The compensation of landing point prediction based on samples

In practical application, errors still happen under some special conditions if predicting the landing point directly by the proposed model due to the different influence of complex environments, which are as below:

when jet ejects upwards, due to the double influence of gravity and air resistance, the velocity decreases fast, making it more easily for jets to crush into droplets, which weigh much lower than the jets. The lower weight, the more easily droplets are affected by air resistance, wind, and flow, which expands the loss of its velocity and thus affects the range. Normally, if ejection happens at a low initial height, greenery and buildings block the wind and air flow, which constructs a quite peaceful environment for jets. If ejecting at a higher place, jets lose the protection and are easily affected by the flow and wind.

when jets eject downwards, due to the conflict between gravity and air resistance, the loss of velocity is slow. The core stable jets are mainly affected by the initial velocity, which is judged by working water pressure. Thus, the range is mainly about working water pressure.

When the wind velocity is small, its effect on jets can be ignored; when it is big, it goes to the opposite.

Thus, considering the 3 things listed above, jet distance range is updated as:

where x represents the distance range computed by the model and is measured in meters, \(\delta _{i}\left( i=1,2,3\right)\) denotes the influence coefficient of height, working water pressure and wind respectively and thanks to the samples, Eqs. (16 , 17 and 18 ) can be fitted respectively.

where x represents the distance range computed by the model and is measured in meters, h represents the height range and is measured in meters, and \(v_{out}\) represents the initial velocity of jet and is measured in meters per second.

where x represents the distance range computed by the model and is measured in meters, \(P_{0}\) represents the rated working water pressure and is measured in mPa, and P represents the actual working water pressure and is also measured in mPa.

where x represents the distance range computed by the model and is measured in meters, h represents the height range and is measured in meters, and \(v_{wind}\) represents the actual wind velocity and is measured in meters per second.

The \(\beta _{i}\left( i=1,2,3\right)\) in Eq. (15) represents the influence respectively, and is defined as:

where \(\epsilon\) represents the height threshold, \(H_{0}\) denotes the initial ejecting height, unit in meters.

where \(\theta\) represents the initial pitch angle, unit in degree.

where \(\omega\) represents the velocity threshold of wind, unit in meters per second.

Computation of jet’s initial pitch angle

So far, given an initial pitch angle, the landing point can be obtained by the model. However, practically, what has to be done is the inversion, which is targeting at the fire, measuring the distance, then setting an initial pitch angle and finally shooting. In other words, it is the initial pitch angle that has to be obtained and applied instead of a landing point.

Liu 16 obtained an analytical solution directly by the classical trajectory equation. However, getting an analytical solution is not a easy way for the proposed mode. this paper proposes an iterative solution with a searching algorithm which is searching an pitch angle that meets the requirements of height and distance.

figure 4

The variation of distance range with initial pitch angle. 0. When the initial ejecting height is 2.0 m, the max distance range is 54.8 m and the corresponding initial pitch angle is \(13.7^\circ\) 1. When the initial ejecting height is 8.0 m, the max distance range is 49.6 m and the corresponding initial pitch angle is \(4.6^\circ\) 2. When the initial ejecting height is 14.0 m, the max distance range is 57.6 m and the corresponding initial pitch angle is \(8.6^\circ\) 3. When the initial ejecting height is 20.0 m, the max distance range is 57.1 m and the corresponding initial pitch angle is \(7.6^\circ\) 4. When the initial ejecting height is 26.0 m, the max distance range is 53.8 m and the corresponding initial pitch angle is \(4.6^\circ\) 5. When the initial ejecting height is 32.0 m, the max distance range is 49.4 m and the corresponding initial pitch angle is \(0.5^\circ\) .

As is illustrated in Fig. 4 , under a certain working water pressure, when initial pitch angle is less than a certain value, the relationship between distance range and initial pitch angle is monotonically increasing, and when it is larger than the certain value, the distance range is to waver. Thus, there are multiple initial pitch angles satisfied with the distance range for a monitor under a certain water pressure and at a certain initial height. At the monotonically increasing period, distance range results cover all the areas that jets are able to reach. As a result, this paper uses dichotomy to search for the target initial pitch angle among angles involved in the period of monotonically increasing. As is shown in Fig. 5 , the specific procedures are as below:

Convert displacement \(x_{target}\) and \(L_{0}\) to target distance range x :

Predict \(x_{pred}\) for every \(\theta\) given the actual \(H_{0}\) and the actual p , and obtain \(\{x_{pred}|\theta ,H_{0},p, \theta \in \left( \theta _{min}, \theta _{max}\right) \}\) ;

Sort \(\{x_{pred}\}\) and obtain the maximum \(x_{pred_{max}}\) and the corresponding \(\theta '\) ;

If \(x_{pred_{max}} < x\) , then output null and finish; otherwise continue;

initialize \(low=\theta _{low}\) ( \(\theta _{low}\) is the minimum pitch angle that the cannon can reach) and \(high=\theta '\) , use dichotomy to obtain a \(\theta _{target}\) satisfied with \(|x_{pred}-x|<\epsilon\) ( \(\epsilon\) is a bearable error threshold) then output and finish. The dichotomy procedures are as below:

compute \(\theta _{mid}=\frac{low+high}{2}\) ;

Retrieve \(x_{pred}\) corresponding to \(\theta _{mid}\) ;

If \(|x_{pred} - x| < \epsilon\) , then \(\theta _{target}=\theta _{mid}\) and finish; otherwise, continue;

If \(x_{pred} < x\) , then set low to \(\theta _{mid}\) ; if \(x_{pred} > x\) , then set high to \(\theta _{mid}\) , continue;

figure 5

Process of initial pitch angle computation.

Experiments

This paper conducts experiments for a firetruck produced by Zoomlion and its model is JP65 on which the fire water monitor model is Akron 3480. The velocity at the nozzle inlet is the average velocity in the gun pipe, decided by working water pressure, working water discharge, rated water pressure and rated water discharge. Several corresponding parameters are as below:

rated water discharge: the maximum of effective water discharge, and the value is 0.13 \(\frac{m^{3}}{s}\) ;

rated water pressure: the maximum of effective water pressure, and the value is 1.7 mPa;

nozzle inlet diameter: the value is 100 mm;

nozzle outlet diameter: the value is 89 mm;

jet thickness: the value is 5 mm;

the error threshold of searching for the target pitch angle: the value is 1 m;

For the convenience of testing and recording, truck conditions and operations are as below:

In the interval of 1–31 m, every 5 m is used as an initial ejecting height;

In the interval of \(-45\) to \(90^\circ\) , every 5 degrees is used as an initial pitch angle;

Working water pressure is normally at 0.8 mPa, hence no change.

The distance range is measured by distance between the fire cannon and the center of core jet landing area.

Data from Table 2 comes from experiments and model calculation, in which \(H_{0}\) represents the initial ejecting height, unit in meters; \(\theta _{0}\) represents the initial pitch angle, unit in degrees; P represents the working water pressure, unit in mPa; x represents the distance range that comes from the experiment, unit in meters; \(x'\) comes from the model calculation; \(\Delta\) represents the difference between x and \(x'\) ; relative \(\Delta\) represents the percentage of \(\Delta\) to x . The maximum x happens when the conditions are \(H_{0}=24, 10<=\theta _{0}<=20\) and \(H_{0}=16, 15<=\theta _{0}<=25\) , which is around 70 m.

As is shown in Table 2 , for different \(H_{0}\) , every 1 degree \(\theta _{0}\) changes, the variation of x is between 0.5 and 1, among which the largest variation happens when \(H_{0}\) is between 10 to 25. Moreover, it can be seen in Table 3 that the distance range change varies among different initial pitch angle range even at the same initial ejecting height, and according to Table 3 , normally, for all initial ejecting height, when initial pitch angle range increases to a value, the distance range change starts to decrease, which matches to the computation result illustrated by Fig. 4 . Moreover, it also varies among different initial ejecting height in a same initial pitch angle range. For example, the range from 0 to 10 of \(\theta _{0}\) is the most sensitive range in the ranges at 1.30 m high for x changing, while the most sensitive initial pitch angle range is from 10 to 20 when the initial ejecting height is 6.14 m. Table 4 shows a sensitive analysis result which is that among all \(\theta _{0}\) , every 1 meter \(H_{0}\) changes, x changes from 0.4 to 1.9. And, \(\theta _{0}\) ranges with large x variations are -15 to \(0^\circ\) and 10 to \(20^\circ\) .

To conclude, x is influenced by both \(H_{0}\) and \(\theta _{0}\) , and the influence is non-linear. Practically, to improve efficiency, it is best to choose a \(\theta _{0}\) range which makes x change greatly but the sensitivity is low, such as the range from \(-15\) to \(0^\circ\) , which is also the range this paper is to search in. Choosing a proper \(H_{0}\) with the \(\theta _{0}\) makes jets land just at the target.

As is shown in Table 5 , maximum prediction error of the model considering air resistance proportional to velocity reaches 53 m and the average relative error is 24.5%, though it is still less than the value of the model considering air resistance proportional to the square of velocity, of which the maximum prediction is 65 m and the average relative error is 81.6%. Nevertheless, the proposed model makes the maximum prediction error within 5 m, average in 2 m, and the relative error decreases to 4.6%.

This paper researches on water jet trajectory and landing point prediction anchored by monitor’s initial pitch angle solution, realizing an automatic system for monitor to aim at the target without human involving and thus improving firefighting efficiency.

Considering the crushing and atomization of water jets and also air resistance, wind and height, this paper proposes a core stable jet definition, adapts it to predict the landing point and makes some compensation for the prediction, which limits the distance error within 4.6%, which is satisfied for the engineering application and help achieve intelligent fire monitor. This proposed model costs 0.00292 s for each computation in average, obviously less than the prediction of neural network and of hydrodynamics cost.

Regarding to the complexity and variability of the actual situation, experiments in this paper still don’t cover all the situations in the world, such as in forests, and they just cover some typical situations in city. Hence, in the future, the jet trajectory model and landing point prediction method is going to be updated.

Data availability

All data generated or analysed during this study are included in this published article.

Jiao, L., Yang, S., Liu, F., Wang, S. & Feng, Z. Beyond neural networks: Retrospect and prospect. Chin. J. Comput. 39 , 1697–1716 (2016).

Google Scholar  

Zhang, C., Zhang, R., Dai, Z., He, B. & Yao, Y. Prediction model for the water jet falling point in fire extinguishing based on a GA-BP neural network. PLoS ONE 14 , e0221729. https://doi.org/10.1371/journal.pone.0221729 (2019).

Article   CAS   PubMed   PubMed Central   Google Scholar  

Lin, Y., Ji, W., He, H. & Chen, Y. Two-stage water jet landing point prediction model for intelligent water shooting robot. Sensors [SPACE] https://doi.org/10.3390/s21082704 (2021).

Article   PubMed   PubMed Central   Google Scholar  

Tang, J. Study on Cutting Model for Abrasive Water Jet Based on Artificial Neural Network . Master’s thesis, Xihua University (2011).

Wang, F., Chen, X., Min, Y. & Wang, W. Fitting model of water jet track of fire monitors. Fire Sci. Technol. 656–658 (2007).

Liao, X., Liu, P. & Chen, W. Study on fire monitor’s jet track based on matlab. Fire Sci. Technol. 33 , 1169–1172 (2014).

Chen, X. & Yang, Y. Jet trajectory model and positioning compensation method for fire water cannon. Chin. J. Eng. Des. 23 , 558–563 (2016).

Min, Y., Chen, X., Chen, C. & Hu, C. Pitching angle-based theoretical model for the track simulation of water jet our from water fire monitors. Chin. J. Mech. Eng. 47 , 134–138 (2011).

Article   Google Scholar  

Sun, J. Research of the trajectory of fire-fighting monitor’s Jet . Master’s thesis, Shanghai Jiao Tong University (2010).

Zhu, J. Research on Intelligent Fire Monitor Control Based on Machine Vision . Ph.D. thesis, China University of Mining and Technology (2021).

Hao, W. Modeling and control of water jet for forest fire truck . Master’s thesis, Beijing Forestry University (2020).

Wan, F. The analysis on the jet track and positioning performance of the fire monitor . Master’s thesis, Shanghai University (2008).

Wang, K., Guo, L. & Yang, W. A methodology to position jet landing point of fire monitor and the fire-fighting robot (2016).

Wang, Y., Li, C. & Zhang, J. Automatic controlling method to jet for aerial fire vehicles (2020).

Liu, X., Zhang, M. & Liang, Z. Method to confirm landing point of fire monitor water jet (2018).

Liu, B., Chen, Y. & Ren, S. Method to calculate jet pitch angle of auto tracking and targeting jet suppression system (2014).

Download references

Author information

Authors and affiliations.

College of Mechanical and Vehicle Engineering, Hunan University, Changsha, 410000, China

Qing Fan & Qianwang Deng

Central Research Center, Zoomlion Heavy Industry Science & Technology Co., Ltd, Changsha, 410000, China

You can also search for this author in PubMed   Google Scholar

Contributions

Q.F., Q.D. and Q.L. conceived and designed the experiments. Q.L. analyzed the data. All authors interpreted the results. Q.F. and Q.L. wrote the first and revised drafts of the manuscript. All authors contributed to the manuscript and reviewed the manuscript. All the authors agreed with the results and conclusions of the manuscript.

Corresponding author

Correspondence to Qing Fan .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/ .

Reprints and permissions

About this article

Cite this article.

Fan, Q., Deng, Q. & Liu, Q. Research and application on modeling and landing point prediction technology for water jet trajectory of fire trucks under large-scale scenarios. Sci Rep 14 , 21950 (2024). https://doi.org/10.1038/s41598-024-72476-y

Download citation

Received : 04 July 2024

Accepted : 09 September 2024

Published : 20 September 2024

DOI : https://doi.org/10.1038/s41598-024-72476-y

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Large scenes
  • Jet trajectory
  • Prediction of landing points
  • Fire water cannon

By submitting a comment you agree to abide by our Terms and Community Guidelines . If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing: AI and Robotics newsletter — what matters in AI and robotics research, free to your inbox weekly.

soccer research forecasting paper

UK Football Pools

Week 7 Pool RSK Papers 2024: Bob Morton, Capital, Soccer X Research, BigWin

@ukfootballpools

Week 7 RSK Pool Papers 2024: Soccer X Research, Bob Morton, Capital Intl, Winstar, BigWin

Week 7 rsk papers 2024: Here we provide you with the best football papers such as Bob Morton, Fortune ‘X’ Matrix, Capital International, Dream International Research, and Soccer ‘X’ Research, we publish the world’s leading weekend football forecasting papers apart from RSK in the likes of WinStar, BigWin, Dream International Research , Pools Telegraph, and pools fixtures like Dream Int’l Fixtures, Special Advance Fixtures, and Right On Football Fixtures.

SPECIAL ADVANCE FIXTURES

week 7 special advance fixtures 2024

CBK RESEARCH PAPERS

Fortune ‘X’ Matrix is released every ‘ Wednesday. ” Dream International Research is released every ‘ Thursday. ” Fortune ‘X’ Matrix is released every ‘ Sunday. ”

SOCCER ‘X’ RESEARCH

week 7 soccer x research 2024 page 1

SOCCER PERCENTAGE & LEAGUE TABLE

week 7 soccer x research 2024 percentage

CAPITAL INTERNATIONAL

week 7 capital international 2024 page 1

BIGWIN SOCCER

week 7 bigwin soccer 2024 page 1

POOLS TELEGRAPH

[RELEASED EVERY WEDNESDAY, THURSDAY AND SUNDAY]

CBK Pools Papers (with Pool Late News)

Week 12 Pool RSK Papers 2024: Bob Morton, Capital, Soccer X Research, BigWin

week 11 rsk papers 2024

Week 11 Pool RSK Papers 2024: Bob Morton, Capital, Soccer X Research, BigWin

week 10 rsk papers 2024

Week 10 Pool RSK Papers 2024: Bob Morton, Capital, Soccer X Research, BigWin

week 9 rsk paper 2024

Week 9 Pool RSK Papers 2024: Bob Morton, Capital, Soccer X Research, BigWin

FortuneSoccer.com

Week 9 Pool RSK Papers 2024: Bob Morton, Capital, Soccer Research

Week 9 RSK Pool Papers 2024: Soccer ‘X’ Research, Bob Morton, Capital International, Winstar, BigWin

Week 9 rsk papers 2024 : Welcome to fortune soccer rsk papers here we provide you with RSK papers: Soccer X Research, Bob Morton, Capital International, WinStar, Bigwin Soccer, Pools Telegraph and pool papers from other publishers such as Dream International Research, Fortune ‘X’ Matrix, Special Advance Fixtures, Right On Fixtures, Pools Telegraph and Dream International Fixtures.

Fortune rsk papers download – Click on image

CBK PRESS POOLS PAPERS © (POOLS JOURNALS) Ω  DREAM INTERNATIONAL FIXTURES Ω   DREAM INTERNATIONAL RESEARCH Ω  FORTUNE ‘X’ MATRIX

SPECIAL ADVANCE FIXTURES

week 9 special advance fixtures 2024

CAPITAL INTERNATIONAL

week 9 capital international 2024 page 1

SOCCER ‘X’ RESEARCH

week 9 soccer x research 2024 page 1

SOCCER PERCENTAGE AND LEAGUE TABLE

week 9 soccer x research 2024 percentage

BIGWIN SOCCER

week 9 bigwin soccer 2024 page 1

POOLS TELEGRAPH

week 9 pools telegraph 2024 page 1

CBK Pools Papers (with Pool Late News)

[PUBLISHED EVERY WEDNESDAY, THURSDAY AND SUNDAY]

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Save my name, email, and website in this browser for the next time I comment.

COMMENTS

  1. Transfer Portal: Accurately Forecasting the Impact of a Player Transfer

    In this paper, we present a method which addresses these issues and enables us to make accurate predictions of future performance. Our Transfer Portal model utilizes a personalized neural network accounting for both stylistic and ability level input representations for players, teams, and leagues to simulate future player performance at any ...

  2. Pools RSK Papers

    Week 4 RSK Pool Papers 2024: Soccer X Research, Bob Morton, Capital Intl, Winstar, BigWin ... Posted by By @ukfootballpools 2 months Ago Read More Week 3 Pool RSK Papers 2024: Bob Morton, Capital, Soccer X Research, BigWin

  3. Forecasting Goal Performance for Top League Football Players: A

    Sports Analytics (SA) is a rapidly growing field, and its applications are very useful to sports clubs. This paper focuses on football, and it aims to investigate which Machine Learning (ML) approach performs better in seasonal performance forecasting for top League football players' goals, which predictors are most significant for enhancing model predictive performance and finally, if ...

  4. Predicting the in-game status in soccer with machine learning using

    Evaluation of ML models Frame-by-frame prediction accuracy. Table 1a shows the average F1-Score and accuracy of our models against all 47 test matches. Here AD shows the best results in accuracy ...

  5. Football is becoming more predictable; network analysis of 88 thousand

    Most of the past research in this area, however, either focuses on inter-team interactions and modelling player behaviour rather than league tournament's results prediction, or are limited in scope—particularly, they rarely take an historical approach in order to study the game as an evolving phenomenon [24-26]. This is understandable in ...

  6. Predictive analysis and modelling football results using machine

    One such area where predictive systems have gained a lot of popularity is the prediction of football match results. This paper demonstrates our work on the building of a generalized predictive model for predicting the results of the English Premier League. Using feature engineering and exploratory data analysis, we create a feature set for ...

  7. Evaluating Soccer Match Prediction Models: A Deep Learning Approach and

    Machine learning models have become increasingly popular for predicting the results of soccer matches, however, the lack of publicly-available benchmark datasets has made model evaluation challenging. The 2023 Soccer Prediction Challenge required the prediction of match results first in terms of the exact goals scored by each team, and second, in terms of the probabilities for a win, draw, and ...

  8. Forecasting Events in Soccer Matches Through Language

    Abstract. This paper introduces an approach to predicting the next event in a soccer match, a challenge bearing remarkable similarities to the problem faced by Large Language Models (LLMs). Unlike other methods that severely limit event dynamics in soccer, often abstracting from many variables or relying on a mix of sequential models, our ...

  9. Week 7 Pool RSK Papers 2024: Bob Morton, Capital, Soccer Research

    Week 7 RSK Pool Papers 2024: Soccer 'X' Research, Bob Morton, Capital International, Winstar, BigWin. Week 7 rsk papers 2024: Welcome to fortune soccer rsk papers here we provide you with RSK papers: Soccer X Research, Bob Morton, Capital International, WinStar, Bigwin Soccer, Pools Telegraph and pool papers from other publishers such as ...

  10. Week 9 Pool RSK Papers 2024: Bob Morton, Capital, Soccer X Research, BigWin

    Week 9 rsk papers 2024: Here we provide you with the best football papers such as Bob Morton, Fortune 'X' Matrix, Capital International, Dream International Research, and Soccer 'X' Research, we publish the world's leading weekend football forecasting papers apart from RSK in the likes of WinStar, BigWin, Dream International Research ...

  11. Week 12 Pool RSK Papers 2024: Bob Morton, Capital, Soccer X Research

    Week 12 rsk papers 2024: Here we provide you with the best football papers such as Bob Morton, Fortune 'X' Matrix, Capital International, Dream International Research, and Soccer 'X' Research, we publish the world's leading weekend football forecasting papers apart from RSK in the likes of WinStar, BigWin, Dream International Research ...

  12. Week 12 Pools RSK Papers 2024: Soccer X Research, Bob Morton, Capital

    Week 12 Pool RSK papers page. Here, we furnish you with weekly and current Pool Fixture Papers, Pools RSK papers and other forecasting papers for your forecast and winning pleasure. Click on the images to view them more clearly. Here we go: Special Advance Right On Fixtures. Bigwin Capital International Soccer X Research. Soccer X Research ...

  13. Kendall correlations and radar charts to include goals for and goals

    This paper deals with the challenging themes of the way sporting teams and athletes are ranked in sports competitions. Starting from the paradigmatic case of soccer, we advance a new method for ranking teams in the official national championships through computational statistics methods based on Kendall correlations and radar charts. In detail, we consider the goals for and against the teams ...

  14. POOL RSK PAPERS Archives

    Week 10 Pools RSK Papers 2024: Soccer X Research, Bob Morton, Capital Intl, Winstar, BigWin. September 1, 2024 Obinna 5. Week 10 Pool RSK papers page Here, we furnish you with weekly and current Pool Fixture Papers, Pools RSK papers and other forecasting papers for […]

  15. Week 8 Pool RSK Papers 2024: Bob Morton, Capital, Soccer Research

    Week 8 RSK Pool Papers 2024: Soccer 'X' Research, Bob Morton, Capital International, Winstar, BigWin. Week 8 rsk papers 2024: Welcome to fortune soccer rsk papers here we provide you with RSK papers: Soccer X Research, Bob Morton, Capital International, WinStar, Bigwin Soccer, Pools Telegraph and pool papers from other publishers such as ...

  16. Hybrid Neural Networks for Time Series Forecasting: 16th Russian

    The paper presents research in the field for hybrid neural networks for time series forecasting. A detailed review of the latest researches in this field is described.

  17. PDF Theoretical Foundations of Using Econometric Methods of Time Series

    The aim of this paper is the analysis of the majority of existing models, the identification of the strengths and weaknesses of each of them. Research Methodology In the course of this study, the authors relied on the classificationoftime series forecasting models proposed by I.I. Chuchueva and took thefollowing models [1]:

  18. Calculation of the forecast part of the "hint" file with the help of

    The confidence interval was set equal to 0.25 at the forecasting stage. Figure 7 shows the results of forecasting using the Caterpillar-SSA program for 120 days in advance.

  19. Research and application on modeling and landing point prediction

    Fan, Q., Deng, Q. & Liu, Q. Research and application on modeling and landing point prediction technology for water jet trajectory of fire trucks under large-scale scenarios. Sci Rep 14 , 21950 ...

  20. Week 8 Pool RSK Papers 2024: Bob Morton, Capital, Soccer X Research, BigWin

    Week 8 rsk papers 2024: Here we provide you with the best football papers such as Bob Morton, Fortune 'X' Matrix, Capital International, Dream International Research, and Soccer 'X' Research, we publish the world's leading weekend football forecasting papers apart from RSK in the likes of WinStar, BigWin, Dream International Research ...

  21. Week 7 Pool RSK Papers 2024: Bob Morton, Capital, Soccer X Research, BigWin

    Week 7 rsk papers 2024: Here we provide you with the best football papers such as Bob Morton, Fortune 'X' Matrix, Capital International, Dream International Research, and Soccer 'X' Research, we publish the world's leading weekend football forecasting papers apart from RSK in the likes of WinStar, BigWin, Dream International Research ...

  22. Week 3 Pool RSK Papers 2024: Bob Morton, Capital, Soccer Research

    Week 3 RSK Pool Papers 2024: Soccer 'X' Research, Bob Morton, Capital International, Winstar, BigWin. Week 3 rsk papers 2024: Welcome to fortune soccer rsk papers here we provide you with RSK papers: Soccer X Research, Bob Morton, Capital International, WinStar, Bigwin Soccer, Pools Telegraph and pool papers from other publishers such as ...

  23. PDF Long‐Term Forecasting of S&T and Innovation Indicators

    LONG‐TERM FORECASTING OF S&T AND ... • Federal targeted programme "Research and development in priority S&T area" for 2007 ‐2012 • Targeted programmes of government agencies • Long‐term and medium‐term sectoral strategies ... Number of scientific papers published by Russian authors 6)Share of scientific papers written by ...

  24. Week 9 Pool RSK Papers 2024: Bob Morton, Capital, Soccer Research

    Week 9 RSK Pool Papers 2024: Soccer 'X' Research, Bob Morton, Capital International, Winstar, BigWin. Week 9 rsk papers 2024: Welcome to fortune soccer rsk papers here we provide you with RSK papers: Soccer X Research, Bob Morton, Capital International, WinStar, Bigwin Soccer, Pools Telegraph and pool papers from other publishers such as ...