Socceraction documentation

socceraction is a Python package for objectively quantifying the impact of the individual actions performed by soccer players using event stream data. It contains the following components:

  • Convertors for event stream data to the SPADL and atomic-SPADL formats, which are unified and expressive languages for on-the-ball player actions.

  • An implementation of the VAEP framework to value actions on their expected impact on the score line.

  • An implementation of the xT framework to value ball-progressing actions using a possession-based Markov model.

_images/actions_bra-bel.png

First steps

Are you new to socceraction? Check out the Quickstart guide or watch Lotte Bransen’s and Jan Van Haaren’s series of tutorials on how to use socceraction:

  • Introduction in Friends of Tracking (video)

    This introductory presentation motivates the use of data for player recruitment in football, shows the limitations of traditional statistics to assess the performances of football players, introduces a number of frameworks for valuing actions of football players, provides an intuitive explanation of the VAEP framework for valuing actions of football players, and introduces the content of this series of hands-on video tutorials.

  • Presentation: Valuing actions in football (video, slides)

    This presentation expands on the content of the introductory presentation by discussing the technicalities behind the VAEP framework for valuing actions of football players as well as the content of the hands-on video tutorials in more depth.

  • Tutorial 1: Run pipeline (video, notebook, notebook on Google Colab)

    This tutorial demonstrates the entire pipeline of ingesting the raw Wyscout match event data to producing ratings for football players. This tutorial touches upon the following four topics: downloading and preprocessing the data, valuing game states, valuing actions and rating players.

  • Tutorial 2: Generate features (video, notebook, notebook on Google Colab)

    This tutorial demonstrates the process of generating features and labels. This tutorial touches upon the following three topics: exploring the data in the SPADL representation, constructing features to represent actions and constructing features to represent game states.

  • Tutorial 3: Learn models (video, notebook, notebook on Google Colab)

    This tutorial demonstrates the process of splitting the dataset into a training set and a test set, learning baseline models using conservative hyperparameters for the learning algorithm, optimizing the hyperparameters for the learning algorithm and learning the final models.

  • Tutorial 4: Analyze models and results (video, notebook, notebook on Google Colab)

    This tutorial demonstrates the process of analyzing the importance of the features that are included in the trained machine learning models, analyzing the predictions for specific game states, and analyzing the resulting player ratings.

Getting help

Having trouble? We’d like to help!

Contributing

Learn about the development process itself and about how you can contribute in our developer guide.

Research

If you make use of this package in your research, please consider citing the following papers.

  • Decroos, Tom, Lotte Bransen, Jan Van Haaren, and Jesse Davis. “Actions speak louder than goals: Valuing player actions in soccer.” In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851-1861. 2019. [link]

  • Maaike Van Roy, Pieter Robberechts, Tom Decroos, and Jesse Davis. “Valuing on-the-ball actions in soccer: a critical comparison of XT and VAEP.” In Proceedings of the AAAI-20 Workshop on Artifical Intelligence in Team Sports. AI in Team Sports Organising Committee, 2020. [link]