A Soccer Action Valuation Toolkit¶

socceraction is a Python package for objectively quantifying the value of the individual actions performed by soccer players using event stream data. It contains the following components:

A set of API clients for loading event stream data from StatsBomb, Wyscout and Opta.
Converters for each of these provider’s proprietary data format to the SPADL and atomic-SPADL formats, which are unified and expressive languages for on-the-ball player actions.
An implementation of the Expected Threat (xT) possession value framework.
An implementation of the VAEP and Atomic-VAEP possession value frameworks.

First steps¶

Are you new to soccer event stream data and possession value frameworks? Check out our interactive explainer and watch Lotte Bransen’s and Jan Van Haaren’s presentation in Friends of Tracking. Once familiar with the basic concepts, you can move on to the quickstart guide or continue with the hands-on video tutorials of the Friends of Tracking series:

Valuing actions in soccer (video, slides)
This presentation expands on the content of the introductory presentation by discussing the technicalities behind the VAEP framework for valuing actions of soccer players as well as the content of the hands-on video tutorials in more depth.
Tutorial 1: Run pipeline (video, notebook, notebook on Google Colab)
This tutorial demonstrates the entire pipeline of ingesting the raw Wyscout match event data to producing ratings for soccer players. This tutorial touches upon the following four topics: downloading and preprocessing the data, valuing game states, valuing actions and rating players.
Tutorial 2: Generate features (video, notebook, notebook on Google Colab)
This tutorial demonstrates the process of generating features and labels. This tutorial touches upon the following three topics: exploring the data in the SPADL representation, constructing features to represent actions and constructing features to represent game states.
Tutorial 3: Learn models (video, notebook, notebook on Google Colab)
This tutorial demonstrates the process of splitting the dataset into a training set and a test set, learning baseline models using conservative hyperparameters for the learning algorithm, optimizing the hyperparameters for the learning algorithm and learning the final models.
Tutorial 4: Analyze models and results (video, notebook, notebook on Google Colab)
This tutorial demonstrates the process of analyzing the importance of the features that are included in the trained machine learning models, analyzing the predictions for specific game states, and analyzing the resulting player ratings.

Note

The video tutorials are based on version 0.2.0 of the socceraction library. If a more recent version of the library is installed, the code may need to be adapted.

Getting help¶

Having trouble? We’d like to help!

Try the FAQ – it’s got answers to many common questions.
Looking for specific information? Try the Index or Module Index.
Report bugs in our ticket tracker.

Contributing¶

Learn about the development process itself and about how you can contribute in our developer guide.

Research¶

If you make use of this package in your research, please consider citing the following papers.

Tom Decroos, Lotte Bransen, Jan Van Haaren, and Jesse Davis. “Actions speak louder than goals: Valuing player actions in soccer.” In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1851-1861. 2019.

[pdf, bibtex]
Maaike Van Roy, Pieter Robberechts, Tom Decroos, and Jesse Davis. “Valuing on-the-ball actions in soccer: a critical comparison of xT and VAEP.” In Proceedings of the AAAI-20 Workshop on Artifical Intelligence in Team Sports. AI in Team Sports Organising Committee, 2020.

[pdf, bibtex]