Loading data¶
Socceraction provides API clients for various popular event stream data sources. These clients enable fetching event streams and their corresponding metadata as Pandas DataFrames using a unified data model. Alternatively, you can also use kloppy to load data.
Loading data with socceraction¶
All API clients implemented in socceraction inherit from the
EventDataLoader
interface. This interface provides the
following methods to retrieve data as a Pandas DataFrames with a unified data
model (i.e., Schema
). The schema defines the minimal set of
columns and their types that are returned by each method. Implementations of
the EventDataLoader
interface may add additional columns.
Method |
Output schema |
Description |
---|---|---|
All available competitions and seasons |
||
All available games in a season |
||
Both teams that participated in a game |
||
All players that participated in a game |
||
The event stream of a game |
Currently, the following data providers are supported:
Loading data with kloppy¶
Similarly to socceraction, kloppy implements a unified data model for soccer data. The main differences between kloppy and socceraction are: (1) kloppy supports more data sources (including tracking data), (2) kloppy uses a more flexible object-based data model in contrast to socceraction’s dataframe-based model, and (3) kloppy covers a more complete set of events while socceraction focuses on-the-ball events. Thus, we recommend using kloppy if you want to load data from a source that is not supported by socceraction or when your analysis is not limited to on-the-ball events.
The following code snippet shows how to load data from StatsBomb using kloppy:
from kloppy import statsbomb
dataset = statsbomb.load_open_data(match_id=8657)
Instructions for loading data from other sources can be found in the kloppy documentation.
You can then convert the data to the SPADL format using the
convert_to_actions()
function:
from socceraction.spadl.kloppy import convert_to_actions
spadl_actions = convert_to_actions(dataset, game_id=8657)
Note
Currently, the data model of kloppy is only complete for StatsBomb data. If you use kloppy to load data from other sources and convert it to the SPADL format, you may lose some information.