What would be the most likely result if the match was played 10,000 times?

We ask this question of every Veikkausliiga match and we hope to spark and inform debate by showing you our findings.

The outcome of every football match consists of signal and noise. The signals are related to certain types of match-events, e.g. goal scoring attempts and their locations. If the signal is strong enough to cut through the noise, we can find valuable information about the performance and its consistency.

Using Veikkausliiga data from 2017 as an initial dataset, our dynamic expected goals models crunch the match-event data from each new round of fixtures and predict the goals scored (expected goals – xG) by each team against their opponent of the week.

The xG predictions are then used in simulating the most likely match score, with the given match-events, if the match was played 10,000 times.

Get in Touch

We are a sports technology and data analytics company. Our mission is to support the development of team sports through the opportunities provided by modern technology and advances in scientific research.

Contact us: info (at) kvantia.com or call us +358 (0)44 550 7414.

Tables and Graphs

Table of Justice

The league table according to the simulated results. The ‘justice’ of the table refers to the teams’ successes (or failures) as defined by our models in cyberspace, and takes nothing away from the actual successes (or failures) achieved by the teams on the field.

Results

The simulated results of each league match in 2018

Form xG: Scored

Each team’s expected goals scored per game according to their form in the last five matches

Form xG: Conceded

Each team’s expected goals conceded per game according to their form in the last five matches

Form xG: Diff

Each team’s expected goal difference per game according to their form in the last five matches

Technical notes about the models

The expected match results we select from the most probable outcomes after a high number of simulations will always regress towards the mean/mode. This does not mean that the other outcomes did not occur in the simulations; they just occurred less frequently.

The models are purely based on available match event data – no other sources of information are used.

The model parameters were initially tuned purely based on previous season, which induces a certain degree of bias. This bias will be ‘corrected’ gradually throughout 2018 by adjusting certain team-specific model parameters. For example, the models do not have any specific prior information about the promoted teams (TPS and FC Honka) but learn from each round of matches and adapt accordingly.

As the famous saying goes, all models are more or less wrong. Please acknowledge this when using the information we present – we look forward to hearing your critique!