.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "gallery/lesson1/plot_PassHeatMap.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note Click :ref:`here ` to download the full example code .. rst-class:: sphx-glr-example-title .. _sphx_glr_gallery_lesson1_plot_PassHeatMap.py: Pass heat maps =========================== Make a heat map of all teams passes during a tournament. In order to add context, we set a window for danger passes to be those in 15 seconds leading up to a shot. .. youtube:: cgn4fvWo5l0 :width: 640 :height: 349 .. GENERATED FROM PYTHON SOURCE LINES 18-19 We will need these libraries .. GENERATED FROM PYTHON SOURCE LINES 19-24 .. code-block:: default import matplotlib.pyplot as plt from mplsoccer import Pitch, Sbopen, VerticalPitch import pandas as pd .. GENERATED FROM PYTHON SOURCE LINES 25-30 Opening the dataset ---------------------------- To get games by England Women's team we need to filter them in a dataframe - if they played as a home or away team. We also calculate number of games to normalize the diagrams later on .. GENERATED FROM PYTHON SOURCE LINES 30-41 .. code-block:: default #open the data parser = Sbopen() df_match = parser.match(competition_id=72, season_id=30) #our team team = "England Women's" #get list of games by our team, either home or away match_ids = df_match.loc[(df_match["home_team_name"] == team) | (df_match["away_team_name"] == team)]["match_id"].tolist() #calculate number of games no_games = len(match_ids) .. GENERATED FROM PYTHON SOURCE LINES 42-49 Finding danger passes ---------------------------- First, for each game using mplsoccer parser we open the event data. Note that we use the [0] to store only event data. Then, we extract all the shots from the games and their time in seconds and possession identifier. We use use that information, together with the match_id to filter out passes that were made in the same possession as the shots. Laslty, we calculate the time difference between the pass and the shot and keep only those passes that were made 15 seconds before the shot. Note, that some possessions may have multiple shots, so we have to make sure to only include the passes in those possessions once, with the shot information of the upcoming shot. .. GENERATED FROM PYTHON SOURCE LINES 49-85 .. code-block:: default #Open event data for all matches and concatenate them df_all_events = pd.DataFrame() for match_id in match_ids: df_events = parser.event(match_id)[0] df_all_events = pd.concat([df_all_events, df_events]) #Identify danger passes #Add time in seconds column df_all_events["time_seconds"] = df_all_events["minute"]*60 + df_all_events["second"] #Take out the shots df_shots = df_all_events[(df_all_events['type_name'] == 'Shot')] #Only keep the necessary columns about shots df_shots = df_shots[['match_id', 'possession', 'time_seconds']] #Take out the open play successful passes from the possession team df_passes = df_all_events[(df_all_events['type_name'] == 'Pass') & (df_all_events['outcome_name'].isnull()) & (df_all_events['possession_team_id'] == df_all_events['team_id']) & (~df_all_events.sub_type_name.isin(['Throw-in','Corner','Free Kick', 'Kick Off', 'Goal Kick'])) ] # Merge shots and passes on possession and match_id # Use a inner join to keep only passes that have a matching shot in the same possession df_merged = df_shots.merge(df_passes, on=['possession', 'match_id'], how='inner',suffixes=('_shot','')) # Calculate time difference between pass and shot df_merged['time_diff'] = df_merged['time_seconds_shot'] - df_merged['time_seconds'] # Keep only passes that occurred within 15 seconds before the shot df_danger_passes = df_merged[df_merged['time_diff'].between(0,15)] # Some possessions may have multiple shots, keep only the shot with the smallest time_diff to each pass first_shot = df_danger_passes.groupby('id')['time_diff'].idxmin() df_danger_passes = df_danger_passes.loc[first_shot].reset_index(drop=True) # Filter for our team df_danger_passes = df_danger_passes[df_danger_passes['team_name'] == team] # Only keep necessary columns danger_passes = df_danger_passes[['x', 'y', 'end_x', 'end_y', 'minute','second','player_name']] .. GENERATED FROM PYTHON SOURCE LINES 86-90 Plotting location of danger passes ---------------------------- First, we create a pitch using mplsoccer *Pitch* class. Then we scatter them using scatter method. If you want to investigate the direction of passes, uncomment a line below! .. GENERATED FROM PYTHON SOURCE LINES 90-103 .. code-block:: default #plot pitch pitch = Pitch(line_color='black') fig, ax = pitch.grid(grid_height=0.9, title_height=0.06, axis=False, endnote_height=0.04, title_space=0, endnote_space=0) #scatter the location on the pitch pitch.scatter(danger_passes.x, danger_passes.y, s=100, color='blue', edgecolors='grey', linewidth=1, alpha=0.2, ax=ax["pitch"]) #uncomment it to plot arrows #pitch.arrows(danger_passes.x, danger_passes.y, danger_passes.end_x, danger_passes.end_y, color = "blue", ax=ax['pitch']) #add title fig.suptitle('Location of danger passes by ' + team, fontsize = 30) plt.show() .. image-sg:: /gallery/lesson1/images/sphx_glr_plot_PassHeatMap_001.png :alt: Location of danger passes by England Women's :srcset: /gallery/lesson1/images/sphx_glr_plot_PassHeatMap_001.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 104-109 Making a heat map ---------------------------- To make a heat map, first, we draw a pitch. Then we calculate the number of passes in each bin using *bin_statistic* method. Then, we normalize number of passes by number of games. We plot a heat map and then, we make a legend. As the last step, we add the title. .. GENERATED FROM PYTHON SOURCE LINES 109-127 .. code-block:: default #plot vertical pitch pitch = Pitch(line_zorder=2, line_color='black') fig, ax = pitch.grid(grid_height=0.9, title_height=0.06, axis=False, endnote_height=0.04, title_space=0, endnote_space=0) #get the 2D histogram bin_statistic = pitch.bin_statistic(danger_passes.x, danger_passes.y, statistic='count', bins=(6, 5), normalize=False) #normalize by number of games bin_statistic["statistic"] = bin_statistic["statistic"]/no_games #make a heatmap pcm = pitch.heatmap(bin_statistic, cmap='Reds', edgecolor='grey', ax=ax['pitch']) #legend to our plot ax_cbar = fig.add_axes((1, 0.093, 0.03, 0.786)) cbar = plt.colorbar(pcm, cax=ax_cbar) fig.suptitle('Danger passes by ' + team + " per game", fontsize = 30) plt.show() .. image-sg:: /gallery/lesson1/images/sphx_glr_plot_PassHeatMap_002.png :alt: Danger passes by England Women's per game :srcset: /gallery/lesson1/images/sphx_glr_plot_PassHeatMap_002.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 128-133 Making a diagram of most involved players ---------------------------- To find out who was the most involved in dnager passes, we keep only surnames of players to make the vizualisation clearer. Then, we group the passes by the player and count them. Also, we divide them by number of games to keep the diagram per game. As the last step, we make the legend to our diagram. .. GENERATED FROM PYTHON SOURCE LINES 133-145 .. code-block:: default #keep only surnames danger_passes["player_name"] = danger_passes["player_name"].apply(lambda x: str(x).split()[-1]) #count passes by player and normalize them pass_count = danger_passes.groupby(["player_name"]).x.count()/no_games #make a histogram ax = pass_count.plot.bar(pass_count) #make legend ax.set_xlabel("") ax.set_ylabel("Number of danger passes per game") plt.show() .. image-sg:: /gallery/lesson1/images/sphx_glr_plot_PassHeatMap_003.png :alt: plot PassHeatMap :srcset: /gallery/lesson1/images/sphx_glr_plot_PassHeatMap_003.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 146-150 Challenge ---------------------------- 1) Improve so that only high xG (>0.07) are included! 2) Make a heat map only for Sweden's player who was the most involved in danger passes! .. rst-class:: sphx-glr-timing **Total running time of the script:** ( 0 minutes 1.848 seconds) .. _sphx_glr_download_gallery_lesson1_plot_PassHeatMap.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_PassHeatMap.py ` .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_PassHeatMap.ipynb ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_