2023 Boston Marathon - Variability in Finish Times

histograms
summary statistics
bimodal data
Describing finish time for runners in the 2023 Boston Marathon
Authors
Affiliation

Ivan Ramler

St. Lawrence University

Jack Fay

St. Lawrence University

Published

May 13, 2024

Welcome video

Introduction

For this activity, you will be exploring the result times from female and male runners that finished the 2023 Boston Marathon.

In particular, you will examine both visualizations and summary statistics of result times to explore the variation in finish times as well as use comparative techniques, such as z-scores, to compare and contrast male and female participants.

Investigating these trends is useful for several reasons. Firstly, exploring these trends can help to deepen our understanding of how different factors, such as gender, impact marathon performances. Secondly, analyzing the distribution of finish times and the performance of top finishers against the masses provides insights into the competitive landscape of the marathon. It can identify outliers or exceptional performances and understand how elite athletes compare to average participants. Although not directly connected to this data, analyses like these can inform training strategies, highlight the effectiveness of different preparation methods, and inspire both new and experienced runners by showcasing the range of achievable performances.

This activity would be suitable for an in-class example or quiz.

By the end of the activity, you will be able to:

  1. Analyze distributions using histograms
  2. Identify potential confounding variables to explain bimodal data
  3. Compare and contrast distributions for a pair of groups
  4. Calculate and compare z-scores for individual cases

For this activity, students will primarily use basic concepts of histograms and summary statistics to analyze distributions. Students will also likely require knowledge of z-scores.

The provided worksheets do not require any specific statistical software. (Although they will likely require access to a calculator.)

Since the data are provided, instructors are encouraged to modify the worksheets to have student construct visualizations and calculate summary statistics using whichever software they choose.

Data

The data set contains 26598 rows and 15 columns. Each row represents a runner who completed the Boston Marathon in 2023

Download data:

Available on the SCORE Data Repository: boston_marathon_2023.csv

Variable Descriptions
Variable Description
age_group age group of the runner
place_overall finishing place of the runner out of all runners
place_gender finishing place of runner among the same gender
place_division finishing place of runner among runners of the same gender and age group
name name of runner
gender gender of runner
team team the runner is affiliated with
bib_number bib number of runner
half_time half marathon time of runner
finish_net finishing time timed from when they cross the starting gate
finish_gun finishing time of runner timed from when the starter gun is fired
half_time_sec half marathon time in seconds
finish_net_sec net finish in seconds
finish_gun_sec gun finish in seconds
finish_net_minutes net finish in minutes

Data Source

Boston Athletic Association

Materials

We provide editable MS Word handouts along with their solutions.

Class handout

Class handout - with solutions

In conclusion, the Boston Marathon Times worksheet provides valuable learning opportunities for students in several key areas. It allows them to understand reasons by variability might exist and to discover multimodal distributions can occur simply due to excluding an important explanatory variable that otherwise confounds the analysis. The calculation of z-scores or other similar measurement of relative location enables students to compare and contrast the remarkable achievements of the top female and male finishers, shedding light on their talent in their respective fields. Overall, this worksheet allows students to critically analyze the 2023 marathon result data and draw meaningful conclusions about the extraordinary performances of athletes in the race.