TLDR: There are important statistical oversights in Women in Tri UK’s recently released “Performance versus Participation” report. These theoretical issues are confirmed by real-world data. If we calculate the report’s key statistic for Kona Ironman World Championships where slots were allocated proportionally, we find that men have been almost as likely as women to finish within 15% of their age-group winning time**.** Assigning Kona slots based on finish times relative to age group winners would be (statistically speaking) unfair to athletes competing in larger fields.
As a recovering data scientist, would-be Kona competitor, and father of a triathlete daughter, I was interested to read Women in Tri UK’s recently released “Performance versus Participation” report. I strongly support their goal of increasing inclusivity in our sport and commend them for their use of data in attempting to answer an important question about the relative competitiveness of women’s versus men’s fields.
Despite my belief in the goals of Women in Tri, I am compelled to point out two issues with the report’s key statistic that women are much more likely than men to finish within 15% of their age-group winner.
The first problem is that we should expect this result in smaller fields regardless of the absolute competitiveness of the fields themselves.
To illustrate, consider what would happen if we compared men’s races with large fields to men’s races with small fields, where the men competing are randomly drawn from exactly the same population of triathletes. If the distribution of this population’s finish times has a central peak (as is the case in the real world), then the age group winners at the smaller races will tend to be slower than the age group winners at larger races – in other words, age-group winners at a smaller race are more likely to have a finish time nearer the center of the distribution. And because there are more competitors nearer the center of the distribution, there will on average be more finishers closer to this time than to a faster time.
As an empirical verification, I collected data on Ironman finishing times in full Ironman races between (from http://www.coachcox.com). These real-world data confirm that the gap in measured “competitiveness” effectively disappears if we downsample men’s data to roughly equalize men’s and women’s field sizes.
To further demonstrate this effect, I generated synthetic race data. I created two groups within this dataset, a “large field” group and a “small field” group, where the number of competitors in each age category was randomly drawn from a distribution with mean and standard deviation equal to that observed in full Ironman race results between 2012 and 2025. Finishing times for both small- and large-field competitors were then drawn from another distribution ranging from 8 to 17 hours and featuring a central peak. After assembling this synthetic data, I calculated Women in Tri’s key statistic for “large field competitors” versus “small field competitors”. “Small field competitors” were much more likely to be within 15% of their age-group winning time even though these competitors were statistically identical to “large field competitors.” I re-ran this analysis many times, using multiple parameters values, and the result – not unsurprisingly – is quite robust to changes. All that is necessary to ensure this result is that the distribution of relative finishing times is smoothly increasing toward a central peak located at a point greater than 15% of age-group winners’ times.
A second related problem with the report is that defining “competitiveness” relative to individual races’ age-group winners does not permit comparison across races.
This again has to do with field sizes. A slower competitor is more likely to win in a smaller field. Finishing within 15% of the age group winner’s time in a larger field does not mean the same thing as finishing within 15% of the winner’s time in a smaller field. All else equal, larger fields tend to have stronger winners than smaller fields.
One way to understand this problem is to use a less relative benchmark for comparison than individual races’ age-group winners, and then to adjust for age and gender. For instance, instead of comparing to age-group winners, we could compare to the overall race winner. While overall winners are still not perfectly comparable across races, they should be much less subject to the “small field” issue than individual age-group winners. To adjust for age and gender, we could calculate the percentage time gap between the age-group winner at each race for each age group, and then select, say, the 98th or 99th percentile fastest relative age-group winner for each age-group across all races (we can’t drop too far in quantile or we will again run into a “small field issue”). This athlete’s percentage time gap can be used as an age-gender adjustment factor relative to the overall race winner. This allows us to define a “theoretical benchmark finish time” for any race. (Note that there are myriad issues to consider here, and I am not proposing use of this benchmark in any real-world application, especially not in assigning Kona slots. The point is merely to illustrate what happens if we use a less relative benchmark.)
One might think of this as posing a similar question to the Women in Tri analysis: “How near the age-group winner would competitors have finished if one of the world’s best athletes in the age group (on one of their best days) had won?”
I applied the above approach to real-world Ironman finishing time data. I ran this analysis over varying time periods and found that when we compare performance to a less relative benchmark time, the gap in competitiveness between males and females effectively disappears.
Finally, I conducted a test of proportional slot allocation’s effect on the competitive balance between male and female fields at Kona Ironman World Championships. To do so, I recalculated the key statistic from the “Performance versus Participation” report using Kona results from 2003 to 2019 (years where slot allocation was largely proportional). If it is true that female fields tend to be more competitive than male fields at qualifying races, then the qualifying standard for women would effectively be higher than for men. We should expect that any disparity in competitiveness between female and male fields would persist at the World Championship. However, this is not the case. When slots were allocated proportionally, 46.8% of women at the Kona World Championships finished within 15% of their age-group winner; this was true for 44.6% of men. This indicates that Ironman’s slot allocation over this period was slightly biased in favor of men, but only slightly. Presumably, additional Women For Tri slots would narrow or reverse this imbalance.
It is important to note that the above findings do not rule out the possibility that women’s finish times tend to be “shifted left” relative to men’s, meaning that it is effectively harder to qualify for Kona as a woman under proportional slot allocation. The above analysis does, however, demonstrate the unfairness of allocating slots based on age-group winner’s times. It would be statistically incorrect to rely on Women in Tri’s key statistic as a measure of competitiveness or as a basis for a slot allocation methodology. It might make sense to employ this statistic at the Kona Ironman World Championship as one measure of how Ironman’s slot allocation is working to ensure competitive balance across female and male fields at World Championships.
I applaud Women in Tri and Ironman for their exploration of data and their efforts to increase female field sizes. I hope further data analysis work will continue.
If I can provide more information or a copy of the Python code used to produce the above results, please do not hesitate to reach out.