Towards a better method of slot allocation

Guys, did you not just read my absurdly long post?! These theories about the women’s field being fundamentally different-shaped are simply not backed up by the data.

No I agree with your analysis 100%.

The issue though is that any system is going to have its flaws, just as the current system does - you’re allocating a scarce resource and any way of distributing that resource is going to have edge cases or people who should have gotten the KQ slot, but that this particular method advantaged someone else.

Even Boston doesn’t let you know you got in when you ran your race, you have to wait until the applications are in and then there’s the buffer you have to beat. Boston has an advantage here since they don’t control the feeder races and they don’t really care if one course is faster than another. They just accept applications and pick the fastest 30k people vs the time qualifier. Whereas if they put this in, everyone would flock to Maryland and Florida over the hillier courses (more than they already do).

Short of having a Thorsten-type mega model that crunches a bunch of data, cross references weather conditions and course elevation, and spits out some sort of athlete score, we’re always going to have to make some trade-offs in how KQ slots get allocated. (as we do today under the current system)

I’d be more than fine with a multi-factor athlete score, and then you just pick the top 50 or 75 scores at each race and be done with it, but I suspect it would be much too hard to communicate to the general public.

I think there is no chance Ironman moves to any slot allocation model that would not allow them to do the slot allocation same day or next day and having people pay immediately.

2 Likes

100% agree, but I suspect maybe for different reasons. I don’t want to put words in your mouth so please correct me if I misinterpret your wording, but I think you would lean on the pay immediately as their primary motivator. While I agree that has a part in it I also think making it 100% based on being the top performers removes a great deal of the fun of the roll down. That hope, that anticipation that the 5-50th place competitor feels that maybe, just maybe they have a chance on that day to secure a spot is a MASSIVE marketing and retention tool. There is no way they give that up.

Boston is a horrible comparative as their business model is now fed by other races not on hosting the feeder races. A better comparative is UTMB. While you can earn your way in by being the best, you also have other pathways through the stones lottery. If Ironman were to take a purely performance based approach for at race slot allocation they would need to implement something like the UTMB stones to keep that hope alive.

2 Likes

Oh yes, the rolldown is something they would also be loath to see go away.

I’ll point out also that even Boston’s objective sporting criteria are themselves policy-based. The times are set, and modified, in order to get the age/gender distribution desired. If there are too many old guys and not enough older women, then Boston makes the adjustment every so often to the time standard (when they drop times every 5 years of so, they don’t do it uniformly)

We can’t have this discussion outside of the lens of - what policy do we want for our championship? Again, whatever method is used will advantage one group over another, and at the end of the day, IM will have to sort out its policy choices vs its desired outcome.

1 Like

Replying generally to the thread- Ironman has years upon years of data on these races. They should do it like they do swim meet qualifications and the Boston Marathon- automatic and consideration times. They could create times for each race and age group. Give away half the slots with automatic times, and give away the rest in June or July before the race.

1 Like

Agree with pretty much everything you’re saying here. Good callout that even with Boston you don’t really know until later.

The current system has flaws and tradeoffs, just like any system. Some benefits (simplicity, same-day decision, roughly equal proportional rewards) some downsides (dependent on turnout, “swingy” decisions). Some “quirks” that IMO are kind of charming and add to the magic (the chance of a super deep rolldown, the tense air at awards waiting to see who takes their slot, getting stuck in traffic driving to awards and wondering if your Kona dreams are doomed by the broken down car…).

The downsides 100% exist, what bugs me is when everyone rushes to declare them “women’s problems.” Nobody has shown any evidence that the downsides systematically affect women uniquely. They systematically affect small AGs, many of which are in fact male.

“Well that’s just semantics. If almost all women race in small AGs then ‘small AG problems’ are, practically speaking, tantamount to ‘women’s problems’”

That’s IMO an important point, and worth considering. Let’s not forget the older male AGs, though.

I think my response to this is to point out that the “bad breaks” (which harm small AGs worse) are balanced by an equal amount of “good breaks” (which benefit small AGs more). Small AGs are hit worse by swingy results, but the swing goes both ways. Statistically speaking, for every race that’s unusually stacked with top athletes, there is a race where where unusually few top athletes show up. Folks in small AGs are much more likely to get lucky with a soft field and nab a slot even if they aren’t quite as objectively strong. In large AGs it’s much harder to get lucky, because the bigger sample size means things tend to hew closer to the expected distribution.

I think this works in theory, but in practice you get lots of stories about how women tend to need to win the age group or they don’t KQ. Whether or not there’s a valid complaint or not, but the perception is that the smallish age groups which fluctuate between 1-2 slots tend to be harsher for those on the bubble - to cohort of people who won’t win the AG, but you could probably argue that they deserve to go (pick your metric). This is tougher when you put in human choice - it can be tough if there’s an outlier athlete in your age group in your area who tends to pick the same races as you do.

Maybe it just is a perception, but its hard to stomach the bad break when you only do an IM every other year or whatever. At least for 70.3 Worlds, you can get 2-3 chances per year and a bad break only stings for a few months.

TLDR: There are important statistical oversights in Women in Tri UK’s recently released “Performance versus Participation” report. These theoretical issues are confirmed by real-world data. If we calculate the report’s key statistic for Kona Ironman World Championships where slots were allocated proportionally, we find that men have been almost as likely as women to finish within 15% of their age-group winning time**.** Assigning Kona slots based on finish times relative to age group winners would be (statistically speaking) unfair to athletes competing in larger fields.


As a recovering data scientist, would-be Kona competitor, and father of a triathlete daughter, I was interested to read Women in Tri UK’s recently released “Performance versus Participation” report. I strongly support their goal of increasing inclusivity in our sport and commend them for their use of data in attempting to answer an important question about the relative competitiveness of women’s versus men’s fields.

Despite my belief in the goals of Women in Tri, I am compelled to point out two issues with the report’s key statistic that women are much more likely than men to finish within 15% of their age-group winner.

The first problem is that we should expect this result in smaller fields regardless of the absolute competitiveness of the fields themselves.

To illustrate, consider what would happen if we compared men’s races with large fields to men’s races with small fields, where the men competing are randomly drawn from exactly the same population of triathletes. If the distribution of this population’s finish times has a central peak (as is the case in the real world), then the age group winners at the smaller races will tend to be slower than the age group winners at larger races – in other words, age-group winners at a smaller race are more likely to have a finish time nearer the center of the distribution. And because there are more competitors nearer the center of the distribution, there will on average be more finishers closer to this time than to a faster time.

As an empirical verification, I collected data on Ironman finishing times in full Ironman races between (from http://www.coachcox.com). These real-world data confirm that the gap in measured “competitiveness” effectively disappears if we downsample men’s data to roughly equalize men’s and women’s field sizes.

To further demonstrate this effect, I generated synthetic race data. I created two groups within this dataset, a “large field” group and a “small field” group, where the number of competitors in each age category was randomly drawn from a distribution with mean and standard deviation equal to that observed in full Ironman race results between 2012 and 2025. Finishing times for both small- and large-field competitors were then drawn from another distribution ranging from 8 to 17 hours and featuring a central peak. After assembling this synthetic data, I calculated Women in Tri’s key statistic for “large field competitors” versus “small field competitors”. “Small field competitors” were much more likely to be within 15% of their age-group winning time even though these competitors were statistically identical to “large field competitors.” I re-ran this analysis many times, using multiple parameters values, and the result – not unsurprisingly – is quite robust to changes. All that is necessary to ensure this result is that the distribution of relative finishing times is smoothly increasing toward a central peak located at a point greater than 15% of age-group winners’ times.

A second related problem with the report is that defining “competitiveness” relative to individual races’ age-group winners does not permit comparison across races.

This again has to do with field sizes. A slower competitor is more likely to win in a smaller field. Finishing within 15% of the age group winner’s time in a larger field does not mean the same thing as finishing within 15% of the winner’s time in a smaller field. All else equal, larger fields tend to have stronger winners than smaller fields.

One way to understand this problem is to use a less relative benchmark for comparison than individual races’ age-group winners, and then to adjust for age and gender. For instance, instead of comparing to age-group winners, we could compare to the overall race winner. While overall winners are still not perfectly comparable across races, they should be much less subject to the “small field” issue than individual age-group winners. To adjust for age and gender, we could calculate the percentage time gap between the age-group winner at each race for each age group, and then select, say, the 98th or 99th percentile fastest relative age-group winner for each age-group across all races (we can’t drop too far in quantile or we will again run into a “small field issue”). This athlete’s percentage time gap can be used as an age-gender adjustment factor relative to the overall race winner. This allows us to define a “theoretical benchmark finish time” for any race. (Note that there are myriad issues to consider here, and I am not proposing use of this benchmark in any real-world application, especially not in assigning Kona slots. The point is merely to illustrate what happens if we use a less relative benchmark.)

One might think of this as posing a similar question to the Women in Tri analysis: “How near the age-group winner would competitors have finished if one of the world’s best athletes in the age group (on one of their best days) had won?”

I applied the above approach using data using real-world Ironman finishing time data. I ran this analysis over varying time periods and found that when we compare performance to a less relative benchmark time, the gap in competitiveness between males and females effectively disappears.

Finally, I conducted a test of proportional slot allocation’s effect on the competitive balance between male and female fields at Kona Ironman World Championships. To do so, I recalculated the key statistic from the “Performance versus Participation” report using Kona results from 2003 to 2019 (years where slot allocation was largely proportional). If it is true that female fields tend to be more competitive than male fields at qualifying races, then the qualifying standard for women would effectively be higher than for men. We should expect that any disparity in competitiveness between female and male fields would persist at the World Championship. However, this is not the case. When slots were allocated proportionally, 46.8% of women at the Kona World Championships finished within 15% of their age-group winner; this was true for 44.6% of men. This indicates that Ironman’s slot allocation over this period was slightly biased in favor of men, but only slightly. Presumably, additional Women For Tri slots would narrow or reverse this imbalance.

It is important to note that the above findings do not rule out the possibility that women’s finish times tend to be “shifted left” relative to men’s, meaning that it is effectively harder to qualify for Kona as a woman under proportional slot allocation. The above analysis does, however, demonstrate the unfairness of allocating slots based on age-group winner’s times. It would be statistically incorrect to rely on Women in Tri’s key statistic as a measure of competitiveness or as a basis for a slot allocation methodology. It might make sense to employ this statistic at the Kona Ironman World Championship as one measure of how Ironman’s slot allocation is working to ensure competitive balance across female and male fields at World Championships.

I applaud Women in Tri and Ironman for their exploration of data and their efforts to increase female field sizes. I hope further data analysis work will continue.

If I can provide more information or a copy of the Python code used to produce the above results, please do not hesitate to reach out.

Best regards and happy training,

Jesse Czelusta

3 Likes

Thanks Jesse. This aligns with my own analysis posted earlier, glad to see someone else arriving at the same result. The report is well intentioned, but comes to conclusions that simply aren’t supported by the data.

Downsampling the men’s data was the next experiment I wanted to try. Thank you for doing this.

TLDR: There are important statistical oversights in Women in Tri UK’s recently released “Performance versus Participation” report. These theoretical issues are confirmed by real-world data. If we calculate the report’s key statistic for Kona Ironman World Championships where slots were allocated proportionally, we find that men have been almost as likely as women to finish within 15% of their age-group winning time**.** Assigning Kona slots based on finish times relative to age group winners would be (statistically speaking) unfair to athletes competing in larger fields.


As a recovering data scientist, would-be Kona competitor, and father of a triathlete daughter, I was interested to read Women in Tri UK’s recently released “Performance versus Participation” report. I strongly support their goal of increasing inclusivity in our sport and commend them for their use of data in attempting to answer an important question about the relative competitiveness of women’s versus men’s fields.

Despite my belief in the goals of Women in Tri, I am compelled to point out two issues with the report’s key statistic that women are much more likely than men to finish within 15% of their age-group winner.

The first problem is that we should expect this result in smaller fields regardless of the absolute competitiveness of the fields themselves.

To illustrate, consider what would happen if we compared men’s races with large fields to men’s races with small fields, where the men competing are randomly drawn from exactly the same population of triathletes. If the distribution of this population’s finish times has a central peak (as is the case in the real world), then the age group winners at the smaller races will tend to be slower than the age group winners at larger races – in other words, age-group winners at a smaller race are more likely to have a finish time nearer the center of the distribution. And because there are more competitors nearer the center of the distribution, there will on average be more finishers closer to this time than to a faster time.

As an empirical verification, I collected data on Ironman finishing times in full Ironman races between (from http://www.coachcox.com). These real-world data confirm that the gap in measured “competitiveness” effectively disappears if we downsample men’s data to roughly equalize men’s and women’s field sizes.

To further demonstrate this effect, I generated synthetic race data. I created two groups within this dataset, a “large field” group and a “small field” group, where the number of competitors in each age category was randomly drawn from a distribution with mean and standard deviation equal to that observed in full Ironman race results between 2012 and 2025. Finishing times for both small- and large-field competitors were then drawn from another distribution ranging from 8 to 17 hours and featuring a central peak. After assembling this synthetic data, I calculated Women in Tri’s key statistic for “large field competitors” versus “small field competitors”. “Small field competitors” were much more likely to be within 15% of their age-group winning time even though these competitors were statistically identical to “large field competitors.” I re-ran this analysis many times, using multiple parameters values, and the result – not unsurprisingly – is quite robust to changes. All that is necessary to ensure this result is that the distribution of relative finishing times is smoothly increasing toward a central peak located at a point greater than 15% of age-group winners’ times.

A second related problem with the report is that defining “competitiveness” relative to individual races’ age-group winners does not permit comparison across races.

This again has to do with field sizes. A slower competitor is more likely to win in a smaller field. Finishing within 15% of the age group winner’s time in a larger field does not mean the same thing as finishing within 15% of the winner’s time in a smaller field. All else equal, larger fields tend to have stronger winners than smaller fields.

One way to understand this problem is to use a less relative benchmark for comparison than individual races’ age-group winners, and then to adjust for age and gender. For instance, instead of comparing to age-group winners, we could compare to the overall race winner. While overall winners are still not perfectly comparable across races, they should be much less subject to the “small field” issue than individual age-group winners. To adjust for age and gender, we could calculate the percentage time gap between the age-group winner at each race for each age group, and then select, say, the 98th or 99th percentile fastest relative age-group winner for each age-group across all races (we can’t drop too far in quantile or we will again run into a “small field issue”). This athlete’s percentage time gap can be used as an age-gender adjustment factor relative to the overall race winner. This allows us to define a “theoretical benchmark finish time” for any race. (Note that there are myriad issues to consider here, and I am not proposing use of this benchmark in any real-world application, especially not in assigning Kona slots. The point is merely to illustrate what happens if we use a less relative benchmark.)

One might think of this as posing a similar question to the Women in Tri analysis: “How near the age-group winner would competitors have finished if one of the world’s best athletes in the age group (on one of their best days) had won?”

I applied the above approach to real-world Ironman finishing time data. I ran this analysis over varying time periods and found that when we compare performance to a less relative benchmark time, the gap in competitiveness between males and females effectively disappears.

Finally, I conducted a test of proportional slot allocation’s effect on the competitive balance between male and female fields at Kona Ironman World Championships. To do so, I recalculated the key statistic from the “Performance versus Participation” report using Kona results from 2003 to 2019 (years where slot allocation was largely proportional). If it is true that female fields tend to be more competitive than male fields at qualifying races, then the qualifying standard for women would effectively be higher than for men. We should expect that any disparity in competitiveness between female and male fields would persist at the World Championship. However, this is not the case. When slots were allocated proportionally, 46.8% of women at the Kona World Championships finished within 15% of their age-group winner; this was true for 44.6% of men. This indicates that Ironman’s slot allocation over this period was slightly biased in favor of men, but only slightly. Presumably, additional Women For Tri slots would narrow or reverse this imbalance.

It is important to note that the above findings do not rule out the possibility that women’s finish times tend to be “shifted left” relative to men’s, meaning that it is effectively harder to qualify for Kona as a woman under proportional slot allocation. The above analysis does, however, demonstrate the unfairness of allocating slots based on age-group winner’s times. It would be statistically incorrect to rely on Women in Tri’s key statistic as a measure of competitiveness or as a basis for a slot allocation methodology. It might make sense to employ this statistic at the Kona Ironman World Championship as one measure of how Ironman’s slot allocation is working to ensure competitive balance across female and male fields at World Championships.

I applaud Women in Tri and Ironman for their exploration of data and their efforts to increase female field sizes. I hope further data analysis work will continue.

If I can provide more information or a copy of the Python code used to produce the above results, please do not hesitate to reach out.

1 Like

So, i just saw the news of what they now are changing it to.
Not to brag, but seems like pretty much what i suggested.
Where what I called the “expected winning time” is now calculated purely based on “past data on other courses” those other courses being Kona only and past data being the last 5 years. I like how they carry that information just as a factor compared to the fastest AG. Makes it computationally simple. That is, if you have a way to calculate 9:14:32 times 0.9873.

It is just (9x60 + 18) x 0.9873

or

(540+18) x 0.9873

or

0.9873 (558)

This gets it close enough (probably).

In any case, it should be calculated and shown for everyone in the Ironman app shortly

Yeah, thanks for explaining :wink:
My point was: its not something most people immediately do in their head.

haha, I can write down how to do the calculation, but I would not be able to compute it after I cross the finish line,

From what I understand, sportstats will do the math and push your age graded time and age graded position into the fields beside your name in the Ironman app, so it will take the guesswork out. It’s easy for them to calculate that in the back end anyway

If they were really smart, they’d be able to calculate virtual placement at each timing mat, so that even when you finished you could go back and at least know where you stood at the end of the bike - the same way you can do now vs people who started behind you in the swim.

For extra bonus points, they’d have separate Swim/T1/Bike/T2/Run coefficients, to recognize there’s probably a difference for between the sports vs age/gender, but I know that’s asking for too much. (E.g. if men go faster proportionally on the bike, the bike would have a different coefficient but the sum would add to the published total)

1 Like

It might be quite a wait for an M30 competing with a F65 for the last slot from the pool. Her ungraded time will be hours behind his.

What we need is a hot seat in the finish area for the current last qualifier!

2 Likes

Yes! And stick a camera on them and do a live stream like they do with the finish line camera

1 Like