Login required to started new threads

Login required to post replies

Prev Next
Re: A New Approach To Predict Performance [brbbiking] [ In reply to ]
Quote | Reply
Maybe I missed it, but once more, what are the individual data points in this new model?

If each workout is a single data point, then I'd still think, that the NN overfits, and quite badly at that.
Because even on logistic regression, which is much more stable and, with appropriate controls, much less likely to overfit, you need at least 100 datapoints just to estimate the model intercept reliably. There's no way to get around the maths, unfortunately, so while a proposed model might work quite well on a limited sample of athletes (note, that in this particular case - on people with VO2 max 65+), its generalization to "mortal" population might be much less predictive.
And NN eventually - just a bunch of polynomial regressions put into a single model ;) https://arxiv.org/abs/1806.06850

On the other hand, I absolutely applaud the effort of trying to modernize the nowadays-30+ years old performance prediction models. As correctly noted by Alan, real life experience clearly shows that the performance doesn't depend only on such a simplified metric as TRIMPS/TSS, so incorporating additional parameters ("features") in the model should bring improvements in prediction accuracy, provided an individual athlete has a sufficient training sample size (which might be the real issue here).

One should also, however, carefully think on model selection. While "nonlinearity" of NNs seems attractive at first, logistic regression with penalized splines is able to model these quite well provided the additivity assumptions still hold, with the big added bonus of clearly explainable parameter impact on the final model. Stepping a couple of steps further, I'd guess adoption of the Bayesian modelling framework should work even better here because at the end of the day that would allow to obtain full predictive performance distribution for particular athlete, and credibility intervals are just that much better explainable in practice.

----------------------------
Need more W/CdA.
Quote Reply
Re: A New Approach To Predict Performance [mrlobber] [ In reply to ]
Quote | Reply
mrlobber wrote:
Maybe I missed it, but once more, what are the individual data points in this new model?

If each workout is a single data point, then I'd still think, that the NN overfits, and quite badly at that.
Because even on logistic regression, which is much more stable and, with appropriate controls, much less likely to overfit, you need at least 100 datapoints just to estimate the model intercept reliably. There's no way to get around the maths, unfortunately, so while a proposed model might work quite well on a limited sample of athletes (note, that in this particular case - on people with VO2 max 65+), its generalization to "mortal" population might be much less predictive.
And NN eventually - just a bunch of polynomial regressions put into a single model ;) https://arxiv.org/abs/1806.06850

On the other hand, I absolutely applaud the effort of trying to modernize the nowadays-30+ years old performance prediction models. As correctly noted by Alan, real life experience clearly shows that the performance doesn't depend only on such a simplified metric as TRIMPS/TSS, so incorporating additional parameters ("features") in the model should bring improvements in prediction accuracy, provided an individual athlete has a sufficient training sample size (which might be the real issue here).

One should also, however, carefully think on model selection. While "nonlinearity" of NNs seems attractive at first, logistic regression with penalized splines is able to model these quite well provided the additivity assumptions still hold, with the big added bonus of clearly explainable parameter impact on the final model. Stepping a couple of steps further, I'd guess adoption of the Bayesian modelling framework should work even better here because at the end of the day that would allow to obtain full predictive performance distribution for particular athlete, and credibility intervals are just that much better explainable in practice.


Thanks for the support and the feedback.

Data points in the NN described in the blog are a 28 day rolling average of sessions - with inputs of volume and intensity for the NN and TSS for the Banister model. I have also tested a plain feed forward neural network with multiple 28 day windows (marginally better fit) and a recurrent neural network with multiple (sequential) time windows (marginally better fit again). Still, I was underwhelmed by the addition of longer time frames and I agree with you that the greatest improvement in the model will come from the addition of more features.

Very fair point on overfitting. Importantly, all accuracy comparisons between models were on the test sets for each (for those following along, Neural Networks are so flexible that they can easily approximate pretty much any data set. For this reason, it's important to 'hold out' a portion of the data to 'test' the model that we've created against unseen data). Hyper-parameters of the NN were also tuned against a validation set (via k-fold cross validation)

Also fair points on the benefits of a simpler model. I do think though that the flexibility of the NN might be important in this case given the variety of load/response patterns - steady improvement, diminishing returns, failing adaptation and every combination thereof but I aim to continue testing as many model types as possible in the never ending quest for that tiny RMSE Smile

Thanks again for the feedback.

Alan Couzens, M.Sc. (Sports Science)
Exercise Physiologist/Coach
Twitter: https://twitter.com/Alan_Couzens
Web: https://alancouzens.com
Last edited by: Alan Couzens: Mar 5, 19 18:30
Quote Reply
Re: A New Approach To Predict Performance [olmec] [ In reply to ]
Quote | Reply
olmec wrote:

PS While I love calling all of this AI, these NNs are just ML... (complex function approximators)


Difference between machine learning (ML) and AI:
If it is written in Python, it's probably machine learning
If it is written in PowerPoint, it's probably AI

https://twitter.com/...379612282885?lang=en

;-)
Quote Reply
Re: A New Approach To Predict Performance [doug in co] [ In reply to ]
Quote | Reply
doug in co wrote:
olmec wrote:

PS While I love calling all of this AI, these NNs are just ML... (complex function approximators)


Difference between machine learning (ML) and AI:
If it is written in Python, it's probably machine learning
If it is written in PowerPoint, it's probably AI

https://twitter.com/...379612282885?lang=en

;-)

There's an older version of the same joke. It it works it's machine learning. If it doesn't it's AI.
Quote Reply
Re: A New Approach To Predict Performance [Alan Couzens] [ In reply to ]
Quote | Reply
Can you plug nutrition, heigh, weight, stress level, personality traits, etc into Neural Network model to increase predictive power?

I assume that eventually we will be able to plug DNA info as well to understand how various genetic traits correlate with performance.
Quote Reply
Re: A New Approach To Predict Performance [hadukla] [ In reply to ]
Quote | Reply
hadukla wrote:
LAI wrote:
AdamL2424 wrote:
Wonder if Coggan will chime in on this thread.


I think he might have been part of the purge.


Last logged on Oct. 20 so yeah

There was a purge and I wasn't purged? Please tell me more...

Indoor Triathlete - I thought I was right, until I realized I was wrong.
Quote Reply
Re: A New Approach To Predict Performance [Anton84] [ In reply to ]
Quote | Reply
Anton84 wrote:
Can you plug nutrition, heigh, weight, stress level, personality traits, etc into Neural Network model to increase predictive power?

I assume that eventually we will be able to plug DNA info as well to understand how various genetic traits correlate with performance.


Absolutely! That is the major strength of a Neural Network - the number of 'features' it can potentially handle. At the extreme, even individual pixels can be considered separate features for the purpose of image recognition - thousands upon thousands of individual inputs. So, a NN (even a very basic NN) has no problem with testing out all of the features you can throw at it - body comp/weight, HRV, life stress, sleep hours/quality etc.

More features is certainly not a guarantee of better performance (at least on small datasets). The more complex you make the model, the more risk there is of over-fitting to the training data (&, thus, under-fitting to the test data) but compared to the current 'standard' of using only one input variable (TSS), and knowing the performance improvement we can already get by simply separating TSS into 2 separate variables of volume & intensity, I believe there is *A LOT* of room for performance model improvement with the addition of some of those features that you mention.

Alan Couzens, M.Sc. (Sports Science)
Exercise Physiologist/Coach
Twitter: https://twitter.com/Alan_Couzens
Web: https://alancouzens.com
Last edited by: Alan Couzens: Mar 7, 19 23:30
Quote Reply

Prev Next