AI and ML in the cycling peloton

A lot of blah-blah and not much boom-bam?

 

In the third edition of my book "Trainen op vermogen" a Dutch-language edition from 2018, I referred by way of introduction to an article by Timandra Harkness.

"Give me your data and I'll tell you what performance you'll achieve."

She concluded: "There's a lot of confidence that smart machines, with all those millions of data points and hundreds of variables, will help us find patterns we could never find ourselves. This fuels the suggestion that we can let the data decide what's best for us. In reality, we distrust our own judgment and shy away from our responsibility to make choices and decisions. It's a dangerous evolution."

Now, seven years later, data analysts want us to believe that performance analysis without AI or ML is unthinkable.
But first of all: what exactly do those terms mean?

We asked ChatGPT. Who else?

"Artificial Intelligence is the broader concept of programming machines to perform tasks that normally require human intelligence."

"Machine Learning is a branch (subfield) of AI. It focuses specifically on learning from data. Instead of programming everything step by step, a computer pattern learns how something works by itself based on examples."

ChatGPT concludes:

• AI is the big picture.

• ML is a way in which AI becomes smarter by learning from data.

Everything seems to revolve around data, and we can't deny that modern cycling has a wealth of data at its disposal. Heart rate and heart rate variability, power output, VO2max, Critical Power and W’, lactate production, muscle fiber composition, cadence and torque, ambient temperature and body temperature, hydration regulation, elevation gain, energy consumption and calorie intake, air resistance and rolling resistance… You can't imagine all the things being measured these days. Technology has certainly advanced to the point where we have instruments that measure very precisely, but

the question must be: are we actually getting the right information or

Big Data = Big Insight ?

It sounded like music to the ears of (wealthy) cycling teams: by linking this seemingly endless amount of data, ML could make predictions that far exceed the computing power and therefore the decision-making power of a human (the coach). The data analyst in cycling was born.

Do I—a coach from a computer-free past—believe in the processing power of computers? Yes, I do. It wouldn't surprise me if Eddy Merckx appeared in the peloton during a live broadcast of the 2026 Tour de France and dropped Pogacar, Vingegaard, and Evenpoel. This pixel-based approach to the cyclist is, of course, irrelevant in this discussion. It would be more relevant if, based on

  • weather conditions such as temperature, rain and wind,

  • the surface such as asphalt, cobblestones or gravel

  • the course

  • and the rider's body type

the rider's position on the bike, the bike itself, the tires, the clothing, the helmet, the nutrition, the hydration… could be adjusted to create the best external conditions for optimal performance. I believe in this too because, after all, it's "only" about processing data that are independent of the rider's physical and mental characteristics.

Unfortunately, we're not there yet

and in the case the physical and mental characteristics of the rider must also be taken into account, we are not there at all. The question in this case is:

what requirements must all the Performance Determinants meet for a rider to perform optimally?
Or, alternatively: what are the factors that will cause a rider to underperform and thus not be ready to compete?

  • Did AI predict Evenepoel would suffer a serious setback during the TDF 2025, let alone that he would have to give up completely exhausted?

  • Did AI predict that Evenepoel would humiliate Pogacar just two months later at the World Time Trial Championships by overtaking him?

  • Did AI predict that in the road race, just a few days after his dismal time trial, Pogacar would need only one acceleration to drop everyone else and successfully hold off a chasing elite group for almost 100 km?

 

So, what are the real achievements of AI and/or ML in the cycling peloton in 2025? I asked a data analyst from a professional team. He referred me to this article:

"Did you know that Kévin Vauquelin's journey to the top was aided by data?"

The story goes like this.

ML was unleashed on several months of training and race data from 650 cyclists from 38 countries. Among other things, the "Tired 20"—a cyclist's best average power output over 20 minutes, measured after an effort of 3000 kcal—was used to identify the most promising cyclists. This data-driven approach yielded two "confirmed talents." In other words, Kévin Vauquelin was the top talent suggested by ML and was therefore eligible for a professional contract.

That study refers to the best average power, which TrainingPeaks has promoted for 25 years as the benchmark in power data collection. Specifically, this Mean Maximum Power curve was used to determine durability—the ability to perform at a high level even under fatigue. Hence, the “Tired 20” as the performance standard in this study, as it provides a better picture than performance over 20 minutes when fresh. So far, so good.

According to ChatGPT, AI gets smarter by learning from data. But what if these data are wrong? Not because the measurement itself is wrong, but because the data provides us with incorrect information? No one will doubt that a "machine" can process this enormous amount of data better and, above all, faster than a human. The question however is: are hese data reliable?

In other words, does it provide us with the correct information?

After 20 years of experience with the MMP curve, I dare say the answer is usually NO, for the following reasons.

  1. The MMP curve of training and competition data—especially that of non-professional cyclists—does not sufficiently capture the true maximum TEST values ​​necessary for determining the cyclist's physical capabilities—and these are far more than the 20-minute value—and is therefore unreliable in many cases.

  2. An article on durability https://doi.org/10.1002/ejsc.12077 concludes:
    "These findings therefore suggest that quantifying previously performed work solely based on energy expenditure may not be sufficient to quantify the described previous work."
    This means that a decrease in the measured value after 3000 kcal does not properly define "durability" because the preceding intensity of the effort, in other words, the percentage of anaerobic reserve still available at that moment, is not taken into account.

  3. What is measured after "x" of work performed in a race is highly dependent on the rider's role (helper or leader, lead-out rider or sprinter), the tactics or the development of the race, and even the rider's motivation. Declining values ​​are therefore certainly not always due to poor durability. And what about rising values? The 10”, 30”, 1' and even 2' values ​​of the lead-out rider and the sprinter are usually highest at the end of the race. Should it then be decided that short, explosive efforts are not subject to a fatigue factor?

Performance losses, which undoubtedly do exist, are therefore not easily detectable using the classical interpretation of the MMP curve. ML learns from flawed examples. It's therefore possible that good performances by other riders escaped the radar in this study.

Shouldn't we be impressed if AI were to point out not only questionable data but also gaps in information gathering? After all, is it really sufficient to use only power data for talent detection or talent identification?

AI and ML in the cycling peloton: lots of blah-blah but little boom-bam? Yes, at least as things stand now in 2025.

If you, as a reader, know of an AI or ML application that can actually positively influence the performance of a rider or a team, or predict certain things, and that can't be devised by the coach's common sense, I'd love to hear about it.

We already have one that comes close: the SuperCycle app (which is underpinned by the ECP/EXREC concept) predicts on the go and based on your current performance profile, how long you can keep up the effort and predicts the moment when you become exhausted.

Next
Next

Interval training for dummies …. but also for smart people.