Most workout apps are passive. They store what you did. When you hit your rep target, they add weight. When you miss, some of them do nothing. A few knock the weight back.
That's not coaching. That's a spreadsheet with a nicer interface.
A coach watches you. Not just the set in front of them, but the pattern across the last few weeks. They notice when your RPE is creeping up without a corresponding increase in load. They notice when you've missed two sessions in a row and show up looking flat. They notice when the program that was working three months ago has quietly stopped working, and they do something about it before you dig yourself into a hole you need six weeks to climb out of.
That gap between a passive logger and an active coach is the thing we've been trying to build. Here's what it actually looks like inside Yuge.
What "training intervention" means
An intervention is any deliberate change to your program in response to a signal from your training. Not a scheduled deload. Not the progression jump the spreadsheet was always going to make on week 4. An unplanned response to real data.
Interventions break down into two categories: exercise-level adjustments (load reductions, set and rep tweaks, exercise swaps) and program-level restructuring (volume reduction across the board, frequency cuts, training max resets, or switching programs entirely).
Most apps, if they do anything, only do the first category. And they do it crudely. Yuge has a layered system for both, with explicit logic for when each type of intervention is appropriate and when to escalate.
The rules engine
The first layer is a set of deterministic rules. These fire automatically after sessions based on performance data. No AI involved in these decisions. They're computed, not generated.
Consecutive failure deload. If you keep failing sets on the same exercise across multiple sessions, the load comes down. The threshold depends on your sensitivity setting. The rationale gets logged with the change so you can see exactly why it happened.
RPE drift. If your RPE on an exercise has been consistently above or below target across recent sessions, the load adjusts. This works in both directions — if you've been coasting, load goes up. A coach doesn't just back you off when things get hard. They push when you're sandbagging. One caveat: RPE accuracy varies with training experience. Beginners tend to underestimate proximity to failure, so the system learns your individual calibration over time rather than treating every RPE 8 as equal.
Missed session volume. Miss a few sessions in a short window and the weekly volume comes down across the board. The system won't strip a movement down to nothing, it just lightens the demand enough to give you a realistic chance of completion when you do show up.
Fatigue accumulation deload. This is the whole-program version. When the system reads multiple signals pointing to accumulated fatigue — completion rates, RPE trends, missed workouts, declining performance, stuff you've mentioned in coaching conversations about sleep or stress — it triggers a proper deload. You don't have to ask for one. The system recognises when you need one.
Exercise substitution. When health or pain notes flag an issue with a specific movement, the system swaps to an alternative in the same movement pattern. It maintains the training stimulus while removing the thing causing the problem.
Stale accessory rotation. Accessories that have stalled or accumulated repeated failures get rotated to a different exercise targeting the same muscle group. Main lifts are protected. Accessories aren't.
These rules run in priority order. If consecutive failures trigger a deload on an exercise, the RPE drift rule won't also fire on that same exercise. The system doesn't pile adjustments on top of each other.
Why deterministic rules instead of LLM decisions
This is a deliberate choice worth explaining.
We could have the AI reason about every adjustment. Some fitness products do something like this. The problem is that AI outputs are unpredictable. The same inputs don't always produce the same outputs. For decisions about load and volume, where someone is about to pick up a bar and lift, you want predictable behaviour.
The rules engine always produces the same decision given the same data. That makes it auditable. You can see exactly what triggered a change and what the system computed. When load or volume changes, it should be because the math said so, not because the AI was in a different mood.
The AI's job is explanation, coaching conversation, and program-level reasoning — not per-exercise load arithmetic.
The intervention ladder
The rules engine handles exercise-level adjustments well. But sometimes the problem isn't a specific exercise. Sometimes the program itself is the problem, and patching individual sets and loads is treating symptoms while the underlying issue gets worse.
That's what the intervention ladder is for.
The stress-driver case is its own intervention path at level 2 — when the limiter is recovery dysregulation rather than overwork, the right move is cutting a training day, not trimming sets.
The health score
The intervention ladder is driven by a program health score built from four signals.
Set completion. Are you completing your prescribed work? The system looks at both the overall rate and how failures are distributed. A high-volume program where failures scatter across many exercises is a different signal from a low-volume program where one exercise keeps failing. Broad failure spread suggests the overall demand exceeds current recovery capacity.
Adjustment stability. How much has the system been tweaking things? High churn indicates exercise-level fixes aren't working. But it also catches the opposite problem: minimal adjustments despite low completion rates. Both excessive churn and neglect are warning signs.
Recovery signals. Missed workouts, RPE creeping above target across multiple lifts, declining trends over consecutive weeks, and stuff you've mentioned in coaching conversations — sleep problems, stress, nutrition issues. If you've mentioned poor sleep three times in two weeks, that shows up.
Phase appropriateness. A bunch of failed sets during an intensification block is different from failed sets during a deload week. Failures are expected when you're pushing hard. Failures during a deload are a red flag. The health score accounts for where you are in the training cycle.
What the AI does with all of this
The rules engine and health score are inputs to the coaching AI, not substitutes for it.
When a rule fires, Yuge tells you what changed, what triggered it, and why the adjustment makes sense in the context of your current program and training history. Not a generic message about load reductions. A specific coaching note about your squat, your recent RPE trend, and what the next few sessions should feel like.
When the ladder escalates, the AI's coaching posture changes. At Level 1, it starts drawing connections you might not have noticed. At Level 2, it asks about your life before proposing program changes. At Level 3, it frames a fresh start as a smart strategic decision rather than a failure.
The interventions are never silent. Every adjustment surfaces in the coaching conversation with a reason. You can ask why. You can push back. You can override. The system proposes. You decide. That's intentional — the goal is a coach that earns your trust by being transparent about its reasoning, not one that quietly shuffles your program around while you're not looking. We go deeper on exactly how we test these constraints in Robert's story — a simulated 58-year-old with bad knees who taught us more about safety than any prompt ever could.
What this means in practice
This system catches a lot of things, but it's not omniscient.
It's good at patterns that show up in logged data. Consecutive failures. RPE trends. Missed sessions. Load stagnation. It's less good at things that don't make it into the log — the session you completed but felt terrible during, the underlying injury you haven't mentioned yet, the context that explains why you've been off for three weeks. The more information flows into the system, the more useful it gets.
This is also why the intervention ladder escalates slowly. We could set lower thresholds or require fewer sessions before escalating. But a system that jumps to "you need a program reset" after a couple of bad weeks is crying wolf. Most bad weeks are just bad weeks. The ladder is calibrated to be conservative about escalation, because the cost of a false positive — disrupting a program that was actually fine — is higher than the cost of waiting an extra week to be sure.
The goal isn't to replace your own sense of how training is going. It's to notice the signals you might be too close to see, and to give them a shape you can do something with.
