Project · Surface EMG · Signal processing · Combat sport

Spectral signatures of biceps brachii fatigue

A classical sEMG fatigue analysis, with a calibrated ML layer on top.

Sustained isometric grip is the limiting motor task in wrestling, judo, and karate. Surface electromyography (sEMG) of the biceps brachii is the canonical non-invasive window into when grip starts to fail — the classical indicator is a decline in median frequency (MDF) as motor-unit conduction velocity slows. I rebuilt the full pipeline in pure Python (custom FFT, Welch-style PSD, spectral features) and added a small calibrated classifier on top that out-predicts the classical MDF-threshold baseline on held-out athletes.

Step-by-step Python tutorial Reproduce the project from scratch with numpy + scipy + scikit-learn (≈45 min).

— Window-level accuracy (held-out athletes)

— vs classical MDF-threshold baseline

— ROC AUC on held-out athletes

— Athletes × isometric trial

Background

When a wrestler grips a sleeve or a judoka clamps a lapel, the biceps brachii and forearm flexors fire isometrically and continuously. Lactate accumulates, intracellular pH drops, muscle-fibre conduction velocity slows — and the surface EMG power spectrum shifts toward lower frequencies. Two scalars summarise this shift: median frequency (MDF) and mean frequency (MNF). Both decline monotonically with fatigue in the classical Lindström / De Luca / Merletti tradition, and remain the workhorses of applied EMG kinesiology — the methods that, for example, the Foro Italico group has used for two decades to study fatigue in elite karateka and judoka.

The classical decision rule is simple: fatigued ≡ MDF below 90% of the initial value. It's interpretable and widely used. But on noisy single-channel data it's a coarse switch — late, threshold-sensitive, and ignores correlated information in RMS amplitude and short-term spectral slopes. The question this project asks is the obvious one: can a calibrated model do better than the threshold while staying interpretable?

Data & methodology

Cohort. 20 simulated combat-sport athletes performing a 60-second sustained isometric biceps-brachii contraction at 1 kHz sampling. Each athlete has individual baselines for MDF (90–135 Hz), fatigue half-time t₅₀, and RMS amplification — sampled to match the inter-individual variability reported in published isometric protocols. The signals are synthesised in the frequency domain so that the time-course of the spectral content is ground-truthed; the analysis pipeline operates on the raw time-domain signal and has no access to the generator parameters.

Pipeline. Non-overlapping 512-sample (0.512 s) windows. Hann-tapered radix-2 Cooley–Tukey FFT (custom, pure-Python ~25 LOC) → one-sided periodogram → median frequency, mean frequency, RMS amplitude, and the 5-window short-term slope of MDF.

Model. Logistic regression on the four features above. Target label is fatigued (t ≥ athlete-specific t₅₀). Athletes are split 12 train / 4 calibrate / 4 test — disjoint by individual, not by time, because cross-athlete generalisation is the realistic deployment problem.

Decision threshold and uncertainty. The 0.5-cutoff is generally wrong for an imbalanced binary target; the calibration athletes are used to pick a balanced-accuracy-optimal threshold τ*. The same calibration set yields split-conformal probability half-widths at 80% and 90% nominal coverage, which you can read as "the model thinks the probability is p, give or take this much".

The classical finding: PSD shifts left as fatigue accumulates

Power spectra for one held-out athlete, computed from one analysis window early in the trial (≈8 s, fresh) and one near exhaustion (≈52 s, fatigued). The spectral mass migrates from the 110–130 Hz neighbourhood toward 70–90 Hz — a textbook EMG fatigue signature, and the variable the classical MDF/MNF indicators were designed to track.

Predicting fatigue on an unseen athlete

Below: the model's per-window fatigue probability for a test athlete the classifier never saw, overlaid on the athlete's MDF time series. The shaded band is the 90% conformal probability interval. The vertical dashed line is the ground-truth fatigue half-time t₅₀ — i.e., where labels in this athlete flip from 0 → 1.

Two things happen at once: MDF decays monotonically (the classical indicator), and the model's confidence crosses τ* roughly when ground truth flips. The conformal band sometimes pierces both 0 and 1 — that's the model honestly telling you that uncertainty is highest near the transition, exactly where the decision is hardest.

Calibration: are the predicted probabilities meaningful?

Each point: one analysis window from a held-out athlete; horizontal axis is the model's predicted probability of fatigue, vertical axis is the actual binary label (jittered so points don't pile up). Probabilities cluster near 0 or 1 — the model is decisive — and ranking is essentially monotonic (ROC AUC —).

What the model leaned on

Standardised logistic-regression coefficients. The four features carry roughly the expected sport-science direction:

MDF — strongly negative: higher median frequency ⇒ less likely to be fatigued. The classical indicator carries most of the signal, as it should.
MNF — same direction, partly redundant with MDF (they're correlated by construction).
RMS — positive: amplitude rises as motor units are recruited to compensate.
5-window MDF slope — adds a short-term trend that the static MDF value can't capture, particularly useful around the transition.

vs. the classical baseline

The classical decision rule (fatigued ≡ MDF < 90% of the first 10 windows) reaches — on the same held-out athletes. The calibrated logistic regression reaches — — a — percentage-point improvement, mostly recovered around the transition where the classical threshold is brittle and the short-term slope feature actually helps.

That's the whole point of this project. The classical method is not wrong; it's coarse. Adding one or two correlated features and one calibrated decision threshold buys real accuracy without losing interpretability — and the conformal interval makes the uncertainty visible to the coach reading the output.

Limitations & honest caveats

Synthetic signals. The signals are physiologically motivated (frequency-domain construction with realistic MDF decay) but not measured. The pipeline applies unchanged to recorded sEMG from a real isometric protocol — that's the next step, not the demo.
Single channel, single muscle. Real grip endurance involves the forearm flexors and brachioradialis as well; multi-channel spectral fatigue would tell a richer story (and is, in fact, what the published literature in this lab's tradition uses).
Binary label. "Fatigued / not fatigued" with a hard cutoff at t₅₀ is a useful simplification but real fatigue is a continuum and a coach needs the probability, not the threshold. The model already outputs probabilities — the calibration plot is the honest summary.
Cross-athlete generalisation. Test athletes were never seen in training, but the cohort is small (n = 4 test). A real evaluation would use leave-one-athlete-out across a larger sample.

How this connects to the research I want to do

This project sits inside the toolkit the Foro Italico group has used for two decades: surface electromyography, median/mean frequency, isometric protocols, elite combat athletes (karateka, judoka). The natural extension — and the proposal I would bring to that group — is to take this pipeline off synthetic data and onto multi-channel sEMG recorded on the Iranian national wrestling team, where I have direct access to a cohort that academic labs typically can't reach. The methodological extensions then write themselves: athlete- stratified conformal, transfer learning from sub-elite to elite cohorts, and the same calibrated-uncertainty framing applied to a richer fatigue label (force-failure onset, dropout from match simulation, or RPE).

Source

Single Python file, standard library only, including a hand-rolled radix-2 FFT: notebooks/emg_fatigue/build_dataset.py in the repo. Generates the cohort, computes spectral features, fits logistic regression by Newton-Raphson, calibrates the decision threshold and conformal probability intervals, exports this page's JSON. End-to-end in under 10 seconds on a laptop.