Figure 03 vs. Intern Aime, May 17: Human Wins the 10-Hour Sort 12,924 to 12,732 — Margin Is 192 Packages, 2.79 vs. 2.83 Seconds Per Box — California Labor Law's Mandatory Meal & Rest Breaks Are the Entire Difference — Aime's Left Forearm 「Basically Broken」, Fleet Past 116h Continuous, Adcock: 「This Is the Last Time a Human Will Ever Win」

Figure AI pitted intern Aime against a Figure 03 fleet for 10 hours. The human won by 192 packages. The fleet had no breaks, no blistered fingers, and is still going at hour 116.

Figure 03 vs. Intern Aime, May 17: Human Wins the 10-Hour Sort 12,924 to 12,732 — Margin Is 192 Packages, 2.79 vs. 2.83 Seconds Per Box — California Labor Law's Mandatory Meal & Rest Breaks Are the Entire Difference — Aime's Left Forearm 「Basically Broken」, Fleet Past 116h Continuous, Adcock: 「This Is the Last Time a Human Will Ever Win」

On Sunday May 17, Figure AI CEO Brett Adcock staged a 10-hour package-sorting head-to-head between an intern named Aime and a small fleet of Figure 03 humanoids that have already been running continuously since May 13. Same conveyor, same task — detect the barcode, pick up the box, place it face-down on the belt. Same shift length on paper.

The human won. The margin was 192 packages across 10 hours. The human had blistered fingers and a left forearm he described as “basically broken” by the end. Adcock’s X post called the score the day it cleared and added the line the headline writers were waiting for: “This is the last time a human will ever win.”

It is a useful sentence for a Figure investor deck. It also obscures what the trial actually proved.

The numbers

MetricAime (human)Figure 03 fleetDelta
Total packages, 10h12,92412,732+192 human
Average seconds per package2.792.83+0.04s human
Mandatory meal break (CA labor law)30 min0
Mandatory rest breaks (CA labor law)2 × 10 min0
Mechanical / physical failures1 (forearm)0
Continuous runtime, total to date10h116h 49m
Cumulative fleet output, livestream-to-date145,320 packages

The headline number is “human wins 12,924 to 12,732.” The structural number is “192 packages, 0.04 seconds per box, across 10 hours of competition.” The arbitrage is the 50 minutes of mandatory California-law meal and rest breaks the human had to take and the robot did not. Strip the breaks out and the throughput-per-active-minute crossover is already at the robot. Leave them in and the human won by less than a percent.

What Helix-02 had to do to come within 0.04 seconds

The Figure 03 fleet runs Helix-02, Figure’s end-to-end vision-language-action model — raw camera pixels in, motor actuator commands out, no intermediate scripted pick-and-place logic. The autonomous battery hot-swap protocol is what lets the cell present as one continuous worker: when Bob’s pack hits 20% the rotation puts Frank in his place without breaking the conveyor cadence. The “fleet” branding is a labor abstraction, not a hardware one — five physical units (Bob, Frank, Gary, Rose, Jim) rotating in and out behind a single workstation, which is exactly how the cohort wants the future warehouse manager to think about humanoid headcount.

Two things in the result deserve to be flagged.

The first is the peak-vs-cumulative shape of the curve. Aime’s first three hours were significantly faster than 2.79 seconds per package — Adcock has previously shown the human peaking around 2.4s in the first ninety minutes. The cumulative average of 2.79 reflects the standard human fatigue curve dragging him down through hours seven, eight, and nine, with a modest end-of-shift recovery. The Figure 03 cadence is flat across all ten hours. Peak speed is not the metric the robot is optimising for; consistency is.

The second is the failure mode. Aime’s left forearm “basically broken” is the kind of repetitive-strain injury workers’ comp would normally categorise as a chronic claim, not an acute one. The competition compressed weeks of warehouse-floor exposure into a single shift. None of the Figure 03 units logged a mechanical failure across the same window — or, more importantly, across the 116 hours of continuous runtime the livestream has now accumulated. At the margin where the human breaks his forearm, the robot is operating like the shift never started.

The labor-law gap, restated

The piece of the result that the cohort will quietly absorb is which side of the table needed legal protection.

Aime got 30 minutes of meal break and two 10-minute rest periods because California labor code says he had to. Figure 03 got zero breaks because the legal framework for humanoid worker rest does not exist. The 192-package margin is, in operational terms, the legal-protections premium — the throughput a worker is allowed to retain by virtue of having a body and being subject to wage-and-hour law.

That is the part of the result that does not generalise. In a state with weaker break rules — Texas, Florida, most non-union deployments — the same competition over the same task with the same human in the same physical condition produces a tied result at best, with the robot winning on shift two when Aime’s forearm doesn’t recover overnight. The “last time a human will ever win” line is not really about the next 10-hour test. It is about whether the next workstation Figure 03 ships into is in a jurisdiction that protects breaks at all.

What it does not prove

The competition was a transparent piece of marketing — Adcock said “we got bored” out loud — and on its own terms it works. The fleet survived the human-comparable scrutiny without a failure, on the same workflow, in front of a live audience.

But three things the trial deliberately did not test are the three things the warehouse manager is actually buying.

The first is task switching. Both sides ran an identical, pre-loaded, single-skill workflow for the full 10 hours. The cohort’s bottleneck in real deployments has consistently been the cost of changing what the robot is doing — switching from “pick package and place face-down” to “decant from a tote into a kit” requires either a different Helix-02 prompt and validation cycle or a human in the loop. Aime can be reassigned to a new task in 30 seconds. Figure 03 cannot.

The second is error mode. Helix-02 has demonstrated 24/7 fully autonomous operation in the package-sort domain. What “zero robot failures” did not record across 10 hours was the failure rate of edge cases — torn packaging, mislabeled barcodes, mixed-density boxes that change the grip plan — which is the part of the warehouse Aime is still strictly required for and the part the next Helix update is supposed to capture.

The third is economics. Adcock will quote a Robots-as-a-Service price for an industrial humanoid that approaches a low-end warehouse wage on a per-hour basis. The trial did not surface the actual cost stack — capex, charging infrastructure, the engineer-hours of Helix updates, the human supervisor watching the fleet — that the buyer would have to absorb. The BMW Spartanburg pilot, 11 months and 30,000 X3 vehicles in, is closer to the real-world version of this calculation. The 10-hour sort is the version optimised for shareability.

The cohort-level read

The robotics layer of the AI-jobs cycle we are tracking is now generating its own physical-evidence cadence — Figure’s livestream, Schaeffler’s thousands-of-units deal with Humanoid, Toyota Canada’s seven Digit units, Hyundai’s announced 30,000-Atlas-a-year Georgia metaplant for 2028 — and the pattern is consistent. The headline event is always a transparent demo. The thing the demo is actually selling is the elimination of the legal-protection premium, the break time, the shift differential, the workers’ comp claim, the hiring funnel.

Aime won 12,924 to 12,732 on Sunday. The trial demonstrated that on apples-to-apples task-execution metrics in a Bay Area facility subject to California labor code, the gap between a healthy intern and the current generation of humanoid is 0.04 seconds per box and one forearm. It is not the gap most warehouse operators are looking at.

What to watch

  • Figure 03 shift two. The fleet has cleared 116 hours of continuous runtime as of writing. The 200-hour mark — five complete 40-hour workweeks back-to-back, the unit nobody else has shown — is the inflection point at which the Adcock line stops being marketing.
  • The first non-California rematch. If Figure runs the same trial in Texas, Florida, or Tennessee against a worker without mandatory meal-and-rest protections, the script of the next press release writes itself.
  • The first multi-skill challenge. A 10-hour split — five hours sorting, five hours decanting, no Helix-02 retraining between them — is the trial Adcock has not staged. The day he does, and the fleet clears it, the warehouse-manager economics shift in a way “boxes per second” never will.
  • Aime’s next job. Figure has a non-trivial PR incentive to show what happens to the human intern after the shift. The cohort’s class-of-2026 problem is that the answer is currently not obvious.

The robot did not win on Sunday. The robot does not need to. The robot needs the legal-protection premium to be priced higher than 0.04 seconds per box, and as of May 17 it is.

Sources