Figure AI scheduled an eight-hour livestream. The three humanoid robots inside the warehouse, all running Figure's in-house neural network called Helix-02, were supposed to pick up small packages, find barcodes, and place each item facing down on a conveyor belt. Sometime around hour seven, when the test was supposed to wrap, nothing had broken. The company kept the stream going. Twenty-four hours later it was still running, and the robots had sorted more than 28,000 packages. Viewers, by then, had named them Bob, Frank and Gary. Figure leaned into it and added physical name tags.
The reporting on the run, syndicated through Fox News and several regional outlets via the CyberGuy column, hit the obvious beats. Speeds close to human throughput. A claimed automatic-reset feature that lets a stuck robot recover without intervention. A second robot ready to swap in if a first one needs maintenance, so the line keeps moving. CEO Brett Adcock made a point of stressing that nobody was teleoperating the machines; every action came from the model. The implicit message was that Figure has crossed a threshold from impressive clips to sustained, lights-out work.
It is worth pausing on what was actually shown and what was not. The task itself, package-to-conveyor with a barcode read, is closer to industrial robotics than to general-purpose dexterity. The packages were presumably consistent in size, the lighting was controlled, the floor did not move. Figure has tested similar systems at a BMW plant in South Carolina, and that environment is the natural near-term market: structured industrial space, repetitive motion, clear payoff per replaced labour-hour. The leap from there to the home, or even to a less predictable warehouse, remains huge.
An essay published the same week by Steven Strauss at Daily Kos, drawing on three talks at the Asian Leadership Conference in Seoul, made the harder argument. Humanoid robots, Strauss writes, are simultaneously the form factor that grabs the press and one of the least practical robot designs. The reason is data. Language models had the internet to train on. A robot trying to learn how a strawberry resists crushing or how a folded sweater behaves has to collect that information physically, slowly, in the real world. Moravec's paradox, formulated in 1988, still bites: the things humans find easy are the things robots find hardest.
Strauss cites a Wall Street Journal review of the 1X Neo, a $20,000 humanoid that ships with a $499 monthly subscription. In tests it took the robot over a minute to pick up a water bottle, five minutes to load three dishes, and two minutes to fold a single sweater. It nearly fell while closing a dishwasher, and parts of the demo were partially controlled by a remote human in a VR headset. Self-driving cars, Strauss notes, started gathering training data in 2016 and took roughly thirty years from the first lab demos before they were even somewhat reliable. Humanoids, by his clock, are still at the lab-demo stage.
Both pictures can be true at once. Figure's run shows that constrained warehouse tasks are crossing into the kind of reliability where companies will start running pilots. The Daily Kos piece is a useful corrective for the broader claim that household humanoids are around the corner. The two ends of the market are diverging: structured industrial work where the economics already pencil out, and unstructured domestic work where the data problem may genuinely take decades.
There is a softer story underneath, too. Figure did not name the robots; the audience did. Once Bob, Frank and Gary existed as characters, the demo stopped being a tech showcase and became a workplace. That is exactly the dynamic researchers warn about. People trust anthropomorphised machines more than they should, then react more harshly when those machines fail. A vertical spinner of a humanoid hype cycle is built on the back of the same instinct that made the chat function on the livestream type out the names. The next question is what happens when Bob drops a package, hits a person, or quietly stops showing up because the maintenance economics turned out worse than the demo suggested.
For now, the headline is honest enough on its own terms. Three machines, no human input, twenty-eight thousand packages, no reported failure across a full day. The interesting bit is everything that test was not asked to do.