Forth VM and compiler written in C++ and Scryer Prolog
12 by triska | 0 comments on Hacker News.
Humans Times
Be Upto date with Per hour news with Humanstimes
Tuesday, March 31, 2026
New top story on Hacker News: Show HN: PhAIL – Real-robot benchmark for AI models. The gap to humans is 20x
Show HN: PhAIL – Real-robot benchmark for AI models. The gap to humans is 20x
6 by vertix | 7 comments on Hacker News.
I built this because I couldn't find honest numbers on how well VLA models actually work on commercial tasks. I come from search ranking at Google where you measure everything, and in robotics nobody seemed to know. PhAIL runs four models (OpenPI/pi0.5, GR00T, ACT, SmolVLA) on bin-to-bin order picking – one of the most common warehouse operations. Same robot (Franka FR3), same objects, hundreds of blind runs. The operator doesn't know which model is running. Best model: 64 UPH. Human teleoperating the same robot: 330. Human by hand: 1,300+. Everything is public – every run with synced video and telemetry, the fine-tuning dataset, training scripts. The leaderboard is open for submissions. Happy to answer questions about methodology, the models, or what we observed.
6 by vertix | 7 comments on Hacker News.
I built this because I couldn't find honest numbers on how well VLA models actually work on commercial tasks. I come from search ranking at Google where you measure everything, and in robotics nobody seemed to know. PhAIL runs four models (OpenPI/pi0.5, GR00T, ACT, SmolVLA) on bin-to-bin order picking – one of the most common warehouse operations. Same robot (Franka FR3), same objects, hundreds of blind runs. The operator doesn't know which model is running. Best model: 64 UPH. Human teleoperating the same robot: 330. Human by hand: 1,300+. Everything is public – every run with synced video and telemetry, the fine-tuning dataset, training scripts. The leaderboard is open for submissions. Happy to answer questions about methodology, the models, or what we observed.
Monday, March 30, 2026
Sunday, March 29, 2026
Saturday, March 28, 2026
Subscribe to:
Comments (Atom)