31 October 2D to 3D Benchmark Metrics
On 24 October 2D inputs went live on Subnet 17 mainnet enabling direct like-for-like comparison with existing generative 3D foundational models. This is a snapshot of the results after 1 week.
This benchmark evaluates leading generative 3D foundational models. It is inspired by 3D Arena and uses Visual Language Model judges as a non-human evaluation criteria.
Read more about evaluation methodology here.
Motivation
Evaluating generated 3D quality quantitatively is challenging, subjective, and there is no standard practice for evaluating aesthetics in real-world applications.
This head-to-head competition utilizes the strength of reasoning models in like-for-like comparisons and displays final results side-by-side with the ability to download files for further human evaluation.
Criteria
Models must handle image (.png, .jpg) inputs and produce mesh (.obj, .glb) or splat (.splat, .ply) outputs. They should run end-to-end without human intervention, including UV unwrapping, texture mapping, and other post-processing.
Contributing
All inputs and outputs are publicly available here. Input image URLs are provided here.
Results
404 v. CSM Cube 404 wins: 43 CSM Cube wins: 16 Draw: 42 Results & Data
404 v. Trellis 404 wins: 39 Trellis wins: 17 Draw: 45 Results & Data
404 v. Hunyuan 2.1 404 wins: 70 Hunyuan wins: 15 Draw: 16 Results & Data
404 v. Meshy 404 wins: 97 Meshy wins: 2 Draw: 1 Results & Data
Future Benchmarks
This set of generations (1 week after launching on mainnet) has been submitted to 3D Arena for human-in-the-loop ranking on their leaderboard.
This benchmark process will be updated with additional closed source models on an ongoing basis (as new models are released).
Last updated
