SkeletonLLM Visualization Gallery

Universal Skeleton Understanding via Differentiable Rendering and MLLMs

This page collects qualitative DrAction renderings prepared for the ICML 2026 paper. It focuses on visual samples only; the official code repository contains release status and paper information.

Rendering Overview

DrAction converts skeleton sequences into visual tokens that MLLMs can process directly, while preserving motion cues across different skeleton formats.

SkeletonLLM framework overview
SkeletonLLM uses DrAction to render skeleton motion before visual-language reasoning.
Qualitative comparison of rendering methods
Learned rendering highlights motion-relevant regions more clearly than fixed renderers.

Each animation shows a rendered skeleton sequence with its submitted ground-truth motion description.