URML primitives as V-JEPA 2-AC action conditioning, and V-JEPA 2 predictions as a URML predictive-safety lane

Hi V-JEPA 2 team,

Posting this as a research-collaboration ask, not a feature request. I work on URML, an open spec for substrate-neutral robot intent (Apache 2.0, urml.dev). Two integration vectors with V-JEPA 2 look interesting from URML's side, and I wanted to surface the shape before writing code.

Vector A: URML primitives as V-JEPA 2-AC action conditioning input. Each URML primitive maps to one or more action-conditioning tokens, and the model's prediction proceeds normally over those tokens. If URML annotation lands on Droid (the OXE annotation proposal, RFC-0046, is the upstream path), V-JEPA 2-AC can fine-tune on URML-annotated trajectories and emit URML primitive sequences as its action representation.

Vector B: V-JEPA 2 predictions as URML's predictive-safety lane. Before URML's validator accepts a candidate program, the model predicts the end-state video embedding, and URML's safety envelope checks the prediction. The pre-execution simulation runs against a learned model of the world rather than an analytical one. No other URML target offers this shape, which is why V-JEPA 2 is the one in URML's outreach landscape I'm most curious to discuss.

Full write-up with the proposed encoding, mapping, drawbacks, and alternatives:
https://github.com/URML-MARS/URML/blob/main/docs/rfcs/0052-meta-fair-vjepa2.md

Things I'd want input on before building:

1. What is the right token-level encoding for URML primitives in V-JEPA 2-AC's action conditioning? The Droid action representation is straightforward; the URML-primitive boundaries are the open question.
2. The predictive-safety lane (Vector B) is novel. Comfortable with framing it as a pre-execution gate, or would FAIR prefer URML keep it at evaluation-only (visualize predictions, don't gate execution on them)?
3. Is URML annotation on Droid trajectories, matching the OXE sidecar shape, acceptable from FAIR's side?
4. Bridge home: standalone urml-vjepa2-bridge on PyPI, or a contributed example in facebookresearch/vjepa2?
5. Anything coming up on the FAIR side (workshop, benchmark, paper) where a URML conformance lane would slot in usefully?

No rush. The world-model angle is the most distinctive thing across this outreach program and I'd rather get the shape right than ship a wrapper you'd want rebuilt.

Ido
greenvh@gmail.com


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

URML primitives as V-JEPA 2-AC action conditioning, and V-JEPA 2 predictions as a URML predictive-safety lane #167

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

URML primitives as V-JEPA 2-AC action conditioning, and V-JEPA 2 predictions as a URML predictive-safety lane #167

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions