Stable-Layers: Fine-Tuning Image Layer Decomposition Models with VLM-Scored Reinforcement Learning
An RL framework that fine-tunes image layer decomposition models using VLM-as-judge rewards, eliminating paired supervision.
An RL framework that fine-tunes image layer decomposition models using VLM-as-judge rewards, eliminating paired supervision.
A rotation-preconditioned KV cache codec that jointly quantizes coordinate triplets via an octahedral map, achieving state-of-the-art compression across text, video, and audio …
We present Stable Video Materials 3D (SViM3D), a framework to predict multi-view consistent physically based rendering (PBR) materials, given a single image. Recently, video …
We present Stable Virtual Camera (Seva), a generalist diffusion model that creates novel views of a scene, given any number of input views and target cameras. Existing works …
We present Stable Video 3D (SV3D) - a latent video diffusion model for high-resolution, image-to-multi-view generation of orbital videos around a 3D object. Recent work on 3D …