Vikram Voleti

Stable-Layers: Fine-Tuning Image Layer Decomposition Models with VLM-Scored Reinforcement Learning

An RL framework that fine-tunes image layer decomposition models using VLM-as-judge rewards, eliminating paired supervision.

ciara-rowles

• May 28, 2026 • 1 min read

KV Cache

OCTOPUS: Optimized KV Cache for Transformers via Octahedral Parametrization Under Optimal Squared Error Quantization

A rotation-preconditioned KV cache codec that jointly quantizes coordinate triplets via an octahedral map, achieving state-of-the-art compression across text, video, and audio …

Mark Boss

• May 21, 2026 • 1 min read

Video Diffusion

SViM3D: Stable Video Material Diffusion for Single Image 3D Generation

We present Stable Video Materials 3D (SViM3D), a framework to predict multi-view consistent physically based rendering (PBR) materials, given a single image. Recently, video …

andreas-engelhardt

• Sep 30, 2025 • 1 min read

Novel View Synthesis

Stable Virtual Camera: Generative View Synthesis with Diffusion Models

We present Stable Virtual Camera (Seva), a generalist diffusion model that creates novel views of a scene, given any number of input views and target cameras. Existing works …

jensen-zhou

• Mar 19, 2025 • 1 min read

Video Diffusion

SV3D: Novel Multi-view Synthesis and 3D Generation from a Single Image using Latent Video Diffusion

We present Stable Video 3D (SV3D) - a latent video diffusion model for high-resolution, image-to-multi-view generation of orbital videos around a 3D object. Recent work on 3D …

vikram-voleti

• Mar 18, 2024 • 1 min read

No results found