-
Surfer-H Meets Holo1: Cost-Efficient Web Agent Powered by Open Weights
Paper • 2506.02865 • Published • 33 -
Chatterbox TTS
🍿1.67kExpressive Zeroshot TTS
-
Loacky/Animator2D-v3.0.0-alpha
Text-to-Image • Updated • 5 • 2 -
ReMoDiffuse: Retrieval-Augmented Motion Diffusion Model
Paper • 2304.01116 • Published • 2
Collections
Discover the best community collections!
Collections including paper arxiv:2407.06188
-
MOFA-Video: Controllable Image Animation via Generative Motion Field Adaptions in Frozen Image-to-Video Diffusion Model
Paper • 2405.20222 • Published • 11 -
ZeroSmooth: Training-free Diffuser Adaptation for High Frame Rate Video Generation
Paper • 2406.00908 • Published • 12 -
CamCo: Camera-Controllable 3D-Consistent Image-to-Video Generation
Paper • 2406.02509 • Published • 10 -
I4VGen: Image as Stepping Stone for Text-to-Video Generation
Paper • 2406.02230 • Published • 18
-
WorldDreamer: Towards General World Models for Video Generation via Predicting Masked Tokens
Paper • 2401.09985 • Published • 18 -
CustomVideo: Customizing Text-to-Video Generation with Multiple Subjects
Paper • 2401.09962 • Published • 9 -
Inflation with Diffusion: Efficient Temporal Adaptation for Text-to-Video Super-Resolution
Paper • 2401.10404 • Published • 10 -
ActAnywhere: Subject-Aware Video Background Generation
Paper • 2401.10822 • Published • 13
-
4K4DGen: Panoramic 4D Generation at 4K Resolution
Paper • 2406.13527 • Published • 9 -
Style-NeRF2NeRF: 3D Style Transfer From Style-Aligned Multi-View Images
Paper • 2406.13393 • Published • 5 -
YouDream: Generating Anatomically Controllable Consistent Text-to-3D Animals
Paper • 2406.16273 • Published • 43 -
EVF-SAM: Early Vision-Language Fusion for Text-Prompted Segment Anything Model
Paper • 2406.20076 • Published • 10
-
ViewDiff: 3D-Consistent Image Generation with Text-to-Image Models
Paper • 2403.01807 • Published • 9 -
TripoSR: Fast 3D Object Reconstruction from a Single Image
Paper • 2403.02151 • Published • 16 -
OOTDiffusion: Outfitting Fusion based Latent Diffusion for Controllable Virtual Try-on
Paper • 2403.01779 • Published • 30 -
MagicClay: Sculpting Meshes With Generative Neural Fields
Paper • 2403.02460 • Published • 8
-
Surfer-H Meets Holo1: Cost-Efficient Web Agent Powered by Open Weights
Paper • 2506.02865 • Published • 33 -
Chatterbox TTS
🍿1.67kExpressive Zeroshot TTS
-
Loacky/Animator2D-v3.0.0-alpha
Text-to-Image • Updated • 5 • 2 -
ReMoDiffuse: Retrieval-Augmented Motion Diffusion Model
Paper • 2304.01116 • Published • 2
-
4K4DGen: Panoramic 4D Generation at 4K Resolution
Paper • 2406.13527 • Published • 9 -
Style-NeRF2NeRF: 3D Style Transfer From Style-Aligned Multi-View Images
Paper • 2406.13393 • Published • 5 -
YouDream: Generating Anatomically Controllable Consistent Text-to-3D Animals
Paper • 2406.16273 • Published • 43 -
EVF-SAM: Early Vision-Language Fusion for Text-Prompted Segment Anything Model
Paper • 2406.20076 • Published • 10
-
MOFA-Video: Controllable Image Animation via Generative Motion Field Adaptions in Frozen Image-to-Video Diffusion Model
Paper • 2405.20222 • Published • 11 -
ZeroSmooth: Training-free Diffuser Adaptation for High Frame Rate Video Generation
Paper • 2406.00908 • Published • 12 -
CamCo: Camera-Controllable 3D-Consistent Image-to-Video Generation
Paper • 2406.02509 • Published • 10 -
I4VGen: Image as Stepping Stone for Text-to-Video Generation
Paper • 2406.02230 • Published • 18
-
ViewDiff: 3D-Consistent Image Generation with Text-to-Image Models
Paper • 2403.01807 • Published • 9 -
TripoSR: Fast 3D Object Reconstruction from a Single Image
Paper • 2403.02151 • Published • 16 -
OOTDiffusion: Outfitting Fusion based Latent Diffusion for Controllable Virtual Try-on
Paper • 2403.01779 • Published • 30 -
MagicClay: Sculpting Meshes With Generative Neural Fields
Paper • 2403.02460 • Published • 8
-
WorldDreamer: Towards General World Models for Video Generation via Predicting Masked Tokens
Paper • 2401.09985 • Published • 18 -
CustomVideo: Customizing Text-to-Video Generation with Multiple Subjects
Paper • 2401.09962 • Published • 9 -
Inflation with Diffusion: Efficient Temporal Adaptation for Text-to-Video Super-Resolution
Paper • 2401.10404 • Published • 10 -
ActAnywhere: Subject-Aware Video Background Generation
Paper • 2401.10822 • Published • 13