YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

VINO β€” Unified Visual Generator (Official Weights)

VINO: A Unified Visual Generator with Interleaved OmniModal Context

🌐 Project Page β€’ πŸ“‘ Paper β€’ πŸ’» Code β€’ πŸ“Ί Demo Video


πŸ”₯ What is VINO?

VINO is a unified image & video generation and editing framework powered by a Vision-Language Model (VLM) and Multi-Modal Diffusion Transformer (MMDiT).

A single set of weights supports:

  • Text-to-Image
  • Text-to-Video
  • Image-to-Video
  • Multi-Image-to-Video
  • Image Editing
  • Video Editing
  • Element Cloning

One model. All visual generation & editing tasks.


πŸ“¦ Contents of This Repository

This Hugging Face repository provides the official VINO model weights, including:

  • MMDiT backbone
  • Learnable multimodal tokens

These weights are intended to be used with:

πŸ‘‰ https://github.com/SOTAMak1r/VINO-code


🧩 Required Base Models

VINO depends on the following public checkpoints:

Component Source
VLM Qwen/Qwen3-VL-4B-Instruct
Video VAE hunyuanvideo-community/HunyuanVideo

They will be automatically downloaded by the VINO codebase.


⬇️ Download

Option 1: Hugging Face CLI

huggingface-cli download SOTAMak1r/VINO-weight \
  --local-dir ./checkpoints/SOTAMak1r/VINO-weight \
  --local-dir-use-symlinks False

Option 2: Inside VINO Repo (recommanded)

python download.py --ak YOUR_HF_TOKEN

πŸš€ Quick Start

See full instructions in:

πŸ‘‰ https://github.com/SOTAMak1r/VINO-code


πŸ“„ License

  • Model Weights: CC BY-NC 4.0 (Non-Commercial Only)
  • Code: Apache 2.0

πŸ“ Citation

@article{chen2026vino,
  title={VINO: A Unified Visual Generator with Interleaved OmniModal Context},
  author={Chen, Junyi and He, Tong and Fu, Zhoujie and Wan, Pengfei and Gai, Kun and Ye, Weicai},
  journal={arXiv preprint arXiv:2601.02358},
  year={2026}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Paper for SOTAMak1r/VINO-weight