Stable Diffusion v2 GGUF Model Card

Quantized versions of stable-diffusion-2 in GGUF format for use with stable-diffusion.cpp.

At the time of publishing, no ready-made GGUF weights for SD2.0 were available for sd.cpp runtime — so here we are.

Sample generation: "A lovely cat" · Q8_0 · 768x768

Available Quantizations

File	Quantization	Description
`v1-5-pruned_Q8_0.gguf`	Q8_0	High quality, ~half the size of bf16
`v1-5-pruned_Q4_K.gguf`	Q4_K	Balanced quality/size
`v1-5-pruned_Q4_0.gguf`	Q4_0	Smallest, fastest, slight quality loss

Quick Start

1. Download the model

wget https://huggingface.co/kostakoff/stable-diffusion-2-GGUF/resolve/main/768-v-ema-Q8_0.gguf
# Other quantizations:
# wget https://huggingface.co/kostakoff/stable-diffusion-2-GGUF/resolve/main/768-v-ema-Q4_K.gguf
# wget https://huggingface.co/kostakoff/stable-diffusion-2-GGUF/resolve/main/768-v-ema-Q4_0.gguf

2. Build stable-diffusion.cpp

Requirements: CUDA-capable GPU, CMake ≥ 3.18, CUDA Toolkit

git clone https://github.com/leejet/stable-diffusion.cpp
cd stable-diffusion.cpp
git submodule init
git submodule update
mkdir build && cd build
cmake .. -DSD_CUDA=ON
cmake --build . --config Release

This was tested on commit d950627 (version master-520-d950627). Check your version with:

./build/bin/sd-cli --version

3. Start the server

export CUDA_VISIBLE_DEVICES=0
./stable-diffusion.cpp/build/bin/sd-server \
  -m ./768-v-ema-Q8_0.gguf \
  --listen-ip 0.0.0.0 \
  --listen-port 8081 \
  --seed -1

The server exposes an OpenAI-compatible /v1/images/generations endpoint.

4. Generate an image

curl -s http://127.0.0.1:8081/v1/images/generations \
  -H "Content-Type: application/json" \
  -d '{
    "model": "sd2.0",
    "prompt": "A lovely cat",
    "n": 1,
    "size": "768x768",
    "response_format": "b64_json"
  }' | jq -r '.data[0].b64_json' | base64 --decode > output.png

Extra parameters can be passed via <sd_cpp_extra_args> as a JSON snippet embedded directly in the prompt field.

How the weights were created

Converted from the original 768-v-ema.safetensors weights using the built-in sd-cli conversion tool:

# Q4_0
./stable-diffusion.cpp/build/bin/sd-cli -M convert \
  -m ~/llm/models/sd2.0/768-v-ema.safetensors \
  -o 768-v-ema-Q4_0.gguf -v --type q4_0
# Q4_K
./stable-diffusion.cpp/build/bin/sd-cli -M convert \
  -m ~/llm/models/sd2.0/768-v-ema.safetensors \
  -o ./768-v-ema-Q4_K.gguf -v --type q4_K
# Q8_0
./stable-diffusion.cpp/build/bin/sd-cli -M convert \
  -m ~/llm/models/sd2.0/768-v-ema.safetensors \
  -o ./768-v-ema-Q8_0.gguf -v --type q8_0

License

This model inherits the license of the original: CreativeML Open RAIL++-M

Downloads last month: 81

GGUF

Model size

1B params

Architecture

undefined

Hardware compatibility

4-bit

8-bit

Model tree for kostakoff/stable-diffusion-2-GGUF

Base model

sd2-community/stable-diffusion-2

Quantized

(1)

this model

Collection including kostakoff/stable-diffusion-2-GGUF

Forks

Collection

My forks, when I need modify something in original model • 4 items • Updated 2 days ago