Instructions to use diffutron/DiffutronLM-0.3B-1st-Stage with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use diffutron/DiffutronLM-0.3B-1st-Stage with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="diffutron/DiffutronLM-0.3B-1st-Stage")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForMaskedLM

tokenizer = AutoTokenizer.from_pretrained("diffutron/DiffutronLM-0.3B-1st-Stage")
model = AutoModelForMaskedLM.from_pretrained("diffutron/DiffutronLM-0.3B-1st-Stage")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use diffutron/DiffutronLM-0.3B-1st-Stage with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "diffutron/DiffutronLM-0.3B-1st-Stage"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "diffutron/DiffutronLM-0.3B-1st-Stage",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/diffutron/DiffutronLM-0.3B-1st-Stage

SGLang

How to use diffutron/DiffutronLM-0.3B-1st-Stage with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "diffutron/DiffutronLM-0.3B-1st-Stage" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "diffutron/DiffutronLM-0.3B-1st-Stage",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "diffutron/DiffutronLM-0.3B-1st-Stage" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "diffutron/DiffutronLM-0.3B-1st-Stage",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use diffutron/DiffutronLM-0.3B-1st-Stage with Docker Model Runner:
```
docker model run hf.co/diffutron/DiffutronLM-0.3B-1st-Stage
```

DiffutronLM-0.3B-1st-Stage / README.md

suayptalha

Update README.md

c13e391 verified 2 months ago

preview code

raw

history blame contribute delete

5.07 kB

	---
	library_name: transformers
	tags:
	- mdlm
	- diffusion
	license: apache-2.0
	datasets:
	- metunlp/LlamaTurk-Instruction-Set
	language:
	- tr
	base_model:
	- diffutron/DiffutronLM-0.3B-Base
	pipeline_tag: text-generation
	new_version: diffutron/DiffutronLM-0.3B-Instruct
	---
	# DiffutronLM-0.3B-1st-Stage

	DiffutronLM-0.3B-1st-Stage is an intermediate checkpoint of the Diffutron series, a parameter-efficient, Masked Diffusion Language Model (MDLM) designed for the Turkish language.

	This specific model represents the completion of the first stage of instruction fine-tuning. It has been trained to grasp the fundamentals of instruction-following in Turkish, serving as a robust foundation before more complex, domain-specific specialization (which is handled in the final `Instruct` model).

	## 📌 Model Details

	* Model Type: Masked Diffusion Language Model (MDLM)
	* Base Architecture: `jhu-clsp/mmBERT-base` (Multilingual Encoder)
	* Language: Turkish
	* Parameter Count: 307M (0.3B)
	* Context Length: 256 tokens
	* Training Libraries: `dllm`, PyTorch
	* Status: Intermediate Checkpoint (Stage 1 SFT)

	## 🚀 Training Pipeline for This Checkpoint

	Diffutron replaces traditional next-token autoregressive generation with a discrete diffusion process, generating text by iteratively refining sequences in parallel. To reach this checkpoint, the model underwent two main phases:

	### 1. Continual Pre-training (CPT)
	The multilingual backbone was adapted to Turkish using a high-rank LoRA strategy (r=256, α=256) on ~2 million sequences sourced from Havadis, Temiz-OSCAR, and Turkish Wikipedia. This effectively modeled Turkish morphological nuances without catastrophic forgetting.

	### 2. Stage 1: Foundational Instruction Tuning
	Following CPT, the model underwent full supervised fine-tuning (SFT) to align it with human intent.
	* Dataset: `metunlp/LlamaTurk-Instruction-Set`
	* Objective: Introduce the model to a broad range of general instructions and establish basic response coherence.
	* Hyperparameters: 20 Epochs, Batch Size 16, AdamW optimizer (lr=1e-4), Max Sequence Length 256.

	(Note: For the most advanced instruction-following capabilities, including complex reasoning, we recommend using the final `DiffutronLM-0.3B-Instruct` model, which includes a second stage of tuning on `InstrucTurca`.)

	## 📊 Evaluation Results

	Despite being an intermediate checkpoint, the 1st-Stage model demonstrates highly competitive performance against much larger autoregressive baselines on the CETVEL Benchmark Suite.

	\| Benchmark \| Diffutron-1st (0.3B)-Stage \| Diffutron-2nd-Stage (0.3B) \| TURNA (1.1B) \| Kumru (2B) \| Kanarya (2B) \| Llama-3.2 (3B) \| Trendyol (7B) \| Aya-101 (13B) \|
	\| :--- \| :---: \| :---: \| :---: \| :---: \| :---: \| :---: \| :---: \| :---: \|
	\| Belebele_TR \| 22.22 \| 27.00 \| 22.56 \| 29.00 \| 28.11 \| 55.78 \| 36.22 \| 22.89 \|
	\| EXAMS_TR \| 25.95 \| 27.74 \| 23.66 \| 30.03 \| 30.03 \| 26.21 \| 28.50 \| 22.90 \|
	\| IronyTR \| 50.67 \| 52.00 \| 48.33 \| 51.00 \| 50.00 \| 50.17 \| 50.00 \| 52.17 \|
	\| News_Cat \| 23.20 \| 32.40 \| 32.80 \| 26.40 \| 66.80 \| 64.00 \| 81.20 \| 20.00 \|
	\| MNLI_TR \| 33.29 \| 32.81 \| 34.94 \| 36.42 \| 33.40 \| 34.76 \| 35.19 \| 27.90 \|
	\| STS_TR \| 17.77 \| 18.78 \| 14.21 \| 11.75 \| 12.91 \| 12.91 \| 15.52 \| 16.97 \|
	\| XCOPA_TR \| 53.80 \| 52.00 \| 55.80 \| 54.00 \| 64.20 \| 54.60 \| 61.00 \| 59.60 \|
	\| Average \| 32.41 \| 34.68 \| 33.19 \| 34.09 \| 40.78 \| 42.63 \| 43.95 \| 31.78 \|

	## 💻 Usage

	Because Diffutron is a Masked Diffusion Language Model, it requires inference strategies distinct from standard causal generation. We recommend using the `dllm` library or custom generation loops tailored for discrete diffusion.

	### 1. Install the dllm Library:
	```bash
	git clone https://github.com/Diffutron/dllm.git
	cd dllm
	pip install -e .
	```
	### 2. Chat via Interaction Mode:

	```bash
	python -u examples/bert/chat.py \
	--model_name_or_path "diffutron/DiffutronLM-0.3B-1st-Stage" \
	--chat True \
	--steps 64 \
	--max_new_tokens 64 \
	--temperature 0.1 \
	--block_length 32 \
	--repetition_penalty 1.2 \
	--remasking "low_confidence" \
	--stochastic_transfer False \
	--cfg_scale 0.0
	```

	For other inference modes, see [dllm](https://github.com/Diffutron/dllm) library.

	## ⚠️ Limitations

	* Intermediate State: This model has not undergone the final specialization phase and may struggle with highly complex or multi-turn instructions compared to the final Instruct model.
	* Context Window: Restricted to a 256-token context window.
	* Multilingual Backbone: Inherits representations from a multilingual encoder, not a natively trained Turkish foundation model.

	## 📝 Citation

	```bibtex
	@misc{diffutron2026,
	title={Diffutron: A Masked Diffusion Language Model for Turkish Language},
	author={Şuayp Talha Kocabay and Talha Rüzgar Akkuş},
	year={2026},
	eprint={2603.20466},
	archivePrefix={arXiv},
	primaryClass={cs.CL},
	url={https://arxiv.org/abs/2603.20466},
	}
	```