Qwen3-Coder-30B-A3B-Kubernetes-Instruct

Model Description

Qwen3-Coder-30B-A3B-Kubernetes-Instruct is a specialized fine-tune of Qwen3-Coder-30B-A3B-Instruct.

It is designed to act as an assistant for yoru Kubernetes related configuration troubleshooting, specifically optimized for YAML generation.

Developed by: Doğaç Eldenk, Robin Luo – Northwestern University
Model type: Large Language Model (Fine-tune)
Language(s) (NLP): English, YAML
License: Apache-2.0
Finetuned from model: Qwen/Qwen3-Coder-30B-A3B-Instruct

Model Sources

Repository: https://huggingface.co/Dogacel/Qwen3-Coder-30B-A3B-Kubernetes-Instruct
Paper: (In-progress)

Uses

Direct Use

This model is designed to assist Site Reliability Engineers (SREs), DevOps professionals, and Platform Engineers. It acts as an expert assistant for:

Kubernetes Troubleshooting: Analyzing error logs, kubectl describe outputs, and CrashLoopBackOff scenarios.
YAML Generation: Writing production-ready manifests for Deployments, Services, Ingresses, StatefulSets, and NetworkPolicies.
Infrastructure as Code: Converting natural language requirements into valid Kubernetes configurations.
Architecture Q&A: Answering questions about cluster architecture, networking (CNI), and storage (CSI).

Bias, Risks, and Limitations

YAML Hallucination: Like all LLMs, this model can generate syntactically correct but logically flawed YAML. Always validate generated manifests (kubectl apply --dry-run=client -f ...) before applying to production.
Version Bias: The model's knowledge is based on Kubernetes versions available up to the training cutoff. It may hallucinate deprecated APIs (e.g., extensions/v1beta1) or be unaware of very recent Alpha features.
Security Risks: The model might suggest configurations with relaxed security contexts (e.g., privileged: true) if not explicitly instructed otherwise.

Recommendations

Users should treat this model as a "Copilot" rather than an autonomous operator. All generated YAML should be reviewed by a human and scanned by static code analyzers before deployment.

How to Get Started with the Model

Use the code below to get started with the model.

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

# Path to your merged model (no base model needed)
model_id = "Dogacel/Qwen3-Coder-30B-A3B-Kubernetes-Instruct"

# 1. Load the Full Model
# Use device_map="auto" to handle the 30B size efficiently
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.float16,
    device_map="auto",
    low_cpu_mem_usage=True
)

tokenizer = AutoTokenizer.from_pretrained(model_id)

# 2. Run Inference
messages = [
    {"role": "system", "content": "You are a Kubernetes expert. Diagnose issues step-by-step, then provide the fixed YAML configuration."},
    {"role": "user", "content": "My Pod is in Pending state and describing it says 'Insufficient cpu'. How do I fix this?"}
]

text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer([text], return_tensors="pt").to(model.device)

outputs = model.generate(**inputs, max_new_tokens=2048)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Training Details

Training Data

This model is fine-tuned on over 20,000 samples of Kubernetes related Q&A pairs from the community forums.

Training Procedure

Training took 2 epochs on 4 x H100 GPUs over 14 hours.

Preprocessing

All Q&A pairs are processed by another LLM (GPT 4.1-mini) to have the same formatting,

Identification: (one sentence - what's wrong)
Reasoning: (root cause explanation)
Remediation: (fix approach)
Fixed YAML configuration in ```yaml code block

Training Hyperparameters

Training regime: BF16 Mixed Precision (Bfloat16)
Method: QLoRA 4-Bit (Quantized Low-Rank Adaptation)
Rank (r): 64
Alpha: 32
Target Modules: Attention-only – ["v_proj", "q_proj", "k_proj", "o_proj"]

Evaluation

The evaluation is run on a validation set of 100 Q&A pairs and we compared YAML similarities of the generated fixes.

Citation

BibTeX:

[TODO]

Downloads last month: 21

Safetensors

Model size

31B params

Tensor type

F32

Model tree for Dogacel/Qwen3-Coder-30B-A3B-Kubernetes-Instruct

Base model

Qwen/Qwen3-Coder-30B-A3B-Instruct

Finetuned

(31)

this model