lfm2_350m_commit_diff_summarizer (LoRA)

A lightweight helper model that turns Git diffs into Conventional Commit–style messages. It outputs strict JSON with a short title (≤ 65 chars) and up to 3 bullets, so your CLI/agents can parse it deterministically.

Model Details

Model Description

  • Purpose: Summarize git diff patches into concise, Conventional Commit–compliant titles with optional bullets.

  • I/O format:

    • Input: prompt containing the diff (plain text).
    • Output: JSON object: {"title": "...", "bullets": ["...", "..."]}.
  • Model type: LoRA adapter for causal LM (text generation)

  • Language(s): English (commit message conventions)

  • Finetuned from: unsloth/LFM2-350M-unsloth-bnb-4bit (4-bit quantized base, trained with QLoRA)

Model Sources

  • Repository: This model card + adapter on the Hub under ethanke/lfm2_350m_commit_diff_summarizer

Uses

Direct Use

  • Convert patch diffs into Conventional Commit messages for PR titles, commits, and changelogs.
  • Provide human-readable summaries in agent UIs with guaranteed JSON structure.

Recommendations

  • Enforce JSON validation; if invalid, retry with a JSON-repair prompt.
  • Keep a regex gate for Conventional Commit titles in your pipeline.

How to Get Started

from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from peft import PeftModel
import torch, json

BASE = "unsloth/LFM2-350M-unsloth-bnb-4bit"
ADAPTER = "ethanke/lfm2_350m_commit_diff_summarizer"  # replace with your repo id

bnb = BitsAndBytesConfig(load_in_4bit=True, bnb_4bit_quant_type="nf4",
                         bnb_4bit_use_double_quant=True, bnb_4bit_compute_dtype=torch.float16)

tok = AutoTokenizer.from_pretrained(BASE, use_fast=True)
mdl = AutoModelForCausalLM.from_pretrained(BASE, quantization_config=bnb, device_map="auto")
mdl = PeftModel.from_pretrained(mdl, ADAPTER)

diff = "...your git diff text..."
prompt = (
  "You are a commit message summarizer.\n"
  "Return a concise JSON object with fields 'title' (<=65 chars) and 'bullets' (0-3 items).\n"
  "Follow the Conventional Commit style for the title.\n\n"
  "### DIFF\n" + diff + "\n\n### OUTPUT JSON\n"
)

inputs = tok(prompt, return_tensors="pt").to(mdl.device)
with torch.no_grad():
    out = mdl.generate(**inputs, max_new_tokens=200, do_sample=False)
text = tok.decode(out[0], skip_special_tokens=True)

# naive JSON extraction
js = text[text.rfind("{"): text.rfind("}")+1]
obj = json.loads(js)
print(obj)

Training Details

Training Data

  • Dataset: Maxscha/commitbench (diff → commit message).
  • Filtering: kept only samples whose first non-empty line of the message matches Conventional Commits: ^(feat|fix|docs|style|refactor|perf|test|build|ci|chore|revert)(\([^)]+\))?(!)?:\s.+$
  • Note: The dataset card indicates non-commercial licensing. Confirm before commercial deployment.

Training Procedure

  • Method: Supervised fine-tuning (SFT) with TRL SFTTrainer + QLoRA (PEFT).

  • Prompting: Instruction + ### DIFF + ### OUTPUT JSON target (title/bullets).

  • Precision: fp16 compute on 4-bit base.

  • Hyperparameters (v0.1):

    • max_length=2048, per_device_train_batch_size=2, grad_accum=4
    • lr=2e-4, scheduler=cosine, warmup_ratio=0.03
    • epochs=1 over capped subset
    • LoRA: r=16, alpha=32, dropout=0.05, targets: q/k/v/o + MLP proj

Evaluation

  • Validation: filtered split from CommitBench.

  • Metrics (example run):

    • eval_loss ≈ 1.18 → perplexity ≈ 3.26
    • eval_mean_token_accuracy ≈ 0.77
    • Suggested task metrics: JSON validity rate, CC-title compliance, title length ≤ 65 chars, bullets ≤ 3.

Environmental Impact

  • Hardware: 1× NVIDIA GTX 3060 12 GB (local)
  • Hours used: ~2 h (prototype)

Technical Specifications

  • Architecture: LFM2-350M (decoder-only) + LoRA adapter
  • Libraries: transformers, trl, peft, bitsandbytes, datasets, unsloth

Contact

  • Open an issue on the Hub repo or message ethanke on Hugging Face.

Framework versions

  • PEFT 0.17.1
  • TRL (SFTTrainer)
  • Transformers (recent version)
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ethanker/lfm2_350m_commit_diff_summarizer

Base model

LiquidAI/LFM2-350M
Adapter
(3)
this model