Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up

All HF Hub posts

AxionLab-officialย 
posted an update 2 days ago
ajibawa-2023ย 
posted an update 2 days ago
view post
Post
6647
Shell-Code-Large
Dataset: ajibawa-2023/Shell-Code-Large

Shell-Code-Large is a large-scale corpus of Shell scripting source code comprising approximately 640,000 code samples stored in JSON Lines (.jsonl) format. The dataset is designed to support research in large language model (LLM) pretraining, code intelligence, DevOps automation, cloud infrastructure engineering, system administration, and software engineering automation.

By providing a high-volume, language-specific corpus focused exclusively on Shell scripting, Shell-Code-Large enables systematic experimentation in automation workflows, deployment pipelines, infrastructure management, and command-line tooling. These domains remain foundational to Linux systems, cloud-native platforms, CI/CD environments, and modern DevOps practices.

Shell-Code-Large addresses the need for a dedicated Shell-focused dataset at substantial scale, enabling targeted research into scripting patterns, command composition, workflow orchestration, infrastructure automation, and operational engineering practices
sequelboxย 
posted an update 1 day ago
view post
Post
2778
NEW RELEASE: Esper 4 is here for Qwen 3.6 27b, along with our new datasets!

- NEW DATASET: Titanium 4 maximizes DevOps and architecture helpfulness, powered by high-difficulty agentic-focused DevOps and architecture data generated with DeepSeek-V4-Pro!
- NEW DATASET: Mitakihara 2 brings AI coding and expertise data for AI development, research, deployment, interpretability, operation and experimentation!
- Improved coding performance: challenging agentic coding queries from Tachibana 4 allow Esper 4 to tackle harder coding tasks across a variety of languages!

GET ESPER 4: ValiantLabs/Qwen3.6-27B-Esper4

Get the datasets for your own training:
sequelbox/Titanium4-DeepSeek-V4-Pro
sequelbox/Mitakihara2-DeepSeek-V4-Pro
sequelbox/Tachibana4-DeepSeek-V4-Pro

We've been working hard on Esper 4 - it's so exciting to finally bring it to everyone! We hope it helps you build.

We'll be expanding Esper 4 to more models as funding allows - donate for more, faster, better models and datasets: sequelbox/SupportOpenSource

The revolution is coming - we're here to fight for AI you can use and build on your own computer, not a giant corporation charging you for access at their discretion. We've seen what OpenAI, Anthropic, and the ultra-rich taking charge of the AI future looks like, and it's already very clear you won't like living in it. Choose a different future while you still can.

Open source must win.

More to come soon!

love, always,
allegra
projectlosangelesย 
posted an update about 14 hours ago
view post
Post
2029
๐Ÿ”ฅCheck out HeartMuLa!!! ๐Ÿ”ฅ

The best open-sourced music generation model in terms of lyrics controllability and music quality!!!

๐Ÿค—https://huggingface.co/HeartMuLa/HeartMuLa-oss-3B-happy-new-year๐Ÿค—

โค๏ธListen to amazing HeartMuLa output samples here:
https://soundcloud.com/aleksandr-sigalov-61/sets/heartmula โค๏ธ

@victor
  • 2 replies
ยท
Anran-MLLMย 
posted an update 4 days ago
view post
Post
3494
๐Ÿš€ Introducing PerceptionDLM โ€” the first multimodal diffusion LLM for parallel region perception!

Most MLLMs are autoregressive, so captioning N regions costs N sequential passes. PerceptionDLM instead describes ALL masked regions in a single denoising process. ๐Ÿงฉ

โœจ Highlights
โ€ข โšก Up to 3.4ร— faster on dense multi-region captioning, with stable per-image latency
โ€ข ๐Ÿ† PerceptionDLM-Base beats LLaDA-V on 15/16 multimodal benchmarks (new SOTA among open diffusion VLMs)
โ€ข ๐Ÿ“Š New benchmark: ParaDLC-Bench โ€” jointly evaluates caption quality AND inference efficiency
โ€ข ๐Ÿ”“ Code, models & benchmark all open-sourced

๐Ÿค– Models
MSALab/PerceptionDLM-Base
MSALab/PerceptionDLM

๐Ÿ“Š Benchmark
MSALab/ParaDLC-Bench

๐Ÿ“„ Paper: PerceptionDLM: Parallel Region Perception with Multimodal Diffusion Language Models (2606.19534)
๐Ÿ’ป Code: https://github.com/MSALab-PKU/PerceptionDLM

Diffusion LLMs aren't just for text โ€” they unlock efficient, parallel visual perception. ๐Ÿ‘๏ธโœจ

#multimodal #diffusion #VLM #perception
AmelieSchreiberย 
posted an update 3 days ago
view post
Post
2019
Latest OpenAI Parameter Golf Competition Training Run BPB (<1K steps on a single 4090) See: ToricBLM, ToricGT, and TropicalGT methods
  • 1 reply
ยท
kanaria007ย 
posted an update about 21 hours ago
view post
Post
87
โœ… Article highlight: Reflexes with Receipts: Fast Paths, Safe Paths, and Ethical Interrupts (art-60-181, v0.1)

TL;DR:
This article argues that reflexes are not hidden shortcuts.

In embodied systems, some actions must happen faster than full deliberation. But โ€œfastโ€ cannot mean opaque. 181 defines reflexive action as a governed fast path: trigger-bounded, ethically interrupted, latency-aware, safe-mode capable, and closed by post-hoc receipts.

Read:
kanaria007/agi-structural-intelligence-protocols

Why it matters:
โ€ข makes emergency action reviewable without making it too slow
โ€ข separates reflex zones from ordinary reasoning paths
โ€ข keeps low-latency action inside bounded ethics and rollback discipline
โ€ข gives safe-stop / safe-mode a first-class runtime role
โ€ข turns โ€œthe agent reactedโ€ into an auditable event

Whatโ€™s inside:
โ€ข reflex trigger records and reflex zone registries
โ€ข REFLEXIA-style fast jump emission under constrained checks
โ€ข KINETICA-bound execution receipts for actuator-safe action
โ€ข HOMEODYNA signals for pressure, suppression, and urgency
โ€ข ethical interrupt results for blocking, modifying, or safe-stopping reflexes
โ€ข latency-envelope receipts for proving the fast path stayed within bounds
โ€ข post-hoc review and reentry receipts after the reflex event

Key idea:
Do not say:

โ€œthe system reacted automatically.โ€

Say:

โ€œthis reflex was triggered by this parsed condition, within this reflex zone, under this latency envelope, with this ethical interrupt result, safe-mode fallback, execution receipt, and post-hoc review.โ€

Fast paths can be safe paths only when they leave receipts.
kanaria007ย 
posted an update 3 days ago
view post
Post
210
โœ… Article highlight: *Embodied SI-Core: Observation, Homeostasis, Reflexes, and Safe Actuation* (art-60-178, v0.1)

TL;DR:
This article argues that SI-Core does not stop at text, tools, or simulated policy.

Once a system can sense, self-regulate, react, and actuate, governance must reach the sensing and motion boundary. Embodied SI-Core keeps observation, ethics, rollback, memory, and evaluation alive across perception, internal state, reflex paths, and actuator-safe execution.

Read:
kanaria007/agi-structural-intelligence-protocols

Why it matters:
โ€ข treats perception and actuation as governed runtime surfaces
โ€ข keeps fast reflex paths inside bounded ethics and rollback discipline
โ€ข makes internal state part of routing, not just telemetry
โ€ข blocks under-observed motion from becoming a world effect
โ€ข connects robots, avatars, vehicles, prosthetics, edge devices, and simulated actors under one frame

Whatโ€™s inside:
โ€ข embodied observation bundles with coverage and confidence
โ€ข HOMEODYNA-style internal-state tension and jump suppression
โ€ข REFLEXIA-style bounded low-latency reflex routing
โ€ข KINETICA-style intent-to-actuation planning
โ€ข execution monitoring, safe-stop, rollback, and reentry logs
โ€ข an embodied runtime arc from raw sensory inputs to append-only memory

Key idea:
Do not say:

*โ€œthe agent saw something and acted.โ€*

Say:

*โ€œthis embodied system parsed the observation, checked internal-state tension, selected a governed route, bound action through ethics and reversibility, monitored execution, and reentered memory with receipts.โ€*

Sense structurally.
Regulate internally.
React only within bounds.
Actuate with receipts.
Hari5115ย 
posted an update 3 days ago
view post
Post
1502
Bit addictive. Fair warning !!!
Chain combos, fever mode, daily leaderboard. Free, runs in your browser.
Beat the score if you can ๐Ÿซง

๐ŸŽฎ Hari5115/neon-pop

#SendHelp #JustOneMoreGame #NeonPop #NotAddicted

  • 2 replies
ยท
Reubencfย 
posted an update 4 days ago
view post
Post
3637
Shadows of Tomorrow is finally live on Hugging Face Spaces with Gradio.

Itโ€™s a browser-playable RPG built with Godot, set in a post-nuclear future where players explore Magnus Province, collect medicinal plants, craft medicine, and help cure NPCs.

Play it here: Reubencf/Shadows_of_Tomorrow
  • 10 replies
ยท