FlashPack — a new, high-throughput file format and loading mechanism for PyTorch that makes model checkpoint I/O blazingly fast, even on systems without access to GPU Direct Storage (GDS).
With FlashPack, loading any model can be 3–6× faster than with the current state-of-the-art methods like accelerate or the standard load_state_dict() and to() flow — all wrapped in a lightweight, pure-Python package that works anywhere.
We're excited to share TryOffDiff v2, extending our approach to support multiple garment categories. Key updates include: - Training on the multi-garment DressCode dataset, covering upper-body, lower-body, and dresses. - A simplified adapter design for improved training efficiency and modularity. - Introduction of four specialized models: - One model per category (upper, lower, dress), - Plus a multi-garment model capable of generating multiple garments sequentially from a single image.
*PS:* Visit us this Friday at 10:30 AM in ExHall-B for our live demo @CVPR'25!
Zonos is flying up the trending tab, and for good reason - it's the most expressive and emotive open-source TTS I've used to date. I'm happy to say it's now supported in Taproot, with added long-form synthesis support and other goodies.
I'm thrilled to be releasing the first iteration of a project I've been working on for quite awhile now. It's called Taproot, and it's a seamlessly scalable open-source AI/ML inference engine designed for letting developers build real-time experiences clustered across a small-to-mid-sized cluster, without the burden of hyperscale infrastructure.
Along with the server and task framework is a client library for node and the browser. And what good is a server and client without an app to go alongside it? To that end, I'm also releasing Anachrovox, a fun, real-time hands-free voice assistant that can run on mid-level devices in <12GB VRAM, with web search, weather, and other tools. It uses my real-time browser wake-word library to detect utterances of the phrase 'Hey Vox', 'Hi Vox', 'Okay Vox', 'Anachrovox' or just 'Vox' (alongside some others.)
Releasing this many things at once will definitely result in bugs, so please report them when sighted! Thank you all!
The Anachrovox Spaces are networked together, balancing load across them to keep all front-ends responsive. You only have to choose what color you like the most!
Introducing Virtual Try-Off (VTOFF), a novel task focused on generating standardized garment images from single photos of clothed individuals. Unlike traditional Virtual Try-On (VTON), which digitally dresses models, VTOFF aims to extract a canonical garment image, posing unique challenges in capturing garment shape, texture, and intricate patterns.