Instructions to use BGI-HangzhouAI/Genos-m with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use BGI-HangzhouAI/Genos-m with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="BGI-HangzhouAI/Genos-m")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("BGI-HangzhouAI/Genos-m") model = AutoModelForCausalLM.from_pretrained("BGI-HangzhouAI/Genos-m") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use BGI-HangzhouAI/Genos-m with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "BGI-HangzhouAI/Genos-m" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "BGI-HangzhouAI/Genos-m", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/BGI-HangzhouAI/Genos-m
- SGLang
How to use BGI-HangzhouAI/Genos-m with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "BGI-HangzhouAI/Genos-m" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "BGI-HangzhouAI/Genos-m", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "BGI-HangzhouAI/Genos-m" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "BGI-HangzhouAI/Genos-m", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use BGI-HangzhouAI/Genos-m with Docker Model Runner:
docker model run hf.co/BGI-HangzhouAI/Genos-m
Genos-m
Genos-m is a foundation model for human-associated microbial genomes. It is trained to model microbial DNA sequences at single-nucleotide resolution and supports ultra-long genomic contexts up to one million tokens.
For instructions, details, benchmarks, and examples, please refer to Genos-m GitHub and paper.
Model Specification
| Specification | Genos-m-4.7B |
|---|---|
| Total parameters | 4.7B |
| Activated parameters | 0.33B |
| Architecture type | MoE |
| Number of experts | 32 |
| Selected experts per token | 2 |
| Number of layers | 12 |
| Attention hidden size | 1024 |
| Number of attention heads | 16 |
| Query groups | 8 |
| MoE hidden size per expert | 4096 |
| Vocabulary size | 128 padded |
| Context length | up to 1M |
| Training objective | next-token prediction |
Training Data
Genos-m was pretrained on curated microbial genome resources, including GTDB R220 representative prokaryotic genomes, public human-associated microbial genomes, in-house high-quality human gut MAGs, and UHGV human gut phage genomes. The final pre-training corpus contains approximately 1.2T tokens and covers 186 phyla, 3,448 families, and 69,056 species. Within this corpus, the retained human-associated prokaryotic subset covers 45 phyla, 585 families, and 12,273 species across major human microbial habitats, including the gut, oral cavity, skin, respiratory tract, and female reproductive tract.
Checkpoints
- HF-Transformers checkpoint: BGI-HangzhouAI/Genos-m-4.7B
- Megatron-LM checkpoint: BGI-HangzhouAI/Genos-m-Megatron-4.7B
License
Genos-m model and code are released under the Apache License 2.0.
Contact
For questions and suggestions, please open an issue.
- Downloads last month
- 12