Seounghyup Claude commited on
Commit
8a1a81a
ยท
0 Parent(s):

ChatBIA API Server for Hugging Face Spaces

Browse files

- FastAPI server with dual AI models (General + BSL)
- Auto-download models from Hugging Face Hub on startup
- Qwen2.5-3B-Instruct-Q4_K_M (General mode)
- ChatBIA-3B-v0.1-Q4_K_M (BSL accounting mode)
- Docker deployment configuration
- CORS support for Android app
- No model files in repo (downloaded from HF Hub)

๐Ÿค– Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>

Files changed (9) hide show
  1. .gitattributes +3 -0
  2. .gitignore +25 -0
  3. DEPLOY.md +116 -0
  4. Dockerfile +26 -0
  5. README.md +129 -0
  6. README_HF.md +53 -0
  7. app.py +200 -0
  8. main.py +193 -0
  9. requirements.txt +5 -0
.gitattributes ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ *.gguf filter=lfs diff=lfs merge=lfs -text
2
+ *.bin filter=lfs diff=lfs merge=lfs -text
3
+ *.safetensors filter=lfs diff=lfs merge=lfs -text
.gitignore ADDED
@@ -0,0 +1,25 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Python
2
+ __pycache__/
3
+ *.py[cod]
4
+ *$py.class
5
+ *.so
6
+ .Python
7
+ venv/
8
+ env/
9
+ ENV/
10
+
11
+ # Models (ํฐ ํŒŒ์ผ)
12
+ models/*.gguf
13
+
14
+ # IDE
15
+ .vscode/
16
+ .idea/
17
+ *.swp
18
+ *.swo
19
+
20
+ # OS
21
+ .DS_Store
22
+ Thumbs.db
23
+
24
+ # Logs
25
+ *.log
DEPLOY.md ADDED
@@ -0,0 +1,116 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Hugging Face Spaces ๋ฐฐํฌ ๊ฐ€์ด๋“œ
2
+
3
+ ## 1๋‹จ๊ณ„: Hugging Face ๊ณ„์ • ์ƒ์„ฑ
4
+
5
+ 1. https://huggingface.co/ ์ ‘์†
6
+ 2. ํšŒ์›๊ฐ€์ž… (GitHub ๊ณ„์ •์œผ๋กœ ๊ฐ€๋Šฅ)
7
+ 3. ๋ฌด๋ฃŒ ๊ณ„์ •์œผ๋กœ ์ถฉ๋ถ„!
8
+
9
+ ## 2๋‹จ๊ณ„: Space ์ƒ์„ฑ
10
+
11
+ 1. https://huggingface.co/new-space
12
+ 2. ์„ค์ •:
13
+ - **Space name**: `chatbia-api` (์›ํ•˜๋Š” ์ด๋ฆ„)
14
+ - **License**: MIT
15
+ - **Space SDK**: Docker
16
+ - **Space hardware**: CPU basic (๋ฌด๋ฃŒ)
17
+ - **Visibility**: Public
18
+
19
+ ## 3๋‹จ๊ณ„: Git ์„ค์ •
20
+
21
+ ### ๋กœ์ปฌ์—์„œ Git ์ดˆ๊ธฐํ™”
22
+
23
+ ```bash
24
+ cd ChatBIA-Server
25
+
26
+ # Git ์ดˆ๊ธฐํ™”
27
+ git init
28
+ git lfs install
29
+ git lfs track "*.gguf"
30
+
31
+ # Hugging Face ๋ฆฌ๋ชจํŠธ ์ถ”๊ฐ€
32
+ git remote add origin https://huggingface.co/spaces/YOUR-USERNAME/chatbia-api
33
+
34
+ # ํŒŒ์ผ ์ถ”๊ฐ€
35
+ git add .
36
+ git commit -m "Initial commit"
37
+ ```
38
+
39
+ ## 4๋‹จ๊ณ„: ๋ชจ๋ธ ํŒŒ์ผ ์ถ”๊ฐ€
40
+
41
+ **์ค‘์š”**: ๋ชจ๋ธ ํŒŒ์ผ์€ Git LFS๋กœ ์—…๋กœ๋“œํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค!
42
+
43
+ ```bash
44
+ # models ํด๋”์— GGUF ํŒŒ์ผ ๋ณต์‚ฌ
45
+ cp ../ChatBIA-Windows/models/*.gguf .
46
+
47
+ # Git LFS๋กœ ์ถ”๊ฐ€
48
+ git add Qwen2.5-3B-Instruct-Q4_K_M.gguf
49
+ git add ChatBIA-3B-v0.1-Q4_K_M.gguf
50
+
51
+ git commit -m "Add model files"
52
+ ```
53
+
54
+ ## 5๋‹จ๊ณ„: Push
55
+
56
+ ```bash
57
+ # Hugging Face ํ† ํฐ ์„ค์ • (์ตœ์ดˆ 1ํšŒ)
58
+ # https://huggingface.co/settings/tokens ์—์„œ ํ† ํฐ ์ƒ์„ฑ
59
+
60
+ git push origin main
61
+ ```
62
+
63
+ **์ฐธ๊ณ **: ๋ชจ๋ธ ํŒŒ์ผ์ด ํฌ๋ฏ€๋กœ ์—…๋กœ๋“œ์— ์‹œ๊ฐ„์ด ๊ฑธ๋ฆฝ๋‹ˆ๋‹ค (~10-20๋ถ„)
64
+
65
+ ## 6๋‹จ๊ณ„: ๋ฐฐํฌ ํ™•์ธ
66
+
67
+ 1. Space ํŽ˜์ด์ง€์—์„œ "Building" ์ƒํƒœ ํ™•์ธ
68
+ 2. ๋นŒ๋“œ ์™„๋ฃŒ ํ›„ "Running" ์ƒํƒœ ํ™•์ธ
69
+ 3. API ํ…Œ์ŠคํŠธ:
70
+
71
+ ```bash
72
+ curl https://YOUR-USERNAME-chatbia-api.hf.space/
73
+ ```
74
+
75
+ ## 7๋‹จ๊ณ„: API URL ํ™•์ธ
76
+
77
+ ๋ฐฐํฌ ์™„๋ฃŒ ํ›„ Space URL:
78
+ ```
79
+ https://YOUR-USERNAME-chatbia-api.hf.space
80
+ ```
81
+
82
+ ์ด URL์„ ์•ˆ๋“œ๋กœ์ด๋“œ ์•ฑ์—์„œ ์‚ฌ์šฉํ•˜๋ฉด ๋ฉ๋‹ˆ๋‹ค!
83
+
84
+ ## ๋ฌธ์ œ ํ•ด๊ฒฐ
85
+
86
+ ### ๋นŒ๋“œ ์‹คํŒจ
87
+ - Logs ํƒญ์—์„œ ์—๋Ÿฌ ํ™•์ธ
88
+ - ์ฃผ๋กœ requirements.txt ๋ฌธ์ œ
89
+
90
+ ### ๋ชจ๋ธ ๋กœ๋“œ ์‹คํŒจ
91
+ - Git LFS๊ฐ€ ์ œ๋Œ€๋กœ ์„ค์ •๋˜์—ˆ๋Š”์ง€ ํ™•์ธ
92
+ - ๋ชจ๋ธ ํŒŒ์ผ์ด ์ •์ƒ์ ์œผ๋กœ ์—…๋กœ๋“œ๋˜์—ˆ๋Š”์ง€ ํ™•์ธ
93
+
94
+ ### ๋А๋ฆฐ ์‘๋‹ต
95
+ - CPU basic์€ ๋ฌด๋ฃŒ์ง€๋งŒ ๋А๋ฆฝ๋‹ˆ๋‹ค
96
+ - ์œ ๋ฃŒ GPU๋กœ ์—…๊ทธ๋ ˆ์ด๋“œ ๊ฐ€๋Šฅ (~$0.60/hour)
97
+
98
+ ## ๋Œ€์•ˆ: ๋กœ์ปฌ ํ…Œ์ŠคํŠธ ๋จผ์ €
99
+
100
+ ```bash
101
+ # Docker๋กœ ๋กœ์ปฌ ํ…Œ์ŠคํŠธ
102
+ docker build -t chatbia-api .
103
+ docker run -p 7860:7860 chatbia-api
104
+
105
+ # ํ…Œ์ŠคํŠธ
106
+ curl http://localhost:7860/
107
+ ```
108
+
109
+ ## ๋น„์šฉ
110
+
111
+ - **CPU basic**: ๋ฌด๋ฃŒ โœ…
112
+ - **CPU upgrade**: $0.03/hour
113
+ - **GPU T4**: $0.60/hour
114
+ - **GPU A10G**: $3.15/hour
115
+
116
+ ๋ฌด๋ฃŒ CPU๋กœ ์‹œ์ž‘ํ•˜์„ธ์š”!
Dockerfile ADDED
@@ -0,0 +1,26 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ FROM python:3.10-slim
2
+
3
+ WORKDIR /app
4
+
5
+ # ์‹œ์Šคํ…œ ํŒจํ‚ค์ง€ ์„ค์น˜
6
+ RUN apt-get update && apt-get install -y \
7
+ build-essential \
8
+ cmake \
9
+ git \
10
+ && rm -rf /var/lib/apt/lists/*
11
+
12
+ # Python ํŒจํ‚ค์ง€ ์„ค์น˜
13
+ COPY requirements.txt .
14
+ RUN pip install --no-cache-dir -r requirements.txt
15
+
16
+ # ์•ฑ ํŒŒ์ผ ๋ณต์‚ฌ
17
+ COPY app.py .
18
+
19
+ # ๋ชจ๋ธ ํŒŒ์ผ์€ Git LFS๋กœ ์ž๋™ ๋‹ค์šด๋กœ๋“œ๋จ
20
+ # (Hugging Face Spaces๊ฐ€ ์ž๋™์œผ๋กœ ์ฒ˜๋ฆฌ)
21
+
22
+ # ํฌํŠธ ๋…ธ์ถœ
23
+ EXPOSE 7860
24
+
25
+ # FastAPI ์‹คํ–‰
26
+ CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "7860"]
README.md ADDED
@@ -0,0 +1,129 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # ChatBIA Server
2
+
3
+ 24/7 ํšŒ๊ณ„ AI FastAPI ์„œ๋ฒ„
4
+
5
+ ## ๋กœ์ปฌ ํ…Œ์ŠคํŠธ
6
+
7
+ ```bash
8
+ # ๊ฐ€์ƒํ™˜๊ฒฝ ์ƒ์„ฑ
9
+ python -m venv venv
10
+ source venv/bin/activate # Windows: venv\Scripts\activate
11
+
12
+ # ํŒจํ‚ค์ง€ ์„ค์น˜
13
+ pip install -r requirements.txt
14
+
15
+ # ๋ชจ๋ธ ๋‹ค์šด๋กœ๋“œ (models ํด๋”์— GGUF ํŒŒ์ผ ๋„ฃ๊ธฐ)
16
+ mkdir models
17
+ # Qwen2.5-3B-Instruct-Q4_K_M.gguf
18
+ # ChatBIA-3B-v0.1-Q4_K_M.gguf
19
+
20
+ # ์„œ๋ฒ„ ์‹คํ–‰
21
+ python main.py
22
+ ```
23
+
24
+ ์„œ๋ฒ„ ์ฃผ์†Œ: http://localhost:8000
25
+
26
+ ## API ์‚ฌ์šฉ๋ฒ•
27
+
28
+ ### 1. ํ—ฌ์Šค ์ฒดํฌ
29
+ ```bash
30
+ curl http://localhost:8000/
31
+ ```
32
+
33
+ ### 2. ์ฑ„ํŒ…
34
+ ```bash
35
+ curl -X POST http://localhost:8000/chat \
36
+ -H "Content-Type: application/json" \
37
+ -d '{
38
+ "message": "์•ˆ๋…•ํ•˜์„ธ์š”",
39
+ "mode": "bsl",
40
+ "max_tokens": 1024,
41
+ "temperature": 0.7
42
+ }'
43
+ ```
44
+
45
+ ### 3. ๋ชจ๋ธ ์ƒํƒœ ํ™•์ธ
46
+ ```bash
47
+ curl http://localhost:8000/models
48
+ ```
49
+
50
+ ## Oracle Cloud ๋ฐฐํฌ
51
+
52
+ ### 1. VM ์ƒ์„ฑ
53
+ - Always Free: VM.Standard.E2.1.Micro (1 OCPU, 1GB RAM)
54
+ - ๋˜๋Š”: VM.Standard.A1.Flex (4 OCPU, 24GB RAM) - ARM
55
+
56
+ ### 2. ๋ฐฉํ™”๋ฒฝ ์„ค์ •
57
+ ```bash
58
+ # Ubuntu ๋ฐฉํ™”๋ฒฝ
59
+ sudo ufw allow 8000/tcp
60
+ sudo ufw enable
61
+
62
+ # Oracle Cloud ๋ณด์•ˆ ๋ชฉ๋ก
63
+ # Ingress Rule: 0.0.0.0/0, TCP, Port 8000
64
+ ```
65
+
66
+ ### 3. ์„œ๋ฒ„ ์„ค์น˜
67
+ ```bash
68
+ # ํ”„๋กœ์ ํŠธ ๋ณต์‚ฌ
69
+ scp -r ChatBIA-Server ubuntu@<IP>:~/
70
+
71
+ # SSH ์ ‘์†
72
+ ssh ubuntu@<IP>
73
+
74
+ # Python ์„ค์น˜
75
+ sudo apt update
76
+ sudo apt install python3.10 python3-pip -y
77
+
78
+ # ํŒจํ‚ค์ง€ ์„ค์น˜
79
+ cd ChatBIA-Server
80
+ pip3 install -r requirements.txt
81
+
82
+ # ๋ชจ๋ธ ์—…๋กœ๋“œ (scp ์‚ฌ์šฉ)
83
+ # ๋กœ์ปฌ์—์„œ:
84
+ scp models/*.gguf ubuntu@<IP>:~/ChatBIA-Server/models/
85
+ ```
86
+
87
+ ### 4. Systemd ์„œ๋น„์Šค ์ƒ์„ฑ
88
+ ```bash
89
+ sudo nano /etc/systemd/system/chatbia.service
90
+ ```
91
+
92
+ ```ini
93
+ [Unit]
94
+ Description=ChatBIA FastAPI Server
95
+ After=network.target
96
+
97
+ [Service]
98
+ Type=simple
99
+ User=ubuntu
100
+ WorkingDirectory=/home/ubuntu/ChatBIA-Server
101
+ ExecStart=/usr/bin/python3 main.py
102
+ Restart=always
103
+ RestartSec=10
104
+
105
+ [Install]
106
+ WantedBy=multi-user.target
107
+ ```
108
+
109
+ ```bash
110
+ # ์„œ๋น„์Šค ์‹œ์ž‘
111
+ sudo systemctl daemon-reload
112
+ sudo systemctl enable chatbia
113
+ sudo systemctl start chatbia
114
+
115
+ # ์ƒํƒœ ํ™•์ธ
116
+ sudo systemctl status chatbia
117
+ ```
118
+
119
+ ## ์‚ฌ์šฉ๋Ÿ‰
120
+
121
+ - CPU: ~80% (์ถ”๋ก  ์‹œ)
122
+ - RAM: ~1.5GB (๋ชจ๋ธ 2๊ฐœ ๋กœ๋“œ ์‹œ)
123
+ - ๋””์Šคํฌ: ~4GB (๋ชจ๋ธ ํŒŒ์ผ)
124
+
125
+ ## ๋ณด์•ˆ
126
+
127
+ - API ํ‚ค ์ธ์ฆ ์ถ”๊ฐ€ ๊ถŒ์žฅ
128
+ - HTTPS (Let's Encrypt) ์„ค์ •
129
+ - Rate limiting ๊ณ ๋ ค
README_HF.md ADDED
@@ -0,0 +1,53 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ title: ChatBIA API
3
+ emoji: ๐Ÿ’ผ
4
+ colorFrom: purple
5
+ colorTo: pink
6
+ sdk: docker
7
+ pinned: false
8
+ license: mit
9
+ ---
10
+
11
+ # ChatBIA API Server
12
+
13
+ 24/7 ํšŒ๊ณ„ ์ „๋ฌธ AI ์„œ๋ฒ„
14
+
15
+ ## Features
16
+
17
+ - ๐Ÿค– ๋“€์–ผ AI ๋ชจ๋ธ (General + BSL)
18
+ - ๐Ÿ’ผ ํšŒ๊ณ„ ๊ณ„์‚ฐ ์ „๋ฌธ
19
+ - ๐Ÿ”„ RESTful API
20
+ - ๐ŸŒ CORS ์ง€์›
21
+
22
+ ## API ์‚ฌ์šฉ๋ฒ•
23
+
24
+ ### 1. ํ—ฌ์Šค ์ฒดํฌ
25
+ ```bash
26
+ curl https://YOUR-SPACE-NAME.hf.space/
27
+ ```
28
+
29
+ ### 2. ์ฑ„ํŒ…
30
+ ```bash
31
+ curl -X POST https://YOUR-SPACE-NAME.hf.space/chat \
32
+ -H "Content-Type: application/json" \
33
+ -d '{
34
+ "message": "5์ฒœ๋งŒ์› ์„ค๋น„๋ฅผ 5๋…„๊ฐ„ ๊ฐ๊ฐ€์ƒ๊ฐ ๊ณ„์‚ฐํ•ด์ค˜",
35
+ "mode": "bsl",
36
+ "max_tokens": 1024,
37
+ "temperature": 0.7
38
+ }'
39
+ ```
40
+
41
+ ### 3. ๋ชจ๋ธ ์ƒํƒœ
42
+ ```bash
43
+ curl https://YOUR-SPACE-NAME.hf.space/models
44
+ ```
45
+
46
+ ## Models
47
+
48
+ - **General**: Qwen2.5-3B-Instruct-Q4_K_M
49
+ - **BSL**: ChatBIA-3B-v0.1-Q4_K_M
50
+
51
+ ## License
52
+
53
+ MIT
app.py ADDED
@@ -0,0 +1,200 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ ChatBIA Hugging Face Spaces API
3
+ 24/7 ํšŒ๊ณ„ AI ์„œ๋ฒ„
4
+ """
5
+ from fastapi import FastAPI, HTTPException
6
+ from fastapi.middleware.cors import CORSMiddleware
7
+ from pydantic import BaseModel
8
+ from typing import Optional
9
+ import os
10
+ from llama_cpp import Llama
11
+ from huggingface_hub import hf_hub_download
12
+
13
+ app = FastAPI(
14
+ title="ChatBIA API",
15
+ description="ํšŒ๊ณ„ ์ „๋ฌธ AI ์„œ๋ฒ„ (Hugging Face Spaces)",
16
+ version="1.0.0"
17
+ )
18
+
19
+ # CORS ์„ค์ •
20
+ app.add_middleware(
21
+ CORSMiddleware,
22
+ allow_origins=["*"],
23
+ allow_credentials=True,
24
+ allow_methods=["*"],
25
+ allow_headers=["*"],
26
+ )
27
+
28
+ # Hugging Face ๋ชจ๋ธ ๋ ˆํฌ์ง€ํ„ฐ๋ฆฌ
29
+ GENERAL_MODEL_REPO = "Qwen/Qwen2.5-3B-Instruct-GGUF"
30
+ GENERAL_MODEL_FILE = "qwen2.5-3b-instruct-q4_k_m.gguf"
31
+ BSL_MODEL_REPO = "Seounghyup/ChatBIA-3B-v0.1"
32
+ BSL_MODEL_FILE = "ChatBIA-3B-v0.1-Q4_K_M.gguf"
33
+
34
+ # ์ „์—ญ ๋ชจ๋ธ ๋ณ€์ˆ˜
35
+ general_model = None
36
+ bsl_model = None
37
+ general_model_path = None
38
+ bsl_model_path = None
39
+
40
+
41
+ class ChatRequest(BaseModel):
42
+ message: str
43
+ mode: str = "bsl" # "general" or "bsl"
44
+ max_tokens: int = 1024
45
+ temperature: float = 0.7
46
+
47
+
48
+ class ChatResponse(BaseModel):
49
+ response: str
50
+ mode: str
51
+ tokens: int
52
+
53
+
54
+ @app.on_event("startup")
55
+ async def load_models():
56
+ """์„œ๋ฒ„ ์‹œ์ž‘ ์‹œ ๋ชจ๋ธ ๋‹ค์šด๋กœ๋“œ ๋ฐ ๋กœ๋“œ"""
57
+ global general_model, bsl_model, general_model_path, bsl_model_path
58
+
59
+ # General ๋ชจ๋ธ ๋‹ค์šด๋กœ๋“œ
60
+ print(f"๐Ÿ”„ ์ผ๋ฐ˜ ๋ชจ๋“œ ๋ชจ๋ธ ๋‹ค์šด๋กœ๋“œ ์ค‘: {GENERAL_MODEL_REPO}/{GENERAL_MODEL_FILE}")
61
+ try:
62
+ general_model_path = hf_hub_download(
63
+ repo_id=GENERAL_MODEL_REPO,
64
+ filename=GENERAL_MODEL_FILE,
65
+ repo_type="model"
66
+ )
67
+ print(f"โœ… ์ผ๋ฐ˜ ๋ชจ๋“œ ๋ชจ๋ธ ๋‹ค์šด๋กœ๋“œ ์™„๋ฃŒ: {general_model_path}")
68
+
69
+ # ๋ชจ๋ธ ๋กœ๋“œ
70
+ general_model = Llama(
71
+ model_path=general_model_path,
72
+ n_ctx=2048,
73
+ n_threads=2, # Spaces CPU ์ œํ•œ
74
+ n_gpu_layers=0,
75
+ verbose=False
76
+ )
77
+ print("โœ… ์ผ๋ฐ˜ ๋ชจ๋“œ ๋ชจ๋ธ ๋กœ๋“œ ์™„๋ฃŒ")
78
+ except Exception as e:
79
+ print(f"โŒ ์ผ๋ฐ˜ ๋ชจ๋“œ ๋ชจ๋ธ ๋กœ๋“œ ์‹คํŒจ: {e}")
80
+
81
+ # BSL ๋ชจ๋ธ ๋‹ค์šด๋กœ๋“œ
82
+ print(f"๐Ÿ”„ BSL ๋ชจ๋“œ ๋ชจ๋ธ ๋‹ค์šด๋กœ๋“œ ์ค‘: {BSL_MODEL_REPO}/{BSL_MODEL_FILE}")
83
+ try:
84
+ bsl_model_path = hf_hub_download(
85
+ repo_id=BSL_MODEL_REPO,
86
+ filename=BSL_MODEL_FILE,
87
+ repo_type="model"
88
+ )
89
+ print(f"โœ… BSL ๋ชจ๋“œ ๋ชจ๋ธ ๋‹ค์šด๋กœ๋“œ ์™„๋ฃŒ: {bsl_model_path}")
90
+
91
+ # ๋ชจ๋ธ ๋กœ๋“œ
92
+ bsl_model = Llama(
93
+ model_path=bsl_model_path,
94
+ n_ctx=2048,
95
+ n_threads=2,
96
+ n_gpu_layers=0,
97
+ verbose=False
98
+ )
99
+ print("โœ… BSL ๋ชจ๋“œ ๋ชจ๋ธ ๋กœ๋“œ ์™„๋ฃŒ")
100
+ except Exception as e:
101
+ print(f"โŒ BSL ๋ชจ๋ธ ๋กœ๋“œ ์‹คํŒจ: {e}")
102
+
103
+
104
+ def build_prompt(message: str, mode: str) -> str:
105
+ """ํ”„๋กฌํ”„ํŠธ ๋นŒ๋“œ"""
106
+ if mode == "bsl":
107
+ return f"""<|im_start|>system
108
+ You are a professional accounting AI assistant. Respond naturally in Korean.
109
+
110
+ Important: Only generate BSL DSL code when the user explicitly requests calculations (e.g., "๊ณ„์‚ฐํ•ด์ค˜", "์ฝ”๋“œ ์ž‘์„ฑํ•ด์ค˜", "BSL๋กœ ์ž‘์„ฑํ•ด์ค˜"). For general questions or greetings, respond conversationally without code.<|im_end|>
111
+ <|im_start|>user
112
+ {message}<|im_end|>
113
+ <|im_start|>assistant
114
+ """
115
+ else:
116
+ return f"""<|im_start|>system
117
+ You are a helpful AI assistant. Respond naturally in Korean.<|im_end|>
118
+ <|im_start|>user
119
+ {message}<|im_end|>
120
+ <|im_start|>assistant
121
+ """
122
+
123
+
124
+ @app.get("/")
125
+ async def root():
126
+ """ํ—ฌ์Šค ์ฒดํฌ"""
127
+ return {
128
+ "status": "online",
129
+ "service": "ChatBIA API",
130
+ "version": "1.0.0",
131
+ "platform": "Hugging Face Spaces",
132
+ "models": {
133
+ "general": general_model is not None,
134
+ "bsl": bsl_model is not None
135
+ }
136
+ }
137
+
138
+
139
+ @app.post("/chat", response_model=ChatResponse)
140
+ async def chat(request: ChatRequest):
141
+ """์ฑ„ํŒ… ์—”๋“œํฌ์ธํŠธ"""
142
+ # ๋ชจ๋ธ ์„ ํƒ
143
+ if request.mode == "general":
144
+ model = general_model
145
+ model_name = "General"
146
+ else:
147
+ model = bsl_model
148
+ model_name = "BSL"
149
+
150
+ # ๋ชจ๋ธ์ด ์—†์œผ๋ฉด ์—๋Ÿฌ
151
+ if model is None:
152
+ raise HTTPException(
153
+ status_code=503,
154
+ detail=f"{model_name} ๋ชจ๋ธ์ด ๋กœ๋“œ๋˜์ง€ ์•Š์•˜์Šต๋‹ˆ๋‹ค."
155
+ )
156
+
157
+ try:
158
+ # ํ”„๋กฌํ”„ํŠธ ๋นŒ๋“œ
159
+ prompt = build_prompt(request.message, request.mode)
160
+
161
+ # ์ถ”๋ก 
162
+ response = model(
163
+ prompt,
164
+ max_tokens=request.max_tokens,
165
+ temperature=request.temperature,
166
+ top_p=0.9,
167
+ top_k=40,
168
+ repeat_penalty=1.1,
169
+ stop=["<|im_end|>", "###", "\n\n\n"]
170
+ )
171
+
172
+ text = response["choices"][0]["text"].strip()
173
+ tokens = len(response["choices"][0]["text"].split())
174
+
175
+ return ChatResponse(
176
+ response=text,
177
+ mode=request.mode,
178
+ tokens=tokens
179
+ )
180
+
181
+ except Exception as e:
182
+ raise HTTPException(
183
+ status_code=500,
184
+ detail=f"AI ๋ชจ๋ธ ์ฒ˜๋ฆฌ ์ค‘ ์˜ค๋ฅ˜: {str(e)}"
185
+ )
186
+
187
+
188
+ @app.get("/models")
189
+ async def get_models():
190
+ """์‚ฌ์šฉ ๊ฐ€๋Šฅํ•œ ๋ชจ๋ธ ๋ชฉ๋ก"""
191
+ return {
192
+ "general": {
193
+ "loaded": general_model is not None,
194
+ "path": general_model_path
195
+ },
196
+ "bsl": {
197
+ "loaded": bsl_model is not None,
198
+ "path": bsl_model_path
199
+ }
200
+ }
main.py ADDED
@@ -0,0 +1,193 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ ChatBIA FastAPI Server
3
+ 24/7 ํšŒ๊ณ„ AI ์„œ๋ฒ„
4
+ """
5
+ from fastapi import FastAPI, HTTPException
6
+ from fastapi.middleware.cors import CORSMiddleware
7
+ from pydantic import BaseModel
8
+ from typing import Optional, List
9
+ import os
10
+ from llama_cpp import Llama
11
+
12
+ app = FastAPI(
13
+ title="ChatBIA API",
14
+ description="ํšŒ๊ณ„ ์ „๋ฌธ AI ์„œ๋ฒ„",
15
+ version="1.0.0"
16
+ )
17
+
18
+ # CORS ์„ค์ • (์•ˆ๋“œ๋กœ์ด๋“œ/์›น์—์„œ ์ ‘๊ทผ ๊ฐ€๋Šฅ)
19
+ app.add_middleware(
20
+ CORSMiddleware,
21
+ allow_origins=["*"],
22
+ allow_credentials=True,
23
+ allow_methods=["*"],
24
+ allow_headers=["*"],
25
+ )
26
+
27
+ # ๋ชจ๋ธ ๊ฒฝ๋กœ
28
+ MODEL_DIR = "models"
29
+ GENERAL_MODEL_PATH = os.path.join(MODEL_DIR, "Qwen2.5-3B-Instruct-Q4_K_M.gguf")
30
+ BSL_MODEL_PATH = os.path.join(MODEL_DIR, "ChatBIA-3B-v0.1-Q4_K_M.gguf")
31
+
32
+ # ์ „์—ญ ๋ชจ๋ธ ๋ณ€์ˆ˜
33
+ general_model = None
34
+ bsl_model = None
35
+
36
+
37
+ class ChatRequest(BaseModel):
38
+ message: str
39
+ mode: str = "bsl" # "general" or "bsl"
40
+ max_tokens: int = 1024
41
+ temperature: float = 0.7
42
+
43
+
44
+ class ChatResponse(BaseModel):
45
+ response: str
46
+ mode: str
47
+ tokens: int
48
+
49
+
50
+ @app.on_event("startup")
51
+ async def load_models():
52
+ """์„œ๋ฒ„ ์‹œ์ž‘ ์‹œ ๋ชจ๋ธ ๋กœ๋“œ"""
53
+ global general_model, bsl_model
54
+
55
+ os.makedirs(MODEL_DIR, exist_ok=True)
56
+
57
+ # General ๋ชจ๋ธ ๋กœ๋“œ
58
+ if os.path.exists(GENERAL_MODEL_PATH):
59
+ print(f"๐Ÿ”„ ์ผ๋ฐ˜ ๋ชจ๋“œ ๋ชจ๋ธ ๋กœ๋“œ ์ค‘: {GENERAL_MODEL_PATH}")
60
+ try:
61
+ general_model = Llama(
62
+ model_path=GENERAL_MODEL_PATH,
63
+ n_ctx=2048,
64
+ n_threads=4,
65
+ n_gpu_layers=0, # Oracle Cloud๋Š” CPU
66
+ verbose=False
67
+ )
68
+ print("โœ… ์ผ๋ฐ˜ ๋ชจ๋“œ ๋ชจ๋ธ ๋กœ๋“œ ์™„๋ฃŒ")
69
+ except Exception as e:
70
+ print(f"โŒ ์ผ๋ฐ˜ ๋ชจ๋“œ ๋ชจ๋ธ ๋กœ๋“œ ์‹คํŒจ: {e}")
71
+
72
+ # BSL ๋ชจ๋ธ ๋กœ๋“œ
73
+ if os.path.exists(BSL_MODEL_PATH):
74
+ print(f"๐Ÿ”„ BSL ๋ชจ๋“œ ๋ชจ๋ธ ๋กœ๋“œ ์ค‘: {BSL_MODEL_PATH}")
75
+ try:
76
+ bsl_model = Llama(
77
+ model_path=BSL_MODEL_PATH,
78
+ n_ctx=2048,
79
+ n_threads=4,
80
+ n_gpu_layers=0,
81
+ verbose=False
82
+ )
83
+ print("โœ… BSL ๋ชจ๋“œ ๋ชจ๋ธ ๋กœ๋“œ ์™„๋ฃŒ")
84
+ except Exception as e:
85
+ print(f"โŒ BSL ๋ชจ๋“œ ๋ชจ๋ธ ๋กœ๋“œ ์‹คํŒจ: {e}")
86
+
87
+
88
+ def build_prompt(message: str, mode: str) -> str:
89
+ """ํ”„๋กฌํ”„ํŠธ ๋นŒ๋“œ"""
90
+ if mode == "bsl":
91
+ return f"""<|im_start|>system
92
+ You are a professional accounting AI assistant. Respond naturally in Korean.
93
+
94
+ Important: Only generate BSL DSL code when the user explicitly requests calculations (e.g., "๊ณ„์‚ฐํ•ด์ค˜", "์ฝ”๋“œ ์ž‘์„ฑํ•ด์ค˜", "BSL๋กœ ์ž‘์„ฑํ•ด์ค˜"). For general questions or greetings, respond conversationally without code.<|im_end|>
95
+ <|im_start|>user
96
+ {message}<|im_end|>
97
+ <|im_start|>assistant
98
+ """
99
+ else:
100
+ return f"""<|im_start|>system
101
+ You are a helpful AI assistant. Respond naturally in Korean.<|im_end|>
102
+ <|im_start|>user
103
+ {message}<|im_end|>
104
+ <|im_start|>assistant
105
+ """
106
+
107
+
108
+ @app.get("/")
109
+ async def root():
110
+ """ํ—ฌ์Šค ์ฒดํฌ"""
111
+ return {
112
+ "status": "online",
113
+ "service": "ChatBIA API",
114
+ "version": "1.0.0",
115
+ "models": {
116
+ "general": general_model is not None,
117
+ "bsl": bsl_model is not None
118
+ }
119
+ }
120
+
121
+
122
+ @app.post("/chat", response_model=ChatResponse)
123
+ async def chat(request: ChatRequest):
124
+ """์ฑ„ํŒ… ์—”๋“œํฌ์ธํŠธ"""
125
+ # ๋ชจ๋ธ ์„ ํƒ
126
+ if request.mode == "general":
127
+ model = general_model
128
+ model_name = "General"
129
+ else:
130
+ model = bsl_model
131
+ model_name = "BSL"
132
+
133
+ # ๋ชจ๋ธ์ด ์—†์œผ๋ฉด ์—๋Ÿฌ
134
+ if model is None:
135
+ raise HTTPException(
136
+ status_code=503,
137
+ detail=f"{model_name} ๋ชจ๋ธ์ด ๋กœ๋“œ๋˜์ง€ ์•Š์•˜์Šต๋‹ˆ๋‹ค."
138
+ )
139
+
140
+ try:
141
+ # ํ”„๋กฌํ”„ํŠธ ๋นŒ๋“œ
142
+ prompt = build_prompt(request.message, request.mode)
143
+
144
+ # ์ถ”๋ก 
145
+ response = model(
146
+ prompt,
147
+ max_tokens=request.max_tokens,
148
+ temperature=request.temperature,
149
+ top_p=0.9,
150
+ top_k=40,
151
+ repeat_penalty=1.1,
152
+ stop=["<|im_end|>", "###", "\n\n\n"]
153
+ )
154
+
155
+ text = response["choices"][0]["text"].strip()
156
+ tokens = len(response["choices"][0]["text"].split())
157
+
158
+ return ChatResponse(
159
+ response=text,
160
+ mode=request.mode,
161
+ tokens=tokens
162
+ )
163
+
164
+ except Exception as e:
165
+ raise HTTPException(
166
+ status_code=500,
167
+ detail=f"AI ๋ชจ๋ธ ์ฒ˜๋ฆฌ ์ค‘ ์˜ค๋ฅ˜: {str(e)}"
168
+ )
169
+
170
+
171
+ @app.get("/models")
172
+ async def get_models():
173
+ """์‚ฌ์šฉ ๊ฐ€๋Šฅํ•œ ๋ชจ๋ธ ๋ชฉ๋ก"""
174
+ return {
175
+ "general": {
176
+ "loaded": general_model is not None,
177
+ "path": GENERAL_MODEL_PATH if os.path.exists(GENERAL_MODEL_PATH) else None
178
+ },
179
+ "bsl": {
180
+ "loaded": bsl_model is not None,
181
+ "path": BSL_MODEL_PATH if os.path.exists(BSL_MODEL_PATH) else None
182
+ }
183
+ }
184
+
185
+
186
+ if __name__ == "__main__":
187
+ import uvicorn
188
+ uvicorn.run(
189
+ "main:app",
190
+ host="0.0.0.0",
191
+ port=8000,
192
+ reload=False # ํ”„๋กœ๋•์…˜์—์„œ๋Š” reload=False
193
+ )
requirements.txt ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ fastapi==0.115.0
2
+ uvicorn[standard]==0.30.6
3
+ pydantic==2.9.2
4
+ llama-cpp-python==0.2.90
5
+ python-multipart==0.0.9