Spaces:

Seounghyup
/

chatbit-api

Sleeping

Seounghyup Claude commited on Oct 2

Commit

8a1a81a

0 Parent(s):

ChatBIA API Server for Hugging Face Spaces

- FastAPI server with dual AI models (General + BSL)
- Auto-download models from Hugging Face Hub on startup
- Qwen2.5-3B-Instruct-Q4_K_M (General mode)
- ChatBIA-3B-v0.1-Q4_K_M (BSL accounting mode)
- Docker deployment configuration
- CORS support for Android app
- No model files in repo (downloaded from HF Hub)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>

Files changed (9) hide show

.gitattributes +3 -0
.gitignore +25 -0
DEPLOY.md +116 -0
Dockerfile +26 -0
README.md +129 -0
README_HF.md +53 -0
app.py +200 -0
main.py +193 -0
requirements.txt +5 -0

.gitattributes ADDED Viewed

	@@ -0,0 +1,3 @@

+*.gguf filter=lfs diff=lfs merge=lfs -text
+*.bin filter=lfs diff=lfs merge=lfs -text
+*.safetensors filter=lfs diff=lfs merge=lfs -text

.gitignore ADDED Viewed

	@@ -0,0 +1,25 @@

+# Python
+__pycache__/
+*.py[cod]
+*$py.class
+*.so
+.Python
+venv/
+env/
+ENV/
+# Models (큰 파일)
+models/*.gguf
+# IDE
+.vscode/
+.idea/
+*.swp
+*.swo
+# OS
+.DS_Store
+Thumbs.db
+# Logs
+*.log

DEPLOY.md ADDED Viewed

	@@ -0,0 +1,116 @@

+# Hugging Face Spaces 배포 가이드
+## 1단계: Hugging Face 계정 생성
+1. https://huggingface.co/ 접속
+2. 회원가입 (GitHub 계정으로 가능)
+3. 무료 계정으로 충분!
+## 2단계: Space 생성
+1. https://huggingface.co/new-space
+2. 설정:
+   - **Space name**: `chatbia-api` (원하는 이름)
+   - **License**: MIT
+   - **Space SDK**: Docker
+   - **Space hardware**: CPU basic (무료)
+   - **Visibility**: Public
+## 3단계: Git 설정
+### 로컬에서 Git 초기화
+```bash
+cd ChatBIA-Server
+# Git 초기화
+git init
+git lfs install
+git lfs track "*.gguf"
+# Hugging Face 리모트 추가
+git remote add origin https://huggingface.co/spaces/YOUR-USERNAME/chatbia-api
+# 파일 추가
+git add .
+git commit -m "Initial commit"
+```
+## 4단계: 모델 파일 추가
+**중요**: 모델 파일은 Git LFS로 업로드해야 합니다!
+```bash
+# models 폴더에 GGUF 파일 복사
+cp ../ChatBIA-Windows/models/*.gguf .
+# Git LFS로 추가
+git add Qwen2.5-3B-Instruct-Q4_K_M.gguf
+git add ChatBIA-3B-v0.1-Q4_K_M.gguf
+git commit -m "Add model files"
+```
+## 5단계: Push
+```bash
+# Hugging Face 토큰 설정 (최초 1회)
+# https://huggingface.co/settings/tokens 에서 토큰 생성
+git push origin main
+```
+**참고**: 모델 파일이 크므로 업로드에 시간이 걸립니다 (~10-20분)
+## 6단계: 배포 확인
+1. Space 페이지에서 "Building" 상태 확인
+2. 빌드 완료 후 "Running" 상태 확인
+3. API 테스트:
+```bash
+curl https://YOUR-USERNAME-chatbia-api.hf.space/
+```
+## 7단계: API URL 확인
+배포 완료 후 Space URL:
+```
+https://YOUR-USERNAME-chatbia-api.hf.space
+```
+이 URL을 안드로이드 앱에서 사용하면 됩니다!
+## 문제 해결
+### 빌드 실패
+- Logs 탭에서 에러 확인
+- 주로 requirements.txt 문제
+### 모델 로드 실패
+- Git LFS가 제대로 설정되었는지 확인
+- 모델 파일이 정상적으로 업로드되었는지 확인
+### 느린 응답
+- CPU basic은 무료지만 느립니다
+- 유료 GPU로 업그레이드 가능 (~$0.60/hour)
+## 대안: 로컬 테스트 먼저
+```bash
+# Docker로 로컬 테스트
+docker build -t chatbia-api .
+docker run -p 7860:7860 chatbia-api
+# 테스트
+curl http://localhost:7860/
+```
+## 비용
+- **CPU basic**: 무료 ✅
+- **CPU upgrade**: $0.03/hour
+- **GPU T4**: $0.60/hour
+- **GPU A10G**: $3.15/hour
+무료 CPU로 시작하세요!

Dockerfile ADDED Viewed

	@@ -0,0 +1,26 @@

+FROM python:3.10-slim
+WORKDIR /app
+# 시스템 패키지 설치
+RUN apt-get update && apt-get install -y \
+    build-essential \
+    cmake \
+    git \
+    && rm -rf /var/lib/apt/lists/*
+# Python 패키지 설치
+COPY requirements.txt .
+RUN pip install --no-cache-dir -r requirements.txt
+# 앱 파일 복사
+COPY app.py .
+# 모델 파일은 Git LFS로 자동 다운로드됨
+# (Hugging Face Spaces가 자동으로 처리)
+# 포트 노출
+EXPOSE 7860
+# FastAPI 실행
+CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "7860"]

README.md ADDED Viewed

	@@ -0,0 +1,129 @@

+# ChatBIA Server
+24/7 회계 AI FastAPI 서버
+## 로컬 테스트
+```bash
+# 가상환경 생성
+python -m venv venv
+source venv/bin/activate  # Windows: venv\Scripts\activate
+# 패키지 설치
+pip install -r requirements.txt
+# 모델 다운로드 (models 폴더에 GGUF 파일 넣기)
+mkdir models
+# Qwen2.5-3B-Instruct-Q4_K_M.gguf
+# ChatBIA-3B-v0.1-Q4_K_M.gguf
+# 서버 실행
+python main.py
+```
+서버 주소: http://localhost:8000
+## API 사용법
+### 1. 헬스 체크
+```bash
+curl http://localhost:8000/
+```
+### 2. 채팅
+```bash
+curl -X POST http://localhost:8000/chat \
+  -H "Content-Type: application/json" \
+  -d '{
+    "message": "안녕하세요",
+    "mode": "bsl",
+    "max_tokens": 1024,
+    "temperature": 0.7
+  }'
+```
+### 3. 모델 상태 확인
+```bash
+curl http://localhost:8000/models
+```
+## Oracle Cloud 배포
+### 1. VM 생성
+- Always Free: VM.Standard.E2.1.Micro (1 OCPU, 1GB RAM)
+- 또는: VM.Standard.A1.Flex (4 OCPU, 24GB RAM) - ARM
+### 2. 방화벽 설정
+```bash
+# Ubuntu 방화벽
+sudo ufw allow 8000/tcp
+sudo ufw enable
+# Oracle Cloud 보안 목록
+# Ingress Rule: 0.0.0.0/0, TCP, Port 8000
+```
+### 3. 서버 설치
+```bash
+# 프로젝트 복사
+scp -r ChatBIA-Server ubuntu@<IP>:~/
+# SSH 접속
+ssh ubuntu@<IP>
+# Python 설치
+sudo apt update
+sudo apt install python3.10 python3-pip -y
+# 패키지 설치
+cd ChatBIA-Server
+pip3 install -r requirements.txt
+# 모델 업로드 (scp 사용)
+# 로컬에서:
+scp models/*.gguf ubuntu@<IP>:~/ChatBIA-Server/models/
+```
+### 4. Systemd 서비스 생성
+```bash
+sudo nano /etc/systemd/system/chatbia.service
+```
+```ini
+[Unit]
+Description=ChatBIA FastAPI Server
+After=network.target
+[Service]
+Type=simple
+User=ubuntu
+WorkingDirectory=/home/ubuntu/ChatBIA-Server
+ExecStart=/usr/bin/python3 main.py
+Restart=always
+RestartSec=10
+[Install]
+WantedBy=multi-user.target
+```
+```bash
+# 서비스 시작
+sudo systemctl daemon-reload
+sudo systemctl enable chatbia
+sudo systemctl start chatbia
+# 상태 확인
+sudo systemctl status chatbia
+```
+## 사용량
+- CPU: ~80% (추론 시)
+- RAM: ~1.5GB (모델 2개 로드 시)
+- 디스크: ~4GB (모델 파일)
+## 보안
+- API 키 인증 추가 권장
+- HTTPS (Let's Encrypt) 설정
+- Rate limiting 고려

README_HF.md ADDED Viewed

	@@ -0,0 +1,53 @@

+---
+title: ChatBIA API
+emoji: 💼
+colorFrom: purple
+colorTo: pink
+sdk: docker
+pinned: false
+license: mit
+---
+# ChatBIA API Server
+24/7 회계 전문 AI 서버
+## Features
+- 🤖 듀얼 AI 모델 (General + BSL)
+- 💼 회계 계산 전문
+- 🔄 RESTful API
+- 🌐 CORS 지원
+## API 사용법
+### 1. 헬스 체크
+```bash
+curl https://YOUR-SPACE-NAME.hf.space/
+```
+### 2. 채팅
+```bash
+curl -X POST https://YOUR-SPACE-NAME.hf.space/chat \
+  -H "Content-Type: application/json" \
+  -d '{
+    "message": "5천만원 설비를 5년간 감가상각 계산해줘",
+    "mode": "bsl",
+    "max_tokens": 1024,
+    "temperature": 0.7
+  }'
+```
+### 3. 모델 상태
+```bash
+curl https://YOUR-SPACE-NAME.hf.space/models
+```
+## Models
+- **General**: Qwen2.5-3B-Instruct-Q4_K_M
+- **BSL**: ChatBIA-3B-v0.1-Q4_K_M
+## License
+MIT

app.py ADDED Viewed

	@@ -0,0 +1,200 @@

+"""
+ChatBIA Hugging Face Spaces API
+24/7 회계 AI 서버
+"""
+from fastapi import FastAPI, HTTPException
+from fastapi.middleware.cors import CORSMiddleware
+from pydantic import BaseModel
+from typing import Optional
+import os
+from llama_cpp import Llama
+from huggingface_hub import hf_hub_download
+app = FastAPI(
+    title="ChatBIA API",
+    description="회계 전문 AI 서버 (Hugging Face Spaces)",
+    version="1.0.0"
+)
+# CORS 설정
+app.add_middleware(
+    CORSMiddleware,
+    allow_origins=["*"],
+    allow_credentials=True,
+    allow_methods=["*"],
+    allow_headers=["*"],
+)
+# Hugging Face 모델 레포지터리
+GENERAL_MODEL_REPO = "Qwen/Qwen2.5-3B-Instruct-GGUF"
+GENERAL_MODEL_FILE = "qwen2.5-3b-instruct-q4_k_m.gguf"
+BSL_MODEL_REPO = "Seounghyup/ChatBIA-3B-v0.1"
+BSL_MODEL_FILE = "ChatBIA-3B-v0.1-Q4_K_M.gguf"
+# 전역 모델 변수
+general_model = None
+bsl_model = None
+general_model_path = None
+bsl_model_path = None
+class ChatRequest(BaseModel):
+    message: str
+    mode: str = "bsl"  # "general" or "bsl"
+    max_tokens: int = 1024
+    temperature: float = 0.7
+class ChatResponse(BaseModel):
+    response: str
+    mode: str
+    tokens: int
+@app.on_event("startup")
+async def load_models():
+    """서버 시작 시 모델 다운로드 및 로드"""
+    global general_model, bsl_model, general_model_path, bsl_model_path
+    # General 모델 다운로드
+    print(f"🔄 일반 모드 모델 다운로드 중: {GENERAL_MODEL_REPO}/{GENERAL_MODEL_FILE}")
+    try:
+        general_model_path = hf_hub_download(
+            repo_id=GENERAL_MODEL_REPO,
+            filename=GENERAL_MODEL_FILE,
+            repo_type="model"
+        )
+        print(f"✅ 일반 모드 모델 다운로드 완료: {general_model_path}")
+        # 모델 로드
+        general_model = Llama(
+            model_path=general_model_path,
+            n_ctx=2048,
+            n_threads=2,  # Spaces CPU 제한
+            n_gpu_layers=0,
+            verbose=False
+        )
+        print("✅ 일반 모드 모델 로드 완료")
+    except Exception as e:
+        print(f"❌ 일반 모드 모델 로드 실패: {e}")
+    # BSL 모델 다운로드
+    print(f"🔄 BSL 모드 모델 다운로드 중: {BSL_MODEL_REPO}/{BSL_MODEL_FILE}")
+    try:
+        bsl_model_path = hf_hub_download(
+            repo_id=BSL_MODEL_REPO,
+            filename=BSL_MODEL_FILE,
+            repo_type="model"
+        )
+        print(f"✅ BSL 모드 모델 다운로드 완료: {bsl_model_path}")
+        # 모델 로드
+        bsl_model = Llama(
+            model_path=bsl_model_path,
+            n_ctx=2048,
+            n_threads=2,
+            n_gpu_layers=0,
+            verbose=False
+        )
+        print("✅ BSL 모드 모델 로드 완료")
+    except Exception as e:
+        print(f"❌ BSL 모델 로드 실패: {e}")
+def build_prompt(message: str, mode: str) -> str:
+    """프롬프트 빌드"""
+    if mode == "bsl":
+        return f"""<|im_start|>system
+You are a professional accounting AI assistant. Respond naturally in Korean.
+Important: Only generate BSL DSL code when the user explicitly requests calculations (e.g., "계산해줘", "코드 작성해줘", "BSL로 작성해줘"). For general questions or greetings, respond conversationally without code.<|im_end|>
+<|im_start|>user
+{message}<|im_end|>
+<|im_start|>assistant
+"""
+    else:
+        return f"""<|im_start|>system
+You are a helpful AI assistant. Respond naturally in Korean.<|im_end|>
+<|im_start|>user
+{message}<|im_end|>
+<|im_start|>assistant
+"""
+@app.get("/")
+async def root():
+    """헬스 체크"""
+    return {
+        "status": "online",
+        "service": "ChatBIA API",
+        "version": "1.0.0",
+        "platform": "Hugging Face Spaces",
+        "models": {
+            "general": general_model is not None,
+            "bsl": bsl_model is not None
+        }
+    }
+@app.post("/chat", response_model=ChatResponse)
+async def chat(request: ChatRequest):
+    """채팅 엔드포인트"""
+    # 모델 선택
+    if request.mode == "general":
+        model = general_model
+        model_name = "General"
+    else:
+        model = bsl_model
+        model_name = "BSL"
+    # 모델이 없으면 에러
+    if model is None:
+        raise HTTPException(
+            status_code=503,
+            detail=f"{model_name} 모델이 로드되지 않았습니다."
+        )
+    try:
+        # 프롬프트 빌드
+        prompt = build_prompt(request.message, request.mode)
+        # 추론
+        response = model(
+            prompt,
+            max_tokens=request.max_tokens,
+            temperature=request.temperature,
+            top_p=0.9,
+            top_k=40,
+            repeat_penalty=1.1,
+            stop=["<|im_end|>", "###", "\n\n\n"]
+        )
+        text = response["choices"][0]["text"].strip()
+        tokens = len(response["choices"][0]["text"].split())
+        return ChatResponse(
+            response=text,
+            mode=request.mode,
+            tokens=tokens
+        )
+    except Exception as e:
+        raise HTTPException(
+            status_code=500,
+            detail=f"AI 모델 처리 중 오류: {str(e)}"
+        )
+@app.get("/models")
+async def get_models():
+    """사용 가능한 모델 목록"""
+    return {
+        "general": {
+            "loaded": general_model is not None,
+            "path": general_model_path
+        },
+        "bsl": {
+            "loaded": bsl_model is not None,
+            "path": bsl_model_path
+        }
+    }

main.py ADDED Viewed

	@@ -0,0 +1,193 @@

+"""
+ChatBIA FastAPI Server
+24/7 회계 AI 서버
+"""
+from fastapi import FastAPI, HTTPException
+from fastapi.middleware.cors import CORSMiddleware
+from pydantic import BaseModel
+from typing import Optional, List
+import os
+from llama_cpp import Llama
+app = FastAPI(
+    title="ChatBIA API",
+    description="회계 전문 AI 서버",
+    version="1.0.0"
+)
+# CORS 설정 (안드로이드/웹에서 접근 가능)
+app.add_middleware(
+    CORSMiddleware,
+    allow_origins=["*"],
+    allow_credentials=True,
+    allow_methods=["*"],
+    allow_headers=["*"],
+)
+# 모델 경로
+MODEL_DIR = "models"
+GENERAL_MODEL_PATH = os.path.join(MODEL_DIR, "Qwen2.5-3B-Instruct-Q4_K_M.gguf")
+BSL_MODEL_PATH = os.path.join(MODEL_DIR, "ChatBIA-3B-v0.1-Q4_K_M.gguf")
+# 전역 모델 변수
+general_model = None
+bsl_model = None
+class ChatRequest(BaseModel):
+    message: str
+    mode: str = "bsl"  # "general" or "bsl"
+    max_tokens: int = 1024
+    temperature: float = 0.7
+class ChatResponse(BaseModel):
+    response: str
+    mode: str
+    tokens: int
+@app.on_event("startup")
+async def load_models():
+    """서버 시작 시 모델 로드"""
+    global general_model, bsl_model
+    os.makedirs(MODEL_DIR, exist_ok=True)
+    # General 모델 로드
+    if os.path.exists(GENERAL_MODEL_PATH):
+        print(f"🔄 일반 모드 모델 로드 중: {GENERAL_MODEL_PATH}")
+        try:
+            general_model = Llama(
+                model_path=GENERAL_MODEL_PATH,
+                n_ctx=2048,
+                n_threads=4,
+                n_gpu_layers=0,  # Oracle Cloud는 CPU
+                verbose=False
+            )
+            print("✅ 일반 모드 모델 로드 완료")
+        except Exception as e:
+            print(f"❌ 일반 모드 모델 로드 실패: {e}")
+    # BSL 모델 로드
+    if os.path.exists(BSL_MODEL_PATH):
+        print(f"🔄 BSL 모드 모델 로드 중: {BSL_MODEL_PATH}")
+        try:
+            bsl_model = Llama(
+                model_path=BSL_MODEL_PATH,
+                n_ctx=2048,
+                n_threads=4,
+                n_gpu_layers=0,
+                verbose=False
+            )
+            print("✅ BSL 모드 모델 로드 완료")
+        except Exception as e:
+            print(f"❌ BSL 모드 모델 로드 실패: {e}")
+def build_prompt(message: str, mode: str) -> str:
+    """프롬프트 빌드"""
+    if mode == "bsl":
+        return f"""<|im_start|>system
+You are a professional accounting AI assistant. Respond naturally in Korean.
+Important: Only generate BSL DSL code when the user explicitly requests calculations (e.g., "계산해줘", "코드 작성해줘", "BSL로 작성해줘"). For general questions or greetings, respond conversationally without code.<|im_end|>
+<|im_start|>user
+{message}<|im_end|>
+<|im_start|>assistant
+"""
+    else:
+        return f"""<|im_start|>system
+You are a helpful AI assistant. Respond naturally in Korean.<|im_end|>
+<|im_start|>user
+{message}<|im_end|>
+<|im_start|>assistant
+"""
+@app.get("/")
+async def root():
+    """헬스 체크"""
+    return {
+        "status": "online",
+        "service": "ChatBIA API",
+        "version": "1.0.0",
+        "models": {
+            "general": general_model is not None,
+            "bsl": bsl_model is not None
+        }
+    }
+@app.post("/chat", response_model=ChatResponse)
+async def chat(request: ChatRequest):
+    """채팅 엔드포인트"""
+    # 모델 선택
+    if request.mode == "general":
+        model = general_model
+        model_name = "General"
+    else:
+        model = bsl_model
+        model_name = "BSL"
+    # 모델이 없으면 에러
+    if model is None:
+        raise HTTPException(
+            status_code=503,
+            detail=f"{model_name} 모델이 로드되지 않았습니다."
+        )
+    try:
+        # 프롬프트 빌드
+        prompt = build_prompt(request.message, request.mode)
+        # 추론
+        response = model(
+            prompt,
+            max_tokens=request.max_tokens,
+            temperature=request.temperature,
+            top_p=0.9,
+            top_k=40,
+            repeat_penalty=1.1,
+            stop=["<|im_end|>", "###", "\n\n\n"]
+        )
+        text = response["choices"][0]["text"].strip()
+        tokens = len(response["choices"][0]["text"].split())
+        return ChatResponse(
+            response=text,
+            mode=request.mode,
+            tokens=tokens
+        )
+    except Exception as e:
+        raise HTTPException(
+            status_code=500,
+            detail=f"AI 모델 처리 중 오류: {str(e)}"
+        )
+@app.get("/models")
+async def get_models():
+    """사용 가능한 모델 목록"""
+    return {
+        "general": {
+            "loaded": general_model is not None,
+            "path": GENERAL_MODEL_PATH if os.path.exists(GENERAL_MODEL_PATH) else None
+        },
+        "bsl": {
+            "loaded": bsl_model is not None,
+            "path": BSL_MODEL_PATH if os.path.exists(BSL_MODEL_PATH) else None
+        }
+    }
+if __name__ == "__main__":
+    import uvicorn
+    uvicorn.run(
+        "main:app",
+        host="0.0.0.0",
+        port=8000,
+        reload=False  # 프로덕션에서는 reload=False
+    )

requirements.txt ADDED Viewed

	@@ -0,0 +1,5 @@

+fastapi==0.115.0
+uvicorn[standard]==0.30.6
+pydantic==2.9.2
+llama-cpp-python==0.2.90
+python-multipart==0.0.9