hrhub / SETUP_GUIDE.md
Roger Surf
Refactor: Professional Streamlit MVP
f15d7db
# πŸš€ HRHUB SETUP GUIDE
**Quick Start Guide for Deployment**
---
## πŸ“¦ What You Have
A complete, production-ready Streamlit application with:
- βœ… Professional code structure
- βœ… Mock data for MVP demo
- βœ… Interactive UI with network graphs
- βœ… Ready for GitHub + Streamlit Cloud deployment
---
## ⚑ OPTION 1: Quick Local Test (2 minutes)
### For Mac/Linux:
```bash
cd hrhub
./run.sh
```
### For Windows:
```bash
cd hrhub
run.bat
```
**That's it!** Open `http://localhost:8501` in your browser.
---
## 🌐 OPTION 2: Deploy to Internet (10 minutes)
### Step 1: Install Git (if not already)
- **Windows**: Download from https://git-scm.com/
- **Mac**: Install Xcode Command Line Tools
- **Linux**: `sudo apt install git`
### Step 2: Create GitHub Repository
1. Go to https://github.com/new
2. Repository name: `hrhub`
3. Keep it PUBLIC
4. Don't initialize with README (we have one)
5. Click "Create repository"
### Step 3: Push Your Code
Open terminal/command prompt in the `hrhub` folder:
```bash
# Initialize git
git init
# Add all files
git add .
# Commit
git commit -m "Initial HRHUB MVP deployment"
# Connect to GitHub (replace YOUR-USERNAME)
git remote add origin https://github.com/YOUR-USERNAME/hrhub.git
# Push
git branch -M main
git push -u origin main
```
### Step 4: Deploy on Streamlit Cloud
1. Go to https://share.streamlit.io
2. Click "Sign in" β†’ Sign in with GitHub
3. Click "New app"
4. Fill in:
- **Repository**: `YOUR-USERNAME/hrhub`
- **Branch**: `main`
- **Main file path**: `app.py`
5. Click "Deploy!"
**Wait 2-3 minutes** and your app will be live! πŸŽ‰
You'll get a URL like: `https://hrhub-YOUR-USERNAME.streamlit.app`
---
## 🎯 Testing Your Deployment
### What You Should See:
1. **Header**: "🏒 HRHUB - HR Matching System"
2. **Demo Mode Banner**: Blue info box saying mock data is active
3. **Statistics**: 4 metric cards showing:
- Total Matches: 10
- Average Score: ~65%
- Excellent Matches: 4
- Best Match: ~70%
4. **Two Columns**:
- **Left**: Candidate profile with expandable sections
- **Right**: Company matches (table or cards)
5. **Network Graph**: Interactive visualization at the bottom
### Interaction Tests:
- βœ… Change slider in sidebar (matches 5-20)
- βœ… Change min score slider
- βœ… Switch view modes (Overview/Cards/Table)
- βœ… Expand candidate sections
- βœ… Hover over network graph nodes
- βœ… Drag nodes in the graph
---
## πŸ”§ Common Issues & Solutions
### Issue 1: "streamlit: command not found"
**Solution:**
```bash
pip install streamlit
```
### Issue 2: "Module not found"
**Solution:**
```bash
pip install -r requirements.txt
```
### Issue 3: Port 8501 already in use
**Solution:**
```bash
streamlit run app.py --server.port 8502
```
### Issue 4: Git push fails (authentication)
**Solution:**
1. Generate GitHub Personal Access Token:
- Settings β†’ Developer settings β†’ Personal access tokens β†’ Generate new token
- Select "repo" scope
- Copy the token
2. When prompted for password, paste the token (not your GitHub password)
### Issue 5: Streamlit Cloud deployment fails
**Solution:**
- Check `requirements.txt` has all dependencies
- Ensure `app.py` is in root directory
- Check logs in Streamlit Cloud dashboard
- Make sure repository is PUBLIC
---
## πŸ“ Next Steps (After Demo Works)
### Phase 1: Generate Real Embeddings
1. Run your original code with save functionality:
```python
import numpy as np
import pickle
# After generating embeddings...
np.save('candidate_embeddings.npy', candidate_embeddings)
np.save('company_embeddings.npy', company_embeddings)
with open('candidates_processed.pkl', 'wb') as f:
pickle.dump(candidates, f)
with open('companies_processed.pkl', 'wb') as f:
pickle.dump(companies_full, f)
```
2. Place files in `hrhub/data/` folder
### Phase 2: Create Real Data Loader
Create `data/data_loader.py`:
```python
import numpy as np
import pickle
from utils.matching import find_top_matches
def load_embeddings():
"""Load pre-computed embeddings."""
candidate_emb = np.load('data/candidate_embeddings.npy')
company_emb = np.load('data/company_embeddings.npy')
with open('data/candidates_processed.pkl', 'rb') as f:
candidates = pickle.load(f)
with open('data/companies_processed.pkl', 'rb') as f:
companies = pickle.load(f)
return candidate_emb, company_emb, candidates, companies
# Add functions matching mock_data.py structure
```
### Phase 3: Swap Data Sources
In `app.py`, change:
```python
# FROM:
from data.mock_data import get_candidate_data, get_company_matches
# TO:
from data.data_loader import get_candidate_data, get_company_matches
```
In `config.py`, change:
```python
DEMO_MODE = False # Turn off demo banner
```
**That's it!** The UI stays exactly the same.
---
## πŸŽ“ For Your Teachers Demo
### What to Show:
1. **Start the app**: Show the clean UI loading
2. **Explain the candidate**: "This represents a real data scientist profile"
3. **Point out match scores**: "70% means strong alignment"
4. **Show the graph**: "Green = candidate, Red = companies, thickness = match strength"
5. **Demonstrate interaction**: Drag nodes, zoom, hover
6. **Highlight the concept**: "No hardcoded rules - pure semantic similarity"
### Key Points to Emphasize:
- βœ… **Scalable**: Works for 9.5K Γ— 180K matching
- βœ… **Fast**: Real-time similarity computation
- βœ… **Bilateral**: Can work both directions
- βœ… **No manual rules**: NLP understands semantics
- βœ… **Production-ready**: Clean code, modular design
---
## πŸ“Š Project Structure Explained
```
hrhub/
β”œβ”€β”€ app.py # Main app - teachers see this running
β”œβ”€β”€ config.py # Easy to tweak settings
β”œβ”€β”€ requirements.txt # All dependencies listed
β”‚
β”œβ”€β”€ data/
β”‚ └── mock_data.py # Demo data (swap later)
β”‚
β”œβ”€β”€ utils/
β”‚ β”œβ”€β”€ matching.py # Core algorithm - your innovation
β”‚ β”œβ”€β”€ visualization.py # Network graphs
β”‚ └── display.py # UI components
β”‚
└── README.md # Documentation
```
**Why this structure?**
- **Modular**: Easy to swap mock β†’ real data
- **Professional**: Industry-standard layout
- **Maintainable**: Clear separation of concerns
- **Scalable**: Ready to add features
---
## 🎯 Timeline Suggestion
**Tuesday (Today):**
- βœ… Test locally: `./run.sh`
- βœ… Deploy to GitHub
- βœ… Deploy to Streamlit Cloud
- βœ… Share link with team
**Wednesday:**
- Run your original code
- Generate & save embeddings
- Test loading saved files
**Thursday:**
- Create `data_loader.py`
- Swap to real data
- Test end-to-end
- Fix any bugs
**Friday:**
- βœ… **DEMO READY**
- Polish presentation
- Prepare talking points
**Weekend:**
- Focus 100% on report
- App already deployed!
---
## πŸ†˜ Need Help?
### Quick Checks:
1. **Is Python 3.8+ installed?** `python --version`
2. **Are dependencies installed?** `pip list | grep streamlit`
3. **Is the file structure correct?** `ls -la`
4. **Are you in the right directory?** `pwd`
### Still Stuck?
Check these in order:
1. Error message in terminal
2. Streamlit Cloud logs (if deployed)
3. GitHub Actions (if using)
4. This guide's "Common Issues" section
---
## βœ… Deployment Checklist
Before presenting to teachers:
- [ ] Local test works: `./run.sh`
- [ ] Pushed to GitHub
- [ ] Deployed on Streamlit Cloud
- [ ] Can access via public URL
- [ ] All sections display correctly
- [ ] Graph is interactive
- [ ] No error messages
- [ ] Screenshot/video of working app
- [ ] Link shared with team
- [ ] Backup plan (run locally if cloud fails)
---
## πŸŽ‰ You're Done!
You now have:
- βœ… Professional codebase
- βœ… Working demo
- βœ… Online deployment
- βœ… Easy path to production
**The hard part is done. Now focus on your report!** πŸ“
---
**Good luck with your presentation!** πŸš€
*Questions? Check README.md for more details.*