Hermes Agent is designed to run anywhere — from your laptop to a $5 VPS, a GPU cluster, or serverless infrastructure. This guide covers all major deployment options and the security best practices you need for production use.
Local Development Setup
For development and testing, running Hermes locally is the fastest path:
curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash
source ~/.bashrc
hermes doctor
The one-line installer handles Python, Node.js, dependencies, and global command registration.
Running Hermes Fully Offline
One of Hermes' biggest advantages is the ability to run completely offline using local models via Ollama.
Install Ollama
curl -fsSL https://ollama.com/install.sh | sh
Pull a Compatible Model
Choose a model with sufficient context (minimum 64K tokens recommended):
ollama pull qwen2.5-coder:32b
ollama serve
Connect Hermes to Ollama
hermes model
# Select "Custom endpoint (self-hosted / VLLM / etc.)"
# Enter URL: http://localhost:11434/v1
# Skip API key
# Enter model name: qwen2.5-coder:32b
Increase Context Window
The model needs a large context window to load the system prompt, tools, and generate responses:
OLLAMA_CONTEXT_LENGTH=32768 ollama serve
Or make it persistent via systemd:
sudo systemctl edit ollama.service
# Add: Environment="OLLAMA_CONTEXT_LENGTH=32768"
sudo systemctl daemon-reload && sudo systemctl restart ollama
VPS Deployment
For an always-on agent that survives reboots, deploy to a VPS:
Recommended Providers
- Hetzner — reliable, inexpensive (~€5/month)
- DigitalOcean — easy to use with one-click apps
- Linode — good performance for the price
Installation on VPS
SSH into your server and run the same one-line installer:
curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash
Install as a Persistent Service
For the gateway to run 24/7:
# Linux systemd user service
hermes gateway install
# Linux system-wide service
sudo hermes gateway install --system
This creates a systemd service that auto-starts on boot and restarts on crash.
Serverless Deployment
For infrastructure that costs nearly nothing when idle, use serverless platforms:
Modal
Modal provides serverless Python execution with persistent volumes:
# Deploy Hermes to Modal
modal deploy hermes-modal.py
Benefits:
- Scales to zero when not in use
- Persistent volumes for memory and skills
- GPU access for local model inference
Daytona
Daytona offers cloud development environments:
daytona create hermes-agent
daytona open hermes-agent
Ideal for teams that need collaborative agent development.
Security Best Practices
When deploying Hermes in production, follow these guidelines:
1. Use the Container Backend
Container isolation prevents the agent from affecting the host system:
hermes config set backend container
2. Set Explicit Allowlists
Restrict gateway access to specific users:
export TELEGRAM_ALLOWED_USERS=123456789
3. Secure Secret Storage
Store API keys in ~/.hermes/.env with restricted permissions:
chmod 600 ~/.hermes/.env
4. Regular Updates
Keep Hermes and its dependencies up to date:
hermes update
5. Run as Non-Root User
Never run the agent as root. Create a dedicated user:
sudo useradd -m hermes
sudo su - hermes
curl -fsSL https://raw.githubusercontent.com/... | bash
6. Monitor Logs
Watch for unauthorized access attempts:
journalctl -u hermes-gateway -f
Which Deployment Should You Choose?
| Use Case | Recommended Deployment |
|---|---|
| Personal experimentation | Local laptop |
| Privacy-first, no cloud | Local + Ollama |
| Always-on personal assistant | $5 VPS |
| Team collaboration | Daytona or shared VPS |
| Cost-optimized, intermittent use | Modal (serverless) |
| Heavy local model inference | GPU server or Modal with GPU |
Summary
Hermes Agent's flexibility is one of its greatest strengths. Whether you need a fully offline local setup, an always-on VPS deployment, or a cost-efficient serverless configuration, the same agent code runs everywhere. Start local, scale to the cloud when you're ready, and always follow security best practices for production deployments.
