Homelab Inference Server
Quiet headless server for Ollama, Open WebUI, embeddings, and small internal AI services. Budget is focused on 16GB CUDA VRAM, 96GB RAM, and 4TB storage instead of oversized case or cooler spend.
GPU: NVIDIA RTX 4070 Ti SUPER
CPU: AMD Ryzen 9 7900
RAM: 96GB | Storage: 4000GB
Target: 13B-34B q4
Good for 13B-class models
Strong everyday local LLM tier; 30B may need more memory or heavier quantization.
Good for everyday local LLM use
- Roughly suitable for: local coding assistants and 7B/8B models
- Roughly suitable for: 13B/14B quantized models
€3,627
6 market-priced parts, 2 reference estimates