Llama 3.2 3B Instruct
First local chat, prompt experiments, short summaries
A small, friendly starter model for learning local AI without needing a large GPU.
Likely good memory headroom for this quantized model at normal context sizes.
AI Workstations in Estonia
Serious LoRA fine-tuning with 16GB VRAM and 96GB system RAM
Profile: LLM Fine-Tune Starter
Honest visual overview
A quick summary of the main AI buying decisions: GPU memory, system RAM, model target, and power class.
This is a schematic summary, not a photo of the exact build.
AI capability fit
Moderate
Good for everyday local LLM use
GPU
NVIDIA RTX 4070 Ti SUPER
VRAM
16GB
RAM
96GB
Model target
7B-13B LoRA/QLoRA
CPU
AMD Ryzen 9 9950X
GPU
NVIDIA RTX 4070 Ti SUPER
VRAM
16GB
RAM
96GB
Storage
4000GB
Model target
7B-13B LoRA/QLoRA
Throughput
Training: varies by batch size
System power
~400W
Recommended PSU
850W
Cooling
Premium dual-tower air cooling
Good for 13B-class models
Strong everyday local LLM tier; 30B may need more memory or heavier quantization.
AI capability fit
Moderate
Good for everyday local LLM use
Based on VRAM and system RAM; quantization, context, and runtime can change the result.
Estimated build market price history
The chart includes all listed components. Observed prices are preferred; missing points are filled with conservative estimates from current prices and category trends. This does not include the assembly markup.
Latest estimated market total
€3,441
4/8 observed · 4 estimated
Not enough trusted price history yet.
This range has 31.6% average trusted value coverage and 35% latest coverage. The line chart appears once trusted value coverage reaches at least 60%.
The summary below is a directional market-planning estimate, not the checkout/order price and not a daily scraped history.
Latest estimated market total
€3,441
4/8 observed · 4 estimated · 0 unknown
GPU: NVIDIA RTX 4070 Ti SUPER
GPU fallback: current market/reference anchor with a conservative 8% twelve-month decline toward today.
Confidence: low
CPU: AMD Ryzen 9 9950X
CPU fallback: current market/reference anchor with a conservative 6% twelve-month decline toward today.
Confidence: medium
RAM: G.Skill Trident Z5 RGB 96GB (2x48GB) DDR5-6400 CL32
Memory fallback: current market/reference anchor with a conservative 5% twelve-month category adjustment.
Confidence: low
Storage: Samsung 990 Pro 4TB
Storage fallback: current market/reference anchor with a conservative 5% twelve-month category adjustment.
Confidence: medium
Motherboard: MSI MAG X670E Tomahawk WiFi
Motherboard fallback: current market/reference anchor with a conservative 4% twelve-month decline toward today.
Confidence: low
PSU: SeaSonic Focus GX-850 850W
PSU fallback: current market/reference anchor with a very small 2% twelve-month category adjustment.
Confidence: medium
Case: Fractal Design Meshify 2
Case fallback: flat estimate from the current market/reference anchor.
Confidence: medium
Cooler: Noctua NH-D15 G2
Cooler fallback: flat estimate from the current market/reference anchor.
Confidence: medium
Prices use Estonian market data when available, otherwise reference estimates. Displayed component prices include the assembly/configuration markup; payable order price applies only when the purchase panel allows online checkout.
| Component | Product | Displayed price |
|---|---|---|
| GPU | NVIDIA RTX 4070 Ti SUPER Planning reference price | €884 |
| CPU | AMD Ryzen 9 9950X Updated todayVerified pricing input | €532 |
| RAM | G.Skill Trident Z5 RGB 96GB (2x48GB) DDR5-6400 CL32 Stale market dataLast checked 42 days ago | €1,149 |
| Storage | Samsung 990 Pro 4TB Updated todayVerified pricing input | €566 |
| Motherboard | MSI MAG X670E Tomahawk WiFi Planning reference price | €367 |
| PSU | SeaSonic Focus GX-850 850W Updated todayVerified pricing input | €154 |
| Case | Fractal Design Meshify 2 Updated yesterdayVerified pricing input | €134 |
| Cooler | Noctua NH-D15 G2 Stale market dataLast checked 42 days ago | €171 |
| Estimated build configuration total | €3,957 | |
Serious 7B-13B LoRA/QLoRA workstation with CUDA, 96GB RAM, and 4TB fast storage for datasets and checkpoints. The board is sized for reliability rather than prestige; the single 16GB GPU keeps training plans adapter-based.
Source refs: nvidia.com, amd.com
Quote reference price
€3,957
Shown for planning. Direct checkout remains quote-only until fresh market pricing and availability are checked.
Price chart shows Estonian market averages before assembly/configuration markup; quote-only pricing is manually confirmed before payment.
Direct checkout blocker
CPU: AMD Ryzen 9 9950X - price failed safety checks
Quote-only because component pricing could not be verified.
Fresh, non-fallback Estonian market pricing is required before Stripe payment can be opened. Use the verified quote request on this page for manual review.
What happens after your quote request
Support and questions continue through the order or quote email thread.
Request a verified quote
The request does not take payment. We manually verify price and availability, then confirm substitutions or changes before offering any payment link.
Practical model fit
Examples for 16GB VRAM Fine-Tune Workhorse, based mainly on GPU VRAM and system memory.
Starter pick
A practical first coding assistant for most LLMLab desktop builds.
It is small enough for mainstream GPUs but tuned specifically for code.
Likely good memory headroom for this quantized model at normal context sizes.
First local chat, prompt experiments, short summaries
A small, friendly starter model for learning local AI without needing a large GPU.
Likely good memory headroom for this quantized model at normal context sizes.
Light chat, multilingual prompts, compact reasoning tests
A compact Qwen model that gives beginners a taste of newer reasoning-style local models.
Likely good memory headroom for this quantized model at normal context sizes.
Fast general chat and simple assistant tasks
A fast classic 7B model that is easy to run and compare against newer models.
Likely good memory headroom for this quantized model at normal context sizes.
Everyday private chat and document summaries
A widely supported everyday local chat model when the machine has at least an 8GB to 12GB GPU.
Likely good memory headroom for this quantized model at normal context sizes.
Assumptions
Family: Qwen2.5-Coder
Parameters: 7B
Quantization: Q4_K_M
Approx. model size: 4.68GB
CPU-only: Not recommended
VRAM: 8GB min / 12GB recommended
RAM: 16GB min / 32GB recommended
Full GPU offload: Should be possible when memory fits
Context warning: Large files and many open tabs can push memory use above the model size.
Research sources
Researched: 2026-06-22
Family: Meta Llama 3.2
Parameters: 3B
Quantization: Q4_K_M
Approx. model size: 2GB
CPU-only: Possible
VRAM: 0GB min / 4GB recommended
RAM: 8GB min / 16GB recommended
Full GPU offload: Should be possible when memory fits
Context warning: Long documents can still push memory use up, even with a small model.
Research sources
Researched: 2026-06-22
Family: Qwen3
Parameters: 4B
Quantization: Q4_K_M
Approx. model size: 2.5GB
CPU-only: Possible
VRAM: 4GB min / 6GB recommended
RAM: 8GB min / 16GB recommended
Full GPU offload: Should be possible when memory fits
Context warning: Keep the context window modest on 8GB to 16GB systems.
Family: Mistral 7B
Parameters: 7.3B
Quantization: Q4_K_M
Approx. model size: 4.4GB
CPU-only: Not recommended
VRAM: 8GB min / 12GB recommended
RAM: 16GB min / 32GB recommended
Full GPU offload: Should be possible when memory fits
Context warning: Long context support does not mean every machine should use the maximum context.
Research sources
Researched: 2026-06-22
Family: Meta Llama 3.1
Parameters: 8B
Quantization: Q4_K_M
Approx. model size: 4.9GB
CPU-only: Not recommended
VRAM: 8GB min / 12GB recommended
RAM: 16GB min / 32GB recommended
Full GPU offload: Should be possible when memory fits
Context warning: The Q4 model is under 5GB, but KV cache grows with context length.
Research sources
Researched: 2026-06-22
Local AI performance is approximate. Results depend on quantization, context length, backend, drivers, RAM, and whether the model fits fully in VRAM.
Order and handover
After payment or quote request
You receive a confirmation email. We check availability and confirm any practical substitutions before continuing.
Assembly and QA
Planned work includes software, drivers, a GPU/AI smoke test, thermal load sanity check, and memory/storage health checks.
Handover in Estonia
Pickup or local delivery method and timing are agreed after availability is checked.
Warranty and support
Warranty depends on the component, manufacturer, and retailer. Questions continue through the order or quote email thread.
Changes and cancellations
Changes are confirmed in writing; after sourcing or assembly begins, custom-order handling may depend on order state.
Payment security
Card details are entered in Stripe checkout. LLMLab.ee does not collect or store full card numbers.
AI terms in plain language
VRAM
Memory on the graphics card; usually the main limit for local AI model size.
Unified memory
Apple Silicon memory shared by CPU and GPU. Useful for local AI, but not identical to NVIDIA VRAM.
7B / 13B / 70B
A rough model-size signal. Larger numbers usually need more memory and may run slower.
q4 / quantization
A compressed 4-bit model that uses less memory, sometimes with quality or speed tradeoffs.