NVIDIA H200 SXM
The Preferred Choice for Large Model Inference
| Architecture | Hopper (H200) |
|---|---|
| VRAM | 141GB HBM3e |
| Mem Bandwidth | 4.8 TB/s |
| FP8 Tensor | 3,958 TFLOPS |
| Form Factor | SXM5 |
| Tenancy | Dedicated |
| Location | Missoula, Montana |
Ideal workloads: Inference on large foundation models (70B–405B parameters), RAG with large knowledge bases, long-context document processing, multi-modal AI, CubDen at scale.