GPUStack.ai

Build an enterprise-grade LLM as a Service Platform in your environment

and adopt Generative AI with flexibility, privacy, and security.

macOS

M series 14+

Linux

Docker (Recommended)

Windows

win 10, win 11

一条线

Why GPUStack?

Open-source, flexible and easy to use

一条线

Platform Flexibility

Seamlessly adaptable from desktop to server, across all major OS and GPU hardware.

Enterprise Ready

One-click deployment with a fully integrated technical stack, built for Enterprise needs.

Privacy & Security

100% open-source, fully on-premise – complete control over your data and access in your environment.

Run anywhere, from Desktop to Server

Develop and test your AI applications with models on your local Mac or Windows desktop, then seamlessly transition to production models on Linux GPU servers. GPUStack supports all platforms and provides a consistent experience.

Multiple Inference Engines & Model Types

With support for vLLM, llama.cpp, and more, GPUStack is built for cross-platform compatibility and performance. Easily extend and upgrade your inference engines and models to meet evolving needs.

Flexible Scheduling and High Availability

GPUStack ensures maximum GPU utilization and high availability with flexible scheduling strategies, automated resource calculation, and support for multi-model replicas with automatic load-balancing.

Streamlined AI Development for Developers

The Playground enables fast iteration and testing with tools like prompt tests, parameter configuration, multi-model comparison, and code examples.

Comprehensive Monitoring and Metrics

The Dashboard provides real-time insights into system performance, resource usage, and API access statistics. Track token usage, top users, and model status to ensure optimal operations and efficiency.

Try GPUStack

macOS

Linux

Windows

Platform Flexibility

Enterprise Ready

Privacy & Security

Run anywhere, from Desktop to Server

Multiple Inference Engines & Model Types

Flexible Scheduling and High Availability

Streamlined AI Development for Developers

Comprehensive Monitoring and Metrics

Resources

Company

Try GPUStack

macOS

Linux

Windows

Platform Flexibility

Enterprise Ready

Privacy & Security

Run anywhere, from Desktop to Server

Multiple Inference Engines & Model Types

Flexible Scheduling and High Availability

Streamlined AI Development for Developers

Comprehensive Monitoring and Metrics

Resources

Company

Get our newsletter