What is GPUStack?
GPUStack is an open-source GPU cluster manager for running Large Language Models (LLMs). GPUStack allows you to create a unified cluster from any brand of GPUs in Apple MacBooks, Windows PCs, and Linux servers. Administrators can deploy LLMs from popular repositories such as Hugging Face. Developers can then access LLMs just as easily as accessing public LLM services from vendors like OpenAI or Microsoft Azure.
For more details about GPUStack, visit:
Introducing GPUStack: https://gpustack.ai/introducing-gpustack
GitHub repo: https://github.com/gpustack/gpustack
User guide: https://docs.gpustack.ai
Getting Started with GPUStack
You need to use at least Python version 3.10.
Installation
Linux or MacOS
GPUStack provides a script to install it as a service on systemd or launchd based systems. To install GPUStack using this method, execute:
xxxxxxxxxxcurl -sfL https://get.gpustack.ai | sh -Now you have deployed and started the GPUStack server, which serves as the first worker node. You can access the GPUStack page via http://myserver (Replace with the IP address or domain of the host you installed).
Log in to GPUStack with username admin and the default password. You can run the following command to get the password for the default setup:
xxxxxxxxxxcat /var/lib/gpustack/initial_admin_passwordTo add additional worker nodes and form a GPUStack cluster, please run the following command on each worker node:
xxxxxxxxxxcurl -sfL https://get.gpustack.ai | sh - --server-url http://myserver --token mytokenReplace http://myserver with your GPUStack server URL and mytoken with your secret token for adding workers. To retrieve the token in the default setup from the GPUStack server, use the following command:
xxxxxxxxxxcat /var/lib/gpustack/tokenOr follow the instructions on GPUStack to add workers:

Windows
Run PowerShell as administrator, then run the following command to install GPUStack:
xxxxxxxxxxInvoke-Expression (Invoke-WebRequest -Uri "https://get.gpustack.ai" -UseBasicParsing).ContentYou can access the GPUStack page via http://myserver (Replace with the IP address or domain of the host you installed).
Log in to GPUStack with username admin and the default password. You can run the following command to get the password for the default setup:
xxxxxxxxxxGet-Content -Path (Join-Path -Path $env:APPDATA -ChildPath "gpustack\initial_admin_password") -RawOptionally, you can add extra workers to form a GPUStack cluster by running the following command on other nodes:
xxxxxxxxxx Invoke-Expression "& { $((Invoke-WebRequest -Uri "https://get.gpustack.ai" -UseBasicParsing).Content) } --server-url http://myserver --token mytoken"In the default setup, you can run the following to get the token used for adding workers:
xxxxxxxxxxGet-Content -Path (Join-Path -Path $env:APPDATA -ChildPath "gpustack\token") -RawFor other installation scenarios, please refer to our installation documentation at: https://docs.gpustack.ai/docs/quickstart
Serving LLMs
As an LLM administrator, you can log in to GPUStack as the default system admin, navigate to Resources to monitor your GPU status and capacities, and then go to Models to deploy any open-source LLM into the GPUStack cluster. This enables you to provide these LLMs to regular users for integration into their applications. This approach helps you to efficiently utilize your existing resources and deliver stable LLM services for various needs and scenarios.
- Access GPUStack to deploy the LLMs you need. Choose models from Hugging Face (only GGUF format is currently supported) or Ollama Library, download them to your local environment, and run the LLMs:

- GPUStack will automatically schedule the model to run on the appropriate Worker:

- You can manage and maintain LLMs by checking API requests, token consumption, token throughput, resource utilization status, and more. This helps you decide whether to scale up or upgrade LLMs to ensure service stability.

Integrating with your applications
As an AI application developer, you can log in to GPUStack as a regular user and navigate to Playground from the menu. Here, you can interact with the LLM using the UI playground.

Next, visit API Keys to generate and save your API key. Return to Playground to customize your LLM by adjusting the system prompt, adding few-shot learning examples, or resizing prompt parameters. When you're done, click View Code and select your preferred code format (curl, Python, Node.js) along with the API key. Use this code in your applications to enable communication with your private LLMs.
you can access the OpenAI-compatible API now, for example, use curl as the following:
xxxxxxxxxxexport GPUSTACK_API_KEY=myapikeycurl http://myserver/v1-openai/chat/completions \ -H "Content-Type: application/json" \ -H "Authorization: Bearer $GPUSTACK_API_KEY" \ -d '{ "model": "llama3", "messages": [ { "role": "system", "content": "You are a helpful assistant." }, { "role": "user", "content": "Hello!" } ], "stream": true }'
Manage GPUStack
For MacOS
In macOS, GPUStack runs as a launchd service. Use launchctl to manage the GPUStack service:
- View the configuration
xxxxxxxxxxsudo launchctl print system/ai.gpustack
- Stop service
xxxxxxxxxxsudo launchctl unload /Library/LaunchDaemons/ai.gpustack.plistps -ef | grep gpustack
- Start service
xxxxxxxxxxsudo launchctl load /Library/LaunchDaemons/ai.gpustack.plistps -ef | grep gpustack
- Edit configuration and restart service
xxxxxxxxxxsudo launchctl unload /Library/LaunchDaemons/ai.gpustack.plistsudo vim /Library/LaunchDaemons/ai.gpustack.plistsudo launchctl load /Library/LaunchDaemons/ai.gpustack.plistps -ef | grep gpustack
- View logs
You can view GPUStack logs using the following path and command:
xxxxxxxxxxtail -200f /var/log/gpustack.log
- Uninstall
Run the following command to uninstall GPUStack:
xxxxxxxxxx/var/lib/gpustack/uninstall.sh
For Linux
In Linux, GPUStack runs as a systemd service. Use systemctl to manage the GPUStack service:
- View the configuration
xxxxxxxxxxsudo cat /etc/systemd/system/gpustack.service
- Stop service
xxxxxxxxxxsudo systemctl stop gpustackps -ef | grep gpustack
- Start service
xxxxxxxxxxsudo systemctl start gpustackps -ef | grep gpustack
- Edit configuration and restart service
xxxxxxxxxxsudo vim /etc/systemd/system/gpustack.servicesudo systemctl daemon-reloadsudo systemctl restart gpustackps -ef | grep gpustack
- View logs
You can view GPUStack logs using the following path and command:
xxxxxxxxxxtail -200f /var/log/gpustack.log
- Uninstall
Run the following command to uninstall GPUStack:
xxxxxxxxxx/var/lib/gpustack/uninstall.sh
For Windows
In Windows, you can use PowerShell to manage the GPUStack service:
- View the configuration
xxxxxxxxxxGet-WmiObject Win32_Process -Filter "Name = 'gpustack.exe'"
- Stop service
xxxxxxxxxxStop-Service -Name "GPUStack"Get-WmiObject Win32_Process -Filter "Name = 'gpustack.exe'" | Select-Object ProcessId, CommandLine
- Start service
xxxxxxxxxxStart-Service -Name "GPUStack"Get-WmiObject Win32_Process -Filter "Name = 'gpustack.exe'" | Select-Object ProcessId, CommandLine
- Edit the configuration using nssm and restart the service
xxxxxxxxxxnssm edit GPUStackRestart after edit the configuration:
xxxxxxxxxxRestart-Service -Name "GPUStack"Get-Service -Name "GPUStack"
- View logs
You can view GPUStack logs using the following path and command:
xxxxxxxxxxGet-Content "$env:APPDATA\gpustack\log\gpustack.log" -Tail 200 -Wait
- Uninstall
Run the following PowerShell command to uninstall GPUStack:
xxxxxxxxxxSet-ExecutionPolicy Bypass -Scope Process -Force; & "$env:APPDATA\gpustack\uninstall.ps1"
Join Our Community
Please find more information about GPUStack at: https://gpustack.ai.
If you encounter any issues or have suggestions for GPUStack, feel free to join our Community for support from the GPUStack team and to connect with fellow users globally.
We are actively enhancing the GPUStack project and plan to introduce new features in the near future, including support for multimodal models, additional accelerators like AMD ROCm or Intel oneAPI, and more inference engines. Before getting started, we encourage you to follow and star our project on GitHub at gpustack/gpustack to receive instant notifications about all future releases. We welcome your contributions to the project.
About Us
GPUStack is brought to you by Seal, Inc., a team dedicated to enabling AI access for all. Our mission is to enable enterprises to use AI to conduct their business, and GPUStack is a significant step towards achieving that goal.
Quickly build your own LLMaaS platform with GPUStack! Start experiencing the ease of creating GPU clusters locally, running and using LLMs, and integrating them into your applications.