What is GPUStack?
GPUStack is an open-source GPU cluster manager for running Large Language Models (LLMs). GPUStack allows you to create a unified cluster from any brand of GPUs in Apple MacBooks, Windows PCs, and Linux servers. Administrators can deploy LLMs from popular repositories such as Hugging Face. Developers can then access LLMs just as easily as accessing public LLM services from vendors like OpenAI or Microsoft Azure.
For more details about GPUStack, visit:
Introducing GPUStack: https://gpustack.ai/introducing-gpustack
GitHub repo: https://github.com/gpustack/gpustack
User guide: https://docs.gpustack.ai
Getting Started with GPUStack
You need to use at least Python version 3.10.
Installation
Linux or MacOS
GPUStack provides a script to install it as a service on systemd or launchd based systems. To install GPUStack using this method, execute:
xxxxxxxxxx
curl -sfL https://get.gpustack.ai | sh -
Now you have deployed and started the GPUStack server, which serves as the first worker node. You can access the GPUStack page via http://myserver (Replace with the IP address or domain of the host you installed).
Log in to GPUStack with username admin
and the default password. You can run the following command to get the password for the default setup:
xxxxxxxxxx
cat /var/lib/gpustack/initial_admin_password
To add additional worker nodes and form a GPUStack cluster, please run the following command on each worker node:
xxxxxxxxxx
curl -sfL https://get.gpustack.ai | sh - --server-url http://myserver --token mytoken
Replace http://myserver
with your GPUStack server URL and mytoken
with your secret token for adding workers. To retrieve the token in the default setup from the GPUStack server, use the following command:
xxxxxxxxxx
cat /var/lib/gpustack/token
Or follow the instructions on GPUStack to add workers:
Windows
Run PowerShell as administrator, then run the following command to install GPUStack:
xxxxxxxxxx
Invoke-Expression (Invoke-WebRequest -Uri "https://get.gpustack.ai" -UseBasicParsing).Content
You can access the GPUStack page via http://myserver (Replace with the IP address or domain of the host you installed).
Log in to GPUStack with username admin
and the default password. You can run the following command to get the password for the default setup:
xxxxxxxxxx
Get-Content -Path (Join-Path -Path $env:APPDATA -ChildPath "gpustack\initial_admin_password") -Raw
Optionally, you can add extra workers to form a GPUStack cluster by running the following command on other nodes:
xxxxxxxxxx
Invoke-Expression "& { $((Invoke-WebRequest -Uri "https://get.gpustack.ai" -UseBasicParsing).Content) } --server-url http://myserver --token mytoken"
In the default setup, you can run the following to get the token used for adding workers:
xxxxxxxxxx
Get-Content -Path (Join-Path -Path $env:APPDATA -ChildPath "gpustack\token") -Raw
For other installation scenarios, please refer to our installation documentation at: https://docs.gpustack.ai/docs/quickstart
Serving LLMs
As an LLM administrator, you can log in to GPUStack as the default system admin, navigate to Resources
to monitor your GPU status and capacities, and then go to Models
to deploy any open-source LLM into the GPUStack cluster. This enables you to provide these LLMs to regular users for integration into their applications. This approach helps you to efficiently utilize your existing resources and deliver stable LLM services for various needs and scenarios.
- Access GPUStack to deploy the LLMs you need. Choose models from Hugging Face (only GGUF format is currently supported) or Ollama Library, download them to your local environment, and run the LLMs:
- GPUStack will automatically schedule the model to run on the appropriate Worker:
- You can manage and maintain LLMs by checking API requests, token consumption, token throughput, resource utilization status, and more. This helps you decide whether to scale up or upgrade LLMs to ensure service stability.
Integrating with your applications
As an AI application developer, you can log in to GPUStack as a regular user and navigate to Playground
from the menu. Here, you can interact with the LLM using the UI playground.
Next, visit API Keys
to generate and save your API key. Return to Playground
to customize your LLM by adjusting the system prompt, adding few-shot learning examples, or resizing prompt parameters. When you're done, click View Code
and select your preferred code format (curl, Python, Node.js) along with the API key. Use this code in your applications to enable communication with your private LLMs.
you can access the OpenAI-compatible API now, for example, use curl as the following:
xxxxxxxxxx
export GPUSTACK_API_KEY=myapikey
curl http://myserver/v1-openai/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $GPUSTACK_API_KEY" \
-d '{
"model": "llama3",
"messages": [
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "Hello!"
}
],
"stream": true
}'
Manage GPUStack
For MacOS
In macOS, GPUStack runs as a launchd service. Use launchctl to manage the GPUStack service:
- View the configuration
xxxxxxxxxx
sudo launchctl print system/ai.gpustack
- Stop service
xxxxxxxxxx
sudo launchctl unload /Library/LaunchDaemons/ai.gpustack.plist
ps -ef | grep gpustack
- Start service
xxxxxxxxxx
sudo launchctl load /Library/LaunchDaemons/ai.gpustack.plist
ps -ef | grep gpustack
- Edit configuration and restart service
xxxxxxxxxx
sudo launchctl unload /Library/LaunchDaemons/ai.gpustack.plist
sudo vim /Library/LaunchDaemons/ai.gpustack.plist
sudo launchctl load /Library/LaunchDaemons/ai.gpustack.plist
ps -ef | grep gpustack
- View logs
You can view GPUStack logs using the following path and command:
xxxxxxxxxx
tail -200f /var/log/gpustack.log
- Uninstall
Run the following command to uninstall GPUStack:
xxxxxxxxxx
/var/lib/gpustack/uninstall.sh
For Linux
In Linux, GPUStack runs as a systemd service. Use systemctl to manage the GPUStack service:
- View the configuration
xxxxxxxxxx
sudo cat /etc/systemd/system/gpustack.service
- Stop service
xxxxxxxxxx
sudo systemctl stop gpustack
ps -ef | grep gpustack
- Start service
xxxxxxxxxx
sudo systemctl start gpustack
ps -ef | grep gpustack
- Edit configuration and restart service
xxxxxxxxxx
sudo vim /etc/systemd/system/gpustack.service
sudo systemctl daemon-reload
sudo systemctl restart gpustack
ps -ef | grep gpustack
- View logs
You can view GPUStack logs using the following path and command:
xxxxxxxxxx
tail -200f /var/log/gpustack.log
- Uninstall
Run the following command to uninstall GPUStack:
xxxxxxxxxx
/var/lib/gpustack/uninstall.sh
For Windows
In Windows, you can use PowerShell to manage the GPUStack service:
- View the configuration
xxxxxxxxxx
Get-WmiObject Win32_Process -Filter "Name = 'gpustack.exe'"
- Stop service
xxxxxxxxxx
Stop-Service -Name "GPUStack"
Get-WmiObject Win32_Process -Filter "Name = 'gpustack.exe'" | Select-Object ProcessId, CommandLine
- Start service
xxxxxxxxxx
Start-Service -Name "GPUStack"
Get-WmiObject Win32_Process -Filter "Name = 'gpustack.exe'" | Select-Object ProcessId, CommandLine
- Edit the configuration using nssm and restart the service
xxxxxxxxxx
nssm edit GPUStack
Restart after edit the configuration:
xxxxxxxxxx
Restart-Service -Name "GPUStack"
Get-Service -Name "GPUStack"
- View logs
You can view GPUStack logs using the following path and command:
xxxxxxxxxx
Get-Content "$env:APPDATA\gpustack\log\gpustack.log" -Tail 200 -Wait
- Uninstall
Run the following PowerShell command to uninstall GPUStack:
xxxxxxxxxx
Set-ExecutionPolicy Bypass -Scope Process -Force; & "$env:APPDATA\gpustack\uninstall.ps1"
Join Our Community
Please find more information about GPUStack at: https://gpustack.ai.
If you encounter any issues or have suggestions for GPUStack, feel free to join our Community for support from the GPUStack team and to connect with fellow users globally.
We are actively enhancing the GPUStack project and plan to introduce new features in the near future, including support for multimodal models, additional accelerators like AMD ROCm or Intel oneAPI, and more inference engines. Before getting started, we encourage you to follow and star our project on GitHub at gpustack/gpustack to receive instant notifications about all future releases. We welcome your contributions to the project.
About Us
GPUStack is brought to you by Seal, Inc., a team dedicated to enabling AI access for all. Our mission is to enable enterprises to use AI to conduct their business, and GPUStack is a significant step towards achieving that goal.
Quickly build your own LLMaaS platform with GPUStack! Start experiencing the ease of creating GPU clusters locally, running and using LLMs, and integrating them into your applications.