Setting up an external GPU on Ubuntu server

This post contains instructions on how to set up an external GPU (eGPU) on ubuntu server so that you can expose it to your containers. My specific use case was to leverage an old GPU to assist with processing for my LLMs and plex transcoding.

The hardware and software setup:

Use cases:

  • Run open-webui and ollama containers that can leverage the GPU
  • Expose GPU to plex docker container to assist with transcoding

The steps to get there:

  1. Set up the eGPU hardware
  2. Authorize the thunderbolt device in ubuntu
  3. List available drivers for the GPU in ubuntu
  4. Install the correct driver for the GPU in ubuntu and reboot
  5. Confirm the driver is loaded and the device is working
  6. Install the nvidia-container-toolkit and reboot
  7. Confirm the nvidia-container-toolkit is working
  8. Run open-webui and ollama with GPU support enabled
  9. Run plex with GPU transcoding support enabled

Now for the detailed instructions:

Set up the eGPU hardware

This was fairly easy and painless with the AKiTiO Node Titan. All I had to do was:

  1. open up the top cover,
  2. plug in the GPU (which required removing one of the bracket from the enclosure),
  3. plug in the enclosure power supply cables to the GPU,
  4. close the top cover,
  5. plug in the thunderbolt cable (for the NUC, you need to use the port at the back with the thunder icon),
  6. power on the enclosure (button at the back) and that was it.

Setting up thunderbolt in ubuntu

Thunderbolt devices, just like USB devices, must be authorized before the operating system can use them.

In ubuntu, this requires using the boltctl utility in order to list the device, then change its policy so that it is always authorized (so that the change is persistent across boots).

Use this command to list thunderbolt devices on your host and note the device uuid:

Bash
bolctl

 AKiTiO Node Titan
   ├─ type:          peripheral
   ├─ name:          Node Titan
   ├─ vendor:        AKiTiO
   ├─ uuid:          c2010000-0082-8c0e-8359-11de2c441109
   ├─ generation:    Thunderbolt 3
   ├─ status:        connected
     ├─ domain:     d0010000-0000-a508-a219-32cad4134017
     ├─ rx speed:   40 Gb/s = 2 lanes * 20 Gb/s
     ├─ tx speed:   40 Gb/s = 2 lanes * 20 Gb/s
     └─ authflags:  none
   ├─ connected:     Sun Nov 24 09:55:57 2024
   └─ stored:        no

Next change the device policy to always allow it:

Bash
sudo boltctl authorize c2010000-0082-8c0e-8359-11de2c441109

  AKiTiO Node Titan
   ├─ type:          peripheral
   ├─ name:          Node Titan
   ├─ vendor:        AKiTiO
   ├─ uuid:          c2010000-0082-8c0e-8359-11de2c441109
   ├─ dbus path:     /org/freedesktop/bolt/devices/c2010000_0082_8c0e_8359_11de2c441109
   ├─ generation:    Thunderbolt 3
   ├─ status:        authorized
     ├─ domain:     d0010000-0000-a508-a219-32cad4134017
     ├─ parent:     d0010000-0000-a508-a219-32cad4134017
     ├─ syspath:    /sys/devices/pci0000:00/0000:00:1c.4/0000:04:00.0/0000:05:00.0/0000:06:00.0/domain0/0-0/0-1
     ├─ rx speed:   40 Gb/s = 2 lanes * 20 Gb/s
     ├─ tx speed:   40 Gb/s = 2 lanes * 20 Gb/s
     └─ authflags:  none
   ├─ authorized:    Sun Nov 24 10:18:53 2024
   ├─ connected:     Sun Nov 24 10:06:23 2024
   └─ stored:        Sun Nov 24 10:22:07 2024
      ├─ policy:     auto
      └─ key:        no

note that if you use Secure Boot, this may require additional steps in order to add the key to your Secure Boot setup.

You can verify that the device is authorized by running the boltctl command and making sure the circle icon is green instead of yellow.

You can also double check that the pci device is now visible (may require a reboot):

Bash
sudo lspci
Bash
...
08:04.0 PCI bridge: Intel Corporation JHL7540 Thunderbolt 3 Bridge [Titan Ridge DD 2018] (rev 06)
09:00.0 VGA compatible controller: NVIDIA Corporation GP104 [GeForce GTX 1070 Ti] (rev a1)
09:00.1 Audio device: NVIDIA Corporation GP104 High Definition Audio Controller (rev a1)
...

Installing nvdia drivers

This is the step I struggled with the most as if you simply follow the default instructions and let ubuntu install the drivers it thinks you need, things just don’t work.

You have to list available drivers, then select the server drivers manually instead of letting ubuntu decide which driver to install.

Bash
sudo ubuntu-drivers list
Bash
nvidia-driver-560, (kernel modules provided by linux-modules-nvidia-560-generic)
nvidia-driver-535-server, (kernel modules provided by linux-modules-nvidia-535-server-generic)
Bash
sudo apt install nvidia-driver-535-server

You should probably reboot at this stage and once everything is restarted, you should now see that the GPU is available and using the nvidia driver:

Bash
sudo lshw -C display
Bash
  ...
  *-display
       description: VGA compatible controller
       product: GP104 [GeForce GTX 1070 Ti]
       vendor: NVIDIA Corporation
       physical id: 0
       bus info: pci@0000:09:00.0
       version: a1
       width: 64 bits
       clock: 33MHz
       capabilities: pm msi pciexpress vga_controller bus_master cap_list rom
       configuration: driver=nvidia latency=0
       resources: irq:170 memory:c6000000-c6ffffff memory:b0000000-bfffffff memory:c0000000-c1ffffff ioport:3000(size=128) memory:c7000000-c707ffff

You can confirm everything is working by running the nvidia-smi command:

Bash
nvidia-smi

Installing nvidia-container-toolkit

Now that the device is installed correctly, we need to make it available to docker containers which requires setting up the nvidia-container-toolkit.

Configure the production repository:

Bash
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
  && curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
    sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
    sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list

Update the packages list from the repository:

Bash
sudo apt-get update

Install the NVIDIA Container Toolkit packages:

Bash
sudo apt-get install -y nvidia-container-toolkit

Configure the toolkit:

Bash
sudo nvidia-ctk cdi generate --output=/etc/cdi/nvidia.yaml

Verify devices are configured:

Bash
nvidia-ctk cdi list
Bash
INFO[0000] Found 3 CDI devices
nvidia.com/gpu=0
nvidia.com/gpu=GPU-154b7fc8-0ab0-da07-5527-60f8f24e92c7
nvidia.com/gpu=all

At this stage, I would recommend one last reboot before moving on to the next step.

Running containers with the GPU enabled

Now everything is ready to start running our containers with the GPU device enabled.

For Plex:

To run the plex container with the hardware transcoding enabled with your nvidia GPU, make sure the following environment variables are set in your docker command or your docker compose file:

Bash
- NVIDIA_VISIBLE_DEVICES=all
- NVIDIA_DRIVER_CAPABILITIES=compute,video,utility

Example of a complete docker-compose.yml file:

YAML
---
services:
  plex:
    image: lscr.io/linuxserver/plex:latest
    container_name: plex
    network_mode: bridge
    environment:
      - PUID=1000
      - PGID=1000
      - TZ=Europe/Paris
      - VERSION=docker
      - NVIDIA_VISIBLE_DEVICES=all
      - NVIDIA_DRIVER_CAPABILITIES=compute,video,utility
    volumes:
      - plex-database:/config
      - plex-transcode:/transcode
      - plex-data:/media
    restart: unless-stopped
    ports:
      - 32400:32400/tcp
      - 3005:3005/tcp
      - 8324:8324/tcp
      - 32469:32469/tcp
      - 1900:1900/udp
      - 32410:32410/udp
      - 32412:32412/udp
      - 32413:32413/udp
      - 32414:32414/udp
    restart: always
volumes:
  plex-database:
    external: true
    name: plex-database
  plex-transcode:
    external: true
    name: plex-transcode
  plex-data:
    external: true
    name: plex-data

Note that in this example, I’m using my QNAP NAS for docker volumes so external is set to true.

For ollama:

To run ollama with GPU support, use the following command:

Bash
sudo docker run -d --gpus=all -v ollama:/root/.ollama -p 11434:11434 --restart always --name ollama ollama/ollama

For open-webui:

To run open-webui with GPU support, use the following command:

Bash
docker run -d -p 8080:8080 --gpus all --add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:cuda 

You can use watch nvidia-smi -l or nvtop to monitor the GPU usage on your NUC while using transcoding in plex or generating answers in open-webui with your models.

Leave a Reply

Your email address will not be published. Required fields are marked *