diff --git a/doc/source/operations/index.rst b/doc/source/operations/index.rst index 818b5fa76..3ae8ef002 100644 --- a/doc/source/operations/index.rst +++ b/doc/source/operations/index.rst @@ -33,6 +33,7 @@ Kubernetes Operation k8s_persistent_vol_claims k8s_sriov_config k8s_qat_device_plugin + k8s_gpu_device_plugin ------------------- OpenStack Operation diff --git a/doc/source/operations/k8s_gpu_device_plugin.rst b/doc/source/operations/k8s_gpu_device_plugin.rst new file mode 100644 index 000000000..57233176f --- /dev/null +++ b/doc/source/operations/k8s_gpu_device_plugin.rst @@ -0,0 +1,77 @@ +================================================ +Kubernetes Intel GPU Device Plugin Configuration +================================================ + +This document describes how to enable the Intel GPU device plugin in StarlingX +and schedule pods on nodes with an Intel GPU. + +------------------------------ +Enable Intel GPU device plugin +------------------------------ + +You can pre-install the ``intel-gpu-plugin`` daemonset as follows: + +#. Launch the ``intel-gpu-plugin`` daemonset. + + Add the following lines to the ``localhost.yaml`` file before playing the + Ansible bootstrap playbook to configure the system. + + :: + + k8s_plugins: + intel-gpu-plugin: intelgpu=enabled + +#. Assign the ``intelgpu`` label to each node that should have the Intel GPU + plugin enabled. This will make any GPU devices on a given node available for + scheduling to containers. The following example assigns the ``intelgpu`` + label to the compute-0 node. + + :: + + $ NODE=compute-0 + $ system host-lock $NODE + $ system host-label-assign $NODE intelgpu=enabled + $ system host-unlock $NODE + +#. After the node becomes available, verify the GPU device plugin is registered + and that the available GPU devices on the node have been discovered and reported. + + :: + + $ kubectl describe node $NODE | grep gpu.intel.com + gpu.intel.com/i915: 1 + gpu.intel.com/i915: 1 + +------------------------------------- +Schedule pods on nodes with Intel GPU +------------------------------------- + +Add a ``resources.limits.gpu.intel.com`` to your container specification in order +to request an available GPU device for your container. + +:: + + ... + spec: + containers: + - name: ... + ... + resources: + limits: + gpu.intel.com/i915: 1 + + +The pods will be scheduled to the nodes with available Intel GPU devices. A GPU +device will be allocated to the container and the available GPU devices will be +updated. + +:: + + $ kubectl describe node $NODE | grep gpu.intel.com + gpu.intel.com/i915: 1 + gpu.intel.com/i915: 0 + +For more details, refer to the following examples: + +* `Kubernetes manifest file example `_ +* `Scheduling pods on nodes with Intel GPU example `_