sensu-plugins-nvidia
Functionality
Plugin to collect metrics from your NVIDIA GPU.
metrics-nvidia.rb
Collects metrics by calling nvidia-smi
internally.
parameters
-
-s, --scheme
: The scheme to concatenate the metrics with (default:HOSTNAME.nvidia
)
metrics
Multiple GPUs are supported and labeled by BusID in Hex (e.g. nvidia.bus0x02.temperature.gpu
). For each PCI bus you will get the following metrics:
-
nvidia.busBUS_IN_HEX.fan.speed
: Speed of the card's fan (RPM) -
nvidia.busBUS_IN_HEX.memory.free
: Unused memory available to the card (MiB, mebibyte) -
nvidia.busBUS_IN_HEX.memory.total
: Total amount of memory available to the card (MiB, mebibyte) -
nvidia.busBUS_IN_HEX.memory.used
: Memory used by the card (MiB, mebibyte) -
nvidia.busBUS_IN_HEX.power.draw
: Power draw of the card (Watt) -
nvidia.busBUS_IN_HEX.temperature.gpu
: Temperature of the card (Degree Celsius) -
nvidia.busBUS_IN_HEX.utilization.gpu
: GPU utilization of the card (percent) -
nvidia.busBUS_IN_HEX.utilization.memory
: memory utilization of the card (percent)
If you do not get all of the listed values, you GPU probably does not support the feature. To check, you can query it yourself by running nvidia-smi --query-gpu=METRIC --format=csv
.
(e.g. nvidia-smi --query-gpu=power.draw --format=csv
)
Usage
To collect metrics with the scheme of your choice:
metrics-nvidia.rb --scheme YOUR_SCHEME_HERE
Installation
sensu-install --plugin sensu-plugins-nvidia
In order to use this plugin you will need the official NVIDIA command line interface nvidia-smi
which is distributed with their drivers.
For more help see Installation and Setup.