Monitoring NVIDIA GPU Usage on Ubuntu

This article provides a comprehensive guide on how to monitor the usage of NVIDIA graphics cards on Ubuntu. Whether you are a gamer, a professional in graphics-intensive applications, or someone interested in the performance of machine learning models, understanding how to check your NVIDIA GPU’s usage can be crucial. We will cover two primary methods to accomplish this, each with its own set of advantages.

In this tutorial you will learn:

  • How to use the NVIDIA System Management Interface (nvidia-smi) to monitor GPU usage
  • How to install and utilize nvtop for a more interactive monitoring experience
Monitoring NVIDIA GPU Usage on Ubuntu
Monitoring NVIDIA GPU Usage on Ubuntu
Software Requirements and Linux Command Line Conventions
Category Requirements, Conventions or Software Version Used
System Ubuntu with NVIDIA Graphics Card
Software NVIDIA Drivers, nvidia-smi, nvtop
Other Terminal access
Conventions # – requires given linux commands to be executed with root privileges either directly as a root user or by use of sudo command
$ – requires given linux commands to be executed as a regular non-privileged user

Monitoring Methods

Below are the methods to monitor the performance and usage of NVIDIA GPUs on Ubuntu:

  1. Using NVIDIA System Management Interface (nvidia-smi): The NVIDIA System Management Interface, known as nvidia-smi, is a powerful command-line utility included with NVIDIA GPU drivers. It’s designed to provide vital statistics about the GPU’s performance, including current utilization, memory consumption, GPU temperature, and more. To utilize this tool, ensure that your system has the NVIDIA drivers installed. For a detailed guide on installing NVIDIA drivers on Ubuntu, refer to our specialized installation guide.
    $ nvidia-smi
    or
    $ watch -n 1 nvidia-smi

    This method allows for real-time monitoring or a snapshot of current GPU stats, making it invaluable for quick checks or continuous observation.

    Using NVIDIA System Management Interface (nvidia-smi)
    Using NVIDIA System Management Interface (nvidia-smi)

    The terminal output provides key statistics for GPU usage monitoring. The NVIDIA System Management Interface (nvidia-smi) command shows the GPU’s temperature at 50°C, performance state (P0), power consumption at 140W (out of a 370W capacity), memory utilization at 1471MiB of 10240MiB, and overall GPU utilization at 27%. It also details active processes, such as the system’s Xorg and user applications like Firefox, with their respective GPU memory usage, providing insights into which applications are consuming GPU resources.



  2. Monitoring with nvtop: nvtop stands out as an interactive monitoring tool, akin to htop but focused on NVIDIA GPUs. It offers an in-depth view of the processes utilizing the GPU, detailed memory usage statistics, and other critical metrics. To start with nvtop, installation is required. This can be easily done via the package manager.
    $ sudo apt install nvtop
    $ nvtop

    After installation, running nvtop presents a comprehensive, user-friendly interface for monitoring your GPU’s activity in real time, offering insights into the performance and usage of your NVIDIA graphics card.

    Monitoring NVIDIA GPU with nvtop
    Monitoring NVIDIA GPU with nvtop

    The terminal output showcases the nvtop utility, providing a graphical representation of an NVIDIA GPU’s performance metrics. It highlights Device 0 [NVIDIA GeForce RTX 3080] with its core GPU clock at 1800MHz and memory clock at 9501MHz. The temperature is stable at 44°C, and the power draw is at 94W from a possible 370W. In the graphical section, the GPU utilization and memory utilization percentages are visualized over time. Below, the processes are listed with details such as process ID, user, device, type, individual GPU and CPU usage, host memory usage, and the specific command/process name.

Conclusion

Understanding and monitoring your NVIDIA GPU’s usage on Ubuntu can significantly enhance your experience, whether for gaming, professional applications, or computational tasks. By employing tools like nvidia-smi and nvtop, you gain valuable insights into your GPU’s performance, ensuring optimal usage and troubleshooting potential issues. Both methods discussed offer unique advantages, catering to different user needs and preferences for system monitoring.

Frequently Asked Questions about Monitoring NVIDIA GPU Usage on Ubuntu

1. How do I check if my Ubuntu system recognizes my NVIDIA GPU?
You can use the ‘lspci | grep NVIDIA’ command to verify if your GPU is detected by the system.

2. Can I use nvidia-smi to manage GPU fan speeds?
Nvidia-smi primarily monitors performance; for fan control, you may need additional software like ‘Coolbits’.

3. Is there a way to log GPU usage over time?
Yes, you can use ‘nvidia-smi –query-gpu=utilization.gpu –format=csv –loop-ms=1000 > gpu_usage.log’ to log the usage.

4. Does monitoring GPU usage impact system performance?
Monitoring tools use minimal resources and typically do not significantly affect overall performance.

5. Can I monitor multiple GPUs simultaneously?
Both nvidia-smi and nvtop support monitoring multiple GPUs if they are installed in your system.

6. How do I interpret the ‘P0’ state in nvidia-smi?
‘P0’ refers to the highest performance state for NVIDIA GPUs, indicating maximum performance.

7. What does the ‘Persistence-M’ column indicate in nvidia-smi?
It shows whether persistence mode is enabled, which helps maintain GPU performance state for longer periods.

8. Are there any GUI tools for monitoring NVIDIA GPU on Ubuntu?
Yes, tools like ‘GreenWithEnvy’ provide a graphical interface for monitoring and tweaking NVIDIA GPUs.

9. Can nvtop show which processes are using the GPU?
Absolutely, nvtop displays the list of processes utilizing the GPU along with their memory usage.

10. What does ‘Volatile GPU Utilization’ mean in nvidia-smi?
It measures the percentage of time over the past sample period during which active processing occurred.

11. How can I find out the total memory of my NVIDIA GPU?
The total memory is listed in nvidia-smi output next to the GPU name, measured in MiB (Mebibytes).

12. Can I use nvtop to monitor GPU temperature?
Yes, nvtop provides real-time data on GPU temperature along with other statistics.

13. How often does nvidia-smi update its output?
By default, nvidia-smi updates its output each time it is run; for continuous update, use the ‘watch’ command.

14. Is it possible to monitor GPU power efficiency?
Nvidia-smi includes power draw and power capacity, which can be used to gauge power efficiency.



Comments and Discussions
Linux Forum