I was trying to launch a VM instance with GPU on Google Cloud. But after trying T4, L4, and V100, they all reported “exceeding resource limit”, which means a lot of people in my region are using these types of GPUs.
Without choice, I launched a VM instance with an old Nvidia Tesla P100 (I first used it about 5 years ago). Then, I need to install its driver. But the installation process reported errors:
*** Failed CC version check. *** SYMLINK /tmp/selfgz26389/NVIDIA-Linux-x86_64-515.105.01/kernel/nvidia/nv-kernel.o SYMLINK /tmp/selfgz26389/NVIDIA-Linux-x86_64-515.105.01/kernel/nvidia-modeset/nv-modeset-kernel.o CONFTEST: hash__remap_4k_pfn CONFTEST: set_pages_uc CONFTEST: list_is_first CONFTEST: set_memory_uc ... cc: error: unrecognized command-line option '-ftrivial-auto-var-init=zero'
At first glance, I suspect the GCC compiler is too old. After downgrading the GCC to gcc-10 and gcc-9, the error still existed.
Finally, I noticed that the driver of the Tesla P100 is very new (Release Date: 2023.3.30) and this page mentioned “gcc-12”. Therefore I upgraded the GCC to 12:
sudo apt install gcc-12 sudo ln -sf /usr/bin/gcc-12 /etc/alternatives/cc
Now the driver can be installed successfully.