After I changed my dataset for my code, the training failed:

/tmp/pip-req-build-_tx3iysr/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:310: operator(): block: [0,0,0], thread: [59,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/tmp/pip-req-build-_tx3iysr/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:310: operator(): block: [0,0,0], thread: [60,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/tmp/pip-req-build-_tx3iysr/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:310: operator(): block: [0,0,0], thread: [61,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/tmp/pip-req-build-_tx3iysr/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:310: operator(): block: [0,0,0], thread: [62,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/tmp/pip-req-build-_tx3iysr/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:310: operator(): block: [0,0,0], thread: [63,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
Traceback (most recent call last):
  File "train.py", line 337, in <module>
    train(args, train_loader, eval_loader)
  File "train.py", line 189, in train
    sounds = aug(sounds)
  File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 881, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/sanbai/birds_sound_classification/utils/augment.py", line 13, in forward
    image = (image - image.mean()) / image.std()
RuntimeError: CUDA error: device-side assert triggered 

It’s terribly hard to find out the reason for this common error “RuntimeError: CUDA error: device-side assert triggered”. But someone on Github recommends a method: adding CUDA_LAUNCH_BLOCKING=1 before the program.

Now the real error behind RuntimeError shows up: it’s the wrong number of categories I set to the model.