遇到问题如下:
Traceback (most recent call last):File "run_warmup_a.py", line 431, in <module>main()File "run_warmup_a.py", line 142, in mainreturn main_worker(args, logger)File "run_warmup_a.py", line 207, in main_workerloss = train(lb_train_loader, model, ema_m, optimizer, scheduler, epoch, args, logger, criterion)File "run_warmup_a.py", line 368, in trainscaler.scale(loss).backward()File "/home/algroup/anaconda3/envs/chenao/lib/python3.7/site-packages/torch/_tensor.py", line 489, in backwardself, gradient, retain_graph, create_graph, inputs=inputsFile "/home/algroup/anaconda3/envs/chenao/lib/python3.7/site-packages/torch/autograd/__init__.py", line 199, in backwardallow_unreachable=True, accumulate_grad=True) # Calls into the C++ engine to run the backward pass
RuntimeError: CUDA error: CUBLAS_STATUS_INVALID_VALUE when calling `cublasGemmEx( handle, opa, opb, m, n, k, &falpha, a, CUDA_R_16F, lda, b, CUDA_R_16F, ldb, &fbeta, c, CUDA_R_16F, ldc, CUDA_R_32F, CUBLAS_GEMM_DFALT_TENSOR_OP)`
解决方法
输入unset LD_LIBRARY_PATH
参考
【已解决】RuntimeError: CUDA error: CUBLAS_STATUS_INVALID_VALUE when calling `cublasSgemm( handle, opa, o