Ask questionscublas runtime error on torch.bmm() with CUDA10 and RTX2080Ti

🐛 Bug

When running torch.bmm(a, b) on cuda tensors I get the following error

RuntimeError: cublas runtime error : the GPU program failed to execute at /pytorch/aten/src/THC/

I'm running the appropriate CUDA for the card (plenty of other models have trained fine). The same CUDA version on a machine with a 1080Ti works fine.

To Reproduce

Here is a minimal script to reproduce the problem:

import torch

x = torch.randn(32, 60, 60)

torch.bmm(x, x) # no problem without cuda

x = x.cuda()

(x ** 2).sum()  # no problem with other cuda operations

torch.bmm(x, x) # crash here


python Collecting environment information... PyTorch version: 1.1.0 Is debug build: No CUDA used to build PyTorch: 9.0.176

OS: CentOS Linux release 7.4.1708 (Core) GCC version: (GCC) 6.3.0 CMake version: version

Python version: 3.6 Is CUDA available: Yes CUDA runtime version: 10.0.130 GPU models and configuration: GPU 0: GeForce RTX 2080 Ti Nvidia driver version: 410.72 cuDNN version: Could not collect

Versions of relevant libraries: [pip3] numpy==1.15.4 [pip3] torch==1.1.0 [pip3] torchsample==0.1.2 [pip3] torchvision==0.2.1 [conda] cuda80 1.0 0 soumith [conda] mkl 2017.0.3 0
[conda] pytorch 0.2.0 py36h53baedd_4cu80 [cuda80] soumith [conda] torch 1.1.0 <pip> [conda] torchsample 0.1.2 <pip> [conda] torchvision 0.1.9 py36h7584368_1 soumith [conda] torchvision 0.2.1 <pip>


Answer questions mrshenli

@pbloem please let us know if @ngimel's reply solves the problem.


Related questions

TensorBoard logging requires TensorBoard with Python summary writer installed. This should be available in 1.14 or above hot 3
AttributeError: module 'torch.jit' has no attribute 'unused' hot 3
Script freezes with no output when using DistributedDataParallel hot 2
Adding Pixel Unshuffle hot 2
DataLoader leaking Semaphores. hot 2
[feature request] Add matrix exponential hot 2
libtorch does not initialize OpenMP/MKL by default hot 2
Use torch.device() with torch.load(..., map_location=torch.device()) hot 2
Cuda required when loading a TorchScript with map_location='cpu' hot 2
PyTorch 1.5 failed to import c:miniconda3-x64envs estlibsite-packages orchlibcaffe2_nvrtc.dll - pytorch hot 2
Error during import torch, NameError: name &#39;_C&#39; is not defined - pytorch hot 2
Quantisation of object detection models. hot 2
Problems with install python from source hot 2
torch.utils.tensorboard.SummaryWriter.add_graph do not support non-tensor inputs - pytorch hot 2
a retrained and saved jit module could not be reload. hot 2
Github User Rank List