操作系统Ubuntu 22.04 + 2060 上整合好的M4Singer,拷贝到Ubuntu 22.04 + 4060ti16G上运行报错
Traceback (most recent call last):
   File "data_gen/tts/bin/binarize.py", line 20, in <module>
     binarize()
   File "data_gen/tts/bin/binarize.py", line 15, in binarize
     binarizer_cls().process()
   File "/home/yeqiang/下载/ai/M4Singer/code/data_gen/singing/binarize.py", line 98, in process
     self.process_data('valid')
   File "/home/yeqiang/下载/ai/M4Singer/code/data_gen/tts/base_binarizer.py", line 131, in process_data
     voice_encoder = VoiceEncoder().cuda()
   File "/home/yeqiang/下载/ai/M4Singer/code/venv3712/lib/python3.7/site-packages/resemblyzer/voice_encoder.py", line 40, in __init__
     self.to(device)
   File "/home/yeqiang/下载/ai/M4Singer/code/venv3712/lib/python3.7/site-packages/torch/nn/modules/module.py", line 607, in to
     return self._apply(convert)
   File "/home/yeqiang/下载/ai/M4Singer/code/venv3712/lib/python3.7/site-packages/torch/nn/modules/module.py", line 354, in _apply
     module._apply(fn)
   File "/home/yeqiang/下载/ai/M4Singer/code/venv3712/lib/python3.7/site-packages/torch/nn/modules/rnn.py", line 161, in _apply
     self.flatten_parameters()
   File "/home/yeqiang/下载/ai/M4Singer/code/venv3712/lib/python3.7/site-packages/torch/nn/modules/rnn.py", line 151, in flatten_parameters
     self.batch_first, bool(self.bidirectional))
RuntimeError: CUDA error: no kernel image is available for execution on the device
  
单独测试torch
$ python
 Python 3.7.12 | packaged by conda-forge | (default, Oct 26 2021, 06:08:21) 
 [GCC 9.4.0] on linux
 Type "help", "copyright", "credits" or "license" for more information.
 >>> import torch
 >>> torch.cuda.is_avaliable()
 Traceback (most recent call last):
   File "<stdin>", line 1, in <module>
 AttributeError: module 'torch.cuda' has no attribute 'is_avaliable'
 >>> torch.cuda.is_available()
 True
 >>> torch.zeros(1).cuda()
 /home/yeqiang/下载/ai/M4Singer/code/venv3712/lib/python3.7/site-packages/torch/cuda/__init__.py:125: UserWarning: 
 NVIDIA GeForce RTX 4060 Ti with CUDA capability sm_89 is not compatible with the current PyTorch installation.
 The current PyTorch install supports CUDA capabilities sm_37 sm_50 sm_60 sm_70 sm_75.
 If you want to use the NVIDIA GeForce RTX 4060 Ti GPU with PyTorch, please check the instructions at https://pytorch.org/get-started/locally/
  warnings.warn(incompatible_device_warn.format(device_name, capability, " ".join(arch_list), device_name))
 Traceback (most recent call last):
   File "<stdin>", line 1, in <module>
   File "/home/yeqiang/下载/ai/M4Singer/code/venv3712/lib/python3.7/site-packages/torch/tensor.py", line 153, in __repr__
     return torch._tensor_str._str(self)
   File "/home/yeqiang/下载/ai/M4Singer/code/venv3712/lib/python3.7/site-packages/torch/_tensor_str.py", line 371, in _str
     return _str_intern(self)
   File "/home/yeqiang/下载/ai/M4Singer/code/venv3712/lib/python3.7/site-packages/torch/_tensor_str.py", line 351, in _str_intern
     tensor_str = _tensor_str(self, indent)
   File "/home/yeqiang/下载/ai/M4Singer/code/venv3712/lib/python3.7/site-packages/torch/_tensor_str.py", line 241, in _tensor_str
     formatter = _Formatter(get_summarized_data(self) if summarize else self)
   File "/home/yeqiang/下载/ai/M4Singer/code/venv3712/lib/python3.7/site-packages/torch/_tensor_str.py", line 89, in __init__
     nonzero_finite_vals = torch.masked_select(tensor_view, torch.isfinite(tensor_view) & tensor_view.ne(0))
RuntimeError: CUDA error: no kernel image is available for execution on the device
  
2060主机正常
$ python
 Python 3.7.12 | packaged by conda-forge | (default, Oct 26 2021, 06:08:21) 
 [GCC 9.4.0] on linux
 Type "help", "copyright", "credits" or "license" for more information.
 >>> import torch
 >>> torch.cuda.is_available()
 True
 >>> torch.zeros(1).cuda()
tensor([0.], device='cuda:0')
 >>> 
  
尝试安装nvidia-cuda-toolkit(2060主机未安装这个包)
apt install nvidia-cuda-toolkit
故障依旧,于此无关?
尝试升级torch
采用aliyun源
(venv3712) (python3.7.12) yeqiang@yeqiang-Default-string:~/Downloads/ai/M4Singer/code$ pip install --upgrade torch
 Looking in indexes: http://mirrors.aliyun.com/pypi/simple/
 Collecting torch
   Downloading http://mirrors.aliyun.com/pypi/packages/00/86/77a9eddbf46f1bca2468d16a401911f58917f95b63402d6a7a4522521e5d/torch-1.13.1-cp37-cp37m-manylinux1_x86_64.whl (887.5 MB)
      |████████████████████████████████| 887.5 MB 3.2 MB/s 
 Collecting nvidia-cublas-cu11==11.10.3.66; platform_system == "Linux"
   Downloading http://mirrors.aliyun.com/pypi/packages/ce/41/fdeb62b5437996e841d83d7d2714ca75b886547ee8017ee2fe6ea409d983/nvidia_cublas_cu11-11.10.3.66-py3-none-manylinux1_x86_64.whl (317.1 MB)
      |████████████████████████████████| 317.1 MB 2.7 MB/s 
 Collecting nvidia-cudnn-cu11==8.5.0.96; platform_system == "Linux"
   Downloading http://mirrors.aliyun.com/pypi/packages/dc/30/66d4347d6e864334da5bb1c7571305e501dcb11b9155971421bb7bb5315f/nvidia_cudnn_cu11-8.5.0.96-2-py3-none-manylinux1_x86_64.whl (557.1 MB)
      |████████████████████████████████| 557.1 MB 3.2 MB/s 
 Requirement already satisfied, skipping upgrade: typing-extensions in ./venv3712/lib/python3.7/site-packages (from torch) (4.7.1)
 Collecting nvidia-cuda-nvrtc-cu11==11.7.99; platform_system == "Linux"
   Downloading http://mirrors.aliyun.com/pypi/packages/ef/25/922c5996aada6611b79b53985af7999fc629aee1d5d001b6a22431e18fec/nvidia_cuda_nvrtc_cu11-11.7.99-2-py3-none-manylinux1_x86_64.whl (21.0 MB)
      |████████████████████████████████| 21.0 MB 3.6 MB/s 
 Collecting nvidia-cuda-runtime-cu11==11.7.99; platform_system == "Linux"
   Downloading http://mirrors.aliyun.com/pypi/packages/36/92/89cf558b514125d2ebd8344dd2f0533404b416486ff681d5434a5832a019/nvidia_cuda_runtime_cu11-11.7.99-py3-none-manylinux1_x86_64.whl (849 kB)
      |████████████████████████████████| 849 kB 4.1 MB/s 
 Requirement already satisfied, skipping upgrade: wheel in ./venv3712/lib/python3.7/site-packages (from nvidia-cublas-cu11==11.10.3.66; platform_system == "Linux"->torch) (0.41.2)
 Requirement already satisfied, skipping upgrade: setuptools in ./venv3712/lib/python3.7/site-packages (from nvidia-cublas-cu11==11.10.3.66; platform_system == "Linux"->torch) (47.1.0)
 ERROR: torchvision 0.7.0 has requirement torch==1.6.0, but you'll have torch 1.13.1 which is incompatible.
 ERROR: torchaudio 0.6.0 has requirement torch==1.6.0, but you'll have torch 1.13.1 which is incompatible.
 Installing collected packages: nvidia-cublas-cu11, nvidia-cudnn-cu11, nvidia-cuda-nvrtc-cu11, nvidia-cuda-runtime-cu11, torch
   Attempting uninstall: torch
     Found existing installation: torch 1.6.0
     Uninstalling torch-1.6.0:
       Successfully uninstalled torch-1.6.0
 Successfully installed nvidia-cublas-cu11-11.10.3.66 nvidia-cuda-nvrtc-cu11-11.7.99 nvidia-cuda-runtime-cu11-11.7.99 nvidia-cudnn-cu11-8.5.0.96 torch-1.13.1
 WARNING: You are using pip version 20.1.1; however, version 23.2.1 is available.
 You should consider upgrading via the '/home/yeqiang/下载/ai/M4Singer/code/venv3712/bin/python3 -m pip install --upgrade pip' command.
 (venv3712) (python3.7.12) yeqiang@yeqiang-Default-string:~/Downloads/ai/M4Singer/code$ pip install --upgrade torch torchvision torchaudio
 Looking in indexes: http://mirrors.aliyun.com/pypi/simple/
 Requirement already up-to-date: torch in ./venv3712/lib/python3.7/site-packages (1.13.1)
 Collecting torchvision
   Downloading http://mirrors.aliyun.com/pypi/packages/8a/88/e83d51deb96de0847884fddb82ac0958fdc06f814c846878489aa5857a91/torchvision-0.14.1-cp37-cp37m-manylinux1_x86_64.whl (24.2 MB)
      |████████████████████████████████| 24.2 MB 2.0 MB/s 
 Collecting torchaudio
   Downloading http://mirrors.aliyun.com/pypi/packages/f6/d4/5e898f626c73f5e9a2ae15be92186e2bb090fa7441c5c00f45549a8cb13d/torchaudio-0.13.1-cp37-cp37m-manylinux1_x86_64.whl (4.2 MB)
      |████████████████████████████████| 4.2 MB 2.2 MB/s 
 Requirement already satisfied, skipping upgrade: nvidia-cublas-cu11==11.10.3.66; platform_system == "Linux" in ./venv3712/lib/python3.7/site-packages (from torch) (11.10.3.66)
 Requirement already satisfied, skipping upgrade: typing-extensions in ./venv3712/lib/python3.7/site-packages (from torch) (4.7.1)
 Requirement already satisfied, skipping upgrade: nvidia-cudnn-cu11==8.5.0.96; platform_system == "Linux" in ./venv3712/lib/python3.7/site-packages (from torch) (8.5.0.96)
 Requirement already satisfied, skipping upgrade: nvidia-cuda-nvrtc-cu11==11.7.99; platform_system == "Linux" in ./venv3712/lib/python3.7/site-packages (from torch) (11.7.99)
 Requirement already satisfied, skipping upgrade: nvidia-cuda-runtime-cu11==11.7.99; platform_system == "Linux" in ./venv3712/lib/python3.7/site-packages (from torch) (11.7.99)
 Requirement already satisfied, skipping upgrade: requests in ./venv3712/lib/python3.7/site-packages (from torchvision) (2.25.1)
 Requirement already satisfied, skipping upgrade: pillow!=8.3.*,>=5.3.0 in ./venv3712/lib/python3.7/site-packages (from torchvision) (8.0.1)
 Requirement already satisfied, skipping upgrade: numpy in ./venv3712/lib/python3.7/site-packages (from torchvision) (1.19.4)
 Requirement already satisfied, skipping upgrade: setuptools in ./venv3712/lib/python3.7/site-packages (from nvidia-cublas-cu11==11.10.3.66; platform_system == "Linux"->torch) (47.1.0)
 Requirement already satisfied, skipping upgrade: wheel in ./venv3712/lib/python3.7/site-packages (from nvidia-cublas-cu11==11.10.3.66; platform_system == "Linux"->torch) (0.41.2)
 Requirement already satisfied, skipping upgrade: certifi>=2017.4.17 in ./venv3712/lib/python3.7/site-packages (from requests->torchvision) (2020.12.5)
 Requirement already satisfied, skipping upgrade: urllib3<1.27,>=1.21.1 in ./venv3712/lib/python3.7/site-packages (from requests->torchvision) (1.26.2)
 Requirement already satisfied, skipping upgrade: idna<3,>=2.5 in ./venv3712/lib/python3.7/site-packages (from requests->torchvision) (2.10)
 Requirement already satisfied, skipping upgrade: chardet<5,>=3.0.2 in ./venv3712/lib/python3.7/site-packages (from requests->torchvision) (4.0.0)
 Installing collected packages: torchvision, torchaudio
   Attempting uninstall: torchvision
     Found existing installation: torchvision 0.7.0
     Uninstalling torchvision-0.7.0:
       Successfully uninstalled torchvision-0.7.0
   Attempting uninstall: torchaudio
     Found existing installation: torchaudio 0.6.0
     Uninstalling torchaudio-0.6.0:
       Successfully uninstalled torchaudio-0.6.0
 Successfully installed torchaudio-0.13.1 torchvision-0.14.1
 WARNING: You are using pip version 20.1.1; however, version 23.2.1 is available.
 You should consider upgrading via the '/home/yeqiang/下载/ai/M4Singer/code/venv3712/bin/python3 -m pip install --upgrade pip' command.
  
跑起来了

参考资料
CUDA Toolkit Archive | NVIDIA Developer
深度学习RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling `cublasSgemm( handle, opa,-CSDN博客



















