美文网首页
solve "Could not load dynamic li

solve "Could not load dynamic li

作者: 寽虎非虫003 | 来源:发表于2020-06-12 01:51 被阅读0次

一、问题描述

我之前按照tensorflow官网的脚本安装了CUDA10.1cudnn,也能也能在Python中正常导入tensorflow,但是也就放心了,然后今天进行数据训练的时候爆出错误如下:

2020-05-10 20:51:10.929736: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
2020-05-10 20:51:10.930780: E tensorflow/stream_executor/cuda/cuda_driver.cc:313] failed call to cuInit: UNKNOWN ERROR (303)
2020-05-10 20:51:10.931057: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:169] retrieving CUDA diagnostic information for host: G5-5587
2020-05-10 20:51:10.931156: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:176] hostname: G5-5587
2020-05-10 20:51:10.932180: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:200] libcuda reported version is: Not found: was unable to find libcuda.so DSO loaded into this program
2020-05-10 20:51:10.934319: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:204] kernel reported version is: 418.87.1
2020-05-10 20:51:10.942166: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-05-10 20:51:11.080120: I tensorflow/core/platform/profile_utils/cpu_utils.cc:102] CPU Frequency: 2208000000 Hz
2020-05-10 20:51:11.084013: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x18214d5a0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-05-10 20:51:11.084074: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
Epoch 1/100
2020-05-10 20:51:39.497298: W tensorflow/core/framework/op_kernel.cc:1730] OP_REQUIRES failed at cast_op.cc:123 : Unimplemented: Cast string to float is not supported

不能打开`libcuda.so.1'。

二、处理过程

查看gpu信息:

nvidia-smi #输入

得到:

Sun May 10 20:41:16 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.87.01    Driver Version: 418.87.01    CUDA Version: N/A      |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 106...  Off  | 00000000:01:00.0 Off |                  N/A |
| N/A   57C    P8     6W /  N/A |    224MiB /  6078MiB |     17%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0      1183      G   /usr/lib/xorg/Xorg                           132MiB |
|    0      2058      G   compiz                                        78MiB |
|    0      2374      G   fcitx-qimpanel                                 6MiB |
|    0      2889      G   /usr/lib/firefox/firefox                       1MiB |
|    0      3462      G   /usr/lib/firefox/firefox                       1MiB |
+-----------------------------------------------------------------------------+

再输入

nvcc --version

输出:

The program 'nvcc' is currently not installed. You can install it by typing:
sudo apt install nvidia-cuda-toolkit

提示让我安装nvidia-cuda-toolkit,照做:

sudo apt install nvidia-cuda-toolkit

成功后重新调用之前的出错代码,得到输出:

2020-05-10 21:07:53.675736: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2020-05-10 21:07:53.686611: E tensorflow/stream_executor/cuda/cuda_driver.cc:313] failed call to cuInit: CUDA_ERROR_COMPAT_NOT_SUPPORTED_ON_DEVICE: forward compatibility was attempted on non supported HW
2020-05-10 21:07:53.687012: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:169] retrieving CUDA diagnostic information for host: G5-5587
2020-05-10 21:07:53.687077: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:176] hostname: G5-5587
2020-05-10 21:07:53.688438: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:200] libcuda reported version is: 440.64.0
2020-05-10 21:07:53.689387: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:204] kernel reported version is: 418.87.1
2020-05-10 21:07:53.689492: E tensorflow/stream_executor/cuda/cuda_diagnostics.cc:313] kernel version 418.87.1 does not match DSO version 440.64.0 -- cannot find working devices in this configuration
2020-05-10 21:07:53.696530: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-05-10 21:07:53.863791: I tensorflow/core/platform/profile_utils/cpu_utils.cc:102] CPU Frequency: 2208000000 Hz
2020-05-10 21:07:53.866341: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x18334c140 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-05-10 21:07:53.866364: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version

可以留意到如下的版本不匹配信息:

kernel version 418.87.1 does not match DSO version 440.64.0

所以,卸载掉nvidia-cuda-toolkit,在重新安装特定版本:
顺带学到的一个nvcc处理办法The program 'nvcc' is currently not installed. You can install it by typing:
~/.bashrc添加配置即可:

    # cuda 10.1
    export LD_LIBRARY_PATH=/usr/local/cuda/lib
    export PATH=$PATH:/usr/local/cuda/bin
    # cuda 10.1

相关文章

网友评论

      本文标题:solve "Could not load dynamic li

      本文链接:https://www.haomeiwen.com/subject/cqpsnhtx.html