Afficher la pageAnciennes révisionsLiens de retourHaut de page Cette page est en lecture seule. Vous pouvez afficher le texte source, mais ne pourrez pas le modifier. Contactez votre administrateur si vous pensez qu'il s'agit d'une erreur. **cuda** ====== pycuda ====== * voir [[http://wiki.tiker.net/PyCuda/Installation/Linux/Ubuntu]] * me suis aussi inspiré de [[https://wiki.calculquebec.ca/w/Python/fr]] <Code> tar xzf Downloads/pycuda-2015.1.3.tar.gz cd pycuda-2015.1.3/ export PATH=/local/apps/cuda-7.5/bin:$PATH export LD_LIBRARY_PATH=/local/apps/cuda-7.5/lib64:$LD_LIBRARY_PATH ./configure.py --python-exe=/usr/bin/python3 --cuda-root=/local/apps/cuda-7.5 --cudadrv-lib-dir=/usr/lib/x86_64-linux-gnu --boost-inc-dir=/usr/include --boost-lib-dir=/usr/lib --boost-python-libname=boost_python-py34 --boost-thread-libname=boost_thread --no-use-shipped-boost </Code> erreur: <Code> ImportError: No module named 'setuptools' <Code> résolu en installant: <Code> apt-get install python3-setuptools </Code> ppuis: <Code> ImportError: No module named 'numpy' # apt-get install python3-scipy Reading package lists... Done Building dependency tree Reading state information... Done The following extra packages will be installed: python3-decorator python3-numpy </Code> plus <Code> apt-get install libpython3.4-dev </Code> une fois fait, on va installer dans un virtualenv pour ne pas toucher aux fichiers système: * on ajoute un alias dans .bashrc pour que python lance python3 ===== venv python3 ===== * on crée le venv en python3 <Code> pyvenv-3.4 env_pycuda </Code> qui crée le dossier env_pycuda, et on l'active <Code> source ~/env_pycuda/bin/activate </Code> dans ce venv, il faut ajouter numpy: <Code> pip install numpy </Code> et enfin: <Code> export PATH=/local/apps/cuda-7.5/bin:$PATH export LD_LIBRARY_PATH=/local/apps/cuda-7.5/lib64:$LD_LIBRARY_PATH python setup.py install </Code> ce qui donne: <Code> $ pip list appdirs (1.4.0) decorator (4.0.6) numpy (1.10.4) pip (1.5.4) py (1.4.31) pycuda (2015.1.3) pytest (2.8.7) pytools (2016.1) setuptools (3.3) six (1.10.0) </Code> on récupère les examples dans le dossier: <Code> ./pycuda-2015.1.3/examples/download-examples-from-wiki.py </Code> les exemples sont dans le dossier wiki-examples/ ===== avec python2 ===== <Code> virtualenv env_pycuda_py2 source env_pycuda_py2/bin/activate pip install numpy cd pycuda-2015.1.3/ rm siteconf.py ./configure.py --cuda-root=/local/apps/cuda-7.5/ --cudadrv-lib-dir=/usr/lib/x86_64-linux-gnu --boost-inc-dir=/usr/include --boost-lib-dir=/usr/lib --boost-python-libname=boost_python --boost-thread-libname=boost_thread --no-use-shipped-boost python setup.py install pip install . </Code> et on vérifie: <Code> pip list appdirs (1.4.0) argparse (1.2.1) decorator (4.0.6) numpy (1.10.4) pip (1.5.4) py (1.4.31) pycuda (2015.1.3) pytest (2.8.7) pytools (2016.1) setuptools (2.2) six (1.10.0) wsgiref (0.1.2) </Code> ====== Utilisation ====== <Code> export PATH=/local/apps/cuda-7.5/bin:$PATH export LD_LIBRARY_PATH=/local/apps/cuda-7.5/lib64:$LD_LIBRARY_PATH </Code> les exemples sont dans /local/admin1/NVIDIA_CUDA-7.5_Samples/ * pour les recopier: ''sh cuda-install-samples-7.5.sh ~'' * Exemple avec nbody: <Code> ./nbody -bench GPU Device 0: "Quadro K620" with compute capability 5.0 > Compute 5.0 CUDA device: [Quadro K620] 3072 bodies, total time for 10 iterations: 4.661 ms = 20.246 billion interactions per second = 404.929 single-precision GFLOP/s at 20 flops per interaction </Code> ====== carte K620 sur Dell 7100 ====== dans le dossier ~/NVIDIA_CUDA-7.5_Samples/1_Utilities/ <Code> ./deviceQuery Starting... CUDA Device Query (Runtime API) version (CUDART static linking) Detected 1 CUDA Capable device(s) Device 0: "Quadro K620" CUDA Driver Version / Runtime Version 7.5 / 7.5 CUDA Capability Major/Minor version number: 5.0 Total amount of global memory: 2047 MBytes (2146762752 bytes) ( 3) Multiprocessors, (128) CUDA Cores/MP: 384 CUDA Cores GPU Max Clock rate: 1124 MHz (1.12 GHz) Memory Clock rate: 900 Mhz Memory Bus Width: 128-bit L2 Cache Size: 2097152 bytes Maximum Texture Dimension Size (x,y,z) 1D=(65536), 2D=(65536, 65536), 3D=(4096, 4096, 4096) Maximum Layered 1D Texture Size, (num) layers 1D=(16384), 2048 layers Maximum Layered 2D Texture Size, (num) layers 2D=(16384, 16384), 2048 layers Total amount of constant memory: 65536 bytes Total amount of shared memory per block: 49152 bytes Total number of registers available per block: 65536 Warp size: 32 Maximum number of threads per multiprocessor: 2048 Maximum number of threads per block: 1024 Max dimension size of a thread block (x,y,z): (1024, 1024, 64) Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535) Maximum memory pitch: 2147483647 bytes Texture alignment: 512 bytes Concurrent copy and kernel execution: Yes with 1 copy engine(s) Run time limit on kernels: Yes Integrated GPU sharing Host Memory: No Support host page-locked memory mapping: Yes Alignment requirement for Surfaces: Yes Device has ECC support: Disabled Device supports Unified Addressing (UVA): Yes Device PCI Domain ID / Bus ID / location ID: 0 / 1 / 0 Compute Mode: < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) > deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 7.5, CUDA Runtime Version = 7.5, NumDevs = 1, Device0 = Quadro K620 Result = PASS </Code> <Code> [CUDA Bandwidth Test] - Starting... Running on... Device 0: Quadro K620 Quick Mode Host to Device Bandwidth, 1 Device(s) PINNED Memory Transfers Transfer Size (Bytes) Bandwidth(MB/s) 33554432 6417.3 Device to Host Bandwidth, 1 Device(s) PINNED Memory Transfers Transfer Size (Bytes) Bandwidth(MB/s) 33554432 6471.0 Device to Device Bandwidth, 1 Device(s) PINNED Memory Transfers Transfer Size (Bytes) Bandwidth(MB/s) 33554432 26349.3 Result = PASS NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled. </Code> ====== Installation ====== cuda.txt Dernière modification : 2017/08/25 09:56de 127.0.0.1