Différences
Ci-dessous, les différences entre deux révisions de la page.
Prochaine révision | Révision précédente | ||
cuda [2016/01/06 10:41] – créée gerard | cuda [2017/08/25 09:56] (Version actuelle) – modification externe 127.0.0.1 | ||
---|---|---|---|
Ligne 1: | Ligne 1: | ||
**cuda** | **cuda** | ||
+ | |||
+ | ====== pycuda ====== | ||
+ | * voir [[http:// | ||
+ | * me suis aussi inspiré de [[https:// | ||
+ | < | ||
+ | tar xzf Downloads/ | ||
+ | cd pycuda-2015.1.3/ | ||
+ | export PATH=/ | ||
+ | export LD_LIBRARY_PATH=/ | ||
+ | ./ | ||
+ | </ | ||
+ | erreur: | ||
+ | < | ||
+ | ImportError: | ||
+ | < | ||
+ | résolu en installant: | ||
+ | < | ||
+ | apt-get install python3-setuptools | ||
+ | </ | ||
+ | ppuis: | ||
+ | < | ||
+ | ImportError: | ||
+ | |||
+ | # apt-get install python3-scipy | ||
+ | Reading package lists... Done | ||
+ | Building dependency tree | ||
+ | Reading state information... Done | ||
+ | The following extra packages will be installed: | ||
+ | python3-decorator python3-numpy | ||
+ | | ||
+ | </ | ||
+ | plus | ||
+ | < | ||
+ | apt-get install libpython3.4-dev | ||
+ | </ | ||
+ | une fois fait, on va installer dans un virtualenv pour ne pas toucher aux fichiers système: | ||
+ | * on ajoute un alias dans .bashrc pour que python lance python3 | ||
+ | |||
+ | ===== venv python3 ===== | ||
+ | |||
+ | * on crée le venv en python3 | ||
+ | |||
+ | < | ||
+ | pyvenv-3.4 env_pycuda | ||
+ | </ | ||
+ | qui crée le dossier env_pycuda, et on l' | ||
+ | < | ||
+ | source ~/ | ||
+ | </ | ||
+ | dans ce venv, il faut ajouter numpy: | ||
+ | < | ||
+ | pip install numpy | ||
+ | </ | ||
+ | et enfin: | ||
+ | < | ||
+ | export PATH=/ | ||
+ | export LD_LIBRARY_PATH=/ | ||
+ | python setup.py install | ||
+ | </ | ||
+ | ce qui donne: | ||
+ | < | ||
+ | $ pip list | ||
+ | appdirs (1.4.0) | ||
+ | decorator (4.0.6) | ||
+ | numpy (1.10.4) | ||
+ | pip (1.5.4) | ||
+ | py (1.4.31) | ||
+ | pycuda (2015.1.3) | ||
+ | pytest (2.8.7) | ||
+ | pytools (2016.1) | ||
+ | setuptools (3.3) | ||
+ | six (1.10.0) | ||
+ | </ | ||
+ | on récupère les examples dans le dossier: | ||
+ | < | ||
+ | ./ | ||
+ | </ | ||
+ | les exemples sont dans le dossier wiki-examples/ | ||
+ | |||
+ | ===== avec python2 ===== | ||
+ | < | ||
+ | virtualenv env_pycuda_py2 | ||
+ | source env_pycuda_py2/ | ||
+ | pip install numpy | ||
+ | cd pycuda-2015.1.3/ | ||
+ | rm siteconf.py | ||
+ | |||
+ | ./ | ||
+ | |||
+ | python setup.py install | ||
+ | |||
+ | pip install . | ||
+ | </ | ||
+ | et on vérifie: | ||
+ | < | ||
+ | pip list | ||
+ | appdirs (1.4.0) | ||
+ | argparse (1.2.1) | ||
+ | decorator (4.0.6) | ||
+ | numpy (1.10.4) | ||
+ | pip (1.5.4) | ||
+ | py (1.4.31) | ||
+ | pycuda (2015.1.3) | ||
+ | pytest (2.8.7) | ||
+ | pytools (2016.1) | ||
+ | setuptools (2.2) | ||
+ | six (1.10.0) | ||
+ | wsgiref (0.1.2) | ||
+ | </ | ||
====== Utilisation ====== | ====== Utilisation ====== | ||
< | < | ||
- | export PATH=/usr/ | + | export PATH=/local/apps/ |
- | export LD_LIBRARY_PATH=/usr/ | + | export LD_LIBRARY_PATH=/ |
</ | </ | ||
les exemples sont dans / | les exemples sont dans / | ||
+ | * pour les recopier: '' | ||
* Exemple avec nbody: | * Exemple avec nbody: | ||
< | < | ||
Ligne 20: | Ligne 130: | ||
= 404.929 single-precision GFLOP/s at 20 flops per interaction | = 404.929 single-precision GFLOP/s at 20 flops per interaction | ||
</ | </ | ||
+ | ====== carte K620 sur Dell 7100 ====== | ||
+ | dans le dossier ~/ | ||
+ | < | ||
+ | ./ | ||
+ | |||
+ | CUDA Device Query (Runtime API) version (CUDART static linking) | ||
+ | |||
+ | Detected 1 CUDA Capable device(s) | ||
+ | |||
+ | Device 0: " | ||
+ | CUDA Driver Version / Runtime Version | ||
+ | CUDA Capability Major/Minor version number: | ||
+ | Total amount of global memory: | ||
+ | ( 3) Multiprocessors, | ||
+ | GPU Max Clock rate: 1124 MHz (1.12 GHz) | ||
+ | Memory Clock rate: 900 Mhz | ||
+ | Memory Bus Width: | ||
+ | L2 Cache Size: | ||
+ | Maximum Texture Dimension Size (x, | ||
+ | Maximum Layered 1D Texture Size, (num) layers | ||
+ | Maximum Layered 2D Texture Size, (num) layers | ||
+ | Total amount of constant memory: | ||
+ | Total amount of shared memory per block: | ||
+ | Total number of registers available per block: 65536 | ||
+ | Warp size: 32 | ||
+ | Maximum number of threads per multiprocessor: | ||
+ | Maximum number of threads per block: | ||
+ | Max dimension size of a thread block (x,y,z): (1024, 1024, 64) | ||
+ | Max dimension size of a grid size (x,y,z): (2147483647, | ||
+ | Maximum memory pitch: | ||
+ | Texture alignment: | ||
+ | Concurrent copy and kernel execution: | ||
+ | Run time limit on kernels: | ||
+ | Integrated GPU sharing Host Memory: | ||
+ | Support host page-locked memory mapping: | ||
+ | Alignment requirement for Surfaces: | ||
+ | Device has ECC support: | ||
+ | Device supports Unified Addressing (UVA): | ||
+ | Device PCI Domain ID / Bus ID / location ID: 0 / 1 / 0 | ||
+ | Compute Mode: | ||
+ | < Default (multiple host threads can use :: | ||
+ | |||
+ | deviceQuery, | ||
+ | Result = PASS | ||
+ | </ | ||
+ | |||
+ | < | ||
+ | [CUDA Bandwidth Test] - Starting... | ||
+ | Running on... | ||
+ | |||
+ | | ||
+ | Quick Mode | ||
+ | |||
+ | Host to Device Bandwidth, 1 Device(s) | ||
+ | | ||
+ | | ||
+ | | ||
+ | |||
+ | | ||
+ | | ||
+ | | ||
+ | | ||
+ | |||
+ | | ||
+ | | ||
+ | | ||
+ | | ||
+ | |||
+ | Result = PASS | ||
+ | |||
+ | NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled. | ||
+ | </ | ||
+ | |||
+ | ====== Installation ====== |