Différences

Ci-dessous, les différences entre deux révisions de la page.

--- umfpack [2009/11/25 14:26] – gerard
+++ umfpack [2017/08/25 09:56] (Version actuelle) – modification externe 127.0.0.1
@@ Ligne 6: / Ligne 6: @@
 | nemo | 4.4 | /usr/local/UMFPACKv4.4 | compilé avec sunperflib |
 | shrek | | /usr/local/UMFPACKv4.4 | compilé avec [[http://www.cs.utexas.edu/users/flame/goto|K. Goto's BLAS]] |
+| octopus | 4.4 | /usr/local/UMFPACKv4.4 | compilé avec sunperflib |
 ====== Utilisation ======
@@ Ligne 12: / Ligne 13: @@
   * /usr/local/UMFPACKv4.4/UMFPACK/Include
   * /usr/local/UMFPACKv4.4/AMD/Include
+  * /local/apps/src/UMFPACKv4.4/UMFPACK/Demo/
-Utilisation de umfpack dans un code fortran:
+===== Utilisation de umfpack dans un code fortran =====
+:
   * umfpack est écrit en C
   * il existe une interface fortran 77, utilisable en fortran 90
@@ Ligne 32: / Ligne 36: @@
 Pour cela, il faut bien entendu lier votre programme avec le programme umf4_f77wrapper.c, de la facon suivante:
 <code>
-cc -o umf4_f77wrapper.o -DDLONG -m64 -I/local2/fboyer/UMFPACKv4.4/UMFPACK/Include -c umf4_f77wrapper.c
+cc -o umf4_f77wrapper.o -DDLONG -m64 -I/usr/local/UMFPACKv4.4/UMFPACK/Include -c umf4_f77wrapper.c
 f90 -o poisson3d_umfpack.o -g -fast -C -e -fpp -stackvar -xcheck=init_local -fpover -ftrap=%none -Xlist -fsimple=0 -fns=no -dalign -O4 -KPIC -xmodel=medium -m64 -c poisson3d_umfpack.f90
 f90 -g -fast -C -e -fpp -stackvar -xcheck=init_local -fpover -ftrap=%none -Xlist -fsimple=0 -fns=no -dalign -O4 -KPIC -xmodel=medium -m64 -o poisson3d_umfpack poisson3d_umfpack.o umf4_f77wrapper.o /usr/local/UMFPACKv4.4/UMFPACK/Lib/libumfpack.a /usr/local/UMFPACKv4.4/AMD/Lib/libamd.a -xlic_lib=sunperf
@@ Ligne 75: / Ligne 79: @@
   * 64 bits uniquement
+  * il y a un bug dans les programmes de test, corrige dans umf4hb64.f
+<code>
+diff  /local/apps/src/UMFPACKv4.4/UMFPACK/Demo/umf4hb64.f-ori  /local/apps/src/UMFPACKv4.4/UMFPACK/Demo/umf4hb64.f
+c331
+<      $      n, nz, Ap (n+1), Ai (n), j, i, p
+---
+>      $      n, nz, Ap (n+1), Ai (nz), j, i, p
+</code>
 ====== Tests ======
+===== en C =====
   * prendre le source [[http://iusti.polytech.univ-mrs.fr/~jobic/dokuwiki/doku.php?id=librairies_installees&#umfpack|ici]]
 <code>
+module load ss12
 cc -o umfpack_simple -m64 umfpack_simple.c -I/usr/local/UMFPACKv4.4/UMFPACK/Include -R/usr/local/UMFPACKv4.4/UMFPACK/Lib -L/usr/local/UMFPACKv4.4/UMFPACK/Lib -R/usr/local/UMFPACKv4.4/AMD/Lib -L/usr/local/UMFPACKv4.4/AMD/Lib -lumfpack -lamd -xlic_lib=sunperf
+</code>
+ou
+<code>
+module load ss12u1
+cc -o umfpack_simple -m64 umfpack_simple.c -I/usr/local/UMFPACKv4.4/UMFPACK/Include -R/usr/local/UMFPACKv4.4/UMFPACK/Lib -L/usr/local/UMFPACKv4.4/UMFPACK/Lib -R/usr/local/UMFPACKv4.4/AMD/Lib -L/usr/local/UMFPACKv4.4/AMD/Lib -lumfpack -lamd -xlic_lib=sunperf -lm
 </code>
+===== en fortran =====
+  * prendre pour exemple le fichier Demo/umf4hb64.f dans /local/apps/src/UMFPACKv4.4/UMFPACK/Demo/ sur nemo
+<code>
+f90 -o umf4hb64.o -g -fast -C -e -fpp -stackvar -xcheck=init_local -fpover -ftrap=%none -Xlist -fsimple=0 -fns=no -xmodel=medium -dalign -m64 -c umf4hb64.f
+cc -o umf4_f77wrapper.o -DDLONG -m64 -I/usr/local/UMFPACKv4.4/UMFPACK/Include -c umf4_f77wrapper.c
+f90 -g -fast -C -e -fpp -stackvar -xcheck=init_local -fpover -ftrap=%none -Xlist -fsimple=0 -fns=no -xmodel=medium -dalign -m64 -o umf4hb64 umf4hb64.o umf4_f77wrapper.o /usr/local/UMFPACKv4.4/UMFPACK/Lib/libumfpack.a /usr/local/UMFPACKv4.4/AMD/Lib/libamd.a -xlic_lib=sunperf
+</code>
+et (les matrices de test sont dans le répertoire Demo/HB:
+<code>
+./umf4hb64 < arc130.rua
+ Matrix key: ARC130
+UMFPACK V4.4 (Jan. 28, 2005), Control:
+    Matrix entry defined as: double
+    Int (generic integer) defined as: long
+: print level: 2
+: dense row parameter:    0.2
+        "dense" rows have    > max (16, (0.2)*16*sqrt(n_col) entries)
+: dense column parameter: 0.2
+        "dense" columns have > max (16, (0.2)*16*sqrt(n_row) entries)
+: pivot tolerance: 0.1
+: block size for dense matrix kernels: 32
+: strategy: 0 (auto)
+: initial allocation ratio: 0.7
+: max iterative refinement steps: 2
+: 2-by-2 pivot tolerance: 0.01
+: Q fixed during numerical factorization: 0 (auto)
+: AMD dense row/col parameter:    10
+       "dense" rows/columns have > max (16, (10)*sqrt(n)) entries
+        Only used if the AMD ordering is used.
+: diagonal pivot tolerance: 0.001
+        Only used if diagonal pivoting is attempted.
+: scaling: 1 (divide each row by sum of abs. values in each row)
+: frontal matrix allocation ratio: 0.5
+: drop tolerance: 0
+: AMD and COLAMD aggressive absorption: 1 (yes)
+    The following options can only be changed at compile-time:
+: BLAS library used:  Sun Performance Library BLAS.
+: compiled for ANSI C (uses malloc, free, realloc, and printf)
+: CPU timer is POSIX times ( ) routine.
+: compiled for normal operation (debugging disabled)
+    computer/operating system: Sun Solaris
+    size of int: 4 long: 8 Int: 8 pointer: 8 double: 8 Entry: 8 (in bytes)
+symbolic analysis:
+   status:     0.
+   time:      0.00E+00 (sec)
+   estimates (upper bound) for numeric LU:
+   size of LU:          0.14 (MB)
+   memory needed:       0.29 (MB)
+   flop count:      0.94E+05
+   nnz (L):            1009.
+   nnz (U):            7849.
+numeric factorization:
+   status:     0.
+   time:      0.00E+00
+   actual numeric LU statistics:
+   size of LU:          0.02 (MB)
+   memory needed:       0.11 (MB)
+   flop count:      0.42E+04
+   nnz (L):             417.
+   nnz (U):             787.
+UMFPACK V4.4 (Jan. 28, 2005), Info:
+    matrix entry defined as:          double
+    Int (generic integer) defined as: long
+    BLAS library used:                Sun Performance Library BLAS.
+    MATLAB:                           no.
+    CPU timer:                        POSIX times ( ) routine.
+    number of rows in matrix A:       130
+    number of columns in matrix A:    130
+    entries in matrix A:              1282
+    memory usage reported in:         16-byte Units
+    size of int:                      4 bytes
+    size of long:                     8 bytes
+    size of pointer:                  8 bytes
+    size of numerical entry:          8 bytes
+    strategy used:                    symmetric
+    ordering used:                    amd on A+A'
+    modify Q during factorization:    no
+    prefer diagonal pivoting:         yes
+    pivots with zero Markowitz cost:               6
+    submatrix S after removing zero-cost pivots:
+        number of "dense" rows:                    7
+        number of "dense" columns:                 0
+        number of empty rows:                      0
+        number of empty columns                    0
+        submatrix S square and diagonal preserved
+    pattern of square submatrix S:
+        number rows and columns                    124
+        symmetry of nonzero pattern:               0.841193
+        nz in S+S' (excl. diagonal):               1204
+        nz on diagonal of matrix S:                124
+        fraction of nz on diagonal:                1.000000
+    AMD statistics, for strict diagonal pivoting:
+        est. flops for LU factorization:           8.27000e+03
+        est. nz in L+U (incl. diagonal):           1336
+        est. largest front (# entries):            324
+        est. max nz in any column of L:            18
+        number of "dense" rows/columns in S+S':    2
+    symbolic factorization defragmentations:       0
+    symbolic memory usage (Units):                 4690
+    symbolic memory usage (MBytes):                0.1
+    Symbolic size (Units):                         633
+    Symbolic size (MBytes):                        0
+    symbolic factorization CPU time (sec):         0.00
+    symbolic factorization wallclock time(sec):    0.00
+    matrix scaled: yes (divided each row by sum of abs values in each row)
+    minimum sum (abs (rows of A)):              7.94859e-01
+    maximum sum (abs (rows of A)):              1.08460e+06
+    symbolic/numeric factorization:      upper bound               actual      %
+    variable-sized part of Numeric object:
+        initial size (Units)                    4013                 3870    96%
+        peak size (Units)                      16281                 4884    30%
+        final size (Units)                      8566                  596     7%
+    Numeric final size (Units)                  9317                 1282    14%
+    Numeric final size (MBytes)                  0.1                  0.0    14%
+    peak memory usage (Units)                  18734                 7337    39%
+    peak memory usage (MBytes)                   0.3                  0.1    39%
+    numeric factorization flops          9.41610e+04          4.20900e+03     4%
+    nz in L (incl diagonal)                     1009                  417    41%
+    nz in U (incl diagonal)                     7849                  787    10%
+    nz in L+U (incl diagonal)                   8728                 1074    12%
+    largest front (# entries)                   2337                  270    12%
+    largest # rows in front                       19                   18    95%
+    largest # columns in front                   123                   15    12%
+    initial allocation ratio used:                 0.36
+    # of forced updates due to frontal growth:     0
+    number of off-diagonal pivots:                 0
+    nz in L (incl diagonal), if none dropped       417
+    nz in U (incl diagonal), if none dropped       796
+    number of small entries dropped                9
+    nonzeros on diagonal of U:                     130
+    min abs. value on diagonal of U:               9.22e-07
+    max abs. value on diagonal of U:               1.00e+00
+    estimate of reciprocal of condition number:    9.22e-07
+    indices in compressed pattern:                 74
+    numerical values stored in Numeric object:     979
+    numeric factorization defragmentations:        1
+    numeric factorization reallocations:           1
+    costly numeric factorization reallocations:    0
+    numeric factorization CPU time (sec):          0.00
+    numeric factorization wallclock time (sec):    0.05
+    numeric factorization mflops (wallclock):      0.08
+    symbolic + numeric CPU time (sec):             0.00
+    symbolic + numeric wall clock time (sec):      0.05
+    symbolic + numeric mflops (wall clock):        0.08
+    solve flops:                                   2.14800e+03
+    iterative refinement steps taken:              0
+    iterative refinement steps attempted:          0
+    solve CPU time (sec):                          0.00
+    solve wall clock time (sec):                   0.00
+    total symbolic + numeric + solve flops:        6.35700e+03
+    total symbolic + numeric + solve CPU time:     0.00
+    total symbolic+numeric+solve wall clock time:  0.05
+    total symbolic+numeric+solve mflops(wallclock) 0.13
+ norm (A*x-b):  1.8917489796876907E-10
+ norm (A*x-b):  5.838675376512725E-10
+ norm (A*x-b):  5.838675376512725E-10
+</code>