HPC Challenge Benchmark Record

System Information
Affiliation:   Oak Ridge National Lab   URL:   http://nccs.gov/
Location:   USA, Tennessee, Oak Ridge   System Use:   Research
System Manufacturer:   Cray Inc.   System Name:   XT3 Dual-Core
Interconnect Manufacturer:   Cray, Inc   Interconnect Type:   Cray SeaStar
Operating System:   Unicos/lc 1.5.25   MPI:   xt-mpt 1.5.25
MPI Wtick:   0.000001   BLAS:   ACML 3.0
Language:   C   Compiler:   PGI 6.1.4
Compiler Flags:   -fastsse   Processor Type:   AMD Opteron
Processor Speed:   2.6 GHz   Total Processors:   10424
Processors Entered:   10404   Processors determined:   10404
Cores per chip:   2   HPL Processes:   10404
MPI Processes:   10404   Threads Entered:   1
Threads determined:   1   FLOPs per cycle:   2
Theoretical peak:   54.1 TFlop/s   Total memory:   GiB
FFT library:    
Explain Optimizations:
* Replaced Streams C code with equivalent assembly code. * Used vendor MPI optimization to MPI_Cart_* functions to put communicating neighbors on the same physical node. * Used MPIRandomAccess optimization from Sandia that combined messages so that many small messages could be combined into fewer large messages that were then passed together via alltoall operations. This work is documented at: http://www.cs.sandia.gov/~sjplimp/algorithms.html#gups

HPL
HPL:   43.5056 Tflop/s   HPL time:   18485.3
HPL eps:   1.11022e-16   HPL Rnorm1:   0.000000226069
HPL Anorm1:   266722   HPL AnormI:   266717
HPL Xnorm1:   2223320   HPL XnormI:   13.0527
HPL N:   1064520   HPL NB:   60
HPL NProw:   102   HPL NPcol:   102
HPL depth:   1   HPL NBdiv:   2
HPL NBmin:   4   HPL CPfact:   R
HPL CRfact:   R   HPL CPtop:   1
HPL order:   R
HPL dMach EPS:   1.110223e-16   HPL sMach EPS:   0.00000005960464
HPL dMach sfMin:   0   HPL sMach sfMin:   1.175494e-38
HPL dMach Base:   2   HPL sMach Base:   2
HPL dMach Prec:   2.220446e-16   HPL sMach Prec:   0.0000001192093
HPL dMach mLen:   53   HPL sMach mLen:   24
HPL dMach Rnd:   1   HPL sMach Rnd:   1
HPL dMach eMin:   -1021   HPL sMach eMin:   -125
HPL dMach rMin:   0   HPL sMach rMin:   1.175494e-38
HPL dMach eMax:   1024   HPL sMach eMax:   128
HPL dMach rMax:   1.797693e308   HPL sMach rMax:   3.402823e38
dweps:   1.110223e-16   sweps:   0.00000005960464

PTRANS
PTRANS:   2038.92 GB/s   PTRANS time:   1.11157 seconds
PTRANS residual:   0   PTRANS N:   532260
PTRANS NB:   63   PTRANS NProw:   102
PTRANS NPcol:   102

STREAM
S-STREAM Copy:   5.38758 GB/s   S-STREAM Scale:   4.13775 GB/s
S-STREAM Add:   3.44623 GB/s   S-STREAM Triad:   5.17433 GB/s
EP-STREAM Copy:   2.54167 GB/s   EP-STREAM Scale:   2.23672 GB/s
EP-STREAM Add:   2.05365 GB/s   EP-STREAM Triad:   2.55092 GB/s
STREAM Vector Size:   36305920   STREAM Threads:   1

RandomAccess
S-RandomAccess:   0.0181075 Gup/s   EP-RandomAccess:   0.0101491 Gup/s
G-RandomAccess:   10.6711 Gup/s   G-RandomAccess N:   1099511627776
G-RandomAccess time:   4.06674 seconds   G-RandomAccess Check Time:   23.9385 seconds
G-RandomAccess Errors:   0   G-RandomAccess Errors Fraction:   0
G-RandomAccess TimeBound:   4621.32   G-RandomAccess ExeUpdates:   43396665408
RandomAccess N:   67108864

FFT
S-FFT:   0.735931 GFlop/s   EP-FFT:   0.653607 GFlop/s
MPIFFT:   1122.7 GFlop/s   MPIFFT N:   68719476736
MPIFFT Max Error:   0.00000000000000221632   MPIFFT time0:   0 seconds
MPIFFT time1:   2.58289 seconds   MPIFFT time2:   1.32588 seconds
MPIFFT time3:   2.12467 seconds   MPIFFT time4:   2.33529 seconds
MPIFFT time5:   2.47716 seconds   MPIFFT time6:   0.00000190735 seconds
FFTEnblk:   16   FFTEnp:   4
FFTEl2size:   1048576

DGEMM
S-DGEMM:   4.79513 GFlop/s   EP-DGEMM:   4.79356 GFlop/s
DGEMM N:   5218

RandomRing Latency/Bandwidth
RandomRing Latency:   17.0356 usec   RandomRing Bandwidth:   0.0820073 GB/s

NaturalRing Latency/Bandwidth
NaturalRing Latency:   16.1443 usec   NaturalRing Bandwidth:   0.201726 GB/s

PingPong Latency/Bandwidth
Maximum PingPong Latency:   8.68738 usec   Maximum PingPong Bandwidth:   1.15307 GB/s
Minimum PingPong Latency:   5.36442 usec   Minimum PingPong Bandwidth:   1.14708 GB/s
Average PingPong Latency:   7.00607 usec   Average PingPong Bandwidth:   1.15009 GB/s

Size of Data Types
char:   1 byte     short:   2 bytes
int:   4 bytes   long:   8 bytes
void ptr:   8 bytes   float:   4 bytes
double:   8 bytes   size t:   8 bytes
s64Int:   8 bytes   u64Int:   8 bytes

OpenMP
M OpenMP:   -1   OpenMP Num Threads:   0
OpenMP Num Procs:   0   OpenMP Max Threads:   0

Memory
MemProc:   -1   MemSpec:   -1
MemVal:   -1

CPS
CPS_HPCC_FFT_235:     CPS_HPCC_FFTW_ESTIMATE:  
CPS_HPCC_MEMALLCTR:     CPS_HPL_USE_GETPROCESSTIMES:  
CPS_RA_SANDIA_NOPT:     CPS_RA_SANDIA_OPT2:  


Version: 1.0.0.b - Run Type: opt - Parent ID: 194
Created: 2006-11-06 - Exported: Fri Oct 24 07:28:18 2014
HPC Challenge Benchmark Record