HPC Challenge Benchmark Record

System Information
Affiliation:   U.S. Army Engineer Research and Development Center Major Shared Resource Center   URL:  
Location:   Vicksburg, MS   System Use:   Government
System Manufacturer:   Cray Inc.   System Name:   X1
Interconnect Manufacturer:   Cray   Interconnect Type:   Cray modified 2D torus
Operating System:   Unicos/MP 2.4 (Site dependent)   MPI:   MPT 2.4
MPI Wtick:     BLAS:   Cray libsci 5.2
Language:   C   Compiler:  
Compiler Flags:     Processor Type:   Cray X1 MSP
Processor Speed:   0.8 GHz   Total Processors:   0
Processors Entered:   60   Processors determined:   60
Cores per chip:     HPL Processes:   60
MPI Processes:   60   Threads Entered:   1
Threads determined:   1   FLOPs per cycle:   16
Theoretical peak:   0.768 TFlop/s   Total memory:   GiB
FFT library:    
Explain Optimizations:
STREAM: Aligned the data to cache line boundaries and added no_cache_alloc directives. Single cpu RandomAccess: Change vector length to 1024 and added concurrent directive. MPI RandomAccess: Changed distribution so all processors have equal number of elements except last cpu, this eliminated the need for an if test. Implemented "Extra Buckets" to vectorize.

HPL:   0.578874 Tflop/s   HPL time:   2868.66
HPL eps:   1.110223e-16   HPL Rnorm1:  
HPL Anorm1:     HPL AnormI:  
HPL Xnorm1:     HPL XnormI:  
HPL N:   135555   HPL NB:   112
HPL NProw:   4   HPL NPcol:   15
HPL depth:   1   HPL NBdiv:   2
HPL NBmin:   4   HPL CPfact:   R
HPL CRfact:   R   HPL CPtop:   1
HPL order:   R
HPL dMach EPS:     HPL sMach EPS:  
HPL dMach sfMin:     HPL sMach sfMin:  
HPL dMach Base:     HPL sMach Base:  
HPL dMach Prec:     HPL sMach Prec:  
HPL dMach mLen:     HPL sMach mLen:  
HPL dMach Rnd:     HPL sMach Rnd:  
HPL dMach eMin:     HPL sMach eMin:  
HPL dMach rMin:     HPL sMach rMin:  
HPL dMach eMax:     HPL sMach eMax:  
HPL dMach rMax:     HPL sMach rMax:  
dweps:     sweps:  

PTRANS:   31.0723 GB/s   PTRANS time:   1.18 seconds
PTRANS residual:   0   PTRANS N:   67777
PTRANS NB:   112   PTRANS NProw:   4
PTRANS NPcol:   15

S-STREAM Copy:   21.737 GB/s   S-STREAM Scale:   21.7695 GB/s
S-STREAM Add:   23.3945 GB/s   S-STREAM Triad:   23.9422 GB/s
EP-STREAM Copy:   19.4427 GB/s   EP-STREAM Scale:   19.4533 GB/s
EP-STREAM Add:   20.5811 GB/s   EP-STREAM Triad:   21.768 GB/s
STREAM Vector Size:   102084192   STREAM Threads:   1

S-RandomAccess:   0.211953 Gup/s   EP-RandomAccess:   0.210385 Gup/s
G-RandomAccess:   Gup/s   G-RandomAccess N:  
G-RandomAccess time:   52.610146 seconds   G-RandomAccess Check Time:   seconds
G-RandomAccess Errors:     G-RandomAccess Errors Fraction:  
G-RandomAccess TimeBound:     G-RandomAccess ExeUpdates:  
RandomAccess N:  

S-FFT:   GFlop/s   EP-FFT:   GFlop/s
MPIFFT:   GFlop/s   MPIFFT N:  
MPIFFT Max Error:     MPIFFT time0:   seconds
MPIFFT time1:   seconds   MPIFFT time2:   seconds
MPIFFT time3:   seconds   MPIFFT time4:   seconds
MPIFFT time5:   seconds   MPIFFT time6:   seconds
FFTEnblk:     FFTEnp:  

S-DGEMM:   GFlop/s   EP-DGEMM:   GFlop/s

RandomRing Latency/Bandwidth
RandomRing Latency:   21.158 usec   RandomRing Bandwidth:   1.00986 GB/s

NaturalRing Latency/Bandwidth
NaturalRing Latency:   20.969 usec   NaturalRing Bandwidth:   3.4332 GB/s

PingPong Latency/Bandwidth
Maximum PingPong Latency:   9.26937 usec   Maximum PingPong Bandwidth:   9.459843 GB/s
Minimum PingPong Latency:   7.992 usec   Minimum PingPong Bandwidth:   4.40747 GB/s
Average PingPong Latency:   8.672 usec   Average PingPong Bandwidth:   8.431709 GB/s

Size of Data Types
char:   byte     short:   bytes
int:   bytes   long:   bytes
void ptr:   bytes   float:   bytes
double:   bytes   size t:   bytes
s64Int:   bytes   u64Int:   bytes

M OpenMP:     OpenMP Num Threads:  
OpenMP Num Procs:     OpenMP Max Threads:  

MemProc:     MemSpec:  


Version: 0.5.1.b - Run Type: opt - Parent ID: 24
Created: 2004-04-26 - Exported: Sun Apr 30 10:53:43 2017
HPC Challenge Benchmark Record