HPC Challenge Benchmark Record

System Information
Affiliation:   Oak Ridge National Laboratory   URL:  
Location:   Tennessee, USA   System Use:   Government
System Manufacturer:   Cray Inc.   System Name:   X1
Interconnect Manufacturer:   Cray   Interconnect Type:   X1
Operating System:   UNICOS/mp 2.4   MPI:   MPT 2.4
MPI Wtick:     BLAS:   libsci 5.2
Language:   C   Compiler:  
Compiler Flags:     Processor Type:   Cray X1 MSP
Processor Speed:   0.8 GHz   Total Processors:   256
Processors Entered:   252   Processors determined:   252
Cores per chip:     HPL Processes:   252
MPI Processes:   252   Threads Entered:   1
Threads determined:   1   FLOPs per cycle:   16
Theoretical peak:   3225.6 TFlop/s   Total memory:   GiB
FFT library:    
Explain Optimizations:
STREAM: Aligned the data to cache line boundaries and added no_cache_alloc directives. Single cpu RandomAccess: Change vector length to 1024 and added concurrent directive. MPI RandomAccess: Changed distribution so all processors have equal number of elements except last cpu, this eliminated the need for a if test. Implemented "Extra Buckets" to vectorize.

HPL:   2.36782 Tflop/s   HPL time:   5610.54
HPL eps:   1.110223e-16   HPL Rnorm1:  
HPL Anorm1:     HPL AnormI:  
HPL Xnorm1:     HPL XnormI:  
HPL N:   271111   HPL NB:   112
HPL NProw:   4   HPL NPcol:   63
HPL depth:   1   HPL NBdiv:   2
HPL NBmin:   4   HPL CPfact:   R
HPL CRfact:   R   HPL CPtop:   1
HPL order:   R
HPL dMach EPS:     HPL sMach EPS:  
HPL dMach sfMin:     HPL sMach sfMin:  
HPL dMach Base:     HPL sMach Base:  
HPL dMach Prec:     HPL sMach Prec:  
HPL dMach mLen:     HPL sMach mLen:  
HPL dMach Rnd:     HPL sMach Rnd:  
HPL dMach eMin:     HPL sMach eMin:  
HPL dMach rMin:     HPL sMach rMin:  
HPL dMach eMax:     HPL sMach eMax:  
HPL dMach rMax:     HPL sMach rMax:  
dweps:     sweps:  

PTRANS:   96.1372 GB/s   PTRANS time:   1.53 seconds
PTRANS residual:   0   PTRANS N:   135555
PTRANS NB:   317   PTRANS NProw:   4
PTRANS NPcol:   63

S-STREAM Copy:   21.0248 GB/s   S-STREAM Scale:   21.1629 GB/s
S-STREAM Add:   23.907 GB/s   S-STREAM Triad:   24.0163 GB/s
EP-STREAM Copy:   19.5648 GB/s   EP-STREAM Scale:   18.8986 GB/s
EP-STREAM Add:   21.0922 GB/s   EP-STREAM Triad:   21.741 GB/s
STREAM Vector Size:   97223744   STREAM Threads:   1

S-RandomAccess:   0.20847 Gup/s   EP-RandomAccess:   0.208553 Gup/s
G-RandomAccess:   Gup/s   G-RandomAccess N:  
G-RandomAccess time:   249.15811 seconds   G-RandomAccess Check Time:   seconds
G-RandomAccess Errors:     G-RandomAccess Errors Fraction:  
G-RandomAccess TimeBound:     G-RandomAccess ExeUpdates:  
RandomAccess N:  

S-FFT:   GFlop/s   EP-FFT:   GFlop/s
MPIFFT:   GFlop/s   MPIFFT N:  
MPIFFT Max Error:     MPIFFT time0:   seconds
MPIFFT time1:   seconds   MPIFFT time2:   seconds
MPIFFT time3:   seconds   MPIFFT time4:   seconds
MPIFFT time5:   seconds   MPIFFT time6:   seconds
FFTEnblk:     FFTEnp:  

S-DGEMM:   GFlop/s   EP-DGEMM:   GFlop/s

RandomRing Latency/Bandwidth
RandomRing Latency:   22.643 usec   RandomRing Bandwidth:   0.438279 GB/s

NaturalRing Latency/Bandwidth
NaturalRing Latency:   18.821 usec   NaturalRing Bandwidth:   2.59659 GB/s

PingPong Latency/Bandwidth
Maximum PingPong Latency:   10.285 usec   Maximum PingPong Bandwidth:   9.222965 GB/s
Minimum PingPong Latency:   8.039 usec   Minimum PingPong Bandwidth:   4.89147 GB/s
Average PingPong Latency:   9.151 usec   Average PingPong Bandwidth:   8.410201 GB/s

Size of Data Types
char:   byte     short:   bytes
int:   bytes   long:   bytes
void ptr:   bytes   float:   bytes
double:   bytes   size t:   bytes
s64Int:   bytes   u64Int:   bytes

M OpenMP:     OpenMP Num Threads:  
OpenMP Num Procs:     OpenMP Max Threads:  

MemProc:     MemSpec:  


Version: 0.5.1.b - Run Type: opt - Parent ID: 22
Created: 2004-04-26 - Exported: Mon Nov 20 19:35:29 2017
