blob: 4cc33c00935f1c1e6e4b07b7ac997ea37132c3c9 (
plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
|
MACHINE: Intel Core2 Duo T9600 @ 2.80GHz
COMPILER: gcc 5.3.0 (as shipped with Slackware Linux 14.2)
The times labeled Reference are for ATLAS as installed by the authors.
NAMING ABBREVIATIONS:
kSelMM : selected matmul kernel (may be hand-tuned)
kGenMM : generated matmul kernel
kMM_NT : worst no-copy kernel
kMM_TN : best no-copy kernel
BIG_MM : large GEMM timing (usually N=1600); estimate of asymptotic peak
kMV_N : NoTranspose matvec kernel
kMV_T : Transpose matvec kernel
kGER : GER (rank-1 update) kernel
Kernel routines are not called by the user directly, and their
performance is often somewhat different than the total
algorithm (eg, dGER perf may differ from dkGER)
AFTER A PARTIAL SEARCH, ARCH IDENTIFIED AS Core232SSE3
======================================================
Reference clock rate=2493Mhz, new rate=2801Mhz
Refrenc : % of clock rate achieved by reference install
Present : % of clock rate achieved by present ATLAS install
single precision double precision
******************************** *******************************
real complex real complex
--------------- --------------- --------------- ---------------
Benchmark Refrenc Present Refrenc Present Refrenc Present Refrenc Present
========= ======= ======= ======= ======= ======= ======= ======= =======
kSelMM 578.5 363.2 564.7 577.7 334.6 352.5 325.1 336.5
kGenMM 156.3 101.2 156.5 102.0 159.9 159.2 161.7 97.3
kMM_NT 134.3 125.8 133.0 127.1 151.6 140.7 151.2 152.9
kMM_TN 154.8 101.3 152.6 101.1 142.4 90.8 149.7 94.2
BIG_MM 554.0 350.7 554.6 352.2 318.9 330.7 312.3 324.5
kMV_N 63.6 71.7 106.8 62.5 29.7 40.3 56.5 71.8
kMV_T 64.7 74.7 108.0 79.3 32.5 44.9 60.5 65.8
kGER 45.9 37.9 88.6 61.2 22.1 19.7 45.5 44.5
AFTER A FULL SEARCH
===================
Reference clock rate=2493Mhz, new rate=2801Mhz
Refrenc : % of clock rate achieved by reference install
Present : % of clock rate achieved by present ATLAS install
single precision double precision
******************************** *******************************
real complex real complex
--------------- --------------- --------------- ---------------
Benchmark Refrenc Present Refrenc Present Refrenc Present Refrenc Present
========= ======= ======= ======= ======= ======= ======= ======= =======
kSelMM 578.5 624.7 564.7 572.9 334.6 347.2 325.1 334.3
kGenMM 156.3 156.0 156.5 155.4 159.9 163.2 161.7 163.2
kMM_NT 134.3 104.8 133.0 96.9 151.6 140.5 151.2 144.5
kMM_TN 154.8 170.8 152.6 163.5 142.4 122.0 149.7 127.9
BIG_MM 554.0 527.8 554.6 558.3 318.9 331.3 312.3 331.0
kMV_N 63.6 72.1 106.8 118.8 29.7 44.8 56.5 79.1
kMV_T 64.7 78.8 108.0 134.4 32.5 45.5 60.5 88.3
kGER 45.9 40.2 88.6 74.6 22.1 21.7 45.5 44.8
|