Difference between revisions of "Dhrystone howto"
Mjeamiguel (talk | contribs) (→Obtaining the source code) |
Mjeamiguel (talk | contribs) |
||
(46 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
− | == About Dhrystone | + | <font style="font-size:90%">This Page serves as a guide for running Dhrystone benchmark on arm machines :: Please visit the [http://zenit.senecac.on.ca/wiki/index.php/Supporting_Architectures_above_armv5tel Main Project Page] </font> |
+ | =About Dhrystone= | ||
<font style="font-size:90%">From Wikipedia, the free encyclopedia</font> | <font style="font-size:90%">From Wikipedia, the free encyclopedia</font> | ||
Line 11: | Line 12: | ||
Both Whetstone and Dhrystone are synthetic benchmarks, meaning that they are simple programs that are carefully designed to statistically mimic the processor usage of some common set of programs. Whetstone, developed in 1972, originally strove to mimic typical Algol 60 programs based on measurements from 1970, but eventually became most popular in its Fortran version, reflecting the highly numerical orientation of computing in the 1960s. | Both Whetstone and Dhrystone are synthetic benchmarks, meaning that they are simple programs that are carefully designed to statistically mimic the processor usage of some common set of programs. Whetstone, developed in 1972, originally strove to mimic typical Algol 60 programs based on measurements from 1970, but eventually became most popular in its Fortran version, reflecting the highly numerical orientation of computing in the 1960s. | ||
− | + | == Dhrystone Fundamentals == | |
− | |||
− | |||
<font style="font-size:90%">From ARM White Paper</font><br /> | <font style="font-size:90%">From ARM White Paper</font><br /> | ||
Line 43: | Line 42: | ||
independent certification means that customers are dependent on processor vendors to | independent certification means that customers are dependent on processor vendors to | ||
quote accurate and meaningful Dhrystone data. | quote accurate and meaningful Dhrystone data. | ||
+ | |||
+ | == What Dhrystone really does == | ||
+ | <font style="font-size:90%">From Clarify.doc (Included in Dhrystone 2.1), Rick Richardson</font><br /> | ||
+ | |||
+ | <ul> | ||
+ | <li> DHRYSTONE is a measure of processor+compiler efficiency in | ||
+ | executing a 'typical' program. The 'typical' program was | ||
+ | designed by measuring statistics on a great number of | ||
+ | 'real' programs. The 'typical' program was then written | ||
+ | by Reinhold P. Weicker using these statistics. The | ||
+ | program is balanced according to statement type, as well | ||
+ | as data type.</li> | ||
+ | |||
+ | <li>DHRYSTONE does not use floating point. Typical programs don't.</li> | ||
+ | |||
+ | <li>DHRYSTONE does not do I/O. Typical programs do, but then | ||
+ | we'd have a whole can of worms opened up.</li> | ||
+ | |||
+ | <li>DHRYSTONE does not contain much code that can be optimized | ||
+ | by vector processors. That is why a CRAY doesn't look real | ||
+ | fast, they weren't built to do this sort of computing.</li> | ||
+ | |||
+ | <li>DHRYSTONE does not measure OS performance, as it avoids | ||
+ | calling the O.S. The O.S. is indicated in the results only | ||
+ | to help in identifying the compiler technology.</li> | ||
+ | |||
+ | <li>DHRYSTONE is not perfect, but is a hell of a lot better than | ||
+ | the "sieve", or "SI".</li> | ||
+ | |||
+ | <li>DHRYSTONE gives results in dhrystones/second. Bigger | ||
+ | numbers are better. As a baseline, the original IBM PC | ||
+ | gives around 300-400 dhrystones/second with a good compiler. | ||
+ | The fastest machines today are approaching 100,000.</li> | ||
+ | </ul> | ||
<font style="font-size:110%"><b>Dhrystone Characteristics</b></font> | <font style="font-size:110%"><b>Dhrystone Characteristics</b></font> | ||
Line 78: | Line 111: | ||
</blockquote> | </blockquote> | ||
− | == Obtaining the | + | = Installation = |
+ | <font style="font-size:110%"><b>1. Obtaining the Source Code</b></font> | ||
+ | |||
One of the most important defects in Dhrystone is that it is often unclear what version | One of the most important defects in Dhrystone is that it is often unclear what version | ||
is being quoted. Furthermore, since there are no "disclosure rules" or independent | is being quoted. Furthermore, since there are no "disclosure rules" or independent | ||
− | certification of scores, companies and individuals are free to state, or not state, anything. | + | certification of scores, companies and individuals are free to state, or not state, anything. Due to its non proprietary nature, individuals and companies modified their own versions of Dhrystone resulting in various alterations of the original source code. |
− | The following package is | + | The following package is the most quoted, well used Dhrystone release. It is the cleanest/customisable Dhrystone out in the internet. |
[http://www.sfr-fresh.com/unix/privat/old/dhrystone-2.1.tar.gz/ Dhrystone-2.1.tar.gz] | [http://www.sfr-fresh.com/unix/privat/old/dhrystone-2.1.tar.gz/ Dhrystone-2.1.tar.gz] | ||
− | == | + | <font style="font-size:110%"><b>2. Extract the file</b></font> |
+ | |||
+ | Extract the tarball using the command: | ||
+ | <pre>tar xvf dhrystone-2.1.tar.gz -C destination_directory/</pre> | ||
+ | There will be a total of 19 files once extracted. Move to the directory where the extracted files are. | ||
+ | |||
+ | <font style="font-size:110%"><b>3. Edit the Makefile</b></font> | ||
+ | |||
+ | Open Makefile with any text editor; '''UNCOMMENT''' (if commented) then '''EDIT''' the following fields using the '''GIVEN''' values: | ||
+ | <blockquote> | ||
+ | Line #25 | ||
+ | Fedora uses -DTIME for TIME function, this field is commented out by default | ||
+ | <pre>TIME_FUNC= -DTIME # Use times(2) for measurement</pre> | ||
+ | Line #28 | ||
+ | Check motherboard specifications to determine the memory clock speed ( beagleboardXM runs at 166MHz DDR speed ) | ||
+ | <pre>HZ= 166 # Frequency of times(2) clock ticks</pre> | ||
+ | Line #39 | ||
+ | This option is for C compiler | ||
+ | <pre>OPTIMIZE= -O2 -march=armv7-a -mtune=cortex-a8 -mfpu=neon -mfloat-abi=softfp -fomit-frame-pointer # Optimization Level (generic UNIX)</pre> | ||
+ | Line #40 | ||
+ | This option is for GCC compiler | ||
+ | <pre>GCCOPTIM= -O2 -march=armv7-a -mtune=cortex-a8 -mfpu=neon -mfloat-abi=softfp -fomit-frame-pointer</pre> | ||
+ | </blockquote> | ||
+ | |||
+ | Comment out/disable the following lines: | ||
+ | <blockquote> | ||
+ | Line #26 | ||
+ | <pre>TIME_FUNC= -DTIMES # Use times(2) for measurement</pre> | ||
+ | Line #38 | ||
+ | <pre>OPTIMIZE= -Ox -G2 # Optimization Level (MSC, 80286)</pre> | ||
+ | </blockquote> | ||
+ | |||
+ | <font style="font-size:110%"><b>Makefile snapshot</b></font> | ||
+ | [[Image:Dhry21.png|center]] | ||
+ | |||
+ | |||
+ | <font style="font-size:110%"><b>Compiler Optimization Options</b></font> | ||
+ | |||
+ | <font style="font-size:90%">Please see more about [http://gcc.gnu.org/onlinedocs/gcc/ARM-Options.html GCC ARM-Options]</font> | ||
+ | |||
+ | The options used for lines #39~40 are for optimizing the dhrystone install to run specifically with '''armv7''' architecture. Optimizations provide a performance boost for the program. Removing the optimizations would result in a ''nominal'' program performance. | ||
+ | |||
+ | <font style="font-size:110%"><b>4. Run "make"</b></font> | ||
+ | |||
+ | Running make in the current directory should only produce warnings!! Here is an output of the make command with warnings relating to c library functions that can be ignored. | ||
+ | |||
+ | <blockquote> | ||
+ | <pre>[mjeamiguel@cdot-beagleXM-0-3 dhrystone]$ make | ||
+ | gcc -O2 -march=armv7-a -mtune=cortex-a8 -mfpu=neon -mfloat-abi=softfp -fomit-frame-pointer -DTIME -DHZ=166 | ||
+ | dhry_1.c dhry_2.c -o gcc_dry2 | ||
+ | dhry_1.c:31: warning: conflicting types for built-in function ‘malloc’ | ||
+ | dhry_1.c: In function ‘main’: | ||
+ | dhry_1.c:98: warning: incompatible implicit declaration of built-in function ‘strcpy’ | ||
+ | gcc -O2 -march=armv7-a -mtune=cortex-a8 -mfpu=neon -mfloat-abi=softfp -fomit-frame-pointer -DTIME -DHZ=166 | ||
+ | -DREG=register dhry_1.c dhry_2.c -o gcc_dry2reg | ||
+ | dhry_1.c:31: warning: conflicting types for built-in function ‘malloc’ | ||
+ | dhry_1.c: In function ‘main’: | ||
+ | dhry_1.c:98: warning: incompatible implicit declaration of built-in function ‘strcpy’ | ||
+ | </pre> | ||
+ | </blockquote> | ||
+ | |||
+ | = Running the benchmark and gathering results = | ||
+ | |||
+ | The make command outputs 2 files named '''gcc_dry2''' and '''gcc_dry2reg'''. The author of this version decided to create 2 dhrystone executables . One with register variables, and one without. Either one will work for the benchmark so, feel free to test it out. | ||
+ | |||
+ | |||
+ | <font style="font-size:110%><b>1. Run the executable by typing ''./gcc_dry2''</b></font> | ||
+ | |||
+ | The program will start to ask you for a ''number of runs''. By convention on newer machines (>=1GHz), the number of runs used is about '''~100000000''' (100 million). There are no rules or standards about how many runs it should be. Some people calculate the number of runs through "Dhrystone run time erros" (way too advanced); what matters is the consistency of the result. For consistent results, dhrystone is executed more than 5 times with same values for number of runs. | ||
+ | |||
− | + | <font style="font-size:110%><b>2. Calculate for DMIPS</b></font> | |
− | + | One common representation of the Dhrystone benchmark is DMIPS. DMIPS (Dhrystone MIPS ). It is obtained when the Dhrystone score is divided by '''1757''' (the number of Dhrystones per second obtained on the VAX 11/780, nominally a 1 MIPS machine). | |
− | + | Given the result: | |
− | + | <blockquote><code> | |
+ | Microseconds for one run through Dhrystone: 0.8<br /> | ||
+ | Dhrystones per Second: 1333333.4 | ||
+ | </code></blockquote> | ||
− | + | Using the formula: | |
− | + | '''1333333.4 / 1757 = 758.87''' ''DMIPS'' | |
− | |||
− | + | The result shown was an actual test for a beagleboardXM machine with 1GHz of processor speed. | |
− | |||
− | |||
− | |||
− | = | + | = And... = |
− | + | <font style="font-size:90%">From ARM White paper</font><br /> | |
− | + | "When first released, the Dhrystone benchmark fulfilled a useful function – at least it gave an alternative indicator to vendors’ literal MIPS ratings. However, more than twenty years later, there are undoubtedly better benchmarks available for measuring processor performance." |
Latest revision as of 15:25, 30 December 2010
This Page serves as a guide for running Dhrystone benchmark on arm machines :: Please visit the Main Project Page
Contents
About Dhrystone
From Wikipedia, the free encyclopedia
Dhrystone is a synthetic computing benchmark program developed in 1984 by Reinhold P. Weicker intended to be representative of system (integer) programming. The Dhrystone grew to become representative of general processor (CPU) performance. Later the CPU89 benchmark suite from the Standard Performance Evaluation Corporation, today known as the "SPECint" suite was introduced, but SPEC programs are quite expensive whereas Dhrystone is free, therefore Dhrystone remains popular.
The name "Dhrystone" is a pun on a different benchmark algorithm called Whetstone.
With Dhrystone, Weicker gathered meta-data from a broad range of software, including programs written in FORTRAN, PL/1, SAL, ALGOL 68, and Pascal. He then characterized these programs in terms of various common constructs: procedure calls, pointer indirections, assignments, etc. From this he wrote the Dhrystone benchmark to correspond to a representative mix. Dhrystone was published in Ada, with the C version for Unix developed by Rick Richardson ("version 1.1") greatly contributing to its popularity.
The Dhrystone benchmark contains no floating point operations, thus the name is a pun on the then-popular Whetstone benchmark for floating point operations. The output from the benchmark is the number of Dhrystones per second (the number of iterations of the main code loop per second).
Both Whetstone and Dhrystone are synthetic benchmarks, meaning that they are simple programs that are carefully designed to statistically mimic the processor usage of some common set of programs. Whetstone, developed in 1972, originally strove to mimic typical Algol 60 programs based on measurements from 1970, but eventually became most popular in its Fortran version, reflecting the highly numerical orientation of computing in the 1960s.
Dhrystone Fundamentals
From ARM White Paper
Dhrystone has a number of attributes that have led to it being widely used in the past as a measure of CPU performance. Foremost, Dhrystone is compact, widely available in the public domain, and simple to run. Significantly, there are no lengthy certification processes to go through before citing Dhrystone figures. Dhrystone compares the performance of the processor under benchmark to that of a reference machine. This is an advantage over quoting ‘straight’ MIPS numbers since using a reference machine effectively compensates for differences in the richness of competing instruction sets. For example, literal comparison of the ‘millions of instructions per second’ numbers for a RISC architecture and a CISC architecture is not meaningful.
The industry has adopted the VAX 11/780 as the reference 1 MIP machine. The VAX 11/780 achieves 1757 Dhrystones per second. The Dhrystone figure is calculated by measuring the number of Dhrystones per second for the system, and dividing that by 1757. So "80 MIPS" means "80 Dhrystone VAX MIPS", which means 80 times faster than a VAX 11/780. A DMIPS/MHz rating takes this normalization process one step further, enabling comparison of processor performance at different clock rates. For all of these reasons, in the past, Dhrystone has been a widely quoted benchmark figure. In theory, Dhrystone should provide a basis for the comparison of processor performances.
However, some of the apparent advantages of Dhrystone are also significant weaknesses of the benchmark. Dhrystone numbers actually reflect the performance of the C compiler and libraries, probably more so than the performance of the processor itself. Also, lack of independent certification means that customers are dependent on processor vendors to quote accurate and meaningful Dhrystone data.
What Dhrystone really does
From Clarify.doc (Included in Dhrystone 2.1), Rick Richardson
- DHRYSTONE is a measure of processor+compiler efficiency in executing a 'typical' program. The 'typical' program was designed by measuring statistics on a great number of 'real' programs. The 'typical' program was then written by Reinhold P. Weicker using these statistics. The program is balanced according to statement type, as well as data type.
- DHRYSTONE does not use floating point. Typical programs don't.
- DHRYSTONE does not do I/O. Typical programs do, but then we'd have a whole can of worms opened up.
- DHRYSTONE does not contain much code that can be optimized by vector processors. That is why a CRAY doesn't look real fast, they weren't built to do this sort of computing.
- DHRYSTONE does not measure OS performance, as it avoids calling the O.S. The O.S. is indicated in the results only to help in identifying the compiler technology.
- DHRYSTONE is not perfect, but is a hell of a lot better than the "sieve", or "SI".
- DHRYSTONE gives results in dhrystones/second. Bigger numbers are better. As a baseline, the original IBM PC gives around 300-400 dhrystones/second with a good compiler. The fastest machines today are approaching 100,000.
Dhrystone Characteristics
Strengths
- Written in C language Code (Allows code portability)
- Small in size (An easy to understand program)
- Single easy to report score (DMIPS which uses a reference VAX MIPS)
- Potentially useful for 8 and 16-bit microcontroller benchmark
Weaknesses
- Cannot hope to mimic the breadth of applications encountered by a processor-based system
- Dhrystone only measures a few mathematical and basic operations
- Does not measure multiply- accumulate, floating-point, SIMD, or any other type of operations
- Dhrystone’s execution is largely spent in standard C library functions, such as strcmp(),strcpy(), and memcpy(). Compiler vendors generally provide these libraries that are typically optimized and hand-written in assembly language. While you may think you are benchmarking a processor, you are really benchmarking are the compiler writer’s optimizations of the C library functions for a particular platform
Installation
1. Obtaining the Source Code
One of the most important defects in Dhrystone is that it is often unclear what version is being quoted. Furthermore, since there are no "disclosure rules" or independent certification of scores, companies and individuals are free to state, or not state, anything. Due to its non proprietary nature, individuals and companies modified their own versions of Dhrystone resulting in various alterations of the original source code.
The following package is the most quoted, well used Dhrystone release. It is the cleanest/customisable Dhrystone out in the internet.
2. Extract the file
Extract the tarball using the command:
tar xvf dhrystone-2.1.tar.gz -C destination_directory/
There will be a total of 19 files once extracted. Move to the directory where the extracted files are.
3. Edit the Makefile
Open Makefile with any text editor; UNCOMMENT (if commented) then EDIT the following fields using the GIVEN values:
Line #25 Fedora uses -DTIME for TIME function, this field is commented out by default
TIME_FUNC= -DTIME # Use times(2) for measurementLine #28 Check motherboard specifications to determine the memory clock speed ( beagleboardXM runs at 166MHz DDR speed )
HZ= 166 # Frequency of times(2) clock ticksLine #39 This option is for C compiler
OPTIMIZE= -O2 -march=armv7-a -mtune=cortex-a8 -mfpu=neon -mfloat-abi=softfp -fomit-frame-pointer # Optimization Level (generic UNIX)Line #40 This option is for GCC compiler
GCCOPTIM= -O2 -march=armv7-a -mtune=cortex-a8 -mfpu=neon -mfloat-abi=softfp -fomit-frame-pointer
Comment out/disable the following lines:
Line #26
TIME_FUNC= -DTIMES # Use times(2) for measurementLine #38
OPTIMIZE= -Ox -G2 # Optimization Level (MSC, 80286)
Makefile snapshot
Compiler Optimization Options
Please see more about GCC ARM-Options
The options used for lines #39~40 are for optimizing the dhrystone install to run specifically with armv7 architecture. Optimizations provide a performance boost for the program. Removing the optimizations would result in a nominal program performance.
4. Run "make"
Running make in the current directory should only produce warnings!! Here is an output of the make command with warnings relating to c library functions that can be ignored.
[mjeamiguel@cdot-beagleXM-0-3 dhrystone]$ make gcc -O2 -march=armv7-a -mtune=cortex-a8 -mfpu=neon -mfloat-abi=softfp -fomit-frame-pointer -DTIME -DHZ=166 dhry_1.c dhry_2.c -o gcc_dry2 dhry_1.c:31: warning: conflicting types for built-in function ‘malloc’ dhry_1.c: In function ‘main’: dhry_1.c:98: warning: incompatible implicit declaration of built-in function ‘strcpy’ gcc -O2 -march=armv7-a -mtune=cortex-a8 -mfpu=neon -mfloat-abi=softfp -fomit-frame-pointer -DTIME -DHZ=166 -DREG=register dhry_1.c dhry_2.c -o gcc_dry2reg dhry_1.c:31: warning: conflicting types for built-in function ‘malloc’ dhry_1.c: In function ‘main’: dhry_1.c:98: warning: incompatible implicit declaration of built-in function ‘strcpy’
Running the benchmark and gathering results
The make command outputs 2 files named gcc_dry2 and gcc_dry2reg. The author of this version decided to create 2 dhrystone executables . One with register variables, and one without. Either one will work for the benchmark so, feel free to test it out.
1. Run the executable by typing ./gcc_dry2
The program will start to ask you for a number of runs. By convention on newer machines (>=1GHz), the number of runs used is about ~100000000 (100 million). There are no rules or standards about how many runs it should be. Some people calculate the number of runs through "Dhrystone run time erros" (way too advanced); what matters is the consistency of the result. For consistent results, dhrystone is executed more than 5 times with same values for number of runs.
2. Calculate for DMIPS
One common representation of the Dhrystone benchmark is DMIPS. DMIPS (Dhrystone MIPS ). It is obtained when the Dhrystone score is divided by 1757 (the number of Dhrystones per second obtained on the VAX 11/780, nominally a 1 MIPS machine).
Given the result:
Microseconds for one run through Dhrystone: 0.8
Dhrystones per Second: 1333333.4
Using the formula:
1333333.4 / 1757 = 758.87 DMIPS
The result shown was an actual test for a beagleboardXM machine with 1GHz of processor speed.
And...
From ARM White paper
"When first released, the Dhrystone benchmark fulfilled a useful function – at least it gave an alternative indicator to vendors’ literal MIPS ratings. However, more than twenty years later, there are undoubtedly better benchmarks available for measuring processor performance."