Difference between revisions of "Supporting Architectures above armv5tel"

From CDOT Wiki
Jump to: navigation, search
(Project Plan)
([0.1] Optimize and run a benchmark program specifically designed to make use of glibc)
 
(101 intermediate revisions by the same user not shown)
Line 15: Line 15:
 
== Project Contributor(s) ==
 
== Project Contributor(s) ==
 
IRC:<br />
 
IRC:<br />
'''#ubuntu-arm''' (persia)<br />
+
<ul>
'''#gentoo-embedded''' (solar and steev_)
+
<li>'''#ubuntu-arm''' (persia)<br /></li>
 +
<li>'''#gentoo-embedded''' (solar and steev_)</li>
 +
</ul>
 +
 
 +
SENECA:
 +
 
 +
'''Tae Hee (Tyler) Lee'''
 +
 
 +
<blockquote>
 +
My fellow classmate, a great and helpful guy working on the [http://zenit.senecac.on.ca/wiki/index.php/To_Thumb_or_Not_to_Thumb To Thumb or Not to Thumb Project]. We both have the same project objective; to benchmark a system. (his project deals with 16-bit codes) We share access to the test builder cdot-beagleXM-0-3 and even ran tests both at the same time that could have toasted it! Tyler helped me out with with scripting/'makefiles' and using 'screen'.</blockquote>
  
 
== Project Details ==
 
== Project Details ==
This project aims to test the significance of technologies that exist in '''ARMv7''' architecture by conducting different benchmark techniques and compiler optimisation. This project covers performance comparison between '''ARMv5''' and '''ARMv7'''. Results can help the Fedora community decide on how to go about with the compilation of the fedora universe on ARMv7 devices.
+
'''What the project is all about'''
 +
 
 +
Currently, Fedora only supports armv5tel codes. With the release of armv7 architecture (with beagleboardXM) Fedora-ARM is pressed with the decision of upgrading its Fedora Universe to use armv7 code. While it seems logical, re-compiling the whole Fedora package is a strenuous task. Before deciding to recompile the whole universe, Fedora-ARM can test if optimizing certain system binaries to use armv7 architecture provides significant performance difference against the currently used armv5tel codes. The test would clarify if armv7 codes used for armv7 hardware really improves system performance.
 +
 
 +
This project aims for that sole purpose. By running a benchmark and compiler optimizations on system binaries, Fedora-ARM can contrast both technologies and use the results to decide if it's really worth to recompile the whole Fedora Universe to use armv7  codes.
 +
 
 +
Below is a list of technologies by armv7.
  
 
'''ARMv7''' Technologies:
 
'''ARMv7''' Technologies:
Line 29: Line 44:
  
  
There are currently 2 '''ARMv7''' (beagleboard & beagleboard XM) builders in the Fedora ARM farm. These builders are running builds on '''ARMv5tel''' architecture which means optimum performance is not achieved.
+
There are currently 2 '''ARMv7''' (beagleboard & beagleboard XM) builders in the Fedora ARM farm. These builders are running builds on '''ARMv5tel'''. This project will focus in using the beagleboardXM builder <code>cdot-beagleXM-0-3</code>
 +
 
 +
===Specifications for the cdot-beagleboardXM-0-3 builder===
 +
----
 +
[[Image:Bb.jpg|thumb|right|400px| cdot-beagleboardXM-0-3 builder]]
 +
'''beagleboardXM specific'''<br />
 +
<blockquote>[http://beagleboard.org/hardware-xM beagleboardXM hardware page]</blockquote>
 +
'''cat /proc/version'''
 +
<blockquote>Linux version 2.6.32 (ubuntu@ip-10-204-115-71) (gcc version 4.3.3 (GCC) ) #3 PREEMPT Wed Aug 18 15:53:03 UTC 2010</blockquote>
 +
'''cat /proc/cpuinfo'''
 +
<blockquote>
 +
Processor      : ARMv7 Processor rev 2 (v7l)<br />
 +
BogoMIPS        : 515.72<br />
 +
Features        : swp half thumb fastmult vfp edsp thumbee neon vfpv3<br />
 +
CPU implementer : 0x41<br />
 +
CPU architecture: 7<br />
 +
CPU variant    : 0x3<br />
 +
CPU part        : 0xc08<br />
 +
CPU revision    : 2<br />
 +
Hardware        : OMAP3 Beagle Board<br />
 +
Revision        : 0020<br />
 +
Serial          : 0000000000000000<br />
 +
</blockquote>
 +
'''cat /proc/meminfo'''<br />
 +
<blockquote>MemTotal:        498716 kB</blockquote>
 +
 
 +
'''rpm -q glibc'''<br />
 +
<blockquote>glibc-2.11-2.fa3.armv5tel (Subjected for upgrade to armv7)</blockquote>
  
 
== Project Plan ==
 
== Project Plan ==
  
 
Goals for each release:
 
Goals for each release:
*<b><font style="font-size:100%"> 0.1 Optimize and run a benchmark program specifically designed to make use of <code>glibc</code></font></b>
 
  
Any packages compiled for the beagleboards can be installed without optimization. So far, in the case of <code>cdot-beagleXM-0-3</code> packages are compiled without it. In order to make use of ARMv7 architecture features, editing the <code>CFLAGS</code> to use <code>arm</code> optimization options  will let the compiler attempt to improve the performance and/or code size of the program.
+
===[0.1] Optimize and run a benchmark program specifically designed to make use of <code>glibc</code>===
There has been a big focus on arm specific options for gcc recently. In gcc-4.4 the value <code>vfpv3</code> for the <code>-mfpu</code> option was included. More updates are expected for gcc as more and more companies focus their sights on ARM processors.
 
  
Dhrystone is selected as the benchmark program for this release. Several reasons make Dhrystone a good program to test the general processor performance of the beagleboardXM.
+
Any packages compiled for the beagleboards can be installed without optimization. So far, in the case of <code>cdot-beagleXM-0-3</code> packages are compiled without it. Without optimizations, software installed in a system can only run on sub-optimal performance. In order to make use of ARMv7 architecture features, editing the <code>CFLAGS</code> to use <code>arm</code> optimization options  will let the compiler attempt to improve the performance and/or code size of the program; resulting in a more efficient/faster system.
  
 +
The goal of this release is to run a benchmark software named [http://en.wikipedia.org/wiki/Dhrystone Dhrystone] on <code>cdot-beagleXM-0-3</code> and record the results. Three (3) runs are required: '''No optimization''', '''Optimized for armv5tel''', and '''Optimized for armv7'''. Dhrystone is chosen as the benchmark software mainly, to test the general system performance of <code>cdot-beagleXM-0-3</code> and to test how much performance gain can be expected from optimizing a program. Other reasons include:
 +
<blockquote>
 
<b><font style="font-size:100%">Reasons for using Dhrystone</font></b>
 
<b><font style="font-size:100%">Reasons for using Dhrystone</font></b>
 
*ARM® recognizes the program and uses it as a performance attribute of their processors.
 
*ARM® recognizes the program and uses it as a performance attribute of their processors.
 
*Dhrystone provides a more meaningful MIPS (Million Instructions Per Second) because results are compared to a reference machine.
 
*Dhrystone provides a more meaningful MIPS (Million Instructions Per Second) because results are compared to a reference machine.
 
*Dhrystone numbers reflect the performance of the C compiler and libraries more so than the performance of the processor itself. (considered as a weakness of the program)
 
*Dhrystone numbers reflect the performance of the C compiler and libraries more so than the performance of the processor itself. (considered as a weakness of the program)
 +
*Check if armv7 optimization options and armv5tel optimization options differ significantly in program performance
 +
</blockquote>
 +
 +
 +
[[Image:graph.png|thumb|450px|right| A graph showing the overall system performance of cdot-beagleXM-0-3]]
 +
 +
 +
'''Test Result (in DMIPS):'''<br />
 +
 +
Normal                = 758.869322709 DMIPS<br />
 +
Optimized for armv5tel = 1034.82179852 DMIPS<br />
 +
Optimized for armv7    = 1034.82179852 DMIPS<br />
 +
 +
 +
 +
 +
 +
The benchmark graph shows that optimization increased the overall performance of cdot-beagleXM-0-3 by '''36%''' (Normal run vs. Optimized for armv5tel/armv7). The results for both armv5tel and armv7 optimizations are the same. (It can be assumed that the armv5tel glibc impacts the performance of C library dependent programs such as Dhrystone). Another possible reason is that the compiler used is already armv7 optimized (since dhrystone also relies on the compiler efficiency) The data gathered can be used as a reference for conducting 0.2 project release.
 +
<br />
 +
<br />
 +
<br />
 +
<br />
 +
<br />
 +
<br />
 +
<br />
 +
<br />
 +
 +
===[0.2] Install an armv7 glibc and re-run the benchmark using dhrystone===
 +
 +
Since optimizing the benchmark program using both architecture optimizations revealed the same results, optimizing the system binaries and re-running the benchmark should be able to provide a more concrete contrast between the 2 architectures.
 +
 +
The goal of this release is to find out if upgrading the armv5tel glibc would affect the performance of a C library dependent program such as dhrystone. By re-installing glibc using armv7 optimizations and re-running an armv7 optimized dhrystone; a better benchmark result is expected (Higher DMIPS). The result would be beneficial for Fedora-ARM, for it will help the community decide if armv5tel codes should continually be supported.
 +
 +
 +
'''Requires:'''
  
 +
<blockquote>[http://arm.koji.fedoraproject.org/koji/buildinfo?buildID=8721 glibc binaries]</blockquote>
  
Test Result:
+
In order for a successful glibc armv7 build, the file ''/usr/lib/rpm/redhat/rpmrc'' needs edit to use armv7 build options. Using mock; a downloaded glibc source can be rebuilt. Once finished, the binaries can then be installed locally to the system using rpm OR a local repository can be created enabling the use of yum.
  
* 0.2
+
 
* 0.3
+
'''Test Result:'''
 +
 
 +
The Dhrystone (DMIPS) results never changed. The benchmark brought the same, exact number of DMIPS. Although Dhrystone does make use of C library functions and is assumed that the glibc would have effects on the program; the results proved that upgrading the glibc did not bring what's expected.
 +
 
 +
It is proven that armv7 and armv5 arch optimizations provide the same level of performance especially when running C library dependent programs on cdot-beagleXM-0-3 builder. Why is it possible when armv7 architecture is supposed to be better than armv5tel? One big answer is that the system tested was built to use an [http://en.wikipedia.org/wiki/Application_binary_interface ABI] called "softfp". Although beagleboardXM (cortex-a8)supports the "hard floating-point" ABI, The Fedora-ARM currently can't afford to waste time building a system that supports "hardfp" solely on test purposes. To make things a little clearer, cdot-beagleXM-0-3 can't use the technology offered by ARM-cortex-a8 processor because of how the system (down to the lowest level) is built.
 +
 
 +
 
 +
ARM Floating point is a pretty big topic, provided are some links to help you understand more.
 +
 
 +
<ul>
 +
<li>[http://en.wikipedia.org/wiki/IEEE_754-2008 Standard for floating point arithmetic]</li>
 +
<li>[http://www.arm.com/products/processors/technologies/vector-floating-point.php ARM Floating Point]</li>
 +
<li>[https://wiki.linaro.org/Linaro-arm-hardfloat arm-hardfloat wiki]</li>
 +
<li>[http://gcc.gnu.org/wiki/Software_floating_point Software floating point in GCC]</li>
 +
</ul>
 +
 
 +
In conclusion, Fedora-ARM can continue to use armv5tel codes on armv7 machines. Deciding to recompile the whole universe for armv7 is somehow an inconvenient recommendation for now.
 +
 
 +
===[0.3] Future project prospect===
 +
The previous test didn't leave an opportunity for me to work on [0.3] Release. Although the comparison is done, and the results are gathered; One last option to test armv7 technology still remains: "Rebuild everything to use a hardfp ABI!" This recommendation would undoubtedly reveal the performance difference of armv7 against armv5tel; but at the same time would be a big project "not suitable for a single person to work on".
 +
 
 +
I hope that this project page including the [http://zenit.senecac.on.ca/wiki/index.php/Dhrystone_howto Dhrystone How To] page can be of use for future ARM based project reference.
  
 
== Things to learn ==
 
== Things to learn ==
 +
-rpmmacros
 +
 +
-Dhrystone 2.1<br />
 +
-gcc ARM optimizations<br />
 
-Ways of benchmarking ARM processors<br />
 
-Ways of benchmarking ARM processors<br />
 
-gcc install options<br />
 
-gcc install options<br />
 
-Compiling kernel and glibc<br />
 
-Compiling kernel and glibc<br />
 
-Familiarization with ARM hardware<br />
 
-Familiarization with ARM hardware<br />
 +
 +
=== HOW TOs ===
 +
[http://zenit.senecac.on.ca/wiki/index.php/Dhrystone_howto A guide for using Dhrystone benchmark]
  
 
== Project News ==
 
== Project News ==
 +
''December 16th, 2010'' - [0.2] and [0.3] Release updated and finalized
 +
 +
''December 15th, 2010'' - Added the [http://zenit.senecac.on.ca/wiki/index.php/Dhrystone_howto Dhrystone How to Page]
 +
 +
''December 9th, 2010'' - Project page update (0.1 Release)
 +
 +
''November 22nd, 2010'' - Release 0.1 test results posted
 +
 
''November 4th, 2010'' - Compiler optimization options ready for testing, project page updated
 
''November 4th, 2010'' - Compiler optimization options ready for testing, project page updated
  

Latest revision as of 18:51, 22 December 2010

Project Name

Supporting Architectures Above armv5tel

Project Description

The armv5tel architecture version is supported by some common devices such as the Marvell Feroceon processors used in most plug computers. However, later versions of the architecture support advanced features, and using armv5tel code on those processors may result in suboptimal performance.

This project will research ways that Fedora-ARM could support higher processor versions effectively without recompiling the entire Fedora package universe -- for example, by providing an armv7 + hardfp glibc and kernel. This involves performance testing across multiple devices.

Initial contacts: ctyler, PaulW

Project Leader(s)

Mark Eamiguel

Project Contributor(s)

IRC:

  • #ubuntu-arm (persia)
  • #gentoo-embedded (solar and steev_)

SENECA:

Tae Hee (Tyler) Lee

My fellow classmate, a great and helpful guy working on the To Thumb or Not to Thumb Project. We both have the same project objective; to benchmark a system. (his project deals with 16-bit codes) We share access to the test builder cdot-beagleXM-0-3 and even ran tests both at the same time that could have toasted it! Tyler helped me out with with scripting/'makefiles' and using 'screen'.

Project Details

What the project is all about

Currently, Fedora only supports armv5tel codes. With the release of armv7 architecture (with beagleboardXM) Fedora-ARM is pressed with the decision of upgrading its Fedora Universe to use armv7 code. While it seems logical, re-compiling the whole Fedora package is a strenuous task. Before deciding to recompile the whole universe, Fedora-ARM can test if optimizing certain system binaries to use armv7 architecture provides significant performance difference against the currently used armv5tel codes. The test would clarify if armv7 codes used for armv7 hardware really improves system performance.

This project aims for that sole purpose. By running a benchmark and compiler optimizations on system binaries, Fedora-ARM can contrast both technologies and use the results to decide if it's really worth to recompile the whole Fedora Universe to use armv7 codes.

Below is a list of technologies by armv7.

ARMv7 Technologies:


There are currently 2 ARMv7 (beagleboard & beagleboard XM) builders in the Fedora ARM farm. These builders are running builds on ARMv5tel. This project will focus in using the beagleboardXM builder cdot-beagleXM-0-3

Specifications for the cdot-beagleboardXM-0-3 builder


cdot-beagleboardXM-0-3 builder

beagleboardXM specific

beagleboardXM hardware page

cat /proc/version

Linux version 2.6.32 (ubuntu@ip-10-204-115-71) (gcc version 4.3.3 (GCC) ) #3 PREEMPT Wed Aug 18 15:53:03 UTC 2010

cat /proc/cpuinfo

Processor  : ARMv7 Processor rev 2 (v7l)
BogoMIPS  : 515.72
Features  : swp half thumb fastmult vfp edsp thumbee neon vfpv3
CPU implementer : 0x41
CPU architecture: 7
CPU variant  : 0x3
CPU part  : 0xc08
CPU revision  : 2
Hardware  : OMAP3 Beagle Board
Revision  : 0020
Serial  : 0000000000000000

cat /proc/meminfo

MemTotal: 498716 kB

rpm -q glibc

glibc-2.11-2.fa3.armv5tel (Subjected for upgrade to armv7)

Project Plan

Goals for each release:

[0.1] Optimize and run a benchmark program specifically designed to make use of glibc

Any packages compiled for the beagleboards can be installed without optimization. So far, in the case of cdot-beagleXM-0-3 packages are compiled without it. Without optimizations, software installed in a system can only run on sub-optimal performance. In order to make use of ARMv7 architecture features, editing the CFLAGS to use arm optimization options will let the compiler attempt to improve the performance and/or code size of the program; resulting in a more efficient/faster system.

The goal of this release is to run a benchmark software named Dhrystone on cdot-beagleXM-0-3 and record the results. Three (3) runs are required: No optimization, Optimized for armv5tel, and Optimized for armv7. Dhrystone is chosen as the benchmark software mainly, to test the general system performance of cdot-beagleXM-0-3 and to test how much performance gain can be expected from optimizing a program. Other reasons include:

Reasons for using Dhrystone

  • ARM® recognizes the program and uses it as a performance attribute of their processors.
  • Dhrystone provides a more meaningful MIPS (Million Instructions Per Second) because results are compared to a reference machine.
  • Dhrystone numbers reflect the performance of the C compiler and libraries more so than the performance of the processor itself. (considered as a weakness of the program)
  • Check if armv7 optimization options and armv5tel optimization options differ significantly in program performance


A graph showing the overall system performance of cdot-beagleXM-0-3


Test Result (in DMIPS):

Normal = 758.869322709 DMIPS
Optimized for armv5tel = 1034.82179852 DMIPS
Optimized for armv7 = 1034.82179852 DMIPS



The benchmark graph shows that optimization increased the overall performance of cdot-beagleXM-0-3 by 36% (Normal run vs. Optimized for armv5tel/armv7). The results for both armv5tel and armv7 optimizations are the same. (It can be assumed that the armv5tel glibc impacts the performance of C library dependent programs such as Dhrystone). Another possible reason is that the compiler used is already armv7 optimized (since dhrystone also relies on the compiler efficiency) The data gathered can be used as a reference for conducting 0.2 project release.







[0.2] Install an armv7 glibc and re-run the benchmark using dhrystone

Since optimizing the benchmark program using both architecture optimizations revealed the same results, optimizing the system binaries and re-running the benchmark should be able to provide a more concrete contrast between the 2 architectures.

The goal of this release is to find out if upgrading the armv5tel glibc would affect the performance of a C library dependent program such as dhrystone. By re-installing glibc using armv7 optimizations and re-running an armv7 optimized dhrystone; a better benchmark result is expected (Higher DMIPS). The result would be beneficial for Fedora-ARM, for it will help the community decide if armv5tel codes should continually be supported.


Requires:

glibc binaries

In order for a successful glibc armv7 build, the file /usr/lib/rpm/redhat/rpmrc needs edit to use armv7 build options. Using mock; a downloaded glibc source can be rebuilt. Once finished, the binaries can then be installed locally to the system using rpm OR a local repository can be created enabling the use of yum.


Test Result:

The Dhrystone (DMIPS) results never changed. The benchmark brought the same, exact number of DMIPS. Although Dhrystone does make use of C library functions and is assumed that the glibc would have effects on the program; the results proved that upgrading the glibc did not bring what's expected.

It is proven that armv7 and armv5 arch optimizations provide the same level of performance especially when running C library dependent programs on cdot-beagleXM-0-3 builder. Why is it possible when armv7 architecture is supposed to be better than armv5tel? One big answer is that the system tested was built to use an ABI called "softfp". Although beagleboardXM (cortex-a8)supports the "hard floating-point" ABI, The Fedora-ARM currently can't afford to waste time building a system that supports "hardfp" solely on test purposes. To make things a little clearer, cdot-beagleXM-0-3 can't use the technology offered by ARM-cortex-a8 processor because of how the system (down to the lowest level) is built.


ARM Floating point is a pretty big topic, provided are some links to help you understand more.

In conclusion, Fedora-ARM can continue to use armv5tel codes on armv7 machines. Deciding to recompile the whole universe for armv7 is somehow an inconvenient recommendation for now.

[0.3] Future project prospect

The previous test didn't leave an opportunity for me to work on [0.3] Release. Although the comparison is done, and the results are gathered; One last option to test armv7 technology still remains: "Rebuild everything to use a hardfp ABI!" This recommendation would undoubtedly reveal the performance difference of armv7 against armv5tel; but at the same time would be a big project "not suitable for a single person to work on".

I hope that this project page including the Dhrystone How To page can be of use for future ARM based project reference.

Things to learn

-rpmmacros

-Dhrystone 2.1
-gcc ARM optimizations
-Ways of benchmarking ARM processors
-gcc install options
-Compiling kernel and glibc
-Familiarization with ARM hardware

HOW TOs

A guide for using Dhrystone benchmark

Project News

December 16th, 2010 - [0.2] and [0.3] Release updated and finalized

December 15th, 2010 - Added the Dhrystone How to Page

December 9th, 2010 - Project page update (0.1 Release)

November 22nd, 2010 - Release 0.1 test results posted

November 4th, 2010 - Compiler optimization options ready for testing, project page updated

October 19th, 2010 - Chris Tyler explained more about the project, including goal 0.1

October 15th, 2010 - Project page updated (Things to learn)