Changes

Jump to: navigation, search

Winter 2014 SPO600 Weekly Schedule

10,027 bytes added, 01:00, 5 September 2014
no edit summary
[[Category:Winter 2014 SPO600]]
{{Admon/obsolete|the [[Current SPO600 Weekly Schedule]]}} <!-- {{Admon/important|It's Alive!|This [[SPO600]] weekly schedule will be updated as the course proceeds - dates and content are subject to change. The cells in the summary table will be linked to relevant resources and labs as the course progresses.}}-->
== Summary Table ==
|4||Jan 27||[[#Tuesday (Jan 28)|Lab 3 results, inline assembler, and compiler optimizations]]||[[#Friday (Jan 31)|Analyzing a codebase for assembler and non-portable code]]||[[#Week 3 Deliverables|Blog post about codebase analysis]]
|-
|5||Feb 3||[[#Tuesday (Feb 4)|Memory Barriers and Atomics]]||[[#Friday (Feb 7)|Potential Project Analysis]]||[[#Week 5 Deliverables|Blog about your selected projects]]
|-
|6||Feb 10||Porting - Adding platform[[#Tuesday (Feb 11)|Architecture-specific code Code for Aarch64Performance]]||Group hack session - Porting||Port [[#Week 5 Deliverables|Identify the assembler in your projectsand contact your upstream communities.]]
|-
|7||Feb 17||Portability - Removing platform-specific code||Group hack session - Portability||Remove platform-specific code from your projects
|8||Mar 3||Project Work||Project Work||Get code into review
|-
|9||Mar 10||Benchmarking [[#Tuesday (IMarch 11)|Status Update]]||[[#Friday (March 14) - Baseline, Control, and Repeatability|Foundation Models]]|Group hack session - Baseline benchmarks|[[#Week 9 Deliverables|Produce baseline benchmarks for your softwareInstall and Test With Foundation Model]]
|-
|10||Mar 17||Benchmarking [[#Tuesday (IIMarch 18) - Change Impact|Profiling ]]||Baseline Profiling|Group hack session - Impact of your Changes|[[#Week 10 Deliverables|Publish change impact Post baseline stats for your software]]
|-
|11||Mar 24||Optimizing Code||Group hack - Profiling and optimizing||Code review update
|-
|12||Mar 31||Project WorkUsing complier optimizations||Project Work||Code review update
|-
|13||Apr 7||ConclusionFinal Presentations||Final Presentations(No class - Exams start)||Code accepted upstream
|-style="background: #f0f0ff"
|Exam Week||Apr 14||colspan="3" align="center"|Exam Week - No exam in this course!
* '''Reminder:''' Week 1-3 blog posts are due for marking on Friday, January 31.
* Blog about the [[Codebase Analysis Lab]]
 
== Week 5 ==
 
=== Tuesday (Feb 4) ===
 
Platform-specific code is often utilized for '''Memory Barriers''' and '''Atomics Operations'''.
 
==== Memory Barriers ====
'''Memory Barriers''' ensure that memory accesses are sequenced so that multiple threads, processes, cores, or IO devices see a predictable view of memory.
* Leif Lindholm provides an excellent explanation of memory barriers.
** Blog series - I recommend this series, especially the introduction, as a very clear explanation of memory barrier issues.
*** Part 1 - [http://community.arm.com/groups/processors/blog/2011/03/22/memory-access-ordering--an-introduction Memory Access Ordering - An Introduction]
*** Part 2 - [http://community.arm.com/groups/processors/blog/2011/04/11/memory-access-ordering-part-2--barriers-and-the-linux-kernel Memory Access Ordering Part 2 - Barriers and the Linux Kernel]
*** Part 3 - [http://community.arm.com/groups/processors/blog/2011/10/19/memory-access-ordering-part-3--memory-access-ordering-in-the-arm-architecture Memory Access Ordering Part 3 - Memory Access Ordering in the ARM Architecture]
** Presentation at Embedded Linux Conference 2010 (Note: Acquire/Release in C++11 and ARMv8 aarch64 appeared after this presentation):
*** [http://elinux.org/images/f/fa/Software_implications_memory_systems.pdf Slides]
*** [http://free-electrons.com/pub/video/2010/elce/elce2010-lindholm-memory-450p.webm Video]
* [http://www.rdrop.com/users/paulmck/scalability/paper/whymb.2010.07.23a.pdf Memory Barriers - A Hardware View for Software Hackers] - This is a highly-rated paper that explains memory barrier issues - as the title suggests, it is designed to describe the hardware origin of the problem to software developers. Despite the fact that it is an introduction to the topic, it is still very technical.
* [http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.faqs/ka14041.html ARM Technical Support Knowlege Article - In what situations might I need to insert memory barrier instructions?] - Note that there are some additional mechanisms present in ARMv8 aarch64, including Acquire/Release.
* [https://www.kernel.org/doc/Documentation/memory-barriers.txt Kernel Documentation on Memory Barriers] - discusses the memory barrier issue generally, and the solutions used within the Linux kernel. This is part of the kernel documentation.
* Acquire-Release mechanisms
** [http://blogs.msdn.com/b/oldnewthing/archive/2008/10/03/8969397.aspx MSDN Blog Post] with a very clear explanation of Acquire-Release.
** [http://preshing.com/20130922/acquire-and-release-fences/ Preshing on Programming post] with a good explanation.
** [http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.genc010197a/index.html ARMv8 Instruction Set Architecture Manual] (ARM InfoCentre registration required) - See the section on Acquire/Release and Load/Store.
 
==== Atomics ====
'''Atomics''' are operations which must be completed in a single step (or appear to be completed in a single step) without potential interruption.
* Wikipedia has a good basic overview of the need for atomicity in the article on [http://en.wikipedia.org/wiki/Linearizability Linerarizability]
* GCC provides intrinsics (built-in functions) for atomic operations, as documented in the GCC manual:
** [http://gcc.gnu.org/onlinedocs/gcc-4.8.2/gcc/_005f_005fsync-Builtins.html#_005f_005fsync-Builtins Legacy __sync Built-in Functions for Atomic Memory Access]
** [http://gcc.gnu.org/onlinedocs/gcc-4.8.2/gcc/_005f_005fatomic-Builtins.html#_005f_005fatomic-Builtins Built-in functions for memory model aware atomic operations]
* The Fedora project has some guidelines/recommendations for the use of these GCC builtins:
** http://fedoraproject.org/wiki/Architectures/ARM/GCCBuiltInAtomicOperations
 
=== Friday (Feb 7) ===
 
==== Hack Session: Potential Project Analysis ====
 
Select a project from the [[Winter 2014 SPO600 Software List]] and perform these steps:
# Edit that page to put your name in the "Claimed by" column.
# Investigate the package to determine:
#* If the current version has been built for ARM (e.g., exists in the Fedora aarch64 port - fastest way to test is to use 'yum' inside the arm64 emulation environment on Ireland)
#* What the platform-specific code in the software does
#* Whether portable work-arounds exist
#* The need for an aarch64 port or for platform-specific code elimination
#* Opportunities for optimization
#* The amount of work involved in porting and optimizing, and your skills for performing that work
# Based on the result of your investigation, decide on your interest in the project.
#* If you wish to choose this project for yourself, place it on your row in the [[Winter 2014 SPO600 Participants|Participants]] page.
#* If you do not wish to choose this project, remove your name from the "Claimed by" column in the [[Winter 2014 SPO600 Software List|Software List]] page.
# Repeat until you have two packages.
 
{{Admon/note|Overload|It is strongly recommended that you choose two projects with a total scope sum of 0-1. If you wist to try a higher or lower sum, or more or less than two projects, please talk to your professor.}}
 
{{Admon/tip|RPM Packages|For sofware that is present in the rpmfusion repositories but not in Fedora, you can use <code>yumdownloader --source ''packagename''</code> to grab the source RPM and then examine it using the RPM tools. See [[RPM Packaging Process]] for information.}}
 
=== Week 5 Deliverables ===
 
* Blog about your two selected projects, including your detailed initial analysis of them.
** You may want to break this into a couple of posts - e.g., post about your first package while you're working on your second.
** Feel free to also blog about why you did '''not''' choose particular packages, too.
 
== Week 6 ==
 
=== Tuesday (Feb 11) ===
 
* Architecture-specific code for Performance
** Sometimes assembler is used in a C/C++ program for performance. However, modern versions of C/C++ (such as C++11) and recent compilers provide portable ways of accessing high-performance processor capabilities, such as Single Instruction/Multiple Data (SIMD) instructions (called "marketing names" such as SSE, Neon, MMX, 3DNow, or AltaVec on various processors).
** Linaro enginener Matthew Gretton-Dann gave a good presentation on [http://www.linaro.org/linaro-blog/2013/09/20/introduction-to-porting-and-optimising-code/ Porting and Optimizing Code] for aarch64. The vectorization portion, beginning at 28:10, provides a good introduction to SIMD and autovectorization using GCC on aarch64 (Note that the earlier portion of the presentation includes good information about Atomics).
*** [http://www.youtube.com/watch?v=epzYErIIx0Y YouTube Video] direct link
*** [http://www.linaro.org/assets/common/campus-party-presentation-Sept_2013.pdf Slides] direct link
** Note that in the presentation above, Matthew takes the code beyond portability without straying into assembler (e.g., using compiler-specific, architecture-specific intrinsics). It is possible to achieve almost all of the performance gains without becoming arch-specific, and most of those can be attained without becoming compiler-specific as well.
* For full details on the SIMD instructions in aarch64, refer to the [http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.genc010197a/index.html ARMv8 Instruction Set Overview], particularly section 5.7.
 
=== Week 6 Deliverables ===
* Complete your analysis of your two selected software projects (if you haven't already) - see [[#Week 5|Week 5]]. Blog in detail about your findings.
* Identify the upstream communities that develop and maintain the software you have selected to work on. Figure out how they are structured, how they communicate, how code is maintained, and how patches are accepted. Introduce yourself to each of the two communities (one for each of the two software projects you have selected). Blog about your findings.
 
== Week 7 ==
* Project Work
 
== Week 8 ==
* Project Work ([[User:Chris Tyler|Chris Tyler]] is at [http://www.linaro.org/connect-lca14 Linaro Connect]) this week.
* Aim at getting your code changes upstream to your communities
 
== Week 9 ==
=== Tuesday (March 11) ===
* Status updates
* Update from Linaro Connect
* Discussion of useful tools
** screen
** time
 
=== Friday (March 14) ===
* Comparison of Emulation
** QEMU
** Fast Model and Foundation Model
* Install and configure the Foundation Model
** [[:fedora:Architectures/ARM/AArch64/QuickStart|Fedora AArch64 Quick Start]]
** [http://www.linaro.org/engineering/engineering-projects/armv8 Linaro Foundation Model Instructions]
* Baseline Benchmarking
 
==== Resources ====
* Foundation Model
** [http://www.arm.com/products/tools/models/fast-models/ ARM Fast Models] - Note that "fast" here refers to the modelling approach, not execution speed!
* Benchmarking
** [http://www.tokutek.com/wp-content/uploads/2013/05/20130424-percona-live-benchmarking.pdf Benchmarking Talk by Tim Callaghan]
 
=== Week 9 Deliverables ===
* Set up the Foundation Model
* Upstream your proposed code changes
* Blog about your work
 
== Week 10 ==
 
=== Tuesday (March 18) ===
* Profiling with <code>gprof</code>
** Build with profiling enabled (<code>-pg</code>)
** Run the profile-enabled executable
** Analyze the data in the <code>gmon.out</code> file
*** <code>gprof ''nameOfBinary''</code> # Displays text profile including call graph
*** <code>gprof ''nameOfBinary'' | gprof2dot | dot | display -</code> # Displays visualization of call graph
 
Resources
* [https://sourceware.org/binutils/docs-2.16/gprof/ GProf Manual]
* [http://www.thegeekstuff.com/2012/08/gprof-tutorial/ Profiling with GProf]
 
=== Friday (March 21) ===
* Gather baseline statistics for your software
 
=== Week 10 Deliverables ===
* Blog your baseline benchmark results

Navigation menu