Difference between revisions of "SPO600 Vectorization Lab"

From CDOT Wiki
Jump to: navigation, search
 
(10 intermediate revisions by the same user not shown)
Line 1: Line 1:
[[Category:SPO600 Labs]]
+
[[Category:SPO600 Labs - Retired]]
 
{{Admon/lab|Purpose of this Lab|This lab is designed to explore single instruction/multiple data (SIMD) vectorization, and the auto-vectorization capabilities of the GCC compiler.}}
 
{{Admon/lab|Purpose of this Lab|This lab is designed to explore single instruction/multiple data (SIMD) vectorization, and the auto-vectorization capabilities of the GCC compiler.}}
 +
{{Admon/tip|Tiny Lab|This is intended to be a very short lab. Don't overcomplicate it!}}
 +
{{Admon/important|This lab is not used in the current semester.|Please refer to the other labs in the [[:Category:SPO600 Labs|SPO600 Labs]] category.}}
  
== Lab 6 ==
 
  
# Write a short program that creates two 1000-element integer arrays and fills them with random numbers, then sums those two arrays to a third array, and finally sums the third array to a long int and prints the result.
+
== Optional Lab (Recommended!) ==
# Compile this program on [[SPO600 Servers#AArch64|aarchie]] in such a way that the code is auto-vectorized.
+
 
 +
# Write a short program that creates two 1000-element integer arrays and fills them with random numbers in the range -1000 to +1000, then sums those two arrays element-by-element to a third array, and finally sums the third array and prints the result.
 +
# Compile this program on one of the AArch64/ARM64 [[SPO600 Servers]] in such a way that the code is auto-vectorized.
 
# Annotate the emitted code (i.e., obtain a dissassembly via <code>objdump -d</code> and add comments to the instructions in <code>&lt;main&gt;</code> explaining what the code does).
 
# Annotate the emitted code (i.e., obtain a dissassembly via <code>objdump -d</code> and add comments to the instructions in <code>&lt;main&gt;</code> explaining what the code does).
# Review the vector instructions for AArch64. Find a way to scale an array of sound samples (see Lab 5) by a factor between 0.000-1.000 using SIMD. (Note: you may need to convert some data types).
+
# Write a blog post discussing your findings. Include:
# '''Write a blog post discussing your findings'''. Include:
 
 
#* The source code
 
#* The source code
 
#* The compiler command line used to build the code
 
#* The compiler command line used to build the code
#* Your annotated dissassembly listing
+
#* Your annotated dissassembly listing -  '''Prove that the code is vectorized''', for example, by pointing out the use of vector registers and SIMD instructions.
 
#* Your reflections on the experience and the results
 
#* Your reflections on the experience and the results
#* Your proposed volume-sampling-via-SIMD solution.
 
  
 
=== Resources ===
 
=== Resources ===
 
* [https://gcc.gnu.org/projects/tree-ssa/vectorization.html Auto-Vectorization in GCC] - Main project page for the GCC auto-vectorizer.
 
* [https://gcc.gnu.org/projects/tree-ssa/vectorization.html Auto-Vectorization in GCC] - Main project page for the GCC auto-vectorizer.
 
* [http://locklessinc.com/articles/vectorize/ Auto-vectorization with gcc 4.7] - An excellent discussion of the capabilities and limitations of the GCC auto-vectorizer, intrinsics for providing hints to GCC, and other code pattern changes that can improve results. Note that there has been some improvement in the auto-vectorizer since this article was written. '''This article is strongly recommended.'''
 
* [http://locklessinc.com/articles/vectorize/ Auto-vectorization with gcc 4.7] - An excellent discussion of the capabilities and limitations of the GCC auto-vectorizer, intrinsics for providing hints to GCC, and other code pattern changes that can improve results. Note that there has been some improvement in the auto-vectorizer since this article was written. '''This article is strongly recommended.'''
 +
* [https://software.intel.com/sites/default/files/8c/a9/CompilerAutovectorizationGuide.pdf Intel (Auto)Vectorization Tutorial] - this deals with the Intel compiler (ICC) but the general technical discussion is valid for other compilers such as gcc and llvm

Latest revision as of 11:52, 2 October 2019

Lab icon.png
Purpose of this Lab
This lab is designed to explore single instruction/multiple data (SIMD) vectorization, and the auto-vectorization capabilities of the GCC compiler.
Idea.png
Tiny Lab
This is intended to be a very short lab. Don't overcomplicate it!
Important.png
This lab is not used in the current semester.
Please refer to the other labs in the SPO600 Labs category.


Optional Lab (Recommended!)

  1. Write a short program that creates two 1000-element integer arrays and fills them with random numbers in the range -1000 to +1000, then sums those two arrays element-by-element to a third array, and finally sums the third array and prints the result.
  2. Compile this program on one of the AArch64/ARM64 SPO600 Servers in such a way that the code is auto-vectorized.
  3. Annotate the emitted code (i.e., obtain a dissassembly via objdump -d and add comments to the instructions in <main> explaining what the code does).
  4. Write a blog post discussing your findings. Include:
    • The source code
    • The compiler command line used to build the code
    • Your annotated dissassembly listing - Prove that the code is vectorized, for example, by pointing out the use of vector registers and SIMD instructions.
    • Your reflections on the experience and the results

Resources

  • Auto-Vectorization in GCC - Main project page for the GCC auto-vectorizer.
  • Auto-vectorization with gcc 4.7 - An excellent discussion of the capabilities and limitations of the GCC auto-vectorizer, intrinsics for providing hints to GCC, and other code pattern changes that can improve results. Note that there has been some improvement in the auto-vectorizer since this article was written. This article is strongly recommended.
  • Intel (Auto)Vectorization Tutorial - this deals with the Intel compiler (ICC) but the general technical discussion is valid for other compilers such as gcc and llvm