Difference between revisions of "SPO600 Vectorization Lab"
Chris Tyler (talk | contribs) |
Chris Tyler (talk | contribs) |
||
Line 7: | Line 7: | ||
# Compile this program on [[SPO600 Servers#AArch64|aarchie]] in such a way that the code is auto-vectorized. | # Compile this program on [[SPO600 Servers#AArch64|aarchie]] in such a way that the code is auto-vectorized. | ||
# Annotate the emitted code (i.e., obtain a dissassembly via <code>objdump -d</code> and add comments to the instructions in <code><main></code> explaining what the code does). | # Annotate the emitted code (i.e., obtain a dissassembly via <code>objdump -d</code> and add comments to the instructions in <code><main></code> explaining what the code does). | ||
+ | # Review the vector instructions for AArch64. Find a way to scale an array of sound samples (see Lab 5) by a factor between 0.000-1.000 using SIMD. (Note: you may need to convert some data types). | ||
# '''Write a blog post discussing your findings'''. Include: | # '''Write a blog post discussing your findings'''. Include: | ||
#* The source code | #* The source code | ||
Line 12: | Line 13: | ||
#* Your annotated dissassembly listing | #* Your annotated dissassembly listing | ||
#* Your reflections on the experience and the results | #* Your reflections on the experience and the results | ||
+ | #* Your proposed volume-sampling-via-SIMD solution. | ||
=== Resources === | === Resources === | ||
* [https://gcc.gnu.org/projects/tree-ssa/vectorization.html Auto-Vectorization in GCC] - Main project page for the GCC auto-vectorizer. | * [https://gcc.gnu.org/projects/tree-ssa/vectorization.html Auto-Vectorization in GCC] - Main project page for the GCC auto-vectorizer. | ||
* [http://locklessinc.com/articles/vectorize/ Auto-vectorization with gcc 4.7] - An excellent discussion of the capabilities and limitations of the GCC auto-vectorizer, intrinsics for providing hints to GCC, and other code pattern changes that can improve results. Note that there has been some improvement in the auto-vectorizer since this article was written. '''This article is strongly recommended.''' | * [http://locklessinc.com/articles/vectorize/ Auto-vectorization with gcc 4.7] - An excellent discussion of the capabilities and limitations of the GCC auto-vectorizer, intrinsics for providing hints to GCC, and other code pattern changes that can improve results. Note that there has been some improvement in the auto-vectorizer since this article was written. '''This article is strongly recommended.''' |
Revision as of 09:46, 12 February 2016
Lab 6
- Write a short program that creates two 1000-element integer arrays and fills them with random numbers, then sums those two arrays to a third array, and finally sums the third array to a long int and prints the result.
- Compile this program on aarchie in such a way that the code is auto-vectorized.
- Annotate the emitted code (i.e., obtain a dissassembly via
objdump -d
and add comments to the instructions in<main>
explaining what the code does). - Review the vector instructions for AArch64. Find a way to scale an array of sound samples (see Lab 5) by a factor between 0.000-1.000 using SIMD. (Note: you may need to convert some data types).
- Write a blog post discussing your findings. Include:
- The source code
- The compiler command line used to build the code
- Your annotated dissassembly listing
- Your reflections on the experience and the results
- Your proposed volume-sampling-via-SIMD solution.
Resources
- Auto-Vectorization in GCC - Main project page for the GCC auto-vectorizer.
- Auto-vectorization with gcc 4.7 - An excellent discussion of the capabilities and limitations of the GCC auto-vectorizer, intrinsics for providing hints to GCC, and other code pattern changes that can improve results. Note that there has been some improvement in the auto-vectorizer since this article was written. This article is strongly recommended.