Open main menu

CDOT Wiki β

Changes

Fall 2019 SPO600 Weekly Schedule

1,376 bytes added, 09:37, 2 October 2019
Week 5 - Class I
=== Week 5 - Class I ===
* SIMD and Auto-vectorization
** SIMD is an acronym for "Single Instruction, Multiple Data", and refers to a class of instructions which perform the same operation on several separate pieces of data in parallel. SIMD instructions also include related instructions to set up data for SIMD processing, and to summarize results.
** SIMD is based on very wide registers (128 bits to 2048 bits on implementations current as of 2019), and these wide registers can be treated as multiple "lanes" of similar data. These SIMD registers, also called vector registers, can therefore be thought of as small arrays of values.
*#* This works for the basic SIMD operations, but may not be applicable to advanced SIMD instructions, which don't clearly map to C statements.
*#* The compiler will be very cautious about vectorizing code. See the Resources section below for insight into these challenges.
*#** In order to vectorize a loop, among other things, the number of loop iterations needs to be known before the loop starts, memory layout must meet SIMD alignment requirements, loops must not overlap in a way that is affected by vectorization.*#** The compiler will also calculate a cost for the vectorization, because *#* Vectorization in applied by default only at the -O3 level in most compilers. In GCC:*# ** The main individual feature flag to turn on vectorization is <code>-ftree-vectorize</code> (enabled by default at -O3, disabled at other levels).*#** You can see all of the vectorization decisions using <code>-fopt-info-vec-all</code> or you can see just the missed vectorizations using <code>-fopt-info-vec-missed</code> (which is usually what you want to focus on, because it show only the loops where vectorization was ''not'' enabled, and the reason that it was not). This approach is generally very portable.*# We can explicitly include SIMD instructions in a C program by using [[Inline Assembly Language|Inline Assembler]]. This is obviously architecture-specific, so it is important to use C preprocessor directives to include/exclude this code, and to use a generic C implementation on any platform for which you are not providing an inline assembler version.*# ''C Intrinsics* [[SPO600 Vectorization Lab|Vectorization Lab]] (Optional lab '' are function- recommended)like capabilities built into the C compiler. There is a group of intrinsics which provide access to SIMD instructions. However, the benefit of using these over inline assembler is debatable. SIMD intrinsics are not portable, and should be included with C preprocessor directives like inline assembler.
=== Week 5 - Class II ===