Open main menu

CDOT Wiki β

Changes

Fall 2022 SPO600 Weekly Schedule

6,689 bytes added, 00:06, 13 November 2022
no edit summary
|6||Oct 10||[[#Week 6 - Class I|Mid-semester Sync Discussion]]||[[#Week 6 - Class II|Algorithm Selection / In-line Assembler / SIMD]]||[[#Week 6 Deliverables|Lab 5]]
|-
|7||Oct 17||[[#Week 7 - Class I|Project IntroductionExploring 64-bit Code]]||[[#Week 7 - Class II|Project SelectionSVE2]]||[[#Week 7 Deliverables|Lab Wrap up lab 5]]
|-
|Reading||Oct 24||style="background: #f0f0ff" colspan="3" align="center"|Reading Week
|-
|8||Oct 31||[[#Week 8 - Class I|Optimization Trade-Offs / Algorithm Selection/ Inline Assembler / SIMD]]||[[#Week 8 - Class II|Scalable Vector Extensions (SVE/SVE2) via Inline Assemblerand C Intrinsics]]||[[#Week 8 Deliverables|Lab 6, October blog posts]]
|-
|9||Nov 7||[[#Week 9 - Class I|iFunc & Project DiscussionOverview]]||[[#Week 9 - Class II|Demo/discussion of SVE2 ExamplesProject Detail]]||[[#Week 9 Deliverables|Blog about ifunc and your project work]]
|-
|10||Nov 14||[[#Week 10 - Class I|Project Discussion]]||[[#Week 10 - Class II|Memory Barriers]]||[[#Week 10 Deliverables|Blog about project work]]
==== Lab 5 ====
* [https://wiki.cdot.senecacollege.ca/wiki/SPO600_Algorithm_Selection_Lab#Deliverables Algorithm Selection Lab] (Lab 5)
=== Week 6 Deliverables ===
* [https://wiki.cdot.senecacollege.ca/wiki/SPO600_Algorithm_Selection_LabLab 5]  == Week 7 == === Week 7 - Class I === ==== Video ====* Video summary will be posted after editing === Week 7 - Class II === '''Please catch up on course material to this point. If you are fully caught up, you can start to take a look at SVE2:''' ==== Reading ====* [[SVE2]] ==== SVE2 Demonstration ====* Code available here: https://github.com/ctyler/sve2-test* This is an implementation of a very simple program which takes an image file, adjusts the red/green/blue channels of that file, and then writes an output file. Each channel is adjusted by a factor in the range 0.0 to 2.0 (with saturation).* The image adjustment is performed in the function <code>adjust_channels()</code> in the file <code>adjust_channels.c</code>. There are three implementations:*#A basic (naive) implementation in C. Although this is a very basic implementation, it is potentially subject to autovectorization.*# An implementation using inline assembler for SVE2 with strucure loads.*# An implementation using inline assembler for SVE2 with an interleaved factor table.*# An implementation using ACLE compile intrinsics.* The implementation built is dependent on the value of the ADJUST_CHANNEL_IMPLEMENTATION macro.* The provided Makefile will build four versions of the binary -- one using each of the four implementations -- and it will run through 3 tests with each binary. The tests use the input image file <code>tests/input/bree.jpg</code> (a picture of a cat) and place the output in the files <code>tests/output/bree[1234][abc].jpg</code>. The output files are processed with adjustment factors of 0.5/0.5/0.5, 1.0/1.0/1.0, and 2.0/2.0/2.0.* '''Please examine, build, and test the code, compare the implementations, and note how it works - there are extensive comments in the code, especially for implementation 2.'''* Your observations about the code might make a good blog post!  === Week 7 Deliverables ===* Complete [[SPO600 64-bit Assembly Language Lab|Lab 4]] and [https://wiki.cdot.senecacollege.ca/wiki/SPO600_Algorithm_Selection_Lab Lab 5]* Remember that October blogs are due soon. == Week 8 == === Week 8 - Class I === ==== Video ====* [https://web.microsoftstream.com/video/f67c0185-fc67-43fb-ac39-57cae26792a8 SIMD - Edited Summary Video] === Week 8 - Class II === ==== Video ====* [https://web.microsoftstream.com/video/a6b892e4-b408-4bc7-9fc1-d78e4efb8e0e SVE & SVE2 - Edited Summary Video] ==== Reading ====* [[SVE2]] ==== SVE2 Demonstration ====* Code available here: https://github.com/ctyler/sve2-test** You can clone this to israel.cdot.systems with: <code>git clone https://github.com/ctyler/sve2-test.git</code>* This is an implementation of a very simple program which takes an image file, adjusts the red/green/blue channels of that file, and then writes an output file. Each channel is adjusted by a factor in the range 0.0 to 2.0 (with saturation).* The image adjustment is performed in the function <code>adjust_channels()</code> in the file <code>adjust_channels.c</code>. There are three implementations:*# A basic (naive) implementation in C. Although this is a very basic implementation, it is potentially subject to autovectorization.*# An implementation using inline assembler for SVE2 with strucure loads.*# An implementation using inline assembler for SVE2 with an interleaved factor table.*# An implementation using ACLE compile intrinsics.* The implementation built is dependent on the value of the ADJUST_CHANNEL_IMPLEMENTATION macro.* The provided Makefile will build four versions of the binary -- one using each of the four implementations -- and it will run through 3 tests with each binary. The tests use the input image file <code>tests/input/bree.jpg</code> (a picture of a cat) and place the output in the files <code>tests/output/bree[1234][abc].jpg</code>. The output files are processed with adjustment factors of 0.5/0.5/0.5, 1.0/1.0/1.0, and 2.0/2.0/2.0.* '''Please examine, build, and test the code, compare the implementations, and note how it works - there are extensive comments in the code, especially for implementation 2.'''* Your observations about the code might make a good blog post! === Week 8 Deliverables ===* Continue your blogging* Include blogging on SVE/SVE* The second group of blog posts is due on or before this Sunday (November 6, 11:59 pm) == Week 9 == === Week 9 - Class I === ==== Video ====* Will be posted after editing ==== iFunc ==== GNU iFunc is a facility for handling indirect functions. The basic premise is that you prototype the function to be called, add the <code>ifunc</code> attribute to that prototype, and provide the name of a resolver function. The resolver function is called at program initialization, and returns a pointer to the function to be executed when the function referenced in the prototype is called. The resolver typically picks one of several implementations based on the capabilities of the machine on which the code is running; for example, it could return a pointer to a non-SVE, SVE, or SVE2 implementation of a function based on cpu capabilities (on an Aarch64 system) or it could return a pointer to an SSE, SSE2, AVX, or AVX512 implementation (on an x86_64 system). There is a [https://github.com/ctyler/ifunc-aarch64-demo GitHub repository] available with example iFunc code -- please clone this to [[SPO600 Servers#AArch64:_israel.cdot.systems|israel.cdot.systems]] and build and test the code there. You should see different results if you run the output executable directly (<code>./ifunc-test</code>) and run it through the qemu-aarch64 tool, which will emultate SVE2 capabilities (<code>qemu-aarch64 ./ifunc-test</code>). Make sure you understand how the code works. ==== Reading/Resources ==== * [https://gcc.gnu.org/onlinedocs/gcc-12.2.0/gcc/Common-Function-Attributes.html#index-ifunc-function-attribute GNU iFunc attribute in GCC manual]* [https://sourceware.org/glibc/wiki/GNU_IFUNC iFunc on the glibc wiki] === Week 9 - Class II === ==== Video ====* [https://web.microsoftstream.com/video/edc09b0a-1a7f-45d1-a27e-7f4901bba03d Edited summary video] - '''Important!''' This video contains a detailed discussion of the requirements for the course project.** Project discussion starts at beginning of video** Demo of what the project needs to do (manually performing the same steps) starts at 0:27:47** Recap/summary of the demo starts around 1:02:05 ==== Project ====* [[Fall 2022 SPO600 Project]] === Week 9 Deliverables ===* Investigate the iFunc example code* Blog about your investigation* Start blogging about your project 
<!-- Memory System Design - Paging ; Memory - Cache/Numa ; Memory - Observability, Barriers -->
* The image adjustment is performed in the function <code>adjust_channels()</code> in the file <code>adjust_channels.c</code>. There are three implementations:
*# A basic (naive) implementation in C. Although this is a very basic implementation, it is potentially subject to autovectorization.
*# An implementation using inline assembler for SVE2with strucure loads.*# An implementation using inline assembler for SVE2 with an interleaved factor table.*# (Future) An implementation using ACLE compile intrinsics.
* The implementation built is dependent on the value of the ADJUST_CHANNEL_IMPLEMENTATION macro.
* The provided Makefile will build two four versions of the binary, -- one using implementation 1 (named <code>image_adjust1</code>) and one using implementation 2 (named <code>image_adjust2</code>), each of the four implementations -- and it will run through 3 tests with each binary. The tests use the input image file <code>tests/input/bree.jpg</code> (a picture of a cat) and place the output in the files <code>tests/output/bree[121234][abc].jpg</code>. The output files are processed with adjustment factors of 0.5/0.5/0.5, 1.0/1.0/1.0, and 2.0/2.0/2.0.
* '''Please examine, build, and test the code, compare the implementations, and note how it works - there are extensive comments in the code, especially for implementation 2.'''
* Your observations about the code might make a good blog post!