Difference between revisions of "Fall 2022 SPO600 Project"
Chris Tyler (talk | contribs) (Created page with "This page describes the SPO600 project in the Fall 2022 semester. == Overview == The autovectorizer in gcc (and other compilers, such as llvm/clang) has become very good...") |
Chris Tyler (talk | contribs) (→Stage 3) |
||
Line 69: | Line 69: | ||
What is required: | What is required: | ||
* A final implementation of your project | * A final implementation of your project | ||
− | * This implementation should not have any limitations beyond those listed in the [[#Limitations Limitations]] section above | + | * This implementation should not have any limitations beyond those listed in the [[#Limitations|Limitations]] section above |
− | * Bonus points will be awarded if your project works well and is not subject to some of the [[#Limitations Limitations]] listed above. For example, your project could work on aarch64 and x86_64 systems, or it could accept multiple function files | + | * Bonus points will be awarded if your project works well and is not subject to some of the [[#Limitations|Limitations]] listed above. For example, your project could work on aarch64 and x86_64 systems, or it could accept multiple function files |
* Bonus points will be awarded if your project has additional useful features, such as notifying the user if autovectorization could not be applied to the code | * Bonus points will be awarded if your project has additional useful features, such as notifying the user if autovectorization could not be applied to the code | ||
Revision as of 11:29, 11 November 2022
This page describes the SPO600 project in the Fall 2022 semester.
Contents
Overview
The autovectorizer in gcc (and other compilers, such as llvm/clang) has become very good -- to the point that it is automatically enabled at optimization level -O2 (standard optimization level) in recent versions of gcc.
However, there are many different implementations of SIMD instructions on various CPUs -- on 64-bit Arm systems, there's Advanced SIMD, SVE, and SVE2; on x86, there's SSE, SSE2, AVX, AVX512, and more. It is desirable to be able to build a single binary that takes optimal advantage of the available CPU capabilities.
There is a tool provided by the gcc compiler to allow the run-time selection of one of several different implementations of a function (or procedure or method or subroutine): ifunc. However, ifunc requires additional setup by the software developer.
The goal for this project is to produce a proof-of-concept tool that will take code that meets specific conditions and automatically build it with ifunc capability to select between multiple, autovectorized versions of a function, to take advantage of the best SIMD implementation available on the CPU on which the code is running.
Imagine that you have two source files:
main.c # contains main() and possibly other functions function.c # contains one function named foo()
The file function.c
will be built three times, each time using the autovectorizer, targeting different SIMD implementations for aarch64 (advanced SIMD, SVE, and SVE2). The appropriate ifunc code will be inserted so that the correct build of the foo()
function is executed based on the capabilities of the computer on which it runs.
Limitations
Since the goal of this project is to produce a proof-of-concept, these limitations are accepted:
- This tool only operates on aarch64 systems
- There are three targets of interest: machines with advanced SIMD, SVE, and SVE2 capabilities.
- There are only two input source files, one containing main (and optionally other functions) (
main.c
) and one containing a function to be optimized (function.c
) - Only
function.c
is built multiple times for different SIMD implementations - The file
function.c
may only contain one function
Requirements
The finished project:
- Can be written in any language which will operate on the target environment, which is a 64-bit Arm system running Fedora 35 (such as israel.cdot.systems) with gcc 11.3.1. This means that the tool itself can be written in C, python, perl, bash, JS/node, haskell, or any other language available for that platform
- Once started with the appropriate arguments, the tool must produce an output file which will use advanced SIMD, SVE, or SVE2 instructions for the function contained in the
function.c
file according to the capabilites of the platform on which it is run. Thus, if the code is executed on israel.cdot.systems directly, it will execute with advanced SIMD (non-SVE) instructions only. If it is executed on israel.cdot.systems using theqemu-aarch64
emulation tool, it will use SVE2 instructions.
Test Code
To test your solution, use the code available at https://github.com/ctyler/spo600-fall2022-project-test-code as the input.
Project Stages
Stage 1
What is required:
- Provide a plan for your project
- Specify the language that you're going to use
- Specify the overall operation of your project -- how you're going to approach the problem
- Describe the challenges you expect to face as you implement the code
- Submit your plan in one or more clear and detailed blog posts
Due: Sunday, November 18, 11:59 pm
Mark: 15%
Stage 2
What is required:
- Provide the initial implementation of your project
- The initial implementation must be able to produce a usable output binary that correctly uses the best available SIMD implementation, but it may have additional limitations or bugs. These limitations and bugs must be appropriately documented
- Provide clear documentation on what the project does and how to test it
- Submit the implementation as one or more blog posts linked to your code hosted appropriately (recommendation: place it in an accessible git repository)
Due: Sunday, December 4, 11:59 pm
Mark: 20%
Stage 3
What is required:
- A final implementation of your project
- This implementation should not have any limitations beyond those listed in the Limitations section above
- Bonus points will be awarded if your project works well and is not subject to some of the Limitations listed above. For example, your project could work on aarch64 and x86_64 systems, or it could accept multiple function files
- Bonus points will be awarded if your project has additional useful features, such as notifying the user if autovectorization could not be applied to the code
Due: Wednesday, December 14, 11:59 pm
Mark: 25%