Difference between revisions of "SPO600 Algorithm Selection Lab"
Chris Tyler (talk | contribs) |
Chris Tyler (talk | contribs) (→Lab 5) |
||
Line 4: | Line 4: | ||
1. Write two different approaches to adjusting the volume of a sequence of sound samples: | 1. Write two different approaches to adjusting the volume of a sequence of sound samples: | ||
− | * The first one should scale a signed 16-bit integer by multiplying it by a volume scaling factor expressed as a floating point number in the range of 0-1. This should be implemented as a function that accepts the sample (int16) and scaling factor (float) and returns the scaled sample (int16). | + | * The first one should scale a signed 16-bit integer by multiplying it by a volume scaling factor expressed as a floating point number in the range of 0.000-1.000. This should be implemented as a function that accepts the sample (int16) and scaling factor (float) and returns the scaled sample (int16). |
− | * The second | + | * The second version of the function should do the same thing, using a lookup table (a pre-computed array of all 65536 possible values). The lookup table should be initialized every time a different volume factor is observed. This should be implemented as a drop-in replacement for the function above (same parameters and return value). |
− | 2. Test which approach is faster. Control the variables and use a large run of data (at least millions of samples). Use both [[SPO600 Servers| | + | 2. Test which approach is faster. Control the variables and use a large run of data (at least hundreds millions of samples). Use both [[SPO600 Servers|x86 and AArch64]] systems for testing - DO NOT compare results between the architectures (because they are different classes of systems) but DO compare the relative performance of the algorithms on each architecture. For example, you might note that "Algorithm I is NN% faster on Architecture A, but NN% slower on Architecture B". |
3. Blog about your results. Important! -- explain what you're doing so that a reader coming across your blog post understands the context (in other words, don't just jump into a discussion of optimization results -- give your post some context). | 3. Blog about your results. Important! -- explain what you're doing so that a reader coming across your blog post understands the context (in other words, don't just jump into a discussion of optimization results -- give your post some context). | ||
Line 13: | Line 13: | ||
=== Things to consider === | === Things to consider === | ||
+ | ==== Design of Your Test ==== | ||
+ | |||
+ | * Most solutions for a problem of this type involve generating a large amount of data in an array, processing that array using the function being evaluated, and then storing that data back into an array. Make sure that you measure the time taken in the test function only -- you need to be able toremove the rest of the processing time from your evaluation. | ||
+ | * You may need to run a very large amount of sample data through the function to be able to detect its performance. | ||
+ | * If you do not use the output from your calculation (e.g., do something with the output array), the compiler may recognize that, and remove the code you're trying to test. Be sure to process the results in some way so that the optimizer preserves the code you want to test. It is a good idea to calculate some sort of verification value to ensure that both approaches generate the same results. | ||
+ | |||
+ | ==== Analyzing Results ==== | ||
* Does the distribution of data matter? | * Does the distribution of data matter? | ||
* If samples are fed at CD rate (44100 samples per second x 2 channels), can both algorithms keep up? | * If samples are fed at CD rate (44100 samples per second x 2 channels), can both algorithms keep up? | ||
Line 22: | Line 29: | ||
=== Competition === | === Competition === | ||
− | * | + | * How fast can you scale 500 million int16 PCM sound samples? |
=== Tips === | === Tips === |
Revision as of 09:37, 5 February 2016
Contents
Lab 5
1. Write two different approaches to adjusting the volume of a sequence of sound samples:
- The first one should scale a signed 16-bit integer by multiplying it by a volume scaling factor expressed as a floating point number in the range of 0.000-1.000. This should be implemented as a function that accepts the sample (int16) and scaling factor (float) and returns the scaled sample (int16).
- The second version of the function should do the same thing, using a lookup table (a pre-computed array of all 65536 possible values). The lookup table should be initialized every time a different volume factor is observed. This should be implemented as a drop-in replacement for the function above (same parameters and return value).
2. Test which approach is faster. Control the variables and use a large run of data (at least hundreds millions of samples). Use both x86 and AArch64 systems for testing - DO NOT compare results between the architectures (because they are different classes of systems) but DO compare the relative performance of the algorithms on each architecture. For example, you might note that "Algorithm I is NN% faster on Architecture A, but NN% slower on Architecture B".
3. Blog about your results. Important! -- explain what you're doing so that a reader coming across your blog post understands the context (in other words, don't just jump into a discussion of optimization results -- give your post some context).
Things to consider
Design of Your Test
- Most solutions for a problem of this type involve generating a large amount of data in an array, processing that array using the function being evaluated, and then storing that data back into an array. Make sure that you measure the time taken in the test function only -- you need to be able toremove the rest of the processing time from your evaluation.
- You may need to run a very large amount of sample data through the function to be able to detect its performance.
- If you do not use the output from your calculation (e.g., do something with the output array), the compiler may recognize that, and remove the code you're trying to test. Be sure to process the results in some way so that the optimizer preserves the code you want to test. It is a good idea to calculate some sort of verification value to ensure that both approaches generate the same results.
Analyzing Results
- Does the distribution of data matter?
- If samples are fed at CD rate (44100 samples per second x 2 channels), can both algorithms keep up?
- What is the memory footprint of each approach?
- What is the performance of each approach?
- What is the energy consumption of each approach?
- Xerxes and Aarchie have different performance profiles, so it's not reasonable to compare performance between the machines, but it is reasonable to compare the relative performance of the two algorithms in each context. Do you get similar results?
- What other optimizations can be applied to this problem?
Competition
- How fast can you scale 500 million int16 PCM sound samples?