Difference between revisions of "SPO600 Algorithm Selection Lab"

From CDOT Wiki
Jump to: navigation, search
(Lab 5)
Line 1: Line 1:
 
[[Category:SPO600 Labs]]{{Admon/lab|Purpose of this Lab|In this lab, you will select one of two algorithms for adjusting the volume of PCM audio samples based on benchmarking of two possible approaches.}}
 
[[Category:SPO600 Labs]]{{Admon/lab|Purpose of this Lab|In this lab, you will select one of two algorithms for adjusting the volume of PCM audio samples based on benchmarking of two possible approaches.}}
  
== Lab 5 ==
+
== Lab 6 ==
  
1. Write two different approaches to adjusting the volume of a sequence of sound samples, using different algorithms, in the C language. In each case, you should take a series of signed 16-bit integers representing sound waveform samples and multiply each by a floating point "volume scaling factor" in the range 0.000-1.000. It is recommended that one approach be the naive multiplication of the sample by the volume scaling factor, and the second approach be dramatically different (e.g., table lookup, multiplication by bit-shifting, memoization, or another approach).  
+
Background:
 +
* Digital sound is typically represented, uncompressed, as signed 16-bit integer signal samples. There is one stream of samples for the left and right stereo channels, at typical sample rates of 44.1 or 48 thousand samples per second, for a total of 88.2 or 96 thousand samples per second.
 +
* To change the volume of sound, each sample can be scaled by a volume factor, in the range of 0.0000 to 1.0000 (silence to full volume).
 +
* On a mobile device, the amount of processing required to scale sound will affect battery life.
  
2. Test which approach is faster. Control the variables and use a large run of data (at least hundreds millions of samples). Use both [[SPO600 Servers|x86 and AArch64]] systems for testing - DO NOT compare results between the architectures (because they are different classes of systems) but DO compare the relative performance of the algorithms on each architecture. For example, you might note that "Algorithm I is NN% faster than Algorithm II on Architecture A, but NN% slower on Architecture B".
 
  
3. Blog about your results. Important! -- explain what you're doing so that a reader coming across your blog post understands the context (in other words, don't just jump into a discussion of optimization results -- give your post some context).
+
Task:
 +
 
 +
A. Create a large (500M?) array of int16_t numbers to represent sound samples.
 +
 
 +
B. Scale each sample by the volume factor (0.75). Store the results into the original array or into a separate result array.
 +
 
 +
C. Sum the results and display the total (just to keep the optimzer from eliminating the scaling code).
 +
 
 +
D. Determine the time taken for step B of each approach. You can add instrumentation to your program or you can use the 'time' command.
 +
 
 +
 
 +
Try using each of these three approaches to step B, and compare the results:
 +
 
 +
# Multiply each sample by the floating point volume factor 0.75
 +
# Pre-calculate a lookup table (array) of all possible sample values multiplied by the volume factor, and look up each sample in that table to get the scaled values.
 +
# Convert the volume factor 0.75 to a fix-point integer by multiplying by a binary number representing a fixed-point value "1". For example, you could use 0b100000000 (= 256 in decimal). Shift the result to the right the required number of bits after the multiplication (>>8 if you're using 256 as the multiplier).
 +
 
 +
 
 +
Blog about your results. Important! -- explain what you're doing so that a reader coming across your blog post understands the context (in other words, don't just jump into a discussion of optimization results -- give your post some context).
 +
 
  
 
=== Things to consider ===
 
=== Things to consider ===
Line 26: Line 47:
 
* What is the performance of each approach?
 
* What is the performance of each approach?
 
* What is the energy consumption of each approach? (What information do you need to calculate this?)
 
* What is the energy consumption of each approach? (What information do you need to calculate this?)
* Xerxes and Betty have different performance profiles, so it's not reasonable to compare performance between the machines, but it is reasonable to compare the relative performance of the two algorithms in each context. Do you get similar results?
+
* Aarchie and Betty have different performance profiles, so it's not reasonable to compare performance between the machines, but it is reasonable to compare the relative performance of the two algorithms in each context. Do you get similar results?
 
* What other optimizations can be applied to this problem?
 
* What other optimizations can be applied to this problem?
  
=== Competition ===
 
* How fast can you scale 500 million int16 PCM sound samples?
 
  
 
=== Tips ===
 
=== Tips ===

Revision as of 11:38, 11 October 2017

Lab icon.png
Purpose of this Lab
In this lab, you will select one of two algorithms for adjusting the volume of PCM audio samples based on benchmarking of two possible approaches.

Lab 6

Background:

  • Digital sound is typically represented, uncompressed, as signed 16-bit integer signal samples. There is one stream of samples for the left and right stereo channels, at typical sample rates of 44.1 or 48 thousand samples per second, for a total of 88.2 or 96 thousand samples per second.
  • To change the volume of sound, each sample can be scaled by a volume factor, in the range of 0.0000 to 1.0000 (silence to full volume).
  • On a mobile device, the amount of processing required to scale sound will affect battery life.


Task:

A. Create a large (500M?) array of int16_t numbers to represent sound samples.

B. Scale each sample by the volume factor (0.75). Store the results into the original array or into a separate result array.

C. Sum the results and display the total (just to keep the optimzer from eliminating the scaling code).

D. Determine the time taken for step B of each approach. You can add instrumentation to your program or you can use the 'time' command.


Try using each of these three approaches to step B, and compare the results:

  1. Multiply each sample by the floating point volume factor 0.75
  2. Pre-calculate a lookup table (array) of all possible sample values multiplied by the volume factor, and look up each sample in that table to get the scaled values.
  3. Convert the volume factor 0.75 to a fix-point integer by multiplying by a binary number representing a fixed-point value "1". For example, you could use 0b100000000 (= 256 in decimal). Shift the result to the right the required number of bits after the multiplication (>>8 if you're using 256 as the multiplier).


Blog about your results. Important! -- explain what you're doing so that a reader coming across your blog post understands the context (in other words, don't just jump into a discussion of optimization results -- give your post some context).


Things to consider

Design of Your Test

  • Most solutions for a problem of this type involve generating a large amount of data in an array, processing that array using the function being evaluated, and then storing that data back into an array. Make sure that you measure the time taken in the test function only -- you need to be able to remove the rest of the processing time from your evaluation.
  • You may need to run a very large amount of sample data through the function to be able to detect its performance.
  • If you do not use the output from your calculation (e.g., do something with the output array), the compiler may recognize that, and remove the code you're trying to test. Be sure to process the results in some way so that the optimizer preserves the code you want to test. It is a good idea to calculate some sort of verification value to ensure that both approaches generate the same results.
  • You can test using actual sound data (see the tips section, below) or using generated data. If you're generating data, it is best to use a pseudo-random number generator which is seeded with the same value every time, so that each run processes the same data.
  • Be aware of what other tasks the system is handling during your test run.

Analyzing Results

  • What is the impact of various optimization levels on the software performance?
  • Does the distribution of data matter?
  • If samples are fed at CD rate (44100 samples per second x 2 channels), can both algorithms keep up?
  • What is the memory footprint of each approach?
  • What is the performance of each approach?
  • What is the energy consumption of each approach? (What information do you need to calculate this?)
  • Aarchie and Betty have different performance profiles, so it's not reasonable to compare performance between the machines, but it is reasonable to compare the relative performance of the two algorithms in each context. Do you get similar results?
  • What other optimizations can be applied to this problem?


Tips

Idea.png
SOX
If you want to try this with actual sound samples, you can convert a sound file of your choice to raw 16-bit signed integer PCM data using the sox utility present on most Linux systems and available for a wide range of platforms.
Idea.png
Stack Limit
Fixed-size, non-static arrays will be placed in the stack space. The size of the stack space is controlled by per-process limits, inherited from the shell, and adjustable with the ulimit command. Allocating an array larger than the stack size limit will cause a segmentation fault, usually on the first write. To see the current stack limit, use ulimit -s (displayed value is in KB; default is usually 8192 KB or 8 MB). To set the current stack limit, place a new size in KB or the keyword unlimitedafter the -s argument.

Alternate (and preferred) approach: allocate the array space with malloc() or calloc().
Idea.png
stdint.h
The stdint.h header provides definitions for many specialized integer size types. Use int16_t for 16-bit signed integers.