Changes

Fall 2022 SPO600 Weekly Schedule

1 byte removed, 12:04, 17 September 2022

spelling, 6502 emulator link fix

|3||Sep 19||[[#Week 3 - Class I|6502 Strings]]||[[#Week 3 - Class II|Building Code / Make and Makefiles / Autotools and Friends]]||[[#Week 3 Deliverables|Lab 3]]

|-

|4||Sep 26||[[#Week 4 - Class I|Compiler Optimizations]]||[[#Week 4 - Class II|ELF Files / Shared ~~Libaries~~Libraries]]||[[#Week 4 Deliverables|Lab 3, September blog posts]]

|-

|5||Oct 3||[[#Week 5 - Class I|Introduction to 64-bit Architectures and Assembly Language (x86_64 and AArch64)]]||[[#Week 5 - Class II|Memory on 64-bit Systems]]||[[#Week 5 Deliverables|Lab 3]]

*** [[Endian|Endianness]]

** Code that takes advantage of platform-specific features

* Reasons for writing code in Assembly ~~Langauge~~ Language include:

** Performance

** [[Atomic Operation|Atomic Operations]]

===== Build Process =====

Building software is a complex task that many developers gloss over. The simple act of compiling a program invokes a process with five or more stages, including pre-~~proccessing~~processing, compiling, optimizing, assembling, and linking. However, a complex software system will have hundreds or even thousands of source files, as well as dozens or hundreds of build configuration options, auto configuration scripts (cmake, autotools), build scripts (such as Makefiles) to coordinate the process, test suites, and more.

The build process varies significantly between software packages. Most software distribution projects (including Linux distributions such as Ubuntu and Fedora) use a packaging system that further wraps the build process in a standardized script format, so that different software packages can be built using a consistent process.

*** Logically: false or true.

** Binary numbers are resistant to errors, especially when compared to other systems such as analog voltages.

*** To represent the numbers 0-10 as an analog ~~electical~~ electrical value, we could use a voltage from 0 - 10 volts. However, if we use a long cable, there will be signal loss and the voltage will drop: we could apply 10 volts on one end of the cable, but only observe (say) 9.1 volts on the other end of the cable. Alternately, electromagnetic interference from nearby devices could slightly increase the signal.

*** If we instead use the same voltages and cable length to carry a binary signal, where 0 volts == off == "0" and 10 volts == on == "1", a signal that had degraded from 10 volts to 9.1 volts would still be counted as a "1" and a 0 volt signal with some stray electromagnetic interference presenting as (say) 0.4 volts would still be counted as "0". However, we will need to use multiple bits to carry larger numbers -- either in parallel (multiple wires side-by-side), or sequentially (multiple bits presented over the same wire in sequence).

* Integers

*** Instead of fixed-length numbers, variable-length numbers are used, with the most common values encoded in the smallest number of bits. This is an effective strategy if the distribution of values in the data set is uneven.

** Repeated sequence encoding (1D, 2D, 3D)

*** Run length encoding is an encoding scheme that records the number of repeated values. For example, fax messages are encoded as a series of numbers representing the number of white pixels, then the number of black pixels, then white pixels, then black pixels, alternating to the end of each line. These numbers are then represented with adaptive ~~artithmetic~~ arithmetic encoding.

*** Text data can be compressed by building a dictionary of common sequences, which may represent words or complete phrases, where each entry in the dictionary is numbered. The compressed data contains the dictionary plus a sequence of numbers which represent the occurrence of the sequences in the original text. On standard text, this typically enables 10:1 compression.

** Decomposition

*** Compound audio ~~wavforms~~ waveforms can be decomposed into individual signals, which can then be modelled as repeated sequences. For example, a waveform consisting of two notes being played at different frequencies can be decomposed into those separate notes; since each note consists of a number of repetitions of a particular wave pattern, they can individually be represented in a more compact format by describing the frequency, waveform shape, and amplitude characteristics.

** Palletization

*** Images often contain repeated colours, and rarely use all of the available colours in the original encoding scheme. For example, a 1920x1080 "full HD" image contains about 2 million pixels, so if every pixel was a different colour, there would be a maximum of 2 million colours. But it's likely that many of the pixels in the image are the same colour, so there might only be (perhaps) 4000 colours in the image. If each pixel is encoded as a 24-bit value, there are potentially 16 million colours available, and there is no possibility that they are all used. Instead, a palette can be provided which specifies each of the 4000 colours used in the picture, and then each pixel can be encoded as a 12-bit number which selects one of the colours from the palette. The total storage requirement for the original 24-bit scheme is 1920*1080*3 bytes per pixel = 5.9 MB. Using a 12-bit pallette, the storage requirement is 3 * 4096 bytes for the palette plus 1920*1080*1.5 bytes for the image, for a total of 3 MB -- a reduction of almost 50%

{{Admon/tip|Follow the Links!|To get the full benefit of the following material, please follow the links embedded within it. For additional detail, see the Category links at the bottom of those pages -- for example, the [[Category:Computer Architecture|Computer Architecture]] category linked from many of the following pages has over 30 pages of content.}}

* Although we program computers in a variety of languages, they can really only execute one ~~langauge~~language: [[Machine Language]], which is encoded in an architecture-specific binary code, sometimes called object code.

* Machine language is not easy to read. [[Assembly Language]] corresponds very closely to machine language, but is (sort of!) human-readable.

* Assembly language is converted into machine code by a particular type of compiler called an [[Assembler]] (sometimes the language itself is also referred to as "Assembler").

==== 6502 ====

Modern processors are complex - the reference manual for 64-bit ARM processors is over 11000 pages long! - so we're going to look at assembly ~~lanaguage~~ language on a much simpler processor to get started. This processor is the 6502, a processor used in many early home and personal computers as well as video game systems, including the Commodore PET, VIC-20, C64; the Apple II; the Atari 400 and 800 computers and 2600 video game systems; and many others.

* Introduction to the [[6502]] (note the Resources links on that page)

*** [[Endian|Endianness]]

** Code that takes advantage of platform-specific features

* Reasons for writing code in ~~Assembl~~Assembly

== Week 2 ==

{{Admon/tip|Follow the Links!|To get the full benefit of the following material, please follow the links embedded within it. For additional detail, see the Category links at the bottom of those pages -- for example, the [[Category:Computer Architecture|Computer Architecture]] category linked from many of the following pages has over 30 pages of content.}}

* Although we program computers in a variety of languages, they can really only execute one ~~langauge~~language: [[Machine Language]], which is encoded in an architecture-specific binary code, sometimes called object code.

* Machine language is not easy to read. [[Assembly Language]] corresponds very closely to machine language, but is (sort of!) human-readable.

* Assembly language is converted into machine code by a particular type of compiler called an [[Assembler]] (sometimes the language itself is also referred to as "Assembler").

==== 6502 ====

Modern processors are complex - the reference manual for 64-bit ARM processors is over 7000 pages long! - so we 're going to look at assembly ~~lanaguage~~ language on a much simpler processor to get started. This processor is the 6502.

* Introduction to the [[6502]] (note the Resources links on that page)

# Study the [[6502 Instructions - Introduction|6502 Instructions]] and make sure you understand what each one does

# Complete [[6502 Assembly Language Lab|Lab 2]] and blog your results

y ~~Langauge~~ Language include:

** Performance

** [[Atomic Operation|Atomic Operations]]

===== Build Process =====

Building software is a complex task that many developers gloss over. The simple act of compiling a program invokes a process with five or more stages, including pre-~~proccessing~~processing, compiling, optimizing, assembling, and linking. However, a complex software system will have hundreds or even thousands of source files, as well as dozens or hundreds of build configuration options, auto configuration scripts (cmake, autotools), build scripts (such as Makefiles) to coordinate the process, test suites, and more.

The build process varies significantly between software packages. Most software distribution projects (including Linux distributions such as Ubuntu and Fedora) use a packaging system that further wraps the build process in a standardized script format, so that different software packages can be built using a consistent process.

*** Logically: false or true.

** Binary numbers are resistant to errors, especially when compared to other systems such as analog voltages.

*** To represent the numbers 0-5 as an analog ~~electical~~ electrical value, we could use a voltage from 0 - 5 volts. However, if we use a long cable, there will be signal loss and the voltage will drop: we could apply 5 volts on one end of the cable, but only observe (say) 4.1 volts on the other end of the cable. Alternately, electromagnetic interference from nearby devices could slightly increase the signal.

*** If we use instead use the same voltages and cable length to carry a binary signal, where 0 volts == off == "0" and 5 volts == on == "1", a signal that had degraded from 5 volts to 4.1 volts would still be counted as a "1" and a 0 volt signal with some stray electromagnetic interference presenting as (say) 0.4 volts would still be counted as "0". However, we will need to use multiple bits to carry larger numbers -- either in parallel (multiple wires side-by-side), or sequentially (multiple bits presented over the same wire in sequence).

* Integers

{{Admon/tip|Follow the Links!|To get the full benefit of the following material, please follow the links embedded within it. For additional detail, see the Category links at the bottom of those pages -- for example, the [[Category:Computer Architecture|Computer Architecture]] category linked from many of the following pages has over 30 pages of content.}}

* Although we program computers in a variety of languages, they can really only execute one ~~langauge~~language: [[Machine Language]], which is encoded in an architecture-specific binary code, sometimes called object code.

* Machine language is not easy to read. [[Assembly Language]] corresponds very closely to machine language, but is (sort of!) human-readable.

* Assembly language is converted into machine code by a particular type of compiler called an [[Assembler]] (sometimes the language itself is also referred to as "Assembler").

==== 6502 ====

Modern processors are complex - the reference manual for 64-bit ARM processors is over 7000 pages long! - so we 're going to look at assembly ~~lanaguage~~ language on a much simpler processor to get started. This processor is the 6502.

* Introduction to the [[6502]] (note the Resources links on that page)

variable-length numbers are used, with the most common values encoded in the smallest number of bits. This is an effective strategy if the distribution of values in the data set is uneven.

** Repeated sequence encoding (1D, 2D, 3D)

*** Run length encoding is an encoding scheme that records the number of repeated values. For example, fax messages are encoded as a series of numbers representing the number of white pixels, then the number of black pixels, then white pixels, then black pixels, alternating to the end of each line. These numbers are then represented with adaptive ~~artithmetic~~ arithmetic encoding.

*** Text data can be compressed by building a dictionary of common sequences, which may represent words or complete phrases, where each entry in the dictionary is numbered. The compressed data contains the dictionary plus a sequence of numbers which represent the occurrence of the sequences in the original text. On standard text, this typically enables 10:1 compression.

** Decomposition

*** Compound audio ~~wavforms~~ waveforms can be decomposed into individual signals, which can then be modelled as repeated sequences. For example, a waveform consisting of two notes being played at different frequencies can be decomposed into those separate notes; since each note consists of a number of repetitions of a particular wave pattern, they can individually be represented in a more compact format by describing the frequency, waveform shape, and amplitude characteristics.

** Palletization

*** Images often contain repeated colours, and rarely use all of the available colours in the original encoding scheme. For example, a 1920x1080 "full HD" image contains about 2 million pixels, so if every pixel was a different colour, there would be a maximum of 2 million colours. But it's likely that many of the pixels in the image are the same colour, so there might only be (perhaps) 4000 colours in the image. If each pixel is encoded as a 24-bit value, there are potentially 16 million colours available, and there is no possibility that they are all used. Instead, a palette can be provided which specifies each of the 4000 colours used in the picture, and then each pixel can be encoded as a 12-bit number which selects one of the colours from the palette. The total storage requirement for the original 24-bit scheme is 1920*1080*3 bytes per pixel = 5.9 MB. Using a 12-bit pallette, the storage requirement is 3 * 4096 bytes for the palette plus 1920*1080*1.5 bytes for the image, for a total of 3 MB -- a reduction of almost 50%

{{Admon/tip|Follow the Links!|To get the full benefit of the following material, please follow the links embedded within it. For additional detail, see the Category links at the bottom of those pages -- for example, the [[Category:Computer Architecture|Computer Architecture]] category linked from many of the following pages has over 30 pages of content.}}

* Although we program computers in a variety of languages, they can really only execute one ~~langauge~~language: [[Machine Language]], which is encoded in an architecture-specific binary code, sometimes called object code.

* Machine language is not easy to read. [[Assembly Language]] corresponds very closely to machine language, but is (sort of!) human-readable.

* Assembly language is converted into machine code by a particular type of compiler called an [[Assembler]] (sometimes the language itself is also referred to as "Assembler").

==== 6502 ====

Modern processors are complex - the reference manual for 64-bit ARM processors is over 7000 pages long! - so we 're going to look at assembly ~~lanaguage~~ language on a much simpler processor to get started. This processor is the 6502.

* Introduction to the [[6502]] (note the Resources links on that page)

* Introduction to the [[6502 Instructions - Introduction|6502 Instructions]]

* Information about the [[6502 Emulator]] which we will use in this course, and some [[6502_Emulator_Example_Code|example code]]

* Link to the actual [http://6502.cdot.~~system~~ systems 6502 emulator]

=== Week 2 - Class II ===

Pstaley

5

edits

CDOT Wiki β

Changes

Fall 2022 SPO600 Weekly Schedule

CDOT Wiki ^β