Changes

← Older edit

Winter 2022 SPO600 Weekly Schedule

241 bytes added, 10:42, 6 September 2022

→‎Binary Representation of Data

*** Logically: false or true.

** Binary numbers are resistant to errors, especially when compared to other systems such as analog voltages.

*** To represent the numbers 0-5 10 as an analog electical value, we could use a voltage from 0 - 5 10 volts. However, if we use a long cable, there will be signal loss and the voltage will drop: we could apply 5 10 volts on one end of the cable, but only observe (say) 49.1 volts on the other end of the cable. Alternately, electromagnetic interference from nearby devices could slightly increase the signal.*** If we use instead use the same voltages and cable length to carry a binary signal, where 0 volts == off == "0" and 5 10 volts == on == "1", a signal that had degraded from 5 10 volts to 49.1 volts would still be counted as a "1" and a 0 volt signal with some stray electromagnetic interference presenting as (say) 0.4 volts would still be counted as "0". However, we will need to use multiple bits to carry larger numbers -- either in parallel (multiple wires side-by-side), or sequentially (multiple bits presented over the same wire in sequence).

* Integers

** Integers are the basic building block of binary numbering schemes.

** The most commonly-used floating point formats are defined in the [[IEEE 754]] standard.

** IEEE754 floating point numbers have three parts: a ''sign bit'' (0 for positive, 1 for negative), a ''mantissa'' or ''significand'', and an ''exponent''. The significand has an implied 1 and radix point preceeding the stored value. The exponent is stored as an unsigned integer to which a ''bias'' value has been added; the bias value is 2(number of exponent bits - 1) - 1. The floating point value is interpreted in normal cases as <code>''sign'' mantissa * 2(exponent - bias)</code>. Exponent values which are all-zeros or all-ones encode four categories of special cases: zero, infinity, Not a Number (NaN), and subnormal numbers (numbers which are close to zero, where the significand does not have an implied 1 to the left of the radix point); in these special cases, the sign bit and significand values may have special meanings.

** There are some new floating-point formats appearing, such as ''Brain Float 16'', a 16-bit format with the same dynamic range as 32-bit IEEE 754 floating point but with less accuracy, intended for use in machine learning applications.

* Characters

** Characters are encoded as integers, where each integer corresponds to one "code point" in a character table (e.g., code 65 in ASCII corresponds to the character "A").

** Historically, many different coding schemes have been used, but the two most common ones were the American Standard Code for Information Interchange (ASCII), and Extended Binary Coded Decimal Interchange Code (EBCDIC - primarily used on IBM midrange and mainframe systems).

** ASCII characters occupied seven bits (code points 0-127), and contains only characters used in North American English. ASCII characters are usually encoded in bytes, so many vendors of ASCII-based systems used the remaining codes 128-255 for special characters such as graphics, line symbols (horizontal, vertical, connector, and corner line symbols for drawing tables), and accented characters; these were called "extended ASCII".

** Several ISO standards exist in an attempt to standardize the "extended ascii" characters, such as ISO8859, which was intended to enable the encoding of European languages by adding currency symbols and accented characters. However, the original version of ISO8859-1 does not include all accented characters and was created before the Euro symbol was standardized, so there are multiple versions of ISO8859, ranging from ISO8859-1 through ISO8859-15.

** The Unicode and ISO10646 initiatives were initiated to create a single character code set that would encode all symbols used in human writing, both for current and obsolete languages. These initiatives were merged, and the Unicode and ISO10646 standards define a common character set with 232 potential code points. However, Unicode also describes transformation formats for data interchange, rendering and composition/decomposition recommendations, and font symbol recommendations.

Chris Tyler

Bureaucrats, Administrators

1,885

edits

Changes

Winter 2022 SPO600 Weekly Schedule

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

get involved with CDOT

courses

course projects

links

Tools