Difference between revisions of "Assembly Language"

From CDOT Wiki
Jump to: navigation, search
m (arm64 section format fix)
 
(15 intermediate revisions by 2 users not shown)
Line 1: Line 1:
[[Category:Computer Architecture]]
+
[[Category:Computer Architecture]][[Category:Assembly Language]]
''Assembly language'' is a [[Symbol|symbolic]] representation of [[Machine Language|machine language]]. It is therefore [[Portable|architecture-specific]].
+
''Assembly language'' is a [[Symbol|symbolic]] representation of [[Machine Language|machine language]]. It is therefore very [[Portable|architecture-specific]].
  
Each instruction is represented by a short mnemonic word such as "LDR" for ''Load Register'', "MOV" for ''move'', or "MUL" for ''multiply'', followed by (optional) arguments. The [[Addressing Mode|addressing mode]] is implied by the format of the arguments.
+
Each instruction is represented by a short mnemonic word such as "LDR" for ''Load Register'', "MOV" for ''move'', or "MUL" for ''multiply'', followed by (optional) arguments. The [[Addressing Mode|addressing mode]] is implied by the format of the arguments. Different [[Assembler|assemblers]] use slightly different syntax.
  
 
== Examples ==
 
== Examples ==
Line 8: Line 8:
 
=== x86 ===
 
=== x86 ===
  
Here is a "Hello, World!" program in x86 assembler for a Linux system, using the Nasm syntax:
+
Here is a "Hello, World!" program written for an x86_64 Linux system using the [https://sourceware.org/binutils/docs/as/ GNU Assembler (gas/as)] syntax (which is the main assembler used in open source projects such as the Linux kernel, as well as the [[SPO600]] course), using Linux [[Syscalls]]:
 +
 
 +
.text
 +
.globl _start
 +
_start:
 +
    mov $len, %edx    /* file descriptor: 1 is stdout */
 +
    mov $msg, %ecx    /* message location (memory address) */
 +
    mov $1, %ebx      /* message length (bytes) */
 +
    mov $4, %eax      /* write is syscall #4 */
 +
    int $0x80          /* invoke syscall */
 +
 +
    mov $0, %ebx      /* exit status: 0 (good) */
 +
    mov $1, %eax      /* kernel syscall number: 1 is sys_exit */
 +
    int $0x80          /* invoke syscall */
 +
 +
.data
 +
msg:
 +
    .ascii "Hello, World!\n"
 +
    len = . - msg
 +
 
 +
Here is a similar program for a 32-bit x86 system using the [http://www.nasm.us/xdoc/2.11/html/nasmdoc1.html#section-1.1 Nasm] syntax:
  
 
  section    .text
 
  section    .text
Line 29: Line 49:
 
  len    equ    $ - msg
 
  len    equ    $ - msg
  
Here is the same program with GNU Assembler (gas/as) syntax:
 
  
.text
+
Notice that the order of the arguments in some lines is reversed between the two assemblers, and the prefixes to symbols and values also change.
.globl _start
+
 
_start:
 
    mov $len, %edx    /* file descriptor: 1 is stdout */
 
    mov $msg, %ecx    /* message location (memory address) */
 
    mov $1, %ebx      /* message length (bytes) */
 
    mov $4, %eax      /* write is syscall #4 */
 
    int $0x80          /* invoke syscall */
 
 
    mov $0, %ebx      /* exit status: 0 (good) */
 
    mov $1, %eax      /* kernel syscall number: 1 is sys_exit */
 
    int $0x80          /* invoke syscall */
 
 
.data
 
msg:
 
    .ascii "Hello, World!\n"
 
    len = . - msg
 
  
 
=== ARM (32-bit) ===
 
=== ARM (32-bit) ===
  
This is written in GNU Assembler (gas/as) syntax:
+
This is written in [https://sourceware.org/binutils/docs/as/ GNU Assembler (gas/as)] syntax using Linux [[Syscalls]]:
  
 
  .text
 
  .text
Line 72: Line 76:
 
       len = . - msg
 
       len = . - msg
  
=== External Examples ===
 
  
"Hello World" in many different types of assembler: http://leto.net/code/asm/hw_assembler.php
+
=== ARM (64-bit) - AArch64 ===
 +
 
 +
This is also in [https://sourceware.org/binutils/docs/as/ GNU Assembler (gas/as)] syntax using Linux [[Syscalls]]:
 +
 
 +
.text
 +
.globl _start
 +
_start:
 +
 +
mov    x0, 1          /* file descriptor: 1 is stdout */
 +
adr    x1, msg  /* message location (memory address) */
 +
  mov    x2, len  /* message length (bytes) */
 +
 
 +
  mov    x8, 64    /* write is syscall #64 */
 +
svc    0          /* invoke syscall */
 +
 +
mov    x0, 0    /* status -> 0 */
 +
mov    x8, 93    /* exit is syscall #93 */
 +
svc    0          /* invoke syscall */
 +
 +
.data
 +
msg: .ascii      "Hello, world!\n"
 +
len= . - msg
 +
 
 +
 
 +
=== 6502 ===
 +
 
 +
Here is the same "Hello World" program in [[6502]] assembler as used in the [[6502 Emulator]], using the [[6502_Emulator#ROM_Routines|ROM routines]] for output:
 +
 
 +
define SCINIT $ff81 ; initialize/clear screen
 +
define CHROUT $ffd2 ; output character to screen
 +
 +
JSR SCINIT ; clear screen
 +
LDY #$00 ; set Y index to zero
 +
 +
loop: LDA msg,Y ; get a character
 +
BEQ done ; quit if character is null
 +
JSR CHROUT ; output the character
 +
INY ; increment index
 +
JMP loop ; get next character
 +
 +
done: BRK ; break (stop program)
 +
 +
msg:
 +
DCB "H","e","l","l","o",$2C,$20
 +
DCB "W","o","r","l","d","!",$0d, $00
 +
 
 +
 
 +
== Resources ==
 +
 
 +
* [[Assembler Basics]]
 +
* [http://leto.net/code/asm/hw_assembler.php "Hello World" in many different types of assembler]
 +
* [[x86_64 Register and Instruction Quick Start]]
 +
* [[aarch64 Register and Instruction Quick Start]]

Latest revision as of 12:22, 26 October 2022

Assembly language is a symbolic representation of machine language. It is therefore very architecture-specific.

Each instruction is represented by a short mnemonic word such as "LDR" for Load Register, "MOV" for move, or "MUL" for multiply, followed by (optional) arguments. The addressing mode is implied by the format of the arguments. Different assemblers use slightly different syntax.

Examples

x86

Here is a "Hello, World!" program written for an x86_64 Linux system using the GNU Assembler (gas/as) syntax (which is the main assembler used in open source projects such as the Linux kernel, as well as the SPO600 course), using Linux Syscalls:

.text
.globl _start
_start:
    mov $len, %edx     /* file descriptor: 1 is stdout */ 
    mov $msg, %ecx     /* message location (memory address) */
    mov $1, %ebx       /* message length (bytes) */
    mov $4, %eax       /* write is syscall #4 */
    int $0x80          /* invoke syscall */

    mov $0, %ebx       /* exit status: 0 (good) */
    mov $1, %eax       /* kernel syscall number: 1 is sys_exit */
    int $0x80          /* invoke syscall */

.data
msg:
    .ascii "Hello, World!\n"
    len = . - msg

Here is a similar program for a 32-bit x86 system using the Nasm syntax:

section    .text
global    _start

_start:
    mov    edx,len          ; message length (bytes)
    mov    ecx,msg          ; message location (memory address)
    mov    ebx,1            ; file descriptor: 1 is stdout
    mov    eax,4            ; kernel syscall number: 4 is sys_write
    int    0x80             ; invoke syscall

    mov    ebx,0            ; exit status: 0 (good)
    mov    eax,1            ; kernel syscall number: 1 is sys_exit
    int    0x80             ; invoke syscall

section    .rodata

msg    db    'Hello, world!\n'
len    equ    $ - msg


Notice that the order of the arguments in some lines is reversed between the two assemblers, and the prefixes to symbols and values also change.


ARM (32-bit)

This is written in GNU Assembler (gas/as) syntax using Linux Syscalls:

.text
.globl _start
_start:

     mov     %r0, $1     /* file descriptor: 1 is stdout */
     ldr     %r1, =msg   /* message location (memory address) */
     ldr     %r2, =len   /* message length (bytes) */
     mov     %r7, $4     /* write is syscall #4 */
     swi     $0          /* invoke syscall */

     mov     %r0, $0     /* exit status: 0 (good) */
     mov     %r7, $1     /* kernel syscall number: 1 is sys_exit */
     swi     $0          /* invoke syscall */

.data
msg:
     .ascii      "Hello, world!\n"
     len = . - msg


ARM (64-bit) - AArch64

This is also in GNU Assembler (gas/as) syntax using Linux Syscalls:

.text
.globl _start
_start:

	mov     x0, 1           /* file descriptor: 1 is stdout */
	adr     x1, msg   	/* message location (memory address) */
 	mov     x2, len   	/* message length (bytes) */
 
 	mov     x8, 64     	/* write is syscall #64 */
	svc     0          	/* invoke syscall */

	mov     x0, 0     	/* status -> 0 */
	mov     x8, 93    	/* exit is syscall #93 */
	svc     0          	/* invoke syscall */

.data
msg: 	.ascii      "Hello, world!\n"
len= 	. - msg


6502

Here is the same "Hello World" program in 6502 assembler as used in the 6502 Emulator, using the ROM routines for output:

define SCINIT $ff81 ; initialize/clear screen
define CHROUT $ffd2 ; output character to screen

	JSR SCINIT	; clear screen
	LDY #$00	; set Y index to zero

loop:	LDA msg,Y	; get a character
	BEQ done	; quit if character is null
	JSR CHROUT	; output the character
	INY		; increment index
	JMP loop	; get next character

done:	BRK		; break (stop program)

msg:	
	DCB "H","e","l","l","o",$2C,$20
	DCB "W","o","r","l","d","!",$0d, $00


Resources