&& Developing an Operating System This is a booklet about the process of developing an operating system. First a 512 byte bootloader is created which performs a simple function such as printing a message or getting a key from the keyboard, then that bootloader is used to load and execute a larger os 'kernel'. Really, the snippets here, just contain information about writing simple 'bootable' programs for x86 computers. The source code is in x86 assembler and was inspired by the Mikeos code == main bios interrupt functions .. int 13h, disk access (via sectors, cylinders etc) .. GETTING HELP wiki.osdev.org is a good site THE BOOT PROCESS The computer powers on and starts executing the bios. The bios then looks for a bootable sector (512 bytes) on a suitable 'boot medium'. That could be a old floppy disk (in days gone by) or now a USB memory stick, CD, or hard-drive. The boot medium is supposed to contain a 'boot signature' which is just a couple of bytes with specific numbers in them. This is to ensure that the computer doesnt attempt to boot something which is not supposed to be booted. http://board.flatassembler.net/topic.php?p=124387 interesting information about booting from usb by mike gonta STEP BY STEP This section describes, step by step how to create a bootable x86 usb key * install the correct programs >> sudo apt-get install qemu-system-i386 nasm mkdosfs ... * make a new floppy image called 'os.flp' >> mkdosfs -C os.flp 1440 The above line will not overwrite and existing file. Another way is to copy an existing floppy disk image. * compile the assembler source into a flat binary executable >> nasm -f bin -o first.bin first.asm >> nasm -o first.bin first.asm ##(probably the same) The 'bin' format is the default for the nasm assembler. * insert the compiled kernel 'first.bin' into the floppy image >> dd status=noxfer conv=notrunc if=first.bin of=os.flp * boot the operating system in the qemu virtual machine >> qemu -fda os.flp >> qemu-system-i386 -fda os.flp * create an iso file which can be burnt to a cd in the 'cdiso' folder >> mkisofs -o myfirst.iso -b myfirst.flp cdiso/ Use 'df' or 'dmesg' to find out the device name of a usb key which you have inserted eg '/dev/sdc' * unmount the usb key >> umount /dev/sdc WARNING: the following command will delete all previous data on the usb memory stick. When you execute the command below, the little light on the usb memory stick should flash a few times, indicating that data is being written to the stick. * write the new operating system to the boot sector of the usb key >> sudo dd if=os.flp of=/dev/sdc >> su; dd if=os.flp of=/dev/sdc ##(on a non debian system) Be Very, Very careful where you write the floppy image file to. If you write it to your hard-disk (for example /dev/hda) that is more or less the end of the data and operating system on that hard-disk. If the usb memory stick dev has a number, dont use the number, just the letters of the device name eg 'sdc1' becomes 'sdc' (remove the number 1 from the name). The usb memory stick can now be used to boot the new operating system by changing the computer boot order in the bios. Eg press on an asus eee pc or the ibm key on a thinkpad MEMORY ADDRESSES IN X86 READ MODE real mode addresses are 20 bits but are made up of a 16bit segment address, with a 16bit offset. Its not that complicated. The segment address, which needs to be loaded into ds or another segment register with something like mov ax 07C0h mov ds ax is really the address 07C00h. That means that it is a 20 bit address that can only address 1 megabyte of memory, no more. Hence the one of the needs for protected mode... SIMPLE BOOT PROGRAM Apparently the boot sector from a floppy (or usb) or hard-disk is always loaded to the physical memory location 07C00h which corresponds to the 'segment' (in real mode) 07C0h. The code below seems to have a problem with register indirect jumps (code segment not set properly) * a simple example of a bootable program with stack and a function ----------------- BITS 16 start: mov ax, 07C0h ; Set up 4K stack space after this bootloader add ax, 288 ; (4096 + 512) / 16 bytes per paragraph mov ss, ax ; this creates a 4K gap between stack and code mov sp, 4096 mov ax, 07C0h ; Set data segment to where we're loaded mov ds, ax mov si, text_string ; Put string position into SI call print_string ; Call our string-printing routine jmp $ ; Jump here - infinite loop! text_string db 'This is my cool new OS!!!', 0 print_string: ; Routine: output string in SI to screen mov ah, 0Eh ; int 10h 'print char' function .repeat: lodsb ; Get character from string cmp al, 0 je .done ; If char is zero, end of string int 10h ; Otherwise, print it jmp .repeat .done: ret times 510-($-$$) db 0 ; Pad remainder of boot sector with 0s dw 0xAA55 ; The standard PC boot signature ,,, GOTCHAS .... Some bioses require a short jump (+/-128 bytes) followed by a 'nop' no operation instruction in order to execute even though there isnt really a logical reason for this. * start the boot sector like this ------- jmp short start nop start: ,,, BOOTLOADERS The initial bootable program may only be 512 bytes long since it must fit into 1 sector of the 'floppy'. This is limiting. The answer is to use these 512 bytes to load a bigger program into memory and jump to it. The code below shows how. Sector 2, head 0, cylinder 0, is the sector (512 bytes) immediately following the sector occupied by the boot program, which contains the code we want to execute. After Booting the DL register may contain the number of the boot media. For example for a usb memory stick on my asus eee pc DL=128. this number should be saved for use with the read write functions of INT 13h * a simple working bootloader ----------------- BITS 16 jmp start drive db 0 ; a variable to hold boot drive number start: mov ax, 07C0h ; Set data segment to where we're loaded mov ds, ax mov [drive], dl ; save the boot drive number mov ax, 07C0h ; Set up 4K stack space after this bootloader add ax, 288 ; (4096 + 512) / 16 bytes per paragraph mov ss, ax ; with a 4K gap between stack and code mov sp, 4096 ; save the DL register or else dont modify it ; it contains the number of the boot medium (hard disk, ; usb memory stick etc) ; The 'floppy' Drive is NOT necesarily 0!!! reset: ; Reset the floppy drive mov ax, 0 ; mov dl, [drive] ; the boot drive number (eg for usb 128) int 13h ; jc reset ; ERROR => reset again read: mov ax, 1000h ; ES:BX = 1000:0000 mov es, ax ; es:bx determines where data loaded to mov bx, 0 ; mov ah, 2 ; Load disk data to ES:BX mov al, 5 ; Load 5 sectors (only 1 used here) mov ch, 0 ; Cylinder=0 mov cl, 2 ; Sector=2 (sector 1 is the boot sector) mov dh, 0 ; Head=0 mov dl, [drive] ; int 13h ; Read! jc read ; ERROR => Try again jmp 1000h:0000 ; Jump to the loaded code times 510-($-$$) db 0 ; pad out the boot sector (512 bytes) dw 0AA55h ; end with standard boot signature ; the code to be loaded and executed mov ah, 0x0A mov al, '!' mov cx, 10 int 10h hang: jmp hang ,,, The only difference in the code below is that the loaded program is contained in a separate file, which is handy for organisational reasons. * another way of writing the bootloader, almost identical ----------------------------- ; 3.ASM ; Load a program off the disk and jump to it ; Tells the compiler that this is offset 0. ; It isn't offset 0, but it will be after the jump. [ORG 0] jmp 07C0h:start ; Goto segment 07C0 start: push dx ; save the boot medium drive number ; Update the segment registers mov ax, cs mov ds, ax mov es, ax reset: ; Reset the floppy drive ; drive number in DL, unmodified since boot mov ax, 0 ; int 13h ; jc reset ; ERROR => reset again read: mov ax, 1000h ; ES:BX = 1000:0000 mov es, ax ; mov bx, 0 ; mov ah, 2 ; Load disk data to ES:BX mov al, 5 ; Load 5 sectors mov ch, 0 ; Cylinder=0 mov cl, 2 ; Sector=2 mov dh, 0 ; Head=0 ; drive number in DL, unmodified since boot int 13h ; Read! jc read ; ERROR => Try again jmp 1000h:0000 ; Jump to the program times 510-($-$$) db 0 dw 0AA55h This is a small loadable program. ; PROG.ASM mov ah, 9 mov al, '=' mov bx, 7 mov cx, 10 int 10h hang: jmp hang This program creates a disk image file that contains both the bootstrap and the small loadable program. ; IMAGE.ASM ; Disk image %include '3.asm' %include 'prog.asm' ,,, The code below doesnt modify the DL register which contains the drive number of the boot medium immediately after boot. (the bios places it there). It would be better and safer to save DL for use with the int 13h read/write functions * a boot loader which shows what its up to ----------------- BITS 16 jmp start %include 'prints.asm' %include 'printi8.asm' m.reset db 'resetting floppy',13,10,0 m.read db 'reading sector 2 of floppy',13,10,0 m.dlstate db 'dl is ',0 start: mov ax, 07C0h ; Set up 4K stack space after this bootloader add ax, 288 ; (4096 + 512) / 16 bytes per paragraph mov ss, ax mov sp, 4096 mov ax, 07C0h ; Set data segment to where we're loaded mov ds, ax mov si, m.dlstate call prints mov bl, 10 mov al, dl call printi8 mov cx, 4 ; try to reset drive 4 times .reset: ; Reset the floppy drive mov si, m.reset call prints mov ax, 0 ; ;mov dl, 0 ; Drive=0 (=A), no! use the DL value after boot int 13h jnc .startread loop .reset ; on error (carry flag) reset again 3 times .startread: mov cx, 4 ; try to read 4 times .read: mov si, m.read call prints mov ax, 1000h ; ES:BX = 1000:0000 mov es, ax ; es:bx determines where data loaded to mov bx, 0 ; mov ah, 2 ; Load disk data to ES:BX mov al, 5 ; Load 5 sectors (only 1 used here) mov ch, 0 ; Cylinder=0 mov cl, 2 ; Sector=2 (sector 1 is the boot sector) mov dh, 0 ; Head=0 ;mov dl, 0 ; Drive=0, 'floppy' (or usb memory stick) int 13h ; Read! jnc .done loop .read ; on error (carry flag) try again 3 times .done: jmp 1000h:0000 ; Jump to the loaded code jmp $ times 510-($-$$) db 0 dw 0AA55h ; the code to be loaded and executed jmp start2 m.loaded db 'loaded data!',13,10,0 start2: mov ah, 0x0A mov al, '!' mov cx, 10 int 10h hang: jmp hang ,,, ; boot1.asm stand alone program for floppy boot sector ; Compiled using nasm -f bin boot1.asm ; Written to floppy with dd if=boot1 of=/dev/fd0 REBOOTING .... * reboot the computer by jumping to FFFF:0 ------------------ ; Boot record is loaded at 0000:7C00 ie CS==0 & IP==7c00 org 7c00h lea si,[msg] ; load message address into SI register: mov ah,0eh print: mov al,[si] cmp al,0 jz done ; zero byte at end of string int 10h ; write character to screen. inc si jmp print done: mov ah,0 ; wait for any key: int 16h ; waits for key press ; store magic value at 0040h:0072h to reboot: ; 0000h - cold boot. ; 1234h - warm boot. mov ax,0040h mov ds,ax mov word[0072h],0000h ; cold boot. jmp 0ffffh:0000h ; reboot! msg db 'welcome, i have control of the computer.',13,10 db 'press any key to reboot.',13,10 db '(after removing the floppy)',13,10,0 ,,, * reboot the computer, but this may lock up the computer. >> int 19h * reboot the computer after a user keypress with int 19h, may lock! ------------- mov ah, 0 ; x86 bios wait for keypress function int 16h mov ah, 0eH ; echo the key just pressed int 10H int 19h ; reboot the computer times 510-($-$$) db 0 ; Pad remainder of boot sector with 0s dw 0xAA55 ; PC boot signature ,,, SEGMENTS STACK SEGMENT .... The stack segment register is used to calculate offsets into the stack used for PUSH and POP instructions. It appears to be automatically initialized, but we can initialize it explicitly if we need a big stack etc. DATA SEGMENT .... The DS or data segment register needs to be initialized before accessing variable with [var] since the offset of these variables are calculated relative to value in the DS register. The following 2 lines are sufficient. But I am not sure why or if the magic number 0x07C0 always works. * Initialize the data segment register DS ---------- mov ax, 07C0h mov ds, ax ; load DS with correct value ,,, * Error!! prints rubbish not 1st and 2nd char of 'message' ------------ jmp start message db 'hello!' start: mov ah, 0eh mov al, [message] ;! DS hasnt been initialized int 10h ; will display garbage mov al, [message+1] ; same... int 10h hang: jmp hang ,,, * Correct! print the first two characters of a string ------------ jmp start message db 'hello!' start: mov ax, 07C0h ; Initialize data segment DS register mov ds, ax ; load DS with correct value mov ah, 0eh ; bios teletype function mov al, [message] ; first char of 'message' int 10h ; invoke bios mov al, [message+1] ; 2nd char of 'message' int 10h hang: jmp hang ,,, * also works in qemu with no boot signature ------------ mov ax, 07C0h ; Initialize data segment DS register mov ds, ax ; load DS with correct value mov ah, 0eh ; bios teletype function mov al, [message] ; first char of 'message' int 10h ; invoke bios mov al, [message+1] ; 2nd char of 'message' int 10h hang: jmp hang message db 'hello!' ,,, EXTENDED READ WRITE FUNCTIONS For memory addresses outside of the range use extended functions with a dap data structure. http://forum.osdev.org/viewtopic.php?f=13&t=27510 good posts about this topic Use INT 13h with AH=42h (read) AH=43h (write), extended functions use with a DAP, a datastructure MOVING DATA == data moving instructions .. mov - mov data around .. xchg - exchange the contents of 2 registers/memory .. MOV ... The 'mov' x86 instruction is perhaps the simplest and most fundamental instructions XCHG ... This instruction is 1 clock cycle and fewer bytes than mov so more desirable in some circumstances. STRINGS AND TEXT A 'string' in this context is just a series of bytes, words or double words which exist is contiguous memory locations. The bytes may represent characters in some human language, or they may not. Its up to you. x86 Assembly language has special instructions for dealing with strings such as movs, movsb etc. But each instruction only deals with one byte, word etc at a time (unless you combine these instructions with a 'rep' instruction) STRING INSTRUCTIONS .... == summary .. cmpsb - compare bytes from 2 strings (in DS:SI and ES:DI) .. cmpsw - compare double bytes from 2 strings (DS:SI and ES:DI) .. lodsb - load a byte from a string in AL .. lodsw - load 2 bytes from a string in to AX .. lodsd - load 4 bytes from a string into EAX .. * load a byte character from a string in AL and update SI >> lodsb STOS .... This is the "store a string" instruction and includes stosb, store a byte, stosw, store a word etc * initialise an array with -1 ----------- jmp start array resw 100 start: mov ecx, 100 mov edi, array mov ax, -1 cld ; clear direction flag, ie go forward not backward rep stosw here: jmp here ,,, * convert a string to lowercase without changing blank characters ----------- jmp start start: mov ecx, stringlength mov esi, string.a mov edi, string.b cld ; clear direction flag, ie go forward not backward .again lodsb or al, 20h stosb loop .again here: jmp here ,,, PRINTING STRINGS .... Normally strings are 'printed' or displayed by loading the address of the first byte (or word) of a string into the SI register and then using 'lodsb' the load string byte instruction to get successive characters into the AL or AX register while incrementing the pointer in the SI register and decrementing the CX loop count variable (lodsb does all these things automatically). Does LODSB decrement the CX counter? No * print the first two characters of a string ------------ jmp start message db 'hello!' start: mov ax, 07C0h ; Set data segment to where we're loaded mov ds, ax mov ah, 0eh ; print character function mov al, [message] ; first char of 'message' int 10h mov al, [message+1] int 10h here: jmp here ,,, * print the 1st three characters with lodsb ------------ jmp start message db 'hello!' start: mov ax, 07C0h ; Set data segment to where we're loaded mov ds, ax cld ; set dir flag to forwards lea si, [message] mov cx, 3 ; loop count 3 mov ah, 0eh ; print character function .again: lodsb ; get next char from message int 10h loop .again here: jmp here times 510-($-$$) db 0 ; Pad remainder of boot sector with 0s dw 0xAA55 ; The standard PC boot signature ,,, * print a zero terminated string with address in the SI register ----------------- BITS 16 jmp start message db 'A function to print',13,10,0 start: mov ax, 07C0h ; Set up 4K stack space after this bootloader add ax, 288 ; (4096 + 512) / 16 bytes per paragraph mov ss, ax mov sp, 4096 mov ax, 07C0h ; Set data segment to where we're loaded mov ds, ax mov si, message ; Put string position into SI call prints ; Call our string-printing routine hang: jmp hang ; Jump here - infinite loop! ;# prints ; output zero terminated string in SI to screen prints: mov ah, 0Eh ; int 10h 'print char' function .again: lodsb ; Get next character from string cmp al, 0 ; Char == 0 ? je .done ; If char is zero, end of string int 10h ; Otherwise, print it jmp .again .done: ret times 510-($-$$) db 0 ; Pad remainder of boot sector with 0s dw 0xAA55 ; The standard PC boot signature ,,, ZERO TERMINATED STRINGS .... A zero terminated string simple has a byte with 0 zero in it at the end of the characters stored in memory. This is the system used by the C language. COUNTED STRINGS .... One method of storing a string is to include the count of the number of characters in a string next to the string where it is stored in memory. This system is used in the old 'forth' language and in modern languages where strings are stored as objects. * store the length (in byte characters) after the string -------- message db 'abcdefghijklmnop' count dw $-message ,,, * print a counted string with lodsb ----------------- BITS 16 jmp start message db 'Counted String' count dw 14 start: mov ax, 07C0h ; set the data segment mov ds, ax mov si, message ; Put string position into SI mov cx, [count] ; how many chars to print mov ah, 0Eh ; int 10h 'print char' function .again: lodsb ; Get character from string into AL int 10h ; loop .again ; loop while CX > 0 hang: jmp hang ; Jump here - infinite loop! times 510-($-$$) db 0 ; Pad remainder of boot sector with 0s dw 0xAA55 ; The standard PC boot signature ,,, The code below is very similar by only uses 1 byte for the count (thus limiting the string length to 255 characters) and also has the count preceding the string. These type of counted strings are what are used in the Forth language * print a preceding counted string with lodsb ----------------- BITS 16 jmp start message db 16,'Counted String!!' start: mov ax, 07C0h ; set the data segment mov ds, ax cld ; move forward through message sub cx, cx ; set CX = 0 mov si, message ; Put start of string position into SI lodsb ; get [SI] into AL; increment SI mov cl, al ; cl now contains the count mov ah, 0Eh ; bios int 10h 'print char' function .again: lodsb ; Get character from string into AL int 10h ; invoke bios loop .again ; loop while CX > 0 here: jmp here ; Jump here - infinite loop! times 510-($-$$) db 0 ; Pad remainder of boot sector with 0s dw 0xAA55 ; The standard PC boot signature ,,, * print a string with an automatic preceding count ----------------- BITS 16 jmp start count dw message.end - message message db 'abcdefghijklmnopqrstuvwxyz' message.end start: mov ax, 07C0h ; set the data segment mov ds, ax mov si, message ; Put string position into SI mov cx, [count] ; how many chars to print mov ah, 0Eh ; int 10h 'print char' function .again: lodsb ; Get character from string into AL int 10h ; loop .again ; loop while CX > 0 hang: jmp hang ; Jump here - infinite loop! times 510-($-$$) db 0 ; Pad remainder of boot sector with 0s dw 0xAA55 ; The standard PC boot signature ,,, The code below prints a counted string, where the count is calculated by the assembler and is located in the 2 bytes just before the string itself in memory. This system allows the use of only one label (instead of one for the message and one for the count) * a counted string with only one label ----------------- BITS 16 jmp start message dw message.end-$-2 db 'abcdefghijklmnopqrstuvwxyz' message.end start: mov ax, 07C0h ; set the data segment mov ds, ax mov si, message+2 ; Put string position into SI (after count) mov cx, [message] ; how many chars to print (message length) mov ah, 0Eh ; int 10h 'print char' function .again: lodsb ; Get character from string into AL int 10h ; x86 bios interrupt, do it! loop .again ; loop while CX > 0 hang: jmp hang ; Jump here - infinite loop! times 510-($-$$) db 0 ; Pad remainder of boot sector with 0s dw 0xAA55 ; The standard PC boot signature ,,, CMPS COMPARE STRING .... This includes cmpsb, cmpsw, cmpsw. These instructions compare [ds:si] == [es:di] and set a flag if true or false They also advance the 2 pointers by one byte or word etc. This means the cmps instructions can be used in a loop or with rep to compare an entire string. The instructions can be used with repe repne etc The std, set direction flag instruction and cld, clear direction flag determine which way the ds:si and es:di pointers advance after the compare instruction * initialise ds and es registers and si and di ------ aaa db 'x' bbb db 'y' start: mov ax, 07C0h ; Set data segment to where we're loaded mov ds, ax mov es, ax ; ! must initialise for cmpsb lea si, [a] ,,, We can also initialise ds and es with lds and les SCAS SCAN STRING .... The scan string instructions are used to located a particular 'character' (or value) with a string. It uses the ES:DI register pair (not the DS:SI pair). This instruction has the variants scasb, scasw, scasd * scan a string for a particular character ----------- BITS 16 jmp start message db 'abcdefghijklmnop' count dw $-message start: mov ax, 07C0h ; Set up 4K stack space after this bootloader add ax, 288 ; (4096 + 512) / 16 bytes per paragraph mov ss, ax mov sp, 4096 mov ax, 07C0h mov ds, ax ; data segment = code segment mov es, ax ; extended segment = code segment mov di, message ; or as below ;les di, [message] cld ; search forward (set direction flag = 0) mov al, 'p' ; the character to scan for mov cx, [count] ; search within string length repne scasb je .found .notfound: mov al, 'N' mov ah, 0eH ; bios 'teletype' function int 10H ; bios output interrupt jmp hang .found: dec di ; If found, DI points 1 byte further, as with 'cmps' mov ax, [di] ; print the character found mov ah, 0eH ; teletype AL bios function int 10H mov al, 'Y' int 10H hang: jmp hang ; loop foever times 510-($-$$) db 0 ; padding dw 0AA55h ; boot signature ,,, The code below should be modified to skip all leading white space. This could be done with 'repe scasb' with al=' ' * find the length of the 1st word of a sentence ----------- BITS 16 jmp start message db 'tree one 2 three' count dw $-message start: mov ax, 07C0h mov ds, ax ; data segment = code segment mov es, ax ; extended segment = code segment mov di, message ; or ... les di, [message] cld ; search forward (set direction flag = 0) mov al, ' ' ; scan for next space mov cx, [count] ; search within string length repne scasb je .found .notfound: mov al, 'N' mov ah, 0eH ; bios 'teletype' function int 10H ; bios output interrupt jmp hang .found: dec di ; DI points 1 byte further, as with 'cmps' mov ax, [count] ; how many characters scanned sub ax, cx ; or do DI - message dec ax ; cx is 1 too small add ax, '0' ; convert count digit to ascii mov ah, 0eH ; teletype AL bios function int 10H mov al, 'Y' int 10H hang: jmp hang ; loop foever times 510-($-$$) db 0 ; padding dw 0AA55h ; boot signature ,,, NULL TERMINATED STRINGS .... The 'null' or zero terminated string is a series of (usually) ascii characters (traditionally bytes) with the last bytes being the value zero 0. This is the standard C programming language string representation and therefore is pretty common. The advantage is that string manipulation functions dont have to know how long the strings are before doing something with them. * define a string with a unix newline >> prompt db "ENTER OPERAND:", 13, 0 * define a string with a dos newline >> prompt db "ENTER OPERAND:", 13, 10, 0 * define a null terminated string >> message db 'This is my cool new OS!!!', 0 COMPARING STRINGS .... Below we use 'dw' for the count of the two words because the count is loaded into the CX loop register (and so has to be a word, not a byte) The code below could be better written with 'cmpsb' ie "compare string byte". Use 'std' set direction flag to scan through a string backwards * eg with cmpsb ------ les edi, string.b ; loads edi and es segment (??) lds esi, string.a ; loads esi and ds data segment (??) mov ecx, stringlength(string.a) cld ; clear direction flag, search forward repe cmpsb je .same ja .above ; if string.a is greater than string.b ,,, Must use 'lea' when initialising si and di registers. Also must initialise es register, since 'cmpsb' compares ds:si with es:di * compare with cmpsb --------- jmp start word.a db 'an elephantaa',0 length dw $-word.a word.b db 'an elephanTaa',0 start: mov ax, 07C0h ; Set data segment to where we're loaded mov ds, ax mov es, ax ; ! must initialise for cmpsb mov cx, length ; this length includes the null termination lea si, [word.a] lea di, [word.b] cld ; search forwards (clear direction flag) repe cmpsb ; below is the more verbose version of repe cmpsb ;.again: ; cmpsb ; loope .again dec si ; point to last different letter dec di mov ax, [si] ; get character into al register mov ah, 0eH ; print al int 10H here: jmp here times 510-($-$$) db 0 dw 0AA55h ,,, The code below is unnecessarily long. It should use cmpsb etc * compare 2 counted strings for equality ---------- BITS 16 jmp start word.a dw 18 db 'five hundred and 2' word.b dw 18 db 'five hundred and 1' start: mov ax, 07C0h ; set the data segment mov ds, ax mov ah, 0Eh ; int 10h 'print char' function mov si, word.a ; Put address of 1st byte of word.a into SI mov di, word.b ; same, but for word.b in to DI mov ax, [si] ; get word.a count into AL mov bx, [di] cmp ax, bx ; see if 2 words have the same count jne different ; print message and terminate mov cx, ax ; put the loop count into cx (ch == 0) inc si ; point to first char of word.a inc di ; point to first char of word.b .again: lodsb ; Get character from word.a into AL mov bl, [di] ; get next char from word.b into BL inc di cmp al, bl ; see if the character is the same jne different ; print message and terminate loop .again ; loop while CX > 0 same: ; the words must be the same mov ah, 0Eh ; int 10h 'print char' function mov al, 'S' int 10h ; x86 bios interrupt, do it! hang: jmp hang ; Jump here - infinite loop! different: mov ah, 0Eh ; int 10h 'print char' function mov al, 'D' int 10h ; x86 bios interrupt, do it! jmp hang times 510-($-$$) db 0 ; Pad remainder of boot sector with 0s dw 0xAA55 ; The standard PC boot signature ,,, Todo! * enter a string and check if it is in a dictionary ------------- receive characters, count them, store them in a buffer then compare them to words in a dictionary ,,, COPYING STRINGS .... what is the difference between 'lea' and 'lds', mov etc? * copy a number of bytes from one destination to another ----------------- mov cx,(number of bytes to move) lea di,(destination address) lea si,(source address) cld ; clear direction flag, copy forwards rep movsb ,,, * a complete copy example --------- jmp start word db 'an elephant',0 length EQU $-word buffer resb 80 start: mov ax, 07C0h ; Set data segment to where we're loaded mov ds, ax mov cx, length ; this length includes the null termination mov si, word mov di, buffer cld ; copy forwards (clear direction flag) rep movsb here: jmp here times 510-($-$$) db 0 dw 0AA55h ,,, CONVERTING TO AND FROM STRINGS .... * convert from a digit to an ascii character by adding '0' or 48 ------ mov al, 9 add al, '0' ,,, * convert a digit to hexadecimal using xlatb ----------- ; Tell the compiler that this is offset 0. ; It isn't offset 0, but it will be after the jump. [ORG 0] jmp 07C0h:start ; Goto segment 07C0 hextable db "0123456789ABCDEF" start: ; Update the segment registers mov ax, cs mov ds, ax mov es, ax mov al, 15 mov ebx, hextable ; translation table xlatb ; replace al with hex digit mov ah, 0eH ; print al int 10H hang: jmp hang times 510-($-$$) db 0 dw 0AA55h ,,, The code below may be the most concise possible way to print a number in assembly language. * print a 2 byte number in hexadecimal ----------- ; Tell the compiler that this is offset 0. ; It isn't offset 0, but it will be after the jump. [ORG 0] jmp 07C0h:start ; Goto segment 07C0 hextable db "0123456789ABCDEF" start: mov ax, cs ; cs is 07C0 after the far jump mov ds, ax ; point data segment -> code segment mov ah, 0x0E ; bios teletype function mov bx, hextable ; translation table mov dx, 0xFEDC ; the number to print mov cx, 4 .again rol dx, 4 mov al, dl and al, 0x0F xlatb ; replace al with hex digit int 10H loop .again hang: jmp hang times 510-($-$$) db 0 dw 0AA55h ,,, The code below is just a variation on the code above where the stack is used to pass the number to the function or proceedure. Hopefully this will allow us to reuse this proceedure in other code. We have to juggle the stack to get the parameter off without hurting the return address * print a 2 byte number but use the stack to pass number ----------- [ORG 0] jmp 07C0h:start ; Goto code segment 07C0 start: mov ax, cs ; cs is 07C0 after the far jump mov ds, ax ; point data segment -> code segment ; doesnt seem necessary to initialize the stack ;add ax, 288 ; (4096 + 512) / 16 bytes per paragraph ;mov ss, ax ; initialise stack pointers ;mov sp, 4096 push 0xABCD ; the number to print call hexprint ; print the number here: jmp here ; and the rest is silence hexprint.data: hextable db "0123456789ABCDEF" hexprint: pop bx ; get off the return call address pop dx ; retrieve the number to print push bx ; save the return call address mov ah, 0x0E ; x86 bios print char function mov bx, hextable ; translation table mov cx, 4 .again: rol dx, 4 mov al, dl and al, 0x0F xlatb ; replace al with hex digit int 10H loop .again ret times 510-($-$$) db 0 ; pad to 512 bytes total dw 0AA55h ; standard x86 bootloader signature ,,, conversion to decimal display... divide by base (10) convert remainder to ascii using xlatb. push to stack. divide again by base, convert to ascii ... and so on until quotient is 0. Then pop the stack and display each character. The code below seems to be working. To adapt for 16 bit unsigned ints we need to use DX AX as dividend and BX as base or divisor * convert an unsigned 8 bit number to ascii in any base 0 to 16 ----------- [ORG 0] jmp 07C0h:start ; Goto segment 07C0 number db 255 base db 16 start: ; Update the segment registers mov ax, cs mov ds, ax mov es, ax ;mov ax, 07C0h ; Set up 4K stack space after this bootloader add ax, 288 ; (4096 + 512) / 16 bytes per paragraph mov ss, ax mov sp, 4096 mov al, [number] ; number/base mov bl, [base] ; bl is the divisor call printi8 hang: jmp hang ;proc printi8 ; expects the 8 bit number to display in AL and ; the base in BL register hextable db "0123456789ABCDEF" printi8: push cx push bx push ax sub cx, cx ; set counter = 0 .again: sub ah, ah ; ah = 0, ax is the dividend div bl ; does ax/bl. remainder --> ah push ax ; save remainder:quotient on the stack inc cx ; increment the digit counter cmp al, 0 ; if the quotient != 0 do the next digit jne .again ; loop while quotient > 0 .print: pop ax ; get digit from the stack mov al, ah ; convert digit to ascii mov ebx, hextable ; translation table xlatb ; replace al with hex digit from table mov ah, 0eH ; print digit in al int 10H loop .print ; using cx the digit counter to loop pop ax pop bx pop cx ret times 510-($-$$) db 0 dw 0AA55h ,,, LOGICAL AND BIT OPERATIONS and, or, not, xor, test Use the OR instruction to turn on 1 or more bits of a register Use the AND instruction to turn off 1 or more bits of a register OR INSTRUCTION .... * turn on the high bit of the bl register ----------- mov bl, color or bl, 10000000b ,,, * cut and paste bits with 'or' ------- and AL, 55H ; cut odd bits and BL, 0AAH ; cut even bits or AL, BL ; paste the registers together ,,, XOR INSTRUCTION .... Toggles one or more bits. etc * initialize the AX register to zero >> xor AX, AX * toggle the last bit of the AX register >> xor AX, 1 In the code the value 0A6h is the encryption 'key' (any key may be used, or chosen by the user). The data can be unencrypted using the same function. * encrypt data with xor --- input db 'unencrypted' output db ' ' ... cld lea si, [input] lea di, [output] lodsb ; read a data byte (or character) into AL xor AL, 0A6H stosb ; write data byte from AL to output buffer ,,, AND INSTRUCTION .... TEST INSTRUCTION .... The test instruction can be used to test the value of one or more bits of a register. This is similar to the AND instruction but the register value is not changed. The TEST instruction sets the Zero flag if the test is true. So we can use jz, je, loopz, loope for the true case and jnz, jne, loopnz, and loopne for the false case. TEST sets the flag register in an identical way to the AND instruction. So, if the result of and AND instruction would be zero, then TEST will set the zero flag to 1. This can be a bit confusing!! * check if AL == 0 >> test al, al * jump if AL is odd ------- test AL, 1 jz .odd ,,, * jump if the least and most significant bits of AX are set ------- test AL, 10000001b jz .exit ,,, * test if msb is set ----------- jmp start start: mov bl, 0b10101011 ; pattern to display mov dl, 0b00000000 ; test pattern and bl, 1 ; see if bit is set cmp bl, 1 jne here mov ah, 0eH ; bios teletype function mov al, 'y' int 10H ; do it here: jmp here times 510-($-$$) db 0 ; Pad remainder of boot sector with 0s dw 0xAA55 ; The standard PC boot signature ,,, We can use the TEST instruction to see if a number is divisible by a power of 2 (eg 2,4,8,16...). Use 2^n - 1 as the operand to TEST (1,3,7,15...) . * print a "!" if CX is divisible by 4 --------------------------------- start: mov cx, 9 .again: mov al, cl add al, '0' ; convert digit to ascii mov ah, 0eH ; bios teletype function int 10H ; invoke bios test cl, 3 jnz .here mov al, '!' mov ah, 0eH ; bios teletype function int 10H .here: loop .again jmp $ ; keep looping! times 510-($-$$) db 0 ; Pad remainder of boot sector with 0s dw 0xAA55 ; The standard PC boot signature ,,, * print a "!" if CX is divisible by 8 --------------------------------- start: mov cx, 31 .again: mov al, 'o' ; the character to print mov ah, 0eH ; bios teletype function int 10H ; invoke bios test cx, 0x0007 jne .here mov al, '!' mov ah, 0eH ; bios teletype function int 10H .here loop .again jmp $ ; an infinite loop times 510-($-$$) db 0 ; Pad remainder of boot sector with 0s dw 0xAA55 ; The standard PC boot signature ,,, XOR INSTRUCTION .... NOT INSTRUCTION .... * reverse every bit of a register >> not ESI SHIFT INSTRUCTIONS .... Shift operations are useful for multiplying and (integer) dividing by powers of 2. eg 2, 4, 8, 16 etc This should be faster than using the MUL and DIV instructions The code below is more easily done with the ROR or ROL instructions. * encrypt data by swapping nibbles with shl and shr ----------- ; al contains the byte to be encrypted mov AH, AL shl AL, 4 shr AH, 4 or AL, AH ; al has encrypted byte ,,, ROTATE INSTRUCTIONS .... == rotate sumary rol - rotate left ror - rotate right rcl - rotate left through carry rcr - rotate right through carry. ,,, * encrypt a byte by swapping nibbles (4 bits in a byte) ---- mov CL, 4 ror AL, CL ; or rol AL, Cl (no difference) ,,, DISPLAYING BIT PATTERNS .... strangely the following 3 lines are not equivalent ??? ------- ;test bl, dl ; see if bit is set and bl, dl ; see if bit is set cmp bl, dl ,,, * display a bit pattern ----------- block equ 0xFE ; ascii code for small block alpha equ 224 ; Greek letter alpha jmp start start: mov ax, 07C0h ; Initialize data segment DS register mov ds, ax ; load DS with correct value mov cx, 8 ; number of bits to display mov dl, 0b10000000 ; test pattern .again mov bl, 0b10101010 ; pattern to display mov ah, 0eH ; bios teletype ;test bl, dl ; see if bit is set and bl, dl ; see if bit is set cmp bl, dl jz .fill mov al, ' ' ; print space int 10H ; do it jmp .ll .fill mov al, block ; char to print int 10H ; do it .ll ror dl, 1 ; move the bit pattern loop .again ; go again here: jmp here times 510-($-$$) db 0 ; Pad remainder of boot sector with 0s dw 0xAA55 ; The standard PC boot signature ,,, PROCEEDURES Proceedures are similar to jumps in that the IP instruction pointer is modified implicitly by the instruction. The 'call' and 'ret' instructions implement proceedures in x86 assembler. * call proceedure located by a pointer >> call [BX] The following is a simple command interpreter. The next step is to create a linked list dictionary with a function which searches through and executes. Another step is to have some kind of self-referentialism, that is, so that the use can look up what functions are available. This self referentialism can be provided with a 'command name' which is just a counted string before the code to be executed. The code below is not checking that a valid command has been entered * a simple indirect procedure call with jump-table --------- BITS 16 [ORG 0] alpha equ 224 ; Greek letter alpha beta equ 225 ; Greek letter beta gamma equ 226 ; Greek letter gamma jmp 07C0h:start ; Goto segment 07C0 jumptable dw aa,bb,cc aa: mov al, alpha ; letter to print mov ah, 0eH ; bios teletype int 10H ; do it ret bb: mov al, beta mov ah, 0eH ; bios teletype int 10H ; do it ret ; return from 'call' cc: mov al, gamma mov ah, 0eH int 10H ret start: mov ax, cs mov ds, ax .again: mov al, '?' ; print a prompt mov ah, 0eH int 10H mov ah, 0 ; wait for keypress function int 16h cmp al, 'a' ; check for valid command (a-c) jb .again ; just print prompt if invalid cmp al, 'c' ja .again sub al, 'a' ; convert letter to a index into jump table sub bx, bx ; set bx:=0 mov bl, al ; ax cant be used in effective addresses shl bl, 1 ; do bl:=bl*2 call [jumptable+bx] ; jump-table is word (2 byte) cells jmp .again ; code never gets here times 510-($-$$) db 0 ; Pad remainder of boot sector with 0s dw 0xAA55 ; The standard PC boot signature ,,, JUMPS direct, indirect, short long etc, conditional, unconditional CONDITIONAL JUMPS .... jne/jnz, ja/jnbe, je/jz, jae/jnb, jb/jnae, jbe/jna je jz - jump if ZF=1 * example with lots of jumps ------- mov AX, 10 mov BX, 9 cmp AX, BX je .equal ; if ax=bx jump jz. equal ; same jne .unequal ; if ax!=bx jump jnz .unequal ; same ja .above ; ?if ax>bx jump jae .greaterequal ; ? if ax=>bx jump jnb ... ; same ,,, == summary of jump instructions .. jecxz - jump if ecx is 0 .. jc - jump if carry .. jnc - jump if no carry .. jo - jump if overflow .. jno - jump if no overflow .. js - jump if negative sign .. jns - jump if not negative sign .. jp - jump if parity .. jpe - jump if even parity .. jnp - jump if no parity .. jpo - jump if odd parity ,,, INDIRECT JUMPS .... Indirect jumps may be used to simulate 'switch' or 'case' language syntax from higher level languages. See also indirect call statements. The techniques of indirect jumps and calls all a very simple command interpreter to be written (in the style of a forth system). Code below is hanging why??? some code is now working eg jmp di Many hours of frustration later, it seems the problem was two-fold. Firstly register and register indirect jumps us the CS code segment implicitly (the offset is calculated from the start of the CS segment). The technique used in the MikeOs primer doesnt seem to set the code segment properly... Also there are 2 forms ... * jump to a memory location contained in register di >> jmp di * and jump to location specified by register pointer >> jmp [di] The version above can be used with jump-tables for example * perhaps the simplest register jump --------- BITS 16 [ORG 0] jmp 07C0h:start ; Go to (code?) segment 07C0 nip: mov al, '!' ; print something mov ah, 0eH int 10H jmp $ start: mov ax, cs mov ds, ax ; may not be necessary? .again: mov bx, nip jmp bx times 510-($-$$) db 0 ; Pad remainder of boot sector with 0s dw 0xAA55 ; The standard PC boot signature ,,, The code below is a good template for possibly the simplest possible command interpreter possible within an assembly program. The program prompts the user for a letter (command) and then executes some code based on the letter entered. The code is selected with a jump-table and an indirect jump. It may be more sensible to use procedure 'calls' in this case rather than jumps. * a simple register-indirect jump with jump-table --------- BITS 16 [ORG 0] jmp 07C0h:start ; Goto segment 07C0 jumptable dw aa,bb,cc aa: mov al, 'A' ; print A mov ah, 0eH int 10H jmp start.again bb: mov al, 'B' ; print B mov ah, 0eH int 10H jmp start.again cc: mov al, 'C' ; print C mov ah, 0eH int 10H jmp start.again start: mov ax, cs mov ds, ax mov es, ax .again: mov al, '?' ; print a prompt mov ah, 0eH int 10H mov ah, 0 ; wait for keypress function int 16h cmp al, 'a' jb .again cmp al, 'c' ja .again sub al, 'a' ; convert letter to a digit sub bx, bx ; set bx:=0 mov bl, al ; ax cant be used in effective addresses shl bl, 1 ; do bl:=bl*2 jmp [jumptable+bx] ; jump-table is word (2 byte) cells jmp .again ; loop forever times 510-($-$$) db 0 ; Pad remainder of boot sector with 0s dw 0xAA55 ; The standard PC boot signature ,,, Add a code name to each function, this provides some self referentialism. * a simple register-indirect jump with jump-table --------- BITS 16 [ORG 0] jmp 07C0h:start ; Goto segment 07C0 jumptable dw aa,bb,cc aa: mov al, 'A' ; print A mov ah, 0eH int 10H jmp start.again bb: mov al, 'B' ; print B mov ah, 0eH int 10H jmp start.again cc: mov al, 'C' ; print C mov ah, 0eH int 10H jmp start.again start: mov ax, cs mov ds, ax mov es, ax .again: mov al, '?' ; print a prompt mov ah, 0eH int 10H mov ah, 0 ; wait for keypress function int 16h cmp al, 'a' jb .again cmp al, 'c' ja .again sub al, 'a' ; convert letter to a digit sub bx, bx ; set bx:=0 mov bl, al ; ax cant be used in effective addresses shl bl, 1 ; do bl:=bl*2 jmp [jumptable+bx] ; jump-table is word (2 byte) cells jmp .again ; loop forever times 510-($-$$) db 0 ; Pad remainder of boot sector with 0s dw 0xAA55 ; The standard PC boot signature ,,, In the examples below we jump to the third item in a word cell jump table * jumptable examples, --------- mov di, [jumptable+4] call di ; is equivalent to mov di, jumptable+4 call [di] ; which is equivalent to call [jumptable+4] ,,, code below is not working because of code segment issues but we can do jmp [table+esi*4] which is good! * indirect jump example ----- [org 0] jmp start jumptable dd apple dd orange dd pear dd lemon start: mov ax, 07C0h ; Set data segment to where we're loaded mov cs, ax mov ds, ax ; get a digit (0-3) into AX sub eax, eax ; set eax := 0 mov ah, 0 ; wait for keypress function int 16h mov ah, 0eH ; echo the keypress int 10H cmp al, '0' ; check if digit is 0,1,2 or 3 jb start ; cmp al, '3' ; ja start ; sub ah, ah ; set ah := 0 sub al, '0' ; convert from ascii to a digit 0-3 mov esi, eax jmp [jumptable+ESI*4] ; indirect jump apple: mov al, 'A' ; print A mov ah, 0eH int 10H jmp start orange: mov al, 'B' ; print B mov ah, 0eH int 10H jmp start pear: jmp start lemon: jmp start times 510-($-$$) db 0 ; Pad remainder of boot sector with 0s dw 0xAA55 ; The standard PC boot signature ,,, LOOPS * print digits 0-9 in ascending order --------------------------------- start: mov cx, 10 .again: mov al, 10 sub al, cl ; add al, '0' ; convert digit to ascii mov ah, 0eH ; bios teletype function int 10H ; invoke bios loop .again jmp $ ; keep looping! times 510-($-$$) db 0 ; Pad remainder of boot sector with 0s dw 0xAA55 ; The standard PC boot signature ,,, * attempt to read from a disk 3 times into a data buffer ----------------- mov cx, 3 ; countdown of read attempts read_loop: xor ah, ah ; set ah to zero - reset drive function int 0x13 ; call drive reset mov ax, ds mov es, ax ; es == ds mov bx, BlahBlah ; set BX to the address (not the value) of BlahBlah mov dl, DriveNumber mov dh, HeadNumber mov al, NumSectors mov ch, CylNumLow mov cl, CylNumHigh ; set the high part of the cylinder number, bits 6 and 7 and cl, Sector ; set the sector number, bits 0-5 mov ah, 0x2 ; set function 2h int 0x13 ; call the interrupt jnc exit ; if the carry flag is clear, it worked loop read_data ; try three times, then give up - leave error msg in al exit: ;;; whatever other code you need [segment data] BlahBlah resb 512 ,,, The CX register with the loop command, counts down. So we need some extra logic to make it count up * print extended ascii characters in ascending order --------------------------------------------------------- start: mov cx, 0x00FF .again: mov al, 0xFF sub al, cl ; the character to print goes in AL mov ah, 0eH ; bios teletype function int 10H ; invoke bios function loop .again jmp $ ; keep looping! times 510-($-$$) db 0 ; Pad remainder of boot sector with 0s dw 0xAA55 ; The standard PC boot signature ,,, * print all ascii characters in descending order --------------------------------------------------------- start: mov cx, 0x00FF .again: mov al, cl mov ah, 0eH int 10H loop .again jmp $ ; keep looping! times 510-($-$$) db 0 ; Pad remainder of boot sector with 0s dw 0xAA55 ; The standard PC boot signature ,,, LOOP INSTRUCTIONS .... loop, loopz, loope, loopnz, loopne GOTCHAS FOR LOOPS .... If the CX register somehow goes < 0 then the loop will probably continue forever (since -1 == 0xFFFF) !! REP INSTRUCTIONS .... The rep instruction is similar to the LOOP instructions except that only one instruction is repeated (multiple in the case of loop). The REP instrucion and its cousins is used in conjunction with another instruction which is repeated while CX is not 0 (CX is decremented on each repetition of the instruction) * rep instructions >> rep, repe, repz, repne, repnz * increment AX 4 times ------- mov CX, 4 rep inc AX ,,, * move 8 bytes of data from DS:SI to ES:DI ------- mov CX, 8 rep movsb ,,, * move 8 double words (32bits) of data from DS:SI to ES:DI ------- mov CX, 8 rep movsd ,,, STACK The stack is a useful thing but you may have to allocate space for it. It grows 'down', towards low memory, and toward SS. The stack contains either word (16 bit) or double word (32 bit) data items (but never 8 bit). Apparently an x86 bios automatically initialises a 512 (one sector) stack immediately after the boot code sector. If you need more than this you have to initialise SS (stack segment register) and SP (the stack pointer register) to something sensible and useful. INITIALIZING THE STACK .... * set up a stack after a bootloader ------------------- mov ax, 07C0h ; Set up 4K stack space after this bootloader add ax, 288 ; (4096 + 512) / 16 bytes per paragraph mov ss, ax mov sp, 4096 ,,, POP INSTRUCTIONS .... pops a 16 or 32 bit data item from the stack (depending on the 'address size attribute'. * pop the 2 bytes at top of stack and place in CX register >> pop cx (flags: none) * pop a saved flags register into the flags register >> popf * pop all registers >> popa >> popad TWO STACKS .... The x86 architecture includes one built in stack (accessed with the 'push' and 'pop' instructions). But it would be nice to have another stack. For example to pass parameters to functions without having to worry about stack frames etc. This is another forth idea. * create a second stack ---------- ; The push instructions sub edx, 4 ; Decrement the stack pointer one position (4 bytes) mov dword [edx], eax ; Store the value at the new location ; The pop instructions ;Popping 3 steps: Getting value, incrementing the stack, ; and returning the value. We will return the value simply by leaving it in eax. mov eax, dword [edx] ; Load the value off of the stack add edx, 4 ; Increment the stack pointer one position (4 bytes) ; Leave the result in eax to return it ,,, DATA STRUCTURES CODE WITH HEADER .... Code blocks (proceedures) can be given a header to describe the following code. This header can be used to provide a description of the code. Compared to a forth style linked list dictionary, we see the extra maintenance involved in maintaining a 'jump-table' of pointers to the beginning of each function * code with some header text --------- BITS 16 [ORG 0] cr equ 13 ; carriage return lf equ 10 ; line feed bell equ 7 ; bell (sort of) jmp 07C0h:start ; Goto segment 07C0 jumptable dw asc,beep,reboot,colours,help asc: db 11, 'ascii chars' ; function name in text mov cx, 0x00FF asc.again: mov al, cl ; print ascii char in CL register mov ah, 0eH int 10H and al, 0x0F ; using 'AND' with 'CMP' to cmp al, 0x0F ; create a simple modulus test jne asc.ll mov al, cr ; print chars 16 to a line int 10H mov al, lf int 10H asc.ll: loop asc.again nop ret beep: db 4, 'beep' mov al, bell ; beep mov ah, 0eH int 10H ret reboot: db 6, 'reboot' int 19h ; a dodgy way to reboot the computer colours: db 7, 'colours' mov al, 'C' ; print Colours ... mov ah, 0eH int 10H ret help: ; the code below has a problem with the CX index used ; is only printing 4 valid counts db 4, 'help' mov al, 'H' ; print help ... mov ah, 0eH int 10H mov cx, 5 ; loop through 5 functions help.again: mov di, jumptable add di, cx ; do si:=si+cx*2 (jumptable is word cell) add di, cx mov si, [di] ; di now points to start of function mov al, byte [si] ; get the text count mov ah, 0eH add al, '0' ; convert to ascii digit int 10H loop help.again ret start: mov ax, cs ; the code segment is magically correct mov ds, ax ; establish data segment mov es, ax ; do we need ES extended segment? .again: mov ah, 0eH mov al, cr ; print a prompt on a newline int 10H mov al, lf int 10H mov al, '?' int 10H mov ah, 0 ; wait for keypress function int 16h cmp al, 'a' ; check for valid command (a-c) jb .again ; just print prompt again if invalid cmp al, 'e' ja .again sub al, 'a' ; convert letter to a index into jump table mov bx, jumptable add bl, al ; set bx:=bx+al*2 add bl, al ; set the pointer to point to code mov si, [bx] ; si -> start of proceedure mov bl, byte [si] ; jump over name of proceedure ; or we could print the function name here inc bl ; the first code byte sub bh, bh ; set bh := 0 add si, bx call si jmp .again ; loop back to prompt times 510-($-$$) db 0 ; Pad remainder of boot sector with 0s dw 0xAA55 ; The standard PC boot signature ,,, LINKED LISTS .... Linked lists can be implemented easily using the syntax of the assembler itself (nasm style syntax) * a linked list in assembler ----------- liststart dw '0' w1 dw liststart db 3, 'egg' w2 dw w1 db 5, 'water' w3 dw w2 db 4, 'tree' ,,, * another layout ----------------- nip dw 0 ; 1st word has a zero link db 3, 'nip' ; strings are 'counted' egg dw nip ; link to previous dictionary entry db 3, 'egg' ; bat dw egg ; link to previous dictionary entry db 3, 'bat' ; last dw bat ; ,,, Nasm can also handle forward references to labels, so the dictionary could be written the with the reverse order. NUMBERS DISPLAYING NUMBERS .... Even the task of displaying a number in assembler is a non-trivial task. * print a 1 byte number in hexadecimal ----------- ; Tell the compiler that this is offset 0. ; It isn't offset 0, but it will be after the jump. [ORG 0] jmp 07C0h:start ; Goto segment 07C0 hextable db "0123456789ABCDEF" start: mov ax, cs ; cs is 07C0 after the far jump mov ds, ax ; point data segment -> code segment mov ah, 0x0E mov bx, hextable ; translation table mov dx, 0xABCD ; the number to print mov cx, 4 .again rol dx, 4 mov al, dl and al, 0x0F xlatb ; replace al with hex digit int 10H loop .again hang: jmp hang times 510-($-$$) db 0 dw 0AA55h ,,, * print a 2 byte number in hexadecimal ----------- ; Tell the compiler that this is offset 0. ; It isn't offset 0, but it will be after the jump. [ORG 0] jmp 07C0h:start ; Goto segment 07C0 hextable db "0123456789ABCDEF" start: mov ax, cs ; cs is 07C0 after the far jump mov ds, ax ; point data segment -> code segment mov ah, 0x0E mov bx, hextable ; translation table mov dx, 0xABCD ; the number to print mov cx, 4 .again rol dx, 4 mov al, dl and al, 0x0F xlatb ; replace al with hex digit int 10H loop .again hang: jmp hang times 510-($-$$) db 0 dw 0AA55h ,,, ARITHMETIC Simple mathemetical operations require special care in assembly language because of the need to check for 'overflow' or 'carry' conditions (where the result is to large to fit into the target register DIGITS .... * check if a number entered is an ascii digit ------- start: .again: sub ax, ax mov ah,0 ; wait for a key press int 16h ; bios interrupt service cmp al, '0' ; jb .notdigit ; if ascii value is less than '0' not digit cmp al, '9' ja .notdigit ; if ascii value greater than '9' not digit mov ah, 0eh ; print the digit if it is one int 10h ; bios print routine jmp .again .notdigit: mov ah, 0eh ; print 'N' if its not a digit mov al, 'N' int 10h jmp .again times 510-($-$$) db 0 ; Pad remainder of boot sector with 0s dw 0xAA55 ; The standard PC boot signature ,,, MULTIPLICATION .... The MUL instruction is used to multiply. 8 or 16 bit: 11 clock cycles 32 bit: 10 clock cycles * multiply AL x BL and store result in AX >> mul BL (flags: CF, OF cleared if AH zero, otherwise set) * multiply AX x DX and store result in DX:AX >> mul DX (flags: CF, OF cleared if DX zero, otherwise set) * multiply EAX x ECX and store result in EDX:EAX >> mul CX (flags: CF, OF cleared if DX zero, otherwise set) DIVISION .... the x86 instruction set has a special 'div' instruction for performing division. Another method is to perform repeated subtraction. AX/[8 bit register] -> quotient in AL, remainder in AH DX AX/[16 bit register] -> quotient in AX, remainder in DX EDX EAX/[32 bit register] -> quotient in AX, remainder in DX * divide 23/10 and print quotient and remainder --------------------------------------------------------- start: mov ax, 07C0h ; Set up 4K stack space after this bootloader add ax, 288 ; (4096 + 512) / 16 bytes per paragraph mov ss, ax mov sp, 4096 mov ax, 07C0h ; Set data segment to where we're loaded mov ds, ax mov ax, 23 mov bl, 10 div bl add al, '0' ; convert digit in al to ascii call printc ; print the quotient in AL mov al, 'r' call printc ; print a separator character mov al, ah add al, '0' ; convert digit to ascii call printc ; print the remainder (from AH) jmp $ ; keep looping! ; routine to output character in AL to screen printc: push ax mov ah, 0Eh ; int 10h 'print char' function cmp al, 32 ; could modify to check for ascii range int 10h ; call bios function pop ax ret times 510-($-$$) db 0 ; Pad remainder of boot sector with 0s dw 0xAA55 ; The standard PC boot signature ,,, note that the dividend (AX) must be 16 bits for an 8 bit divisor, so check that the correct data type is loaded. The code below only works if the quotient and remainder are single digits. * divide a number by another and print quotient and remainder --------------------------------------------------------- start: mov ax, 07C0h ; Set up 4K stack space after this bootloader add ax, 288 ; (4096 + 512) / 16 bytes per paragraph mov ss, ax mov sp, 4096 mov ax, 07C0h ; Set data segment to where we're loaded mov ds, ax mov AX, [dividend] mov BL, [divisor] div BL push AX ; save AX so we can get the remainder later add AL, '0' ; convert to ascii, but only one digit! mov AH, 0Eh ; print quotient int 10h mov AL, 'r' ; print separator character mov ah, 0eh int 10h pop AX mov AL, AH add AL, '0' ; convert to ascii mov ah, 0eh ; print remainder int 10h jmp $ ; keep looping! dividend dw 79 divisor dw 11 times 510-($-$$) db 0 ; Pad remainder of boot sector with 0s dw 0xAA55 ; The standard PC boot signature ,,, DIVISION INSTRUCTIONS .... == summary of division instructions .. div - unsigned division .. idiv - signed division .. shr - integer division by powers of 2 ,,, MODULUS .... The modulus operation can be performed by used the 'div' instruction, and then taking the value in the AH register which is the 'remainder' from a division operation. If the modulus is of a number which is a power of 2 (2,4,8,16 ...) we can obtain the modulus by ANDing the right number of high bits in the number. This should be faster than using the DIV instruction The modulus can also be performed with AND and CMP instructions * modulus performed with 'and' and 'cmp' ----------------- asc: db 11, 'ascii chars' ; function name in text mov cx, 0x00FF asc.again mov al, cl ; print ascii char in CL register mov ah, 0eH int 10H and al, 0x0F ; using 'AND' with 'CMP' to cmp al, 0x0F ; create a simple modulus test jne asc.ll mov al, cr ; print chars 16 to a line int 10H mov al, lf int 10H asc.ll loop asc.again ret ,,, CONFIGURING VIDEO DISPLAY The video appears to have several 'display modes' which need to be set or configured. int 10h Get current video mode AH=0Fh returns: AL = Video Mode, AH = number of character columns, BH = active page * display the video mode number ----------------------------------------------------- jmp start start: mov ah, 0Fh int 10h add al, '0' mov ah, 0Eh int 10h jmp $ ; loop forever times 510-($-$$) db 0 ; Pad remainder of boot sector with 0s dw 0xAA55 ; The standard PC boot signature ,,, The code below prints '80' which is probably the standard screen character width in text mode. * display the number of character columns ----------------------------------------------------- jmp start start: mov ax, 07C0h ; Set data segment to where we're loaded mov ds, ax mov ah, 0Fh ; video mode info function int 10h ; load al with character width mov al, ah ; printi8 prints number in al register mov bl, 10 ; printi8 uses bl register for base call printi8 jmp $ ; loop forever %include 'printi8.asm' times 510-($-$$) db 0 ; Pad remainder of boot sector with 0s dw 0xAA55 ; The standard PC boot signature ,,, VIDEO MODES .... http://brokenthorn.com/Resources/OSDevVid2.html good info vga interface = low resolution, usually <= 16 colours (except 13h) vesa vbe interface = higher resolutions Note that mode 13h is the only one in standard vga which can display 256 colours. * set a video mode >> INT 10h, AH=0, AL= == Standard vga colour modes Mode Resolution Color depth AL=0h 40x25 Text 16 Color AL=1h 40x25 Text 16 Color AL=2h 80x25 Text 16 Color 3h 80x25 Text 16 Color (default text mode on boot-up) 4h 320x200 4 Color 5h 320x200 4 Gray 7h 80x25 Text 2 Color Dh 320x200 16 Color Eh 640x200 16 Color Fh 640x350 2 Color 10h 640x350 16 Color 11h 640x480 2 Color 12h 640x480 16 Color 13h 320x200 256 Color (a common simple graphics mode) 6Ah 800x600 16 color (higher resolution) ,,, Basically in text mode its seems impossible to draw pixels and visa versa. * set the mode to 0 for bigger text -------------------- start: mov ax, 07C0h mov ds, ax .setmode: mov ah, 0 ; set graphics display mode function. mov al, 0h ; mode 0h = text 40x25 int 10h ; set it! .text: mov ah, 0eh mov al, 'Q' int 10h hang: jmp hang times 510-($-$$) db 0 ; Pad remainder of boot sector with 0s dw 0xAA55 ; The standard PC boot signature ,,, Set the high bit of AL to not clear screen when changing video mode. * set video mode to 3 (colour text) without clearing screen ------- start: mov ah, 0eh ; teletype 'P' mov al, 'P' int 10h mov ah, 0 int 16h ; wait for a key press .setmode: mov ah, 0 mov al, 10000000b ; 0 + high bit set to not clear screen int 10h .print: mov ah, 0eh ; teletype 'Q' mov al, 'Q' ; the 'Q' gets printed in cursor position 0,0 int 10h ; and overwrites whatever was there hang: jmp hang times 510-($-$$) db 0 dw 0xAA55 ,,, Garbage is displayed on the screen with the code below * switch to a graphics mode without clearing screen, not useful ------- start: mov ah, 0eh ; teletype 'P' mov al, 'P' int 10h mov ah, 0 int 16h ; wait for a key press .setmode: mov ah, 0 mov al, 13h ; graphics mode 320x200 or al, 10000000b ; set the high bit to 1 int 10h .print: hang: jmp hang times 510-($-$$) db 0 dw 0xAA55 ,,, WRITING OUTPUT TO THE SCREEN Write a character at the current cursor position int 10h, ah=0ah al=character, bh=page number, cx=number of times to print the character == int 10h character display functions (value in register ah) .. 0eh - teletype, the cursor is advanced after printing .. 0ah - print character at x,y position with colour .. * print a character using the 'teletype' function (0eh) --------------------------------------------------------- start: mov al, '*' mov ah, 0eH int 10H jmp $ ; keep looping! times 510-($-$$) db 0 ; Pad remainder of boot sector with 0s dw 0xAA55 ; The standard PC boot signature ,,, The newline (13) goes to start of current line. linefeed (10) goes to a new line. So 13, 10 works as expected * print a character with 'teletype' and a newline --------------------------------------------------------- start: mov al, '*' mov ah, 0eH int 10H mov al, 13 mov ah, 0eH int 10H mov al, 10 mov ah, 0eH int 10H jmp $ ; keep looping! times 510-($-$$) db 0 ; Pad remainder of boot sector with 0s dw 0xAA55 ; The standard PC boot signature ,,, * print a 10 stars down the screen --------------------------------------------------------- start: mov cx, 30 .again: mov al, '*' mov ah, 0eH int 10H mov al, 13 mov ah, 0eH int 10H mov al, 10 mov ah, 0eH int 10H loop .again jmp $ ; keep looping! times 510-($-$$) db 0 ; Pad remainder of boot sector with 0s dw 0xAA55 ; The standard PC boot signature ,,, * print "!!!" at the current cursor position ----------------------------------------------------- start: mov ah, 0aH mov al, '!' ; mov cx, 3 ; int 10h ; jmp $ ; Jump here - infinite loop! times 510-($-$$) db 0 ; Pad remainder of boot sector with 0s dw 0xAA55 ; The standard PC boot signature ,,,, * notes for how to clear the screen -------------------------- start: mov ah,06h ; ah=function number for int10 (06) mov al,00h ; al=number of lines to scroll (00=clear screen) mov bx,700h ; bh=color attribute for new lines xor cx,cx ; ch=upper left hand line number of window (dec) ; cl=upper left hand column number of window (dec) mov dx,184fh ; dh=low right hand line number of window (dec) ; dl=low right hand column number of window (dec) int 10h jmp $ ; Jump here - infinite loop! times 510-($-$$) db 0 ; Pad remainder of boot sector with 0s dw 0xAA55 ; The standard PC boot signature ,,, CURSOR Get Cursor position Bios function int 10H AH=03h,DL=Cursor-column,DH=Cursor-row Set Cursor position Bios function int 10H AH=02h,DL=Cursor-column,DH=Cursor-row * increment the cursor column position ------------------------------- start: ; mov bh, 00h ; assume page 0 mov ah, 03h ; bios function: get cursor position into dx int 10h ; invoke bios mov ah, 02h ; bios function: set cursor position specified in dx inc dl ; increment cursor column by 1 int 10h ; invoke bios jmp $ ; Jump here - infinite loop! times 510-($-$$) db 0 ; Pad remainder of boot sector with 0s dw 0xAA55 ; The standard PC boot signature ,,, * increment the cursor row and line position 10 times ------------------------------- start: mov cx, 10 ; the loop counter .again: push cx mov al, '*' mov ah, 0eH int 10H ; why mov bh??? ;mov bh, 00h ; get cursor position into dx (int 10h, ah=03h) mov ah, 03h ; bios get cursor row:col into DH:DL int 10h ; invoke bios mov bh, 00h ; not sure if necessary ?? mov ah, 02h ; set cursor position specified in dx inc dl inc dh int 10h pop cx loop .again jmp $ ; Jump here - infinite loop! times 510-($-$$) db 0 ; Pad remainder of boot sector with 0s dw 0xAA55 ; The standard PC boot signature ,,, MOVING THE CURSOR .... * a simple example ------------ mov ah, 0 ; bios read key code function int 16H ; invoke bios cmp al, 0 ; is extended char (AL != 0) ? jne .printkey cmp ah, 75 ; left arrow je .leftarrow cmp ah, 77 ; right arrow je .rightarrow ,,, * move the cursor right if right arrow pressed ------------------------------- jmp start start: .again: mov ah, 0 ; bios read key code function int 16H ; invoke bios cmp al, 0 ; is extended char (AL != 0) ? jne .again ; wait for next key if not -> cmp ah, 77 ; right arrow jne .again ; wait for next key if not -> arrow key mov ah, 03h ; get cursor position into dx (int 10h, ah=03h) int 10h ; invoke bios mov ah, 02h ; bios function: set cursor position specified in dx inc dl ; increment column position int 10h ; invoke bios jmp .again jmp $ ; program hangs here times 510-($-$$) db 0 ; Pad remainder of boot sector with 0s dw 0xAA55 ; The standard PC boot signature ,,, The code below could be greatly simplified by having a mov cursor procedure ? * move the cursor if arrow keys pressed ------------------------------- jmp start start: .again: mov ah, 0 ; bios read key code function int 16H ; invoke bios cmp al, 0 ; is extended char (AL != 0) ? jne .again ; wait for next key if not extended char cmp ah, 75 ; left arrow je .moveleft ; cmp ah, 77 ; right arrow je .moveright ; cmp ah, 80 ; down arrow je .movedown cmp ah, 72 ; up arrow je .moveup jmp .again .moveleft: mov ah, 03h ; get cursor position into dx (int 10h, ah=03h) int 10h ; invoke bios mov ah, 02h ; bios function: set cursor position specified in dx dec dl ; decrement column position int 10h ; invoke bios jmp .again .moveright: mov ah, 03h ; get cursor position into dx (int 10h, ah=03h) int 10h ; invoke bios mov ah, 02h ; bios function: set cursor position specified in dx inc dl ; increment column position int 10h ; invoke bios jmp .again .movedown: mov ah, 03h ; get cursor position into dx (int 10h, ah=03h) int 10h ; invoke bios mov ah, 02h ; bios function: set cursor position specified in dx inc dh ; increment row position int 10h ; invoke bios jmp .again .moveup: mov ah, 03h ; get cursor position into dx (int 10h, ah=03h) int 10h ; invoke bios mov ah, 02h ; bios function: set cursor position specified in dx dec dh ; decrement row position int 10h ; invoke bios jmp .again jmp $ ; program hangs here times 510-($-$$) db 0 ; Pad remainder of boot sector with 0s dw 0xAA55 ; The standard PC boot signature ,,, * echo chars and print ascii code at bottom of screen --------------------------------------------------------- jmp start hextable db "0123456789ABCDEF" key db 1 start: mov ax, 07C0h ; Set data segment to where we're loaded mov ds, ax .begin mov ah, 0 ; wait for keypress function int 16h mov ah, 0eH ; bios teletype char function int 10H mov [key], al ; save key code in key buffer cmp al, 13 ; was the key press a 'enter' jne .status mov al, 10 ; if a 'enter' is pressed add a newline int 10h ; save the cursor position, print something ; at the bottom of screen, then restore cursor .status mov ah, 03h ; bios function DH:DL <- cursor y:x int 10h ; invoke bios push dx ; save cursor Row:Col (DH:DL) on stack mov dx, 0x1700 ; Row 23, Column 0 mov ah, 02h ; set cursor position specified in dx int 10h ; ; NOTE, could simplify this with a loop below ; mov bx, hextable ; translation table mov ah, 0eH ; bios teletype function mov al, [key] ; get key code from buffer rol al, 4 ; print first digit and al, 0x0F ; higher order digit xlatb ; replace al with hex digit int 10h ; invoke bios mov al, [key] ; get key code from buffer and al, 0x0F ; lower order digit xlatb ; replace al with hex digit int 10h ; invoke bios pop dx ; restore text cursor position mov ah, 02h ; set cursor position specified in dx int 10h ; invoke bios jmp .begin ; keep looping! times 510-($-$$) db 0 ; Pad remainder of boot sector with 0s dw 0xAA55 ; The standard PC boot signature ,,, INPUT FROM THE KEYBOARD * define ASCII code constant for the key >> ESC equ 1bh INT 16h with AH=00h or 10h will block waiting for a keypress (returns ASCII result in AL); use AH=01h or 11h to query whether a keypress is available first if you want to avoid blocking (returns immediately with ZF clear if a key is available, or set if not). See e.g. here, or here (or Google "INT 16h" for more). In the code example below, space and backspace appear to work as expected on my machine, but [enter] returns the cursor to the beginning of the line. * read keys from the keyboard and print them to the screen --------------------------------------------------------- jmp start start: mov ax, 07C0h ; Set data segment to where we're loaded mov ds, ax mov ah, 0 ; wait for keypress function int 16h mov ah, 0eH int 10H cmp al, 13 ; was the key press a 'enter' jne start mov al, 10 ; if a 'enter' is pressed add a newline int 10h jmp start ; keep looping! times 510-($-$$) db 0 ; Pad remainder of boot sector with 0s dw 0xAA55 ; The standard PC boot signature ,,, * read keys and freeze if escape is pressed --------------------------------------------------------- ESCP equ 1bh start: mov ah, 0 int 16H cmp al, ESCP je .done ; If escape pressed, freeze! mov ah, 0eH ; print the character int 10H jmp start ; keep looping! .done: jmp $ times 510-($-$$) db 0 ; Pad remainder of boot sector with 0s dw 0xAA55 ; The standard PC boot signature ,,, If a 'special' non-printing key is pressed, then register AL is zero and register AH contains the code for the key press * print '<' and '>' if arrow keys are pressed --------------------------------------------------------- jmp start start: .again: mov ah, 0 int 16H cmp al, 0 jne .printkey cmp ah, 75 ; left arrow je .leftarrow cmp ah, 77 ; right arrow je .rightarrow cmp ah, 82 je .insertkey ; insert key? cmp ah, 83 je .deletekey ; delete key? jmp .again ; read more keys .leftarrow mov al, '<' jmp .printkey .rightarrow mov al, '>' jmp .printkey .insertkey mov al, 'I' jmp .printkey .deletekey mov al, 'D' jmp .printkey .printkey mov ah, 0eH ; print the key pressed int 10H jmp .again .done: jmp $ times 510-($-$$) db 0 ; Pad remainder of boot sector with 0s dw 0xAA55 ; The standard PC boot signature ,,, The following clears the screen but stops typing. * read keys and clear screen if escape is pressed --------------------------------------------------------- ESC equ 1bh start: mov ah, 0 ; bios read char function int 16H ; invoke bios cmp al, ESC ; was key 'escape' ? je .clear ; If escape pressed, cls! mov ah, 0eH ; print the character int 10H jmp start ; keep looping! .clear: mov ah, 0 mov al, 13h int 10H jmp start times 510-($-$$) db 0 ; Pad remainder of boot sector with 0s dw 0xAA55 ; The standard PC boot signature ,,, READING TEXT FROM KEYBOARD .... * read a line of input and count the characters --------------------------------------------------------- jmp start start: mov ax, 07C0h ; Initialize data segment register mov ds, ax ; via AX sub cx, cx ; set cx = 0 .again mov ah, 0 ; bios read character function int 16h ; invoke bios interrupt mov ah, 0eH ; echo char entered int 10H inc cx ; increment the character counter cmp al, 13 ; was the key press a 'enter' jne .again ; loop if not enter pressed mov al, 10 ; print a newline mov ah, 0eH int 10H dec cx ; dont include newline in character count mov ax, cx ; add al, '0' ; convert count digit to ascii mov ah, 0eH ; x86 bios print char function int 10h ; print first digit of character count times 510-($-$$) db 0 ; Pad remainder of boot sector with 0s dw 0xAA55 ; The standard PC boot signature ,,, Todo!! * read a line of input, and copy counted string to buffer --------------------------------------------------------- jmp start buffer db ' ' start: mov ax, 07C0h ; Set data segment to where we're loaded mov ds, ax mov di, buffer+2 ; the first word of buffer is for the count sub cx, cx ; set cx = 0 .again mov ah, 0 ; x86 bios get char from keyboard int 16h ; invoke the bios mov ah, 0eH ; print char int 10H ; invoke bios function ; NOTE: I think the code below could be better done with ; stosb, and loopne mov [di], al ; copy AL to buffer inc di inc cx ; increment the character counter cmp al, 13 ; was the key press a 'enter' jne .again ; loop if not enter pressed mov al, 10 ; print a newline mov ah, 0eH int 10H dec cx ; dont include newline in character count mov ax, cx ; add al, '0' ; convert count digit to ascii mov ah, 0eH ; x86 bios print char function int 10h ; print first digit of character count times 510-($-$$) db 0 ; Pad remainder of boot sector with 0s dw 0xAA55 ; The standard PC boot signature ,,, NUMBERS AS USER INPUT .... We would like to be able to read a positive integer entered by the user. So we need to read each digit, then multiply it by the base and add the next digit, and so on. MOUSE INPUT http://wiki.osdev.org/Mouse_Input just enough info The good news is that a modern usb mouse emulates or behaves like a normal PS/2 mouse. So you dont have to actually write a usb 'stack' (software api) in order to use the mouse. Phew! Mouse (and Keyboard) data come on port 0x60 port 0x64 bit 1 - data is available, bit 5 - data is from mouse, not keyboard. ASCII CODE The ascii code is a way of mapping common western (latin) characters to numbers * some common useful ascii codes ----------- cr equ 13 ; carriage return lf equ 10 ; line feed bell equ 7 ; bell (sort of) spc equ 32 ; space bs equ 8 ; back space del equ 127 ; 'delete' character ,,, * print ascii in descending order -------------------- jmp start start: mov cx, 255 .again: mov ah, 0eh mov al, cl int 10H mov al, ' ' int 10H loop .again .exit: hang: jmp hang ; keep looping! times 510-($-$$) db 0 ; Pad remainder of boot sector with 0s dw 0xAA55 ; The standard PC boot signature ,,, * print the first 128 ascii in ascending order -------------------- jmp start start: mov cx, 128 .again: mov ah, 0eh ; bios teletype function mov al, 128 sub al, cl int 10H ; call bios function mov al, ' ' ; print a space int 10H ; do it loop .again ; loop 128 times .exit: hang: jmp hang ; keep looping! times 510-($-$$) db 0 ; Pad remainder of boot sector with 0s dw 0xAA55 ; The standard PC boot signature ,,, * print ascii characters (and some non-ascii) in a table --------------------------------------------------------- jmp start %include 'printi8.asm' start: mov ax, 07C0h ; Set up 4K stack space after this bootloader add ax, 288 ; (4096 + 512) / 16 bytes per paragraph mov ss, ax mov sp, 4096 mov ax, 07C0h ; Set data segment to where we're loaded mov ds, ax mov cx, 0 .again: mov al, cl mov bl, 10 call printi8 ; print value of ascii character mov ah, 0eH mov al, ':' ; print a separator character int 10H mov al, cl int 10H mov al, ' ' int 10H inc cx cmp cx, 0xFF je .exit mov ax, cx and ax, 0007h ; mod 8 cmp ax, 0 jne .again mov ah, 0eh mov al, 13 int 10h mov ah, 0eh mov al, 10 int 10h jmp .again .exit: hang: jmp hang ; keep looping! times 510-($-$$) db 0 ; Pad remainder of boot sector with 0s dw 0xAA55 ; The standard PC boot signature ,,, FONTS AND TEXT http://wiki.osdev.org/VGA_Fonts good info A 'standard' x86 bios outputs text in one of its 'text video modes'. Its uses font 'bitmaps' with 8 columns and 16 rows for each character.Within the bitmap a 0 represents the background colour, and a 1 represents the foreground colour. The first row of the glyph (character) - 8 bits or 1 byte is contained in the first byte of the bitmap, the 2nd row in the second byte... etc. In the various graphics video modes, there are no BIOS functions for writing a character to the screen. The programmer must provide this functionality, by writing a character 1 pixel at a time. The first step to writing characters in graphics mode is getting the bitmap fonts data matrix... Standard vga fonts are 8x16 pixels. Each byte contains 1 row of the given character- the first byte is the top row etc. 1=foreground, 0=background So character takes up 16 bytes. The bios contains tables of information laying out the fonts used to display characters. Int 10h service AH=11h, subservice AL=30h * get the table of font information, untested ------------- mov ax, 1130h ; (Get font information) mov bh, 06h ; 8x16 font (vga/mcga) int 10h ; leave font table pointer in ES:BP ,,, Next step: access a character and display it pixel by pixel in some graphics mode. To access 1 character from the 4K (4096byte) character bitmap multiply ascii code by 16 (bytes per character) and add to the ES:BP register * code to store the full 4K (256 chars x 16 bytes/char) bitmap ------------------- ;in: es:di=4k buffer ;out: buffer filled with font push ds push es ;ask BIOS to return VGA bitmap fonts mov ax, 1130h mov bh, 6 int 10h ;copy charmap push es ; make the extended segment and pop ds ; the data segment the same pop es mov si, bp mov cx, 256*16/4 rep movsd pop ds ,,, * display an ascii character by copying a bios font bitmap ------------- jmp start %include 'printi8.asm' char db '$' start: mov ax, 07C0h mov ds, ax ; set the data segment mov ax, 1130h ; (Get font information) mov bh, 06h ; 8x16 font (vga/mcga) int 10h ; leave font table pointer in ES:BP mov al, [char] mov bl, 10 call printi8 ; right shift char 4 times (to multiply by 16) ; eg ; sub ah, ah mov cx, 4 shr ax, 4 ; add this to the bitmap offset ; add bp, ax ; now just print out the bytes or draw pixel by hang: jmp hang times 510-($-$$) db 0 dw 0xAA55 ,,, CUSTOM FONTS You can modify a glyph for a character in text mode just by writing the desired bit map to the correct location in memory. The code is identical to reading the 4K font bit map but change the direction of the MOV instruction. Another way to set fonts used for text mode Ralph Brown Interrupt list Int 10/AX=1110h. * define a custom 8x16 pixel 'A' glyph in assembly language ----------------- OurFont db 00000000b db 00000000b db 01111111b db 01100011b db 01100011b db 01100011b db 01111111b db 01100011b db 01100011b db 01100011b db 01100011b db 01100011b db 01100011b db 00000000b db 00000000b db 00000000b ,,,, PIXELS AND DRAWING WITH INTERRUPTS .... video mode 13h has the highest number of colours (for vga) int 10h functions for drawing pixels are considered slow. Pixels can only be read and written in graphics modes * draw 1 white pixel at (10,10) -------------------- start: mov ax, 07C0h mov ds, ax .setmode: mov ah, 0 ; set graphics display mode function. mov al, 13h ; mode 13h = 320x200 at 8 bits/pixel. int 10h ; set it! .draw: mov cx, 10 ; x-coordinate mov dx, 10 ; y-coordinate mov al, 15 ; white mov ah, 0ch ; put pixel int 10h ; draw pixel jmp $ times 510-($-$$) db 0 ; Pad remainder of boot sector with 0s dw 0xAA55 ; The standard PC boot signature ,,, * attempt to write text in a graphics mode, doesnt work!! -------------------- start: mov ax, 07C0h mov ds, ax .setmode: mov ah, 0 ; set graphics display mode function. mov al, 13h ; mode 13h = 320x200 at 8 bits/pixel. int 10h ; set it! .text: mov ah, 0eh mov al, 'Q' int 10h .draw: mov cx, 10 ; column mov dx, 10 ; row mov al, 15 ; white mov ah, 0ch ; put pixel int 10h ; draw pixel jmp $ times 510-($-$$) db 0 ; Pad remainder of boot sector with 0s dw 0xAA55 ; The standard PC boot signature ,,, MEMORY MAPPED GRAPHICS .... much faster than the bios interrupt approach. Writes values directly to video memory, provided that these are in standard places. CURSOR SHAPE * set text-mode cursor shape. >> int 10h, ah=01h input: CH = cursor start line (bits 0-4) and options (bits 5-7). CL = bottom cursor line (bits 0-4). when bit 5 of CH is set to 0, the cursor is visible. when bit 5 is 1, the cursor is not visible. * hide blinking text cursor: ---------------------------- mov ch, 32 mov ah, 1 int 10h ,,, * show standard blinking text cursor: ------------------------------------- mov ch, 6 mov cl, 7 mov ah, 1 int 10h ,,, * show box-shaped blinking text cursor: --------------------------------------- mov ch, 0 mov cl, 7 mov ah, 1 int 10h jmp $ ; keep looping! times 510-($-$$) db 0 ; Pad remainder of boot sector with 0s dw 0xAA55 ; The standard PC boot signature ,,, * show a box cursor while reading keys --------------------------------------------------------- start: mov ch, 0 ; set up the cursor mov cl, 7 mov ah, 1 int 10h ; display the box cursor .repeat: mov ah, 0 int 16h ; read a key mov ah, 0eH int 10H ; display the last key pressed jmp .repeat times 510-($-$$) db 0 ; Pad remainder of boot sector with 0s dw 0xAA55 ; The standard PC boot signature ,,, note: some bioses required CL to be >=7, otherwise wrong cursor shapes are displayed. COLOURS 16 colour mode uses a 4 bit encoding * * * * I R G B where I=Intensity (eg dark or light green) and RGB means red, green, blue. The Intensity bit is also the 'blinking' bit in some modes (to make text blink) So we can change the intensity of a colour (from light to dark or vice-versa) by toggling the 1st bit of the nibble. * turn on the blink/intensity bit for a coloured character ------------ mov ah, 9 mov al, 'a' mov bh, 0 mov bl, colour or bl, 10000000b mov cx, 1 int 10h ,,, COLOUR AND TEXT The bios in text mode is able to display text in 16 different colours. Write character and attribute at cursor position int 10h, ah=09h, al=character, bh=page number, bl=color, cx=number of times to print character The character colour attribute is 8 bit value in the BL register the low 4 bits set forground color, the high 4 bits set background color. * print intense red (fg) on intense blue (background) -------------- IRGBIRGB mov BL, 0b10101100 ,,, The cursor position is not changed after writing the characters * print "====" in green at the current cursor position ----------------------------------------------------- mov ah, 09h ; the 'function' number mov al, '=' ; the character to print ; color IRGBIRGB mov bl, 0b00000010 ; green on black at first page (bh=0) mov cx, 4 ; do it 4 times (cursor stays where it was) int 10h ; do it with a bios interrupt jmp $ ; loop forever times 510-($-$$) db 0 ; Pad remainder of boot sector with 0s dw 0xAA55 ; The standard PC boot signature ,,,, * print 2 characters advancing cursor ----------------------------------------------------- mov cx, 1 ; number of characters to print mov ah, 09h ; bios function colour print mov al, '=' ; the character to print ; color IRGBIRGB mov bl, 0b00000010 ; green on black int 10h ; do it with a bios interrupt mov ah, 03h ; bios function: get cursor position into dx int 10h ; invoke bios mov ah, 02h ; bios function: set cursor position specified in dx inc dl ; increment cursor column by 1 int 10h ; invoke bios mov cx, 1 ; number of characters to print mov ah, 09h ; bios function colour print mov bl, 0b01101111 int 10h ; colour print another = jmp $ ; loop forever times 510-($-$$) db 0 ; Pad remainder of boot sector with 0s dw 0xAA55 ; The standard PC boot signature ,,,, The code below relies on the fact that the bios function AH=09h,BL=colour,CX=char-count does not update the cursor position after printing to the screen. So each iteration of the loop actually overwrites n-1 characters of the previous iteration * print digits 0-9 in 9 different colours ----------------------------------------- start: mov cx, 0x0009 .again mov ah, 09h ; the 'function' number mov al, cl ; the digit to print add al, '0' ; convert the digit to ascii mov bl, cl ; use the CX counter to cycle thru 9 colours int 10h ; do it with a bios interrupt loop .again jmp $ ; loop forever times 510-($-$$) db 0 ; Pad remainder of boot sector with 0s dw 0xAA55 ; The standard PC boot signature ,,,, * print 16 stars in 16 colours, or 15 ----------------------------------------- start: mov cx, 0x000F .again mov ah, 09h ; the 'function' number mov al, '*' ; the character to print a star mov bl, cl ; use the CX counter to cycle thru 16 colours int 10h ; do it with a bios interrupt loop .again jmp $ ; loop forever times 510-($-$$) db 0 ; Pad remainder of boot sector with 0s dw 0xAA55 ; The standard PC boot signature ,,,, == Basic Bios Colours .. HEX BIN COLOUR .. 0, 0000 black .. 1 0001 blue .. 2 0010 green .. 3 0011 cyan .. 4 0100 red .. 5 0101 magenta .. 6 0110 brown .. 7 0111 light gray .. 8 1000 dark gray .. 9 1001 light blue .. A 1010 light green .. B 1011 light cyan .. C 1100 light red .. D 1101 light magenta .. E 1110 yellow .. F 1111 white ,,, * print "##" light blue on white at the current cursor position ----------------------------------------------------- mov ah, 0x09 ; bios function colour print mov al, '#' ; character to print mov bl, 0b11110001 ; blue on white background mov cx, 2 ; how many times to print character int 10h ; invoke bios function jmp $ ; infinite loop times 510-($-$$) db 0 ; Pad remainder of boot sector with 0s dw 0xAA55 ; The standard PC boot signature ,,,, * print a triangle of stars in 16 different colours ----------------------------------------------------- start: mov cx, 0xF ; 16 colours .again mov ah, 09h ; the 'function' number for colour print mov al, '*' ; the character to print mov bx, cx ; colour in cx counter at first page (bh=0) int 10h ; do it with a bios interrupt mov ah, 0eH ; teletype function mov al, 10 ; a form-feed int 10h ; do it loop .again jmp $ ; loop forever times 510-($-$$) db 0 ; Pad remainder of boot sector with 0s dw 0xAA55 ; The standard PC boot signature ,,,, * print a triangle of characters in 16 different colours ----------------------------------------------------- start: mov cx, 0xF ; 16 colours .again mov ah, 09h ; the 'function' number for colour print mov al, cl ; the character to print add al, 'A'-1 mov bx, cx ; colour in cx counter at first page (bh=0) int 10h ; do it with a bios interrupt mov ah, 0eH ; teletype function mov al, 10 ; a form-feed int 10h ; do it loop .again jmp $ ; loop forever times 510-($-$$) db 0 ; Pad remainder of boot sector with 0s dw 0xAA55 ; The standard PC boot signature ,,,, * print a triangle of characters in 16 different colours ----------------------------------------------------- start: mov cx, 0xF ; 16 colours .again push cx mov ah, 09h ; the 'function' number for colour print mov al, 'Z' sub al, cl ; the character to print mov bx, cx ; colour in cx counter at first page (bh=0) mov cx, 1 int 10h ; do it with a bios interrupt mov dx, 0 ;mov bh, 00h ; assume page 0 mov ah, 03h ; get cursor position into dx int 10h mov ah, 02h ; set cursor position specified in dx inc dl int 10h ;mov ah, 0eH ; teletype function ;mov al, 10 ; a form-feed ;int 10h ; do it pop cx loop .again jmp $ ; loop forever times 510-($-$$) db 0 ; Pad remainder of boot sector with 0s dw 0xAA55 ; The standard PC boot signature ,,,, * change whole screen colours to white on blue ----------------------------------------------------- mov ah, 0Bh mov bh, 0 mov bl, 11110001b int 10h mov ah, 0eH ; print the character mov al, '#' int 10H jmp $ times 510-($-$$) db 0 ; Pad remainder of boot sector with 0s dw 0xAA55 ; The standard PC boot signature ,,,, * print blue then green, no !!! not working ----------------------------------------------------- mov ah, 0Bh mov bh, 0 mov bl, 00010000b ; blue on black int 10h mov ah, 0eH ; print the character mov al, '#' int 10h mov ah, 0Bh mov bh, 0 mov bl, 00100000b ; green on black int 10h mov ah, 0eH ; print the character mov al, '#' int 10h jmp $ times 510-($-$$) db 0 ; Pad remainder of boot sector with 0s dw 0xAA55 ; The standard PC boot signature ,,,, ARTFUL COLOURS .... * print 16 colours, with 16 backgrounds ----------------------------------------- start: mov cx, 0x00FF .again mov ah, 09h ; the 'function' number mov al, '*' ; the character to print a star mov bl, cl ; use the CX counter to cycle thru 16 colours int 10h ; do it with a bios interrupt loop .again jmp $ ; loop forever times 510-($-$$) db 0 ; Pad remainder of boot sector with 0s dw 0xAA55 ; The standard PC boot signature ,,,, * print some ascii digit in colour by advancing cursor ----------------------------------------------------- jmp start start: mov cx, 9 .again: push cx mov ah, 09h ; bios function colour print mov al, cl ; the digit to print add al, '0' ; convert digit to ascii mov bl, cl ; colour in counter CX mov cx, 1 ; number of characters to print int 10h ; invoke bios mov ah, 03h ; bios function: get cursor position into dx int 10h ; invoke bios mov ah, 02h ; bios function: set cursor position specified in dx inc dl ; increment cursor column by 1 int 10h ; invoke bios pop cx loop .again jmp $ ; loop forever times 510-($-$$) db 0 ; Pad remainder of boot sector with 0s dw 0xAA55 ; The standard PC boot signature ,,,, * print all ascii chars in colour by advancing cursor ----------------------------------------------------- jmp start start: mov cx, 0x00FF .again: push cx mov ah, 09h ; bios function colour print mov al, 0xFF ; print ascending order sub al, cl ; the ascii char to print mov bl, cl ; colour in counter CX and bl, 0x0F ; only print foreground colours mov cx, 1 ; number of characters to print int 10h ; invoke bios mov ah, 03h ; bios function: get cursor position into dx int 10h ; invoke bios mov ah, 02h ; bios function: set cursor position specified in dx inc dl ; increment cursor column by 1 int 10h ; invoke bios pop cx test cl, 0b00001111 ; 32 characters to a line jne .here mov al, 10 ; form feed char mov ah, 0eH ; bios 'teletype' function int 10H ; invoke bios mov al, 13 ; return char mov ah, 0eH ; bios 'teletype' function int 10H ; invoke bios .here: loop .again jmp $ ; loop forever times 510-($-$$) db 0 ; Pad remainder of boot sector with 0s dw 0xAA55 ; The standard PC boot signature ,,,, USEFULL PROCEDURES This section contains a set of hopefully useful proceedures * print a zero terminated string with address in the SI register ----------------- BITS 16 jmp start message db 'A function to print',13,10,0 start: mov ax, 07C0h ; Set up 4K stack space after this bootloader add ax, 288 ; (4096 + 512) / 16 bytes per paragraph mov ss, ax mov sp, 4096 mov ax, 07C0h ; Set data segment to where we're loaded mov ds, ax mov si, message ; Put string position into SI call prints ; Call our string-printing routine hang: jmp hang ; Jump here - infinite loop! ;# prints ; output zero terminated string in SI to screen prints: mov ah, 0Eh ; int 10h 'print char' function .again: lodsb ; Get character from string cmp al, 0 je .done ; If char is zero, end of string int 10h ; Otherwise, print it jmp .again .done: ret times 510-($-$$) db 0 ; Pad remainder of boot sector with 0s dw 0xAA55 ; The standard PC boot signature ,,, * a proceedure to print a character in colour and advance cursor ----------------------------------------------------- start: mov al, 'G' mov cx, 0xF .again: mov bl, cl ; some colour call putcc loop .again jmp $ ; loop forever ; proc: print a coloured character (in AL) and colours (in BL) putcc: push ax push bx push cx push dx mov bh, 0 ; assume we are working in the first page mov ah, 09h ; the 'function' number for colour print mov cx, 1 ; print the character once int 10h ; do it with a bios interrupt mov ah, 03h ; get cursor position into dx int 10h mov ah, 02h ; set cursor position function inc dl ; increment the column position int 10h pop dx pop cx pop bx pop ax ret times 510-($-$$) db 0 ; Pad remainder of boot sector with 0s dw 0xAA55 ; The standard PC boot signature ,,,, The code below needs to perform a modulus on BL to make the colours cycle through the 16 allowable text mode colours * a procedure to 'rainbow' print some text (each letter a new colour) ----------------------------------------------------- jmp start message db '8086 in realmode rainbow!@#$%^&*', 0 start: mov ax, 07C0h ; Set up 4K stack space after this bootloader add ax, 288 ; (4096 + 512) / 16 bytes per paragraph mov ss, ax mov sp, 4096 mov ax, 07C0h ; Set data segment to where we're loaded mov ds, ax mov si, message ; Put string position into SI call printcolour jmp $ ; loop forever ; proc: print a string in rainbow colour ; string address in SI printcolour: push bx .resetcolour: mov bl, 1 ; colours start from 1 because 0 is black .repeat: lodsb ; Get character from string cmp al, 0 ; is the character byte 0 je .done ; If char is zero, end of string call putcc ; Otherwise, print it in colour cmp bl, 15 ; if bl is at the last colour reset it je .resetcolour inc bl jmp .repeat .done: pop bx ret ; print a coloured character (in AL) and colours (in BL) putcc: push ax ; save registers to the stack push bx push cx push dx mov bh, 0 ; assume we are working in the first page mov ah, 09h ; the 'function' number for colour print mov cx, 1 ; print the character once int 10h ; do it with a bios interrupt mov ah, 03h ; get cursor position into dx int 10h mov ah, 02h ; set cursor position function inc dl ; increment the column position int 10h pop dx pop cx pop bx pop ax ret times 510-($-$$) db 0 ; Pad remainder of boot sector with 0s dw 0xAA55 ; The standard PC boot signature ,,,, READING FROM DISKS The AH=02h, int 13h function allows reading from a 'floppy' (or usb emulating a floppy) or a hard drive. This is probably the most important function of a 'bootloader', that is, it must load something (code) from the disk in order to overcome the 1 sector (512byte) limit of the bootsector. Some pundits say that resetting and reading should be tried 3 times. In the case of a real floppy the first read may not work because the device takes some time to 'spin up' etc. These factors should not apply to a usb memory stick. * read and print some text which is in the 2nd sector ----------------- BITS 16 start: mov ax, 07C0h ; Set up 4K stack space after this bootloader add ax, 288 ; (4096 + 512) / 16 bytes per paragraph mov ss, ax mov sp, 4096 mov ax, 07C0h ; Set data segment to where we're loaded mov ds, ax reset: ; Reset the floppy drive mov ax, 0 ; mov dl, 0 ; Drive=0 (=A) int 13h ; jc reset ; ERROR => reset again read: mov ax, 1000h ; ES:BX = 1000:0000 mov es, ax ; es:bx determines where data loaded to mov bx, 0 ; mov ah, 2 ; Load disk data to ES:BX mov al, 5 ; Load 5 sectors mov ch, 0 ; Cylinder=0 mov cl, 2 ; Sector=2 mov dh, 0 ; Head=0 mov dl, 0 ; Drive=0, 'floppy' (or usb key) int 13h ; Read! jc read ; ERROR => Try again mov al, [es:bx] ; print 2 characters loaded mov ah, 0eh int 10h mov al, [es:bx+1] int 10h hang: jmp hang times 510-($-$$) db 0 dw 0AA55h data db 'some sample data',0 ,,, * read 1 sector into a variable in the data sector ----------------- BITS 16 jmp start start: mov ax, 07C0h ; Set up 4K stack space after this bootloader add ax, 288 ; (4096 + 512) / 16 bytes per paragraph mov ss, ax mov sp, 4096 mov ax, 07C0h ; Set data segment to where we're loaded mov ds, ax reset: ; Reset the floppy drive mov ax, 0 ; mov dl, 0 ; Drive=0 (=A) int 13h ; jc reset ; ERROR => reset again read: mov ax, ds ; ES:BX = this data segment, message variable mov es, ax ; es:bx determines where data loaded to mov bx, message ; load into the 'message' variable buffer mov ah, 2 ; Load disk data to ES:BX mov al, 1 ; Load 1 sector, 512 bytes mov ch, 0 ; Cylinder=0 mov cl, 2 ; Sector=2 mov dh, 0 ; Head=0 mov dl, 0 ; Drive=0, 'floppy' (or usb key) int 13h ; Read! jc read ; ERROR => Try again mov ah, 0eh mov al, [message] ; print 2 characters loaded int 10h mov al, [message+1] int 10h hang: jmp hang times 510-($-$$) db 0 dw 0AA55h data db 'loaded data',0 message times 512 db 0 ,,, * read 1 sector and print out the string data ----------------- BITS 16 jmp start message.reset db 'resetting the floppy',13,10,0 message.read db 'reading 1 sector',13,10,0 start: mov ax, 07C0h ; Set up 4K stack space after this bootloader add ax, 288 ; (4096 + 512) / 16 bytes per paragraph mov ss, ax mov sp, 4096 mov ax, 07C0h ; Set data segment to where we're loaded mov ds, ax reset: ; Reset the floppy drive mov si, message.reset call prints mov ax, 0 ; mov dl, 0 ; Drive=0 (=A) int 13h ; jc reset ; ERROR => reset again read: mov si, message.read call prints mov ax, ds ; ES:BX = this data segment, message variable mov es, ax ; es:bx determines where data loaded to mov bx, message ; load into the 'message' variable buffer mov ah, 2 ; Load disk data to ES:BX mov al, 1 ; Load 1 sector, 512 bytes mov ch, 0 ; Cylinder=0 mov cl, 2 ; Sector=2 mov dh, 0 ; Head=0 mov dl, 0 ; Drive=0, 'floppy' (or usb key) int 13h ; Read! jc read ; ERROR => Try again mov si, message call prints hang: jmp hang %include 'prints.asm' times 510-($-$$) db 0 dw 0AA55h data db 'loaded data',0 message times 512 db 0 ,,, WRITING TO DISKS The bios contains functions (under INT 13h) for writing to the 'floppy' disk (nowdays a usb memory stick which is emulating a floppy) or to a hard disk. We must be VERY thoughtful when writing to a hard disk, or we will end up with the computer completely unusable!!!!!. The same applies to the floppy but perhaps the consequences are less catastrophic. * write to hard disk, dont do it!!! you wont have a working comp ---------- xor ax, ax mov es, ax ; ES <- 0 mov cx, 1 ; cylinder 0, sector 1 mov dx, 0080h ; DH = 0 (head), drive = 80h (0th hard disk) mov bx, 5000h ; segment offset of the buffer mov ax, 0301h ; AH = 03 (disk write), AL = 01 (number of sectors to write) ;int 13h ,,, The code below should check that we are not writing to a hard disk (eg DL=80h) because doing so will probably render the computer unusable at all! * write to the boot medium (a usb stick hopefully) ---------- ; see the read disk section for some better code ; to do this xor ax, ax mov es, ax ; ES := 0 mov cx, 1 ; cylinder 0, sector 1 mov dh, 0 ; head 0 mov dl, 0 ; 1st floppy but not usb memory stick mov bx, 5000h ; segment offset of the buffer mov ah, 03 ; disk write mov al, 01 ; write only 1 sector (512 bytes) int 13h ,,, WRITE TO FLOPPY OR USB .... The happy answer is that a simple technique allows the same boot sector code to access a floppy disk image on a USB flash drive whether it was booted with floppy disk emulation or hard drive emulation. If dl=80h (hard drive emulation) * get drive parameters ------------- int 13h, ah=8 Return: ch=maximum sector number (same as number of sectors per track) dh=maximum head number (just add 1 to get number of heads) ,,, This returned information describes the geometry of the emulated device (if dl=0 then it's standard floppy disk geometry - 18 sectors per track and 2 heads). This can be used to calculate the required Cylinder Head Sector information required for: READ SECTOR(S) int 13h, ah=2 WRITE SECTOR(S) int 13h, ah=3 CMOS http://wiki.osdev.org/CMOS good cmos and realtime clock information http://vitaly_filatov.tripod.com/ng/asm/asm_029.3.html more timer info. TIME AND TIMERS INT 08H is a timer interrupt generated I think every 42milliseconds but see below for a easier way to time code INT 1Ah / AH = 00h - get system time. return: CX:DX = number of clock ticks since midnight. AL = midnight counter, advanced each time midnight passes. notes: there are approximately 18.20648 clock ticks per second, and 1800B0h per 24 hours. AL is not set by the emulator. Back to Top If you don’t actually need to use the timer tick interrupt directly, there is a much easier alternative. You could code the program to poll the count of the timer ticks since midnight that the BIOS maintains in the DWORD at offset address 6Ch in the BIOS data area, located at segment address 40h. The polling loop could compare the count to the value saved on the previous loop, and if the count had changed, indicating that a timer tick had occurred, save the count (for use in the next loop), call Interrupt 1Ah, etc, and continue looping. BIOS In real mode, the bios provides all sorts of useful functions for reading and writing and displaying * get information about the current bios ---------------------- mov ah, C0h int 15h ;this returns a table of information ,,,, GOTCHAS * watch out, dividend is only 1 byte! so AH? is undefined! ------------ mov AX, [dividend] jmp $ dividend db 54 ,,, FORTH IDEAS The forth language introduced some revolutionary ideas that never led to any kind of revolution. Namely: place code units (functions/ words/ objects) within a datastructure which includes the words name. This provides what is called today 'reflexivity'- the ability of code to 'know' something about itself. basic forth functions: receive a word from keyboard and look up the word in a dictionary. If word found, execute the code associated with the word. If word not found, try to convert input to a number and push it on the stack. All functions receive their parameters on the 'stack' (which is either the system stack, or else a software stack) * an example of a forth word data structure (from 'itsy-forth') ----------------- ; header dw link_to_previous_word db 3, 'nip' ; strings are 'counted' in forth (3 chars in nip) xt_nip dw docolon ; xt= execution token, forth jargon ; body dw xt_swap ; pointers to other forth 'words' dw xt_drop ; remove last item on stack dw xt_exit ; pop stack etc ,,, * example of forth dictionary entry with assembly code ----------- dw link_to_previous_word db 1, '+' xt_plus dw mc_plus mc_plus pop ax add bx,ax jmp next ,,, EXERCISES TOWARD A FORTH LIKE SYSTEM .... We can write small programs which perform forth-like functions to demonstrate different techniques for creating forth-ish systems. The following code is similar to the forth 'accept' word. It gets a certain number of characters and copies them to a buffer. In forth, the line is then parsed into counted words and executed, one word at a time. The entry code below should handle 'backspaces' to allow the user to edit the text entered. * get some text from the keyboard and copy to a counted buffer ------------------ org 7c00h jmp start SIZE equ 9 buffer resb SIZE+1 ; 1 byte for the count + 9 for chars start: mov ax, 07C0h ; Set data segment to where we're loaded mov ds, ax mov es, ax ; es is needed for stosb cld ; go forwards, not backwards .keys: mov cx, SIZE ; maximum chars in buffer lea di, [buffer+1] .again: mov ah,0 ; wait for any key int 16h ; bios keyboard functions cmp al, 13 ; was the key press an 'enter' je .count stosb ; copy the char to the buffer mov ah, 0eh ; echo the key pressed int 10h loop .again ; loop while CX > 0 .count: mov bx, SIZE ; calculate and store char count in [buffer] sub bx, cx mov [buffer], bl .type: ; print count and 1st character call newline mov al, [buffer] ; print char count (one digit) add al, '0' ; convert digit to ascii int 10h mov al, [buffer+1] ; print 1st char of buffer call newline jmp .keys ; keep looping! newline: mov ah, 0eh mov al, 13 ; print to a newline int 10h mov al, 10 int 10h ret times 510-($-$$) db 0 ; Pad remainder of boot sector with 0s dw 0xAA55 ; The standard PC boot signature ,,, Below a difficult Gotcha!. With org 7c00h and the mov ax, 07c0h the code does not work. * get one letter from keyboard and look up in a dictionary ------------------ ;org 7c00h jmp start buffer db ' ' ; single character buffer ; the dictionary, a linked list. aa dw 0 ; zero link means top of dictionary db 1,'a' ; count + mov ah, 0eh mov al, 'A' int 10h bb dw aa ; link to previous entry in dictionary db 1,'b' mov ah, 0eh mov al, 'B' int 10h cc dw bb ; link to previous entry in dictionary db 1,'c' mov ah, 0eh mov al, 'C' int 10h ret last dw cc ; link to last dictionary entry start: mov ax, 07C0h ; Set data segment to where we're loaded add ax, 288 ; (4096 + 512) / 16 bytes per paragraph mov ss, ax mov sp, 4096 mov ax, 07C0h ; Set data segment to where we're loaded mov ds, ax mov es, ax ; es is needed for stosb cld ; go forwards, not backwards .again: mov ah,0 ; wait for any key int 16h ; bios keyboard functions mov [buffer], al ; copy the char to the buffer mov ah, 0eh ; echo the key pressed int 10h .search: ; print count and 1st character ;call newline mov bx, [last] lea si, [bx] mov al, [si+3] mov ah, 0eh ; echo the last character in dict int 10h ;lea bx, newline ;call bx jmp .again ; keep looping! newline: mov ah, 0eh mov al, 13 ; print to a newline int 10h mov al, 10 int 10h ret times 510-($-$$) db 0 ; Pad remainder of boot sector with 0s dw 0xAA55 ; The standard PC boot signature ,,, The assembly coder will notice that it is easier to compare 2 strings if those strings are 'counted', that is, the number of characters they contain are stored (preferably) in front of the string text. * look up a word in a linked-list dictionary and report if found ------------------ ,,, * write functions which take and leave parameters from some stack -------- ,,, FASM Fasm is the 'free assembler' and appears to be more actively maintained than 'nasm' ASSEMBLY AND NASM Nasm is the 'netwide assembler' and does not appear very actively maintained. Assembly language programming has the reputation as a egregious wrongheaded persuit, the domain of casino card counters and their ilk. But its really not that bad. LABELS .... labels may be local (starting with a dot) or non local. local ones for some reason need a non local one before them. * error, nasm doesnt like this ----- .again jmp .again ,,,, * ok nasm is happy ----- start: .again jmp .again ,,,, ASSEMBLING AND ORGANISING WITH NASM .... The program below seems to work even without initializing the stack, which is odd, since a procedure needs to use it. * a program which includes a proceedure in a separate file -------- jmp start start: mov ax, 07C0h ; Set up 4K stack space after this bootloader add ax, 288 ; (4096 + 512) / 16 bytes per paragraph mov ss, ax mov sp, 4096 mov ax, 07C0h ; Set data segment to where we're loaded mov ds, ax mov al, 0xEE mov bl, 2 call printi8 hang: jmp hang %include 'printi8.asm' times 510-($-$$) db 0 ; Fill the file with 0's dw 0AA55h ; End the file with AA55 ,,, VARIABLES .... Variables in assembly language a just buffers of initialised or reserved but uninitialised data. No more no less. The rest is up to you * move the value in the DL register into memory ------------ jmp start drive db 0 start: mov ax, 07C0h ; first set data-segment=code-segment mov ds, ax ; so that [drive] points to where it should mov [drive], dl mov ah, 0eh mov al, [drive] ; prints the value as an ascii character int 10h ; not useful but better than nothing hang: jmp hang ,,, * print the first two characters of a string ------------ jmp start message db 'hello!' start: mov ah, 0eh mov al, [message] int 10h mov al, [message+1] int 10h hang: jmp hang ,,, VIM AND ASM Vim can be used to compile and run bootable assembly code with qemu. * a command to write the next assembly proceedure to its own file >> command! -nargs=1 Asp /^ *[a-z0-9]\+:/,/^ *ret *$/w .asm * map the key sequence ';as' to compile the whole file with nasm >> map ;as :!nasm -f bin % -o %:r.bin In the examples below, the complete assembly program is supposed to be within 2 'markers' within a document. The markers are '---' on a line by itself and ',,,' on a line by itself. These 2 markers mark the beginning and end of the assembly program within the document. * just compile an assembly program within a document >> map ;cc :?^ *---?+1,/,,,/-1w ! ( cat - ) > test.asm; nasm -fbin -o test.bin test.asm; * compile and run a fragment of boot assembly with nasm and qemu >> map ;aa :?^ *---?+1,/,,,/-1w ! ( cat - ) > test.asm; nasm -fbin -o test.bin test.asm; mkdosfs -C test.flp 1440; dd status=noxfer conv=notrunc if=test.bin of=test.flp; qemu-system-i386 -noframe -fda test.flp * no qemu window decorations, stop with control-c >> qemu-system-i386 -noframe test.flp * qemu fullscreen, stop with control-c >> qemu-system-i386 -full-screen test.flp The command line above may have a problem if the test.flp file already exists and is no good isnce mkdosfs will not overwrite it. The following is useful for determining how much space is left within a boot file (which is limited to 512 bytes) * see how big a compiled file is without 512 byte padding >> map ;bb :?^ *---?+1,/,,,/-1w ! ( sed -n '/times/\!p' ) > test.asm; nasm -fbin -o test.bin test.asm; ls -la DOCUMENT NOTES This section contains some meta information about the document. document-history: @@ 2014 document begun inspired by mikeos @@ 17 march 2015 a little bit more work, trying to amplify the forth like section DANIELS NASM BOOT TIPS xxx http://home.swipnet.se/smaffy/asm/info/nasmBoot.txt author: Daniel Marjamäki (daniel.marjamaki@home.se) The basics ---------- These are the rules that you must follow: - The BIOS will load your bootloader at address 07C00h. Sadly, the segment and offset varies. - Bootstraps must be compiled as plain binary files. - The filesize for the plain binary file must be 512 bytes. - The file must end with AA55h. A minimal bootstrap ------------------- This bootstrap just hangs: ; HANG.ASM ; A minimal bootstrap hang: ; Hang! jmp hang times 510-($-$$) db 0 ; Fill the file with 0's dw 0AA55h ; End the file with AA55 note: $ means the current memory offset in the assembled machine code. $$ the beginning memory offset The last instruction puts AA55 at the end of the file. To compile the bootstrap, use this command: nasm hang.asm -o hang.bin If you want to test the bootstrap, you must first put it on the first sector on a floppy disk. You can for example use 'dd' or 'rawrite'. When the bootstrap is on the floppy, test it by restarting your computer with the floppy inserted. The computer should hang then. The memory problem ------------------ There is a memory problem. As I've written bootstraps are always loaded to address 07C00. We don't know what segment and offset the BIOS has put us in. The segment can be anything between 0000 and 07C0. This is a problem when we want to use variables. The solution is simple. Begin your bootstrap by jumping to your bootstrap, but jump to a known segment. Here is an example: ; JUMP.ASM ; Make a jump and then hang ; Tell the compiler that this is offset 0. ; It isn't offset 0, but it will be after the jump. [ORG 0] jmp 07C0h:start ; Goto segment 07C0 start: ; Update the segment registers mov ax, cs mov ds, ax mov es, ax hang: ; Hang! jmp hang times 510-($-$$) db 0 dw 0AA55h If you compile and test this bootstrap, there will be no visible difference to the minimal bootstrap presented earlier. The computer will just hang. Some exercises -------------- 1. Create a bootstrap that outputs "====" on the screen, and then hangs. Tip: modify the jump.asm program. 2. Create a bootstrap that outputs "Hello Cyberspace!" and hangs. 3. Create a bootstrap that loads a program off the floppy disk and jumps to it. Solutions to the exercises -------------------------- 1. ; 1.ASM ; Print "====" on the screen and hang ; Tell the compiler that this is offset 0. ; It isn't offset 0, but it will be after the jump. [ORG 0] jmp 07C0h:start ; Goto segment 07C0 start: ; Update the segment registers mov ax, cs mov ds, ax mov es, ax mov ah, 9 ; Print "====" mov al, '=' ; mov bx, 7 ; mov cx, 4 ; int 10h ; hang: ; Hang! jmp hang times 510-($-$$) db 0 dw 0AA55h 2. ; 2.ASM ; Print "Hello Cyberspace!" on the screen and hang ; Tell the compiler that this is offset 0. ; It isn't offset 0, but it will be after the jump. [ORG 0] jmp 07C0h:start ; Goto segment 07C0 ; Declare the string that will be printed msg db 'Hello Cyberspace!' start: ; Update the segment registers mov ax, cs mov ds, ax mov es, ax mov si, msg ; Print msg print: lodsb ; AL=memory contents at DS:SI cmp al, 0 ; If AL=0 then hang je hang mov ah, 0Eh ; Print AL mov bx, 7 int 10h jmp print ; Print next character hang: ; Hang! jmp hang times 510-($-$$) db 0 dw 0AA55h 3. ; 3.ASM ; Load a program off the disk and jump to it ; Tell the compiler that this is offset 0. ; It isn't offset 0, but it will be after the jump. [ORG 0] jmp 07C0h:start ; Goto segment 07C0 start: ; Update the segment registers mov ax, cs mov ds, ax mov es, ax reset: ; Reset the floppy drive mov ax, 0 ; mov dl, 0 ; Drive=0 (=A) int 13h ; jc reset ; ERROR => reset again read: mov ax, 1000h ; ES:BX = 1000:0000 mov es, ax ; mov bx, 0 ; mov ah, 2 ; Load disk data to ES:BX mov al, 5 ; Load 5 sectors mov ch, 0 ; Cylinder=0 mov cl, 2 ; Sector=2 mov dh, 0 ; Head=0 mov dl, 0 ; Drive=0 int 13h ; Read! jc read ; ERROR => Try again jmp 1000h:0000 ; Jump to the program times 510-($-$$) db 0 dw 0AA55h This is a small loadable program. ; PROG.ASM mov ah, 9 mov al, '=' mov bx, 7 mov cx, 10 int 10h hang: jmp hang This program creates a disk image file that contains both the bootstrap and the small loadable program. ; IMAGE.ASM ; Disk image %include '3.asm' %include 'prog.asm' Finally ------- Thanks for reading. Email me any suggestions, comments, questions, ... If you don't use NASM and are having problems with the code, you should contact me. Together we can solve it. MIKEOS SIMPLE OS GUIDE Navigate * [1]Requirements * [2]PC primer * [3]Asm primer * [4]Your first OS * [5]Going further How to write a simple operating system This document shows you how to write and build your first operating system in x86 assembly language. It explains what you need, the fundamentals of the PC boot process and assembly language, and how to take it further. The resulting OS will be very small (fitting into a bootloader) and have very few features, but it's a starting point for you to explore further. After you have read the guide, see [6]the MikeOS project for a bigger x86 assembly language OS that you can explore to expand your skills. __________________________________________________________________ Requirements Prior programming experience is essential. If you've done some coding in a high-level language like PHP or Java, that's good, but ideally you'll have some knowledge of a lower-level language like C, especially on the subject of memory and pointers. For this guide we're using Linux. OS development is certainly possible on Windows, but it's so much easier on Linux as you can get a complete development toolchain in a few mouse-clicks/commands. Linux is also really good for making floppy disk and CD-ROM images - you don't need to install loads of fiddly programs. Installing Linux is very easy thesedays; grab Ubuntu and install it in VMware or VirtualBox if you don't want to dual-boot. When you're in Ubuntu, get all the tools you need to follow this guide by entering this in a terminal window: sudo apt-get install build-essential qemu nasm This gets you the development toolchain (compiler etc.), QEMU PC emulator and the NASM assembler, which converts assembly language into raw machine code executable files. __________________________________________________________________ PC primer If you're writing an OS for x86 PCs (the best choice, due to the huge amount of documentation available), you'll need to understand the basics of how a PC starts up. Fortunately, you don't need to dwell on complicated subjects such as graphics drivers and network protocols, as you'll be focusing on the essential parts first. When a PC is powered-up, it starts executing the BIOS (Basic Input/Output System), which is essentially a mini-OS built into the system. It performs a few hardware tests (eg memory checks) and typically spurts out a graphic (eg Dell logo) or diagnostic text to the screen. Then, when it's done, it starts to load your operating system from any media it can find. Many PCs jump to the hard drive and start executing code they find in the Master Boot Record (MBR), a 512-byte section at the start of the hard drive; some try to find executable code on a floppy disk (boot sector) or CD-ROM. This all depends on the boot order - you can normally specify it in the BIOS options screen. The BIOS loads 512 bytes from the chosen media into its memory, and begins executing it. This is the bootloader, the small program that then loads the main OS kernel or a larger boot program (eg GRUB/LILO for Linux systems). This 512 byte bootloader has two special numbers at the end to tell the OS that it's a boot sector - we'll cover that later. Note that PCs have an interesting feature for booting. Historically, most PCs had a floppy drive, so the BIOS was configured to boot from that device. Today, however, many PCs don't have a floppy drive - only a CD-ROM - so a hack was developed to cater for this. When you're booting from a CD-ROM, it can emulate a floppy disk; the BIOS reads the CD-ROM drive, loads in a chunk of data, and executes it as if it was a floppy disk. This is incredibly useful for us OS developers, as we can make floppy disk versions of our OS, but still boot it on CD-only machines. (Floppy disks are really easy to work with, whereas CD-ROM filesystems are much more complicated.) So, to recap, the boot process is: 1. Power on: the PC starts up and begins executing the BIOS code. 2. The BIOS looks for various media such as a floppy disk or hard drive. 3. The BIOS loads a 512 byte boot sector from the specified media and begins executing it. 4. Those 512 bytes then go on to load the OS itself, or a more complex bootloader. For MikeOS, we have the 512-byte bootloader, which we write to a floppy disk image file (a virtual floppy). We can then inject that floppy image into a CD, for PCs that only have CD-ROM drives. Either way, the BIOS loads it as if it was on a floppy, and starts executing it. We have control of the system! __________________________________________________________________ Assembly language primer Most modern operating systems are written in C/C++. That's very useful when portability and code-maintainability are crucial, but it adds an extra layer of complexity to the proceedings. For your very first OS, you're better off sticking with assembly language, as used in MikeOS. It's more verbose and non-portable, but you don't have to worry about compilers and linkers. Besides, you need a bit of assembly to kick-start any OS. Assembly language (or colloquially "asm") is a textual way of representing the instructions that a CPU executes. For instance, an instruction to move some memory in the CPU may be 11001001 01101110 - but that's hardly memorable! So assembly provides mnemonics to substitute for these instructions, such as mov ax, 30. They correlate directly with machine-code CPU instructions, but without the meaningless binary numbers. Like most programming languages, assembly is a list of instructions followed in order. You can jump around between various places and set up subroutines/functions, but it's much more minimal than C# and friends. You can't just print "Hello world" to the screen - the CPU has no concept of what a screen is! Instead, you work with memory, manipulating chunks of RAM, performing arithmetic on them and putting the results in the right place. Sounds scary? It's a bit alien at first, but it's not hard to grasp. At the assembly language level, there is no such thing as variables in the high-level language sense. What you do have, however, is a set of registers, which are on-CPU memory stores. You can put numbers into these registers and perform calculations on them. In 16-bit mode, these registers can hold numbers between 0 and 65535. Here's a list of the fundamental registers on a typical x86 CPU: AX, BX, CX, DX General-purpose registers for storing numbers that you're using. For instance, you may use AX to store the character that has been pressed on the keyboard, while using CX to act as a counter in a loop. (Note: these 16-bit registers can be split into 8-bit registers such as AH/AL, BH/BL etc.) SI, DI Source and destination data index registers. These point to places in memory for retrieving and storing data. SP The Stack Pointer (explained in a moment). IP (sometimes CP) The Instruction/Code Pointer. This contains the location in memory of the instruction being executed. When an instruction has finished, it is incremented and moves on to the next instruction. You can change the contents of this register to move around in your code. So you can use these registers to store numbers as you work - a bit like variables, but they're much more fixed in size and purpose. There are a few others, notably segment registers. Due to limitations in old PCs, memory was handled in 64K chunks called segments. This is a really messy subject, but thankfully you don't have to worry about it - for the time being, your OS will be less than a kilobyte anyway! In MikeOS, we limit ourselves to a single 64K segment so that we don't have to mess around with segment registers. The stack is an area of your main RAM used for storing temporary information. It's called a stack because numbers are stacked one-on-top of another. Imagine a Pringles tube: if you put in a playing card, an iPod Shuffle and a beermat, you'll pull them out in the reverse order (beermat, then iPod, and finally playing card). It's the same with numbers: if you push the numbers 5, 7 and 15 onto the stack, you will pop them out as 15 first, then 7, and lastly 5. In assembly, you can push registers onto the stack and pop them out later - it's useful when you want to store temporarily the value of a register while you use that register for something else. PC memory can be viewed as a linear line of pigeon-holes ranging from byte 0 to whatever you have installed (millions of bytes on modern machines). At byte number 53,634,246 in your RAM, for instance, you may have your web browser code to view this document. But whereas we humans count in powers of 10 (10, 100, 1000 etc. - decimal), computers are better off with powers of two (because they're based on binary). So we use hexadecimal, which is base 16, as a way of representing numbers. See this chart to understand: Decimal 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Hexadecimal 0 1 2 3 4 5 6 7 8 9 A B C D E F 10 11 12 13 14 As you can see, whereas our normal decimal system uses 0 - 9, hexadecimal uses 0 - F in counting. It's a bit weird at first, but you'll get the hang of it. In assembly programming, we identify hexadecimal (hex) numbers by tagging a 'h' onto the end - so 0Ah is hex for the number 10. (You can also denote hexadecimal in assembly by prefixing the number with 0x - for instance, 0x0A.) Let's finish off with a few common assembly instructions. These move memory around, compare them and perform calculations. They're the building blocks of your OS - there are hundreds of instructions, but you don't have to memorise them all, because the most important handful are used 90% of the time. mov Copies memory from one location or register to another. For instance, mov ax, 30 places the number 30 into the AX register. Using square brackets, you can get the number at the memory location pointed to by the register. For instance, if BX contains 80, then mov ax, [bx] means "get the number in memory location 80, and put it into AX". You can move numbers between registers too: mov bx, cx. add / sub Adds a number to a register. add ax, FFh adds FF in hexadecimal (255 in our normal decimal) to the AX register. You can use sub in the same way: sub dx, 50. cmp Compares a register with a number. cmp cx, 12 compares the CX register with the number 12. It then updates a special register on the CPU called FLAGS - a special register that contains information about the last operation. In this case, if the number 12 is bigger than the value in CX, it generates a negative result, and notes that negative in the FLAGS register. We can use this in the following instructions... jmp / jg / jl... Jump to a different part of the code. jmp label jumps (GOTOs) to the part of our source code where we have label: written. But there's more - you can jump conditionally, based on the CPU flags set in the previous command. For instance, if a cmp instruction determined that a register held a smaller value than the one with which it was compared, you can act on that with jl label (jump if less-than to label). Similarly, jge label jumps to 'label' in the code if the value in the cmp was greater-than or equal to its compared number. int Interrupt the program and jump to a specified place in memory. Operating systems set up interrupts which are analogous to subroutines in high-level languages. For instance, in MS-DOS, the 21h interrupt provides DOS services (eg as opening a file). Typically, you put a value in the AX register, then call an interrupt and wait for a result (passed back in a register too). When you're writing an OS from scratch, you can call the BIOS with int 10h, int 13h, int 14h or int 16h to perform tasks like printing strings, reading sectors from a floppy disk etc. Let's look at some of these instructions in a little more detail. Consider the following code snippet: mov bx, 1000h mov ax, [bx] cmp ax, 50 jge label ... label: mov ax, 10 In the first instruction, we move the number 1000h into the BX register. Then, in the second instruction, we store in AX whatever is in the memory location pointed to by BX. This is what the [bx] means: if we just did mov ax, bx it'd simply copy the number 1000h into the AX register. But by using square brackets, we're saying: don't just copy the contents of BX into AX, but copy the contents of the memory address to which BX points. Given that BX contains 1000h, this instruction says: find whatever is at memory location 1000h, and put it into AX. So, if the byte of memory at location 1000h contains 37, then that number 37 will be put into the AX register via our second instruction. Next up, we use cmp to compare the number in AX with the number 50 (the decimal number 50 - we didn't suffix it with 'h'). The following jge instruction acts on the cmp comparison, which has set the FLAGS register as described earlier. The jge label says: if the result from the previous comparison is greater than or equal, jump to the part of the code denoted by label:. So if the number in AX is greater than or equal to 50, execution jumps to label:. If not, execution continues at the '...' stage. One last thing: you can insert data into a program with the db (define byte) directive. For instance, this defines a series of bytes with the number zero at the end, representing a string: mylabel: db 'Message here', 0 In our assembly code, we know that a string of characters, terminated by a zero, can be found at the mylabel: position. We could also set up single byte to use somewhat like a variable: foo: db 0 Now foo: points at a single byte in the code, which in the case of MikeOS will be writable as the OS is copied completely to RAM. So you could have this instruction: mov byte al, [foo] This moves the byte pointed to by foo into the AL register. That's the essentials of x86 PC assembly language, and enough to get you started. When writing an OS, though, you'll need to learn much more as you progress, so see the [7]Resources section for links to more in-depth assembly tutorials. __________________________________________________________________ Your first OS Now you're ready to write your first operating system kernel! Of course, this is going to be extremely bare-bones, just a 512-byte boot sector as described earlier, but it's a starting point for you to expand further. Paste the following code into a file called myfirst.asm and save it into your home directory - this is the source code to your first OS. BITS 16 start: mov ax, 07C0h ; Set up 4K stack space after this bootloader add ax, 288 ; (4096 + 512) / 16 bytes per paragraph mov ss, ax mov sp, 4096 mov ax, 07C0h ; Set data segment to where we're loaded mov ds, ax mov si, text_string ; Put string position into SI call print_string ; Call our string-printing routine jmp $ ; Jump here - infinite loop! text_string db 'This is my cool new OS!', 0 print_string: ; Routine: output string in SI to screen mov ah, 0Eh ; int 10h 'print char' function .repeat: lodsb ; Get character from string cmp al, 0 je .done ; If char is zero, end of string int 10h ; Otherwise, print it jmp .repeat .done: ret times 510-($-$$) db 0 ; Pad remainder of boot sector with 0s dw 0xAA55 ; The standard PC boot signature Let's step through this. The BITS 16 line isn't an x86 CPU instruction; it just tells the NASM assembler that we're working in 16-bit mode. NASM can then translate the following instructions into raw x86 binary. Then we have the start: label, which isn't strictly needed as execution begins right at the start of the file anyway, but it's a good marker. From here onwards, note that the semicolon (;) character is used to denote non-executable text comments - we can put anything there. The following six lines of code aren't really of interest to us - they simply set up the segment registers so that the stack pointer (SP) knows where our handy stack of temporary data is, and where the data segment (DS) is located. As mentioned, segments are a hideously messy way of handling memory from the old 16-bit days, but we just set up the segment registers and forget about them. (The references to 07C0h are the equivalent segment location at which the BIOS loads our code, so we start from there.) The next part is where the fun happens. The mov si, text_string line says: copy the location of the text string below into the SI register. Simple enough! Then we use call, which is like a GOSUB in BASIC or a function call in C. It means: jump to the specified section of code, but prepare to come back here when we're done. How does the code know how to do that? Well, when we use a call instruction, the CPU increments the position of the IP (Instruction Pointer) register and pushes it onto the stack. You may recall from the previous explanation of the stack that it's a last-in first-out memory storage mechanism. All that business with the stack pointer (SP) and stack segment (SS) at the start cleared a space for the stack, so that we can drop temporary numbers there without overwriting our code. So, the call print_string says: jump to the print_string routine, but push the location of the next instruction onto the stack, so we can pop it off later and resume execution here. Execution has jumped over to print_string: - this routine uses the BIOS to output text to the screen. First we put 0Eh into the AH register (the upper byte of AX). Then we have a lodsb (load string byte) instruction, which retrieves a byte of data from the location pointed to by SI, and stores it in AL (the lower byte of AX). Next we use cmp to check if that byte is zero - if so, it's the end of the string and we quit printing (jump to the .done label). If it's not zero, we call int 10h (interrupt our code and go to the BIOS), which reads the value in the AH register (0Eh) we set up before. Ah, says the BIOS - 0Eh in the AH register means "print the character in the AL register to the screen!". So the BIOS prints the first character in our string, and returns from the int call. We then jump to the .repeat label, which starts the process again - lodsb to load the next byte from SI (it increments SI each time), see if it's zero and decide what to do. The ret at the end of our string-printing routine means: "we've finished here - return back to the place where we were called by popping the code location from the stack back into the IP register". So there we have a demonstration of a loop, in a standalone routine. You can see that the text_string label is alongside a stream of characters, which we insert into our OS using db. The text is in apostrophes so that NASM knows it's not code, and at the end we have a zero to tell our print_string routine that we're at the end. Let's recap: we start off by setting up the segment registers so that our OS knows where the stack pointer and executable code resides. Then we point the SI register at a string in our OS binary, and call our string-printing routine. This routine scans through the characters pointed to by SI and displays them until it finds a zero, at which point it returns back into the code that called it. Then the jmp $ line says: keep jumping to the same line. (The '$' in NASM denotes the current point of code.) This sets up an infinite loop, so that the message is displayed and our OS doesn't try to execute the following string! The final two lines are interesting. For a PC to recognise a valid floppy disk boot sector, it has to be exactly 512 bytes in size and end with the numbers AAh and 55h (the boot signature). So the first of these lines says: pad out our resulting binary file to be 510 bytes in size. Then the second line uses dw (define a word - two bytes) containing the aforementioned boot signature. Voila: a 512 byte boot file with the correct numbers at the end for the BIOS to recognise. Let's build our new OS. In a terminal window, in your home directory, enter: nasm -f bin -o myfirst.bin myfirst.asm Here we assemble the code from our text file into a raw binary file of machine-code instructions. With the -f bin flag, we tell NASM that we want a plain binary file (not a complicated Linux executable - we want it as plain as possible!). The -o myfirst.bin part tells NASM to generate the resulting binary in a file called myfirst.bin. Now we need a virtual floppy disk image to which we can write our bootloader-sized kernel. Copy mikeos.flp from the disk_images/ directory of the MikeOS bundle into your home directory, and rename it myfirst.flp. Then enter: dd status=noxfer conv=notrunc if=myfirst.bin of=myfirst.flp This uses the 'dd' utility to directly copy our kernel to the first sector of the floppy disk image. When it's done, we can boot our new OS using the QEMU PC emulator as follows: qemu -fda myfirst.flp And there you are! Your OS will boot up in a virtual PC. If you want to use it on a real PC, you can write the floppy disk image to a real floppy and boot from it, or generate a CD-ROM ISO image. For the latter, make a new directory called cdiso and move the myfirst.flp file into it. Then, in your home directory, enter: mkisofs -o myfirst.iso -b myfirst.flp cdiso/ This generates a CD-ROM ISO image called myfirst.iso with bootable floppy disk emulation, using the virtual floppy disk image from before. Now you can burn that ISO to a CD-R and boot your PC from it! (Note that you need to burn it as a direct ISO image and not just copy it onto a disc.) Next you'll want to improve your OS - explore the MikeOS source code to get some inspiration. Remember that bootloaders are limited to 512 bytes, so if you want to do a lot more, you'll need to make your bootloader load a separate file from the disk and begin executing it, in the same fashion as MikeOS. Going further So, you've now got a very simple bootloader-based operating system running. What next? Here are some ideas: * Add more routines -- You already have print_string in your kernel. You could add routines to get strings, move the cursor etc. Search the internet for BIOS calls which you can use to achieve these. * Load files -- The bootloader is limited to 512 bytes, so you don't have much room. You could make the bootloader load subsequent sectors on the disk into RAM, and jump to that point to continue execution. Or you could read up on FAT12, the filesystem used on floppy drives, and implement that. (See source/bootload/bootload.asm in the MikeOS .zip for an implementation.) DOCUMENT-NOTES: # this section contains information about the document and # will not normally be printed. # A small (16x16) icon image to identify the book document-icon: # A larger image to identify or illustrate the title page document-image: # what sort of document is this document-type: book # in what kind of state (good or bad) is this document document-quality: just begun document-history: @@ 10 nov 2011 started this booklet after seeing the 'mikeos' site and realising that writing a bootable program in realmode x86 code should not be that difficult. Also found a good page describing how to load a program from a floppy image and execute it (to overcome the 512 byte boot sector limit) @@ date work description # who wrote this authors: mjbishop # a short description of the contents, possible used for doc lists short-description: # A computer language which is contained in the document, if any code-language: # the script which will be used to produce html (a webpage) make-html: ./book-html.sh # the script which will produce 'LaTeX' output (for printing, pdf etc) make-latex: ./booktolatex.cgi