
- •Table of Contents
- •Foreword
- •Do Not Pass GO
- •Counting in Martian
- •Octal: How the Grinch Stole Eight and Nine
- •Hexadecimal: Solving the Digit Shortage
- •From Hex to Decimal and from Decimal to Hex
- •Arithmetic in Hex
- •Binary
- •Hexadecimal as Shorthand for Binary
- •Switches, Transistors, and Memory
- •The Shop Foreman and the Assembly Line
- •The Box That Follows a Plan
- •DOS and DOS files
- •Compilers and Assemblers
- •The Assembly Language Development Process
- •DEBUG and How to Use It
- •Chapter 5: NASM-IDE: A Place to Stand Give me a lever long enough, and a place to stand, and I will move the Earth.
- •NASM-IDE's Place to Stand
- •Using NASM-IDE's Tools
- •NASM-IDE's Editor in Detail
- •Other NASM-IDE Features
- •The Nature of Segments
- •16-Bit and 32-Bit Registers
- •The Three Major Assembly Programming Models
- •Reading and Changing Registers with DEBUG
- •Assembling and Executing Machine Instructions with DEBUG
- •Machine Instructions and Their Operands
- •Reading and Using an Assembly Language Reference
- •Rally Round the Flags, Boys!
- •Using Type Specifiers
- •The Bones of an Assembly Language Program
- •Assembling and Running EAT.ASM
- •One Program, Three Segments
- •Last In, First Out via the Stack
- •Using DOS Services through INT
- •Boxes within Boxes
- •Using BIOS Services
- •Building External Libraries of Procedures
- •Creating and Using Macros
- •Bits Is Bits (and Bytes Is Bits)
- •Shifting Bits
- •Flags, Tests, and Branches
- •Assembly Odds 'n Ends
- •The Notion of an Assembly Language String
- •REP STOSW, the Software Machine Gun
- •The Semiautomatic Weapon: STOSW without REP
- •Storing Data to Discontinuous Strings
- •Chapter 12: The Programmer's View of Linux Tools and Skills to Help You Write Assembly Code under a True 32-Bit OS
- •Prerequisites-Yukkh!
- •NASM for Linux
- •What's GNU?
- •The make Utility and Dependencies
- •Using the GNU Debugger
- •Your Work Strategy
- •Genuflecting to the C Culture
- •A Framework to Build On
- •The Perks of Protected Mode
- •Characters Out
- •Characters In
- •Be a Time Lord
- •Generating Random Numbers
- •Accessing Command-Line Arguments
- •Simple File I/O
- •Conclusion: Not the End, But Only the Beginning
- •Where to Now?
- •Stepping off Square One
- •Notes on the Instruction Set Reference
- •AAA Adjust AL after BCD Addition
- •ADC Arithmetic Addition with Carry
- •ADD Arithmetic Addition
- •AND Logical AND
- •BT Bit Test (386+)
- •CALL Call Procedure
- •CLC Clear Carry Flag (CF)
- •CLD Clear Direction Flag (DF)
- •CMP Arithmetic Comparison
- •DEC Decrement Operand
- •IMUL Signed Integer Multiplication
- •INC Increment Operand
- •INT Software Interrupt
- •IRET Return from Interrupt
- •J? Jump on Condition
- •JMP Unconditional Jump
- •LEA Load Effective Address
- •MOV Move (Copy) Right Operand into Left Operand
- •NOP No Operation
- •NOT Logical NOT (One's Complement)
- •OR Logical OR
- •POP Pop Top of Stack into Operand
- •POPA Pop All 16-Bit Registers (286+)
- •POPF Pop Top of Stack into Flags
- •POPFD Pop Top of Stack into EFlags (386+)
- •PUSH Push Operand onto Top of Stack
- •PUSHA Push All 16-Bit GP Registers (286+)
- •PUSHAD Push All 32-Bit GP Registers (386+)
- •PUSHF Push 16-Bit Flags onto Stack
- •PUSHFD Push 32-Bit EFlags onto Stack (386+)
- •RET Return from Procedure
- •ROL Rotate Left
- •ROR Rotate Right
- •SBB Arithmetic Subtraction with Borrow
- •SHL Shift Left
- •SHR Shift Right
- •STC Set Carry Flag (CF)
- •STD Set Direction Flag (DF)
- •STOS Store String
- •SUB Arithmetic Subtraction
- •XCHG Exchange Operands
- •XOR Exclusive Or
- •Appendix C: Web URLs for Assembly Programmers
- •Appendix D: Segment Register Assumptions
- •Appendix E: What's on the CD-ROM?
- •Index
- •List of Figures
- •List of Tables

Boxes within Boxes
This sounds like Eastern mysticism, but it's just an observation from life: Within any action is a host of smaller actions. Look inside your common activities. When you brush your teeth you do the following:
Pick up your toothpaste tube.
Unscrew the cap.
Place the cap on the sink counter.
Pick up your toothbrush.
Squeeze toothpaste onto the brush from the middle of the tube.
Put your toothbrush into your mouth.
Work it back and forth vigorously.
And so on. The original list went the entire page. When you brush your teeth, you perform every one of those actions. However, when you think about the sequence, you don't run through the whole list. You bring to mind the simple concept "brushing teeth."
Furthermore, when you think about what's behind the action we call "getting up in the morning," you might assemble a list of activities like this:
Shut off the clock radio.
Climb out of bed.
Put on your robe.
Let the dogs out.
Make breakfast.
Brush your teeth.
Shave.
Shower.
Get dressed.
Brushing your teeth is on the list, but within the activity you call "brushing your teeth" is a whole list of smaller actions, as listed previously. The same can be said for most of the activities shown in the preceding list. How many individual actions, for example, does it take to put a reasonable breakfast together? And yet in one small, if sweeping, phrase, "getting up in the morning," you embrace that whole host of small and even smaller actions without having to laboriously trace through each one.
What I'm describing is the "Chinese boxes" method of fighting complexity. Getting up in the morning involves hundreds of little actions, so we divide the mass up into coherent chunks and set the chunks into little conceptual boxes. "Making breakfast" is in one box, "brushing teeth" is in another, and so on. Closer inspection of any box shows that its contents can also be divided into numerous boxes, and those smaller boxes into even smaller boxes.
This process doesn't (and can't) go on forever, but it should go on as long as it needs to in order to satisfy this criterion: The contents of any one box should be understandable with only a little scrutiny. No single box should contain anything so subtle or large and involved that it takes hours of hair-pulling to figure it out.
Procedures as Boxes for Code
The mistake I made in writing my APL text formatter is that I threw the whole collection of 600 lines of APL code into one huge box marked "text formatter."
While I was writing it, I should have been keeping my eyes open for sequences of code statements that worked together at some identifiable task. When I spotted such sequences, I should have set them off as procedures. Each sequence would then have a name that would provide a memory tag for the sequence's function. If it took 10 statements to justify a line of text, those 10 statements should have been named
JustifyLine, and so on.
Xerox's legendary APL programmer Jim Dunn later told me that I shouldn't ever write an APL procedure that wouldn't fit on a single 25-line terminal screen. "More than 25 lines and you're doing too much in one procedure. Split it up," he said. Whenever I worked in APL after that, I adhered to that rather sage rule of thumb. The Martians still struck from time to time, but when they did, it was no longer a total loss.
All computer languages have procedures of one sort or another, and assembly language is no exception. Your assembly language program may have numerous procedures. There's no limit to the number of procedures, as long as the total number of bytes of code contained by all the procedures together does not exceed 65,536 (one segment). Other complications arise at that point, but there are mechanisms in assembly language to deal sensibly with those complications.
But that's a lot of code. You needn't worry for a while, and certainly not while you're just learning assembly language. (I won't be treating the creation of multiple code segments in this book.) In the meantime, let's take a look at the "Eat at Joe's" program, expanded a little to include a couple of procedures:
; Source name |
: EAT2.ASM |
|
||
; Executable name : EAT2.COM |
|
|||
; Code model |
: Real Mode Flat Model |
|||
; Version |
: |
1.0 |
|
|
; Created date |
: 7/31/1999 |
|
||
; Last update |
: 9/11/1999 |
|
||
; Author |
: Jeff Duntemann |
|||
; Description |
: A simple example of a DOS .COM file programmed using |
|||
; |
|
|
NASM-IDE 1.1 and NASM 0.98 and incorporating procedures. |
|
[BITS 16] |
; Set 16 bit code generation |
|||
[ORG 0×0100] |
; Set code start address to 100h (COM file) |
|||
[SECTION .text] |
; Section containing code |
|||
Start: |
; Load offset of Eat1 string into DX |
|||
|
mov DX,EatMsg1 |
|||
|
call Writeln |
; |
and display it |
|
|
mov DX,EatMsg2 |
; Load offset of Ear2 string into DX |
||
|
call Writeln |
; |
and display it |
|
|
mov ax, 04C00H ; This function exits the program |
|||
|
int 21H |
; and returns control to DOS. |
||
;-----------------------------| |
||||
; |
PROCEDURE SECTION |
| |
||
;-----------------------------| |
||||
Write: |
; Select DOS service 9: Print String |
|||
|
mov AH,09H |
|||
|
int 21H |
; Call DOS |
|
|
|
ret |
; Return to the caller |
||
Writeln: |
; Display the string proper through Write |
|||
|
call Write |
|||
|
mov DX,CRLF |
; Load offset of newline string to DX |
||
|
call Write |
; Display the newline string through Write |
||
|
ret |
; Return to the caller |
||
;-----------------------------| |
||||
; |
DATA SECTION |
|
| |

;----------------------------- |
|
|
| |
[SECTION .data] |
; Section containing initialized data |
||
EatMsg1 |
DB |
"Eat at Joe's . . . ",'$' |
|
EatMsg2 |
DB |
"...ten million flies can't ALL be wrong!",'$' |
|
CRLF |
DB |
0DH,0AH,'$' |
Calling and Returning
EAT2.ASM does about the same thing as EAT.ASM. It prints a second line as part of the advertising slogan, and that's all in the line of functional innovation. The way the two lines of the slogan are displayed, however, bears examination:
mov DX,EatMsg1 |
; |
Load |
offset of Eat1 string into DX |
call Writeln |
; |
and |
display it |
Here's a new machine instruction: CALL. The label Writeln refers to a procedure. As you might have gathered (especially if you've programmed in an older language such as Basic or FORTRAN), CALL Writeln simply tells the CPU to go off and execute a procedure named Writeln.
The means by which CALL operates may sound familiar: CALL first pushes the address of the next instruction after itself onto the stack. Then CALL transfers execution to the address represented by the name of the procedure. The instructions contained in the procedure execute. Finally, the procedure is terminated by CALL's alter ego: RET (for RETurn). The RET instruction pops the address off the top of the stack and transfers execution to that address. Since the address pushed was the address of the first instruction after the CALL instruction, execution continues as though CALL had not changed the flow of instruction execution at all. See Figure 9.1.

Figure 9.1: Calling a procedure and returning.
This should remind you strongly of how software interrupts work. The main difference is that the caller does know the exact address of the routine it wishes to call. Apart from that, it's very close to being the same process. (Also note that RET and IRET are not interchangeable. CALL works with RET just as INT works with IRET. Don't get those return instructions confused!)
The structure of a procedure is simple and easy to understand. Look at the Write procedure from EAT2.ASM:
Write: |
; |
Select |
DOS service 9: Print String |
mov AH,09H |
|||
int 21H |
; |
Call DOS |
|
ret |
; |
Return |
to the caller |
The important points are these: A procedure must begin with a label, which is (as you should recall) an identifier followed by a colon. Also, somewhere within the procedure, and certainly as the last instruction in the procedure, there must be at least one RET instruction. There may be more than one RET instruction. Execution has to come back from a procedure by way of a RET instruction, but there can be more than one exit door from a procedure. Using more than one RET instruction requires the use of condition jump instructions, which I won't take up until the next chapter.
Calls within Calls
Within a procedure you can do anything that you can do within the main program. This includes calling other procedures from within a procedure. Even something as simple as EAT2.ASM does that. Look at the
Writeln procedure:
Writeln:

call Write |
; Display the string proper through Write |
||
mov DX,CRLF |
; |
Load offset |
of newline string to DX |
call Write |
; |
Display the |
newline string through Write |
ret |
; Return to the caller |
The Writeln procedure displays a string to your screen, and then returns the cursor to the left margin of the following screen line. This action is actually two distinct activities, and Writeln very economically uses a mechanism that already exists: the Write procedure. The first thing that Writeln does is call Write to display the string itself to the screen. Remember that the caller loaded the address of the string to be displayed into DX before calling Writeln. Nothing has disturbed DX, so Writeln can immediately call
Write, which will fetch the address from DX and display the string to the screen.
Returning the cursor is done by displaying the newline sequence, which is stored in a string named CRLF. (If you recall, the carriage return and line feed character pair was built right into our message string in the EAT.ASM program that we dissected in Chapter 8.) Writeln again uses Write to display CRLF. Once that is done, the work is finished, and Writeln executes a RET instruction to return execution to the caller.
Calling procedures from within procedures requires you to pay attention to one thing: stack space. Remember that each procedure call pushes a return address onto the stack. This return address is not removed from the stack until the RET instruction for that procedure executes. If you execute another
CALL instruction before returning from a procedure, the second CALL instruction pushes another return address onto the stack. If you keep calling procedures from within procedures, one return address will pile up on the stack for each CALL until you start returning from all those nested procedures.
If you run out of stack space, your program will crash and return to DOS, possibly taking DOS with it. This is why you should take care not to use more stack space than you have. Ironically, in small programs written in real mode flat model, this usually isn't a problem. Stack space isn't allocated in real mode flat model; instead the stack pointer points to the high end of the program's single segment, and the stack uses as much of the segment as it needs. For small programs with only a little data (such as the toy programs we're building and dissecting in this book), 95 percent of the space in the segment has nothing much to do and can be used by the stack if the stack needs it. (Which it doesn't—not in this kind of programming!)
Things are different when you move to real mode segmented model. In that model, you have to explicitly allocate a stack segment of some specific size, and that is all the space that the stack has to work with. So, ironically, in a program that can potentially make use of the full megabyte of real mode memory, it's much easier to foment a stack crash in segmented model than flat model. So, when you allocate space for the stack in real mode segmented model, it makes abundant sense to allocate considerably more stack space than you think you might ever conceivably need. EAT2.ASM at most uses 4 bytes of stack space, because it nests procedure calls two deep. (Writeln within itself calls Write.) In a program like this, stack allocation isn't an issue, even if you migrated it to the segmented model.
Nonetheless, I recommend allocating 512 bytes of stack to get you in the habit of not being stingy with stack space. Obviously, you won't always be able to keep a 128-to-1 ratio of need-to-have, but consider 512 bytes a minimum for stack space allocation in any reasonable program that uses the stack at all. (We allocated only 64 bytes of stack in EATSEG.ASM simply to show you what stack allocation was. The program does not, in fact, make any use of the stack at all.) If you need more, allocate it. Don't forget that there is only one stack in the system, and while your program is running, DOS and the BIOS and any active memory resident programs may well be using the same stack. If they fill it, you'll go down with the system—so leave room!
When to Make Something a Procedure
The single most important purpose of procedures is to manage complexity in your programs by replacing a sequence of machine instructions with a descriptive name. This might hardly seem to the point in the case of the Write procedure, which contains only two instructions apart from the structurally necessary RET instruction.
True. But—the Writeln procedure hides two separate calls to Write behind itself: one to display the string, and another to return the cursor to the left margin of the next line. The name Writeln is more readable and descriptive of what the underlying sequence of instructions does than the sequence of instructions itself.
Extremely simple procedures such as Write don't themselves hide a great deal of complexity. They do give certain actions descriptive names, which is valuable in itself. They also provide basic building blocks for the creation of larger and more powerful procedures, as we'll see later on. And those larger procedures will hide considerable complexity, as you'll soon see.
In general, when looking for some action to turn into a procedure, see what actions tend to happen a lot in a program. Most programs spend a lot of time displaying things to the screen. Such procedures as Write and Writeln become general-purpose tools that may be used all over your programs. Furthermore, once you've written and tested them, they may be reused in future programs as well without adding to the burden of code that you must test for bugs.
Try to look ahead to your future programming tasks and create procedures of general usefulness. I show you more of those by way of examples as we continue, and tool building is a very good way to hone your assembly language skills.
On the other hand, a short sequence (5 to 10 instructions) that is only called once or perhaps twice within a middling program (that is, over hundreds of machine instructions) is a poor candidate for a procedure.
You may find it useful to define large procedures that are called only once when your program becomes big enough to require breaking it down into functional chunks. A thousand-line assembly language program might split well into a sequence of 9 or 10 largish procedures. Each is only called once from the main program, but this allows your main program to be very indicative of what the program is doing:
Start: call Initialize call OpenFile
Input: call GetRec call VerifyRec call WriteRec loop Input call CloseFile call CleanUp
call ReturnToDOS
This is clean and readable and provides a necessary view from a height when you begin to approach a thousand-line assembly language program. Remember that the Martians are always hiding somewhere close by, anxious to turn your program into unreadable hieroglyphics.
There's no weapon against them with half the power of procedures.