Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
Carter P.A.PC assembly language.2005.pdf
Скачиваний:
15
Добавлен:
23.08.2013
Размер:
1.07 Mб
Скачать

22

CHAPTER 1. INTRODUCTION

The compiler specific example files, available from the author’s web site, have already been modified to work with the appropriate compiler.

1.4.2Compiler dependencies

The assembly code above is specific to the free GNU9-based DJGPP C/C++ compiler.10 This compiler can be freely downloaded from the Internet. It requires a 386-based PC or better and runs under DOS, Windows 95/98 or NT. This compiler uses object files in the COFF (Common Object File Format) format. To assemble to this format use the -f coff switch with nasm (as shown in the comments of the above code). The extension of the resulting object file will be o.

The Linux C compiler is a GNU compiler also. To convert the code above to run under Linux, simply remove the underscore prefixes in lines 37 and 38. Linux uses the ELF (Executable and Linkable Format) format for object files. Use the -f elf switch for Linux. It also produces an object with an o extension.

Borland C/C++ is another popular compiler. It uses the Microsoft OMF format for object files. Use the -f obj switch for Borland compilers. The extension of the object file will be obj. The OMF format uses di erent segment directives than the other object formats. The data segment (line 13) must be changed to:

segment DATA public align=4 class=DATA use32

The bss segment (line 26) must be changed to:

segment BSS public align=4 class=BSS use32

The text segment (line 36) must be changed to:

segment TEXT public align=1 class=CODE use32

In addition a new line should be added before line 36:

group DGROUP BSS DATA

The Microsoft C/C++ compiler can use either the OMF format or the Win32 format for object files. (If given a OMF format, it converts the information to Win32 format internally.) Win32 format allows segments to be defined just as for DJGPP and Linux. Use the -f win32 switch to output in this mode. The extension of the object file will be obj.

1.4.3Assembling the code

The first step is to assembly the code. From the command line, type:

nasm -f object-format first.asm

9GNU is a project of the Free Software Foundation (http://www.fsf.org)

10http://www.delorie.com/djgpp

1.4. CREATING A PROGRAM

23

where object-format is either co , elf , obj or win32 depending on what C compiler will be used. (Remember that the source file must be changed for both Linux and Borland as well.)

1.4.4Compiling the C code

Compile the driver.c file using a C compiler. For DJGPP, use:

gcc -c driver.c

The -c switch means to just compile, do not attempt to link yet. This same switch works on Linux, Borland and Microsoft compilers as well.

1.4.5Linking the object files

Linking is the process of combining the machine code and data in object files and library files together to create an executable file. As will be shown below, this process is complicated.

C code requires the standard C library and special startup code to run. It is much easiler to let the C compiler call the linker with the correct parameters, than to try to call the linker directly. For example, to link the code for the first program using DJGPP, use:

gcc -o first driver.o first.o asm io.o

This creates an executable called first.exe (or just first under Linux). With Borland, one would use:

bcc32 first.obj driver.obj asm io.obj

Borland uses the name of the first file listed to determine the executable name. So in the above case, the program would be named first.exe.

It is possible to combine the compiling and linking step. For example,

gcc -o first driver.c first.o asm io.o

Now gcc will compile driver.c and then link.

1.4.6Understanding an assembly listing file

The -l listing-file switch can be used to tell nasm to create a listing file of a given name. This file shows how the code was assembled. Here is how lines 17 and 18 (in the data segment) appear in the listing file. (The line numbers are in the listing file; however notice that the line numbers in the source file may not be the same as the line numbers in the listing file.)

 

24

 

CHAPTER 1. INTRODUCTION

48

00000000

456E7465722061206E-

prompt1 db

"Enter a number: ", 0

49

00000009

756D6265723A2000

 

 

50

00000011

456E74657220616E6F-

prompt2 db

"Enter another number: ", 0

51

0000001A

74686572206E756D62-

 

 

52

00000023

65723A2000

 

 

The first column in each line is the line number and the second is the o set

(in hex) of the data in the segment. The third column shows the raw hex values that will be stored. In this case the hex data correspond to ASCII codes. Finally, the text from the source file is displayed on the line. The o sets listed in the second column are very likely not the true o sets that the data will be placed at in the complete program. Each module may define its own labels in the data segment (and the other segments, too). In the link step (see section 1.4.5), all these data segment label definitions are combined to form one data segment. The new final o sets are then computed by the linker.

Here is a small section (lines 54 to 56 of the source file) of the text segment in the listing file:

94

0000002C

A1[00000000]

mov

eax, [input1]

95

00000031

0305[04000000]

add

eax, [input2]

96

00000037

89C3

mov

ebx, eax

The third column shows the machine code generated by the assembly. Often the complete code for an instruction can not be computed yet. For example, in line 94 the o set (or address) of input1 is not known until the code is linked. The assembler can compute the op-code for the mov instruction (which from the listing is A1), but it writes the o set in square brackets because the exact value can not be computed yet. In this case, a temporary o set of 0 is used because input1 is at the beginning of the part of the bss segment defined in this file. Remember this does not mean that it will be at the beginning of the final bss segment of the program. When the code is linked, the linker will insert the correct o set into the position. Other instructions, like line 96, do not reference any labels. Here the assembler can compute the complete machine code.

Big and Little Endian Representation

If one looks closely at line 95, something seems very strange about the o set in the square brackets of the machine code. The input2 label is at o set 4 (as defined in this file); however, the o set that appears in memory is not 00000004, but 04000000. Why? Di erent processors store multibyte integers in di erent orders in memory. There are two popular methods of

Endian is pronounced like storing integers: big endian and little endian. Big endian is the method indian.

Соседние файлы в предмете Электротехника