Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

Assembly Language Step by Step 1992

.pdf
Скачиваний:
190
Добавлен:
17.08.2013
Размер:
7.98 Mб
Скачать

;1 Entry point:

; End Start:

Most people, having learned a little assembly language, grumble about the seemingly huge number of instructions it takes to do anything useful. By and large, this is a legitimate gripe—and the major reason there are things like Turbo Pascal and Microsoft BASIC.

The 8086/8088 instruction set, on the other hand, is full of surprises, and the surprise most likely to make apprentice assembly-language programmers gasp is the instruction group we call the string instructions.

They alone of all the instructions in the 8086/8088 instruction set have the power to deal with long sequences of bytes or words at one time. (In assembly language, any contiguous sequence of bytes or words in memory may be considered a string.) More amazingly, they deal with these large sequences of bytes or words in an extraordinarily compact way: by executing an instruction loop entirely inside the CPU! A string instruction is, in effect, a complete instruction loop baked into a single instruction.

The string instructions are subtle and complicated, and I won't be able to treat them exhaustively in this book. Much of what they do qualifies as an advanced topic. Still, you can get a good start on understanding the string instructions by using them to build some simple tools to add to your video toolkit.

Besides, for my money, the string instructions are easily the single most fascinating aspect of assembly-language work.

10.1 The Notion of an Assembly-Language String

Words fail us sometimes by picking up meanings as readily as a magnet picks up iron filings. The word string is a major offender here. It means roughly the same thing in all computer programming, but there are a multitude of small variations on that single theme. If you learned about strings in Turbo Pascal, you'll find that what you know isn't totally applicable when you program in C, or BASIC, or assembly.

So here's the big view: a string is any contiguous group of bytes, of any arbitrary size up to the size of a segment. The main concept of a string is that its component bytes are right there in a row, with no interruptions.

That's pretty fundamental. Most higher-level languages build on the string concept, in several ways.

Turbo Pascal treats strings as a separate data type, limited to 255 characters in length, with a single byte at the start of the string to indicate how many bytes are in the string. In C, a string can be longer than 255 bytes, and it has no "length byte" in front of it. Instead,

a C string is said to end when a byte with a binary value of 0 is encountered. In BASIC, strings are stored in something called string space, which has a lot of built-in code machinery associated with it.

When you begin working in assembly, you have to give all that high-level language stuff over. Assembly strings are just contiguous regions of memory. They start at some specified segment:offset address, go for some number of bytes, and stop. There is no "length byte" to tell how many bytes are in the string, and no standard boundary characters like binary 0 to indicate where a string starts or ends.

You can certainly write assembly-language routines that allocate Turbo Pascal-style strings or C-style strings and manipulate them. To avoid confusion, however, you must think of the data operated on by your routines to be Pascal or C strings rather than assembly strings.

Turning Your "String Sense" Inside-Out

As I mentioned above, assembly strings have no boundary values or length indicators. They can contain any value at all, including binary 0. In fact, you really have to stop thinking of strings in terms of specific regions in memory. You should instead think of strings in much the same way you think of segments: in terms of the register values that define them.

It's slightly inside-out compared to how you think of strings in languages like Pascal, but it works: you've got a string when you set up a pair of registers to point to one. And once you point to a string, the length of that string is defined by the value you place in register

CX.

This is key: assembly strings are wholly defined by values you place in registers. There is a set of assumptions about strings and registers baked into the silicon of the CPU. When you execute one of the string instructions, (as I'll describe a little later) the CPU uses those assumptions to determine what area of memory it reads from or writes to.

Source Strings and Destination Strings

There are two kinds of strings in assembly work: source strings are strings that you read from, and destination strings are strings that you write to. The difference between the two is only a matter of registers. Source strings and destination strings can overlap; in fact, the very same region of memory can be both a source string and a destination string, all at the same time.