
Eilam E.Reversing.Secrets of reverse engineering.2005
.pdf
Deciphering Code Structures 511
nonzero, CF will be set to one. It appears that some compilers like to use this additional functionality provided by NEG as a clever way to check whether an operand contains a zero or nonzero value. Let’s quickly go over each step in this sequence:
■■Use NEG to check whether the source operand is zero or nonzero. The result is stored in CF.
■■Use SBB to transfer the result from CF back to a usable register. Of course, because of the nature of SBB, a nonzero value in CF will become –1 rather than 1. Whether that’s a problem or not depends on the nature of the high-level language. Some languages use 1 to denote True, while others use –1.
■■Because the code in the sample came from a C/C++ compiler, which uses 1 to denote True, an additional NEG is required, except that this time NEG is actually employed for reversing the operand’s sign. If the operand is –1, it will become 1. If it’s zero it will of course remain zero.
The following is a pseudocode that will help clarify the steps described previously:
EAX = EAX & 0x00001000; if (EAX)
CF = 1;
else
CF = 0;
EAX = EAX – (EAX + CF);
EAX = -EAX;
Essentially, what this sequence does is check for a particular bit in EAX (0x00001000), and returns 1 if it is set or zero if it isn’t. It is quite elegant in the sense that it is purely arithmetic—there are no conditional branch instructions involved. Let’s quickly translate this sequence back into a high-level C representation:
if (LocalVariable & 0x00001000)
return TRUE;
else
return FALSE;
That’s much more readable, isn’t it? Still, as reversers we’re often forced to work with such less readable, unattractive code sequences as the one just dissected. Knowing and understanding these types of low-level tricks is very helpful because they are very frequently used in compiler-generated code.
Let’s take a look at another, slightly more involved, example of how highlevel logical constructs can be implemented using pure arithmetic:

512 Appendix A
call |
SomeFunc |
sub |
eax, 4 |
neg |
eax |
sbb |
eax, eax |
and |
al, -52 |
add |
eax, 54 |
ret |
|
You’ll notice that this sequence also uses the NEG/SBB combination, except that this one has somewhat more complex functionality. The sequence starts by calling a function and subtracting 4 from its return value. It then invokes NEG and SBB in order to perform a zero test on the result, just as you saw in the previous example. If after the subtraction the return value from SomeFunc is zero, SBB will set EAX to zero. If the subtracted return value is nonzero, SBB will set EAX to –1 (or 0xffffffff in hexadecimal).
The next two instructions are the clever part of this sequence. Let’s start by looking at that AND instruction. Because SBB is going to set EAX either to zero or to 0xffffffff, we can consider the following AND instruction to be similar to a conditional assignment instruction (much like the CMOV instruction discussed later). By ANDing EAX with a constant, the code is essentially saying: “if the result from SBB is zero, do nothing. If the result is –1, set EAX to the specified constant.” After doing this, the code unconditionally adds 54 to EAX and returns to the caller.
The challenge at this point is to try and figure out what this all means. This sequence is obviously performing some kind of transformation on the return value of SomeFunc and returning that transformed value to the caller. Let’s try and analyze the bottom line of this sequence. It looks like the return value is going to be one of two values: If the outcome of SBB is zero (which means that SomeFunc’s return value was 4), EAX will be set to 54. If SBB produces 0xffffffff, EAX will be set to 2, because the AND instruction will set it to –52, and the ADD instruction will bring the value up to 2.
This is a sequence that compares a pair of integers, and produces (without the use of any branches) one value if the two integers are equal and another value if they are unequal. The following is a C version of the assembly language snippet from earlier:
if (SomeFunc() == 4) return 54;
else
return 2;

Deciphering Code Structures 513
Predicated Execution
Using arithmetic sequences to implement branchless logic is a very limited technique. For more elaborate branchless logic, compilers employ conditional instructions (provided that such instructions are available on the target CPU architecture). The idea behind conditional instructions is that instead of having to branch to two different code sections, compilers can sometimes use special instructions that are only executed if certain conditions exist. If the conditions aren’t met, the processor will simply ignore the instruction and move on. The IA-32 instruction set does not provide a generic conditional execution prefix that applies to all instructions. To conditionally perform operations, specific instructions are available that operate conditionally.
Certain CPU architectures such as Intel’s IA-64 64-bit architecture actually allow almost any instruction in the instruction set to execute conditionally. In IA-64 (also known as Itanium2) this is implemented using a set of 64 available predicate registers that each store a Boolean specifying whether a particular condition is True or False. Instructions can be prefixed with the name of one of the predicate registers, and the CPU will only execute the instruction if the register equals True. If not, the CPU will treat the instruction as a NOP.
The following sections describe the two IA-32 instruction groups that enable branchless logic implementations under IA-32 processor.
Set Byte on Condition (SETcc)
SETcc is a set of instructions that perform the same logical flag tests as the conditional jump instructions (Jcc), except that instead of performing a jump, the logic test is performed, and the result is stored in an operand. Here’s a quick example of how this is used in actual code. Suppose that a programmer writes the following line:
return (result != FALSE);
In case you’re not entirely comfortable with C language semantics, the only difference between this and the following line:
return result;
is that in the first version the function will always return a Boolean. If result equals zero it will return one. If not, it will return zero, regardless of what value result contains. In the second example, the return value will be whatever is stored in result.



516Appendix A
observes which functions are executed most frequently. The program then reorganizes the order of functions in the binary according to that information, so that the most popular functions are moved to the beginning of the module, and the less popular functions are placed near the end. This way the operating system can keep the “popular code” area in memory and only load the rest of the module as needed (and then page it out again when it’s no longer needed).
In most reversing scenarios function-level working-set tuning won’t have any impact on the reversing process, except that it provides a tiny hint regarding the program: A function’s address relative to the beginning of the module indicates how popular that function is. The closer a function is to the beginning of the module, the more popular it is. Functions that reside very near to the end of the module (those that have higher addresses) are very rarely executed and are probably responsible for some unusual cases such as error cases or rarely used functionality. Figure A.13 illustrates this concept.
Line-Level Working-Set Tuning
Line-level working-set tuning is a more advanced form of working-set tuning that usually requires explicit support in the compiler itself. The idea is that instead of shuffling functions based on their usage patterns, the working-set tuning process can actually shuffle conditional code sections within individual functions, so that the working set can be made even more efficient than with function-level tuning. The working-set tuner records usage statistics for every condition in the program and can actually relocate conditional code blocks to other areas in the binary module.
For reversers, line-level working-set tuning provides the benefit of knowing whether a particular condition is likely to execute during normal runtime. However, not being able to see the entire function in one piece is a major hassle. Because code blocks are moved around beyond the boundaries of the functions to which they belong, reversing sessions on such modules can exhibit some peculiarities. One important thing to pay attention to is that functions are broken up and scattered throughout the module, and that it’s hard to tell when you’re looking at a detached snippet of code that is a part of some unknown function at the other end of the module. The code that sits right before or after the snippet might be totally unrelated to it. One trick that sometimes works for identifying the connections between such isolated code snippets is to look for an unconditional JMP at the end of the snippet. Often this detached snippet will jump back to the main body of the function, revealing its location. In other cases the detached code chunk will simply return, and its connection to its main function body would remain unknown. Figure A.14 illustrates the effect of line-level working-set tuning on code placement.




520Appendix B
program code. Many of the flags in EFLAGS are system flags that determine the current state of the processor. Other than these system flags, there are also eight status flags, which represent the current state of the processor, usually with regards to the result of the last arithmetic operation performed. The following sections describe the most important status flags used in IA-32.
The Overflow Flags (CF and OF)
The carry flag (CF) and overflow flag (OF) are two important elements in arithmetical and logical assembly language. Their function and the differences between them aren’t immediately obvious, so here is a brief overview.
The CF and OF are both overflow indicators, meaning that they are used to notify the program of any arithmetical operation that generates a result that is too large in order to be fully represented by the destination operand. The difference between the two is related to the data types that the program is dealing with.
Unlike most high-level languages, assembly language programs don’t explicitly specify the details of the data types they deal with. Some arithmetical instructions such as ADD (Add) and SUB (Subtract) aren’t even aware of whether the operands they are working with are signed or unsigned because it just doesn’t matter—the binary result is the same. Other instructions, such as MUL (Multiply) and DIV (Divide) have different versions for signed and unsigned operands because multiplication and division actually produce different binary outputs depending on the exact data type.
One area where signed or unsigned representation always matters is overflows. Because signed integers are one bit smaller than their equivalent-sized unsigned counterparts (because of the extra bit that holds the sign), overflows are triggered differently for signed and unsigned integers. This is where the carry flag and the overflow flag come into play. Instead of having separate signed and unsigned versions of arithmetic instructions, the problem of correctly reporting overflows is addressed by simply having two overflow flags: one for signed operands and one for unsigned operands. Operations such as addition and subtraction are performed using the same instruction for either signed or unsigned operands, and such instructions set both groups of flags and leave it up to the following instructions to regard the relevant one.
For example, consider the following arithmetic sample and how it affects the overflow flags:
mov |
ax, 0x1126 |
; |
(4390 in decimal) |
|
mov |
bx, |
0x7200 |
; |
(29184 in decimal) |
add |
ax, |
bx |
|
|