
Eilam E.Reversing.Secrets of reverse engineering.2005
.pdf
Antireversing Techniques 341
0040103F |
. 50 |
|
PUSH |
EAX |
00401040 |
E8 |
BBFFFFFF |
CALL |
compiler.main |
Olly is clearly ignoring the junk byte and using the conditional jump as a marker to the real code starting position, which is why it is providing an accurate listing. It is possible that Olly contains specific code for dealing with these kinds of tricks. Regardless, at this point it becomes clear that you can take advantage of Olly’s use of the jump’s target address to confuse it; if OllyDbg uses conditional jumps to mark the beginning of valid code sequences, you can just create a conditional jump that points to the beginning of the invalid sequence. The following code snippet demonstrates this idea:
_asm
{
mov eax, 2 cmp eax, 3 je Junk jne After
Junk:
_emit 0xf
After:
mov eax, [SomeVariable] push eax
call AFunction
}
This sequence is an improved implementation of the same approach. It is more likely to confuse recursive traversal disassemblers because they will have to randomly choose which of the two jumps to use as indicators of valid code. The reason why this is not trivial is that both codes are “valid” from the disassembler’s perspective. This is a theoretical problem: the disassembler has no idea what constitutes valid code. The only measurement it has is whether it finds invalid opcodes, in which case a clever disassembler should probably consider the current starting address as invalid and look for an alternative one.
Let’s look at the listing Olly produces from the above code.
00401031 |
. B8 |
02000000 |
MOV EAX,2 |
|
00401036 |
. 83F8 |
03 |
CMP EAX,3 |
|
00401039 |
. 74 |
02 |
|
JE SHORT compiler.0040103D |
0040103B |
. 75 |
01 |
|
JNZ SHORT compiler.0040103E |
0040103D |
> 0F8B |
45F850E8 |
JPO E8910888 |
|
00401043 |
? B9 |
FFFFFF68 |
MOV ECX,68FFFFFF |
|
00401048 |
? DC60 |
40 |
FSUB QWORD PTR DS:[EAX+40] |
|
0040104B |
? 00E8 |
|
ADD AL,CH |
|
0040104D |
? 0300 |
|
ADD EAX,DWORD PTR DS:[EAX] |
|
0040104F |
? 0000 |
|
ADD BYTE PTR DS:[EAX],AL |

342 Chapter 10
This time OllyDbg swallows the bait and uses the invalid 0040103D as the starting address from which to disassemble, which produces a meaningless assembly language listing. What’s more, IDA Pro produces an equally unreadable output—both major recursive traversers fall for this trick. Needless to say, linear sweepers such as SoftICE react in the exact same manner.
One recursive traversal disassembler that is not falling for this trick is PEBrowse Professional. Here is the listing produced by PEBrowse:
0x401031: B802000000 |
mov |
eax,0x2 |
|
|
|
0x401036: 83F803 |
cmp |
eax,0x3 |
|
|
|
0x401039: 7402 |
jz |
0x40103d |
; (*+0x4) |
||
0x40103B: |
7501 |
jnz |
0x40103e |
; |
(*+0x3) |
0x40103D: |
0F8B45F850E8 |
jpo |
0xe8910888 |
; |
<==0x00401039(*-0x4) |
;***********************************************************************
0x40103E: 8B45F8 |
mov |
eax,dword ptr [ebp-0x8] ; VAR:0x8 |
|
0x401041: |
50 |
push |
eax |
0x401042: |
E8B9FFFFFF |
call |
0x401000 |
;***********************************************************************
Apparently (and it’s difficult to tell whether this is caused by the presence of special heuristics designed to withstand such code sequences or just by a fluke) PEBrowse Professional is trying to disassemble the code from both 40103D and from 40103E, and is showing both options. It looks like you’ll need to improve on your technique a little bit—there must not be a direct jump to the valid code address if you’re to fool every disassembler. The solution is to simply perform an indirect jump using a value loaded in a register. The following code confuses every disassembler I’ve tested, including both linear- sweep-based tools and recursive-traversal-based tools.
_asm
{
mov eax, 2 cmp eax, 3 je Junk
mov eax, After jmp eax
Junk:
_emit 0xf
After:
mov eax, [SomeVariable] push eax
call AFunction
}
The reason this trick works is quite trivial—because the disassembler has no idea that the sequence mov eax, After, jmp eax is equivalent to jmp After, the disassembler is not even trying to begin disassembling from the After address.

Antireversing Techniques 343
The disadvantage of all of these tricks is that they count on the disassembler being relatively dumb. Luckily, most Windows disassemblers are dumb enough that you can fool them. What would happen if you ran into a clever disassembler that actually analyzes each line of code and traces the flow of data? Such a disassembler would not fall for any of these tricks, because it would detect your opaque predicate; how difficult is it to figure out that a conditional jump that is taken when 2 equals 3 is never actually going to be taken? Moreover, a simple data-flow analysis would expose the fact that the final JMP sequence is essentially equivalent to a JMP After, which would probably be enough to correct the disassembly anyhow.
Still even a cleverer disassembler could be easily fooled by exporting the real jump addresses into a central, runtime generated data structure. It would be borderline impossible to perform a global data-flow analysis so comprehensive that it would be able to find the real addresses without actually running the program.
Applications
Let’s see how one would use the previous techniques in a real program. I’ve created a simple macro called OBFUSCATE, which adds a little assembly language sequence to a C program (see Listing 10.1). This sequence would temporarily confuse most disassemblers until they resynchronized. The number of instructions it will take to resynchronize depends not only on the specific disassembler used, but also on the specific code that comes after the macro.
#define paste(a, b) a##b |
|
|
|
#define pastesymbols(a, b) |
paste(a, b) |
|
|
#define OBFUSCATE() \ |
|
|
|
_asm { mov |
eax, __LINE__ |
* 0x635186f1 |
};\ |
_asm { cmp |
eax, __LINE__ |
* 0x9cb16d48 |
};\ |
_asm { je |
pastesymbols(Junk,__LINE__) |
};\ |
|
_asm { mov |
eax, pastesymbols(After, __LINE__) |
};\ |
|
_asm { jmp |
eax |
|
};\ |
_asm { pastesymbols(Junk, __LINE__): |
};\ |
||
_asm { _emit (0xd8 + __LINE__ % 8) |
};\ |
||
_asm { pastesymbols(After, |
__LINE__): |
}; |
|
|
|
|
|
Listing 10.1 A simple code obfuscation macro that aims at confusing disassemblers.
This macro was tested on the Microsoft C/C++ compiler (version 13), and contains pseudorandom values to make it slightly more difficult to search and replace (the MOV and CMP instructions and the junk byte itself are all random, calculated using the current code line number). Notice that the junk byte ranges from D8 to DF—these are good opcodes to use because they are all

344Chapter 10
multibyte opcodes. I’m using the __LINE__ macro in order to create unique symbol names in case the macro is used repeatedly in the same function. Each occurrence of the macro will define symbols with different names. The paste and pastesymbols macros are required because otherwise the compiler just won’t properly resolve the __LINE__ constant and will use the string
__LINE__ instead.
If distributed throughout the code, this macro (and you could very easily create dozens of similar variations) would make the reversing process slightly more tedious. The problem is that too many copies of this code would make the program run significantly slower (especially if the macro is placed inside key loops in the program that run many times). Overusing this technique would also make the program significantly larger in terms of both memory consumption and disk space usage.
It’s important to realize that all of these techniques are limited in their effectiveness. They most certainly won’t deter an experienced and determined reverser from reversing or cracking your application, but they might complicate the process somewhat. The manual approach for dealing with this kind of obfuscated code is to tell the disassembler where the code really starts. Advanced disassemblers such as IDA Pro or even OllyDbg’s built-in disassembler allow users to add disassembly hints, which enable the program to properly interpret the code.
The biggest problem with these macros is that they are repetitive, which makes them exceedingly vulnerable to automated tools that just search and destroy them. A dedicated attacker can usually write a program or script that would eliminate them in 20 minutes. Additionally, specific disassemblers have been created that overcome most of these obfuscation techniques (see “Static Disassembly of Obfuscated Binaries” by Christopher Kruegel, et al. [Kruegel]). Is it worth it? In some cases it might be, but if you are looking for powerful antireversing techniques, you should probably stick to the control flow and data-flow obfuscating transformations discussed next.
Code Obfuscation
You probably noticed that the antireversing techniques described so far are all platform-specific “tricks” that in my opinion do nothing more than increase the attacker’s “annoyance factor”. Real code obfuscation involves transforming the code in such a way that makes it significantly less human-readable, while still retaining its functionality. These are typically non-platform-specific transformations that modify the code to hide its original purpose and drown the reverser in a sea of irrelevant information. The level of complexity added by an obfuscating transformation is typically called potency, and can be measured using conventional software complexity metrics such as how many predicates the program contains and the depth of nesting in a particular code sequence.

Antireversing Techniques 345
OBFUSCATION TOOLS
Let’s take a quick look at the existing obfuscation tools that can be used to obfuscate programs on the fly. There are quite a few bytecode obfuscators for Java and .NET, and I will be discussing and evaluating some of them in Chapter 12. As for obfuscation of native IA-32 code, there aren’t that many generic tools that process entire executables and effectively obfuscate them. One notable product that is quite powerful is EXECryptor by StrongBit Technology (www.strongbit.com). EXECryptor processes PE executables and applies a variety of obfuscating transformations on the machine code. Code obfuscated by EXECryptor really becomes significantly more difficult to reverse compared to plain IA-32 code. Another powerful technology is the StarForce suite of copy protection products, developed by StarForce Technologies (www.star-force. com). The StarForce products are more than just powerful obfuscation products: they are full-blown copy protection products that provide either hardwarebased or pure software-based copy protection functionality.
Beyond the mere additional complexity introduced by adding additional logic and arithmetic to a program, an obfuscating transformation must be resilient (meaning that it cannot be easily undone). Because many of these transformations add irrelevant instructions that don’t really produce valuable data, it is possible to create deobfuscators. A deobfuscator is a program that implements various data-flow analysis algorithms on an obfuscated program which sometimes enable it to separate the wheat from the chaff and automatically remove all irrelevant instructions and restore the code’s original structure. Creating resilient obfuscation transformations that are resistant to deobfuscation is a major challenge and is the primary goal of many obfuscators.
Finally, an obfuscating transformation will typically have an associated cost. This can be in the form of larger code, slower execution times, or increased memory runtime consumption. It is important to realize that some transformations do not incur any kind of runtime costs, because they involve a simple reorganization of the program that is transparent to the machine, but makes the program less human-readable.
In the following sections, I will be going over the common obfuscating transformations. Most of these transformations were meant to be applied programmatically by running an obfuscator on an existing program, either at the source code or the binary level. Still, many of these transformations can be applied manually, while the program is being written or afterward, before it is shipped to end users. Automatic obfuscation is obviously far more effective because it can obfuscate the entire program and not just small parts of it. Additionally, automatic obfuscation is typically performed after the program is compiled, which means that the original source code is not made any less readable (as is the case when obfuscation is performed manually).

346 Chapter 10
Control Flow Transformations
Control flow transformations are transformations that alter the order and flow of a program in a way that reduces its human readability. In “Manufacturing Cheap, Resilient, and Stealthy Opaque Constructs” by Christian Collberg, Clark Thomborson, and Douglas Low [Collberg1], control flow transformations are categorized as computation transformations, aggregation transformations, and ordering transformations.
Computation transformations are aimed at reducing the readability of the code by modifying the program’s original control flow structure in ways that make for a functionally equivalent program that is far more difficult to translate back into a high-level language. This is can be done either by removing control flow information from the program or by adding new control flow statements that complicate the program and cannot be easily translated into a high-level language.
Aggregation transformations destroy the high-level structure of the program by breaking the high-level abstractions created by the programmer while the program was being written. The basic idea is to break such abstractions so that the high-level organization of the code becomes senseless.
Ordering transformations are somewhat less powerful transformations that randomize (as much as possible) the order of operations in a program so that its readability is reduced.
Opaque Predicates
Opaque predicates are a fundamental building block for control flow transformations. I’ve already introduced some trivial opaque predicates in the previous section on antidisassembling techniques. The idea is to create a logical statement whose outcome is constant and is known in advance. Consider, for example the statement if (x + 1 == x). This statement will obviously never be satisfied and can be used to confuse reversers and automated decompilation tools into thinking that the statement is actually a valid part of the program.
With such a simple statement, it is going to be quite easy for both humans and machines to figure out that this is a false statement. The objective is to create opaque predicates that would be difficult to distinguish from the actual program code and whose behavior would be difficult to predict without actually stepping into the code. The interesting thing about opaque predicates (and about several other aspects of code obfuscation as well) is that confusing an automated deobfuscator is often an entirely different problem from confusing a human reverser.
Consider for example the concurrency-based opaque predicates suggested in [Collberg1]. The idea is to create one or more threads that are responsible for

Antireversing Techniques 347
constantly generating new random values and storing them in a globally accessible data structure. The values stored in those data structures consistently adhere to simple rules (such as being lower or higher than a certain constant). The threads that contain the actual program code can access this global data structure and check that those values are within the expected range. It would make quite a challenge for an automated deobfuscator to figure this structure out and pinpoint such fake control flow statements. The concurrent access to the data would hugely complicate the matter for an automated deobfuscator (though an obfuscator would probably only be aware of such concurrency in a bytecode language such as Java). In contrast, a person would probably immediately suspect a thread that constantly generates random numbers and stores them in a global data structure. It would probably seem very fishy to a human reverser.
Now consider a far simple arrangement where several bogus data members are added into an existing program data structure. These members are constantly accessed and modified by code that’s embedded right into the program. Those members adhere to some simple numeric rules, and the opaque predicates in the program rely on these rules. Such implementation might be relatively easy to detect for a powerful deobfuscator (depending on the specific platform), but could be quite a challenge for a human reverser.
Generally speaking, opaque predicates are more effective when implemented in lower-level machine-code programs than in higher-level bytecode program, because they are far more difficult to detect in low-level machine code. The process of automatically identifying individual data structures in a native machine-code program is quite difficult, which means that in most cases opaque predicates cannot be automatically detected or removed. That’s because performing global data-flow analysis on low-level machine code is not always simple or even possible. For reversers, the only way to deal with opaque predicates implemented on low-level native machine-code programs is to try and manually locate them by looking at the code. This is possible, but not very easy.
In contrast, higher-level bytecode executables typically contain far more details regarding the specific data structures used in the program. That makes it much easier to implement data-flow analysis and write automated code that detects opaque predicates.
The bottom line is that you should probably focus most of your antireversing efforts on confusing the human reversers when developing in lower-level languages and on automated decompilers/deobfuscators when working with bytecode languages such as Java.
For a detailed study of opaque constructs and various implementation ideas see [Collberg1] and General Method of Program Code Obfuscation by Gregory Wroblewski [Wroblewski].

348 Chapter 10
Confusing Decompilers
Because bytecode-based languages are highly detailed, there are numerous decompilers that are highly effective for decompiling bytecode executables. One of the primary design goals of most bytecode obfuscators is to confuse decompilers, so that the code cannot be easily restored to a highly detailed source code. One trick that does wonders is to modify the program binary so that the bytecode contains statements that cannot be translated back into the original high-level language. The example given in A Taxonomy of Obfuscating Transformations by Christian Collberg, Clark Thomborson, and Douglas Low [Collberg2] is the Java programming language, where the high-level language does not have the goto statement, but the Java bytecode does. This means that its possible to add goto statements into the bytecode in order to completely break the program’s flow graph, so that a decompiler cannot later reconstruct it (because it contains instructions that cannot be translated back to Java).
In native processor languages such as IA-32 machine code, decompilation is such a complex and fragile process that any kind of obfuscation transformation could easily get them to fail or produce meaningless code. Consider, for example, what would happen if a decompiler ran into the OBFUSCATE macro from the previous section.
Table Interpretation
Converting a program or a function into a table interpretation layout is a highly powerful obfuscation approach, that if done right can repel both deobfuscators and human reversers. The idea is to break a code sequence into multiple short chunks and have the code loop through a conditional code sequence that decides to which of the code sequences to jump at any given moment. This dramatically reduces the readability of the code because it completely hides any kind of structure within it. Any code structures, such as logical statements or loops, are buried inside this unintuitive structure.
As an example, consider the simple data processing function in Listing 10.2.
00401000 |
push |
esi |
00401001 |
push |
edi |
00401002 |
mov |
edi,dword ptr [esp+10h] |
00401006 |
xor |
eax,eax |
00401008 |
xor |
esi,esi |
0040100A |
cmp |
edi,3 |
0040100D |
jbe |
0040103A |
0040100F |
mov |
edx,dword ptr [esp+0Ch] |
00401013 |
add |
edi,0FFFFFFFCh |
00401016 |
push |
ebx |
|
|
|
Listing 10.2 A simple data processing function that XORs a data block with a parameter passed to it and writes the result back into the data block.

|
|
|
Antireversing Techniques 349 |
|
|
|
|
|
|
|
00401017 |
mov |
ebx,dword ptr [esp+18h] |
|
|
0040101B |
shr |
edi,2 |
|
|
0040101E |
push |
ebp |
|
|
0040101F |
add |
edi,1 |
|
|
00401022 |
mov |
ecx,dword ptr [edx] |
|
|
00401024 |
mov |
ebp,ecx |
|
|
00401026 |
xor |
ebp,esi |
|
|
00401028 |
xor |
ebp,ebx |
|
|
0040102A |
mov |
dword ptr [edx],ebp |
|
|
0040102C |
xor |
eax,ecx |
|
|
0040102E |
add |
edx,4 |
|
|
00401031 |
sub |
edi,1 |
|
|
00401034 |
mov |
esi,ecx |
|
|
00401036 |
jne |
00401022 |
|
|
00401038 |
pop |
ebp |
|
|
00401039 |
pop |
ebx |
|
|
0040103A |
pop |
edi |
|
|
0040103B |
pop |
esi |
|
|
0040103C |
ret |
|
|
|
|
|
|
|
Listing 10.2 A simple data processing function that XORs a data block with a parameter passed to it and writes the result back into the data block.
Let us now take this function and transform it using a table interpretation transformation.
00401040 |
push |
ecx |
00401041 |
mov |
edx,dword ptr [esp+8] |
00401045 |
push |
ebx |
00401046 |
push |
ebp |
00401047 |
mov |
ebp,dword ptr [esp+14h] |
0040104B |
push |
esi |
0040104C |
push |
edi |
0040104D |
mov |
edi,dword ptr [esp+10h] |
00401051 |
xor |
eax,eax |
00401053 |
xor |
ebx,ebx |
00401055 |
mov |
ecx,1 |
0040105A |
lea |
ebx,[ebx] |
00401060 |
lea |
esi,[ecx-1] |
00401063 |
cmp |
esi,8 |
00401066 |
ja |
00401060 |
00401068 |
jmp |
dword ptr [esi*4+4010B8h] |
0040106F |
xor |
dword ptr [edx],ebx |
00401071 |
add |
ecx,1 |
00401074 |
jmp |
00401060 |
00401076 |
mov |
edi,dword ptr [edx] |
|
|
|
Listing 10.3 The data-processing function from Listing 10.2 transformed using a table interpretation transformation. (continued)

350 Chapter 10
00401078 |
add |
ecx,1 |
|
|
0040107B |
jmp |
00401060 |
|
|
0040107D |
cmp |
ebp,3 |
|
|
00401080 |
ja |
00401071 |
|
|
00401082 |
mov |
ecx,9 |
|
|
00401087 |
jmp |
00401060 |
|
|
00401089 |
mov |
ebx,edi |
|
|
0040108B |
add |
ecx,1 |
|
|
0040108E |
jmp |
00401060 |
|
|
00401090 |
sub |
ebp,4 |
|
|
00401093 |
jmp |
00401055 |
|
|
00401095 |
mov |
esi,dword ptr [esp+20h] |
||
00401099 |
xor |
dword ptr [edx],esi |
||
0040109B |
add |
ecx,1 |
|
|
0040109E |
jmp |
00401060 |
|
|
004010A0 |
xor |
eax,edi |
|
|
004010A2 |
add |
ecx,1 |
|
|
004010A5 |
jmp |
00401060 |
|
|
004010A7 |
add |
edx,4 |
|
|
004010AA |
add |
ecx,1 |
|
|
004010AD |
jmp |
00401060 |
|
|
004010AF |
pop |
edi |
|
|
004010B0 |
pop |
esi |
|
|
004010B1 |
pop |
ebp |
|
|
004010B2 |
pop |
ebx |
|
|
004010B3 |
pop |
ecx |
|
|
004010B4 |
ret |
|
|
|
The function’s jump table: |
|
|
||
0x004010B8 |
0040107d |
00401076 |
00401095 |
0040106f |
0x004010C8 |
00401089 |
004010a0 |
004010a7 |
00401090 |
0x004010D8 |
004010af |
|
|
|
|
|
|
|
|
Listing 10.3 (continued)
The function in Listing 10.3 is functionally equivalent to the one in 10.2, but it was obfuscated using a table interpretation transformation. The function was broken down into nine segments that represent the different stages in the original function. The implementation constantly loops through a junction that decides where to go next, depending on the value of ECX. Each code segment sets the value of ECX so that the correct code segment follows. The specific code address that is executed is determined using the jump table, which is included at the end of the listing. Internally, this is implemented using a simple switch statement, but when you think of it logically, this is similar to a little virtual machine that was built just for this particular function. Each “instruction” advances the “instruction pointer”, which is stored in ECX. The actual “code” is the jump table, because that’s where the sequence of operations is stored.