Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

Eilam E.Reversing.Secrets of reverse engineering.2005

.pdf
Скачиваний:
69
Добавлен:
23.08.2013
Размер:
8.78 Mб
Скачать

Antireversing Techniques 341

0040103F

. 50

 

PUSH

EAX

00401040

E8

BBFFFFFF

CALL

compiler.main

Olly is clearly ignoring the junk byte and using the conditional jump as a marker to the real code starting position, which is why it is providing an accurate listing. It is possible that Olly contains specific code for dealing with these kinds of tricks. Regardless, at this point it becomes clear that you can take advantage of Olly’s use of the jump’s target address to confuse it; if OllyDbg uses conditional jumps to mark the beginning of valid code sequences, you can just create a conditional jump that points to the beginning of the invalid sequence. The following code snippet demonstrates this idea:

_asm

{

mov eax, 2 cmp eax, 3 je Junk jne After

Junk:

_emit 0xf

After:

mov eax, [SomeVariable] push eax

call AFunction

}

This sequence is an improved implementation of the same approach. It is more likely to confuse recursive traversal disassemblers because they will have to randomly choose which of the two jumps to use as indicators of valid code. The reason why this is not trivial is that both codes are “valid” from the disassembler’s perspective. This is a theoretical problem: the disassembler has no idea what constitutes valid code. The only measurement it has is whether it finds invalid opcodes, in which case a clever disassembler should probably consider the current starting address as invalid and look for an alternative one.

Let’s look at the listing Olly produces from the above code.

00401031

. B8

02000000

MOV EAX,2

00401036

. 83F8

03

CMP EAX,3

00401039

. 74

02

 

JE SHORT compiler.0040103D

0040103B

. 75

01

 

JNZ SHORT compiler.0040103E

0040103D

> 0F8B

45F850E8

JPO E8910888

00401043

? B9

FFFFFF68

MOV ECX,68FFFFFF

00401048

? DC60

40

FSUB QWORD PTR DS:[EAX+40]

0040104B

? 00E8

 

ADD AL,CH

0040104D

? 0300

 

ADD EAX,DWORD PTR DS:[EAX]

0040104F

? 0000

 

ADD BYTE PTR DS:[EAX],AL

342 Chapter 10

This time OllyDbg swallows the bait and uses the invalid 0040103D as the starting address from which to disassemble, which produces a meaningless assembly language listing. What’s more, IDA Pro produces an equally unreadable output—both major recursive traversers fall for this trick. Needless to say, linear sweepers such as SoftICE react in the exact same manner.

One recursive traversal disassembler that is not falling for this trick is PEBrowse Professional. Here is the listing produced by PEBrowse:

0x401031: B802000000

mov

eax,0x2

 

 

0x401036: 83F803

cmp

eax,0x3

 

 

0x401039: 7402

jz

0x40103d

; (*+0x4)

0x40103B:

7501

jnz

0x40103e

;

(*+0x3)

0x40103D:

0F8B45F850E8

jpo

0xe8910888

;

<==0x00401039(*-0x4)

;***********************************************************************

0x40103E: 8B45F8

mov

eax,dword ptr [ebp-0x8] ; VAR:0x8

0x401041:

50

push

eax

0x401042:

E8B9FFFFFF

call

0x401000

;***********************************************************************

Apparently (and it’s difficult to tell whether this is caused by the presence of special heuristics designed to withstand such code sequences or just by a fluke) PEBrowse Professional is trying to disassemble the code from both 40103D and from 40103E, and is showing both options. It looks like you’ll need to improve on your technique a little bit—there must not be a direct jump to the valid code address if you’re to fool every disassembler. The solution is to simply perform an indirect jump using a value loaded in a register. The following code confuses every disassembler I’ve tested, including both linear- sweep-based tools and recursive-traversal-based tools.

_asm

{

mov eax, 2 cmp eax, 3 je Junk

mov eax, After jmp eax

Junk:

_emit 0xf

After:

mov eax, [SomeVariable] push eax

call AFunction

}

The reason this trick works is quite trivial—because the disassembler has no idea that the sequence mov eax, After, jmp eax is equivalent to jmp After, the disassembler is not even trying to begin disassembling from the After address.

Antireversing Techniques 343

The disadvantage of all of these tricks is that they count on the disassembler being relatively dumb. Luckily, most Windows disassemblers are dumb enough that you can fool them. What would happen if you ran into a clever disassembler that actually analyzes each line of code and traces the flow of data? Such a disassembler would not fall for any of these tricks, because it would detect your opaque predicate; how difficult is it to figure out that a conditional jump that is taken when 2 equals 3 is never actually going to be taken? Moreover, a simple data-flow analysis would expose the fact that the final JMP sequence is essentially equivalent to a JMP After, which would probably be enough to correct the disassembly anyhow.

Still even a cleverer disassembler could be easily fooled by exporting the real jump addresses into a central, runtime generated data structure. It would be borderline impossible to perform a global data-flow analysis so comprehensive that it would be able to find the real addresses without actually running the program.

Applications

Let’s see how one would use the previous techniques in a real program. I’ve created a simple macro called OBFUSCATE, which adds a little assembly language sequence to a C program (see Listing 10.1). This sequence would temporarily confuse most disassemblers until they resynchronized. The number of instructions it will take to resynchronize depends not only on the specific disassembler used, but also on the specific code that comes after the macro.

#define paste(a, b) a##b

 

 

#define pastesymbols(a, b)

paste(a, b)

 

#define OBFUSCATE() \

 

 

_asm { mov

eax, __LINE__

* 0x635186f1

};\

_asm { cmp

eax, __LINE__

* 0x9cb16d48

};\

_asm { je

pastesymbols(Junk,__LINE__)

};\

_asm { mov

eax, pastesymbols(After, __LINE__)

};\

_asm { jmp

eax

 

};\

_asm { pastesymbols(Junk, __LINE__):

};\

_asm { _emit (0xd8 + __LINE__ % 8)

};\

_asm { pastesymbols(After,

__LINE__):

};

 

 

 

 

Listing 10.1 A simple code obfuscation macro that aims at confusing disassemblers.

This macro was tested on the Microsoft C/C++ compiler (version 13), and contains pseudorandom values to make it slightly more difficult to search and replace (the MOV and CMP instructions and the junk byte itself are all random, calculated using the current code line number). Notice that the junk byte ranges from D8 to DF—these are good opcodes to use because they are all

344Chapter 10

multibyte opcodes. I’m using the __LINE__ macro in order to create unique symbol names in case the macro is used repeatedly in the same function. Each occurrence of the macro will define symbols with different names. The paste and pastesymbols macros are required because otherwise the compiler just won’t properly resolve the __LINE__ constant and will use the string

__LINE__ instead.

If distributed throughout the code, this macro (and you could very easily create dozens of similar variations) would make the reversing process slightly more tedious. The problem is that too many copies of this code would make the program run significantly slower (especially if the macro is placed inside key loops in the program that run many times). Overusing this technique would also make the program significantly larger in terms of both memory consumption and disk space usage.

It’s important to realize that all of these techniques are limited in their effectiveness. They most certainly won’t deter an experienced and determined reverser from reversing or cracking your application, but they might complicate the process somewhat. The manual approach for dealing with this kind of obfuscated code is to tell the disassembler where the code really starts. Advanced disassemblers such as IDA Pro or even OllyDbg’s built-in disassembler allow users to add disassembly hints, which enable the program to properly interpret the code.

The biggest problem with these macros is that they are repetitive, which makes them exceedingly vulnerable to automated tools that just search and destroy them. A dedicated attacker can usually write a program or script that would eliminate them in 20 minutes. Additionally, specific disassemblers have been created that overcome most of these obfuscation techniques (see “Static Disassembly of Obfuscated Binaries” by Christopher Kruegel, et al. [Kruegel]). Is it worth it? In some cases it might be, but if you are looking for powerful antireversing techniques, you should probably stick to the control flow and data-flow obfuscating transformations discussed next.

Code Obfuscation

You probably noticed that the antireversing techniques described so far are all platform-specific “tricks” that in my opinion do nothing more than increase the attacker’s “annoyance factor”. Real code obfuscation involves transforming the code in such a way that makes it significantly less human-readable, while still retaining its functionality. These are typically non-platform-specific transformations that modify the code to hide its original purpose and drown the reverser in a sea of irrelevant information. The level of complexity added by an obfuscating transformation is typically called potency, and can be measured using conventional software complexity metrics such as how many predicates the program contains and the depth of nesting in a particular code sequence.

Antireversing Techniques 345

OBFUSCATION TOOLS

Let’s take a quick look at the existing obfuscation tools that can be used to obfuscate programs on the fly. There are quite a few bytecode obfuscators for Java and .NET, and I will be discussing and evaluating some of them in Chapter 12. As for obfuscation of native IA-32 code, there aren’t that many generic tools that process entire executables and effectively obfuscate them. One notable product that is quite powerful is EXECryptor by StrongBit Technology (www.strongbit.com). EXECryptor processes PE executables and applies a variety of obfuscating transformations on the machine code. Code obfuscated by EXECryptor really becomes significantly more difficult to reverse compared to plain IA-32 code. Another powerful technology is the StarForce suite of copy protection products, developed by StarForce Technologies (www.star-force. com). The StarForce products are more than just powerful obfuscation products: they are full-blown copy protection products that provide either hardwarebased or pure software-based copy protection functionality.

Beyond the mere additional complexity introduced by adding additional logic and arithmetic to a program, an obfuscating transformation must be resilient (meaning that it cannot be easily undone). Because many of these transformations add irrelevant instructions that don’t really produce valuable data, it is possible to create deobfuscators. A deobfuscator is a program that implements various data-flow analysis algorithms on an obfuscated program which sometimes enable it to separate the wheat from the chaff and automatically remove all irrelevant instructions and restore the code’s original structure. Creating resilient obfuscation transformations that are resistant to deobfuscation is a major challenge and is the primary goal of many obfuscators.

Finally, an obfuscating transformation will typically have an associated cost. This can be in the form of larger code, slower execution times, or increased memory runtime consumption. It is important to realize that some transformations do not incur any kind of runtime costs, because they involve a simple reorganization of the program that is transparent to the machine, but makes the program less human-readable.

In the following sections, I will be going over the common obfuscating transformations. Most of these transformations were meant to be applied programmatically by running an obfuscator on an existing program, either at the source code or the binary level. Still, many of these transformations can be applied manually, while the program is being written or afterward, before it is shipped to end users. Automatic obfuscation is obviously far more effective because it can obfuscate the entire program and not just small parts of it. Additionally, automatic obfuscation is typically performed after the program is compiled, which means that the original source code is not made any less readable (as is the case when obfuscation is performed manually).

346 Chapter 10

Control Flow Transformations

Control flow transformations are transformations that alter the order and flow of a program in a way that reduces its human readability. In “Manufacturing Cheap, Resilient, and Stealthy Opaque Constructs” by Christian Collberg, Clark Thomborson, and Douglas Low [Collberg1], control flow transformations are categorized as computation transformations, aggregation transformations, and ordering transformations.

Computation transformations are aimed at reducing the readability of the code by modifying the program’s original control flow structure in ways that make for a functionally equivalent program that is far more difficult to translate back into a high-level language. This is can be done either by removing control flow information from the program or by adding new control flow statements that complicate the program and cannot be easily translated into a high-level language.

Aggregation transformations destroy the high-level structure of the program by breaking the high-level abstractions created by the programmer while the program was being written. The basic idea is to break such abstractions so that the high-level organization of the code becomes senseless.

Ordering transformations are somewhat less powerful transformations that randomize (as much as possible) the order of operations in a program so that its readability is reduced.

Opaque Predicates

Opaque predicates are a fundamental building block for control flow transformations. I’ve already introduced some trivial opaque predicates in the previous section on antidisassembling techniques. The idea is to create a logical statement whose outcome is constant and is known in advance. Consider, for example the statement if (x + 1 == x). This statement will obviously never be satisfied and can be used to confuse reversers and automated decompilation tools into thinking that the statement is actually a valid part of the program.

With such a simple statement, it is going to be quite easy for both humans and machines to figure out that this is a false statement. The objective is to create opaque predicates that would be difficult to distinguish from the actual program code and whose behavior would be difficult to predict without actually stepping into the code. The interesting thing about opaque predicates (and about several other aspects of code obfuscation as well) is that confusing an automated deobfuscator is often an entirely different problem from confusing a human reverser.

Consider for example the concurrency-based opaque predicates suggested in [Collberg1]. The idea is to create one or more threads that are responsible for

Antireversing Techniques 347

constantly generating new random values and storing them in a globally accessible data structure. The values stored in those data structures consistently adhere to simple rules (such as being lower or higher than a certain constant). The threads that contain the actual program code can access this global data structure and check that those values are within the expected range. It would make quite a challenge for an automated deobfuscator to figure this structure out and pinpoint such fake control flow statements. The concurrent access to the data would hugely complicate the matter for an automated deobfuscator (though an obfuscator would probably only be aware of such concurrency in a bytecode language such as Java). In contrast, a person would probably immediately suspect a thread that constantly generates random numbers and stores them in a global data structure. It would probably seem very fishy to a human reverser.

Now consider a far simple arrangement where several bogus data members are added into an existing program data structure. These members are constantly accessed and modified by code that’s embedded right into the program. Those members adhere to some simple numeric rules, and the opaque predicates in the program rely on these rules. Such implementation might be relatively easy to detect for a powerful deobfuscator (depending on the specific platform), but could be quite a challenge for a human reverser.

Generally speaking, opaque predicates are more effective when implemented in lower-level machine-code programs than in higher-level bytecode program, because they are far more difficult to detect in low-level machine code. The process of automatically identifying individual data structures in a native machine-code program is quite difficult, which means that in most cases opaque predicates cannot be automatically detected or removed. That’s because performing global data-flow analysis on low-level machine code is not always simple or even possible. For reversers, the only way to deal with opaque predicates implemented on low-level native machine-code programs is to try and manually locate them by looking at the code. This is possible, but not very easy.

In contrast, higher-level bytecode executables typically contain far more details regarding the specific data structures used in the program. That makes it much easier to implement data-flow analysis and write automated code that detects opaque predicates.

The bottom line is that you should probably focus most of your antireversing efforts on confusing the human reversers when developing in lower-level languages and on automated decompilers/deobfuscators when working with bytecode languages such as Java.

For a detailed study of opaque constructs and various implementation ideas see [Collberg1] and General Method of Program Code Obfuscation by Gregory Wroblewski [Wroblewski].

348 Chapter 10

Confusing Decompilers

Because bytecode-based languages are highly detailed, there are numerous decompilers that are highly effective for decompiling bytecode executables. One of the primary design goals of most bytecode obfuscators is to confuse decompilers, so that the code cannot be easily restored to a highly detailed source code. One trick that does wonders is to modify the program binary so that the bytecode contains statements that cannot be translated back into the original high-level language. The example given in A Taxonomy of Obfuscating Transformations by Christian Collberg, Clark Thomborson, and Douglas Low [Collberg2] is the Java programming language, where the high-level language does not have the goto statement, but the Java bytecode does. This means that its possible to add goto statements into the bytecode in order to completely break the program’s flow graph, so that a decompiler cannot later reconstruct it (because it contains instructions that cannot be translated back to Java).

In native processor languages such as IA-32 machine code, decompilation is such a complex and fragile process that any kind of obfuscation transformation could easily get them to fail or produce meaningless code. Consider, for example, what would happen if a decompiler ran into the OBFUSCATE macro from the previous section.

Table Interpretation

Converting a program or a function into a table interpretation layout is a highly powerful obfuscation approach, that if done right can repel both deobfuscators and human reversers. The idea is to break a code sequence into multiple short chunks and have the code loop through a conditional code sequence that decides to which of the code sequences to jump at any given moment. This dramatically reduces the readability of the code because it completely hides any kind of structure within it. Any code structures, such as logical statements or loops, are buried inside this unintuitive structure.

As an example, consider the simple data processing function in Listing 10.2.

00401000

push

esi

00401001

push

edi

00401002

mov

edi,dword ptr [esp+10h]

00401006

xor

eax,eax

00401008

xor

esi,esi

0040100A

cmp

edi,3

0040100D

jbe

0040103A

0040100F

mov

edx,dword ptr [esp+0Ch]

00401013

add

edi,0FFFFFFFCh

00401016

push

ebx

 

 

 

Listing 10.2 A simple data processing function that XORs a data block with a parameter passed to it and writes the result back into the data block.

 

 

 

Antireversing Techniques 349

 

 

 

 

 

 

00401017

mov

ebx,dword ptr [esp+18h]

 

 

0040101B

shr

edi,2

 

 

0040101E

push

ebp

 

 

0040101F

add

edi,1

 

 

00401022

mov

ecx,dword ptr [edx]

 

 

00401024

mov

ebp,ecx

 

 

00401026

xor

ebp,esi

 

 

00401028

xor

ebp,ebx

 

 

0040102A

mov

dword ptr [edx],ebp

 

 

0040102C

xor

eax,ecx

 

 

0040102E

add

edx,4

 

 

00401031

sub

edi,1

 

 

00401034

mov

esi,ecx

 

 

00401036

jne

00401022

 

 

00401038

pop

ebp

 

 

00401039

pop

ebx

 

 

0040103A

pop

edi

 

 

0040103B

pop

esi

 

 

0040103C

ret

 

 

 

 

 

 

 

Listing 10.2 A simple data processing function that XORs a data block with a parameter passed to it and writes the result back into the data block.

Let us now take this function and transform it using a table interpretation transformation.

00401040

push

ecx

00401041

mov

edx,dword ptr [esp+8]

00401045

push

ebx

00401046

push

ebp

00401047

mov

ebp,dword ptr [esp+14h]

0040104B

push

esi

0040104C

push

edi

0040104D

mov

edi,dword ptr [esp+10h]

00401051

xor

eax,eax

00401053

xor

ebx,ebx

00401055

mov

ecx,1

0040105A

lea

ebx,[ebx]

00401060

lea

esi,[ecx-1]

00401063

cmp

esi,8

00401066

ja

00401060

00401068

jmp

dword ptr [esi*4+4010B8h]

0040106F

xor

dword ptr [edx],ebx

00401071

add

ecx,1

00401074

jmp

00401060

00401076

mov

edi,dword ptr [edx]

 

 

 

Listing 10.3 The data-processing function from Listing 10.2 transformed using a table interpretation transformation. (continued)

350 Chapter 10

00401078

add

ecx,1

 

 

0040107B

jmp

00401060

 

 

0040107D

cmp

ebp,3

 

 

00401080

ja

00401071

 

 

00401082

mov

ecx,9

 

 

00401087

jmp

00401060

 

 

00401089

mov

ebx,edi

 

 

0040108B

add

ecx,1

 

 

0040108E

jmp

00401060

 

 

00401090

sub

ebp,4

 

 

00401093

jmp

00401055

 

 

00401095

mov

esi,dword ptr [esp+20h]

00401099

xor

dword ptr [edx],esi

0040109B

add

ecx,1

 

 

0040109E

jmp

00401060

 

 

004010A0

xor

eax,edi

 

 

004010A2

add

ecx,1

 

 

004010A5

jmp

00401060

 

 

004010A7

add

edx,4

 

 

004010AA

add

ecx,1

 

 

004010AD

jmp

00401060

 

 

004010AF

pop

edi

 

 

004010B0

pop

esi

 

 

004010B1

pop

ebp

 

 

004010B2

pop

ebx

 

 

004010B3

pop

ecx

 

 

004010B4

ret

 

 

 

The function’s jump table:

 

 

0x004010B8

0040107d

00401076

00401095

0040106f

0x004010C8

00401089

004010a0

004010a7

00401090

0x004010D8

004010af

 

 

 

 

 

 

 

 

Listing 10.3 (continued)

The function in Listing 10.3 is functionally equivalent to the one in 10.2, but it was obfuscated using a table interpretation transformation. The function was broken down into nine segments that represent the different stages in the original function. The implementation constantly loops through a junction that decides where to go next, depending on the value of ECX. Each code segment sets the value of ECX so that the correct code segment follows. The specific code address that is executed is determined using the jump table, which is included at the end of the listing. Internally, this is implemented using a simple switch statement, but when you think of it logically, this is similar to a little virtual machine that was built just for this particular function. Each “instruction” advances the “instruction pointer”, which is stored in ECX. The actual “code” is the jump table, because that’s where the sequence of operations is stored.