Добавил:

Andrey Опубликованный материал нарушает ваши авторские права? Сообщите нам.

Вуз:

Санкт-Петербургский государственный электротехнический университет "ЛЭТИ"

Предмет:

Электротехника

Файл:

Eilam E.Reversing.Secrets of reverse engineering.2005

.pdf

Скачиваний:

Добавлен:

23.08.2013

Размер:

8.78 Mб

Скачать

☆

<<< < Предыдущая 38 39 40 41 42 43 44 45 46 47 48 49 50 5152 / 6252 53 54 55 56 57 58 59 60 61 62 > Следующая >>>

Deciphering Code Structures 481

Table A.1 (continued)

		RELATION
LEFT	RIGHT	BETWEEN	FLAGS
OPERAND	OPERAND	OPERANDS	AFFECTED	COMMENTS
X < 0	Y < 0	X > Y	OF = 0 SF = 0 ZF = 0	This is the same
				as the preceding
				case, with both X
				and Y containing
				negative
				integers.
X > 0	Y > 0	X < Y	OF = 0 SF = 1 ZF = 0 An SF = 1
				represents a
				negative result,
				which (with OF
				being unset)
				indicates that Y
				is larger than X.
X < 0	Y >= 0	X < Y	OF = 0 SF = 1 ZF = 0	This is the same
				as the preceding
				case, except that
				X is negative and
				Y is positive.
				Again, the
				combination of
				SF = 1 with OF = 0
				represents that Y
				is greater than X.
X < 0	Y > 0	X < Y	OF = 1 SF = 0 ZF = 0	This is another
				similar case
				where X is
				negative and Y is
				positive, except
				that here an
				overflow is
				generated, and
				the result is
				positive.
X > 0	Y < 0	X > Y	OF = 1 SF = 1 ZF = 0	When X is
				positive and Y is
				a negative

integer low enough to generate a positive overflow, both OF and SF are set.

482 Appendix A

In looking at Table A.1, the ground rules for identifying the results of signed integer comparisons become clear. Here’s a quick summary of the basic rules:

■■Anytime ZF is set you know that the subtraction resulted in a zero, which means that the operands are equal.

■■When all three flags are zero, you know that the first operand is greater than the second, because you have a positive result and no overflow.

■■When there is a negative result and no overflow (SF=1 and OF=0), you know that the second operand is larger than the first.

■■When there is an overflow and a positive result, the second operand must be larger than the first, because you essentially have a negative result that is too small to be represented by the destination operand (hence the overflow).

■■When you have an overflow and a negative result, the first operand must be larger than the second, because you essentially have a positive result that is too large to be represented by the destination operand (hence the overflow).

While it is not generally necessary to memorize the comparison outcome tables (tables A.1 and A.2), it still makes sense to go over them and make sure that you properly understand how each flag is used in the operand comparison process. This will be helpful in some cases while reversing when flags are used in unconventional ways. Knowing how flags are set during comparison and subtraction is very helpful for properly understanding logical sequences and quickly deciphering their meaning.

Unsigned Comparisons

Table A.2 demonstrates the behavior of the CMP instruction when comparing unsigned operands. Remember that just like table A.1, the following table also applies to the SUB instruction.

Table A.2 Unsigned Subtraction Outcome Table for CMP and SUB Instructions (X represents the left operand, while Y represents the right operand)

RELATION
BETWEEN	FLAGS
OPERANDS	AFFECTED	COMMENTS
X = Y	CF = 0 ZF = 1	The two operands are equal, so the result is
		zero.

X < Y	CF = 1 ZF = 0	Y is larger than X so the result is lower than
		0, which generates an overflow (CF=1).
X > Y	CF = 0 ZF = 0	X is larger than Y, so the result is above zero,
		and no overflow is generated (CF=0).

Deciphering Code Structures 483

In looking at Table A.2, the ground rules for identifying the results of unsigned integer comparisons become clear, and it’s obvious that unsigned operands are easier to deal with. Here’s a quick summary of the basic rules:

■■Anytime ZF is set you know that the subtraction resulted in a zero, which means that the operands are equal.

■■When both flags are zero, you know that the first operand is greater than the second, because you have a positive result and no overflow.

■■When you have an overflow you know that the second operand is greater than the first, because the result must be too low in order to be represented by the destination operand.

The Conditional Codes

Conditional codes are suffixes added to certain conditional instructions in order to define the conditions governing their execution.

It is important for reversers to understand these mnemonics because virtually every conditional code sequence will include one or more of them. Sometimes their meaning will be very intuitive—take a look at the following code:

cmp

eax, 7

SomePlace

In this example, it is obvious that JE (which is jump if equal) will cause a jump to SomePlace if EAX equals 7. This is one of the more obvious cases where understanding the specifics of instructions such as CMP and of the conditional codes is really unnecessary. Unfortunately for us reversers, there are quite a few cases where the conditional codes are used in unintuitive ways. Understanding how the conditional codes use the flags is important for properly understanding program logic. The following sections list each condition code and explain which flags it uses and why.

The conditional codes listed in the following sections are listed as standalone codes, even though they are normally used as instruction suffixes to conditional instructions. Conditional codes are never used alone.

Signed Conditional Codes

Table A.3 presents the IA-32 conditional codes defined for signed operands. Note that in all signed conditional codes overflows are detected using the

484 Appendix A

overflow flag (OF). This is because the arithmetic instructions use OF for indicating signed overflows.

Table A.3 Signed Conditional Codes Table for CMP and SUB Instructions

		SATISFIED
MNEMONICS	FLAGS	WHEN	COMMENTS
If Greater (G)	ZF = 0 AND	X > Y	Use ZF to confirm
If Not Less or	((OF = 0 AND SF = 0) OR		that the operands
Equal (NLE)	(OF = 1 AND SF = 1))		are unequal. Also use
			SF to check for either
			a positive result
			without an overflow,
			indicating that the first
			operand is greater, or
			a negative result with
			an overflow. The latter
			would indicate that
			the second operand
			was a low enough
			negative integer to
			produce a result too
			large to be
			represented by the
			destination (hence the
			overflow).
If Greater or	(OF = 0 AND SF = 0) OR	X >= Y	This code is similar
Equal(GE)	(OF = 1 AND SF = 1)		to the preceding
If Not Less (NL)			code with the
			exception that it
			doesn’t check ZF for
			zero, so it would also
			be satisfied by equal
			operands.

If Less (L)	(OF = 1 AND SF = 0) OR	X < Y	Check for OF = 1 AND
If Not Greater	(OF = 0 AND SF = 1)		SF = 0 indicating that
or Equal (NGE)			X was lower than Y
			and the result was too
			low to be represented
			by the destination
			operand (you got an
			overflow and a
			positive result). The
			other case is OF = 0
			AND SF = 1. This is a
			similar case, except
			that no overflow is
			generated, and the
			result is negative.

Deciphering Code Structures 485

Table A.3 (continued)

		SATISFIED
MNEMONICS	FLAGS	WHEN	COMMENTS
If Less or	ZF = 1 OR	X <= Y	This code is the same
Equal (LE)	((OF = 1 AND SF = 0) OR		as the preceding code
If Not	(OF = 0 AND SF = 1))		with the exception
Greater (NG)			that it also checks ZF
			and so would also be
			satisfied if the
			operands are equal.

Unsigned Conditional Codes

Table A.4 presents the IA-32 conditional codes defined for unsigned operands. Note that in all unsigned conditional codes, overflows are detected using the carry flag (CF). This is because the arithmetic instructions use CF for indicating unsigned overflows.

Table A.4 Unsigned Conditional Codes

		SATISFIED
MNEMONICS	FLAGS	WHEN	COMMENTS
If Above (A)	CF = 0 AND ZF = 0	X > Y	Use CF to confirm that
If Not Below or			the second operand is
Equal (NBE)			not larger than the
			first (because then CF
			would be set), and ZF
			to confirm that the
			operands are unequal.
If Above or	CF = 0	X >= Y	This code is similar to
Equal (AE)			the above with the
If Not			exception that it only
Below (NB)			checks CF, so it would
If Not Carry (NC)			also be satisfied by
			equal operands.

If Below (B)	CF = 1	X < Y	When CF is set we
If Not Above or			know that the second
Equal (NAE)			operand is greater
If Carry (C)			than the first because
			an overflow could only
			mean that the result
			was negative.

(continued)

486Appendix A

Table A.4 (continued)

		SATISFIED
MNEMONICS	FLAGS	WHEN	COMMENTS
If Below or	CF = 1 OR ZF = 1	X <= Y	This code is the same
Equal (BE)			as the above with the
If Not			exception that it also
Above (NA)			checks ZF, and so
			would also be
			satisfied if the
			operands are equal.
If Equal (E)	ZF = 1	X = Y	ZF is set so we know
If Zero (Z)			that the result was
			zero, meaning that the
			operands are equal.
If Not Equal (NE)	ZF = 0	Z != Y	ZF is unset so we
If Not Zero (NZ)			know that the result
			was nonzero, which
			implies that the
			operands are unequal.

Control Flow & Program Layout

The vast majority of logic in the average computer program is implemented through branches. These are the most common programming constructs, regardless of the high-level language. A program tests one or more logical conditions, and branches to a different part of the program based on the result of the logical test. Identifying branches and figuring out their meaning and purpose is one of the most basic code-level reversing tasks.

The following sections introduce the most popular control flow constructs and program layout elements. I start with a discussion of procedures and how they are represented in assembly language and proceed to a discussion of the most common control flow constructs and to a comparison of their low-level representations with their high-level representations. The constructs discussed are single branch conditionals, two-way conditionals, n-way conditionals, and loops, among others.

Deciphering Functions

The most basic building block in a program is the procedure, or function. From a reversing standpoint functions are very easy to detect because of function prologues and epilogues. These are standard initialization sequences that compilers

Deciphering Code Structures 487

generate for nearly every function. The particulars of these sequences depend on the specific compiler used and on other issues such as calling convention. Calling conventions are discussed in the section on calling conventions in Appendix C.

On IA-32 processors function are nearly always called using the CALL instruction, which stores the current instruction pointer in the stack and jumps to the function address. This makes it easy to distinguish function calls from other unconditional jumps.

Internal Functions

Internal functions are called from the same binary executable that contains their implementation. When compilers generate an internal function call sequence they usually just embed the function’s address into the code, which makes it very easy to detect. The following is a common internal function call.

Call CodeSectionAddress

Imported Functions

An imported function call takes place when a module is making a call into a function implemented in another binary executable. This is important because during the compilation process the compiler has no idea where the imported function can be found and is therefore unable to embed the function’s address into the code (as is usually done with internal functions).

Imported function calls are implemented using the Import Directory and Import Address Table (see Chapter 3). The import directory is used in runtime for resolving the function’s name with a matching function in the target executable, and the IAT stores the actual address of the target function. The caller then loads the function’s pointer from the IAT and calls it. The following is an example of a typical imported function call:

call	DWORD PTR [IAT_Pointer]

Notice the DWORD PTR that precedes the pointer—it is important because it tells the CPU to jump not to the address of IAT_Pointer but to the address that is pointed to by IAT_Pointer. Also keep in mind that the pointer will usually not be named (depending on the disassembler) and will simply contain an address pointing into the IAT.

Detecting imported calls is easy because except for these types of calls, functions are rarely called indirectly through a hard-coded function pointer. I would, however, recommend that you determine the location of the IAT early on in reversing sessions and use it to confirm that a function is indeed

488Appendix A

imported. Locating the IAT is quite easy and can be done with a variety of different tools that dump the module’s PE header and provide the address of the IAT. Tools for dumping PE headers are discussed in Chapter 4.

Some disassemblers and debuggers will automatically indicate an imported function call (by internally checking the IAT address), thus saving you the trouble.

Single-Branch Conditionals

The most basic form of logic in most programs consists of a condition and an ensuing conditional branch. In high-level languages, this is written as an if statement with a condition and a block of conditional code that gets executed if the condition is satisfied. Here’s a quick sample:

if (SomeVariable == 0)

CallAFunction();

From a low-level perspective, implementing this statement requires a logical check to determine whether SomeVariable contains 0 or not, followed by code that skips the conditional block by performing a conditional jump if SomeVariable is nonzero. Figure A.1 depicts how this code snippet would typically map into assembly language.

The assembly language code in Figure A.1 uses TEST to perform a simple zero check for EAX. TEST works by performing a bitwise AND operation on EAX and setting flags to reflect the result (the actual result is discarded). This is an effective way to test whether EAX is zero or nonzero because TEST sets the zero flag (ZF) according to the result of the bitwise AND operation. Note that the condition is reversed: In the source code, the program was checking whether SomeVariable equals zero, but the compiler reversed the condition so that the conditional instruction (in this case a jump) checks whether SomeVariable is nonzero. This stems from the fact that the compiler-generated binary code is organized in memory in the same order as it is organized in the source code. Therefore if SomeVariable is nonzero, the compiler must skip the conditional code section and go straight to the code section that follows.

The bottom line is that in single-branch conditionals you must always reverse the meaning of the conditional jump in order to obtain the true highlevel logical intention.

Deciphering Code Structures 489

	Assembly Language Code	High-Level Code
mov	eax, [SomeVariable]	if (SomeVariable == 0)
test	eax, eax	CallAFunction();
jnz	AfterCondition	...
call CallAFunction
AfterCondition:
...

Figure A.1 High-level/low-level view of a single branch conditional sequence.

Two-Way Conditionals

Another fundamental functionality of high-level languages is to allow the use of two-way conditionals, typically implemented in high-level languages using the if-else keyword pair. A two-way conditional is different from a singlebranch conditional in the sense that if the condition is not satisfied, the program executes an alternative code block and only then proceeds to the code that follows the ‘if-else’ statement. These constructs are called two-way conditionals because the flow of the program is split into one of two different possible paths: the one in the ‘if’ block, or the one in the ‘else’ block.

Let’s take a quick look at how compilers implement two-way conditionals. First of all, in two-way conditionals the conditional branch points to the ‘else’ block and not to the code that follows the conditional statement. Second, the condition itself is almost always reversed (so that the jump to the ‘else’ block only takes place when the condition is not satisfied), and the primary conditional block is placed right after the conditional jump (so that the conditional code gets executed if the condition is satisfied). The conditional block always ends with an unconditional jump that essentially skips the ‘else’ block—this is a good indicator for identifying two-way conditionals. The ‘else’ block is placed at the end of the conditional block, right after that unconditional jump. Figure A.2 shows what an average if-else statement looks like in assembly language.

490	Appendix A
		Assembly Language Code		High-Level Code
	cmp	[Variable1], 7		if (SomeVariable == 7)
				if (SomeVariable == 7)
	jne	ElseBlock	Reversed
	jne	ElseBlock		SomeFunction();
				SomeFunction();
	call	SomeFunction
	jmp	AfterConditionalBlock		else
	jmp	AfterConditionalBlock
	ElseBlock:			SomeOtherFunction();
	call	SomeOtherFunction
	AfterConditionalBlock:
	...

Figure A.2 High-level/low-level view of a two-way conditional.

Notice the unconditional JMP right after the function call. That is where the first condition skips the else block and jumps to the code that follows. The basic pattern to look for when trying to detect a simple ‘if-else’ statement in a disassembled program is a condition where the code that follows it ends with an unconditional jump.

Most high-level languages also support a slightly more complex version of a two-way conditional where a separate conditional statement is used for each of the two code blocks. This is usually implemented by combining the ‘if’ and else-if keywords where each statement is used with a separate conditional statement. This way, if the first condition is not satisfied, the program jumps to the second condition, evaluates that one, and simply skips the entire conditional block if neither condition is satisfied. If one of the conditions is satisfied, the corresponding conditional block is executed, and execution just flows into the next program statement. Figure A.3 provides a high-level/low- level view of this type of control flow construct.

Multiple-Alternative Conditionals

Sometimes programmers create long statements with multiple conditions, where each condition leads to the execution of a different code block. One way to implement this in high-level languages is by using a “switch” block (discussed later), but it is also possible to do this using conventional ‘if’ statements. The reason that programmers sometimes must use ‘if’ statements is that they allow for more flexible conditional statements. The problem is that ‘switch’ blocks don’t support complex conditions, only the use of hardcoded constants. In contrast, a sequence of ‘else-if’ statements allows for any kind of complex condition on each of the blocks—it is just more flexible.

<<< < Предыдущая 38 39 40 41 42 43 44 45 46 47 48 49 50 5152 / 6252 53 54 55 56 57 58 59 60 61 62 > Следующая >>>

Соседние файлы в предмете Электротехника

#
23.08.20139 Mб54Dueck R.Digital design with CPLD applications and VHDL.2000.pdf
#
23.08.2013720.95 Кб11ECMA-262 standard.ECMAScript language specification.1999.pdf
#
23.08.2013355.43 Кб9Eden20v115.pdf
#
23.08.2013385.77 Кб134EIA-364-108 standard.Impedance,reflection coefficient,return loss measured for electrical connectors,cable assemblies and i.pdf
#
23.08.2013261.47 Кб26EIA-364-109 standard.Loop inductance measurement test procedure for electrical connectors.pdf
#
23.08.20138.78 Mб72Eilam E.Reversing.Secrets of reverse engineering.2005.pdf
#
23.08.2013323.38 Кб21Electrical connections for power circuits.2000.pdf
#
23.08.201326.2 Mб17Elektor electronics 2005.03.pdf
#
23.08.20135.43 Mб12Elkan C.The paradoxical success of fuzzy logic.pdf
#
23.08.201387.27 Кб18Emacs beginner's HOWTO.pdf
#
23.08.2013270.17 Кб16Emacs predictive completion manual.V0.12.2.pdf