Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

Pro CSharp And The .NET 2.0 Platform (2005) [eng]

.pdf
Скачиваний:
111
Добавлен:
16.08.2013
Размер:
10.35 Mб
Скачать

494 C H A P T E R 1 5 U N D E R S TA N D I N G C I L A N D T H E R O L E O F DY N A M I C A S S E M B L I E S

Defining Type Constructors in CIL

The CTS supports both instance-level and class-level (static) constructors. In terms of CIL, instancelevel constructors are represented using the .ctor token, while a static-level constructor is expressed via .cctor (class constructor). Both of these CIL tokens must be qualified using the rtspecialname (return type special name) and specialname attributes. Simply put, these attributes are used to identify a specific CIL token that can be treated in unique ways by a given .NET language. For example, in C#, constructors do not define a return type; however, in terms of CIL, the return value of a constructor is indeed void:

.class public MyBaseClass

{

.field private string stringField

.field private int32 intField

.method public hidebysig specialname rtspecialname instance void .ctor(string s, int32 i) cil managed

{

// TODO: Add implementation code...

}

}

Note that the .ctor directive has been qualified with the instance attribute (as it is not a static constructor). The cil managed attributes denote that the scope of this method contains CIL code, rather than unmanaged code, which may be used during platform invocation requests.

Defining Properties in CIL

Properties and methods also have specific CIL representations. By way of an example, if MyBaseClass were updated to support a public property named TheString, you would author the following CIL (note again the use of the specialname attribute):

.class public MyBaseClass

{

...

.method public hidebysig specialname

instance string get_TheString() cil managed

{

// TODO: Add implementation code...

}

.method public hidebysig specialname

instance void set_TheString(string 'value') cil managed

{

// TODO: Add implementation code...

}

.property instance string TheString()

{

.get instance string MyNamespace.MyBaseClass::get_TheString()

.set instance void

MyNamespace. MyBaseClass::set_TheString(string)

}

}

C H A P T E R 1 5 U N D E R S TA N D I N G C I L A N D T H E R O L E O F DY N A M I C A S S E M B L I E S

495

Recall that in terms of CIL, a property maps to a pair of methods that take get_ and set_ prefixes. The .property directive makes use of the related .get and .set directives to map property syntax to the correct “specially named” methods.

Note The previous property definitions will fail to compile, given that you have not yet implemented the mutator logic.

Defining Member Parameters

Now assume that you wish to define methods that take some number of arguments. In a nutshell, specifying arguments in CIL is (more or less) identical to doing so in C#. For example, each argument is defined by specifying its data type followed by the parameter name. Furthermore, like C#, CIL provides a way to define input, output, and pass-by-reference parameters. As well, CIL allows you to define a parameter array argument (aka the C# params keyword) as well as optional parameters (which are not supported in C#, but are used in VB .NET).

To illustrate the process of defining parameters in raw CIL, assume you wish to build a method that takes an int32 (by value), int32 (by reference), a [mscorlib]System.Collection.ArrayList, and a single output parameter (of type int32). In terms of C#, this method would look something like the following:

public static void MyMethod(int inputInt,

ref int refInt, ArrayList ar, out int outputInt)

{

outputInt = 0; // Just to satisfy the C# compiler...

}

If you were to map this method into CIL terms, you would find that C# reference parameters are marked with an ampersand (&) suffixed to the parameter’s underlying data type (int32&). Output parameters also make use of the & suffix, but they are further qualified using the CIL [out] token. Also notice that if the parameter is a reference type (in this case, the [mscorlib]System.Collections. ArrayList type), the class token is prefixed to the data type (not to be confused with the .class directive!):

.method public hidebysig static void MyMethod(int32 inputInt, int32& refInt,

class [mscorlib]System.Collections.ArrayList ar, [out] int32& outputInt) cil managed

{

...

}

Examining CIL Opcodes

The final aspect of CIL code you’ll examine in this chapter has to do with the role of various operational codes (opcodes). Recall that an opcode is simply a CIL token used to build the implementation logic for a given member. The complete set of CIL opcodes (which is fairly large) can be grouped into the following broad categories:

Opcodes that control program flow

Opcodes that evaluate expressions

Opcodes that access values in memory (via parameters, local variables, etc.)

496 C H A P T E R 1 5 U N D E R S TA N D I N G C I L A N D T H E R O L E O F DY N A M I C A S S E M B L I E S

To provide some insight to the world of member implementation via CIL, Table 15-5 defines some of the more useful opcodes that are directly related to member implementation logic, grouped by related functionality.

Table 15-5. Various Implementation-Specific CIL Opcodes

Opcodes

Meaning in Life

add, sub, mul, div, rem

These CIL opcodes allow you to add, subtract, multiply, and divide

 

two values (rem returns the remainder of a division operation).

and, or, not, xor

These CIL opcodes allow you to perform binary operations on two

 

values.

ceq, cgt, clt

These CIL opcodes allow you to compare two values on the stack in

 

various manners, for example:

 

ceq: Compare for equality

 

cgt: Compare for greater than

 

clt: Compare for less than

box, unbox

These CIL opcodes are used to convert between reference types

 

and value types.

ret

This CIL opcode is used to exit a method and return a value to the

 

caller (if necessary).

beq, bgt, ble, blt, switch

These CIL opcodes (in addition to many other related opcodes) are

 

used to control branching logic within a method, for example:

 

beq: Break to code label if equal

 

bgt: Break to code label if greater than

 

ble: Break to code label if less than or equal to

 

blt: Break to code label if less than

 

All of the branch-centric opcodes require that you specify a CIL

 

code label to jump to if the result of the test is true.

call

This CIL opcode is used to call a member on a given type.

newarr, newobj

These CIL opcodes allow you to allocate a new array or new object

 

type into memory (respectively).

 

 

The next broad category of CIL opcodes (a subset of which is shown in Table 15-6) are used to load (push) arguments onto the virtual execution stack. Note how these load-specific opcodes take an ld (load) prefix.

Table 15-6. The Primary Stack-Centric Opcodes of CIL

Opcode

Meaning in Life

ldarg (with numerous variations)

Loads a method’s argument onto the stack. In addition to

 

the generic ldarg (which works in conjunction with a given

 

index that identifies the argument), there are numerous other

 

variations. For example, ldarg opcodes that have a numerical

 

suffix (ldarg_0) hard-code which argument to load.

 

As well, variations of the ldarg opcode allow you to hard-

 

code the data type using the CIL constant notation shown

 

in Table 15-4 (ldarg_I4, for an int32) as well as the data type

 

and value (ldarg_I4_5, to load an int32 with the value of 5).

ldc (with numerous variations)

Loads a constant value onto the stack.

ldfld (with numerous variations)

Loads the value of an instance-level field onto the stack.

ldloc (with numerous variations)

Loads the value of a local variable onto the stack.

C H A P T E R 1 5 U N D E R S TA N D I N G C I L A N D T H E R O L E O F DY N A M I C A S S E M B L I E S

497

Opcode

Meaning in Life

ldobj

Obtains all the values gathered by a heap-based object and

 

places them on the stack.

ldstr

Loads a string value onto the stack.

 

 

In addition to the set of load-specific opcodes, CIL provides numerous opcodes that explicitly pop the topmost value off the stack. As shown over the first few examples in this chapter, popping a value off the stack typically involves storing the value into temporary local storage for further use (such as a parameter for an upcoming method invocation). Given this, note how many opcodes that pop the current value off the virtual execution stack take an st (store) prefix. Table 15-7 hits the highlights.

Table 15-7. Various Pop-Centric Opcodes

Opcode

Meaning in Life

pop

Removes the value currently on top of the evaluation stack,

 

but does not bother to store the value

starg

Stores the value on top of the stack into the method

 

argument at a specified index

stloc (with numerous variations)

Pops the current value from the top of the evaluation stack

 

and stores it in a local variable list at a specified index

stobj

Copies a value of a specified type from the evaluation stack

 

into a supplied memory address

stsfld

Replaces the value of a static field with a value from the

 

evaluation stack

 

 

Do be aware that various CIL opcodes will implicitly pop values off the stack to perform the task at hand. For example, if you are attempting to subtract two numbers using the sub opcode, it should be clear that sub will have to pop off the next two available values before it can perform the calculation. Once the calculation is complete, the result of the value (surprise, surprise) is pushed onto the stack once again.

Considering the .maxstack Directive

When you write method implementations using raw CIL, you need to be mindful of a special directive named .maxstack. As its name suggests, .maxstack establishes the maximum number of variables that may be pushed onto the stack at any given time during the execution of the method. The good news is that the .maxstack directive has a default value (8), which should be safe for a vast majority of methods you may be authoring. However, if you wish to be very explicit, you are able to manually calculate the number of local variables on the stack and define this value explicitly:

.method public hidebysig instance void Speak() cil managed

{

//During the scope of this method, exactly

//1 value (the string literal) is on the stack.

.maxstack 1

ldstr "Hello there..."

call void [mscorlib]System.Console::WriteLine(string) ret

}

498 C H A P T E R 1 5 U N D E R S TA N D I N G C I L A N D T H E R O L E O F DY N A M I C A S S E M B L I E S

Declaring Local Variables in CIL

Let’s first check out how to declare a local variable. Assume you wish to build a method in CIL named MyLocalVariables() that takes no arguments and returns void. Within the method, you wish to define three local variables of type System.String, System.Int32, and System.Object. In C#, this member would appear as so (recall that locally scoped variables do not receive a default value and should be set to an initial state before further use):

public static void MyLocalVariables()

{

string myStr = "CIL me dude..."; int myInt = 33;

object myObj = new object();

}

If you were to construct MyLocalVariables() directly in CIL, you could author the following:

.method public hidebysig static void MyLocalVariables() cil managed

{

.maxstack 8

// Define three local variables.

.locals init ([0] string myStr, [1] int32 myInt, [2] object myObj)

//Load a string onto the virtual execution stack. ldstr "CIL me dude..."

//Pop off current value and store in local variable [0]. stloc.0

//Load a constant of type 'i4'

//(shorthand for int32) set to the value 33.

ldc.i4 33

//Pop off current value and store in local variable [1]. stloc.1

//Create a new object and place on stack.

newobj

instance void [mscorlib]System.Object::.ctor()

// Pop off current value and store in local variable [2]. stloc.2

ret

}

As you can see, the first step taken to allocate local variables in raw CIL is to make use of the

.locals directive, which is paired with the init attribute. Within the scope of the related parentheses, your goal is to associate a given numerical index to each variable (seen here as [0], [1], and [2]). As you can see, each index is identified by its data type and an optional variable name. Once the local variables have been defined, you load a value onto the stack (using the various load-centric opcodes) and store the value within the local variable (using the various storage-centric opcodes).

Mapping Parameters to Local Variables in CIL

You have already seen how to declare local variables in raw CIL using the .local init directive; however, you have yet to see exactly how to map incoming parameters to local methods. Consider the following static C# method:

public static int Add(int a, int b)

{

return a + b;

}

C H A P T E R 1 5 U N D E R S TA N D I N G C I L A N D T H E R O L E O F DY N A M I C A S S E M B L I E S

499

This innocent-looking method has a lot to say in terms of CIL. First, the incoming arguments (a and b) must be pushed onto the virtual execution stack using the ldarg (load argument) opcode. Next, the add opcode will be used to pop the next two values off the stack and find the summation, and store the value on the stack yet again. Finally, this sum is popped off the stack and returned to the caller via the ret opcode. If you were to disassemble this C# method using ildasm.exe, you would find numerous additional tokens injected by csc.exe, but the crux of the CIL code is quite simple:

.method public hidebysig static int32 Add(int32 a, int32 b) cil managed

{

.maxstack

2

ldarg.0

// Load 'a' onto the stack.

ldarg.1

// Load 'b' onto the stack.

add

// Add both values.

ret

 

}

The Hidden this Reference

Notice that the two incoming arguments (a and b) are referenced within the CIL code using their indexed position (index 0 and index 1), given that the virtual execution stack begins indexing at position 0.

One thing to be very mindful of when you are examining or authoring raw CIL code is that every (nonstatic) method that takes incoming arguments automatically receives an implicit additional parameter, which is a reference to the current object (think the C# this keyword). Given this, if the Add() method were defined as nonstatic

// No longer static!

public int Add(int a, int b)

{

return a + b;

}

the incoming a and b arguments are loaded using ldarg.1 and ldarg.2 (rather than the expected ldarg.0 and ldarg.1 opcodes). Again, the reason is that slot 0 actually contains the implicit this reference. Consider the following pseudo-code:

// This is JUST pseudo-code!

.method public hidebysig static int32 AddTwoIntParams(

MyClass_HiddenThisPointer this, int32 a, int32 b) cil managed

{

 

ldarg.0

// Load MyClass_HiddenThisPointer onto the stack.

ldarg.1

// Load 'a' onto the stack.

ldarg.2

// Load 'b' onto the stack.

...

 

}

 

Representing Iteration Constructs in CIL

Iteration constructs in the C# programming language are represented using the for, foreach, while, and do keywords, each of which has a specific representation in CIL. Consider the classic for loop:

public static void CountToTen()

{

for(int i = 0; i < 10; i++)

;

}

500 C H A P T E R 1 5 U N D E R S TA N D I N G C I L A N D T H E R O L E O F DY N A M I C A S S E M B L I E S

Now, as you may recall, the br opcodes (br, blt, and so on) are used to control a break in flow when some condition has been met. In this example, you have set up a condition in which the for loop should break out of its cycle when the local variable i is equal to the value of 10. With each pass, the value of 1 is added to i, at which point the test condition is yet again evaluated.

Also recall that when you make use of any of the CIL branching opcodes, you will need to define a specific code label (or two) that marks the location to jump to when the condition is indeed true. Given these points, ponder the following (augmented) CIL code generated via ildasm.exe (including the autogenerated code labels):

.method public hidebysig static void

CountToTen() cil managed

{

 

 

 

 

.maxstack

2

 

 

 

.locals init ([0] int32 i) // Init

the local integer 'i'.

IL_0000:

ldc.i4.0

 

// Load

this value onto the stack.

IL_0001:

stloc.0

 

// Store this value at index '0'.

IL_0002:

br.s IL_0008

// Jump to IL_0008.

IL_0004:

ldloc.0

 

// Load

value of variable at index 0.

IL_0005:

ldc.i4.1

 

// Load

the value '1' on the stack.

IL_0006:

add

 

// Add current value on the stack at index 0.

IL_0007:

stloc.0

 

 

 

IL_0008:

ldloc.0

 

// Load

value at index '0'.

IL_0009:

ldc.i4.s

10

// Load

value of '10' onto the stack.

IL_000b: blt.s IL_0004

// Less than? If so, jump back to IL_0004

IL_000d:

ret

 

 

 

}

 

 

 

 

In a nutshell, this CIL code begins by defining the local int32 and loading it onto the stack. At this point, you jump back and forth between code label IL_0008 and IL_0004, each time bumping the value of i by 1 and testing to see whether i is still less than the value 10. If so, you exit the method.

Building a .NET Assembly with CIL

Now that you’ve taken a tour of the syntax and semantics of raw CIL, it’s time to solidify your current understanding by building a .NET application using nothing but CIL and your text editor of choice. Specifically, your application will consist of a privately deployed, single-file *.dll that contains two class type definitions, and a console-based *.exe that interacts with these types.

Building CILCars.dll

The first order of business is to build the *.dll to be consumed by the client. Open a text editor and create a new *.il file named CILCars.il. This single-file assembly will make use of two external .NET binaries, and you can begin creating your CIL code file as so:

//Reference mscorlib.dll and

//System.Windows.Forms.dll

.assembly extern mscorlib

{

.publickeytoken = (B7 7A 5C 56 19 34 E0 89 )

.ver 2:0:0:0

}

.assembly extern System.Windows.Forms

{

.publickeytoken = (B7 7A 5C 56 19 34 E0 89 )

.ver 2:0:0:0

}

C H A P T E R 1 5 U N D E R S TA N D I N G C I L A N D T H E R O L E O F DY N A M I C A S S E M B L I E S

501

// Define the single-file assembly.

.assembly CILCars

{

.hash algorithm 0x00008004

.ver 1:0:0:0

}

.module CILCars.dll

As mentioned, this assembly will contain two class types. The first type, CILCar, defines two points of field data and a custom constructor. The second type, CarInfoHelper, defines a single static method named DisplayCarInfo(), which takes CILCar as a parameter and returns void. Both types are in the CILCars namespace. In terms of CIL, CILCar can be implemented as so:

// Implementation of CILCars.CILCar type.

.namespace CILCars

{

.class public auto ansi beforefieldinit CILCar extends [mscorlib]System.Object

{

//The field data of the CILCar.

.field public string petName

.field public int32 currSpeed

//The custom constructor simply allows the caller

//to assign the field data.

.method public hidebysig specialname rtspecialname instance void .ctor(int32 c, string p) cil managed

{

.maxstack 8

//Load first arg onto the stack and call base class ctor. ldarg.0 // 'this' object, not the int32!

call instance void [mscorlib]System.Object::.ctor()

//Now load first and second args onto the stack.

ldarg.0 // 'this' object ldarg.1 // int32 arg

//Store topmost stack (int 32) member in currSpeed field. stfld int32 CILCars.CILCar::currSpeed

//Load string arg and store in petName field.

ldarg.0 // 'this' object ldarg.2 // string arg

stfld string CILCars.CILCar::petName ret

}

}

}

Keeping in mind that the real first argument for any nonstatic member is the current object reference, the first block of CIL simply loads the object reference and calls the base class constructor. Next, you push the incoming constructor arguments onto the stack and store them into the type’s field data using the stfld (store in field) opcode.

Next, you need to implement the second type in this namespace: CILCarInfo. The meat of the type is found within the static Display() method. In a nutshell, the role of this method is to take the incoming CILCar parameter, extract the values of its field data, and display it in a Windows Forms message box. Here is the complete implementation of CILCarInfo, with analysis to follow:

502C H A P T E R 1 5 U N D E R S TA N D I N G C I L A N D T H E R O L E O F DY N A M I C A S S E M B L I E S

.class public auto ansi beforefieldinit CILCarInfo extends [mscorlib]System.Object

{

.method public hidebysig static void Display(class CILCars.CILCar c) cil managed

{

.maxstack 8

//We need a local string variable.

.locals init ([0] string caption)

//Load string and the incoming CILCar onto the stack.

ldstr

"{0}'s speed is:"

ldarg.0

 

//Now place the value of the CILCar's petName on the

//stack and call the static String.Format() method.

ldfld

string

CILCars.CILCar::petName

call

string

[mscorlib]System.String::Format(string, object)

stloc.0

 

 

//Now load the value of the currSpeed field and get its string

//representation (note call to ToString() ).

ldarg.0

 

ldflda

int32 CILCars.CILCar::currSpeed

call

instance string [mscorlib]System.Int32::ToString()

ldloc.0

 

// Now call the MessageBox.Show() method with loaded values. call valuetype [System.Windows.Forms]

System.Windows.Forms.DialogResult

[System.Windows.Forms] System.Windows.Forms.MessageBox::Show(string, string)

pop ret

}

}

Although the amount of CIL code is a bit more than you see in the implementation of CILCar, things are still rather straightforward. First, given that you are defining a static method, you don’t have to be concerned with the hidden object reference (thus, the ldarg.0 opcode really does load the incoming CILCar argument).

The method begins by loading a string ("{0}'s speed is") onto the stack, followed by the CILCar argument. Once these two values are in place, you load the value of the petName field and call the static System.String.Format() method to substitute the curly bracket placeholder with the CILCar’s pet name.

The same general procedure takes place when processing the currSpeed field, but note that you use the ldarga opcode, which loads the argument address onto the stack. At this point, you call System.Int32.ToString() to transform the value at said address into a string type. Finally, once both strings have been formatted as necessary, you call the MessageBox.Show() method.

At this point, you are able to compile your new *.dll using ilasm.exe with the following command:

ilasm /dll CILCars.il

and verify the contained CIL using peverify.exe:

peverify CILCars.dll

C H A P T E R 1 5 U N D E R S TA N D I N G C I L A N D T H E R O L E O F DY N A M I C A S S E M B L I E S

503

Building CILCarClient.exe

Now you can build a simple *.exe assembly that will

Make a CILCar type.

Pass the type into the static CILCarInfo.Display() method.

Create a new *.il file and define external references to mscorlib.dll and CILCars.dll (don’t forget to place a copy of this .NET assembly in the client’s application directory!). Next, define a single type (Program) that manipulates the CILCars.dll assembly. Here’s the complete code:

//External assembly refs.

.assembly extern mscorlib

{

.publickeytoken = (B7 7A 5C 56 19 34 E0 89 )

.ver 2:0:0:0

}

.assembly extern CILCars

{

.ver 1:0:0:0

}

//Our executable assembly.

.assembly CILCarClient

{

.hash algorithm 0x00008004

.ver 0:0:0:0

}

.module CILCarClient.exe

// Implementation of Program type

.namespace CILCarClient

{

.class private auto ansi beforefieldinit Program extends [mscorlib]System.Object

{

.method private hidebysig static void Main(string[] args) cil managed

{

//Marks the entry point of the *.exe.

.entrypoint

.maxstack 8

//Declare a local CILCar type and push

//values on the stack for ctor call.

.locals init ([0] class [CILCars]CILCars.CILCar myCilCar) ldc.i4 55

ldstr "Junior"

//Make new CilCar; store and load reference. newobj instance void

[CILCars]CILCars.CILCar::.ctor(int32, string) stloc.0

ldloc.0