Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

Pro CSharp 2008 And The .NET 3.5 Platform [eng]

.pdf
Скачиваний:
78
Добавлен:
16.08.2013
Размер:
22.5 Mб
Скачать

632 CHAPTER 19 UNDERSTANDING CIL AND THE ROLE OF DYNAMIC ASSEMBLIES

Table 19-3. Various Attributes Used in Conjunction with the .class Directive

Attributes

Meaning in Life

public, private, nested assembly, nested famandassem, nested family, nested famorassem, nested public, nested private

abstract, sealed

CIL defines various attributes that are used to specify the visibility of a given type. As you can see, raw CIL offers numerous possibilities other than those offered by C#.

These two attributes may be tacked onto a .class directive to define an abstract class or sealed class, respectively.

auto, sequential, explicit

These attributes are used to instruct the CLR how to lay

 

out field data in memory. For class types, the default

 

layout flag (auto) is appropriate.

extends, implements

These attributes allow you to define the base class of a

 

type (via extends) or implement an interface on a type

 

(via implements).

 

 

Defining and Implementing Interfaces in CIL

As odd as it may seem, interface types are defined in CIL using the .class directive. However, when the .class directive is adorned with the interface attribute, the type is realized as a CTS interface type. Once an interface has been defined, it may be bound to a class or structure type using the CIL implements attribute:

.namespace MyNamespace

{

// An interface definition.

.class public interface IMyInterface {}

.class public MyBaseClass {}

// MyDerivedClass now implements IAmAnInterface.

.class public MyDerivedClass extends MyNamespace.MyBaseClass

implements MyNamespace.IMyInterface {}

}

As you recall from Chapter 9, interfaces can function as the base interface to other interface types in order to build interface hierarchies. However, contrary to what you might be thinking, the extends attribute cannot be used to derive interface A from interface B. The extends attribute is used only to qualify a type’s base class. When you wish to extend an interface, you will make use of the implements attribute yet again:

// Extending interfaces in terms of CIL.

.class public interface IMyInterface {}

.class public interface IMyOtherInterface implements MyNamespace.IMyInterface {}

Defining Structures in CIL

The .class directive can be used to define a CTS structure if the type extends System.ValueType. As well, the .class directive is qualified with the sealed attribute (given that structures can never be a base structure to other value types). If you attempt to do otherwise, ilasm.exe will issue a compiler error.

CHAPTER 19 UNDERSTANDING CIL AND THE ROLE OF DYNAMIC ASSEMBLIES

633

// A structure definition is always sealed.

.class public sealed MyStruct

extends [mscorlib]System.ValueType{}

Do be aware that CIL provides a shorthand notation to define a structure type. If you use the value attribute, the new type will derive the type from [mscorlib]System.ValueType automatically. Therefore, you could define MyStruct as follows:

// Shorthand notation for declaring a structure.

.class public sealed value MyStruct{}

Defining Enums in CIL

.NET enumerations (as you recall) derive from System.Enum, which is a System.ValueType (and therefore must also be sealed). When you wish to define an enum in terms of CIL, simply extend

[mscorlib]System.Enum:

// An enum.

.class public sealed MyEnum extends [mscorlib]System.Enum{}

Like a structure definition, enumerations can be defined with a shorthand notation using the enum attribute:

// Enum shorthand.

.class public sealed enum MyEnum{}

You’ll see how to specify the name/value pairs of an enumeration in just a moment.

Note The other fundamental .NET type, the delegate, also has a specific CIL representation. See Chapter 11 for full details.

Defining Generics in CIL

Generic types also have a specific representation in the syntax of CIL. Recall from Chapter 10 that a given generic type or generic member may have one or more type parameters. For example, the List<T> type has a single type parameter, while Dictionary<TKey, TValue> has two. In terms of CIL, the number of type parameters is specified using a backward-leaning single tick, `, followed by a numerical value representing the number of type parameters. Like C#, the actual value of the type parameters is encased within angled brackets.

Note On most keyboards, the ` character can be found on the key above the Tab key (and to the left of the 1 key).

For example, assume you wish to create a List<T> type, where T is of type System.Int32. In CIL, you would author the following:

// In C#: List<int> myInts = new List<int>(); newobj instance void class [mscorlib]

System.Collections.Generic.List`1<int32>::.ctor()

634 CHAPTER 19 UNDERSTANDING CIL AND THE ROLE OF DYNAMIC ASSEMBLIES

Notice that this generic class is defined as List`1<int32>, as List<T> has a single type parameter. However, if you needed to define a Dictionary<string, int>type, you would do so as the following:

// In C#: Dictionary<string, int> d = new Dictionary<string, int>(); newobj instance void class [mscorlib] System.Collections.Generic.Dictionary`2<string,int32>::.ctor()

As another example, if you have a generic type that uses another generic type as a type parameter, you would author CIL code such as the following:

// In C#: List<List<int>> myInts = new List<List<int>>(); newobj instance void class [mscorlib] System.Collections.Generic.List`1<class

[mscorlib]System.Collections.Generic.List`1<int32>>::.ctor()

Finally, when you are authoring a class, a structure, or an interface that is itself generic, you would make use of this same syntax at the point of type declaration. For example:

// A custom generic class with 1 type parameter.

.class public MyGenericClass`1<T>{}

Compiling the CILTypes.il file

Even though you have not yet added any members or implementation code to the types you have defined, you are able to compile this *.il file into a .NET DLL assembly (which you must do, as you have not specified a Main() method). Open up a command prompt and enter the following command to ilasm.exe:

ilasm /dll CilTypes.il

Once you have done so, you are able to open your binary into ildasm.exe (see Figure 19-5).

Figure 19-5. The CILTypes.dll assembly

CHAPTER 19 UNDERSTANDING CIL AND THE ROLE OF DYNAMIC ASSEMBLIES

635

Once you have confirmed the contents of your assembly, run peverify.exe against it. Notice that you are issued a number of errors, given that all your types are completely empty (see Figure 19-6).

Figure 19-6. Empty types yield verification errors!

To understand how to populate a type with content, you first need to examine the fundamental data types of CIL.

.NET Base Class Library, C#, and CIL Data Type

Mappings

Table 19-4 illustrates how a .NET base class type maps to the corresponding C# keyword, and how each C# keyword maps into raw CIL. As well, Table 19-4 documents the shorthand constant notations used for each CIL type. As you will see in just a moment, these constants are often referenced by numerous CIL opcodes.

Table 19-4. Mapping .NET Base Class Types to C# Keywords, and C# Keywords to CIL

.NET Base Class Type

C# Keyword

CIL Representation

CIL Constant Notation

System.SByte

sbyte

int8

I1

System.Byte

byte

unsigned int8

U1

System.Int16

short

int16

I2

System.UInt16

ushort

unsigned int16

U2

System.Int32

int

int32

I4

System.UInt32

uint

unsigned int32

U4

System.Int64

long

int64

I8

System.UInt64

ulong

unsigned int64

U8

System.Char

char

char

CHAR

System.Single

float

float32

R4

System.Double

double

float64

R8

System.Boolean

bool

bool

BOOLEAN

System.String

string

string

N/A

System.Object

object

object

N/A

System.Void

void

void

VOID

 

 

 

 

636 CHAPTER 19 UNDERSTANDING CIL AND THE ROLE OF DYNAMIC ASSEMBLIES

Defining Type Members in CIL

As you are already aware, .NET types may support various members. Enumerations have some set of name/value pairs. Structures and classes may have constructors, fields, methods, properties, static members, and so on. Over the course of this book’s first 18 chapters, you have already seen partial CIL definitions for the items previously mentioned, but nevertheless, here is a quick recap of how various members map to CIL primitives.

Defining Field Data in CIL

Enumerations, structures, and classes can all support field data. In each case, the .field directive will be used. For example, let’s breathe some life into the skeleton MyEnum enumeration and define three name/value pairs (note the values are specified within parentheses):

.class public sealed enum MyEnum

{

.field public static literal valuetype MyNamespace.MyEnum A = int32(0)

.field public static literal valuetype MyNamespace.MyEnum B = int32(1)

.field public static literal valuetype MyNamespace.MyEnum C = int32(2)

}

Fields that reside within the scope of a .NET System.Enum-derived type are qualified using the static and literal attributes. As you would guess, these attributes set up the field data to be a fixed value accessible from the type itself (e.g., MyEnum.NameOne).

Note The values assigned to an enum value may also be in hexadecimal with an 0x prefix.

Of course, when you wish to define a point of field data within a class or structure, you are not limited to a point of public static literal data. For example, you could update MyBaseClass to support two points of private, instance-level field data:

.class public MyBaseClass

{

.field private string stringField

.field private int32 intField

}

As in C#, class field data will automatically be initialized to an appropriate default value. If you wish to allow the object user to supply custom values at the time of creation for each of these points of private field data, you (of course) need to create custom constructors.

Defining Type Constructors in CIL

The CTS supports both instance-level and class-level (static) constructors. In terms of CIL, instance-level constructors are represented using the .ctor token, while a static-level constructor is expressed via .cctor (class constructor). Both of these CIL tokens must be qualified using the rtspecialname (return type special name) and specialname attributes. Simply put, these attributes are used to identify a specific CIL token that can be treated in unique ways by a given .NET language. For example, in C#, constructors do not define a return type; however, in terms of CIL, the return value of a constructor is indeed void:

CHAPTER 19 UNDERSTANDING CIL AND THE ROLE OF DYNAMIC ASSEMBLIES

637

.class public MyBaseClass

{

.field private string stringField

.field private int32 intField

.method public hidebysig specialname rtspecialname instance void .ctor(string s, int32 i) cil managed

{

// TODO: Add implementation code...

}

}

Note that the .ctor directive has been qualified with the instance attribute (as it is not a static constructor). The cil managed attributes denote that the scope of this method contains CIL code, rather than unmanaged code, which may be used during platform invocation requests.

Defining Properties in CIL

Properties and methods also have specific CIL representations. By way of an example, if MyBaseClass were updated to support a public property named TheString, you would author the following CIL (note again the use of the specialname attribute):

.class public MyBaseClass

{

...

.method public hidebysig specialname

instance string get_TheString() cil managed

{

// TODO: Add implementation code...

}

.method public hidebysig specialname

instance void set_TheString(string 'value') cil managed

{

// TODO: Add implementation code...

}

.property instance string TheString()

{

.get instance string MyNamespace.MyBaseClass::get_TheString()

.set instance void

MyNamespace. MyBaseClass::set_TheString(string)

}

}

Recall that in terms of CIL, a property maps to a pair of methods that take get_ and set_ prefixes. The .property directive makes use of the related .get and .set directives to map property syntax to the correct “specially named” methods.

Note Notice that the incoming parameter to the set method of a property is placed in single-tick quotation marks, which represents the name of the token to use on the right-hand side of the assignment operator within the method scope.

638 CHAPTER 19 UNDERSTANDING CIL AND THE ROLE OF DYNAMIC ASSEMBLIES

Defining Member Parameters

In a nutshell, specifying arguments in CIL is (more or less) identical to doing so in C#. For example, each argument is defined by specifying its data type followed by the parameter name. Furthermore, like C#, CIL provides a way to define input, output, and pass-by-reference parameters. As well, CIL allows you to define a parameter array argument (aka the C# params keyword) as well as optional parameters (which are not supported in C#, but are used in VB .NET).

To illustrate the process of defining parameters in raw CIL, assume you wish to build a method that takes an int32 (by value), an int32 (by reference), a [mscorlib]System.Collection.ArrayList, and a single output parameter (of type int32). In terms of C#, this method would look something like the following:

public static void MyMethod(int inputInt,

ref int refInt, ArrayList ar, out int outputInt)

{

outputInt = 0; // Just to satisfy the C# compiler...

}

If you were to map this method into CIL terms, you would find that C# reference parameters are marked with an ampersand (&) suffixed to the parameter’s underlying data type (int32&). Output parameters also make use of the & suffix, but they are further qualified using the CIL [out] token. Also notice that if the parameter is a reference type (in this case, the [mscorlib]System. Collections.ArrayList type), the class token is prefixed to the data type (not to be confused with the .class directive!):

.method public hidebysig static void MyMethod(int32 inputInt, int32& refInt,

class [mscorlib]System.Collections.ArrayList ar, [out] int32& outputInt) cil managed

{

...

}

Examining CIL Opcodes

The final aspect of CIL code you’ll examine in this chapter has to do with the role of various operational codes (opcodes). Recall that an opcode is simply a CIL token used to build the implementation logic for a given member. The complete set of CIL opcodes (which is fairly large) can be grouped into the following broad categories:

Opcodes that control program flow

Opcodes that evaluate expressions

Opcodes that access values in memory (via parameters, local variables, etc.)

To provide some insight to the world of member implementation via CIL, Table 19-5 defines some of the more useful opcodes that are directly related to member implementation logic, grouped by related functionality.

CHAPTER 19 UNDERSTANDING CIL AND THE ROLE OF DYNAMIC ASSEMBLIES

639

Table 19-5. Various Implementation-Specific CIL Opcodes

Opcodes

Meaning in Life

add, sub, mul, div, rem

These CIL opcodes allow you to add, subtract, multiply, and divide two

 

values (rem returns the remainder of a division operation).

and, or, not, xor

These CIL opcodes allow you to perform binary operations on two

 

values.

ceq, cgt, clt

These CIL opcodes allow you to compare two values on the stack in

 

various manners, for example:

 

ceq: Compare for equality

 

cgt: Compare for greater than

 

clt: Compare for less than

box, unbox

These CIL opcodes are used to convert between reference types and

 

value types.

ret

This CIL opcode is used to exit a method and return a value to the

 

caller (if necessary).

beq, bgt, ble, blt, switch

These CIL opcodes (in addition to many other related opcodes) are

 

used to control branching logic within a method, for example:

 

beq: Break to code label if equal

 

bgt: Break to code label if greater than

 

ble: Break to code label if less than or equal to

 

blt: Break to code label if less than

 

All of the branch-centric opcodes require that you specify a CIL code

 

label to jump to if the result of the test is true.

call

This CIL opcode is used to call a member on a given type.

nearer, newobj

These CIL opcodes allow you to allocate a new array or new object

 

type into memory (respectively).

 

 

The next broad category of CIL opcodes (a subset of which is shown in Table 19-6) are used to load (push) arguments onto the virtual execution stack. Note how these load-specific opcodes take an ld (load) prefix.

Table 19-6. The Primary Stack-Centric Opcodes of CIL

Opcode

Meaning in Life

ldarg (with numerous variations)

Loads a method’s argument onto the stack. In addition to

 

the general ldarg (which works in conjunction with a given

 

index that identifies the argument), there are numerous

 

other variations. For example, ldarg opcodes that have a

 

numerical suffix (ldarg_0) hard-code which argument to

 

load. As well, variations of the ldarg opcode allow you to

 

hard-code the data type using the CIL constant notation

 

shown in Table 19-4 (ldarg_I4, for an int32) as well as the

 

data type and value (ldarg_I4_5, to load an int32 with the

 

value of 5).

ldc (with numerous variations) ldfld (with numerous variations) ldloc (with numerous variations) ldobj

Loads a constant value onto the stack.

Loads the value of an instance-level field onto the stack.

Loads the value of a local variable onto the stack.

Obtains all the values gathered by a heap-based object and places them on the stack.

ldstr

Loads a string value onto the stack.

640 CHAPTER 19 UNDERSTANDING CIL AND THE ROLE OF DYNAMIC ASSEMBLIES

In addition to the set of load-specific opcodes, CIL provides numerous opcodes that explicitly pop the topmost value off the stack. As shown over the first few examples in this chapter, popping a value off the stack typically involves storing the value into temporary local storage for further use (such as a parameter for an upcoming method invocation). Given this, note how many opcodes that pop the current value off the virtual execution stack take an st (store) prefix. Table 19-7 hits the highlights.

Table 19-7. Various Pop-Centric Opcodes

Opcode

Meaning in Life

pop

Removes the value currently on top of the evaluation stack,

 

but does not bother to store the value

starg

Stores the value on top of the stack into the method

 

argument at a specified index

stloc (with numerous variations)

Pops the current value from the top of the evaluation stack

 

and stores it in a local variable list at a specified index

stobj

Copies a value of a specified type from the evaluation stack

 

into a supplied memory address

stsfld

Replaces the value of a static field with a value from the

 

evaluation stack

 

 

Do be aware that various CIL opcodes will implicitly pop values off the stack to perform the task at hand. For example, if you are attempting to subtract two numbers using the sub opcode, it should be clear that sub will have to pop off the next two available values before it can perform the calculation. Once the calculation is complete, the result of the value (surprise, surprise) is pushed onto the stack once again.

The .maxstack Directive

When you write method implementations using raw CIL, you need to be mindful of a special directive named .maxstack. As its name suggests, .maxstack establishes the maximum number of variables that may be pushed onto the stack at any given time during the execution of the method. The good news is that the .maxstack directive has a default value (8), which should be safe for a vast majority of methods you may be authoring. However, if you wish to be very explicit, you are able to manually calculate the number of local variables on the stack and define this value explicitly:

.method public hidebysig instance void Speak() cil managed

{

//During the scope of this method, exactly

//1 value (the string literal) is on the stack.

.maxstack 1

ldstr "Hello there..."

call void [mscorlib]System.Console::WriteLine(string) ret

}

Declaring Local Variables in CIL

Let’s first check out how to declare a local variable. Assume you wish to build a method in CIL named MyLocalVariables() that takes no arguments and returns void. Within the method, you

CHAPTER 19 UNDERSTANDING CIL AND THE ROLE OF DYNAMIC ASSEMBLIES

641

wish to define three local variables of type System.String, System.Int32, and System.Object. In C#, this member would appear as follows (recall that locally scoped variables do not receive a default value and should be set to an initial state before further use):

public static void MyLocalVariables()

{

string myStr = "CIL code is fun!"; int myInt = 33;

object myObj = new object();

}

If you were to construct MyLocalVariables() directly in CIL, you could author the following:

.method public hidebysig static void MyLocalVariables() cil managed

{

.maxstack 8

// Define three local variables.

.locals init ([0] string myStr, [1] int32 myInt, [2] object myObj)

//Load a string onto the virtual execution stack. ldstr "CIL code is fun!"

//Pop off current value and store in local variable [0]. stloc.0

//Load a constant of type "i4"

//(shorthand for int32) set to the value 33.

ldc.i4 33

//Pop off current value and store in local variable [1]. stloc.1

//Create a new object and place on stack.

newobj

instance void [mscorlib]System.Object::.ctor()

// Pop off current value and store in local variable [2]. stloc.2

ret

}

As you can see, the first step taken to allocate local variables in raw CIL is to make use of the

.locals directive, which is paired with the init attribute. Within the scope of the related parentheses, your goal is to associate a given numerical index to each variable (seen here as [0], [1], and [2]). As you can see, each index is identified by its data type and an optional variable name. Once the local variables have been defined, you load a value onto the stack (using the various loadcentric opcodes) and store the value within the local variable (using the various storage-centric opcodes).

Mapping Parameters to Local Variables in CIL

You have already seen how to declare local variables in raw CIL using the .local init directive; however, you have yet to see exactly how to map incoming parameters to local methods. Consider the following static C# method:

public static int Add(int a, int b)

{

return a + b;

}