Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

C# Bible - Jeff Ferguson, Brian Patterson, Jason Beres

.pdf
Скачиваний:
64
Добавлен:
24.05.2014
Размер:
4.21 Mб
Скачать

Chapter 19: Working with Unsafe Code

In This Chapter

When you use the new keyword to create a new instance of a reference type, you are asking the CLR to set aside enough memory to use for the variable. The CLR allocates enough memory for the variable and associates the memory with your variable. Under normal conditions, your code is unaware of the actual location of that memory, as far as a memory address is concerned. After the new operation succeeds, your code is free to use the allocated memory without knowing or caring where the memory is actually located on your system.

In C and C++, developers have direct access to memory. When a piece of C or C++ code requests access to a block of memory, it is given the specific address of the allocated memory, and the code directly reads from and writes to that memory location. The advantage to this approach is that direct access to memory is extremely fast and made for efficient code. There are problems, however, that outweigh the benefits. The problem with this direct memory access is that it is easy to misuse, and misuse of memory causes code to crash. Misbehaving C or C++ code can easily write to memory that has already been deleted, or can write to memory belonging to another variable. These types of memory access problems result in numerous hard-to-find bugs and software crashes.

The architecture of the CLR eliminates all of these problems by handling memory management for you. This means that your C# code can work with variables without needing to know details about how and where the variables are stored in memory. Because the CLR shields your C# code from these memory-related details, your C# code is free from bugs related to direct access to memory.

Occasionally, however, you need to work with a specific memory address in your C# code. Your code may need that extra ounce of performance, or your C# code may need to work with legacy code that requires that you provide the address of a specific piece of memory. The C# language supports a special mode, called unsafe mode, that enables you to work directly with memory from within your C# code.

This special C# construct is called unsafe mode because your code is no longer safe from the memory-management protection offered by the CLR. In unsafe mode, your C# code is allowed to access memory directly, and it can suffer from the same class of memory-related bugs found in C and C++ code if you're not extremely careful with the way you manage memory.

In this chapter, you take a look at the unsafe mode of the C# language and how it can be used to enable you to access memory locations directly using C and C++ style pointer constructs.

Understanding Pointer Basics

Memory is accessed in C# using a special data type called a pointer. A pointer is a variable whose value points to a specific memory address. A pointer is declared in C# with an asterisk placed between the pointer's type and its identifier, as shown in the following declaration:

int * MyIntegerPointer;

This statement declares an integer pointer named MyIntegerPointer. The pointer's type signifies the type of variable to which the pointer can point. An integer pointer, for example, can only point to memory used by an integer variable.

Pointers must be assigned to a memory address, and C# makes it easy for you to write an expression that evaluates to the memory address of a variable. Prefixing a unary expression with the C# address-of operator, the ampersand, evaluates to a memory address, as shown in the following code:

int

MyInteger = 123;

int

* MyIntegerPointer = &MyInteger;

The preceding code does two things:

It declares an integer variable called MyInteger and assigns a value of 123 to it.

It declares an integer pointer called MyIntegerPointer and points it to the address of the MyInteger variable.

Figure 19-1 illustrates how this assignment is interpreted in memory.

Figure 19-1: A pointer pointing to a variable

Pointers actually have two values:

The value of the pointer's memory address

The value of the variable to which the pointer is pointing

C# enables you to write expressions that evaluate to either value. Prefixing the pointer identifier with an asterisk enables you to obtain the value of the variable to which the pointer is pointing, as demonstrated in the following code:

int

MyInteger = 123;

int

* MyIntegerPointer = &MyInteger;

Console.WriteLine(*MyIntegerPointer);

This code writes 123 to the console.

Understanding pointer types

Pointers can have one of the following types:

sbyte

byte

short

ushort

int

uint

long

ulong

char

float

double

decimal

bool

an enumeration type

void, which is used to specify a pointer to an unknown type

You cannot declare a pointer to a reference type, such as an object. The memory for objects is managed by the CLR, and the memory may be deleted whenever the garbage collector needs to free the object's memory. If the C# compiler enabled you to maintain a pointer to an object, your code would run the risk of pointing to an object whose memory may be reclaimed at some point by the CLR's garbage collector.

Suppose that the C# compiler enabled you to write code like the following:

MyClass MyObject = new MyClass();

MyClass * MyObjectPointer;

MyObjectPointer = &MyObject;

The memory used by MyObject is automatically managed by the CLR, and its memory is freed when all references to the object are released and the CLR's garbage collector executes. The problem is that your unsafe code now maintains a pointer to an object whose memory has been freed. There is no way for the CLR to know that you have a pointer to the object, and the result is that you have a pointer that points to nothing after the garbage collector frees the memory. C# gets around this problem by not enabling you to maintain variables to reference types with memory that is managed by the CLR.

Compiling Unsafe Code

By default, the C# compiler compiles only safe C# code. To force the compiler to compile unsafe C# code, you must use the /unsafe compiler argument:

csc /unsafe file1.cs

Unsafe code enables you to write code that accesses memory directly, bypassing the objects that manage memory in managed applications. Unsafe code can perform better in certain types of applications, because memory locations are accessed directly. This command compiles the file1.cs source file and allows unsafe C# code to be compiled.

Note In C#, unsafe code enables you to declare and use pointers as you would in C++.

Specifying pointers in unsafe mode

The C# compiler doesn't enable you to use pointers in your C# code by default. If you try to work with pointers in your code, the C# compiler issues the following error message:

error CS0214: Pointers may only be used in an unsafe context

Pointers are valid only in C# unsafe mode, and you must explicitly define unsafe code to the compiler. You do so by using the C# keyword unsafe. The unsafe keyword must be applied to a code block that uses pointers.

You can specify that a block of code executes in the C# unsafe mode by applying the unsafe keyword to the declaration of the code body, as shown in Listing 19-1.

Listing 19-1: Unsafe Methods

using System;

public class MyClass

{

public unsafe static void Main()

{

int MyInteger = 123;

int * MyIntegerPointer = &MyInteger;

Console.WriteLine(*MyIntegerPointer);

}

}

The Main() method in Listing 19-1 uses the unsafe modifier in its declaration. This indicates to the C# compiler that all of the code in the method must be considered unsafe. After this keyword is used, the code in the method can use unsafe pointer constructs.

The unsafe keyword applies only to the method in which it appears. If the class in Listing 19- 1 were to contain another method, that other method could not use an unsafe pointer constructs unless it, too, is declared with the unsafe keyword. The following rules apply to the unsafe modifier:

Classes, structures, and delegates can include the unsafe modifier, which indicates that the entire body of the type is considered unsafe.

Fields, methods, properties, events, indexers, operators, constructors, destructors, and static constructors can be defined with the unsafe modifier, which indicates that the specific member declaration is unsafe.

A code block can be marked with the unsafe modifier, which indicates that the entire block should be considered unsafe.

Accessing members' values through pointers

The unsafe mode of C# enables you to use the -> operator to access members to structures referenced by a pointer. The operator, which is keyed as a hyphen followed by a greater-than symbol, enables you to access members directly, as shown in Listing 19-2.

Listing 19-2: Accessing Structure Members with a Pointer

using System;

public struct Point2D

{

public int X; public int Y;

}

public class MyClass

{

public unsafe static void Main()

{

Point2D MyPoint;

Point2D * PointerToMyPoint;

MyPoint = new Point2D(); PointerToMyPoint = &MyPoint; PointerToMyPoint->X = 100; PointerToMyPoint->Y = 200;

Console.WriteLine("({0}, {1})", PointerToMyPoint->X, PointerToMyPoint->Y);

}

}

Listing 19-2 contains a declaration for a structure called Point2D. The structure contains two public members. The listing also includes an unsafe Main() method that creates a new variable of the structure type and creates a pointer to the new structure. The method then uses the pointer member access operator to assign values to the structure, which is then written to the console.

This differs from member access in the default C# safe mode, which uses the . operator. The C# compiler issues an error if you use the wrong operator in the wrong mode. If you use the . operator with an unsafe pointer, the C# compiler issues the following error message:

error CS0023: Operator '.' cannot be applied to operand of type 'Point2D*'

If you use the -> operator in a safe context, the C# compiler also issues an error message:

error CS0193: The * or -> operator must be applied to a pointer

Using Pointers to Fix Variables to a Specific Address

When memory for a variable is managed by the CLR, your code works with a variable, and management details about the variable's memory are handled by the CLR. During the CLR's garbage collection process, the runtime may move memory around to consolidate the memory

heap available at runtime. This means that during the course of an application, the memory address for a variable may change. The CLR might take your variable's data and move it to a different address.

Under normal conditions, your C# code is oblivious to this relocation strategy. Because your code works with a variable identifier, you usually access the variable's memory through the variable identifier, and you can trust that the CLR works with the correct piece of memory as you work with the variable.

The picture is not as straightforward, however, when you work with pointers. Pointers point to a specific memory address. If you assign a pointer to a memory address used by a variable and the CLR later moves that variable's memory location, your pointer is pointing to memory that is no longer used by your variable.

The unsafe mode of C# enables you to specify a variable as exempt from the memory relocation that the CLR offers. This lets you hold a variable at a specific memory address, enabling you to use a pointer with the variable without worrying that the CLR may move the variable's memory address out from under your pointer. The C# keyword fixed is used to specify that a variable's memory address should be fixed. The fixed keyword is followed by a parenthetical expression containing a pointer declaration with an assignment to a variable. A block of code follows the fixed expression, and the fixed variable remains at the same memory address throughout the fixed code block, as shown in Listing 19-3.

Listing 19-3: Fixing Managed Data in Memory

using System;

public class MyClass

{

public unsafe static void Main()

{

int ArrayIndex;

int [] IntegerArray;

IntegerArray = new int [5];

fixed(int * IntegerPointer = IntegerArray)

{

for(ArrayIndex = 0; ArrayIndex < 5; ArrayIndex++) IntegerPointer[ArrayIndex] = ArrayIndex;

}

for(ArrayIndex = 0; ArrayIndex < 5; ArrayIndex++) Console.WriteLine(IntegerArray[ArrayIndex]);

}

}

The fixed keyword in Listing 19-3 declares an integer pointer that points to an integer array. It is followed by a block of code that writes values to the array using the pointer. Within this block of code, the address of the IntegerArray array is guaranteed to be fixed, and the CLR does not move its location. This enables the code to use a pointer with the array without worrying that the CLR will move the array's physical memory location. After the fixed code block exits, the pointer can no longer be used and the CLR again considers the IntegerArray variable a candidate for relocation in memory.

Understanding pointer array element syntax

Listing 19-3 also illustrates the array element pointer syntax. The following line of code treats an unsafe mode pointer as if it were an array of bytes:

IntegerPointer[ArrayIndex] = ArrayIndex;

This line of code treats the pointer as if it were an array. The array element pointer syntax allows your unsafe C# code to view the memory pointed to by the pointer as an array of variables that can be read from or written to.

Comparing pointers

The unsafe mode of C# enables you to compare pointers using the following operators:

Equality (==)

Inequality (!=)

Less-than (<)

Greater-than (>)

Less-than-or-equal-to (<=)

Greater-than-or-equal-to (>=)

As with value types, these operators evaluate to Boolean values of True and False when used with pointer types.

Understanding pointer arithmetic

Pointers can be combined with integer values in mathematical expressions to change the location to which the pointer points. The + operator adds a value to the pointer, and the - operator subtracts a value from the pointer.

The fixed statement in Listing 19-3 could have also been written as follows:

fixed(int * IntegerPointer = IntegerArray)

{

for(ArrayIndex = 0; ArrayIndex < 5; ArrayIndex++) *(IntegerPointer + ArrayIndex) = ArrayIndex;

}

In this code block, the pointer is offset by a value, and the sum is used to point to a memory location. The pointer arithmetic is performed in the following statement:

*(IntegerPointer + ArrayIndex) = ArrayIndex;

This statement reads as follows: "Take the value of IntegerPointer and increment it by the number of positions specified by ArrayIndex. Place the value of ArrayIndex in that location."

Pointer arithmetic increments a pointer position by a specified number of bytes, depending on the size of the type being pointed to. Listing 19-3 declares an integer array and an integer pointer. When pointer arithmetic is used on the integer pointer, the value used to offset the

pointer specifies the number of variable sizes to move, not the number of bytes. The following expression uses pointer arithmetic to offset a pointer location by three bytes:

IntegerPointer + 3

The literal value 3 in this expression specifies that the pointer should be incremented by the space taken up by three integers, not by three bytes. Because the pointer points to an integer, the 3 is interpreted as "space for three integers" and not "space for three bytes." Because an integer takes up four bytes of memory, the pointer's address is incremented by twelve bytes (three integers multiplied by four bytes for each integer), not three bytes.

Using the sizeof operator

You can use the sizeof operator in unsafe mode to calculate the number of bytes needed to hold a specific data type. The operator is followed by an unmanaged type name in parentheses, and the expression evaluates to an integer specifying the number of bytes needed to hold a variable of the specified type.

Table 19-1 lists the supported managed types and the values that are returned by a sizeof operation.

 

Table 19-1: Supported sizeof() Types

 

 

 

 

Expression

 

 

Result

 

 

sizeof(sbyte)

 

 

1

 

 

sizeof(byte)

 

 

1

 

 

sizeof(short)

 

 

2

 

 

sizeof(ushort)

 

 

2

 

 

sizeof(int)

 

 

4

 

 

sizeof(uint)

 

 

4

 

 

sizeof(long)

 

 

8

 

 

sizeof(ulong)

 

 

8

 

 

sizeof(char)

 

 

2

 

 

sizeof(float)

 

 

4

 

 

sizeof(double)

 

 

8

 

 

sizeof(bool)

 

 

1

 

 

 

 

Allocating Memory from the Stack

C# provides a simple memory allocation mechanism in unsafe code. You can request memory in unsafe mode using the C# stackalloc keyword, as shown in Listing 19-4.

Listing 19-4: Allocating Memory from the Stack

using System;

public class MyClass

{

public unsafe static void Main()

{

int * CharacterBuffer = stackalloc int [5]; int Index;

for(Index = 0; Index < 5; Index++) CharacterBuffer[Index] = Index; for(Index = 0; Index < 5; Index++)

Console.WriteLine(CharacterBuffer[Index]);

}

}

A data type follows the stackalloc keyword. It returns a pointer to the allocated memory block, and you can use the memory just as you would use the memory allocated by the CLR.

There is no explicit operation for freeing the memory allocated by the stackalloc keyword. The memory is freed automatically when the method that allocated the memory exits.

Summary

The unsafe mode in C# enables your code to work directly with memory. Using it can enhance performance because your code accesses memory directly, without having to navigate through the CLR. However, unsafe mode is potentially dangerous and can cause your code to crash if you do not work with the memory properly.

In general, avoid using the C# unsafe mode. If you need that last bit of performance from your code, or if you're working with legacy C or C++ code that requires you to specify a specific memory location, you should stick to the default safe mode and let the CLR handle memory allocation details for you. The minor performance degradation that results is far outweighed by lifting the burden of memory management from your code, and by gaining the freedom to write code that is devoid of bugs related to memory management.

Chapter 20: Understanding Advanced C#

Constructs

In This Chapter

In this chapter, you examine some interesting facets of the C# language. You also look at some sample code and learn why the code works the way it does. Understanding programming problems like the ones presented in this chapter will help you understand how to tackle your own tough C# programming questions.

First, you take a look at the implicit conversion feature of C# and see how it applies to objects of derived classes being accessed as objects of the derived class' base class. Remember that you can write implicit operator methods that define how implicit conversions from one type

or another are handled; but, as you'll see, things get a bit more complicated when working with compile-time types and runtime types.

Next, you dive into an issue with structure initialization. Structures, as with classes, can contain both fields and properties. Initialization of structures with fields, however, is handled a bit differently than the initialization of structures with properties. In this chapter, you find out why, and you also discover how to resolve the issue.

In the third section of this chapter, you investigate the issue of passing an object of a derived class into a method call where an object of a base class is expected. Since objects of derived classes are inherently objects of the base class, passing a derived class object where a base class object is expected would seem to be straightforward. In this section, you learn why this technique is not as straightforward as it seems.

Finally, you dive into an advanced usage of class indexers. In the vast majority of cases, the indexers that you write serve to make a class behave like an array of elements. Normally, arrays accept integers as the index element specifier. In this section, you take a look at a technique for using data types other than integers for an array index.

Understanding Implicit Operators and Illegal Casts

Recall that your classes can contain implicit operator conversion code. Implicit operators are used to convert one type to another type without any special code. Many implicit conversions are built into the C# language. For example, an integer can be implicitly converted into a long integer without any special code:

int MyInt = 123;

long MyLongInt = MyInt;

C# does not define implicit conversions for all possible data type combinations. However, as you saw in an earlier chapter, you can write implicit operator conversion code that instructs the Common Language Runtime (CLR) how to behave when a user of your class attempts to implicitly convert between your class and another type. In this section, you explore a facet of the implicit conversion operator dealing with the conversion between two different classes.

Listing 20-1 contains two classes: TestClass and MainClass. The MainClass class contains the application's Main() method. The TestClass class maintains a private variable of type MainClass. It also defines an implicit operator method that converts TestClass objects to MainClass objects. The implicit operator implementation returns a reference to the TestClass object's private MainClass object.

Listing 20-1: Invalid Cast Exceptions with Implicit Operators

public class TestClass

{

private MainClass MyMainClassObject;

public TestClass()

{

MyMainClassObject = new MainClass();

}

Соседние файлы в предмете Программирование