Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
Advanced CORBA Programming wit C++ - M. Henning, S. Vinoski.pdf
Скачиваний:
57
Добавлен:
24.05.2014
Размер:
5 Mб
Скачать

IT-SC book: Advanced CORBA® Programming with C++

Figure 4.2 Development process for different development environments.

Because only the stubs are used by the client, the client developer simply ignores the skeleton generated by the IDL compiler or suppresses the skeleton code generation.

4.4 Source Files

The IDL specification defines a number of rules for the naming and contents of IDL source files.

4.4.1 File Naming

The names of source files containing IDL definitions must end in .idl. For example, CCS.idl is a valid source file name. An IDL compiler is free to reject source files having other file name extensions.

For file systems that are case-insensitive (such as DOS), the case of the file name extension is ignored, so CCS.IDL is legal. For file systems that are case-sensitive (such as UNIX), the extension must be in lowercase and CCS.IDL is not legal.

4.4.2 File Format

IDL is a free-form language. This means that IDL allows free use of spaces, horizontal and vertical tab stops, form feeds, and newline characters (any of these characters serves as a token separator). Layout and indentation do not carry semantics, so you can choose any textual style you prefer. You may wish to follow the style we have used for the IDL examples throughout this book. These examples follow the OMG style guide for IDL.

62

IT-SC book: Advanced CORBA® Programming with C++

4.4.3 Preprocessing

IDL source files are preprocessed. The preprocessor can be implemented as part of the compiler, or it can be an external program. However, its behavior is identical to the C++ preprocessor. This means that the usual C++ rules for lexical translation phases apply: the preprocessor maps source file characters onto the source character set, replaces trigraphs, concatenates lines ending in a backslash, replaces comments with white space, and so on. The most common use of the preprocessor is for #include directives. This permits an IDL definition to use types defined in a different source file. You may also want to use the preprocessor to guard against double inclusion of a file:

#ifndef _MYMODULE_IDL_ #define _MYMODULE_IDL_

module MyModule { /* ... */ };

#endif /* _MYMODULE_IDL_ */

Another frequent use of the preprocessor is to control the repository IDs that are generated by the compiler with #pragma directives. We look at the #pragma directives specified by CORBA in Section 4.19.

4.4.4 Definition Order

IDL constructs, such as modules, interfaces, or type definitions, can appear in any order you prefer. However, identifiers must be declared before they can be used.

4.5 Lexical Rules

IDL's lexical rules are almost identical to those of C++ except for some differences in identifiers.

4.5.1 Comments

IDL definitions permit both the C and the C++ style of writing comments:

/*

* This is a legal IDL comment. */

// This IDL comment extends to the end of this line.

4.5.2 Keywords

IDL uses a number of keywords, which must be spelled in lowercase. For example, interface and struct are keywords and must be spelled as shown. There are three

63

IT-SC book: Advanced CORBA® Programming with C++

exceptions to this lowercase rule: Object, TRUE, and FALSE are all keywords and must be capitalized as shown.

4.5.3 Identifiers

Identifiers begin with an alphabetic character followed by any number of alphabetics, digits, or underscores. Unlike C++ identifiers, IDL identifiers cannot have a leading underscore (but see also Section 4.21.5). In addition, IDL identifiers cannot contain non-English letters, such as Å, because that would make it very difficult to map IDL to target languages that lack support for such characters.

Case Sensitivity

Identifiers are case-insensitive but must be capitalized consistently. For example, TimeOfDay and TIMEOFDAY are considered the same identifier within a naming scope. However, IDL enforces consistent capitalization. After you have introduced an identifier, you must capitalize it consistently throughout; otherwise, the compiler will reject it as illegal. This rule exists to permit mappings of IDL to languages (such as Pascal) that ignore case in identifiers as well as to languages (such as C++) that treat differently capitalized identifiers as distinct.

Identifiers That Are Keywords

IDL permits you to create identifiers that happen to be keywords in one or more implementation languages. For example, while is a perfectly good IDL identifier but of course is a keyword in many implementation languages. Each language mapping defines its own rules for dealing with IDL identifiers that are keywords. The solution typically involves using a prefix to map away from the keyword. For example, the IDL identifier while is mapped to _cxx_while in C++.

This rule for dealing with keywords is workable but results in hard-to-read source code. Identifiers such as package, then, import, PERFORM, and self will clash with some implementation language or other. To make life easier for developers (possibly yourself), you should try to avoid IDL identifiers that are likely to be implementation language keywords.

4.6 Basic IDL Types

IDL provides a number of built-in basic types, and they are shown in Table 4.1.

Table 4.1. IDL basic types.

Type

 

 

Range

Size

short

-215

to 215

-1

= 16 bits

long

-231

to 231

-1

= 32 bits

unsigned short

0 to 216-1

 

= 16 bits

unsigned long

0 to 232-1

 

= 32 bits

float

IEEE single-precision

= 32 bits

64

IT-SC book: Advanced CORBA® Programming with C++

double

IEEE double-precision

= 64 bits

char

ISO Latin-1

= 8 bits

string

ISO Latin-1, except ASCII NUL

Variable-length

boolean

TRUE or FALSE

Unspecified

octet

0–255

= 8 bits

any

Run-time identifiable arbitrary type

Variable-length

The CORBA specification requires that language mappings preserve the size of these types as shown. The value ranges shown in Table 4.1 need not be maintained by all language mappings, but CORBA requires implementations to document any deviations from the specified ranges. (The C++ mapping preserves all value ranges.)

These requirements may sound confusing. For example, when you look at the size requirements, you will find that IDL specifies only a lower bound instead of an exact size. The reason is that some CPU architectures do not have, for example, an 8-bit character type or a 16-bit integer type; on such CPUs, these types are mapped to a type larger than 8 or 16 bits. Similarly, some language mappings cannot preserve the full range of all types; for example, Java does not have unsigned integers and maps both unsigned long and long to Java int. To avoid restricting the possible target environments and languages, the CORBA specification leaves the size and range requirements for IDL basic types loose.

All the basic types (except octet) are subject to changes in representation as they are transmitted between clients and servers. For example, a long value undergoes byte swapping when sent from a big-endian to a little-endian machine. Similarly, characters undergo translation in representation if they are sent from an EBCDIC to an ASCII implementation. What happens if a character does not have a precise match in the target character set is implementation-dependent. For example, the EBCDIC character ¬ does not have an ASCII equivalent. An ORB might translate EBCDIC ¬ into ASCII ~, or it might raise a DATA_CONVERSION exception (see Section 4.10) to indicate that translation is impossible. Characters may also change in size (not all architectures use 8- bit characters). However, these changes are transparent to the programmer and do exactly what is required.

Table 4.1 does not include a pointer type. There are a number of good reasons for this. Pointer types are used much less in object-oriented programming than in non-OO languages.

Some implementation languages (such as COBOL and Java) do not support pointers. Pointers would complicate the implementation of marshaling for ORB vendors and would incur additional run-time costs.

As you will see in Section 4.8.2, the lack of pointers is no great hardship. IDL uses object references to achieve what in a non-OO environment would normally be done with a pointer. In effect, object references are pointers. However, object references can denote

65

IT-SC book: Advanced CORBA® Programming with C++

only objects but cannot point to data. IDL supports recursive data types, such as trees, without introducing a data pointer type (see Section 4.7.8).

CORBA recently extended IDL to support additional numeric and character types. Because many ORBs do not yet provide these types, we cover them separately in

Section 4.21.

4.6.1 Integer Types

IDL does not have a type int, so there are no guessing games as to its range. An IDL short is mapped to at least a 2-byte type, and IDL long is mapped to at least a 4-byte type.

Some languages (notably Java) do not support unsigned types. Because of this, unsigned short and unsigned long map to Java short and int, respectively. This means that a Java programmer must ensure that large unsigned IDL values are treated correctly when represented as Java signed values.

4.6.2 Floating-Point Types

These types follow the IEEE specification for singleand double-precision floating-point representation [7]. If an implementation cannot support IEEE format floating-point values, it must document how it deviates from the IEEE specification.

4.6.3 Characters

IDL characters support the ISO Latin-1 character set [8], which is a superset of ASCII. The bottom 128 character positions (0–127) are identical to ASCII. The top 128 character positions (128–255) are taken up by characters such as Å, ß, and Ç. This arrangement allows most European languages to be used with an 8-bit character set. Recently, IDL was extended to support wide characters and strings. This permits use of arbitrary wide character sets, such as Unicode.

4.6.4 Strings

IDL strings support the ISO Latin-1 character set with the exception of ASCII NUL (0). Disallowing NUL inside IDL strings is a concession to C and C++; the notion of NULterminated strings is so deeply ingrained in C and C++ that allowing embedded NUL characters would make the use of IDL strings impossibly difficult in these languages.

IDL strings can be bounded or unbounded. An unbounded string has the IDL type string and can grow to any length. A bounded string type specifies an upper limit on the length of the string. For example, string<10> is a string type that permits only strings of up to ten characters.

66

IT-SC book: Advanced CORBA® Programming with C++

The bound of a string does not include any terminating NUL character, so the string "Hello" will fit into a string of type string<5>. (Many programming languages do not represent strings as NUL-terminated arrays, so the concept of NUL termination does not apply to IDL.)

Most C and C++ ORB implementations ignore bounded strings and treat them as if they were unbounded. This limitation arises because C and C++ do not support bounded strings natively, and emulating bounded string support would result in awkward language mappings. As a C++ programmer, you are made responsible for enforcing the bound at run time.

4.6.5 Booleans

Boolean values can have only the values TRUE and FALSE. IDL makes no requirement as to how these values are to be represented in particular languages nor about the size of a Boolean value.

4.6.6 Octets

The IDL type octet is an 8-bit type that is guaranteed not to undergo any changes in representation as it is transmitted between address spaces. This guarantee permits exchange of binary data so that it is not tampered with in transit. All other IDL types are subject to changes in representation during transmission.

4.6.7 Type any

Type any is a universal container type. A value of type any can hold a value of any other IDL type, such as long or string, or even another value of type any. Type any can also hold object references or user-defined complex types, such as arrays or structures.

Type any is useful when you do not know at compile time what IDL types you will eventually need to transmit between client and server. Type any is IDL's equivalent of what in C++ is typically achieved with a void * or a stdarg variable argument list. However, type any is substantially safer because it is self-describing (you can find out at run time what type of value is contained in an any). Manipulation of values of type any is type-safe; attempts to, for example, extract a float as a string return an error indication. As a result, careless misinterpretation of a value as the wrong type is much less likely than it is with the completely type-unsafe mechanism of using a void *.

We look at type any and its C++ mapping in detail in Chapter 15.

67