Добавил:

Teana Опубликованный материал нарушает ваши авторские права? Сообщите нам.

Вуз:

Таврический Национальный Университет им. В.И. Вернадского

Предмет:

Программирование

Файл:

Advanced CORBA Programming wit C++ - M. Henning, S. Vinoski.pdf

Скачиваний:

Добавлен:

24.05.2014

Размер:

5 Mб

Скачать

☆

<<< < Предыдущая 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123124 / 128124 125 126 127 128 > Следующая >>>

IT-SC book: Advanced CORBA® Programming with C++

We use the same mutex in the preinvoke implementation to serialize the accesses and modifications to the evictor queue and the active object map.

Instead of invoking Controller_impl::remove_impl from the

Thermometer_impl::remove method to remove the target device's asset number, we change the code to perform the removal from within preinvoke. This is necessary to ensure the consistency of the evictor queue and active object map with the set of asset numbers (by manipulating them while holding a lock on m_assets_mutex) and to avoid leaking a servant.

We derive the Thermometer_impl class from RefCountServantBase to inherit a thread-safe implementation of _remove_ref that we can use instead of deleting our servants directly. In this way, we avoid deleting our servants out from under other requests simultaneously in progress in other threads.

None of the Thermometer_impl or Thermostat_impl method implementations need worry about mutex locks because they merely invoke atomic operations on the devices they represent. Even remove, which modifies the m_removed data member, does not need to guard against concurrent access because the serialization that is performed for each operation invocation in DeviceLocator_impl::preinvoke ensures that only a single thread ever invokes remove.

The most serious drawback to our multithreaded evictor implementation is that we lock the m_assets_mutex for most of the preinvoke method, effectively serializing all invocations of it. Because preinvoke is called for every request on every Thermometer and Thermostat object, this could seriously degrade our application's performance because of lock contention, especially if most requests take little time to process.

Briefly, alternatives to the servant locator evictor implementation include the following: Using a simple servant locator that allocates a new servant for each request, an approach that incurs costs because of excessive heap allocation

Using servant activators, and thus having the POA keep an Active Object Map in memory

Using a default servant, for which you pay the costs of determining the target object ID for each request as well as having to locate the object's (possibly persistent) state

For your applications, you must decide whether the locking overhead of the servant locator evictor solution outweighs the costs of the alternatives.

21.7 Servant Activators and the Evictor Pattern

851

IT-SC book: Advanced CORBA® Programming with C++

In Section 11.9 we describe how invoking deactivate_object does not guarantee that the POA will actually remove the object's Active Object Map entry in a timely fashion. This is because the POA keeps the Active Object Map entry intact until no more requests are active for that object. Unfortunately, this means that an object receiving a steady stream of requests might never be deactivated, in which case its servant will never be etherealized.

In a multithreaded environment, this lack of predictable servant etherealization makes it extremely difficult, if not impossible, to correctly implement the Evictor pattern using servant activators. Because a steady stream of requests can effectively prevent object deactivation and lock a servant into memory, you can end up with many more servants in memory than your evictor queue can contain. Evicted servants wind up in a sort of limbo. They are no longer managed by your evictor code and yet are kept artificially alive by incoming requests until the POA gets the chance to tell the servant activator to etherealize them. If you manage your evictor queue properly and remove servants from it when necessary, all you can do is invoke deactivate_object and hope that the evicted servants get etherealized sooner rather than later so that they can clean up after themselves.

If you are sure that your application's size and performance will not suffer from having too many servants in memory at once, you should not worry about the costs associated with using servant activators. Otherwise, we recommend that you use servant locators or default servants.

21.8 Summary

In this chapter we provide a short overview of multithreading issues for CORBA applications. We also show an example that follows from our presentation of the Evictor pattern in Chapter 12, this time showing how to safely implement the evictor in a multithreaded application.

Multithreaded programming can be difficult, even for seasoned veterans. It pays to spend a little more time in up-front design work when you're programming for multiple threads, and it does not hurt to have a peer review your code when you think it is finished. Even simple, informal code reviews can help you track down insidious concurrency bugs that might otherwise take hours or even days to resolve.

We do not try to provide a complete tutorial on multithreaded programming in this chapter. To learn more about general multithreaded programming techniques, please refer to the appropriate resources listed in Appendix B and the Bibliography.

852

IT-SC book: Advanced CORBA® Programming with C++

Chapter 22. Performance, Scalability, and

Maintainability

22.1 Chapter Overview

In this chapter we bring the book to a close with a light discussion of design techniques that can help make your CORBA applications perform well and be more scalable, maintainable, and portable. Section 22.3 discusses overhead due to remote method invocation and explains how you can reduce it. In Section 22.4 we review the techniques described in earlier chapters for optimizing server applications. Following that,

Section 22.5 briefly presents federation as a solution to distributing process load. Finally, Section 22.6 describes an approach for isolating your application code from your CORBA-related code to achieve better portability and separation of concerns.

22.2 Introduction

By now you probably realize that CORBA does not present you with a cookbook approach to building distributed systems. You cannot simply throw together a handful of interfaces, implement them, and expect to have a distributed application. To be sure, CORBA makes it easy to create a client-server application in a short time. However, if you want to build something that scales to large numbers of objects and at the same time performs well, you must plan ahead.

This chapter presents a few of the design techniques you have at your disposal for building applications that scale and perform well without sacrificing maintainability of your code. This discussion is by no means complete—CORBA applies to such a variety of distributed systems applications that we cannot hope to cover the topic extensively here. Instead, we present a few design techniques that are likely to be useful for many kinds of applications. Of course, you must use your own judgment as to whether these techniques apply to your situation.

Typically, the goals of scalability, performance, and maintainability are in conflict. For example, better performance often implies coding or design techniques that result in source code that compromises maintainability. Similarly, increased scalability often implies a reduction in performance. Steering the correct path through these trade-offs is the hallmark of design excellence, and a cookbook approach is unlikely to lead to the best possible solution. However, the techniques we present here should help to get you started along the correct path and provide a source of ideas you can modify as the situation demands.

22.3 Reducing Messaging Overhead

853

IT-SC book: Advanced CORBA® Programming with C++

The similarity of IDL to C++ class definitions makes it easy to approach IDL interface design the same way you would approach C++ class design. This similarity is both a boon and a bane. Because IDL is so similar to C++ in both syntax and semantics, most programmers quickly feel at home with the language and can begin to develop applications with only a small learning curve. However, pretending that IDL design is the same as C++ class design is likely to get you into trouble. Even though CORBA makes it easy to ignore details of distribution and networking, this does not mean that you can pretend that distribution and networking do not exist. It is easy to forget that sending a message to a remote object is several orders of magnitude slower than sending a message to a local object. Naive IDL design therefore can easily result in systems that work but are unacceptably slow.

22.3.1 Basic IIOP Performance Limitations

To get at least a basic idea of the fundamental performance parameters within which you must create your design, you must know the cost of sending remote messages. This cost is determined by two factors: latency and marshaling rate. Call latency is the minimum cost of sending any message at all, whereas the marshaling rate determines the cost of sending and receiving parameter and return values depending on their size.

Call Latency

The cheapest message you can send is one that has no parameters and does not return a result:

interface Cheap {

oneway void fast_operation();

};

The number of fast_operation invocations that your ORB can deliver per time interval sets a fundamental design limit: you cannot hope to meet your performance goals if your design requires more messages to be sent per time interval than your ORB can deliver.

Unfortunately, the only real way to find out what your ORB is capable of is to create benchmarks. The cost of call dispatch varies considerably among environments and depends on a large number of variables, such as the underlying network technology, the CPU speed, the operating system, the efficiency of your TCP/IP implementation, your compiler, and the efficiency of the ORB run time itself. To give you a rough idea of the state of current technology, general-purpose ORBs have call dispatch times of between 0.5 msec and 5 msec. In other words, depending on your ORB implementation, you can expect a maximum call rate of 200 to 2,000 operation invocations per second. (We obtained these figures by running a client and a server on different machines connected by an otherwise unused 10 Mb Ethernet; client and server were running on typical UNIX workstations and were implemented using a number of commercial general-purpose

854

IT-SC book: Advanced CORBA® Programming with C++

ORBs. As pointed out earlier, you must run appropriate benchmarks yourself to determine the call dispatch cost for your particular environment.)

Whether your ORB is capable of sending 200 or 2,000 invocations per second, the main point is that a remote call is several orders of magnitude slower than a local call. (On an average UNIX workstation, you will be able to easily achieve more than 1,000,000 local C++ function calls per second.)

Marshaling Rate

The second factor that limits the speed of remote invocations of an ORB is its marshaling rate—that is, the speed with which an ORB can transmit and receive data over the network. Marshaling performance depends on the type of data transmitted. Simple types, such as arrays of octet, typically marshal fastest. (This is not surprising considering that an ORB can marshal arrays of octet by doing a simple block copy into a transmit buffer.) On the other hand, marshaling highly structured data, such as nested user-defined types or object references, is usually much slower because the ORB must do more work at run time to collect the data from different memory locations and copy it into a transmit buffer. Most ORBs also slow down significantly when marshaling any values containing complex data, mainly because the type codes for complex data are themselves highly structured.

Again, marshaling rates vary widely among various combinations of network, hardware, operating system, compiler, and ORB implementation. As a rough guide, you can expect marshaling rates between 200 kB/sec and 800 kB/sec between average UNIX workstations over a 10 Mb Ethernet, depending on the type of data and your ORB. You must conduct your own benchmarks to obtain reliable figures for your environment.

22.3.2 Fat Operations

As the preceding discussion shows, call latency is a major limiting factor in the performance of a distributed system. Even assuming a fast call dispatch rate of 1,000 calls per second, it becomes clear that making any remote call is expensive, at least when compared with the cost of making a local call. In addition, the overall cost of remote calls is dominated by the call latency until parameters reach several hundred bytes in size, so an invocation without parameters takes about the same time as an invocation that transmits a few parameters.

IDL for Fat Operations

One way to design a more efficient system, therefore, is to make fewer calls overall. For small parameters up to a few hundred bytes the cost of a remote invocation is essentially constant, so we might as well make calls worthwhile by sending more data with each call. This design trade-off is also known as the fat operation technique. Consider again the Thermometer interface from the climate control system:

interface Thermometer {

855

IT-SC book: Advanced CORBA® Programming with C++

readonly attribute	ModelType	model;
readonly attribute	AssetType	asset_num;
readonly attribute	TempType	temperature;
attribute	LocType	location;

};

This interface uses a fine-grained approach to object modeling by making each piece of state a separate attribute. This design is both clean and easy to understand, but consider a client that has just obtained a thermometer reference, for example, from a list or find operation. To completely retrieve the state of the thermometer, the client must make four remote calls, one for each attribute. In addition, we would expect that the client is highly likely to make these calls (or at least some of them) because a thermometer is not very interesting unless we know something about its state. We can reduce the number of messages by changing the Thermometer interface as follows:

struct ThermometerState {

ModelType	model;
AssetType	asset_num;
TempType	temperature;
LocType	location;

};
interface Thermometer {	get_state();
ThermometerState	get_state();
void	set_location(in LocType location);
};

Instead of modeling each piece of state with a separate attribute, this interface provides the get_state operation to return all of a thermometer's state with a single call. The set_location operation allows us to update a thermometer's location. Because a thermometer has only a single writable attribute, set_location accepts a string parameter. However, for interfaces with several writable attributes, we could combine all the writable attributes into a structure and create a set_state operation that updates all attributes with a single call.

This technique not only applies to things such as thermometers but also works effectively for collection manager operations, such as list:

interface Controller {

typedef sequence<Thermometer> ThermometerSeq;

ThermometerSeq list(); // ...

};

Again, consider a client calling list. It is highly likely that the client will immediately retrieve the state information for the devices returned by list; otherwise, there would be no point in calling the operation. This requires sending as many messages as there are

856

IT-SC book: Advanced CORBA® Programming with C++

devices in the system. Again, we can apply the fat operation technique to reduce messaging overhead:

struct ThermometerState {

ModelType	model;
AssetType	asset_num;
TempType	temperature;
LocType	location;
};
// ...

interface Controller { struct ListItem {

Thermometer ref; ThermometerState state;

};

typedef sequence<ListItem> ThermometerSeq;

ThermometerSeq list_thermometers(); // ...

};

With this IDL definition, a client calling list_thermometers receives not only the object references for all thermometers but also the current state information. As a result, the client does not have to make additional calls to retrieve the state for each thermometer. The fat operation technique can result in substantial performance gains. With the fat list_thermometers operation, we achieve in a single invocation what took 4N + 1 operations in the original CCS design, where N is the number of thermometers. The price we pay is that completing a list_thermometers operation takes longer than completing an individual get_state operation because list_thermometers transmits more data. However, for large numbers of thermometers, list_thermometers is likely to be substantially faster because it saves the call dispatch overhead for N remote calls. Let us assume a marshaling rate of 500 kB/sec for list_thermometers, a call latency of 2 msec, and a system containing 10,000 thermometers. If each individual ListItem structure contains 250 bytes of data, a call to list_thermometers takes about five seconds. In contrast, using our original CCS IDL, retrieving the state for all 10,000 thermometers requires 40,001 remote calls, and that takes about 80 seconds.

Evaluating Fat Operations

At this point, you may be jubilantly concluding that fat operations are the answer to all your performance problems. Before you rush to this conclusion, let us examine some of the trade-offs involved.

The fat operation technique is very sensitive to the number of devices and the call latency and marshaling rate of your ORB. For example, if we assume a slower marshaling rate of 250 kB/sec but a call latency of 1 msec with 1,000 devices, the original CCS design

857

IT-SC book: Advanced CORBA® Programming with C++

results in an overall time of four seconds, whereas the fat list_thermometers operation requires one second to execute. In other words, the performance difference between the two approaches shrinks from a factor of 16 to a factor of 4.

Fat collection manager operations do not scale to large numbers of items. For 10,000 devices, list_thermometers returns around 2.5 MB of data. This number is already at or beyond the limit of the maximum call size for many general-purpose ORBs. If instead we have 100,000 devices, list_thermometers must return 25 MB of data, which is likely to exceed the memory limitations of either client or server on many systems.

You can easily get around these problems by adding iterator interfaces to limit the size of the data returned by each call. However, iterators make your design and implementation more complex. In addition, for a robust system, you must deal with garbage collection of iterators (see Chapter 12).

The fat operation technique uses structures instead of interfaces that encapsulate state. As a result, it is more difficult to modify the system later so that it remains backwardcompatible. You cannot create a new version by deriving new interfaces from the old ones.

In general, the fat operation technique works best if you have large numbers of objects that hold a small amount of state. This is because the fat operation technique works best if the overall cost is dominated by call latency. As soon as individual objects hold more than a few hundred bytes of state, the overall cost is likely to be dominated by the marshaling rate, and the fat operation technique suffers from diminishing returns.

The most serious drawback of the fat operation technique is more subtle. If you look again at the IDL on page 999, you will notice that the list_thermometers operation can handle only thermometers, whereas the list operation in the original CCS design is polymorphic—that is, it can return a list containing both thermometers and thermostats. In other words, the fat operation technique loses polymorphism because IDL provides only polymorphic interfaces and not polymorphic values.[1]

[1] The Objects-By-Value functionality added with CORBA 2.3 allows you to create polymorphic values by extending inheritance to value types. However, the OBV specification is very new and still suffers from a number of technical problems. In addition, OBV implementations are not available as of this writing, so we do not cover OBV in this book.

We can modify our interfaces so that a single list operation can return the state of both thermometers and thermostats, but we lose simplicity:

// ...

struct ThermometerState { ModelType model; AssetType asset_num; TempType temperature; LocType location;

858

IT-SC book: Advanced CORBA® Programming with C++

};

struct ThermostatState { TempType nominal_temp;

};

union ThermostatStateOpt switch(boolean) { case TRUE:

ThermostatState state;

};

struct DeviceState { ThermometerState thermo_state; ThermostatStateOpt tmstat_state;

};

interface Controller { struct ListItem {

Thermometer ref; DeviceState state;

};

typedef sequence<ListItem> DeviceSeq;

DeviceSeq list(); // ...

};

This design simulates polymorphism for the list operation by adding the tmstat_state union member to the DeviceState structure. If a particular device returned by list is a thermostat, the ThermostatStateOpt union contains the additional state specific to thermostats; otherwise, for thermometers, the ThermostatStateOpt union contains no active member. (There are other choices for simulating polymorphism, such as designs using any values. The trade-offs are largely the same for the alternative designs.)

This approach works, but it loses a lot of elegance. In addition, simulated polymorphism is much harder to extend to new device types later because every new device type requires modification of the DeviceState structure.

If you examine the preceding IDL, its inelegance will probably put you off sufficiently to reject this design. If it does, you are correct: fat operations simply do not agree with polymorphism, so you should limit the fat operation technique to situations that do not require derived interfaces.

22.3.3 Coarse Object Models

A more extreme performance optimization than the fat operation technique is to reduce the granularity of the object model. This technique relies on replacing interfaces with data:

#pragma prefix "acme.com"

859

IT-SC book: Advanced CORBA® Programming with C++

module CCS {

// ...
struct Thermometer {
ModelType	model;
AssetType	asset_num;
TempType	temperature;
LocType	location;
};
struct Thermostat {
ModelType	model;
AssetType	asset_num;
TempType	temperature;
LocType	location;
TempType	nominal_temp;

};
interface Controller {		{};
exception BadAssetNumber
exception NotThermostat		{};
exception BadTemp		{ /* ... */ };
Thermometer get_thermometer(in AssetType anum)
void	raises(BadAssetNumber);
	set_loc(in AssetType anum, in LocType loc)
Thermostat	raises(BadAssetNumber);
	get_thermostat(in AssetType anum)
void	raises(BadAssetNumber);
	set_nominal(in AssetType anum, in TempType t)
	raises(

BadAssetNumber, NotThermostat, BadTemp

);

//Fat operations to get and set large numbers of

//thermometers and thermostats here...

};

In this design, we eliminate thermometer and thermostat objects and replace them with structures. This design offers a number of performance advantages.

We reduce the number of CORBA objects in the system to one by retaining only the controller object. This results in a server that requires less code and data at run time and so is likely to scale to a larger number of objects.

Turning objects into data acts as a primitive form of caching. A client holds all of a device's state locally, so repeated read accesses to a particular device do not require a remote message.

For clients that want to deal with thousands of devices simultaneously, we eliminate the need to hold object references. This substantially reduces memory requirements because the client no longer holds a proxy object for each device.

860

IT-SC book: Advanced CORBA® Programming with C++

Naturally, using a coarse object model also has drawbacks.

Coarse object models lose some type safety. For example, the preceding design requires a BadAssetNumber exception on every operation in case a client supplies a non-existent asset number. In the original CCS design, this error condition could never arise because the asset number was implicit in the object reference for each device.

Coarse object models are not polymorphic. Each client must explicitly be aware of all possible types of object and requires modification if more specialized versions of objects are later added to the system. Moreover, coarse object models create error conditions that would otherwise be absent. For example, the set_nominal operation has a NotThermostat exception because a client might specify the asset number of a thermometer for the operation, but a thermometer does not have a nominal temperature attribute.

Thermometers and thermostats are no longer stand-alone entities that can be passed from address space to address space. Suppose that we have located a thermostat of interest and want to pass the thermostat to another process that adjusts the desired temperature for us. With the original CCS design, this is trivial: we simply pass the reference to the relevant thermostat. However, with a coarse object model, it is not sufficient to pass only a Thermostat structure. Instead, we must pass both the structure and a reference to the controller because the receiver of the structure may not know which particular controller is responsible for this particular thermostat. If your application has more than one collection manager for a particular type of object, the need to track the associations between the collection managers and their objects can complicate the design considerably.

In general, the coarse object model approach works well if you do not require polymorphism and if objects are simple, small collections of attributes without behavior. In this case, objects provide set/get semantics for only a small number of attributes and so might as well be represented as structures. Coarse object models are similar to the fat operation technique in that they reduce messaging overhead. However, the main value of coarse object models is that they can improve scalability because they reduce the memory overhead for clients and servers dealing with large numbers of objects.

22.3.4 Client-Side Caching

Both fat operations and coarse object models enable client-side caching of state. After a client has retrieved the state for a particular object, it can keep a local copy of that state. Future read operations on the object can be satisfied by returning state from the local copy and so do not require a remote invocation. For update operations, the client can send a remote message as usual to update the state information for the object in the server.

Read accesses typically account for more than 95% of the total number of operations in a distributed system, so client-side caching can result in a dramatic reduction of the number of remote messages that are sent. Unfortunately, client-side caching also has a number of drawbacks.

861

<<< < Предыдущая 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123124 / 128124 125 126 127 128 > Следующая >>>

Соседние файлы в предмете Программирование

#
24.05.20145 Mб65Advanced CORBA Programming wit C++ - M. Henning, S. Vinoski.pdf
#
24.05.20144.8 Mб43Algorithms - R.Sedgewick.pdf
#
24.05.20142.84 Mб204Applied Java™ Patterns - Stephen Stelting, Olav Maassen.pdf
#
24.05.20147.32 Mб38ASP .NET Web Developer s Guide - Mesbah Ahmed, Chris Garrett.pdf
#
24.05.201415.33 Mб26ASP .NET 2.0 Beta Preview - B. Evjen.pdf
#
24.05.20143.32 Mб33ASP .NET Database Programming Weekend Crash Course - J. Butler, T. Caudill.pdf