
Professional Java.JDK.5.Edition (Wrox)
.pdf
Chapter 11
TCP Monitoring: Testing with Apache TCPMon
Testing and debugging protocol implementations is far more difficult and tedious than testing and debugging a standalone application. To make sure the protocol implementation you are developing is correct, it is extremely helpful to see what is being sent and received over the wire with the remote server. There are utilities available to do just that — view what is being sent and received over a TCP/IP socket connection. For HttpGetter, I used the Apache utility, TCPMon, to monitor my TCP/IP connection with remote Web servers. Being able to read my request from the utility let me know that my request was following the HTTP specification. If there was any trouble parsing the response, I could look at exactly what was sent back from the server using the monitoring utility. Parsing the input from a socket is very similar to parsing a file — the data is in a certain format, and the code must read in that format. With sockets though, there is no file to view and test against. If there is a bug, it is difficult to see what in the protocol could be causing it. This is why the TCPMon utility is invaluable; it lets the developer look at the server’s response as if it were a file on the local machine. It is useful for the implementation of any protocol based on TCP/IP, or during development with Web services. This chapter will also discuss using TCPMon in the “Web Services” section.
Getting, Building, and Running TCPMon
TCPMon is included as part of Apache AXIS. Apache AXIS is an implementation of SOAP that will be discussed in more detail in the “Web Services” section. However, the TCPMon utility, which is also useful in Web services development (hence it is included with AXIS), can also be useful for socket development as well, especially when implementing a protocol. The AXIS distribution can be downloaded from the following URL:
http://ws.apache.org/axis/index.html
Make sure to download a source distribution of AXIS. You will also need the Apache Ant build tool to build and run TCPMon (as well as the AXIS distribution). See Chapter 2, “Tools and Techniques for Developing Java Solutions,” for more information regarding Ant. In the AXIS source distribution /docs folder, there is documentation on building AXIS (building-axis.html at the time of this writing). You will have to download a few libraries before you can actually build AXIS. Look at the “Building without Any Optional Components” and the “Building with Servlets” sections (in building-axis.html) for the links to these libraries. After all the required jars are in /lib of the AXIS source distribution, build AXIS by running the following in the directory with build.xml (the root directory of the distribution):
ant compile
After AXIS has been successfully built, run TCPMon by again using ANT:
ant -buildfile tcpmon.xml
Using TCPMon
To have TCPMon be able to print out your requests and the server’s responses, it must be set up as a middleman between your local machine and the remote server. To test your program, it will have to connect to TCPMon, which in turn connects it to the remote server. TCPMon relays whatever is sent to it to
496

Communicating between Java Components and Components of Other Platforms
the remote server, and whatever the remote server sends it, it relays back to your application. To configure TCPMon in this manner, it must be set up as a Listener, and given a port number on the local machine. The screen in Figure 11-5 is the first screen and main configuration screen of TCPMon. The figure shows the configuration necessary for TCPMon to act as a Listener on port 8079. TCPMon will relay any connection made to port 8079 on the local machine to www.google.com, port 80 (the default HTTP port). Once the Add button is clicked, TCPMon will set up the relay.
Figure 11-5
Now that the relay is running, HttpGetter can be tested by running:
java book.HttpGetter http://localhost:8079/ tester.html
HttpGetter connects to TCPMon, which in turn, connects it to www.google.com. Going to the Port 8079 tab on TCPMon yields a list of all connection attempts made to www.google.com in this session. Figure 11-6 shows each request and response in detail.
497

Chapter 11
Figure 11-6
Debugging a protocol implementation is far easier with a utility such as Apache TCPMon, which allows the developer to view the data sent and received over a TCP/IP connection.
Proprietary Protocols and Reverse Engineering
Some protocols are not open. The instant messaging protocols for AOL’s Instant Messenger and Microsoft’s Messenger clients are proprietary information that currently is not shared (although the FCC is trying to force an open instant messaging standard to allow various clients to interoperate). If your software must communicate with servers such as these, whose protocol is either unknown or proprietary, there are not a whole lot of options. Some groups such as Gaim (http://gaim.sourceforge.net), an open-source, instant messaging client, have attempted to reverse-engineer the instant messaging protocols. This is done by monitoring the TCP connections and data sent between proprietary clients and servers. Sometimes portions of a protocol can be identified. When designing a proprietary protocol, taking into account how easy it would be to reverse-engineer is important (especially if security is a high priority). For extra security, some sort of encryption may be necessary for the protocol to avoid being reverse-engineered. Most of the time, protocols should be open. The specifications are generally easier for everyone to implement, since they have the advantage of being reviewed by many different sets of eyes. HTTP, for example, has undergone a number of performance-improving amendments from version 1.0 to 1.1. The most robust and stable implementations of protocols generally result from free and open protocols that have been in use for a while. High-quality reference implementations have been developed for protocols such as HTTP, TCP/IP, and X-Windows precisely because those protocols are open.
498

Communicating between Java Components and Components of Other Platforms
Utilizing Existing Protocols and Implementations
Developers will want to avoid designing and writing their own protocol if at all possible. Some existing protocol somewhere usually will fulfill the requirements of almost any application. There is no point in reinventing the wheel, and oftentimes using open protocols is a good avenue to ease the difficultly of interoperating with the outside world. If your app needs to interface to other applications, writing and designing a custom protocol has even more costs. Any other application that wishes to interface with your application must now implement a custom protocol. Getting two disparate implementations of a protocol to work robustly together is no easy task in itself, let alone in addition to normal application development. There are many protocols out there that already have high-quality implementations freely available to Java developers. The Jakarta Project from Apache hosts many open source projects. The Jakarta Commons Net package, for example, provides an API that implements FTP, NNTP, SMTP, POP3, Telnet, TFTP, and more. You can find more information about it at the following URL:
http://jakarta.apache.org/commons/net/
Even though in your HttpGetter example, you found that implementing one small section of HTTP was fairly simple, implementing the entire protocol with all of its optional components would be far more difficult. There are already optimized implementations of HTTP out there, and using one would be a far better design choice in any application that requires HTTP client support. The JDK provides limited support for HTTP via the java.net.URL class. It is good for simple HTTP operations, but sometimes more control over how HTTP is used is necessary. For example, to view and set HTTP headers, an HTTP client library that exposes more HTTP details than the java.net.URL class found in the JDK would be required. The HTTP Client project in the Jakarta Project provides a high-quality HTTP implementation. More information on HTTP Client can be found here:
http://jakarta.apache.org/commons/httpclient/
You have just looked at some freely available client libraries. There are also freely available libraries for servers. The Jakarta Project provides an HTTP server implementation with its servlet container, Tomcat. There are implementations of POP3 mail servers available. It should, almost 100 percent of the time, make sense to use an existing protocol in your application for communicating between your Java components and components on other platforms. You also should not have to implement the protocol yourself as there are high-quality robust open source implementations available for almost all of the major open protocols in use today.
Some great resources for finding and aggregating open source Java projects into your application are listed in the following table.
Resource |
URL |
|
|
The Jakarta Project |
http://jakarta.apache.org |
OpenSymphony Quality Components |
http://www.opensymphony.com |
JBoss: Professional Open Source |
http://www.jboss.org |
The Apache XML Project |
http://xml.apache.org |
The Eclipse Project |
http://www.eclipse.org |
|
|
499

Chapter 11
Remote Method Invocation
Remote Method Invocation is the Java platform’s standard for remote procedure calls (RPC). Remote procedure calls are abstractly the same concept as a normal procedure call within a program, except that the calls can happen over a network, and are between two separate processes. Different forms of RPC have been around for a while, but the concepts are similar. There is a client program and a server program, each running on separate machines (or at the very least, on two separate processes on the same machine). The client program calls a procedure (or in Java terminology, a method) on the server, and waits till the server returns the method result before continuing its normal execution (just like a normal local method call). Figure 11-7 illustrates a high-level view of object-to-object communication over a network in different JVMs.
Remote
JVM Object
Network
Remote
JVM Object
Figure 11-7
Remote Method Invocation (RMI) is such a large topic that it has its own chapter. See Chapter 10, “Communicating between Java Components with RMI and EJB,” for detailed information on how to use RMI in your applications. This chapter will take a more abstract view of RMI and see how it fits as a technology into distributed systems.
Core RPC/RMI Principles
The Java platform makes writing client/server programs fairly simple. In Java, you can call methods on an object, and not even necessarily know that the object resides on a remote machine. The code for the method call is no different than a normal local method call. In J2EE, you generally have to look object
500

Communicating between Java Components and Components of Other Platforms
instances up from a naming service before using them. When you look the object up and receive a reference to it, it may be a local reference or a remote reference. The code does not change though, and it is one of the reasons Java is such a powerful server language — a lot of the complex details of technologies such as RMI have been abstracted away. Now, this does not mean developers can be completely oblivious to whether an object instance is remote or local. Remote objects have certain design trade-offs that must be taken into account. Method calls happen across a network, and thus are limited to the reliability and speed of the network. RMI is a powerful mechanism for writing distributed systems. The following sections look into the basic core principles common to almost all RPC mechanisms, and show how they relate to RMI.
In RPC, all method calls must be transformed into a format that can be sent over the network and understood by a remote process. In order to call methods on a remote object, three main steps occur:
1.A reference to the remote object must be obtained. The remote object must be looked up on the remote server
2.Marshalling and unmarshalling of parameters. When a method is invoked on the remote reference, the parameters must be marshalled into a byte stream that can be sent over the network. On the server side, these parameters must be unmarshalled from the byte stream into their original values and then passed to the appropriate method.
3.Transmission of data through a common protocol. There must be a protocol defined for the transport and delivery of these method calls and returns. A standard format for parameters is necessary, along with standards to tell the server which method on which object is to be invoked.
To make the remote call appear like a local call, a local implementation exists with the same interface (all RMI objects must be defined as Java interfaces). This local implementation is called a stub and is essentially a proxy to the real implementation. Whenever a method is called on this local implementation or stub, the local implementation performs the operations necessary to send the method call to a remote implementation of the same interface on another server. The stub marshalls the parameters and sends them over the network using a common RMI protocol. In turn, a stub on the server side implementing the same interface unmarshalls the parameters and then passes them on to the actual remote object in a normal method call. This process is reversed for the return value; the stub on the server side marshalls and sends it, and the stub on the client unmarshalls and returns it to the original caller. Figure 11-8 displays this entire process graphically.
Marshalling and Unmarshalling
The parameters and method call must be flattened into a byte stream before they can be sent over the network. This process is called marshalling. The reverse is called unmarshalling, when the byte stream is decoded into the original parameters and method call information. After unmarshalling the parameters and method call, the server dispatches the method call to the appropriate object that actually implements the remote method and then marshalls the return value back to the client. By serializing the parameters and method into a byte stream, RMI protocols can work on top of network protocols which provide a reliable byte stream, such as TCP/IP.
501

Chapter 11
Actual Implementation
Client Application |
Server Application |
ClientStub |
«interface» |
ServerSkeleton |
|
Remote Object Interface |
|
Network
Figure 11-8
In RMI, there are two types of objects besides primitives that can be passed as parameters. Objects that implement the java.rmi.Remote interface or objects that implement the java.io.Serializable interface. These two interfaces do not contain any methods, instead they mark objects with a particular property. Java’s RMI mechanism knows that Remote objects could be on another virtual machine, and will have stubs. Objects that implement Serializable, on the other hand, can be transformed into a byte stream (to save to disk, or in RMI’s case, to send over a network). In RMI, objects that implement Remote are passed by reference while objects that implement Serializable (and not Remote) are passed by value. When parameters are marshalled over the network and transformed into a byte stream, any object that must be passed via an RMI call must be Serializable. So now for the first time, objects in Java can be passed by value. This is not as confusing as it sounds — Remote objects are passed by reference and Serializable objects are passed by value. This helps reduce the number of network calls that must occur. If an object being passed contains a large number of properties that must be accessed through getXXX methods, there would be a large number of network calls taking place. By serializing the object, all these calls become local calls on the remote server and use up far less network bandwidth. Method calls on Remote objects passed in, on the other hand, will go over the network and must be taken into consideration.
502

Communicating between Java Components and Components of Other Platforms
Suppose this is an implementation of a method on a server that is being invoked remotely by a client:
public void myTestMethod(A a, B b) { a.remoteMethod();
Data d = b.getData();
...
}
In this example, A implements java.rmi.Remote, and thus a call to remoteMethod() is a remote callback to your client. B implements Serializable and hence getData() is a local call to b which was unmarshalled from its serialized state back into an object now running on the server.
Note: Any objects passed by value in RMI must be in the classpath of the JVM running on the remote server.
See Chapter 5, “Persisting Your Application Using Files,” for more information on java.io.Serializable and serializing objects to disk.
Protocols
In RPC, all method calls must be transformed into a standard format that can be sent over a network. In other words, two programs running on two separate processes must be able to read and write this same format. RPC mechanisms have their own protocols. Sometimes these protocols are built on top of TCP/IP, or at other times they define their own transport protocol in addition to the RPC protocol, combining the transport layer and the application layer protocols for optimal performance. Operating systems sometimes provide system-level services in this manner.
RMI is implemented such that it can support more than one underlying transport protocol (though obviously only one protocol can be used between any two objects). There are two main choices as the transport protocol for RMI:
Java Remote Method Protocol (JRMP)
Internet InterORB Protocol (IIOP)
Either one of these protocols could be used in a given system, and both have their trade-offs. IIOP offers compatibility with CORBA, which will be discussed later in this chapter. IIOP, since it was not designed specifically for Java remote procedure calls, does not support some of the features JRMP supports, such as security and distributed garbage collection. Using IIOP as the underlying protocol for RMI makes it easy to integrate legacy objects written in other languages however (discussed more in the “Common Object Request Broker Architecture” section of this chapter). JRMP is the default protocol for RMI. IIOP stubs differ from JRMP stubs and must be generated separately. See rmic tool documentation for more details.
RMI Registry
Object instances must be made available in a registry on the server before they can be used by remote clients. Clients obtain an instance by looking up a particular name — for example, the string EmployeeData might refer to a class containing the data for the employees of a particular company.
503

Chapter 11
When a server is starting up, it creates instances of the objects it wishes to be available, and registers them in a registry. Since these objects are globally available, they must be thread safe (since their methods can be called at the same time by different threads). The code to look up a particular instance of a class is not very difficult, and uses the Java Naming and Directory Interface (JNDI) API (found in javax.naming). A small snippet of code to look up an object on a remote server follows:
import javax.naming.InitialContext;
...
InitialContext ctx = new InitialContext();
EmployeeData data = (EmployeeData) ctx.lookup(“CompanyX\\MyEmployeeDataInstance”);
...
JNDI is configured by setting certain Java system properties to tell it the location and protocol of the registry. This is how objects can be transparently remote or local. If the registry is configured locally, in the same JVM, then all calls to data will be local. If data is an instance on a remote server, all calls will go through RMI, using whatever protocol was specified.
See Chapter 10, “Communicating between Java Components with RMI and EJB,” for more detailed information on the mechanics and details of RMI.
Distributed Objects
RMI allows a developer to abstract away where objects physically reside from his application. Objectoriented applications can be transparently spread across multiple machines. Objects that do heavy processing or provide server-side functionality, such as mail services, transactional database services, or file serving services, can be located on server-class machines. Typical desktop client applications can then access these objects as if they were local and part of the same object-oriented application. Location-inde- pendent objects are powerful since they can be dynamically moved around from machine to machine. If mail services’ objects on a server become too bogged down, they can be spread across multiple machines, all transparently to the client applications using them. Java’s platform independence adds even more value to its location-independent objects. Server objects could reside on a Unix-based operating system for example, and client objects on a Microsoft Windows platform. Figure 11-9 shows many objects communicating from different JVMs on different machines.
Middleware and J2EE
Most of time, the main reasons for distributing objects onto various machines is to give access to various services provided by these machines. Mail services, transactional-database services, and file-server services all can be encapsulated by various software components, or in this case, Java objects. By allowing all these objects to communicate in a standard, distributed way, server-side applications can be developed with ease. Location-independent objects allow for server applications to scale, since when one server no longer provides enough horsepower for a server application, you just add a couple more machines and spread the objects around.
Middleware is a software layer between various data sources and their client applications. RMI distributed objects is one way to implement middleware for different applications. Middleware abstracts away the details of the one-or-many data sources. RMI is the perfect building block for middleware because of its location and platform independence. Java is most prevalent in server-side applications and middleware because of the foundation it provides for building stable and reliable software systems.
504

Communicating between Java Components and Components of Other Platforms
JVM
JVM
Remote |
JVM |
Remote |
|
Object |
Object |
||
|
Network
Remote
JVM Object
Remote
Object
Remote
JVM Object
Figure 11-9
The Java 2 Enterprise Edition (J2EE) platform uses RMI as one of its core technologies. J2EE provides reliable messaging, rock-solid transactional storage capabilities, remote management and deployment, and frameworks for producing Web-enabled server-side applications. J2EE is a standard platform for developing middleware and other server-side services. RMI enables J2EE to be location-independent and distributed. Rather than developing one’s own middleware solely with RMI, it is far better to build on the J2EE standard for writing server-side applications.
Common Object Request Broker Architecture
The Common Object Request Broker Architecture, or CORBA for short, is a set of specifications by the Object Management Group (OMG) for language-independent distributed objects. It allows for objects written in a number of different programming languages to interoperate and communicate with one another. C++ classes can talk to Java classes. C# can talk to C++ or Java. Programs written in C are supported by some CORBA implementations, as well as even scripting languages such as Python. CORBA is similar to RMI conceptually, but supports more languages than simply Java. CORBA itself is a set of specifications, not an actual implementation. For it to be possible for a language to support CORBA and other CORBA objects, it must have an implementation in its native language (or somehow be bound to an implementation). For instance, the Java Development Kit (JDK) includes an implementation of the CORBA 2.3.1 specification. That means that, out of the box, Java supports CORBA implementations up to and including the 2.3.1 specification (the latest CORBA specification at the time of this writing is 3.02).
505