Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

Mastering Enterprise JavaBeans™ and the Java 2 Platform, Enterprise Edition - Roman E

..pdf
Скачиваний:
41
Добавлен:
24.05.2014
Размер:
6.28 Mб
Скачать

624 M A S T E R I N G E N T E R P R I S E J A V A B E A N S

as properly nesting tags. An XML parser must check whether your document is well formed so that it can parse your document. For example, the following XML document is well formed:

<?xml version="1.0" standalone="yes"?>

<food>Banana</food>

This document is not well-formed:

<?xml version="1.0" standalone="yes"?>

<food>Banana

The above example is not well-formed because the <food> tag doesn’t have a matching </food> end tag.

Similarly, this document is well formed:

<?xml version="1.0" standalone="yes"?>

<food>

<name>Banana</name><color>yellow</color>

</food>

This document is not:

<?xml version="1.0" standalone="yes"?>

<food>

<name>Banana<color></name>yellow</color>

</food>

This example is not well-formed because two tags overlap and are not properly nested.

XML Parsers

An XML parser is a program that reads in an XML document and verifies whether it is well formed. Several companies make XML parsers, such as IBM, DataChannel, Oracle Corporation, and Sun Microsystems. Most XML parsers are libraries, rather than stand-alone applications, so that you can call an XML parser programmatically from another application.

XML DTDs

Earlier in this chapter, we mentioned that XML is a meta-markup language because it allows you to define your own markup language with your own tags. We’ve done this already in the previous section when we constructed a document with a root element of <library> that contained many <book> elements. But we could have used any tag names at all, such as <booklist> instead of <library>

Go back to the first page for a quick link to buy this book online!

Understanding the Extensible Markup Language (XML) 625

or <novel> instead of <book>. This means the next guy who writes an XML document might use completely different tags as well. How do we understand his XML then?

It’s very useful to have an XML parser as a stand-alone program so that you can verify whether your documents are well formed. IBM’s XML parser for Java (called XML4J) ships with a sample application that you can use from the command line to verify that your XML files are well formed. Assuming your CLASSPATH references IBM’s XML parser and XML samples .jar files, type:

java sax.SAXWriter <filenames...>

to verify that your XML documents are well formed.

The answer is to use a document type definition (DTD). A DTD specifies rules about constructing XML documents. DTDs limit the tags you can use in your XML documents, and they also impose rules about how and when you use tags. For example, we might write a DTD that specifies all books must have <author> and <title> tags—no more, no less.

DTDs limit how flexible you can make your XML documents. This is a good feature because it restricts people to a very well defined set of common structures. Common structures are necessary for two programs to be able to parse and understand exchanged information written in XML. Agreed structures are absolutely necessary when heterogeneous organizations exchange business data because each company needs to understand the other company’s information.

DTDs can be embedded within XML files, or they can ship separately. Because they are embeddable, XML is a self-describing language.

Valid Documents

An XML document is valid if it satisfies the structural rules laid out in its corresponding DTD. Do not confuse validity with the well-formedness concept we learned in the previous section. A well-formed document is syntactically correct according to the XML specification (for example, all tags are nested, not overlapped). A document is valid if it is well-formed and it satisfies the constraints imposed by the DTD, such as every book element contains title and author elements.

XML Validating Parsers

An XML validating parser is a program that checks that an XML document is valid according to its DTD. XML validating parsers are more powerful than plain

Go back to the first page for a quick link to buy this book online!

626 M A S T E R I N G E N T E R P R I S E J A V A B E A N S

vanilla XML parsers because they can check whether a document is valid. An XML parser that does not check validity is called a nonvalidating XML parser.

It’s also very useful to have an XML validating parser as a stand-alone program so that you can verify whether your documents are well formed as well as valid. IBM’s XML parser for Java (called XML4J) ships with a sample application that you can use from the command line to verify that your XML files are well formed and valid. Assuming your CLASSPATH references IBM’s XML parser and XML samples

.jar files, type:

java sax.SAXWriter -p com.ibm.xml.parsers.ValidatingSAXParser

<filenames...>

to verify that your XML documents are well formed and valid. Note that you must include your DTD embedded within your XML file when using this program.

Understanding DTDs

Let’s make our DTD knowledge a bit more concrete with an example, shown in Source C.2.

This simple DTD illustrates many of the core DTD concepts. The XML library file we introduced earlier is well formed and valid with respect to this DTD, so feel free to refer back to Source C.1 as we break apart this DTD and explain it.

The Document Type Declaration

Our DTD begins with some header information indicating the XML version number:

<?xml version="1.0" standalone="yes"?>

Our DTD then introduces a document type declaration as follows:

<!DOCTYPE library [

...

]>

The document type declaration is very, very different from the document type definition (DTD). The declaration does not introduce any semantic rules; rather, it declares the type of the document and points to the DTD in which its rules are declared. We use the word library in our document type declaration to identify the root element for our document.

Go back to the first page for a quick link to buy this book online!

Understanding the Extensible Markup Language (XML) 627

<?xml version="1.0" standalone="yes"?>

<!DOCTYPE library [

<!ELEMENT library (book*)>

<!ELEMENT book (title, author+, pages, description?, (hardcover | softcover))>

<!ELEMENT author (#PCDATA)>

<!ELEMENT title (#PCDATA)>

<!ELEMENT pages (#PCDATA)>

<!ELEMENT description (#PCDATA)>

<!ELEMENT hardcover EMPTY>

<!ELEMENT softcover EMPTY>

<!ATTLIST book isbn CDATA #REQUIRED> ]>

Source C.2 A Sample XML DTD.

Element Type Declarations

The first line of grammar in our DTD is:

<!ELEMENT library (book*)>

This is an element type declaration because it declares rules about an element— the <library> element. This rule says that all libraries contain zero or more book elements (the * stands for zero or more). Indeed, if you look at Source C.1, you’ll see that our XML file contains zero or more books (it actually contains three books).

The next line of our DTD is:

<!ELEMENT book (title, author+, pages, description?, (hardcover |

softcover))>

This line defines rules for the <book> element. The rules are shown in Table C.1.

As you can see from Source C.1, the rules in Table C.1 were adhered to in our XML document. We have no books with more than one title, and every book is either a hardcover or a softcover.

Go back to the first page for a quick link to buy this book online!

628

 

M A S T E R I N G

E N T E R P R I S E J A V A B E A N S

Table C.1 Rules for the <book> Element

 

 

 

RULE

 

MEANING

 

 

 

title

 

All books must have exactly one title.

author+

All books must have at least one author.

pages

 

All books must have exactly one number of pages.

description?

All books may have zero or one descriptions.

(hardcover | softcover)

All books must be either a softcover or hardcover, but not both.

 

 

 

 

#PCDATA

The next few lines in our DTD are:

<!ELEMENT author (#PCDATA)>

<!ELEMENT title (#PCDATA)>

<!ELEMENT pages (#PCDATA)>

<!ELEMENT description (#PCDATA)>

These rules specify that the <author>, <title>, <pages>, and <description> elements contain normal character data, such as words, sentences, or paragraphs of text. In a DTD, parsable character data is identified with the word #PCDATA. For example, one of our books contains the following text:

<author>J. D. Salinger</author>

The text J. D. Salinger is normal character text, and it is valid #PCDATA for this <author> element.

Empty Elements

The next few lines of our DTD are:

<!ELEMENT hardcover EMPTY>

<!ELEMENT softcover EMPTY>

These lines specify empty elements, such as <hardcover/> and <softcover/>. If you’ll recall, an empty element is an element with a single tag that has no text or subelements. From Source C.1, you can see that each of our books contains either a <hardcover/> or <softcover/> element.

Go back to the first page for a quick link to buy this book online!

Understanding the Extensible Markup Language (XML) 629

Attributes

The final rule in our DTD is:

<!ATTLIST book isbn CDATA #REQUIRED>

This is an attribute list declaration. It declares rules for the isbn attribute within the <book> element. For example, take the following snippet from Source C.1:

<book isbn="0451524934">

<title>1984</title>

<author>George Orwell</author>

<pages>268</pages>

<softcover/>

</book>

The isbn attribute above has the value 0451524934. The attribute list declaration specifies that all books must have an isbn attribute. The word CDATA means that the isbn attribute’s value must be plain text. The word #REQUIRED means that all books must have an isbn attribute. A book without an isbn attribute makes the XML document invalid (but still well formed).

There are other settings we could have used as well. The #IMPLIED keyword would make the isbn attribute optional, but not mandatory, for all books to have. We also could have supplied a default isbn value that all books get if they don’t have an isbn attribute.

XML Summary

This completes our quick tour of XML. We hope you’ve found XML to be a lot simpler than you may have imagined. Note that we’ve only scratched the surface of XML; if you want more information on advanced XML topics, see the book’s accompanying Web site for links to external resources.

XML and EJB

Now that you’ve seen the basic XML concepts, let’s apply our knowledge to the EJB space. We’ll see how XML is used in the EJB deployment descriptor, and we’ll also examine XML as an on-the-wire data format for transferring enterprise information.

XML Deployment Descriptors

As we’ve seen throughout the coding examples in this book, an EJB deployment descriptor provides declarative information about your enterprise beans, such

Go back to the first page for a quick link to buy this book online!

630 M A S T E R I N G E N T E R P R I S E J A V A B E A N S

as your transaction requirements, your security requirements, your persistent fields, and so on. Deployment descriptors are crucial parts of the EJB specification because they allow you to gain middleware services from the EJB container without directly programming to middleware APIs. Rather, the container inspects your deployment descriptor and applies middleware services to your beans as you’ve declared.

EJB 1.0 Deployment Descriptors

In EJB 1.0, deployment descriptors are serializable Java objects that have been saved to disk. To create a deployment descriptor in EJB 1.0, you need a program that is capable of creating a serializable Java object. For example, BEA’s WebLogic server currently ships with a DDCreator tool that can manufacture a deployment descriptor for you. To use this tool, you first must create a text file that lists your deployment descriptor settings. You then pass this text file to the DDCreator tool, and it manufactures a serializable deployment descriptor for you. Note that this is not the only current way to create a deployment descrip- tor—several vendors (including BEA) provide graphical tools that let you visually create deployment descriptors as well. Once you’ve created the deployment descriptor, you bundle it with your bean class in an Ejb-jar file, which is a deployable component you can import into an EJB container/server.

The disadvantage of serializable deployment descriptors is that they are tough to maintain. You need a graphical front end to create your deployment descriptor because you cannot work with the serializable bit-blob directly in a text editor.

EJB 1.1 Deployment Descriptors

We’ve seen that XML is an elegant, simple language you can use to add structure to a document. XML can be used to describe your data, so that another party can query your documents to ascertain information. EJB leverages XML for exactly this purpose: to describe your enterprise beans.

EJB 1.1 completely departs from serializable deployment descriptors. As a bean developer writing to the J2EE platform using EJB 1.1, you must create your declarative bean settings in an XML document. You cannot use serialized deployment descriptors anymore (they are deprecated), but rather, you include an XML document with your enterprise bean classes. All EJB 1.1-compliant containers must accept a deployment descriptor written in XML, and they cannot accept a serialized Java object. We cover the EJB 1.1 XML DTD in Appendix D.

XML as an On-The-Wire Data Format

XML is also applicable to EJB as an on-the-wire data format for sending enterprise information between heterogeneous applications. As we’ve seen throughout this

Go back to the first page for a quick link to buy this book online!

Understanding the Extensible Markup Language (XML) 631

book, EJB defines a component model for developing robust server-side com- ponents—modules that perform tasks such as billing a customer, paying a salary, or fulfilling an order. The EJB paradigm enables corporations to assemble applications from existing prewritten components that solve most of the business problem already.

As good as this sounds, assembling applications from disparate components is not all roses. The problem with assembling heterogeneous components is getting them all to work together. For example, let’s say you purchase a bean that computes prices (as we wrote in Part IV), and you combine it with some homegrown entity beans, such as an Order bean and a Product bean. Let’s assume we also use a Billing component from a different vendor. How do you get these components to work together? None were created with the knowledge of the others.

There is no easy answer to this problem. EJB defines standard interfaces for components to be deployable in any container, but EJB cannot specify how domain-specific components interact. For example, EJB cannot specify the de facto bean to represent a Product or an Order because each corporation models these differently in its existing information systems.

Unfortunately, you’re always going to need to write some workflow component that maps to each vendor’s proprietary API and object model. The only way you can get around mapping to APIs is if a standards committee decides on an official object model for a problem domain, such as standardizing on what a purchase order looks like. Problem domains such as pricing are very open and customizable, which makes this a very large challenge to overcome.

There’s a second problem with having these components work together: data mapping. How does the billing component understand the data computed by the pricing component? Sure, you might be able to call the billing component’s API, but it won’t magically know how to deal with the data passed to it. The data was formatted by another vendor’s component. You’re going to need to write an adapter object that bridges the gap between the two formats. If you purchase components from n vendors, you’re going to be spending all your time writing adapter code. This is quite mindless and boring.

XML has the potential to help with data mapping. Rather than application components sending proprietary data, components could interoperate by passing XML documents as parameters. Because the data is formatted in XML, each component could inspect the XML document to determine what data it received.

XML must overcome several challenges before it reaches this level. One hurdle is performance. Parsing XML documents takes time, and sending XML documents over the wire takes even longer. For high-performance enterprise applications, using XML at runtime for routine operations is very costly. The performance barrier is slowly becoming a more trivial concern, however, as XML parsers

Go back to the first page for a quick link to buy this book online!

632 M A S T E R I N G E N T E R P R I S E J A V A B E A N S

become higher performing and as people begin to use text compression to send XML documents over the wire.

The larger issue that must be overcome before XML is extensively used as an on-the-wire document format is that every participant component must agree on a standard representation, or DTD, for exchanged data. This is a trivial problem when the components are written by a single vendor because that vendor can simply invent a DTD and include it with its components. This becomes a monstrous problem, though, when integrating heterogeneous vendors’ components. For a large number of corporations to agree on document structure for data, a standards body would need to specify a suite of standard DTDs that all enterprise applications used within their industries (two organizations attempting to do this on a widespread basis currently are XML.org and Microsoft’s BizTalk.org). Once industry-standard DTDs are developed, everyone needs to agree to use these DTDs in enterprise applications. Indeed, human competition very much precludes this possibility from gaining widespread adoption, as companies attempt to bend XML for their own business needs, resulting in vendorspecific standards. The need for e-business is there, and so hope remains.

Summary

In this appendix, you’ve been introduced to the Extensible Markup Language (XML). We began with a look at the business’s need for XML to conduct busi- ness-to-business e-commerce, and we saw why existing technologies (such as EDI, HTML, and SGML) are insufficient for this purpose. Next, we dove into XML programming, and we quickly got up to speed with writing XML documents, including understanding DTDs. Finally, we applied our XML knowledge to EJB, examining XML both for specifying deployment descriptor structure and for on- the-wire data format.

This chapter has only scratched the surface of XML. There is a wealth of more information to learn, such as the following:

The Extensible Linking Language (XLL). XLL allows you to link and address parts of XML documents together, similar to how HTML pages have hyperlinks to one another. XLL is divided into two parts: XLinks for creating a link from one XML document to another and XPointers for one XML document to address parts within another document.

Extensible Stylesheet Language (XSL). XSL allows you to add a GUI presentation to your XML documents. You can apply the GUI rules in an XSL document to an XML document to result in a document that has GUI formatting tags in it, such as the HTML <B> and <I>.

Go back to the first page for a quick link to buy this book online!

Understanding the Extensible Markup Language (XML) 633

The Document Object Model (DOM). The DOM is a platform-neutral, lan- guage-neutral interface for programs to access, manipulate, and update content, structure, and styles of documents. The DOM defines a tree-like structure for representing an HTML or XML document in memory using objects. You can use the DOM as an API for manipulating an XML document programmatically.

The Simple API for XML (SAX). SAX is similar to the DOM, in that it allows you to manipulate XML documents programmatically. The big win SAX has over DOM is that SAX is an event-based interface. SAX allows you to query XML documents from a program without loading that entire XML document into memory. This is necessary for performance reasons, especially if the XML document is particularly huge.

I encourage you to learn as much as you can about this emerging standard, as it will play a key role in Internet applications. For links to this information and much more, please visit the book’s accompanying Web site at www.wiley.com/ compbooks/roman.

Go back to the first page for a quick link to buy this book online!