Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

Pro ASP.NET 2.0 In CSharp 2005 (2005) [eng]

.pdf
Скачиваний:
107
Добавлен:
16.08.2013
Размер:
29.8 Mб
Скачать

438 C H A P T E R 1 2 X M L

str.Append(node.Name); str.Append(" "); str.Append(node.Value); str.Append("</b><br />"); break;

case XmlNodeType.Element: str.Append(indent); str.Append("Element: <b>"); str.Append(node.Name); str.Append("</b><br />"); break;

case XmlNodeType.Text: str.Append(indent); str.Append(" - Value: <b>"); str.Append(node.Value); str.Append("</b><br />"); break;

case XmlNodeType.Comment: str.Append(indent); str.Append("Comment: <b>"); str.Append(node.Value); str.Append("</b><br />"); break;

}

...

Note that not all types of nodes have a name or a value. For example, for an element such as Title, the name is Title, but the value is empty, because it’s stored in the following Text node.

Next, the code checks whether the current node has any attributes (by testing if its Attributes collection is null). If it does, the attributes are processed with a nested foreach loop:

...

if (node.Attributes != null)

{

foreach (XmlAttribute attrib in node.Attributes)

{

str.Append(indent); str.Append(" - Attribute: <b>"); str.Append(attrib.Name); str.Append("</b> Value: <b>"); str.Append(attrib.Value); str.Append("</b><br />");

}

}

...

Lastly, if the node has child nodes (according to its HasChildNodes property), the code recursively calls the GetChildNodesDescr function, passing to it the current node’s ChildNodes collection and the current indent level plus 1, as shown here:

...

if (node.HasChildNodes) str.Append(GetChildNodesDescr(node.ChildNodes, level+1));

}

return str.ToString();

}

When the whole process is finished, the outer foreach block is closed, and the function returns the content of the StringBuilder object.

C H A P T E R 1 2 X M L

439

Using the XPathNavigator

The XPathNavigator works similarly to the XmlDocument class. It loads all the information into memory and then allows you to move through the nodes. The key difference is that it uses a cursorbased approach that allows you to use methods such as MoveToNext() to move through the XML data. An XPathNavigator can be positioned on only one node a time.

You can create an XPathNavigator from an XmlDocument using the XmlDocument.CreateNavigator() method. Here’s an example:

private void Page_Load(object sender, System.EventArgs e)

{

string xmlFile = Server.MapPath("DvdList.xml");

//Load the XML file in an XmlDocument. XmlDocument doc = new XmlDocument(); doc.Load(xmlFile);

//Create the navigator.

XPathNavigator xnav = doc.CreateNavigator(); XmlText.Text = GetXNavDescr(xnav, 0);

}

In this case, the returned object is passed to the GetXNavDescr() recursive method, which returns the HTML code that represents the XML structure, as in the previous example.

The code of the GetXNavDescr() method is a bit different from the GetChildNodesDescr() method in the previous example, because it takes an XPathNavigator object that is positioned on a single node, not a collection of nodes. That means you don’t need to loop through any collections. Instead, you can simply examine the information for the current node, as follows:

private string GetXNavDescr(XPathNavigator xnav, int level)

{

string indent = "";

for (int i=0; i<level; i++)

indent += "     "; StringBuilder str = new StringBuilder(""); switch(xnav.NodeType)

{

case XPathNodeType.Root: str.Append("<b>ROOT</b>"); str.Append("<br />"); break;

case XPathNodeType.Element: str.Append(indent); str.Append("Element: <b>"); str.Append(xnav.Name); str.Append("</b><br />"); break;

case XPathNodeType.Text: str.Append(indent); str.Append(" - Value: <b>"); str.Append(xnav.Value); str.Append("</b><br />"); break;

case XPathNodeType.Comment: str.Append(indent); str.Append("Comment: <b>"); str.Append(xnav.Value);

440 C H A P T E R 1 2 X M L

str.Append("</b><br />"); break;

}

...

Note that the values for the NodeType property are almost the same, except for the enumeration name, which is XPathNodeType instead of XmlNodeType. That’s because the XPathNavigator uses a smaller, more streamlined set of nodes. One of the nodes it doesn’t support is the XmlDeclaration node type.

The function checks if the current node has any attributes. If so, it moves to the first one with a call to MoveToFirstAttribute() and loops through all the attributes until the MoveToNextAttribute() method returns false. At that point it returns to the parent node, which is the node originally referenced by the object. Here’s the code that carries this out:

...

if (xnav.HasAttributes)

{

xnav.MoveToFirstAttribute(); do {

str.Append(indent); str.Append(" - Attribute: <b>"); str.Append(xnav.Name); str.Append("</b> Value: <b>"); str.Append(xnav.Value); str.Append("</b><br />");

} while (xnav.MoveToNextAttribute()); // Return to the parent. xnav.MoveToParent();

}

...

The function does a similar thing with the child nodes by moving to the first one with MoveToFirstChild() and recursively calling itself until MoveToNext() returns false, at which point it moves back to the original node, as follows:

...

if (xnav.HasChildren)

{

xnav.MoveToFirstChild(); do {

str.Append(GetXNavDescr(xnav, level+1)); } while (xnav.MoveToNext());

// Return to the parent. xnav.MoveToParent();

}

return str.ToString();

}

This code produces almost the same output as shown in Figure 12-2.

Searching an XML Document

In some situations, you don’t need to process the entire XML document. Instead, you need to extract a single piece of information. If you know the element name, you can use the XmlDocument.GetElementsByTagName() method, which searches an entire document and returns an XmlNodeList that contains all the matching XmlNode objects.

C H A P T E R 1 2 X M L

441

For example, the following code retrieves the title of each DVD in the document:

// Load the XML file.

string xmlFile = Server.MapPath("DvdList.xml"); XmlDocument doc = new XmlDocument(); doc.Load(xmlFile);

// Find all the <Title> elements anywhere in the document. StringBuilder str = new StringBuilder();

XmlNodeList nodes = doc.GetElementsByTagName("Title"); foreach (XmlNode node in nodes)

{

str.Append("Found: <b>");

// Show the text contained in this <Title> element. str.Append(node.ChildNodes[0].Value); str.Append("</b><br />");

}

XmlText.Text = str.ToString();

Figure 12-3 shows the result of running this code in a web page.

Figure 12-3. Searching for information in an XML document

You can also search portions of an XML document by using the method XmlElement.GetElementsByTagName() on a specific element. In this case, the XmlDocument searches all the descendant nodes looking for a match. To use this method, first retrieve an XmlNode that corresponds to an element and then cast this object to an XmlElement. The following example demonstrates how to use this technique to find the stars of a specific movie:

// Load the XML file.

string xmlFile = Server.MapPath("DvdList.xml"); XmlDocument doc = new XmlDocument(); doc.Load(xmlFile);

// Find all the <Title> elements anywhere in the document. StringBuilder str = new StringBuilder();

XmlNodeList nodes = doc.GetElementsByTagName("Title"); foreach (XmlNode node in nodes)

{

str.Append("Found: <b>");

442 C H A P T E R 1 2 X M L

// Show the text contained in this <Title> element. string name = node.ChildNodes[0].Value; str.Append(name);

str.Append("</b><br />");

if (name == "Forrest Gump")

{

//Find the stars for just this movie.

//First you need to get the parent node

//(which is the <DVD> element for the movie). XmlNode parent = node.ParentNode;

//Then you need to search down the tree. XmlNodeList childNodes =

((XmlElement)parent).GetElementsByTagName("Star"); foreach (XmlNode childNode in childNodes)

{

str.Append("   Found Star: "); str.Append(childNode.ChildNodes[0].Value); str.Append("<br />");

}

}

}

XmlText.Text = str.ToString();

Figure 12-4 shows the result of this test.

Figure 12-4. Searching portions of an XML document

The code you’ve seen so far assumes that none of the elements has a namespace. More sophisticated XML documents will always include a namespace and may even have several of them. In this situation, you can use the overload of the method XmlDocument.GetElementsByTagName(), which requires a namespace name as a string argument, as shown here:

// Retrieve all <order> elements in the OrderML namespace. XmlNodeList nodes = doc.GetElementsByTagName("order",

"http://mycompany/OrderML");

Additionally, you can supply an asterisk (*) for the element name if you want to match all tags in the specified namespace:

C H A P T E R 1 2 X M L

443

// Retrieve all elements in the OrderML namespace. XmlNodeList nodes = doc.GetElementsByTagName("*",

"http://mycompany/OrderML");

Searching an XML Document with XPath

The GetElementsByTagName() method is fairly limited. It allows you to search based on the name of an element only. You can’t filter based on other criteria, such as the value of the element or attribute content. XPath is a much more powerful standard that allows you to retrieve the portions of a document that interest you.

XPath uses a pathlike notation. For example, the path / identifies the root of an XML document, and /DvdList identifies the root <DvdList> element. The path /DvdList/DVD selects every <DVD> element inside the <DvdList>. Finally, the period (.) always selects the current node. In addition, the path // is a relative path that searches for nodes anywhere in the document.

These ingredients are enough to build many basic templates, although the XPath standard also defines special selection criteria that can filter out only the nodes in which you are interested. Table 12-1 provides an overview of XPath characters.

Table 12-1. Basic XPath Syntax

Expression

Meaning

/

Starts an absolute path from the root node.

 

/DvdList/DVD selects all <DVD> elements that are children of the root <DvdList>

 

element.

//

Starts a relative path that selects nodes anywhere.

 

//DVD/Title selects all the <Title> elements that are children of a <DVD>

 

element.

@

Selects an attribute of a node.

 

/DvdList/DVD/@ID selects the attribute named ID from the <DVD> element.

*

Selects any element in the path.

 

/DvdList/DVD/* selects all the nodes in the <DVD> element (which include

 

<Title>, <Director>, <Price>, and <Starring> in this example).

|

Combines multiple paths.

 

/DvdList/DVD/Title/DvdList/DVD/Director selects both the <Title> and

 

<Director> elements in the <DVD> element.

.

Indicates the current (default) node.

..

Indicates the parent node.

 

If the current node is <Title>, then .. refers to the <DVD> node.

[ ]

Define selection criteria that can test a contained node or attribute value.

 

/DvdList/DVD[Title='Forrest Gump'] selects the <DVD> elements that contain a

 

<Title> element with the indicated value.

 

/DvdList/DVD[@ID='1'] selects the <DVD> elements with the indicated attribute

 

value.

 

You can use the and keyword to combine criteria.

starts-with

This function retrieves elements based on what text a contained element starts

 

with.

 

/DvdList/DVD[starts-with(Title, 'P')] finds all <DVD> elements that have a

 

<Title> element that contains text that starts with the letter P.

position

This function retrieves elements based on position.

 

/DvdList/DVD[position()=2] selects the second <DVD> element.

count

This function counts the number of elements with the matching name.

 

count(DVD) returns the number of <DVD> elements.

 

 

444 C H A P T E R 1 2 X M L

To execute an XPath expression in .NET, you can use the Select() method of the XPathNavigator or the SelectNodes() or SelectSingleNode() method of the XmlDocument class. The following code uses this technique to retrieve specific information:

// Load the XML file.

string xmlFile = Server.MapPath("DvdList.xml"); XmlDocument doc = new XmlDocument(); doc.Load(xmlFile);

//Retrieve the title of every science-fiction movie. XmlNodeList nodes =

doc.SelectNodes("/DvdList/DVD/Title[../@Category='Science Fiction']");

//Display the titles.

StringBuilder str = new StringBuilder(); foreach (XmlNode node in nodes)

{

str.Append("Found: <b>");

// Show the text contained in this <Title> element. str.Append(node.ChildNodes[0].Value); str.Append("</b><br />");

}

XmlText.Text = str.ToString();

Figure 12-5 shows the results.

Figure 12-5. Extracting information with XPath

Using the XmlTextReader

Reading an XML file with an XmlTextReader object is the simplest approach, but it also provides the least flexibility. The file is read in sequential order, and you can’t freely move to the parent, child, and sibling nodes as you can with XmlDocument and XPathNavigator. Instead, you read a node at a time from a stream. For this reason, the code is in a single nonrecursive method, which is more straightforward. It also makes it easy to scan through an entire XML document until you find the node that interests you.

The following code starts by loading the source file in an XmlTextReader object. It then begins a loop that moves through the document one node at time. To move from one node to the next, you call the XmlTextReader.Read() method. This method returns true until it moves past the last node, at which point it returns false. This is similar to the approach used by the DataReader class, which retrieves query results from a database.

C H A P T E R 1 2 X M L

445

Here’s the code you need:

private void ReadXML()

{

string xmlFile = Server.MapPath("DvdList.xml");

// Create the reader.

XmlTextReader reader = new XmlTextReader(xmlFile); StringBuilder str = new StringBuilder();

// Loop through all the nodes. while (reader.Read())

{

switch(reader.NodeType)

{

case XmlNodeType.XmlDeclaration: str.Append("XML Declaration: <b>"); str.Append(reader.Name); str.Append(" "); str.Append(reader.Value); str.Append("</b><br />");

break;

case XmlNodeType.Element: str.Append("Element: <b>"); str.Append(reader.Name); str.Append("</b><br />"); break;

case XmlNodeType.Text: str.Append(" - Value: <b>"); str.Append(reader.Value); str.Append("</b><br />"); break;

}

...

After handling the types of nodes you’re interested in, the next step is to check if the current node has attributes. The XmlTextReader doesn’t have an Attributes collection, but an AttributeCount property returns the number of attributes. You can continue moving the cursor forward to the next attribute until MoveToNextAttribute() returns false.

...

if (reader.AttributeCount > 0)

{

while(reader.MoveToNextAttribute())

{

str.Append(" - Attribute: <b>"); str.Append(reader.Name); str.Append("</b> Value: <b>"); str.Append(reader.Value); str.Append("</b><br />");

}

}

}

// Close the reader and show the text. reader.Close();

XmlText.Text = str.ToString();

}

446 C H A P T E R 1 2 X M L

In the last two lines the procedure concludes by flushing the content in the buffer and closing the reader. When using the XmlTextWriter, it’s imperative you finish your task and close the reader as soon as possible, because it retains a lock on the file, unlike the XmlDocument, which loads all the information into memory when you call the Load() method.

If you run this code now, you’ll see a web page that’s quite similar to the earlier examples with the XmlDocument and XPathNavigator.

The XmlTextReader provides additional methods that help make reading XML even faster and more convenient if you know what structure to expect. For example, you can use MoveToContent(), which skips over irrelevant nodes (such as comments, whitespace, and the XML declaration) and stops on the declaration of the next element.

You can also use the ReadStartElement() method, which reads a node and performs basic validation at the same time. When you call ReadStartElement(), you specify the name of the element you expect to appear next in the document. The XmlTextReader calls MoveToContent() and then verifies that the current element has the name you’ve specified. If it doesn’t, an exception is thrown. You can also use ReadEndElement() method to skip over whitespace and read the closing tag for the element.

Finally, if you want to read an element that contains only text data, you move over the start tag, content, and end tag by using the ReadElementString() method and by specifying the element name. The data you want is returned as a string.

Here’s the code that extracts data from the XML list using this more streamlined approach:

// Create the reader.

string xmlFile = Server.MapPath("DvdList.xml"); XmlTextReader reader = new XmlTextReader(xmlFile);

StringBuilder str = new StringBuilder(); reader.ReadStartElement("DvdList");

//Read all the <DVD> elements. while (reader.Read())

{

if ((reader.Name == "DVD") && (reader.NodeType == XmlNodeType.Element))

{

reader.ReadStartElement("DVD");

str.Append("<ul><b>");

str.Append(reader.ReadElementString("Title"));

str.Append("</b><li>");

str.Append(reader.ReadElementString("Director"));

str.Append("</li><li>");

str.Append(String.Format("{0:C}",

Decimal.Parse(reader.ReadElementString("Price"))));

str.Append("</li></ul>");

}

}

//Close the reader and show the text.

reader.Close();

XmlText.Text = str.ToString();

Figure 12-6 shows the result.

C H A P T E R 1 2 X M L

447

Figure 12-6. Efficient XML reading

Validating XML Files

So far you’ve seen a number of strategies for reading and parsing XML data. If you try to read invalid XML content using any of these approaches, you’ll receive an error. In other words, all these classes require well-formed XML. However, none of the examples you’ve seen so far has validated the XML to check that it follows any application-specific rules.

As described at the beginning of this chapter, XML formats are commonly codified with an XML schema that lays out the required structure and data types. For the DVD list document, you can create an XML schema that looks like this:

<?xml version="1.0" ?>

<xs:schema id="DvdList" xmlns="" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:msdata="urn:schemas-microsoft-com:xml-msdata">

<xs:element name="DvdList"> <xs:complexType>

<xs:sequence maxOccurs="unbounded"> <xs:element name="DVD" type="DVDType" />

</xs:sequence>

</xs:complexType>

</xs:element>