Joins and Query Use with XML - Use case TREE: queries that preserve hierarchy (Page 3 of 4 ) Some XML document types have a very flexible structure in which text is mixed with elements and many elements are optional. These document-types show a wide variation in structure from one document to another. In these types of documents, the ways in which elements are ordered and nested are usually quite important. An XML query language should have the ability to extract elements from documents while preserving their original hierarchy. This use-case illustrates this requirement by means of a flexible document type named Book. The DTD and XML data used by these queries follows in Examples 9-14 to 9-15. Example 9-14. book.dtd
<!ELEMENT book (title, author+, section+)> <!ELEMENT title (#PCDATA)> <!ELEMENT author (#PCDATA)> <!ELEMENT section (title, (p | figure | section)* )> <!ATTLIST section id ID #IMPLIED difficulty CDATA #IMPLIED> <!ELEMENT p (#PCDATA)> <!ELEMENT figure (title, image)> <!ATTLIST figure width CDATA #REQUIRED height CDATA #REQUIRED > <!ELEMENT image EMPTY> <!ATTLIST image source CDATA #REQUIRED >
Example 9-15.book.xml
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE book SYSTEM "book.dtd"> <book> <title>Data on the Web</title> <author>Serge Abiteboul</author> <author>Peter Buneman</author> <author>Dan Suciu</author> <section id="intro"difficulty="easy" > <title>Introduction</title> <p>Text ... </p> <section> <title>Audience</title> <p>Text ... </p> </section> <section> <title>Web Data and the Two Cultures</title> <p>Text ... </p> <figure height="400" width="400"> <title>Traditional client/server architecture</title> <image source="csarch.gif"/> </figure> <p>Text ... </p> </section> </section> <section id="syntax" difficulty="medium" > <title>A Syntax For Data</title> <p>Text ... </p> <figure height="200" width="500"> <title>Graph representations of structures</title> <image source="graphs.gif"/> </figure> <p>Text ... </p> <section> <title>Base Types</title> <p>Text ... </p> </section> <section> <title>Representing Relational Databases</title> <p>Text ... </p> <figure height="250" width="400"> <title>Examples of Relations</title> <image source="relations.gif"/> </figure> </section> <section> <title>Representing Object Databases</title> <p>Text ... </p> </section> </section> </book> Question 1. Prepare a (nested) table of contents for Book1, listing all the sections and their titles. Preserve the original attributes of each <section> element, if any exist: <xsl:template match="book"> <toc> <xsl:apply-templates/> </toc> </xsl:template> <!-- Copy element of toc --> <xsl:template match="section | section/title | section/title/text()"> <xsl:copy> <xsl:copy-of select="@*"/> <xsl:apply-templates/> </xsl:copy> </xsl:template>
<!-- Suppress other elements --> <xsl:template match="* | text()"/> Question 2. Prepare a (flat) figure list for Book1, listing all figures and their titles. Preserve the original attributes of each <figure> element, if any exist: <xsl:template match="book"> <figlist> <xsl:for-each select=".//figure"> <xsl:copy> <xsl:copy-of select="@*"/> <xsl:copy-of select="title"/> </xsl:copy> </xsl:for-each> </figlist> </xsl:template> Question 3. How many sections are in Book1, and how many figures? <xsl:template match="/"> <section-count><xsl:value-of select="count(//section)"/></section-count> <figure-count><xsl:value-of select="count(//figure)"/></figure-count> </xsl:template> Question 4. How many top-level sections are in Book1? xsl:template match="book"> <top_section_count> <xsl:value-of select="count(section)"/> </top_section_count> </xsl:template> Question 5. Make a flat list of the section elements in Book1. In place of its original attributes, each section element should have two attributes, containing the title of the section and the number of figures immediately contained in the section: <xsl:template match="book"> <section_list> <xsl:for-each select=".//section"> <section title="{title}" figcount="{count(figure)}"/> </xsl:for-each> </section_list> </xsl:template> Question 6. Make a nested list of the section elements in Book1, preserving their original attributes and hierarchy. Inside each section element, include the title of the section and an element that includes the number of figures immediately contained in the section. See Example 9-16 and Example 9-17.
Example 9-16. The solution as I would interpret the English requirements
<xsl:template match="book"> <toc> <xsl:apply-templates select="section"/> </toc> </xsl:template>
<xsl:template match="section"> <xsl:copy> <xsl:copy-of select="@*"/> <xsl:copy-of select="title"/> <figcount><xsl:value-of select="count(figure)"/></figcount> <xsl:apply-templates select="section"/> </xsl:copy> </xsl:template>
Example 9-17. What the W3C use case wants based on a sample result and XQuery
<xsl:template match="book"> <toc> <xsl:for-each select="//section"> <xsl:sort select="count(ancestor::section)"/> <xsl:apply-templates select="."/> </xsl:for-each> </toc> </xsl:template>
<xsl:template match="section"> <xsl:copy> <xsl:copy-of select="@*"/> <xsl:copy-of select="title"/> <figcount><xsl:value-of select="count(figure)"/></figcount> <xsl:apply-templates select="section"/> </xsl:copy> </xsl:template>
Next: Use case SEQ: queries based on sequence. >>
More XML Tutorials Articles More By O'Reilly Media | This article is excerpted from chapter nine of the XSLT Cookbook, Second Edition, written by Sal Mangano (O'Reilly; ISBN: 0596009747). Check it out today at your favorite bookstore. Buy this book now.
|
| |