Document 239227

Web Technologies
III B.Tech II Sem (R09) CSE
UNIT-II
Q: What is XML? Explain the various features of XML. What is Document Type Definition (DTD)?
Explain how a DTD is created.
Ans:
What is XML?
The essence of XML is in its name: Extensible Markup Language.
Extensible
XML is extensible. It lets you define your own tags, the order in which they occur, and how they should be
processed or displayed.
Markup
The most recognizable feature of XML is its tags, or elements. In fact, the elements you'll create in XML will
be very similar to the elements you've already been creating in your HTML documents. However, XML allows
you to define your own set of tags.
Language
XML is a language that's very similar to HTML. It's much more flexible than HTML because it allows you to
create your own custom tags. However, it's important to realize that XML is not just a language. XML is a
meta-language: a language that allows us to create or define other languages. For example, with XML we can
create other languages, such as RSS, MathML (a mathematical markup language), and even tools like XSLT.
Consider the following
<html
<head>
<title>ABC Products</title>
</head>
<body>
<h1>ABC Products</h1>
<h2>Product One</h2>
<p>Product One is an exciting new widget that will simplify your
life.</p>
<p><b>Cost: $19.95</b></p>
<h3>Product Two</h3>
<p><i>Cost: $29.95</i></p>
<p>Product Two is an exciting new widget that will make you Jump up and
down</p>
<p><b>Shipping: $5.95</b></p>
</body>
</html>
For example, a human can probably deduce that the <h2> tag in the above document has been used to
tag a product name within a product listing. Furthermore, a human might be able to guess that the first
paragraph after an <h2> holds the description, and that the next two paragraphs contain price and shipping
information, in bold.
However, even a cursory glance at the rest of the document reveals some very human errors. For
example, the last product name is encapsulated in <h3> tags, not <h2> tags. This last product listing also
displays a price before the description, and the price is italicized instead of appearing in bold.
A computer program (and even some humans) that tried to decipher this document wouldn't be able to
make the kinds of semantic leaps required to make sense of it. The computer would be able only to render the
document to a browser with the styles associated with each tag. HTML is chiefly a set of instructions for
rendering documents inside a Web browser; it's not a method of structuring documents to bring out their
meaning.
If the above document were created in XML, it might look a little like First
Example:
<?xml version="1.0"?>
Prepared by A. Sharath Kumar (M.Tech), Asst.Prof
Page 1
Web Technologies
III B.Tech II Sem (R09) CSE
UNIT-II
<productListing title="ABC Products">
<product>
<name>Product One</name>
<description>Product One is an exciting new widget that will
simplify your life.</description>
<cost>$19.95</cost>
<shipping>$2.95</shipping>
</product>
<product>
<name>Product Two</name>
<description>Product Two is an exciting new widget that will make you Jump
up and down</description
<cost>$29.95</cost>
<shipping>$5.95</shipping>
</product>
</productListing>
When we concentrate on a document's structure, as we've done here, we are better able to ensure that
our information is correct. In theory, we should be able to look at any XML document and understand instantly
what's going on. In the example above, we know that a product listing contains products, and that each product
has a name, a description, a price, and a shipping cost. You could say, rightly, that each XML document is selfdescribing, and is readable by both humans and software.
Now, everyone makes mistakes, and XML programmers are no exception. Imagine that you start to
share your XML documents with another developer or company, and, somewhere along the line, someone
places a product's description after its price. Normally, this wouldn't be a big deal, but perhaps your Web
application requires that the description appears after the product name every time.
To ensure that everyone plays by the rules, you need a DTD (a document type definition), or schema.
Basically, a DTD provides instructions about the structure of your particular XML document. It's a lot like a
rule book that states which tags are legal, and where. Once you have a DTD in place, anyone who creates
product listings for your application will have to follow the rules. We'll get into DTDs a little later. For now,
though, let's continue with the basics.
XML DTD:
A "Valid" XML document is a "Well Formed" XML document which conforms to the rules of a
Document Type Definition (DTD).
The purpose of a DTD is to define the structure of an XML document. It defines the structure with a list
of legal elements. A DTD can be declared inline in your XML document, or as an external reference.
NOTICE
To: B.Tech III CSE
From: CR
Message: Don't forget your seminar presentations will be on 29th March (Thursday)
<?xml version="1.0">
<note>
<to> B.Tech IV CSE - A </to>
<from> CR </from>
<heading>Remainder</heading>
<Message> Don't forget your seminar presentations will be on 29 th March (Thursday)!
</Message>
</note>
<!DOCTYPE note
[
<!ELEMENT note (to,from,heading,body)>
Prepared by A. Sharath Kumar (M.Tech), Asst.Prof
Page 2
Web Technologies
III B.Tech II Sem (R09) CSE
UNIT-II
<!ELEMENT to (#PCDATA)>
<!ELEMENT from (#PCDATA)>
<!ELEMENT heading (#PCDATA)>
<!ELEMENT Message (#PCDATA)>
]>
External DTDs
The DTD example we saw at the start of this chapter appeared within the DOCTYPE declaration at the
top of the XML document. This is okay for experimentation purposes, but with many projects, you'll likely
have dozens—or even hundreds—of files that must conform to the same DTD. In these cases, it's much
smarter to put the DTD in a separate file, then reference it from your XML documents.
An external DTD is usually a file with a file extension of .dtd—for example, letter.dtd. This external
DTD contains the same notational rules set forth for an internal DTD.
To reference this external DTD, you need to add two things to your XML document. First, you must
edit the XML declaration to include the attribute standalone="no":
<?xml version="1.0" standalone="no"?>
Add a DOCTYPE declaration that points to the external DTD, like this:
<!DOCTYPE letter SYSTEM "letter.dtd">
This will search for the letter.dtd file in the same directory as the XML file. If the DTD lives
on a different server, you might point to PUBLIC instead SYSTEM
<!DOCTYPE letter PUBLIC "http://www.example.com/xml/dtd/letter.dtd">
This is the same XML document with an external DTD: (Open it in IE5, and select
view source)
<?xml version="1.0"?>
<!DOCTYPE note SYSTEM "note.dtd">
<note>
<to> B.Tech III CSE </to>
<from> CR </from>
<heading>Remainder</heading>
<Message> Don't forget your seminar presentations will be on 29 th March (Thursday)!
</Message>
</note>
This is a copy of the file "note.dtd" containing the Document Type Definition:
<?xml version="1.0"?>
<!ELEMENT note (to,from,heading,body)>
<!ELEMENT to (#PCDATA)>
<!ELEMENT from (#PCDATA)>
<!ELEMENT heading (#PCDATA)>
<!ELEMENT Message (#PCDATA)>
Element contains elements, pcdata, cdata or empty.
DTD Elements
Creating a DTD is quite straight forward. It's really just a matter of defining your elements, attributes,
and/or entities. Over the next few lessons, I'll explain how to define your elements, attributes, and entities.
To define an element in your DTD, you use the <!ELEMENT> declaration. The actual contents of your
<!ELEMENT> declaration will depend on the syntax rules you need to apply to your element.
Basic Syntax
The <!ELEMENT> declaration has the following syntax:
<!ELEMENT element_name content_model>
Here, element_name is the name of the element you're defining. The content model could indicate a specific
rule, data or another element.
•If it specifies a rule, it will be set to either ANY or EMPTY.
Prepared by A. Sharath Kumar (M.Tech), Asst.Prof
Page 3
Web Technologies
III B.Tech II Sem (R09) CSE
UNIT-II
•If specifies data or another element, the data type/element name needs to be surrounded by brackets (i.e.
(tutorial) or (#PCDATA)).
Plain Text:
If an element should contain plain text, you define the element using #PCDATA. PCDATA stands for
Parsed Character Data and is the way you specify non-markup text in your DTDs.
Using this example - <name>XML Tutorial</name> - the "XML Tutorial" part is the PCDATA. The other part
consists of markup.
Syntax:<!ELEMENT element_name (#PCDATA)>
Example:<!ELEMENT name (#PCDATA)>
The above line in your DTD allows the "name" element to contain non-markup data in your XML document:
<name>XML Tutorial</name>
Unrestricted Elements:
If it doesn't matter what your element contains, you can create an element using the content_model of
ANY. Note that doing this removes all syntax checking, so you should avoid using this if possible. You're
better off defining a specific content model.
Syntax:<!ELEMENT element_name ANY>
Example:<!ELEMENT tutorials ANY>
Empty Elements:
You might remember that an empty element is one without a closing tag. For example, in XHTML, the
<br /> and <img /> tags are empty elements. Here's how you define an empty element:
Syntax:<!ELEMENT element_name EMPTY>
Example:<!ELEMENT header EMPTY>
The above line in your DTD defines the following empty element for your XML document:
<header />
Child Elements:
You can specify that an element must contain another element, by providing the name of the element it
must contain. Here's how you do that:
Syntax:<!ELEMENT element_name (child_element_name)>
Example:<!ELEMENT tutorials (tutorial)>
The above line in your DTD allows the "tutorials" element to contain one instance of the "tutorial"
element in your XML document:
<tutorials>
<tutorial></tutorial>
</tutorials>
DTD Element Operators
One of the examples in the previous lesson demonstrated how to specify that an element
("tutorials") must contain one instance of another element ("tutorial").
This is fine if there only needs one instance of "tutorial", but what if we didn't want a limit.
What if the "tutorials" element should be able to contain any number of "tutorial" instances?
Fortunately we can do that using DTD operators.
Here's a list of operators/syntax rules we can use when defining child elements:
Syntax
Description
Operator
+
a+
One or more occurences of a
*
a*
Zero or more occurences of a
?
a?
Either a or nothing
,
a, b
a followed by b
|
a|b
a followed by b
() (expression) An expression surrounded by parentheses is treated as a unit and could have any one of the
following suffixes ?, *, or +.
Examples of usage follow.
Prepared by A. Sharath Kumar (M.Tech), Asst.Prof
Page 4
Web Technologies
III B.Tech II Sem (R09) CSE
UNIT-II
Zero or More:
To allow zero or more of the same child element, use an asterisk (*):
Syntax:<!ELEMENT element_name (child_element_name*)>
Example:<!ELEMENT tutorials (tutorial*)>
One or More:
To allow one or more of the same child element, use a plus sign (+):
Syntax:<!ELEMENT element_name (child_element_name+)>
Example:<!ELEMENT tutorials (tutorial+)>
Zero or One:
To allow either zero or one of the same child element, use a question mark (?):
Syntax:<!ELEMENT element_name (child_element_name?)>
Example:<!ELEMENT tutorials (tutorial?)>
Choices:
You can define a choice between one or another element by using the pipe (|) operator. For example, if
the "tutorial" element requires a child called either "name", "title", or "subject" (but only one of these), you can
do the following:
Syntax:<!ELEMENT element_name (choice_1 | choice_2 | choice_3)>
Example:<!ELEMENT tutorial (name | title | subject)>
Mixed Content:
You can use the pipe (|) operator to specify that an element can contain both PCDATA and other
elements:
Syntax:<!ELEMENT element_name (#PCDATA | child_element_name)>
Example:<!ELEMENT tutorial (#PCDATA | name | title | subject)*>
DTD Attributes:
Just as you need to define all elements in your DTD, you also need to define any attributes they use.
You use the <!ATTLIST> declaration to define attributes in your DTD.
Syntax:
You use a single <!ATTLIST> declaration to declare all attributes for a given element. In other words,
for each element (that contains attributes), you only need one <!ATTLIST> declaration.
The <!ATTLIST> declaration has the following syntax:
<!ATTLIST element_name
attribute_name TYPE DEFAULT_VALUE
...>
Here, element_name refers to the element that you're defining attributes for, attribute_name is the name
of the attribute that you're declaring, TYPE is the attribute type, and DEFAULT_VALUE is it's default value.
Example: <!ATTLIST tutorial published CDATA "No">
Here, we are defining an attribute called "published" for the "tutorial" element. The attribute's type is
CDATA and it's default value is "No".
Default Values
The attribute TYPE field can be set to one of the following values:
Value
Description
value
A simple text value, enclosed in quotes.
#IMPLIED
Specifies that there is no default value for this attribute, and that the attribute is optional.
#REQUIRED
There is no default value for this attribute, but a a value must be assigned.
#FIXED
value The #FIXED part specifies that the value must be the value provided.
The value part represents the actual value.
Prepared by A. Sharath Kumar (M.Tech), Asst.Prof
Page 5
Web Technologies
III B.Tech II Sem (R09) CSE
UNIT-II
Examples of these default values follow.
Value:
You can provide an actual value to be the default value by placing it in quotes.
Syntax:<!ATTLIST element_name attribute_name CDATA "default_value">
Example:<!ATTLIST tutorial published CDATA "No">
#REQUIRED:
The #REQUIRED keyword specifies that you won't be providing a default value, but that you require
that anyone using this DTD does provide one.
Syntax: <!ATTLIST element_name attribute_name CDATA #REQUIRED>
Example: <!ATTLIST tutorial published CDATA #REQUIRED>
#IMPLIED:
The #IMPLIED keyword specifies that you won't be providing a default value, and that the attribute is
optional for users of this DTD.
Syntax: <!ATTLIST element_name attribute_name CDATA #IMPLIED>
Example: <!ATTLIST tutorial rating CDATA #IMPLIED>
#FIXED:
The #FIXED keyword specifies that you will provide value, and that's the only value that can be used
by users of this DTD.
Syntax:<!ATTLIST element_name attribute_name CDATA #FIXED "value">
Example:<!ATTLIST tutorial language CDATA #FIXED "EN">
Q: Explain about XML Namespace.
Ans:
XML Namespace
In XML, a namespace is used to prevent any conflicts with element names. Because XML allows you
to create your own element names, there's always the possibility of naming an element exactly the same as one
in another XML document. This might be OK if you never use both documents together. But what if you need
to combine the content of both documents? You would have a name conflict. You would have two different
elements, with different purposes, both with the same name.
Example Name Conflict
Imagine we have an XML document containing a list of books. Something like this:
<books>
<book>
<title>The Dream Saga</title>
<author>Matthew Mason</author>
</book>
...
</books>
And imagine we want to combine it with the following HTML page:
<html>
<head>
<title>Cool Books</title>
</head>
<body>
<p>Here's a list of cool books...</p>
(XML content goes here)
</body>
</html>
Prepared by A. Sharath Kumar (M.Tech), Asst.Prof
Page 6
Web Technologies
III B.Tech II Sem (R09) CSE
UNIT-II
We will encounter a problem if we try to combine the above documents. This is because they both have
an element called title. One is the title of the book, the other is the title of the HTML page. We have a name
conflict. What we can do to prevent this name conflict is, create a namespace for the XML document.
Example Namespace:
Using the above example, we could change the XML document to look something like this:
<bk:books xmlns:bk="http://somebooksite.com/book_spec">
<bk:book>
<bk:title>The Dream Saga</bk:title>
<bk:author>Matthew Mason</bk:author>
</bk:book>
...
</bk:books>
We have added the xmlns:{prefix} attribute to the root element. We have assigned this attribute a
unique value. This unique value is usually in the form of a Uniform Resource Identifier (URI). This defines the
namespace. And, now that the namespace has been defined, we have added a bk prefix to our element names.
Now, when we combine the two documents, the XML processor will see two different element names:
bk:title (from the XML document) and title (from the HTML document). In the previous lesson, we created a
namespace to avoid a name conflict between the elements of two documents we wanted to combine. When we
defined the namespace, we defined it against the root element. This meant that the namespace was to be used
for the whole document, and we prefixed all child elements with the same namespace. You can also define
namespaces against a child node. This way, you could use multiple namespaces within the same document if
required.
Example Local Namespace:
Here, we apply the namespace against the title element only:
<books>
<book>
<bk:title xmlns:bk="http://somebooksite.com/book_spec">
The Dream Saga
</bk:title>
<author>Matthew Mason</author>
</book>
...
</books>Here, we apply the namespace against the title element only:
<books>
<book>
<bk:title xmlns:bk="http://somebooksite.com/book_spec">
The Dream Saga
</bk:title>
<author>Matthew Mason</author>
</book>
...
</books>
XML Default Namespace:
The namespaces we created in the previous two lessons involved applying a prefix. We applied the
prefix when we defined the namespace, and we applied a prefix to each element that referred to the namespace.
Prepared by A. Sharath Kumar (M.Tech), Asst.Prof
Page 7
Web Technologies
III B.Tech II Sem (R09) CSE
UNIT-II
You can also use what is known as a default namespace within your XML documents. The only difference
between a default namespace and the namespaces we covered in the previous two lessons is, a default
namespace is one where you don't apply a prefix.
You can also define namespaces against a child node. This way, you could use multiple
namespaces within the same document if required.
Example Default Namespace:
Here, we define the namespace without a prefix:
<books xmlns="http://somebooksite.com/book_spec">
<book>
<title>The Dream Saga</title>
<author>Matthew Mason</author>
</book>
...
</books>
When you define the namespace without a prefix, all descendant elements are assumed to belong to that
namespace, unless specified otherwise (i.e. with a local namespace).
Q: Explain about XML Schema in detail.
Ans:
XML Schema
XML Schema is an XML-based alternative to DTD.An XML schema describes the structure of an
XML document.The XML Schema language is also referred to as XML Schema Definition (XSD).The
purpose of an XML Schema is to define the legal building blocks of an XML document, just like a DTD.
An XML Schema:
Defines elements that can appear in a document
Defines attributes that can appear in a document
Defines which elements are child elements
Defines the order of child elements
Defines the number of child elements
Defines whether an element is empty or can include text
Defines data types for elements and attributes
Defines default and fixed values for elements and attributes
One of the greatest strength of XML Schemas is the support for data types.
With support for data types:
It is easier to describe allowable document content
It is easier to validate the correctness of data
It is easier to work with data from a database
It is easier to define data facets (restrictions on data)
It is easier to define data patterns (data formats)
It is easier to convert data between different data types
XML Schemas Secure Data Communication
When sending data from a sender to a receiver, it is essential that both parts have the same
"expectations" about the content. With XML Schemas, the sender can describe the data in a way that the
receiver will understand.
A date like: "03-11-2004" will, in some countries, be interpreted as 3.November and in other countries as
11.March.
However, an XML element with a data type like this:
<date type="date">2004-03-11</date>
Prepared by A. Sharath Kumar (M.Tech), Asst.Prof
Page 8
Web Technologies
III B.Tech II Sem (R09) CSE
UNIT-II
ensures a mutual understanding of the content, because the XML data type "date" requires the format "YYYYMM-DD".
A Simple XML Document
Look at this simple XML document called "note.xml":
<?xml version="1.0"?>
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
A DTD File
The following example is a DTD file called "note.dtd" that defines the elements of the XML document above
("note.xml"):
<!ELEMENT note (to, from, heading, body)>
<!ELEMENT to (#PCDATA)>
<!ELEMENT from (#PCDATA)>
<!ELEMENT heading (#PCDATA)>
<!ELEMENT body (#PCDATA)>
The first line defines the note element to have four child elements: "to, from, heading, body".
Line 2-5 defines the to, from, heading, body elements to be of type "#PCDATA".
An XML Schema
The following example is an XML Schema file called "note.xsd" that defines the elements of the XML
document above ("note.xml"):
<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
targetNamespace="http://www.w3schools.com"
xmlns="http://www.w3schools.com"
elementFormDefault="qualified">
<xs:element name="note">
<xs:complexType>
<xs:sequence>
<xs:element name="to" type="xs:string"/>
<xs:element name="from" type="xs:string"/>
<xs:element name="heading" type="xs:string"/>
<xs:element name="body" type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
A Reference to an XML Schema
This XML document has a reference to an XML Schema:
<?xml version="1.0"?>
<note
xmlns="http://www.w3schools.com"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.w3schools.com note.xsd">
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
Prepared by A. Sharath Kumar (M.Tech), Asst.Prof
Page 9
Web Technologies
III B.Tech II Sem (R09) CSE
UNIT-II
<body>Don't forget me this weekend!</body>
</note>
The <schema> element is the root element of every XML Schema.
<?xml version="1.0"?>
<xs:schema>
...
...
</xs:schema>
The <schema> element may contain some attributes. A schema declaration often looks something like this:
<?xml version="1.0"?>
<xs:schema xmlns:xs=http://www.w3.org/2001/XMLSchema targetNamespace=http://www.w3schools.com
xmlns=http://www.w3schools.com elementFormDefault="qualified">
...
...
</xs:schema>
The following fragment:
xmlns:xs=http://www.w3.org/2001/XMLSchema- indicates that the elements and data types used in the
schema come from the "http://www.w3.org/2001/XMLSchema" namespace. It also specifies that the elements
and data types that come from the "http://www.w3.org/2001/XMLSchema" namespace should be prefixed with
xs:
elementFormDefault="qualified"- indicates that any elements used by the XML instance document which were
declared in this schema must be namespace qualified.
Prepared by A. Sharath Kumar (M.Tech), Asst.Prof
Page 10