XML Worksheet #1

XML became a W3C Recommendation on February 10, 1998. It has been amazing to see how many languages have promulgated as XML-based standards with different user groups. In class we got a sense of how XML markup rules differed from the more loosely defined HTML markup rules. You practiced creating an XML document and validating it via a DTD.

BEGIN:VCARD
VERSION:2.1
N:Jennifer DiLorenzo;
FN:Jennifer DiLorenzo
ORG:Urban Coast Institute;TOP Partners
TITLE:
EMAIL;type=INTERNET;type=WORK;type=pref:jdiloren@monmouth.edu
TEL;WORK;VOICE
TEL;WORK;FAX
ADR;WORK;ENCODING=QUOTED-PRINTABLE;;;;
LABEL:Monmouth University 400 Cedar Avenue West Long Branch, NJ 07764-1898;WORK;ENCODING=QUOTED-PRINTABLE
REV:20080401T145132Z
END:VCARD
In class we encoded the same information using the basic syntax rules of XML? I asked you to write in the DTD details. Here is one student's answer:
<?xml version="1.0"?>

<!DOCTYPE vcard [

<!ELEMENT vcard (n, fn, org, title?, email*, tel*, address*, revision)>
<!ATTLIST vcard version CDATA "2.1">

<!ELEMENT n (#PCDATA)>
<!ELEMENT fn (#PCDATA)>
<!ELEMENT org (#PCDATA)>
<!ELEMENT title (#PCDATA)>
<!ELEMENT email (#PCDATA)>
<!ELEMENT tel (#PCDATA)>
<!ELEMENT address (street1?, street2?, city?, state?, zip?)>
<!ELEMENT street1 (#PCDATA)>
<!ELEMENT street2 (#PCDATA)>
<!ELEMENT city (#PCDATA)>
<!ELEMENT state (#PCDATA)>
<!ELEMENT zip (#PCDATA)>
<!ELEMENT revision (#PCDATA)>

<!ATTLIST org status CDATA "partner">
<!ATTLIST email type CDATA "internet">
<!ATTLIST email location (home|school|work) "work">
<!ATTLIST tel type (voice|cell|school|work|fax) "voice">
<!ATTLIST tel location (home|work|school) "work">
<!ATTLIST address type (label|printable) "label">
<!ATTLIST address encoding CDATA "QUOTED-PRINTABLE">
<!ATTLIST address location (home|school|work) "work">
<!ATTLIST zip type (99999-9999|99999) "99999-9999">
]>

<vcard version="2.1">
    <n>Jennifer DiLorenzo</n>
    <fn>Jennifer DiLorenzo</fn>
    <org status="partner">Urban Coast Institute</org>
    <title></title>
    <email type="internet" location="work">jdiloren@monmouth.edu</email>
    <tel type="voice" location="work"></tel>
    <tel type="fax" location="work"></tel>
    <address type="label" encoding="QUOTED-PRINTABLE" location="work">
        <street1>Monmouth University</street1>
        <street2>400 Cedar Avenue West</street2>
        <city>Long Branch</city>
        <state>NJ</state>
        <zip type="99999-9999">07764-1898</zip>
    </address>
    <address type="printable" location="work">
    </address>
    <revision>20080401T145132Z</revision>
</vcard>
which passes W3C validation.

The following is a complete example of an RDF personal vCard.

<?xml version="1.0" standalone="yes"?>
<!-- rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
             xmlns:v="http://www.w3.org/2006/vcard/ns#" -->

<!DOCTYPE v:VCard [

<!ELEMENT v:VCard (v:fn, v:nickname, v:tel, v:email, v:adr)>
<!ATTLIST v:VCard rdf:about CDATA "#nowhere">

<!ELEMENT v:fn (#PCDATA)>
<!ELEMENT v:nickname (#PCDATA)>
<!ELEMENT v:tel (rdf:Description)>
<!ELEMENT v:email (#PCDATA)>
<!ELEMENT v:adr (rdf:Description)>
<!ELEMENT rdf:Description (((v:street-address, v:locality, v:postal-code, v:country-name) | rdf:value), rdf:type+)>
<!ELEMENT v:street-address (#PCDATA)>
<!ELEMENT v:locality (#PCDATA)>
<!ELEMENT v:postal-code (#PCDATA)>
<!ELEMENT v:country-name (#PCDATA)>
<!ELEMENT rdf:value (#PCDATA)>
<!ELEMENT rdf:type (#PCDATA)>

<!ATTLIST rdf:type rdf:resource CDATA "">
<!ATTLIST v:email rdf:resource CDATA "">

]>
           
  <v:VCard rdf:about = "http://example.com/me/corky" >
    <v:fn>Corky Crystal</v:fn>
    <v:nickname>Corks</v:nickname>
    <v:tel>
      <rdf:Description>  
        <rdf:value>+61 7 5555 5555</rdf:value>
        <rdf:type rdf:resource="http://www.w3.org/2006/vcard/ns#Home"/>
        <rdf:type rdf:resource="http://www.w3.org/2006/vcard/ns#Voice"/>
      </rdf:Description>  
    </v:tel>
    <v:email rdf:resource="mailto:corky@example.com"/>
    <v:adr>
      <rdf:Description>  
        <v:street-address>111 Lake Drive</v:street-address>
        <v:locality>WonderCity</v:locality>
        <v:postal-code>5555</v:postal-code>
        <v:country-name>Australia</v:country-name>
        <rdf:type rdf:resource="http://www.w3.org/2006/vcard/ns#Home"/>
      </rdf:Description>  
    </v:adr>
  </v:VCard>

What do you think about their recommendation? _______________________________________________________ _______________________________________________________

If you need the practice, create another DTD for this VCARD version above to validate this example as well. If useful, run your XML and DTD through the validator to cement your understanding.


You will be creating your own DTDs for the data language you design (as well as an XML Schema-based validator)

Research the structures used for postal addresses in the US, Japan and Brazil. Can you modify at least one DTD that you've created so that it supports (and can validate) all of those forms of international addresses? Here's one student's answer:

<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<!DOCTYPE addresses[
<!ELEMENT addresses (address+)>
<!ELEMENT address (name,(street | prefecture), ((city,state)|(floor?,district,region)|
(munincipality,location,city_district,((city_block,house_number)|(land_number,land_number_extension)))),post_code)>

<!ATTLIST address country CDATA #REQUIRED>
<!ELEMENT name (#PCDATA)>
<!ELEMENT street (#PCDATA)>
<!ELEMENT floor (#PCDATA)>
<!ELEMENT city (#PCDATA)>
<!ELEMENT state (#PCDATA)>
<!ELEMENT post_code (#PCDATA)>
<!ELEMENT district (#PCDATA)>
<!ELEMENT region (#PCDATA)>
<!ELEMENT prefecture (#PCDATA)>
<!ATTLIST prefecture pre_type CDATA #IMPLIED>
<!ELEMENT munincipality (ward|gun)>
<!ELEMENT location (#PCDATA)>
<!ELEMENT ward (machi|cho)>
<!ELEMENT machi (#PCDATA)>
<!ELEMENT cho (#PCDATA)>

<!ELEMENT gun (town|village)>
<!ELEMENT town (cho2|oaza|aza|koaza)>
<!ELEMENT cho2 (#PCDATA)>
<!ELEMENT oaza (#PCDATA)>
<!ELEMENT aza (#PCDATA)>
<!ELEMENT koaza (#PCDATA)>
<!ELEMENT village (mura|son)>
<!ELEMENT mura (#PCDATA)>
<!ELEMENT son (#PCDATA)>
<!ELEMENT city_district (#PCDATA)>
<!ATTLIST city_district style CDATA #IMPLIED>
<!ELEMENT city_block (#PCDATA)>
<!ELEMENT house_number (#PCDATA)>
<!ELEMENT land_number (#PCDATA)>
<!ELEMENT land_number_extension (#PCDATA)>

]>
<addresses>
<address country="US">
<name>Rachel Lieberman</name>
<street>15 Hillside Park</street>
<city>Somerville</city>
<state>MA</state>
<post_code>02143</post_code>
</address>
<address country="brazil">
<name>Rachel Lieberman</name>
<street>15 Hillside Park</street>
<floor>1</floor>
<district>Middlesex County</district>
<region>New England</region>
<post_code>02143</post_code>
</address>

<address country="Japan">
<name>Rachel Lieberman</name>
<prefecture>New England</prefecture>

<munincipality>
<ward><machi>Union Square</machi></ward>
</munincipality>
<location>Somerville</location>
<city_district style="hyoji">
</city_district>
<city_block>Hillside Park</city_block>
<house_number>15</house_number>
<post_code>02143</post_code>
</address>

</addresses>

<!--japanese  based on this link 
http://en.wikipedia.org/wiki/Japanese_addressing_system
-->
which passes W3C validation.