Using XPath in XQuery
Some examples from the eXist Sandbox
A version of the eXist sandbox is available online should you not be able to get the eXist-db software working on your local machine. Below I document some example queries that are available for you to interact with in the sandbox. Your responsibility is to devote a set amount of time to trying out these queries and to change them to provide different (and yet meaningful results). Whenever you have an "Aha!" moment, document the query you were running and your thoughts in our Forum on the subject. Your comments will be incorporated into this exercise for the next class to consider (thank you).
The xml documents being searched in the query examples can be downloaded here:
mondial.xml | hamlet.xml | macbeth.xml | r_and_j.xml
Use these document structures to try out building different XPath strings to grab data from the database and XQuery code to filter and sort it based on your interests.
Note that I provide you with some additional XML documents (and XPath segments) from my University of Washington class after the sandbox examples here. You might find the tutorial on-line at http://www.w3.org/TR/2001/WD-xquery-20010215/ of interest.
Perform a specific word search in all Shakespeare's Plays:
//SPEECH[ft:query(., 'love')]
The //SPEECH prefix is an XPath notation that grabs all SPEECH nodes from all documents no matter where in the
documents they are encountered. The clause inside the square brackets queries those documents to filter out all SPEECH nodes that do not include the word love in them.
Get just the lines that have the word love in them from all speeches (notes in the forum on this one):
let $line := //SPEECH/LINE for $l in $line where contains($l, "love") return $l
Look for phrases in the speech text for specific speakers:
//SPEECH[ngram:contains(SPEAKER, 'witch')][ft:query(., '"fenny snake"')]
This adds another requirement to the filtering clause such that we filter out all speeches where the speakers
name does not include the word 'witch'
Show the context of a word match:
let $query :=
<query>
<bool><term occur="must">nation</term><wildcard occur="should">miser*</wildcard></bool>
</query>
for $speech in //SPEECH[ft:query(., $query)]
let $scene := $speech/ancestor::SCENE,
$act := $scene/ancestor::ACT,
$play := $scene/ancestor::PLAY
return
<hit>
<play title="{$play/TITLE}">
<act title="{$act/TITLE}">
<scene title="{$scene/TITLE}">{$speech}</scene>
</act>
</play>
</hit>
This query shows us how we can go back up the document hierarchy to report meaningful context in the text
that comes back from the query.
Group Word Hits by Play:
let $speech := //SPEECH[ft:query(., "passion*")]
let $plays := (for $s in $speech return root($s))
for $play in $plays/PLAY
let $hits := $play//$speech
return
<play title="{$play/TITLE}" hits="{count($hits)}">
{$hits}
</play>
Find a city by name:
for $city in /mondial//city[name&='tre*']
return
<result>
{$city}
<country>{$city/ancestor::country/name}</country>
<province>{$city/ancestor::province/name}</province>
</result>
List countries by decreasing population:
for $c in //country[population_growth < 0]
order by $c/name
return
<country>
{$c/name, $c/population_growth}
</country>
List Spanish counties and their populations:
let $country := /mondial/country[name = 'Spain']
for $province in $country/province
order by $province/name
return
<province>
{$province/name}
{
for $city in $country//city[@province=$province/@id]
order by $city/name
return $city
}
</province>
List all countries' three cities with highest population:
for $country in /mondial/country
let $cities :=
(for $city in $country//city[population]
order by xs:integer($city/population[1]) descending
return $city)
order by $country/name
return
<country name="{$country/name}">
{
subsequence($cities, 1, 3)
}
</country>
Additional Examples from my University of Washington class:
let $x:= //units
return $x
let $x:= //location
return $x
for $x in //data/location/values[oceantemp>17.0]
return $x/oceantemp
let $x := max(//data/location/values/oceantemp)
return $x
let $x := count(//data/location/values[airtemp=9999.0]/airtemp)
return $x
let $x := avg(//data/location/values[oceantemp>2.0]/oceantemp)
return $x
for $x in //data/location[values/oceantemp>17.0]
return $x/(x | y)
for $x in //data[location/values/oceantemp>17.0]
return $x/time
let $x:= //location[landnorth>0]
return $x
let $x := max(//data/location/values/oceantemp)
return
if ($x > 10) then "yes"
else "no"
for $x in //data/location/values/riverflow
where some $r in $x satisfies $r < 2000.0 and $r > 1900.0
return $x
Example Data Document:
<data>
<date>04-09-2006</date>
<time>4</time>
<location>
<x>7</x>
<y>51</y>
<values>
<precip>9.47706221914e-010</precip>
<airtemp>190.806568846</airtemp>
<wind>-0.69160763042</wind>
<winddir>9999.0</winddir>
<soiltemp>0.0</soiltemp>
<soilmoist>0.0</soilmoist>
<watertable>0.0</watertable>
<snowwater>0.0</snowwater>
<oceantemp>0.0</oceantemp>
<currentu>0.0</currentu>
<currentv>0.0</currentv>
<salinity>0.0</salinity>
<zoo>0.0</zoo>
<phyto>0.0</phyto>
<no3>0.0</no3>
<o2>0.0</o2>
<po4>0.0</po4>
<riverflow>9999.9</riverflow>
</values>
</location>
</data>
The Units Document:
<units>
<precip abbr ="mm">millimeters</precip>
<airtemp abbr ="K">degrees Kelvin</airtemp>
<humidity abbr ="%">percent</humidity>
<wind abbr ="m/s">meters per second</wind>
<winddir abbr ="NWSE">compass heading</winddir>
<shortwave abbr ="W/m^2">watts/meters-squared</shortwave>
<longwave abbr ="W/m^2">watts/meters-squared</longwave>
<soiltemp abbr ="C">degrees Celsius</soiltemp>
<soilmoist abbr ="%">percent</soilmoist>
<watertable abbr ="m">meters</watertable>
<snowwater abbr ="m^3">meters-cubed</snowwater>
<porosity abbr ="%">percent</porosity>
<grainsize abbr ="m">millimeters</grainsize>
<oceantemp abbr ="C">degrees Celsius</oceantemp>
<currentu abbr ="m/sec">x meters per second</currentu>
<currentv abbr ="m/sec">y meters per second</currentv>
<salinity abbr ="ppt">parts per thousand</salinity>
<zoo abbr ="mmo">mmo</zoo>
<phyto abbr ="mmo">mmo</phyto>
<no3 abbr ="mmo">mmo</no3>
<o2 abbr ="mmo">mmo</o2>
<po4 abbr ="mmo">mmo</po4>
<light abbr ="ph/sec*m^3">photons/second*meters-cubed</light>
<riverflow abbr ="ft^3/sec">feet-cubed per second</riverflow>
<rivervolume abbr ="m^3">meters-cubed</rivervolume>
<rivertemp abbr ="C">degrees Celsius</rivertemp>
</units>
The World Coordinates Document :
<location>
<landnorth type ="UTM">5325350</landnorth>
<landwest type ="UTM">564050</landwest>
<landlongitude type ="center">-121.5828168315</landlongitude>
<landlatitude type ="center">47.70478600794</landlatitude>
<landspacing type ="meters">1500</landspacing>
<waternorth type ="UTM">5424030</waternorth>
<waterwest type ="UTM">487140</waterwest>
<waterlongitude type ="west">-123.5000</waterlongitude>
<waterlatitude type ="north">49.0000</waterlatitude>
<waterxspacing type ="meters">360</waterxspacing>
<wateryspacing type ="meters">540</wateryspacing>
</location>
|