spacer
Home News Links People Catalog
spacer
activepages
spacer

Using XPath in XQuery

Some examples from the eXist Sandbox

A version of the eXist sandbox is available online should you not be able to get the eXist-db software working on your local machine. Below I document some example queries that are available for you to interact with in the sandbox. Your responsibility is to devote a set amount of time to trying out these queries and to change them to provide different (and yet meaningful results). Whenever you have an "Aha!" moment, document the query you were running and your thoughts in our Forum on the subject. Your comments will be incorporated into this exercise for the next class to consider (thank you).

The xml documents being searched in the query examples can be downloaded here: mondial.xml | hamlet.xml | macbeth.xml | r_and_j.xml
Use these document structures to try out building different XPath strings to grab data from the database and XQuery code to filter and sort it based on your interests.

Note that I provide you with some additional XML documents (and XPath segments) from my University of Washington class after the sandbox examples here. You might find the tutorial on-line at http://www.w3.org/TR/2001/WD-xquery-20010215/ of interest.

Perform a specific word search in all Shakespeare's Plays:
//SPEECH[ft:query(., 'love')]

The //SPEECH prefix is an XPath notation that grabs all SPEECH nodes from all documents no matter where in the
documents they are encountered. The clause inside the square brackets queries those documents to filter out all SPEECH nodes that do not include the word love in them. 

Get just the lines that have the word love in them from all speeches (notes in the forum on this one):
let $line := //SPEECH/LINE
for $l in $line
where contains($l, "love")
return $l
Look for phrases in the speech text for specific speakers: //SPEECH[ngram:contains(SPEAKER, 'witch')][ft:query(., '"fenny snake"')] This adds another requirement to the filtering clause such that we filter out all speeches where the speakers name does not include the word 'witch' Show the context of a word match: let $query := <query> <bool><term occur="must">nation</term><wildcard occur="should">miser*</wildcard></bool> </query> for $speech in //SPEECH[ft:query(., $query)] let $scene := $speech/ancestor::SCENE, $act := $scene/ancestor::ACT, $play := $scene/ancestor::PLAY return <hit> <play title="{$play/TITLE}"> <act title="{$act/TITLE}"> <scene title="{$scene/TITLE}">{$speech}</scene> </act> </play> </hit> This query shows us how we can go back up the document hierarchy to report meaningful context in the text that comes back from the query. Group Word Hits by Play: let $speech := //SPEECH[ft:query(., "passion*")] let $plays := (for $s in $speech return root($s)) for $play in $plays/PLAY let $hits := $play//$speech return <play title="{$play/TITLE}" hits="{count($hits)}"> {$hits} </play> Find a city by name: for $city in /mondial//city[name&='tre*'] return <result> {$city} <country>{$city/ancestor::country/name}</country> <province>{$city/ancestor::province/name}</province> </result> List countries by decreasing population: for $c in //country[population_growth < 0] order by $c/name return <country> {$c/name, $c/population_growth} </country> List Spanish counties and their populations: let $country := /mondial/country[name = 'Spain'] for $province in $country/province order by $province/name return <province> {$province/name} { for $city in $country//city[@province=$province/@id] order by $city/name return $city } </province> List all countries' three cities with highest population: for $country in /mondial/country let $cities := (for $city in $country//city[population] order by xs:integer($city/population[1]) descending return $city) order by $country/name return <country name="{$country/name}"> { subsequence($cities, 1, 3) } </country>

Additional Examples from my University of Washington class: let $x:= //units return $x let $x:= //location return $x for $x in //data/location/values[oceantemp>17.0] return $x/oceantemp let $x := max(//data/location/values/oceantemp) return $x let $x := count(//data/location/values[airtemp=9999.0]/airtemp) return $x let $x := avg(//data/location/values[oceantemp>2.0]/oceantemp) return $x for $x in //data/location[values/oceantemp>17.0] return $x/(x | y) for $x in //data[location/values/oceantemp>17.0] return $x/time let $x:= //location[landnorth>0] return $x let $x := max(//data/location/values/oceantemp) return if ($x > 10) then "yes" else "no" for $x in //data/location/values/riverflow where some $r in $x satisfies $r < 2000.0 and $r > 1900.0 return $x

Example Data Document: <data> <date>04-09-2006</date> <time>4</time> <location> <x>7</x> <y>51</y> <values> <precip>9.47706221914e-010</precip> <airtemp>190.806568846</airtemp> <wind>-0.69160763042</wind> <winddir>9999.0</winddir> <soiltemp>0.0</soiltemp> <soilmoist>0.0</soilmoist> <watertable>0.0</watertable> <snowwater>0.0</snowwater> <oceantemp>0.0</oceantemp> <currentu>0.0</currentu> <currentv>0.0</currentv> <salinity>0.0</salinity> <zoo>0.0</zoo> <phyto>0.0</phyto> <no3>0.0</no3> <o2>0.0</o2> <po4>0.0</po4> <riverflow>9999.9</riverflow> </values> </location> </data>

The Units Document: <units> <precip abbr ="mm">millimeters</precip> <airtemp abbr ="K">degrees Kelvin</airtemp> <humidity abbr ="%">percent</humidity> <wind abbr ="m/s">meters per second</wind> <winddir abbr ="NWSE">compass heading</winddir> <shortwave abbr ="W/m^2">watts/meters-squared</shortwave> <longwave abbr ="W/m^2">watts/meters-squared</longwave> <soiltemp abbr ="C">degrees Celsius</soiltemp> <soilmoist abbr ="%">percent</soilmoist> <watertable abbr ="m">meters</watertable> <snowwater abbr ="m^3">meters-cubed</snowwater> <porosity abbr ="%">percent</porosity> <grainsize abbr ="m">millimeters</grainsize> <oceantemp abbr ="C">degrees Celsius</oceantemp> <currentu abbr ="m/sec">x meters per second</currentu> <currentv abbr ="m/sec">y meters per second</currentv> <salinity abbr ="ppt">parts per thousand</salinity> <zoo abbr ="mmo">mmo</zoo> <phyto abbr ="mmo">mmo</phyto> <no3 abbr ="mmo">mmo</no3> <o2 abbr ="mmo">mmo</o2> <po4 abbr ="mmo">mmo</po4> <light abbr ="ph/sec*m^3">photons/second*meters-cubed</light> <riverflow abbr ="ft^3/sec">feet-cubed per second</riverflow> <rivervolume abbr ="m^3">meters-cubed</rivervolume> <rivertemp abbr ="C">degrees Celsius</rivertemp> </units>

The World Coordinates Document : <location> <landnorth type ="UTM">5325350</landnorth> <landwest type ="UTM">564050</landwest> <landlongitude type ="center">-121.5828168315</landlongitude> <landlatitude type ="center">47.70478600794</landlatitude> <landspacing type ="meters">1500</landspacing> <waternorth type ="UTM">5424030</waternorth> <waterwest type ="UTM">487140</waterwest> <waterlongitude type ="west">-123.5000</waterlongitude> <waterlatitude type ="north">49.0000</waterlatitude> <waterxspacing type ="meters">360</waterxspacing> <wateryspacing type ="meters">540</wateryspacing> </location>

Welcome to Class

File Size: 37 kb
Posted: Sun, May 30, 2009

Class Project Discussion

File Size: 24 kb
Posted: Fri, Jun 26, 2009