XML and Rebol
From the Rebol Console:
do http://reb4.me/r/altxml
? load-xml
For Rebol 3:
do http://reb4.me/r3/altxml
For Red:
do https://raw.githubusercontent.com/rgchris/Scripts/master/experimental/altxml.red
For Ren-C:
do https://raw.githubusercontent.com/rgchris/Scripts/master/experimental/altxml.reb
Sample Usage
To get all the links from an xhtml page:
html: load-xml/dom http://w3.org
links: html/get-by-tag <a>
foreach link links [print ["(" link/get #href ")" link/text]]
To get entry titles from an RSS feed:
rss: load-xml/dom http://www.rebol.com/article/carl-rss.xml
entries: rss/get-by-tag <item>
foreach entry entries [probe entry/get <title>]
Today’s Weather
weather: http://weather.yahooapis.com/forecastrss?p=35205
weather: load-xml/dom weather
weather: pick weather/get-by-tag <condition> 1
probe weather: context [
temp: rejoin [weather/get #temp "°F"]
sky: weather/get #text
]
load-xml
This function parses an XML document provided by string!, url! or file! producing a representative block. Tags are represented by the tag! type, attributes by the issue! type and text prefixed by a /text refinement.
>> load-xml "<a b='c'>D</a>"
== [
<a> [
#b "c"
%.txt "D"
]
]
Tags with only text as children have a single value:
>> load-xml "<a>B</a>"
== [
<a> "B"
]
/dom refinement
Wraps the above document in an object! with accessor functions. Values from the accessor functions are similarly wrapped in objects.
get-by-tag ‘tag
Returns a block of child nodes where the tag matches the ‘tag value.
body: doc/get-by-tag <body>
body/get-by-tag <p>
get-by-id ‘id
Returns the first matching child node that has an attribute id matching the the ‘id value.
header: doc/get-by-id "header"
children
Returns a block of child nodes.
body/children
sibling /before /after
Returns an adjacent node or none!
header/sibling/after
get ‘name
Returns the value of an immediate child node of name ‘name (attribute or child tag)
doc/get #b
doc/get <a>
text
Returns the textual value of a tag node.
header/text
value
The value of a node.
header/value
flatten
Serializes the document back to an XML string.
path
(only in the experimental version for now)
Select nodes and values using a path notation:
; all titles in an RSS feed
rss/path [<rss> <channel> <item> <title> ?]
; all item nodes in an XML document
rss/path [* <item>]
; first header in an HTML document
html/path [<html> * <h1> 1]
Notation includes:
<rss> - select tags at the current depth
#version - select attributes at the current depth
* <item> - select any descendant with this tag
2 - selects the second result
? - converts the selection(s) to that node's value
Not implemented:
parent
To follow. Will return a node’s parent node.