When we talk about 'loading' a YAML stream, we mean that a YAML document is translated into native types. In Ruby, this might be a Hash, an Array or any other Ruby object. But before YAML is loaded into those types, it must be parsed. Parsing is the stage where the structure of the document becomes apparent, but not the native typing.
YAML.rb gives you access to a YAML document before it is transformed. At this stage, the document is represented as a tree of YAML::YamlNode objects. This structure can be quite useful for accessing the data as a raw structure, much as the XML world has their DOM API. Also, you can use YPath queries to retrieve data from the structure. Schemas can be applied to the YamlNode tree, to validate if the structure is intact and syntactically correct.
The YAML::parse and YAML::parse_documents methods are way of accessing this parsed data.
The YAML::parse method has the same syntax as the YAML::load method. A single IO object or String containing a YAML document is passed in to the method. Rather than returning a native Ruby object, though, the YAML::parse method returns a YamlNode representing the document.
tree = YAML::parse( File.open( "README" ) ) puts tree.type_id # prints: # map title = tree.select( "/title" )[0] puts title.value # prints: # YAML.rb obj_tree = tree.transform puts obj_tree['title'] # prints: # YAML.rb
The YamlNode returned contains type and value information for the root-level collection or scalar. If, for example, the document contains a mapping at the root level, then the YamlNode will have a type_id of 'map' and a map of YamlNodes will be contained the object's 'value' property.
node = YAML::parse( <<EOY ) one: 1 two: 2 EOY puts node.type_id # prints: 'map' p node.value['one'] # prints key and value nodes: # [ #<YAML::YamlNode:0x8220278 @type_id="str", @value="one", @kind="scalar">, # #<YAML::YamlNode:0x821fcd8 @type_id="int", @value="1", @kind="scalar"> ]' # Mappings can also be accessed for just the value by accessing as a Hash directly p node['one'] # prints: #<YAML::YamlNode:0x821fcd8 @type_id="int", @value="1", @kind="scalar">
Traversing a tree of YamlNodes can be painstaking in comparison to having the native types around. YPath statements are a much quicker means of querying for the data you need. YPath queries also give you a way to build new sets of YamlNodes for transformation.
The YamlNode#select method can be used to retrieve a sequence of matching nodes. The YamlNode#transform method can be applied to a YamlNode to complete the loading of a node into a native Ruby type.
players = YAML::parse( <<EOY )
player:
- given: Sammy
family: Sosa
- given: Ken
family: Griffey
- given: Mark
family: McGwire
EOY
given = players.select( "/player/*/given" )
p given.transform
# prints:
# ["Sammy", "Ken", "Mark"]
The YAML::parse_documents method is identical to the YAML::load_documents method, except that the iterator loops through each document returning a YamlNode for that document. YPath expressions, schema validations, and transformations can all be applied to this YamlNode, as described above.
require 'yaml'
log = File.open( "/var/log/apache.yaml" )
yp = YAML::parse_documents( log ) { |tree|
at = tree.select('/at')[0].value
type = tree.select('/type')[0].value
puts "#{at} #{type}"
}