Loading YAML Documents

YAML.rb includes a stream parser, which can read YAML from strings, files, and any type of IO. You can use YAML::load to read single documents, YAML::load_stream to read several documents at once, and YAML::load_documents to iterate through documents in a stream.

Loading a single document

Often you will want to load a single document, representing a single object, into a Ruby variable. The YAML::load method is designed to do just that. It takes either a String or an IO object and returns the first object in the document.

readme = YAML::load( File.open( 'README' ) )
Ex. 37: YAML::load Example

YAML::load is a very convenient function, as you can manipulate the YAML structure as a Ruby type. It flexes YAML's strength as a data serialization language. While an Object's to_yaml method exports it to YAML, the YAML::load method imports the Object back.

o = [ 'array', 'of', 'items' ]
o2 = YAML::load( o.to_yaml )
# o2 and o should be equal
Ex. 29: YAML::load, the answer to Object#to_yaml
Loading many documents

A YAML stream can contain more than one document. Often, you won't want to load the entire stream into memory. Rather, you'll want to load one document at a time. In Ruby, we use the YAML::load_documents method to iterate through documents.

For example, suppose we have a web server's log file, which is made up of several YAML documents in a stream:

at: 2001-08-12 09:25:00.00 Z
type: GET
HTTP: '1.0'
url: '/index.html'
at: 2001-08-12 09:25:10.00 Z
type: GET
HTTP: '1.0'
url: '/toc.html'
Ex. 39: Stream containing a log file

If we wanted to loop through the documents in this file, printing a short summary of each line, we could use YAML::load_documents:

require 'yaml'
log = File.open( "/var/log/apache.yaml" )
yp = YAML::load_documents( log ) { |doc|
  puts "#{doc['at']} #{doc['type']} #{doc['url']}"
Ex. 40: Loading the log file with YAML::load_documents

Like YAML::load, YAML::load_documents is called with the IO object or String that you want to read from. You also must pass YAML::load_documents a Ruby proc for handling each document. The proc only receives one parameter: the current YAML document, loaded as a Ruby object. In the example above, we receive a Hash object for each document in the stream.

YAML::load_documents is the most efficient way to load streaming data. This applies as well to TCP sockets. Client/server applications which communcate in YAML can pass the TCPSocket object directly to YAML::load_documents for parsing a stream over TCP/IP.

Loading an entire stream

In some situations, you may choose to load an entire stream for modification and re-emission. The YAML::Stream object can hold many documents and contains a few function to add convenience to editing documents in the stream. To load an entire stream into a YAML::Stream object, use the YAML::load_stream method.

Like the other YAML load functions, YAML::load_stream requires an IO object or String as its parameter:

readme_doc = YAML::load_stream( File.open( 'README' ) )
puts readme_doc.documents[0]['title']
# prints:
#   YAML.rb
Ex. 38: YAML::load_stream Example