# UTILS

`FoX_utils` is a collection of general utility functions that the rest of FoX depends on, but which may be of independent use. They are documented here.

All functions are accessible from the `FoX_utils` module. 

NB Unlike the APIs of WXML, WCML, and SAX, the UTILS APIs may not remain constant between FoX versions. While some effort will be expended to ensure they don't change unnecessarily, no guarantees are made.

For any end-users interested in the code who are worried about interface changes, it is recommended that the  relevant code (all found in the `utils/` directory be lifted directly and imported into other projects, rather than accessed through the FoX interfaces.

Two sets of utility functions are provided; one concerned with [UUID](#UUID)s, and a set concerned with [URI](#URI)s.

<a name="UUID"/>

## UUID

UUIDs (see [RFC 4122](http://tools.ietf.org/html/rfc4122)) are  Universally Unique IDentifiers. They are a 128-bit number, represented as a 36-character string. For example:

     f81d4fae-7dec-11d0-a765-00a0c91e6bf6

The intention of UUIDs is to enable distributed systems to uniquely identify information without significant central coordination. Thus, anyone can create a UUID and use it to identify something with reasonable confidence that the identifier will never be unintentionally used by anyone for anything else.

This property also makes them useful as Uniform Resource Names, to refer to a given document without requiring a position in a particular URI scheme. Thus the above UUID could be referred to as

    urn:uuid:f81d4fae-7dec-11d0-a765-00a0c91e6bf6

UUIDs are used by WCML to ensure that every document generated has a unique ID. This enables users to go back later on and have confidence that they are examining the same document, regardless of where it might have ended up in file-system hierarchies or databases.

In addition, UUIDs come in several flavours, one of which stores the time of creation to 100-nanosecond accuracy. This can later be extracted (see, for example [this service](http://www.famkruithof.net/uuid/uuidgen?typeReq=-1)) to verify creation time.

This may well be useful for other XML document types, or indeed in non-XML applications. Thus, UUIDs may be generated by the following function, with one optional argument.

* `generate_UUID`  
**version**: *integer*

This function returns a 36-character string containing the UUID.

**version** identifies the version of UUID to be used (see section 4.1.3 of the RFC). Only versions 0, 1, and 4 are supported. Version 0 generates a nil UUID; version 1 a time-based UUID, and version 4 a pseudo-randomly-generated UUID.

Version 1 is the default, and is recommended.

(Note: all pseudo-random-numbers are generated using the high-quality Mersenne Twister algorithm, using the Fortran implementation of [Scott Robert Ladd](http://www.coyotegulch.com).)

<a name="URI"/>

## URI

URIs (see [RFC 2396](http://tools.ietf.org/html/rfc2396)) are  Universal Resource Identifiers. A URI is a string, containing several components, which identifies a resource. Very often, this resource is a file, and the URI represents the local or network path to this file.

For example:

    http://www.uszla.me.uk/FoX/DoX/index.html

is a URI pointing to the FoX documentation.

Equally, however:

    FoX/configure

is a URI reference pointing to the FoX configure script (relative to the current directory, or `base URI`).

A string which is a URI reference contains several components, some of which are optional.

* `scheme` - eg, `http`
* `authority` - eg, `www.uszla.me.uk`
* `path` - eg, `/FoX/DoX/index.html`

In addition, a URI reference may contain `userinfo`, `host`, `port`, `query`, and `fragment` information. (see the [RFC](http://tools.ietf.org/html/rfc2396) for full details.)

The FoX URI library provides the following features:

* `type(URI)`
This is an opaque Fortran type which is used to hold URI information. The functions described below use this type.

* `parseURI`
This takes one argument, a URI reference, and returns a pointer to a newly-allocated URI object.

If the string provided is not a valid URI reference, then a null pointer is returned; thus this function can be used to check whether a URI is valid.

* `expressURI`
This takes one argument, a URI object, and returns the (fully-escaped) string representing that URI.

* `rebaseURI`
This takes two arguments, both URI objects, and returns a pointer to a third URI object. It calculates the location of the second URI with reference to the first.

Thus, if the first URI were `/FoX/DoX`, and the second `../DoX2/index.html`, then the resulting URI would be `/FoX/DoX2/index.html`

* `destroyURI`
This takes one argument, a pointer to a URI object, and clears up all memory associated with it.

For each component a URI might have (`scheme`, `authority`, `userinfo`, `host`, `port`, `path`, `query`, `fragment`) there are two functions for extracting the component:

* `hasXXX` will return a logical variable according to whether the component is defined. (except for `path` which is always defined, but may be empty)

* `getXXX` will return a string containing the value of the component. (except for `port` which is returned as an integer.

Thus, listing these functions in full:

* `hasScheme`
Is there a scheme associated with the URI?

* `getScheme`
Return the value of the scheme

* `hasAuthority`
Is there an authority associated with the URI?

* `getAuthority`
Return the value of the authority

* `hasUserinfo`
Is there userinfo associated with the URI?

* `getUserinfo`
Return the value of the userinfo

* `hasHost`
Is there a host associated with the URI?

* `getHost`
Return the value of the host

* `hasPort`
Is there a port associated with the URI?

* `getPort`
Return the value of the port

* `getPath`
Return the value of the path

* `hasQuery`
Is there a query associated with the URI?

* `getQuery`
Return the value of the query

* `hasFragment`
Is there a fragment associated with the URI?

* `getFragment`
Return the value of the fragment

