What Is a URI?

For many of the metadata language elements, you can specify a metadata resource by its name or identifier. Some of the language elements accept a Uniform Resource Identifier (URI).

In the legal context, A URI-based legislative identifier system should “create APIs for the underlying data.(see http://blog.law.cornell.edu/metasausage/2012/06/11/identifiers-part-3/). See the english model (URIs in Legislation.gov.uk) below.

In the past, AltLaw provided URIs under the id.altlaw.org domain. These URIs redirected to pages on AltLaw site in either HTML or RDF. The RDF versions declared owl:sameAs relations to URIs at dbpedia.org.

Guidelines for assigning identifiers to metadata terms

The DCMI Abstract Model [DCMI-AM] requires that all terms (elements, element refinements, encoding schemes and controlled vocabulary terms) used in metadata application profiles that are compliant with the model must be assigned a URI [RFC3986] that identifies the term. An XML namespace [XML-NAMES] is a collection of names, identified by a URI, that are used in XML documents as element types and attribute names. By convention, all DCMI recommended encodings [DCMI-ENCODINGS] use a concatenation of an XML namespace URI and the term name to provide a mechanism for encoding the term URI. The use of XML namespaces and URI to uniquely identify metadata terms allows those terms to be unambiguously used across applications, promoting the possibility of shared semantics. As indicated in the DCMI Namespace Policy [DCMI-NAMESPACE], DCMI has adopted this mechanism for the identification of all DCMI terms.

This document provides some simple guidelines for assigning URIs to metadata terms in non-DCMI namespaces. This includes non-DCMI elements, element refinements, encoding schemes and controlled vocabulary terms.

Although these guidelines are mainly intended for metadata application profiles that conform with the DCMI Abstract Model, it is hoped that they are generic enough that they may be useful in the context of other metadata applications as well.


All metadata terms must be assigned a URI. The use of fragment identifiers in the URI used to identify metadata terms is optional and is left to the discretion of the implementor.

For the purposes of encoding, the term URI may be partitioned into an XML namespace URI and the term name. Note that, for convenience, it is commonly the case that XML namespace URIs end with either a ‘#’ (hash) or ‘/’ (slash) character.

Groups of related terms (for example, all the terms within a controlled vocabulary) should be assigned URIs within the same XML namespace.

All XML namespace and term URIs should resolve to human and/or machine-readable descriptions of the namespace or term.

Any valid URI [RFC3986] may be used to identify a metadata term. However, the use of a registered URI scheme is recommended [URI-SCHEMES].

All XML namespace and term URIs should be assigned with the intention of them being unique and persistent. This means that the URI must not be used to identify anything else and that it should be expected to last as long as the Internet.

Strategies for assigning URIs

Four simple strategies for assigning URIs to metadata terms are described below.

Using service or project URLs

Where a term is created within the context of a particular project, service or other initiative, the use of a project or service-specific URL may be appropriate. This is probably the simplest strategy in terms of ease of assignment and resolution. However, it is also the most prone to lack of persistence.

Example 1: http://myservice.org/terms/price
An existing service is delivered using the myservice.org DNS domain name. The service creates a new property called price for use in its metadata application profile. The service defines an XML namespace URI within its existing URL space (http://example.org/terms/) and therefore assigns the term the following URI: http://example.org/terms/price.
Example 2: http://myproject.org/metadata/vocabs/color#Red
A project Web-site is delivered using the myproject.org DNS domain name. The project team build up a new controlled vocabulary of colors for use within their metadata application profile. They define an XML namespace URI within their existing URL space (http://myproject.org/metadata/vocabs/color#). For the vocabulary term Red, the term URI is therefore http://myproject.org/metadata/vocabs/color#Red

Notice that example 1 defines a metadata property while example 2 defines a term within a controlled vocabulary. Remember that in example 2 it will probably also be necessary to define an encoding scheme name for the vocabulary itself, for example http://myproject.org/metadata/terms/Color.

Using PURLs

A similar approach, but one that is likely to offer more persistent URIs, is to use PURLs [PURL]. A PURL is a Persistent Uniform Resource Locator. Functionally, a PURL is a URL. However, instead of pointing directly to the location of an Internet resource, a PURL points to an intermediate resolution service. This provides a level of resilience aginst changes in project or service URLs. The use of PURLs to identify metadata terms has already been adopted by a number of metadata-related initiatives such as DCMI itself and RDF Site Summary (RSS) 1.0 [RSS10].

Example 1: http://purl.org/rss/1.0/link
RDF Site Summary is a lightweight multipurpose extensible metadata description and syndication format. The core metadata terms used by RSS are declared within an XML namespace (http://purl.org/rss/1.0/). For example, the property called link has been assigned the URI http://purl.org/rss/1.0/link. Other terms are declared within separate groupings, known in RSS as modules. Each module makes use of one or more separate XML namespaces.
Example 2: http://purl.org/rdn/terms/dateReviewed
The UK JISC-funded Resource Discovery Network has developed a small metadata application profile in order to describe the status of its catalogue records. One of the new terms in the application profile is called dateReviewed. All the new terms have been defined within an RDN XML namespace (http://purl.org/rdn/terms/). Therefore, the URI assigned to the dateReviewed property is http://purl.org/rdn/terms/dateReviewed.

Note that in example 1, the RSS implementors have chosen to embed a version number into the XML namespace URI. This allows them to use the same term name within a new XML namespace in future versions of the application profile. This has advantages in some scenarios. However, implementors should be cautious when using this technique because it may result in URIs being assigned to new terms that have the same semantics as existing terms.

Using “info” URIs

The “info” URI scheme provides a “mechanism for assigning URIs to information assets that have identifiers in public namespaces” but that do not have an appropriate existing URI scheme [INFO-URI-SPEC] [INFO-REGISTRY]. The phrase ‘information assets’ includes all the metadata terms discussed here. Thus, it is appropriate to consider assigning “info” URIs to metadata terms.

Example 1: info:ddc/22/eng//004.678
The terms that make up the Dewey Decimal Classification [DEWEY] have been assigned “info” URIs such that info:ddc/22/eng// can be considered to be an XML namespace URI and “004.678” can be considered to be a Dewey term name. Thus the URI that has been assigned to that term is info:ddc/22/eng//004.678. Note that the information asset identified by this term is in the English-language Dewey Decimal Classifications (22nd Ed.) and is the classification “Internet”.

Note that, somewhat confusingly, the draft “info” URI specification uses different terminology from that used here. In the terminology of the specification, ddc is the “info URI namespace component” and 22/eng//004.678 is the “info URI identifier component”.

Note also that “info” URIs can not be resolved using current Web browsers (i.e. by using a simple HTTP GET request). Indeed, “info” URIs are designed to be non-dereferencable – i.e. it is not possible to dereference an “info” URI in order to retrieve a representation of the identified resource. Unfortunately, this has serious consequences on their utility for identifiying metadata terms. Since it is not possible to easily obtain a representation of the identified term (typically some metadata about the term), it is not possible to obtain any information about the relationships between the identified term and other terms. This means that the “info” URI is of limited use in the context of the Semantic Web, since it is not possible for software applications to reason automatically based on knowledge about the relationships between multiple metadata terms.

At the time of writing, “info” was not a registered URI scheme.

Using xmlns.com

xmlns.com provides a network space for simple Web namespace management. “The rationale for registering xmlns.com was to secure a short, memorable domain suitable for naming concepts for use in RDF and XML vocabularies” [XMLNS]. The FOAF vocabulary [FOAF] uses xmlns.com to provide an XML namespace URI for its terms.

Example 1: http://xmlns.com/foaf/0.1/firstName
The firstName term within the FOAF vocabulary uses the http://xmlns.com/foaf/0.1/ XML namespace URI and has been assigned the URI http://xmlns.com/foaf/0.1/firstName.

Note that, at the time of writing, the status and ownership of the xmlns.com domain was slightly unclear and it is therefore not possible to be sure of the long term persistence of URIs based on this domain.

Conclusions of this part

All terms used in metadata application profiles must be assigned a URI before they can be used in the encoding syntaxes recommended by DCMI. It is recommended that implementors assign URIs to terms following the guidelines provided here. Of the four strategies for assigning URIs to terms listed in this document, the use of PURLs is recommended for the identification of all metadata terms.

Source: http://www.ukoln.ac.uk/metadata/dcmi/term-identifier-guidelines/

The use of Metadata in URIs

Web-based software uses URIs to identify resources. The authority who assigns a URI is responsible for assuring that it is associated with the intended resource, and that operations targeted to the URI manipulate or return the appropriate data. Many URI schemes offer a flexible structure that can also be used to carry additional information, called metadata, about the resource. Such metadata might include the title of a document, the creation date of the resource, the MIME media type that is likely to be returned by an HTTP GET, a digital signature usable to verify the integrity or authorship of the resource content, or hints about URI assignment policies that would allow one to guess the URIs for related resources.

This finding addresses several questions regarding such metadata in URIs:

What information about a resource can or should be embedded in its URI?

What metadata can be reliably determined from a URI, and in what circumstances is it appropriate to rely on the correctness of such information?

In what circumstances is it appropriate to use information from a URI as a hint as to the nature of a resource or its representations?

The first question is primarily of concern to URI assignment authorities, who must choose a suitable URI for each resource they control. The other questions are focused on people and software making use of URIs, whether at the resource authority or elsewhere. Of course, the questions are related insofar as one reason for an authority to encode metadata is for the benefit of resource users.

The TAG has earlier published a finding Authoritative Metadata [AUTHMETA], which explains how to determine correct metadata in cases where conflicting information has been provided. This finding is concerned with just one possible means of determining resource metadata, i.e. from the URI itself. The TAG publication [AWWW] discusses related issues under the heading of URI Opacity; this finding provides additional detail and guidance on the encoding of metadata into URIs, and on when it is or isn’t appropriate to attempt to infer metadata from a URI.

Encoding and using metadata in URIs

This section uses simple examples to illustrate some issues that arise when encoding metadata in URIs, or when relying on information gleaned from such URIs. Good Practice Notes are provided to explain how to use the Web effectively, and Constraints are given where necessary for using the Web correctly. As these examples show, encoding or not encoding metadata in a URI or deciding whether to rely on such metadata is often a tradeoff, involving some benefits and some costs. In such cases, choices should be made that best meet the needs of particular resource providers and users.

Reliability of URI metadata

Consider Martin, who is using a Web-based bug tracking system to investigate some software problems. He sees a bug report which says:
“See http://example.org/bugdata/brokenfile.xml for an example of XML that is not well-formed.”

The bug tracking system is built to show examples just as they are entered into the system, so for http://example.org/bugdata/brokenfile.xml it returns a stream of (poorly formed) XML with Content-Type text/plain. That Content-Type should cause a properly configured browser to show Martin the erroneous text just as it was recorded:


Unfortunately, Martin uses a browser that incorrectly attempts to infer the format of the returned data from the URI suffix. Keying on the “.xml” in the URI, it launches an XML renderer for what should have been plain text. When Martin attempts to view the faulty file, he sees instead a browser error saying that the erroneous XML could not be displayed.


Constraint: Web software MUST NOT depend on the correctness of metadata inferred from a URI, except when the encoding of such metadata is documented by applicable standards and specifications.

Such standards and specifications include pertinent Web and Internet RFCs and Recommendations such as [URI], as well as documentation provided by the URI assignment authority.

Martin’s browser is in error because its inference that the URI suffix provides file type metadata is not provided for by normative Web specifications or, we may assume, in documentation from the assignment authority. A correctly written browser would have shown the faulty XML as text, or might conceivably have shown a warning about the apparent mismatch between the type inferred from the URI and the returned Content-Type. (Martin’s browser is also ignoring TAG finding “Authoritative Metadata” [AUTHMETA], which mandates that the Content-Type HTTP header takes precedence even if type information had somehow been reliably encoded in the URI.)

Note that the constraint refers to conclusions drawn by software, which must be trustworthy, as opposed to guesses made by people. As discussed in 2.2 Guessing information from a URI, guessing is something that people using the Web do quite often and for good reason. Software tends to be long lived and widely distributed. Thus software dependencies on undocumented URI metadata result not only in buggy systems, but in inappropriate expectations that authorities will constrain their URI assignment policies and representation types to match dependencies in the clients. For both of these reasons, the constraint above requires that software must not have such dependencies.

There is certain metadata that Martin or his browser can reliably determine from the URI. For example, the URI conveys that the http scheme has been used, and that attempts to access the resource should be directed to the IP address returned from the DNS resolution of the string “example.org”. These conclusions are supported by normative specifications such as [URI] and [HTTP].

Guessing information from a URI

Bob is walking down a street, and he sees an advertisement on the side of a bus:
“For the best Chicago Weather information on the Web, visit http://example.org/weather/Chicago.”

Bob goes home and types the URI into his browser, which does indeed display for him a Chicago weather forecast. Bob then realizes that he’ll be visiting Boston, and he guesses that a Boston weather page might be available at a similar URI: http://example.org/weather/Boston. He types that into his browser and reads the response that comes back.

Bob is using the original URI for more than its intended purpose, which is to identify the Chicago weather page. Instead, he’s inferring from it information about the structure of a Web site that, he guesses, might use a uniform naming convention for the weather in lots of cities. So, when Bob tries the Boston URI, he has to be prepared for the possibility that his guess will prove wrong: Web architecture does not guarantee that the retrieved page, if there is one, has the weather for Boston, or indeed that it contains any weather report at all. Even if it does, there is no assurance that it is current weather, that it is intended for reliable use by consumers, etc. Bob has seen an advertisement listing just the Chicago URI, and that is the only one that the URI authority has warranted will be a useful weather report.

Still, the ability to explore the Web informally and experimentally is very valuable, and Web users act on such guesses about URIs all the time. Many authorities facilitate such flexible use of the Web by assigning URIs in an orderly and predictable manner. Nonetheless, in the example above, Bob is responsible for determining whether the information returned is indeed what he needs.

HTML Forms, and Documenting Metadata Assignment Policies

Bob would not have had to guess the Boston weather URI if the authority had documented its URI assignment policy. Assignment authorities have no obligation to provide such documentation, but it can be a useful way of advertising in bulk the URIs for a collection of related resources. For example, an advertisement might read:
“For the best weather information for your city, visit http://example.org/weather/your-city-name-here.”

Reading that advertisement, Bob can reasonably assume that weather reports are available by substituting specific city names into the URI pattern http://example.org/weather/your-city-name-here. Moreover, the advertisement claims that the weather information obtainable at those URIs is “the best”, so Bob can assume that the weather reports are trustworthy and current.

HTML forms [HTMLForms] and now XForms [XFORMS] each provide a means by which an authority can assert its support for a class of parameterized URIs, while simultaneously programming Web clients to prompt for the necessary parameters. For example, a Web site http://example.org/weatherfinder might offer a city lookup page containing the following HTML form fragment:

For what city would you like a weather report:

A browser receiving this form, or Bob if he views the source of the form, is assured that the assigning authority is supporting an entire class of URIs of the form:


The same HTML Form is also a computer program, executable by the browser, that prompts for and retrieves representations for all such URIs, and the English text in the form assures Bob that these are indeed for weather reports. Bob is not guessing the encoding of the URI or the nature of the resources referenced — he is acting on authoritative information provided by the assignor of the URIs. He can assume not just that he will get weather reports for certain cities, but that no URIs in the class correspond to anything other than weather reports (though some may correspond to no resource at all). Bob could, with this assurance, write his own software to construct and use such URIs to retrieve weather reports. Of course, the typical Web user would neither directly inspect the URIs nor write software to build them, but would instead type in city names and push the handy “Get the weather” button on his or her browser screen.

Note that the example carefully specifies that the HTML form is sourced from the same authority as the individual weather URIs that the form queries. In fact, it is also common for the ACTION attributes in HTML forms to refer to URIs from other authorities. In such cases, it is the provider of the form rather than the assigning authority for the queried URIs who is responsible for the claims made in the form. In particular, users (and software) should check the origin of HTML forms before depending on the URI assignment patterns that they appear to imply. Of course, you can always use such a form to perform a query and see what comes back; what you can’t do is blame the assignment authority if the generated URIs either don’t resolve (status code 404) or return representations that don’t match the expectations established when reading the form (you got a football score instead of a weather report).

Authority use of URI metadata

In the examples in 2.3 HTML Forms, and Documenting Metadata Assignment Policies above, resource metadata (I.e. the city associated with each resource) was encoded into URIs primarily for the benefit of users such as Bob, or to facilitate use of the HTML Forms or XForms acting on those users’ behalf.

Often, metadata is encoded into a URI not primarily for the benefit of users, but to facilitate management of the resources themselves. For example, assume that the administrators at example.org have established a policy of assigning URIs based on the media types of representations: all GIF images are named with URIs ending in “.gif”, and all JPEG images are named with URIs ending in “.jpeg”, and so on. Although 2.1 Reliability of URI metadata warned that users of a resource cannot rely on undocumented naming conventions to determine media types and other information about a resource, the owner of a resource controls such naming and can depend on it. Example.org may therefore rely on their policy in an Apache Web Server .htaccess file, which causes the correct media type to be served automatically for each resource:
ForceType ‘image/gif’

ForceType ‘image/jpeg’

Even if it does not document this policy publicly, example.org’s own Web servers can safely depend on it.

Good Practice

Good Practice: URI assignment authorities and the Web servers deployed for them may benefit from an orderly mapping from resource metadata into URIs.

In addition to filename-based conventions, authorities may choose to base URIs on database keys, customer identifiers, or other information that makes it easy to associate a URI with information pertinent to the corresponding resource. Such encodings are both useful and common on the Web, but there can also be drawbacks to including such information in URIs. Some of those problems are discussed in the three sections immediately below.
2.5 URIs that are convenient for people to use

URIs optimized for use by the assignment authority may sometimes be inconvenient for resource users. Consider Mary who is walking down the street, and who sees the same weather advertisement as Bob:
“For the best Chicago Weather information on the Web, visit http://example.org/weather/Chicago.”

Like Bob, Mary is pleased to learn about a valuable Web site, and she finds that the URI itself is quite easy both to remember and to type into her browser. This is because, in addition to the required scheme and authority components, the URI is based on the word weather and the city name Chicago, both of which fit her expectations for this resource.

The next day, Mary sees another advertisement reading:
“For the best Atlanta Weather information on the Web, visit http://example.org/123Hx67v4gZ5234Bq5rZ.”

Mary is annoyed, because the URI is both difficult to remember and hard to transcribe accurately. She guesses that the authority has assigned this URI for its own convenience (see 2.4 Authority use of URI metadata) rather than for hers. Although Web architecture does not require that URIs be easy to understand or suggestive of the resource named, it’s handy if those intended for direct use by people are.

Good Practice

Good Practice: URIs intended for direct use by people should be easy to understand, and should be suggestive of the resource actually named.

Note that the second URI might be based on a database key that facilitates efficient access to the weather data at the server (see 2.4 Authority use of URI metadata); such a URI might have been a good choice if it were intended only for use in HTML hyperlinks, rather than in an advertisement on the side of a bus.
2.6 Changing metadata

URIs should generally not encode metadata that will change, regardless of whether the encoding policy is established to benefit URI assignment authorities, resource users, or both. Consider a Web site that organizes document URIs according to the documents’ lead author or editor. Thus, the documents:


are named for their editor, Bob Smith. Bob retires, and Mary Jones takes over as editor for document1. If the URI is changed to encode her name, then existing links break, but if the URI is not changed, the naming policy is violated. By encoding into the URI metadata that will change, the authority has put itself in a difficult position.

Good Practice

Good Practice: Resource metadata that will change SHOULD NOT be encoded in a URI.

Indeed, RDF statements about the resource, headers returned with representations (e.g. Content-Type) or metadata embedded in the representations themselves (e.g. HTMLtags) are all better alternatives for conveying such volatile metadata about the resource.
2.7 Hiding metadata for security reasons

A bank establishes a URI assignment policy in which account numbers are encoded directly in the URI. For example, the URI http://example.org/customeraccounts/456123 accesses information for account number 456123. A malicious worker at an Internet Service Provider notices these URIs in his traffic logs, and determines the bank account numbers for his Internet customers. Furthermore, if access controls are not properly in place, he might be able to guess the URIs for other accounts, and to attempt to access them.

Good Practice

Good Practice: URI assignment authorities SHOULD NOT put into URIs metadata that is to be kept confidential.

Confusing or malicious metadata

Although a URI suffix such as .jpeg or .exe plays no role in establishing the media type of a Web resource, such suffixes are often significant in operating system filenames. This inconsistency can be confusing to users, and may in some cases be exploited by malicious Web sites to cause harm. Consider Ed, who browses to an HTML page that includes an image of his favorite movie star. Underneath the picture is a suggestion that Ed “Right click on the picture and select ‘Save as’ to save a copy of this picture on your local disk”. The HTML sent to the browser is:

Right click on the picture and select ‘Save as’ to save
a copy of this picture on your local disk.

Unfortunately, the Web site is attempting to trick Ed’s browser into saving the retrieved data not as an image file, but as an executable. Specifically, the site is gambling on the possibility that his browser will preserve the .exe extension when saving the file, and that such an extension will cause his operating system to treat the file as executable.

When saving information from the Web, browsers must preserve to the extent practical the authoritative typing information provided with the representation. As discussed in 2.1 Reliability of URI metadata, the Content-Type is the authoritative source of type information in this example. If the local operating system considers filename extensions to be significant, then either .jpeg or .jpg is more likely to be the appropriate choice for a resource of media type image/jpeg, regardless of what suffix may appear in the URI.

Good Practice

Good Practice: When saving to filesystems that use extensions to represent media types, user agents MUST choose an extension that is consistent with the media type of the representation.

Indeed, many modern browsers suggest a name such as moviestar.exe.jpeg when saving the example file above. Nonetheless, it is inappropriate for Web sites to intentionally mislead users. Although naming an image/jpeg file with a URI ending in .exe is not prohibited by Web architecture, doing so with the intention to deceive users or to compromise their systems is of course not acceptable.

Note that the example above is contrived in at least one respect: to achieve its malicious goals, the Web site must serve a file that displays as an image in the browser, but that also runs as an executable after being saved to the local filesystem. Whether this is possible in practice is likely to depend on the exact image format and operating system involved. A slightly more complex approach to achieving a similar deception involves sending Ed an image that serves as a link to a separate executable. For example:

Click on the picture to see a larger copy of the picture.

The executable served for malicious.exe may in fact render a larger image of the movie star, but it could also be programmed to damage Ed’s computer. For the reasons described above, the correct way for Ed’s browser to determine the type of the linked representation, which in this case may indeed be executable, is from the media-type. Of course, a well-written agent will warn users before executing any code retrieved from the Web, regardless of whether the determination of its type was made in the appropriate manner or inappropriately from the URI suffix.
3 Conclusions

The principle conclusions of this finding are:

It is legitimate for assignment authorities to encode static identifying properties of a resource, e.g. author, version, or creation date, within the URIs they assign. This may contribute to the unique assignment of URIs. It may also contribute to the use of efficient mechanisms for dereferencing resources within origin servers e.g. use of database keys within URIs.

Assignment authorities may publish specifications detailing the structure and semantics of the URIs they assign. Other users of those URIs may use such specifications to infer information about resources identified by URI assigned by that authority.

The ability to explore and experiment is important to Web users. Users therefore benefit from the ability to infer either the nature of the named resource, or the likely URI of other resources, from inspection of a URI. Such inferences are reliable only when supported by normative specifications or by documentation from the assignment authorities. In other cases, users should be aware that their inferences may be incorrect and the effect could be malicious.

People and software using URIs assigned outside of their own authority should make as few inferences as possible about a resource based on its URI. The more dependencies a piece of software has on particular constraints and inferences, the more fragile it becomes to change and the lower its generic utility.

Source: http://www.w3.org/2001/tag/doc/metaDataInURI-31.html

Cool URIs

The Semantic Web is envisioned as a decentralised world-wide information space for sharing machine-readable data with a minimum of integration costs. Its two core challenges are the distributed modelling of the world with a shared data model, and the infrastructure where data and schemas can be published, found and used. Users benefit from getting information “raw and now” [Give] and in portable data formats [DP]. Providers often publish data embedded in a fixed user interface, in HTML. A basic question is thus how to publish information about resources in a way that allows interested users and software applications to find and interpret them.

On the Semantic Web, all information has to be expressed as statements about resources, like the members of the company Example.com are Alice and Bob or Bob’s telephone number is “+1 555 262” or this Web page was created by Alice. Resources are identified by Uniform Resource Identifiers (URIs) [RFC3986]. This modelling approach is at the heart of Resource Description Framework (RDF) [RDFPrimer]. A nice introduction is given in the N3 primer [N3Primer].

Using RDF, the statements can be published on the Web site of the company. Others can read the data and publish their own information, linking to existing resources. This forms a distributed model of the world. It allows the user to pick any application to view and work with the same data, for example to see Alice’s published address in your address book.

At the same time, Web documents have always been addressed with URIs (in common parlance often referred as Uniform Resource Locators, URLs). This is useful because it means we can easily make RDF statements about Web pages, but also dangerous because we can easily mix up Web pages and the things, or resources, described on the page.

So the question is, what URIs should we use in RDF? As an example, to identify the frontpage of the Web site of Example Inc., we may use http://www.example.com/. But what URI identifies the company as an organisation, not a Web site? Do we have to serve any content—HTML pages, RDF files—at those URIs? In this document we will answer these questions according to relevant specifications. We explain how to use URIs for things that are not Web pages, such as people, products, places, ideas and concepts such as ontology classes. We give detailed examples as to how the Semantic Web can (and should) be realised as a part of the Web.

URIs for Web Documents

Let us begin with an example. Assume that Example Inc., a fictional company producing “Extreme Guitar Amplifiers”, has a Web site at http://www.example.com/. Part of the site is a white-pages service listing the names and contact details of the employees. Alice and Bob both work at Example Inc. The structure of the Web site might thus be:

the homepage of Example Inc.
the homepage of Alice
the homepage of Bob

Like everything on the traditional Web, each of the pages mentioned above are Web documents. Every Web document has its own URI. Note that a Web document is not the same as a file: a single Web document can be available in many different formats and languages, and a single file, for example a PHP script, may be responsible for generating a large number of Web documents with different URIs. A Web document is defined as something that has a URI and can return representations (responses in a format such as HTML or JPEG or RDF) of the identified resource in response to HTTP requests. In technical literature, such as Architecture of the World Wide Web, Volume One [AWWW], the term Information Resource is used instead of Web document.

On the traditional Web, URIs were used primarily for Web documents—to link to them, and to access them in a browser. The notion of resource identity was not so important on the traditional Web, a URL simply identified whatever we see when we type it into a browser.

HTTP and Content Negotiation

Web clients and servers use the HTTP protocol [RFC2616] to request representations of Web documents and send back the responses. HTTP has a powerful mechanism for offering different formats and language versions of the same Web document known as content negotiation.

When a user agent (such as a browser) makes an HTTP request, it sends along some HTTP headers to indicate what data formats and language it prefers. The server then selects the best match from its file system or generates the desired content on demand, and sends it back to the client. For example, a browser could send this HTTP request to indicate that it wants an HTML or XHTML representation of http://www.example.com/people/alice in English or German:

GET /people/alice HTTP/1.1
Host: www.example.com
Accept: text/html, application/xhtml+xml
Accept-Language: en, de

The server could answer:

HTTP/1.1 200 OK
Content-Type: text/html
Content-Language: en
Content-Location: http://www.example.com/people.en.html

followed by the content of the HTML document in English.

Here we see Content negotiation [TAG-Alt] in action. The server interprets the Accept-Language headers in the request and decides to return the English representation of the resource in question. Note that the URI of this representation is passed back in the Content-Location header, this is not required but a recommended good practice (see [CHIPS], 7.2). Clients see that this URI is connected to the specific representation (in this case English) and search engines can refer to the different representations by using the different URIs. This implies that it is possible to have multiple representations of the same resource.

Content negotation is often implemented with a twist: Instead of a direct answer, the server redirects to another URL where the appropriate representation is found:

HTTP/1.1 302 Found
Location: http://www.example.com/people/alice.en.html

The redirect is indicated by a special Status Code, here 302 Found. The client would now send another HTTP request to the new URL. By having separate URLs for different representations, this approach allows Web authors to link directly to a specific representation.

RDF/XML, the standard serialisation format of RDF, has its own content type, application/rdf+xml. Content negotiation thus allows publishers to serve HTML representations of a Web document to traditional Web browsers and RDF representations to Semantic Web-enabled user agents. This also allows servers to provide alternative RDF serialisation formats like Notation3 [N3] or TriX [TriX].

URIs for Real-World Objects

On the Semantic Web, URIs identify not just Web documents, but also real-world objects like people and cars, and even abstract ideas and non-existing things like a mythical unicorn. We call these real-world objects or things.

Given such a URI, how can we find out what it identifies? We need some way to answer this question, because otherwise it will be hard to achieve interoperability between independent information systems. We could imagine a service where we can look up a description of the identified resource, similar to today’s search engines. But such a single point of failure is against the Web’s decentralised nature.

Instead, we should use the Web itself—an extremely robust and scalable information publishing system—as a lookup service for resource descriptions. Whenever a URI is mentioned, we can look it up to retrieve a description containing relevant information and links to related data. This is so important that we make it our number one requirement for cool URIs:

1. Be on the Web.
Given only a URI, machines and people should be able to retrieve a description about the resource identified by the URI from the Web. Such a look-up mechanism is important to establish shared understanding of what a URI identifies. Machines should get RDF data and humans should get a readable representation, such as HTML. The standard Web transfer protocol, HTTP, should be used.

Let’s assume Example Inc. wants to publish contact data of their employees on the Semantic Web so their business partners can import it into their address books. For example, the published data would contain these statements about Alice, written here in N3 syntax [N3]:

a foaf:Person;
foaf:name “Alice”;
foaf:mbox <mailto:alice@example.com>;
foaf:homepage <http://www.example.com/people/alice> .

What URI should we use instead of the placeholder ? Certainly not http://www.example.com/people/alice, because that would confuse a person with a Web document, leading to misunderstandings: Is the homepage of Alice also named “Alice”? Can a homepage itself have an e-mail address? And does it make sense for a home-page to have itself as its home-page? So we need another URI. (For in-depth treatments of this issue, see What HTTP URIs Identify? [HTTP-URI2] and Four Uses of a URL: Name, Concept, Web Location and Document Instance [Booth]).

Therefore our second requirement:

2. Be unambiguous.
There should be no confusion between identifiers for Web documents and identifiers for other resources. URIs are meant to identify only one of them, so one URI can’t stand for both a Web document and a real-world object.

We note that our requirements seem to conflict with each other. If we can’t use URIs of documents to identify real-world object, then how can we retrieve a representation about real-world objects based on their URI? The challenge is to find a solution that allows us to find the describing documents if we have just the resource’s URI, using standard Web technologies.

The following picture shows the desired relationships between a resource and its representing documents:

A resource and its describing documents

Distinguishing between Representations and Descriptions

It is important to understand that using URIs, it is possible to identify both a thing (which may exist outside of the Web) and a Web document describing the thing. For example the person Alice is described on her homepage. Bob may not like the look of the homepage, but fancy the person Alice. So two URIs are needed, one for Alice, one for the homepage or a RDF document describing Alice. The question is where to draw the line between the case where either is possible and the case where only descriptions are available.

According to W3C guidelines ([AWWW], section 2.2.), we have a Web document (there called information resource) if all its essential characteristics can be conveyed in a message. Examples are a Web page, an image or a product catalog.

In HTTP, because a 200 response code should be sent when a Web document has been accessed, but a different setup is needed when publishing URIs that are meant to identify entities which are not Web documents.

In the next section, solutions are described that allow you to mint URIs for things and also allow clients to get a description of the thing using standard Web technologies.

Two Solutions

There are two solutions that meet our requirements for identifying real-world objects: 303 URIs and hash URIs. Which one to use depends on the situation, both have advantages and disadvantages.

The solutions described in the following apply to deployment scenarios in which the RDF data and the HTML data is served separately, such as a standalone RDF/XML document along with an HTML document. The metadata can also be embedded in HTML, using technologies such as RDFa [RDFa Primer], microformats and other documents to which the GRDDL [GRDDL] mechanisms can be applied. In those cases the RDF data is extracted from the returned HTML document.

Hash URIs

The first solution is to use “hash URIs” for non-document resources. URIs can contain a fragment, a special part that is separated from the rest of the URI by a hash symbol (“#”).

When a client wants to retrieve a hash URI, then the HTTP protocol requires the fragment part to be stripped off before requesting the URI from the server. This means a URI that includes a hash cannot be retrieved directly, and therefore does not necessarily identify a Web document. But we can use them to identify other, non-document resources, without creating ambiguity.

If Example Inc. adopts this solution, then they could use these URIs to represent the company, Alice, and Bob:

Example Inc., the company
Bob, the person
Alice, the person

Clients will always strip off the fragment part before requesting any of these URIs, resulting in a request to this URI:

RDF document describing Example Inc., Bob, and Alice

At this URI, Example Inc. could serve an RDF document that contains descriptions of all three resources, using the original hash URIs to identify the resources.

The following picture shows the hash URI approach without content negotiation:

The hash URI solution without content negotiation

Alternatively, content negotiation (see Section 2.1.) could be employed to redirect from the about URI to either a HTML or an RDF representation. The decision which to return is based on client preferences and server configuration, as explained below in Section 4.7. The Content-Location header should be set to indicate if the hash URI refers to a part of the HTML document or RDF document.

The following picture shows the hash URI approach with content negotiation:

The hash URI solution with content negotiation

303 URIs forwarding to One Generic Document

The second solution is to use a special HTTP status code, 303 See Other, to give an indication that the requested resource is not a regular Web document. Web architecture tells you that for a thing resource (URI) it is inappropriate to return a 200 because there is, in fact, no suitable representation for those resources. However, it is useful to provide information about those resources. The W3C’s Technical Architecture Group proposes in its httpRange-14 resolution [httpRange] document a solution that is to direct you to a document which has information about the thing you asked about. By doing this we avoid ambiguity between the original, real-world object and the resource that represents it.

Since 303 is a redirect status code, the server can give the location of a document that represents the resource. If, on the other hand, a request is answered with one of the usual status codes in the 2XX range, like 200 OK, then the client knows that the URI identifies a Web document.

If Example Inc. adopts this solution, they could use these URIs to represent the company, Alice and Bob:

Example Inc., the company
Bob, the person
Alice, the person

The Web server would be configured to answer requests to all these URIs with a 303 status code and a Location HTTP header that provides the URL of a document that represents the resource. For example, to redirect from http://www.example.com/id/alice to http://www.example.com/doc/alice.

Content-negotiation is then used when retrieving a representation from the document URI using a HTTP request. The server decides (see Section 4.7) to return either HTML or RDF (or more alternative forms) and sets the Content-Location header to the URI where the specific representation can be retrieved.

This setup should be used when the RDF and HTML (and possibly more alternative representations) convey the same information in different forms. When the information in the variations differs considerably, the 303 approach as described below should be used.

See the following illustration for the solution providing the generic document URI.

solution for a generic document URI

In this setup, the server forwards from the identification URI to the generic document URI. This has the advantage that clients can bookmark and further work with the generic document. A user having a RDF-capable client could bookmark the document, and mail it to another user (or device) which then dereferences it and gets the HTML or the RDF view. Also, the server can add representations in new languages in the future. Just because the client started with the URI of a thing, it doesn’t mean that the document involved is not a first class document on the WWW. The background of generic document resources is described in [GenRes].

303 URIs forwarding to Different Documents

When the RDF and HTML representations of the resource differ substantially, the previous setup should not be used. They are not two versions of the same document, but different documents altogether. Again, the Web server would be configured to answer requests with a 303 status code and a Location HTTP header that provides the URL of a document that represents the resource.

The following picture shows the redirects for the 303 URI solution without the generic document URI:

The 303 URI solution

The server could employ content negotiation (see Section 2.1.) to send either the URL of an HTML description or RDF. HTTP requests for HTML content would be redirected to the HTML URLs we gave in Section 2. Requests for RDF data would be redirected to RDF documents, such as:

RDF document describing Example Inc., the company
RDF document describing Bob, the person
RDF document describing Alice, the person

Each of the RDF documents would contain statements about the appropriate resource, using the original URI, e.g. http://www.example.com/id/alice, to identify the described resource.

Choosing between 303 and Hash

Which approach is better? It depends. The hash URIs have the advantage of reducing the number of necessary HTTP round-trips, which in turn reduces access latency. A family of URIs can share the same non-hash part. The descriptions of http://www.example.com/about#exampleinc, http://www.example.com/about#alice, and http://www.example.com/about#bob are retrieved with a single request to http://www.example.com/about. However this approach has a downside. A client interested only in #product123 will inadvertently load the data for all other resources as well, because they are in the same file. 303 URIs, on the other hand, are very flexible because the redirection target can be configured separately for each resource. There could be one describing document for each resource, or one large document for all of them, or any combination in between. It is also possible to change the policy later on.

When using 303 URIs for an ontology, like FOAF, network delay can reduce a client’s performance considerable. The large number of redirects may cause higher latency. A client looking up a set of terms through 303 may use many requests, even though the first request has already loaded everything there is to know.

When hosting large-scale datasets with the 303 solution, clients may be tempted to download all data using many requests. We advise to additionally provide SPARQL endpoints or comparable services to answer complex queries on the server directly, rather than to let the client download a large set of data via HTTP.

Note also, that both 303 and Hash can be combined, allowing a large dataset to be separated into multiple parts and have an identifier for a non-document resource. An example for a combination of 303 and Hash is:

Bob, the person with a combined URI.

Any fragment identifier is valid, this in the above URI is a suggestion you may want to copy for your implementations.

Hash URIs should be preferred for rather small and stable sets of resources that evolve together. The ideal case are RDF Schema vocabularies and OWL ontologies, where the terms are often used together, and the number of terms is unlikely to grow out of control in the future.

Hash URIs without content negotiation can be implemented by simply uploading static RDF files to a Web server, without any special server configuration. This makes them popular for quick-and-dirty RDF publication.

URIs of the bob#this form can be used for large sets of data that are, or may grow, beyond the point where it is practical to serve all related resources in a single document. 303 URIs may also be used for such data sets, making neater-looking URIs, but with an impact on run-time performance and server load.

If in doubt, follow your nose.

Cool URIs

The best resource identifiers don’t just provide descriptions for people and machines, but are designed with simplicity, stability and manageability in mind, as explained by Tim Berners-Lee in Cool URIs don’t change and by the W3C Team in Common HTTP Implementation Problems (sections 1 and 3):

Short, mnemonic URIs will not break as easily when sent in emails and are in general easier to remember, e.g. when debugging your Semantic Web server.
Once you set up a URI to identify a certain resource, it should remain this way as long as possible. Think about the next ten years. Maybe twenty. Keep implementation-specific bits and pieces such as .php and .asp out of your URIs, you may want to change technologies later.

Issue your URIs in a way that you can manage. One good practice is to include the current year in the URI path, so that you can change the URI-schema each year without breaking older URIs. Keeping all 303 URIs on a dedicated subdomain, e.g. http://id.example.com/alice, eases later migration of the URI-handling subsystem.


All the URIs related to a single real-world object—resource identifier, RDF document URL, HTML document URL—should also be explicitly linked with each other to help information consumers understand their relation. For example, in the 303 URI solution for Example Inc., there are three URIs related to Alice:

Identifier for Alice, the person
Alice’s homepage
RDF document with description of Alice

Two of them are Web document URLs. The RDF document located at http://www.example.com/data/alice might contain these statements (expressed in N3):

foaf:page <http://www.example.com/people/alice>;
rdfs:isDefinedBy <http://www.example.com/data/alice>;

a foaf:Person;
foaf:name “Alice”;
foaf:mbox <mailto:alice@example.com>;

The document makes statements about Alice, the person, using the resource identifier. The first two properties relate the resource identifier to the two document URIs. The foaf:page statement links it to the HTML document. This allows RDF-aware clients to find a human-readable resource, and at the same time, by linking the page to its topic, defines useful metadata about that HTML document. The rdfs:isDefinedBy statement links the person to the document containing its RDF description and allows RDF browsers to distinguish this main resource from other auxiliary resources that just happen to be mentioned in the document. We use rdfs:isDefinedBy instead of its weaker superproperty rdfs:seeAlso because the content at /data/alice is authoritative. The remaining statements are the actual white pages data.

The HTML document at http://www.example.com/people/alice should contain in its header a element that points to the corresponding RDF document:
Alice’s Homepage

This allows RDF-aware Web clients to discover the RDF information. The approach is recommended in the RDF/XML specification ([RDFXML], section 9). If the RDF data is about the Web page, rather than an expression of the information in it, then we recommend using rel=”meta” instead of rel=”alternate”.

The client also can deduce similar link information directly from the HTTP headers: that a thing is described by a Web document which can be found at the end of a 303 redirect; that the Content-Location resource is a content-specific version of the generic document, and more. Ontologies for these relations are not discussed here.

The following illustration shows how the RDF and HTML documents should relate the three URIs to each other:

The RDF and HTML documents should relate the URIs to each other

Implementing Content Negotiation

The W3C’s Semantic Web Best Practices and Deployment Working Group has published a document that describes how to implement the solutions presented here on the Apache Web server. The Best Practice Recipes for Publishing RDF Vocabularies [Recipes] mostly discuss the publication of RDF vocabularies, but the ideas can also be applied to other kinds of small RDF datasets that are published from static files.

However, especially when it comes to content negotiation, the Recipes document doesn’t cover some important details. Content negotiation is a bit more difficult in practice because of mixed-mode clients that can deal with both HTML and RDF, such as Firefox with the Tabulator extension.

These browsers announce their ability to consume both RDF and HTML through Accept headers that use q (quality) values:

Accept: application/rdf+xml;q=0.7, text/html

This browser accepts RDF with a q value of 0.7 and HTML with a q value of 1.0 (the default). This means the browser has a slight preference for HTML over RDF.

Now, a client preference for HTML doesn’t necessarily mean that every server should send HTML. The server has to look at the client’s preferences, and then it must make a decision based on the quality of the different variants it could offer. For example:

If the HTML variant is a simple low-quality rendering of the RDF, like a property-value table or a list of triples, then the server should send the RDF, unless the client has a very strong preference for HTML.
If HTML and RDF variant contain the same information, and both are of high quality, then the server should treat both variants with equal preference, and leave the choice to the client’s preferences.
If the RDF variant is only a part of the information offered in the HTML, or is scraped from the HTML, then the server should probably send the HTML, unless the client has a strong preference for RDF.

There are algorithms for choosing the best match by comparing client preferences with the quality of the server’s available variants. For example, the Apache server can be configured with server-side qs values that specify their relative quality.

A qs value of 1.0 for application/rdf+xml and 0.5 for text/html, would mean that the HTML variant has only approximately half the quality of the RDF and might be appropriate in the first case from the list above. If the HTML is a news article and the RDF contains just minimal information such as title, date and author, then 1.0 for the HTML and 0.1 for the RDF would be appropriate.

To determine the best variant for a particular client, Apache multiplies the client’s q value for HTML with the configured qs value for HTML; and the same for RDF. The variant with the higher number wins. Apache’s documentation has a section with a detailed description of its content negotiation algorithm [ApCN]. HTTP’s Accept header is described in detail in section 14.1 of the HTTP specification [HTTP-SPEC].

Content negotiation, with all its details, is fairly complex, but it is a powerful way of choosing the best variant for mixed-mode clients that can deal with HTML and RDF.

Examples from the Web

Not all projects that work with Semantic Web technologies make their data available on the Web. But a growing number of projects follow the practices described here. This section gives a few examples.

ECS Southampton. The School of Electronics and Computer Science at University of Southampton has a Semantic Web site that employs the 303 solution and is a great example of Semantic Web engineering. It is documented in the ECS URI System Specification [ECS]. Separate subdomains are used for HTML documents, RDF documents, and resource identifiers. Take these examples:

URI for Wendy Hall, the person
HTML page about Wendy Hall
RDF about Wendy Hall

Entering the first URI into a normal Web browser redirects to an HTML page about Wendy Hall. It presents a Web view of all available data on her. The page also links to her URI and to her RDF document.

D2R Server is an open-source application that can be used to publish data from relational databases on the Semantic Web in accordance with these guidelines. It employs the 303 solution and content negotiation. For example, the D2R Server publishing the DBLP Bibliography Database publishes several thousand bibliographical records and information about their authors. Example URIs, again connected via 303 redirects:

URI for Chris Bizer, the person
HTML page about Chris Bizer

The RDF document for Chris Bizer is a SPARQL query result from the server’s SPARQL endpoint:


The SPARQL query encoded in this URI is:

DESCRIBE <http://www4.wiwiss.fu-berlin.de/dblp/resource/person/315759>

This shows how a SPARQL endpoint can be used as a convenient method of serving resource descriptions.

Semantic MediaWiki is an open-source Semantic wiki engine. Authors can use special wiki syntax to put semantic attributes and relationships into wiki articles. For each article, the software generates a 303 URI that identifies the article’s topic, and serves RDF descriptions generated from the attributes and relationships. Semantic MediaWiki drives the OntoWorld wiki. It has an article about the city of Karlsruhe:

the article, an HTML document
the city of Karlsruhe
RDF description of Karlsruhe

The URI of the RDF description is less than ideal, because it exposes the implementation (php) and refers redundantly to RDF in the path and in the query. A much cooler URI would be for example http://ontoworld.org/data/Karlsruhe, as it allows content negotiation to be used to serve the data in RDF, RIF (Rule Interchange Format), or whatever else we think of next.

Other Resource Naming Proposals

Many other approaches have been suggested over the years. While most of them are appropriate in special circumstances, we feel that they do not fit the criteria from Section 3, which are to be on the Web and don’t be ambiguous. Therefore they are not adequate as general solutions for building a standards-based, non-fragmented, decentralized Semantic Web. We will discuss two of these approaches in some detail.
6.1. New URI Schemes

HTTP URIs already identify Web resources and Web documents, not other kinds of resources. Shouldn’t we create a new URI scheme to identify other resources? Then we could easily distinguish them from Web documents just by looking at the first characters of the URI. For example, the info scheme can be used to identify books based on a LCCN number: info:lccn/2002022641.

Here are examples of such new URI schemes. A longer list is provided by Thompson and Orchard in URNs, Namespaces and Registries [TAG-URNs].

Magnet is an open URI scheme enabling seamless integration between Web sites and locally-running utilities, such as file-management tools. It is based on hash-values, a URI looks like this:
The info: URI scheme is proposed to identify information assets that have identifiers in existing public namespaces. Examples are URIs for LCCN numbers (info:lccn/2002022641) and the Dewey decimal system (info:ddc/22/eng//004.678).
The idea of Tag URIs is to generate collision-free URIs by using a domain name and the date when the URI was allocated. Even if the domain changes ownership at a later date, the URI remains unambiguous. Example: tag:hawke.org,2001-06-05:Taiko.
XRI defines a scheme and resolution protocol for abstract identifiers. The idea is to use URIs that contain wildcards, to adapt to changes of organizations, servers, etc.
Examples are @Jones.and.Company/(+phone.number) or xri://northgate.library.example.com/(urn:isbn:0-395-36341-1).

To be truly useful, a new scheme must be accompanied by a protocol defining how to access more information about the identified resource. For example, the ftp:// URI scheme identifies resources (files on an FTP server), and also comes with a protocol for accessing them (the FTP protocol).

Some of the new URI schemes provide no such protocol at all. Others provide a Web Service that allows retrieval of descriptions using the HTTP protocol. The identifier is passed to the service, which looks up the information in a central database or in a federated way. The problem here is that a failure in this service renders the system unusable.

Another drawback can be a dependence on a standardization body. To register new parts in the info: space, a standardization body has to be contacted. This, or paying a license fee before creating a new URI, slows down adoption. In such cases a standardization body is desirable to ensure that all URIs are unique (e.g. with ISBNs). But this can be achieved using HTTP URIs inside an HTTP namespace owned and managed by the standardization organization.

Independent of standardization body and retrievability, pending patents and legal issues can influence the adoption of a new URI scheme. When using patented technology, implementers should verify that a Royalty-Free license is available.

The problems with new URI schemes are discussed at length in URNs, Namespaces and Registries.

Reference by Description

“Reference by Description” radically solves the URI problem by doing away with URIs altogether: Instead of naming resources with a URI, anonymous nodes are used, and are described with information that allows us to find the right one. A person, for example, could be described by name, date of birth, and social security number. These pieces of information should be sufficient to uniquely identify a person.

A popular practice is the use of a person’s email address as a uniquely identifying piece of information. The foaf:mbox property is used in Friend of a Friend (FOAF) profiles for this purpose. In OWL, this kind of property is known as an Inverse Functional Property (IFP). When an agent encounters two resources with the same email address, it can infer that both refer to the same person and can treat them as one.

But how to be on the Web with this approach? How to enable agents to download more data about resources we mention? There is a best practice to achieve this goal: Provide not only the IFP of the resource (e.g. the person’s email address), but also an rdfs:seeAlso property that points to a Web address of an RDF document with further information about it. We see that HTTP URIs are still used to identify the location where more information can be downloaded.

Furthermore, we now need several pieces of information to refer to a resource, the IFP value and the RDF document location. The simple act of linking by using a URI has become a process involving several moving parts, and this increases the risk of broken links and makes implementation more cumbersome.

Regarding FOAF’s practice of avoiding URIs for people, we agree with Tim Berners-Lee’s advice: “Go ahead and give yourself a URI. You deserve it!”

Source: http://www.w3.org/TR/cooluris/

URIs in Legislation.gov.uk

This is the the URI-based identifier system that John Sheridan and Dr. Jeni Tennison developed for the Legislation.gov.uk system

This page describes the URI scheme that is used on the Legislation API.

The best way of finding out the URI for a particular piece of legislation is to search for it. A search on the title of a piece of legislation will redirect you to the proper URI for that item of legislation without you having to construct the URI yourself.

The Legislation API attempts to follow the guidance given in How to Publish Linked Data on the Web. We define three levels of URIs:

identifier URIs; for example, “The Transport Act 1985”, http://www.legislation.gov.uk/id/ukpga/1985/67

document URIs; for example, “The current version of The Transport Act 1985” (as opposed to a previous version), http://www.legislation.gov.uk/ukpga/1985/67

representation URIs; for example, “The current version of The Transport Act 1985 in XML” (as opposed to an HTML document), http://www.legislation.gov.uk/ukpga/1985/67/data.xml

When you request an identifier URI, the response will usually be a 303 See Other redirection to a document URI. When you request a document URI, you will usually get a 200 OK response and a Content-Location header that will point to an appropriate representation URI based on the Accept headers that you use in the request.
Identifier URIs
We recommend that you link to identifier URIs.
Identifier URIs generally follow the template:http://www.legislation.gov.uk/id/{type}/{year}/{number}[/{section}]However, legislation is often quoted without a chapter number, which can make it hard to automatically construct these URIs. If you don’t know the chapter number for a piece of legislation, you can use a search URI of the form:http://www.legislation.gov.uk/id?title={title}
If the title is recognised, this will result in a 301 Moved Permanently redirection to the canonical URI for the legislation. For example, requesting:
http://www.legislation.gov.uk/id?title=The%20Transport%20Act%201985will result in a 301 Moved Permanently redirection tohttp://www.legislation.gov.uk/id/ukpga/1985/67
On occasion, items of legislation have very similar titles, and the title search will result in multiple possibilities. In this case, the response will be a 303 Multiple Choices containing a simple XHTML document. For example, requesting
http://www.legislation.gov.uk/id?title=Disability%20Rights%20Commission%20Actwill result in a document containing
<li><a href=”/id/uksi/2006/3189″>The Disability Rights Commission Act 1999 (Commencement No.3) Order 2006</a></li>
<li><a href=”/id/uksi/2000/880″>The Disability Rights Commission Act 1999 (Commencement No. 2 and Transitional Provision) Order 2000</a></li>
<li><a href=”/id/uksi/1999/2210″>The Disability Rights Commission Act 1999 (Commencement No. 1 and Transitional Provision) Order 1999</a></li>
<li><a href=”/id/uksi/1999/17″>Disability Rights Commission Act 1999</a></li>
Legislation TypesThe legislation type codes are the same as those used on the Statute Law Database, and within the OPSI site. The list is:

Document Main Type
URI abbreviation

Primary Legislation

UK Public General Acts



UK Local Acts



Acts of the Parliament of Great Britain (1707-1800)



Acts of the English Parliament (1267-1706)



Acts of the Old Scottish Parliament (1424-1707)



Acts of the Scottish Parliament



Acts of the Old Irish Parliament (1495-1800)



Acts of the Northern Ireland Parliament (1921-1972)



Measures of the Northern Ireland Assembly (1974)



Acts of the Northern Ireland Assembly



UK Church Measures



Measures of the Welsh Assembly



Acts of the Welsh Assembly



Secondary Legislation

UK Statutory Instruments



Scottish Statutory Instruments



Wales Statutory Instruments



Northern Ireland Statutory Rules



UK Church Instruments



Northern Ireland Orders in Council



UK Ministerial Orders



Draft Legislation

UK Draft Statutory Instruments



Scottish Draft Statutory Instruments



Northern Ireland Statutory Rules



Northern Ireland Draft Orders in Council



Welsh Draft Statutory Instruments



Wales Statutory Instruments and Northern Ireland Orders in Council follow the same numbering sequence as UK Statutory Instruments, and can therefore be legitimately referred to through a URI using either wsi/nisi or uksi. In these cases, the wsi or nisi URI is the canonical one. For example, a request to
http://www.legislation.gov.uk/id/uksi/2002/808will result in a 301 Moved Permanently response with a Location header pointing tohttp://www.legislation.gov.uk/id/wsi/2002/808Legislation Years
The legislation year can be a calendar year or a regnal year. Calendar years can be used for legislation after 1963, but before that time legislation is unambiguously identified based on the year of reign of the monarch. For example:
http://www.legislation.gov.uk/id/ukpga/1985/67identifies The Transport Act 1986 (c.67), and:http://www.legislation.gov.uk/id/ukpga/Edw7/7/51identifies the Sheriff Courts (Scotland) Act 1907 (c.51). If you use a calendar year prior to 1963 within a URI, you will be redirected to the canonical identifier, which uses the regnal year. For example, requesting:http://www.legislation.gov.uk/id/ukpga/1907/51will result in a 301 Moved Permanently response with a Location header pointing tohttp://www.legislation.gov.uk/id/ukpga/Edw7/7/51On a few occasions, a pre-1963 calendar year in a URI does not uniquely identify a particular piece of legislation. For example:http://www.legislation.gov.uk/id/ukpga/1955/19Could refer to the Friendly Societies Act 1955 (c.19) or the Air Force Act 1955 (c.19). These items of legislation have different regnal years, but the same calendar years. The above request will result in a 300 Multiple Choices response, and the result will be XHTML that includes:
<li><a href=”/id/ukpga/Eliz2/3-4/19″>Air Force Act 1955</a></li>
<li><a href=”/id/ukpga/Eliz2/4-5/19″>Friendly Societies Act 1955</a></li>
Legislation NumbersThe legislation number is an integer that reflects the legislation’s chapter number according to the primary numbering sequence for the type. Legislation is sometimes assigned one or more secondary numbers. Secondary numbering schemes are:

Numbering Scheme
URI Number Prefix

Commencement and/or Appointed Day orders (C)
Bring into force an Act or part of an Act.


Legal series (L)
Relate to fees or procedures in Courts in England and Wales.


Scottish series (S)
Instruments covering reserved matters applying to Scotland only, not to be confused with Scottish Statutory Instruments made under powers devolved under the Scotland Act 1998.


Northern Ireland series (NI)
Orders in Council made under section 1(3) of the Northern Ireland (Temporary Provisions) Act 1972 or paragraph 1 of Schedule 1 to the Northern Ireland Act 1974.


National Assembly for Wales series (W/Cy)
Statutory Instruments made by the National Assembly for Wales and applying to Wales only. Such instruments will generally be made in both the English and Welsh languages.


It’s possible to use a secondary number within a URI by prefixing the number with the appropriate prefix as shown in the above table. This will result in a 301 Moved Permanently redirection to the URI using the main numbering scheme. For example, requesting
http://www.legislation.gov.uk/id/wsi/2002/w89will result in a 301 Moved Permanently redirection tohttp://www.legislation.gov.uk/id/wsi/2002/808Legislation Sections
You can refer to particular sections, articles, regulations and so on within a piece of legislation by appending /{divisionName}/{number} to the URI. For example, to refer to section 6 of the Road Traffic Regulation Act 1984, you can use
http://www.legislation.gov.uk/id/ukpga/1984/27/section/6The name of the division that is used depends on the type of the legislation as follows:

Legislation Type
Division Name

Act, Scottish Bill

UK Bill

Order in Council, Order of Council or Order



For example, regulation 6 of the Overseas Life Insurance Companies Regulations 2004 can be referenced with:http://www.legislation.gov.uk/id/uksi/2004/2200/regulation/6Further subsections can be listed after the section number, using forward slashes as separators. For example:http://www.legislation.gov.uk/id/ukpga/1975/63/section/1/1/baThe numbering scheme used for the sections, subsections and so on is that used within the legislation itself.Whole schedules can be referred to with /schedule/{numberOrLetter}, and paragraphs within schedules using /schedule/{numberOrLetter}/paragraph/{paraNumber}. For example:http://www.legislation.gov.uk/id/ukpga/2005/6/schedule/1/paragraph/2Sub-paragraphs can be referred as with sub-sections described above.In cases where a piece of legislation only has one schedule, the keyword schedule can be used on its own. For example:http://www.legislation.gov.uk/id/ukpga/1996/6/schedule
To refer to other structures within a piece of legislation, such as parts, chapter and so on, the appropriate name for the structure should be used in lowercase, with separators between it and its number. Further substructures can be appended to this. For example, Part IV to Schedule 9 of the Road Traffic Regulation Act 1984 should be referenced using:
http://www.legislation.gov.uk/id/ukpga/1984/27/schedule/9/part/IVThe allowed keywords here are:

Note that these are URI keywords, and always in English regardless of the language used in the legislation. However, the numbers used for parts, chapters and so on reflect the numbers used within the legislation; some legislation may contain Part II while another contains Part 2, and the URIs will reflect this difference rather than normalising on decimal numbers.
Requesting a division that does not exist within the legislation will result in a 404 Not Found response.Document URIs
Document URIs are used to refer to particular documents on the web: versions of the legislation. Document URIs follow the template:
http://www.legislation.gov.uk/{type}/{year}/{number}[/{section}][/{authority}][/{extent}][/{version}]Legislation AuthoritiesThe documents provided within the SLS API come from four possible sources, which may be reflected in the URI:

URI abbreviation

Statute Law Database
Revised versions of primary legislation; unrevised versions of secondary legislation. Revised and unrevised versions of Northern Ireland Acts and Orders in Council prior to 2006. See the description of limitations for more details.

King’s or Queen’s Printer of Acts of Parliament
Enacted/made versions of UK legislation since 1988.

King’s or Queen’s Printer for Scotland
Enacted/made versions of Scottish legislation since 1988.

Government Printer for Northern Ireland
Enacted/made versions of Northern Ireland legislation since 1988.

The default authority, if one isn’t given in the URI, depends on the version of the document being viewed. The revised version of legislation from the Statute Law Database will be returned for a current, dated or prospective version; if the enacted version of legislation is requested, you will get the King or Queen’s Printer version unless it’s unavailable, in which case the unrevised version from the Statute Law Database will be provided if possible.For example:http://www.legislation.gov.uk/nia/2000/5will return the latest version of the Weights and Measures (Amendment) Act (Northern Ireland) 2000 from the Statute Law Database, which could also be accessed at the URI:http://www.legislation.gov.uk/nia/2000/5/sldRequesting:http://www.legislation.gov.uk/nia/2000/5/enactedwill return the enacted version of the Act from the Government Printer for Northern Ireland, which could also be accessed at:http://www.legislation.gov.uk/nia/2000/5/gpni/enactedThe unrevised version of the Act from the Statute Law Database can be accessed at:http://www.legislation.gov.uk/nia/2000/5/sld/enactedThe text of this version will be the same as the Government Printer for Northern Ireland version, but may include annotations and links to other legislation.Legislation ExtentsTo reference legislation as it extends to a particular country, append /{country}, where country is:
For example, to get the Rent Act 1977 as it extends to England, you would use:http://www.legislation.gov.uk/ukpga/1977/42/englandIt is also possible to select a section based on more than one country by listing them with a + separator. For example,http://www.legislation.gov.uk/ukpga/1985/67/section/6/england+scotlandrequests the versions of Section 6 of The Transport Act 1985 that are applicable to England and Scotland.Requesting a piece of legislation, or a subsection of legislation, while specifying a country that the legislation or subsection does not extend to will result in a 404 Not Found response.URIs that do not specify an extent are assumed to refer to the legislation as it extends to all countries.When a selection for an exact extent is needed, the ‘=’ operator can precede the country list. For example,http://www.legislation.gov.uk/ukpga/1985/67/section/6/=england+waleswhich will request all version of Section 6 of The Transport Act 1985 that are applicable to both England and Wales.Legislation Versions
Legislation versions fall into three general categories: enacted/made versions, dated versions and prospective versions.
Enacted/Made Versions
The enacted or made version of legislation reflects the text of the legislation when it becomes law. Primary legislation is “enacted” while the majority of secondary legislation is “made” (United Kingdom Church Instruments and Ministerial Orders are simply “created”).

Using the keyword enacted, made or created at the end of a document URI provides the enacted or made version of the legislation, if such is available. The enacted version of legislation is not generally available for legislation prior to 1988.

For example, the enacted version of the Childcare Act 2006 can be found at:
http://www.legislation.gov.uk/ukpga/2006/21/enactedDated Versions
It is often helpful to know which parts of a piece of legislation are in force at a particular time. Often, particular sections of a piece of legislation do not come into force immediately (on the enactment date) but are brought into force later on, often through a commencement order (a particular kind of secondary legislation).

In addition, most legislation, particularly primary legislation, goes through multiple changes during its lifetime as other legislation inserts or repeals sections, paragraphs and phrases. Like the original, enacted, sections, inserted sections may not actually come into force until a separate order is made.

If no version is specified in a document URI, this is taken to refer to the version of the legislation that is currently in force. For example:
Indicates the current version of The Transport Act 1985, and will provide the most up to date version of the legislation available through the API. (This may not indicate the current state of the legislation, due to the limitations of the content available through this site.) In this case, the result will be the legislation as it stood on 1st April 2003, which is also accessible at the URI:
http://www.legislation.gov.uk/ukpga/1985/67/2003-04-01Any date can be used within the URI. For example:http://www.legislation.gov.uk/ukpga/1985/67/1997-06-01would refer to the version of The Transport Act 1985 that was in effect on 1st June 1997.
Requesting a date that is prior to the base date of 1st February 1991 will result in a redirection to the legislation as it was on the base date.

Requesting a date that was prior to the enactment of the legislation results in 404 Not Found response. Requests for sections that did not exist within a particular version will return you that section though the fact that it was not in force on that date will be indicated.
Prospective Versions
At any point in time, there may be prospective sections within or amendments to a piece of legislation: planned sections or changes that have not come into force. Using /prospective instead of a date within the URI refers to the legislation that would be in force were all prospective sections and amendments in effect. For example, Part II of the Road Traffic Regulation Act 1984 has a prospective amendment from the Railways and Transport Safety Act 2003 (sections 108 and 120) that adds a Section 22B. The prospective version of that Part would be:
http://www.legislation.gov.uk/ukpga/1984/27/part/II/prospectiveNote that the URIhttp://www.legislation.gov.uk/ukpga/1984/27/section/22Bwill return the section but that it is marked as being prospective, as does specifically requesting the prospective version withhttp://www.legislation.gov.uk/ukpga/1984/27/section/22B/prospectiveExplanatory Notes
Explanatory Notes provide accessible information to readers who are not legally qualified and who have no specialised knowledge of the matters dealt with by the enacting legislation. They are intended to allow the reader to grasp what the Act sets out to achieve and place its effect in context. Using /notes at the end of a document URI provides the Explanatory Notes for that specific legislation. For example, the Explanatory Notes for the Communications Act 2003 would be:
http://www.legislation.gov.uk/ukpga/2003/21/notesThe explanatary Notes for a specific section within that legislation, for example section 50 of the Communications Act 2003 would be:http://www.legislation.gov.uk/ukpga/2003/21/section/50/notesIf there are no Explanatory Notes or no Explanatory Note for a specific section exists then the uri will return a 404 Not Found response, since it does not exist.Representation URIs
Each document is available in multiple formats. The URI for a particular format follows the template:
http://www.legislation.gov.uk/{type}/{year}/{number}[/{section}][/{authority}][/{extent}][/{version}]/data.extfor legislation andhttp://www.legislation.gov.uk/{type}/{year}/{number}[/{section}][/{notes}]/data.extfor Explanatory Notes.where ext is the extension for the particular format.The format provided as the result of a particular request on a document URI is determined through content negotiation based on the mime types used in the Accept header used by the client. Available formats, their mime types and their extensions are listed on the formats page.


DCMI Abstract Model

Namespaces in XML, W3C Recommendation, 14 January 1999

IETF (Internet Engineering Task Force) RFC 3986: Uniform Resource Identifiers (URI): Generic Syntax, T. Berners-Lee, R. Fielding, L. Masinter. January 2005.

DCMI Encoding Guidelines

Namespace Policy for the Dublin Core Metadata Initiative (DCMI), 26 October 2001

Uniform Resource Identifier (URI) SCHEMES


RDF Site Summary 1.0

The “info” URI Scheme for Information Assets with Identifiers in Public Namespaces, 9 July 2004

Dewey Decimal Classification

“info” URI registry


FOAF Vocabulary Specification

R.T. Fielding, I.Jacobs, editors. “Authoritative Metadata”; World Wide Web Consortium. TAG Finding. April 2006. (See http://www.w3.org/2001/tag/doc/mime-respect.)
I.Jacobs, N. Walsh, editors.Architecture of the World Wide Web. World Wide Web Consortium. December, 2004. (See http://www.w3.org/TR/webarch/.)
R. Fielding, J. Gettys, J. Mogul, H. Frystyk, P. Leach, L. Masinter, T. Berners-Lee. “Hypertext Transfer Protocol – HTTP/1.1”. IETF RFC 2616. June 1999. (See http://www.ietf.org/rfc/rfc2616.)
D. Raggett, A. Le Hors, I. Jacobs, editors. HTML 4.01 Specification (Forms Chapter). World Wide Web Consortium. December 1999. (See http://www.w3.org/TR/html4/interact/forms.html.)
S. Bradner. Key words for use in RFCs to Indicate Requirement Levels. IETF RFC 2119. March, 1997. (See http://www.ietf.org/rfc/rfc2119.txt.)
T. Berners-Lee, R. Fielding, L. Masinter. Uniform Resource Identifiers (URI): Generic Syntax. IETF RFC3986. August 1998. (See http://www.ietf.org/rfc/rfc3986.)
J.M. Boyer, D. Landwehr, R. Merrick, T. V. Raman, M. Dubinko, L. Klotz, editors. XForms 1.0. World Wide Web Consortium 2006 (2nd Edition). (See http://www.w3.org/TR/xforms/.)
Architecture of the World Wide Web, Volume One, Ian Jacobs, Norman Walsh, Editors. World Wide Web Consortium, 15 December 2004. This edition is http://www.w3.org/TR/2004/REC-webarch-20041215/. The latest edition is available at http://www.w3.org/TR/webarch/.
Apache HTTP Server Version 2.0 Documentation, Chapter Content Negotiation. This document is available at http://httpd.apache.org/docs/2.0/content-negotiation.html.
Four Uses of a URL: Name, Concept, Web Location and Document Instance, David Booth. 28 January 2003. This document is available at http://www.w3.org/2002/11/dbooth-names/dbooth-names_clean.htm.
Common HTTP Implementation Problems, Olivier Théreaux, Editor. World Wide Web Consortium, 28 January 2003. This edition is http://www.w3.org/TR/2003/NOTE-chips-20030128/. The latest edition is available at http://www.w3.org/TR/chips/.
Cool URIs don’t change, Tim Berners-Lee, 1998. This document is available at http://www.w3.org/Provider/Style/URI.
The DataPortability Project. http://dataportability.org/
ECS URI System Specification, Colin Williams, Nick Gibbins. ECS Southampton, 2006. This document is available at http://id.ecs.soton.ac.uk/docs/.
FOAF Vocabulary Specification 0.9, Dan Brickley, Libby Miller. 24 May 2007. This edition is http://xmlns.com/foaf/spec/20070524.html. The latest edition is available at http://xmlns.com/foaf/spec/.
Give Us the Data Raw, and Give it to Us Now. Rufus Pollock. 7th November 2007.
Generic Resources, Tim Berners-Lee. This document is available at http://www.w3.org/DesignIssues/Generic.html.
Gleaning Resource Descriptions from Dialects of Languages (GRDDL), Dan Connolly, Editor, W3C Recommendation 11 September 2007. This edition is http://www.w3.org/TR/2007/REC-grddl-20070911/. The latest edition is available at http://www.w3.org/TR/grddl/.
What HTTP URIs Identify, Tim Berners-Lee. 9 June 2005. This document is available at http://www.w3.org/DesignIssues/HTTP-URI2.html.
[httpRange-14] Resolved, Roy Fielding. 18 June 2005. This archived www-tag email message is available at http://lists.w3.org/Archives/Public/www-tag/2005Jun/0039.html.
RFC2616, Hypertext Transfer Protocol — HTTP/1.1, http://www.rfc.net/rfc2616.html#s14.1
Notation 3, Tim Berners-Lee, Dan Connolly, 2008. This document is available at http://www.w3.org/TeamSubmission/n3/.
Primer: Getting into RDF & Semantic Web using N3. Tim Berners-Lee, 2005. http://www.w3.org/2000/10/swap/Primer
[RDFa Primer]
RDFa Primer 1.0 – Embedding Structured Data in Web Pages (see http://www.w3.org/2006/07/SWD/RDFa/primer/.)
RDF Primer, Frank Manola, Eric Miller, Editors. World Wide Web Consortium, 10 February 2004. This edition is http://www.w3.org/TR/2004/REC-rdf-primer-20040210/. The latest edition is available at http://www.w3.org/TR/rdf-primer/.
RDF/XML Syntax Specification (Revised), Dave Beckett, Editor. World Wide Web Consortium, 10 February 2004. This edition is http://www.w3.org/TR/2004/REC-rdf-syntax-grammar-20040210/. The latest edition is available at http://www.w3.org/TR/rdf-syntax-grammar/.
Best Practice Recipes for Publishing RDF Vocabularies, Alistair Miles, Thomas Baker, Ralph Swick, Editors. World Wide Web Consortium, 23 January 2008. This edition is http://www.w3.org/TR/2008/WD-swbp-vocab-pub-20080123/. It is a work in progress. The latest edition is available at http://www.w3.org/TR/swbp-vocab-pub/.
RFC 2616: Hypertext Transfer Protocol – HTTP/1.1, J. Gettys, J. Mogul, H. Frystyk, L. Masinter, P. Leach, T. Berners-Lee. IETF, 1999. This document is available at http://www.ietf.org/rfc/rfc2616.txt.
RFC 3986: Uniform Resource Identifier (URI): Generic Syntax, T. Berners-Lee, R. Fielding, L. Masinter. IETF, 2005. This document is available at http://www.ietf.org/rfc/rfc3986.txt.
Semantic Wikipedia, Max Völkel, Markus Krötzsch, Denny Vrandecic, Heiko Haller, Rudi Studer. University of Karlsruhe, 2006. This document is available at http://www.aifb.uni-karlsruhe.de/WBS/hha/papers/SemanticWikipedia.pdf.
On Linking Alternative Representations To Enable Discovery And Publishing, T.V. Raman. World Wide Web Consortium, 1 November 2006. This edition is http://www.w3.org/2001/tag/doc/alternatives-discovery-20061101.html. The latest edition is available at http://www.w3.org/2001/tag/doc/alternatives-discovery.html.
URNs, Namespaces and Registries, Henry S. Thompson, David Orchard. World Wide Web Consortium, 17 August 2006. This edition is http://www.w3.org/2001/tag/doc/URNsAndRegistries-50-2006-08-17.html. It is a work in progress. The latest edition is available at http://www.w3.org/2001/tag/doc/URNsAndRegistries-50.html.
RDF Triples in XML, Jeremy J. Carroll, Patrick Stickler, 2004. This document is available at http://www.mulberrytech.com/Extreme/Proceedings/html/2004/Stickler01/EML2004Stickler01.html.
Hypertext Transfer Protocol, Wikipedia contributors. Wikipedia, 8 October 2007. The latest version of this document is available at http://en.wikipedia.org/wiki/HTTP.


Leave a Reply

Your email address will not be published. Required fields are marked *