http://www.zorba-xquery.com/modules/http-client

Description

Before using any of the functions below please remember to import the module namespace:

import module namespace http = "http://www.zorba-xquery.com/modules/http-client";

Introduction

This module provides provides simple functions for performing HTTP requests (GET, POST, DELETE etc.), as well as a more flexible general purpose function (send-request()).

Examples of how to use this module

Simple GET Request

 import module namespace http="http://www.zorba-xquery.com/modules/http-client";
 declare namespace svg="http://www.w3.org/2000/svg";
 http:get("http://www.w3.org/Graphics/SVG/svglogo.svg")[2]/svg:svg/svg:title
 

This example downloads an XML resource from the web (in this case, an SVG file, which is an XML-based image format) and returns it as a document node. Since the XML is in a namespace, we declare that namespace; we can then perform a path expression directly on the return value of http:get().

Simple GET Request (retrieving XHTML)

   import module namespace http="http://www.zorba-xquery.com/modules/http-client";
   declare namespace xhtml="http://www.w3.org/1999/xhtml";
   http:get-node( "http://www.w3.org" )[2]//xhtml:body
   

This example shows how to retrieve an XHTML resource. XHTML is XML, so the http:get-node() function will return it as a document node and you can operate on it with the full power of XQuery. As above, since this XML is in a particular namespace, the above query defines that namespace with the prefix "xhtml" so it can easily perform path expressions, etc.

Note: many webservers, include www.w3.org, return XHTML with the HTTP Content-Type "text/html". Zorba cannot assume that "text/html" is actually XHTML, and so http:get() would have returned raw text rather than a document node. That is why the example above uses http:get-node(), which overrides the server's Content-Type and tells Zorba to attempt to parse the result as XML.

Simple GET Request (retrieving HTML as text)

Note that XQuery does not understand plain HTML, and so if the URL you retrieve contains plain HTML data (not XHTML), it will be treated as plain text as shown in the next example. If you want to operate on the HTML with XQuery, you should use the HTML language module which can transform HTML to XHTML. The HTML module is supported by the Zorba team, but it is not a "core module", meaning that it is not shipped with every Zorba installation and may not be available. See the Zorba downloads page for information about obtaining this module if you do not have it.

 import module namespace http="http://www.zorba-xquery.com/modules/http-client";
 http:get("http://www.example.com")[2]
 
returns
   <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
   <html>
     <head>
       <meta http-equiv="Content-Type"
       content="text/html; charset=utf-8" />
       <title>Example Web Page</title>
     </head>
     <body>
       <p>You have reached this web page by typing "example.com",
       "example.net", or "example.org" into your web browser.</p>
       <p>These domain names are reserved for use in documentation and are
       Not available for registration. See
       <a href="http://www.rfc-editor.org/rfc/rfc2606.txt">RFC 2606</a>,
       Section 3.</p>
     </body>
   </html>
   

Note that the response data above is a simple xs:string value containing the HTML data, not actual XML data. If you executed the above query using the Zorba command-line client, you would have actually seen data like the following:

   &lt;!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"&gt;
   &lt;html&gt;
      ...
 

because Zorba would attempt to serialize it as XML data, and would escape all the raw angle brackets in the original xs:string.

Simple POST Request

Here is a simple example which sends text content by making an HTTP POST request.

 import module namespace http="http://www.zorba-xquery.com/modules/http-client";
 http:post( "...", "Hello World" )
 

Return Values

Most functions in this module (all except options()) return one or more items. (head() returns exactly one.) For all of these, the first item returned will be a <http-schema:response> element, as seen in the examples above. This element has "status" and "message" attributes, representing the result of the HTTP call. It also has any number of <http-schema:header> child elements that encode the HTTP headers returned by the HTTP server. Finally, it will generally contain a <http-schema:body> child element with a "media-type" attribute that identifies the content-type of the result data.

The full schema of this <http-schema:response> element is part of the EXPath HTTP Client module. You can see the schema here.

Any items in function return values after the initial <http-schema:response> element are the body/bodies of the HTTP response from the server. (MIME Multi-part responses will have more than one body.) The type of these items depends on the Content-Type for each body. Each item will be:

  • an element node, if the returned content type is one of:
    • text/xml
    • application/xml
    • text/xml-external-parsed-entity
    • application/xml-external-parsed-entity
    • or if the Content-Type ends with "+xml".
  • an xs:string, if the returned content type starts with "text/" and does not match the above XML content types strings, or if it is one of:
    • "application/json"
    • "application/x-javascript"
  • xs:base64Binary for all other content types.

This return value - a sequence of items comprising one <http-schema:response> element followed by zero or more response items - is referred to as the "standard http-client return type" in the function declarations below.

$href Arguments to Functions

All functions in this module accept a URL argument named $href. In all cases, the value passed to $href must be a valid xs:anyURI. However, all functions declare $href to be of type xs:string. This is for convenience, since you can pass a string literal value (that is, a URL in double-quotes spelled out explicitly in your query) to an xs:string parameter.

Important Notice Regarding get() Functions

All of the get() functions in this module - get(), get-node(), get-text(), and get-binary() - are declared to be nondeterministic, which means that Zorba will not cache their results. However, they are not declared to be sequential, which means that Zorba may re-order them as part of its query optimization. According to the HTTP RFC, GET requests should only return data, and should not have any side-effects. However, in practice it is not uncommon for GET requests to have side-effects. If your application depends on the ordering of side-effects from making GET requests, you should either use the more complex send-request() function (which is declared sequential), or alterately wrap each call to get() in your own sequential function, to ensure that Zorba does not place the GET requests out of order.

Relation to the EXPath http-client module

EXPath defines its own http-client module, which is available separately for Zorba as a non-core module. There are two primary differences between EXPath's http-client and Zorba's core http-client (this module):
  1. EXPath defines only the send-request() function, although it does include convenient 1- and 2-argument forms in addition to the full 3-argument form. EXPath does not include the simpler get(), post(), put(), delete(), head(), and options() functions defined by this module.
  2. EXPath specifies that all HTML content returned from the HTTP server will be tidied up into valid XML, and then parsed into an element. As this required an additional third-party library dependency, Zorba's http-client module does not perform this tidying. Instead, HTML content is returned as a string (with special XML characters replaced with XML entity references, as shown in the above examples).
See the full spec of the EXPath http-client module for more information.

Module code

Here is the actual XQuery module code.

Imported Schemas

Please note that the schemas are not automatically imported in the modules that import this module.

In order to import and use the schemas, please add:

import schema namespace http-schema =  "http://expath.org/ns/http-client";

Imported modules

See also

Authors

Markus Pilman, Federico Cavalieri

Version Declaration

xquery version "3.0" encoding "utf-8";

Namespaces

anhttp://zorba.io/annotations
errhttp://www.w3.org/2005/xqt-errors
errorhttp://expath.org/ns/error
httphttp://www.zorba-xquery.com/modules/http-client
http-schemahttp://expath.org/ns/http-client
http-wrapperhttp://zorba.io/modules/http-client-wrapper
jnhttp://jsoniq.org/functions
json-httphttp://zorba.io/modules/http-client
libjnhttp://jsoniq.org/function-library
serhttp://www.w3.org/2010/xslt-xquery-serialization
verhttp://zorba.io/options/versioning

Function Summary

send-request($request as element(http-schema:request)?, $href as xs:string?, $bodies as item()*) as item()+

This function sends an HTTP request and returns the corresponding response.

get($href as xs:string) as item()+

This function makes a GET request to a given URL.

get-node($href as xs:string) as item()+

This function makes a GET request to a given URL.

get-text($href as xs:string) as item()+

This function makes a GET request to a given URL.

get-binary($href as xs:string) as item()+

This function makes a GET request on a given URL.

head($href as xs:string) as item()

This function makes an HTTP HEAD request on a given URL.

options($href as xs:string) as xs:string*

This function makes an HTTP OPTIONS request, which asks the server which operations it supports.

put($href as xs:string, $body as item()) as item()+

This function makes an HTTP PUT request to a given URL.

put($href as xs:string, $body as item(), $content-type as xs:string) as item()+

This function makes an HTTP PUT request to a given URL.

delete($href as xs:string) as item()+

This function makes an HTTP DELETE request to a given URL.

post($href as xs:string, $body as item()) as item()+

This function makes an HTTP POST request to a given URL.

post($href as xs:string, $body as item(), $content-type as xs:string) as item()+

This function makes an HTTP POST request to a given URL.

Functions

send-request#3

declare %an:sequential function http:send-request(
    $request as element(http-schema:request)?,
    $href as xs:string?,
    $bodies as item()*
) as item()+
This function sends an HTTP request and returns the corresponding response. Its inputs, outputs, and behavior are identical to the EXPath http-client's send-request() function (except that HTML responses are not tidied into XML - see the note above). It is provided here for use in Zorba installations that do not have the EXPath module available. If you have the option of using the EXPath module instead of this function, please do so, as it will allow your application to be more interoperable between different XQuery engines. Full documentation of the $request parameter can be found in the EXPath specification.

Parameters

  • $request

    Contains the various parameters of the request (see above).

  • $href

    The URL to which the request will be made (see note above). If this parameter is specified, it will override the "href" attribute of $request.

  • $bodies

    is the request body content, for HTTP methods that can contain a body in the request (i.e. POST and PUT). It is an error if this param is not the empty sequence for methods

Returns

  • item()+

    standard http-client return type.

Errors

  • error:HC001

    An HTTP error occurred.

  • error:HC002

    Error parsing the response content as XML.

  • error:HC003

    With a multipart response, the override-media-type must be either a multipart media type or application/octet-stream.

  • error:HC004

    The src attribute on the body element is mutually exclusive with all other attribute (except the media-type).

  • error:HC005

    The input request element is not valid.

  • error:HC006

    A timeout occurred waiting for the response.

  • error:HCV02

    Trying to follow a redirect of a POST, PUT, or DELETE request

Examples

get#1

declare %an:nondeterministic function http:get(
    $href as xs:string
) as item()+

This function makes a GET request to a given URL.

Parameters

  • $href

    The URL to which the request will be made (see note above).

Returns

  • item()+

    standard http-client return type.

Errors

  • error:HC001

    An HTTP error occurred.

  • error:HC002

    Error parsing the response content as XML.

  • error:HC006

    A timeout occurred waiting for the response.

Examples

get-node#1

declare %an:nondeterministic function http:get-node(
    $href as xs:string
) as item()+

This function makes a GET request to a given URL. All returned bodies are forced to be interpreted as XML and parsed into elements.

Parameters

  • $href

    The URL to which the request will be made (see note above).

Returns

  • item()+

    standard http-client return type.

Errors

  • error:HC001

    An HTTP error occurred.

  • error:HC002

    Error parsing the response content as XML.

  • error:HC006

    A timeout occurred waiting for the response.

Examples

get-text#1

declare %an:nondeterministic function http:get-text(
    $href as xs:string
) as item()+

This function makes a GET request to a given URL. All returned bodies are forced to be interpreted as plain strings, and will be returned as xs:string items.

Parameters

  • $href

    The URL to which the request will be made (see note above).

Returns

  • item()+

    standard http-client return type.

Errors

  • error:HC001

    An HTTP error occurred.

  • error:HC002

    Error parsing the response content as XML.

  • error:HC006

    A timeout occurred waiting for the response.

Examples

get-binary#1

declare %an:nondeterministic function http:get-binary(
    $href as xs:string
) as item()+

This function makes a GET request on a given URL. All returned bodies are forced to be interpreted as binary data, and will be returned as xs:base64Binary items.

Parameters

  • $href

    The URL to which the request will be made (see note above).

Returns

  • item()+

    standard http-client return type.

Errors

  • error:HC001

    An HTTP error occurred.

  • error:HC002

    Error parsing the response content as XML.

  • error:HC006

    A timeout occurred waiting for the response.

Examples

head#1

declare %an:nondeterministic function http:head(
    $href as xs:string
) as item()

This function makes an HTTP HEAD request on a given URL.

Parameters

  • $href

    The URL to which the request will be made (see note above).

Returns

  • item()

    standard http-client return type (since HEAD never returns any body data, only the <http-schema:response> element will be returned).

Errors

  • error:HC001

    An HTTP error occurred.

  • error:HC006

    A timeout occurred waiting for the response.

Examples

options#1

declare %an:nondeterministic function http:options(
    $href as xs:string
) as xs:string*

This function makes an HTTP OPTIONS request, which asks the server which operations it supports.

Parameters

  • $href

    The URL to which the request will be made (see note above).

Returns

  • xs:string*

    A sequence of xs:string values of the allowed operations.

Errors

  • error:HC001

    An HTTP error occurred.

  • error:HC006

    A timeout occurred waiting for the response.

Examples

put#2

declare %an:sequential function http:put(
    $href as xs:string,
    $body as item()
) as item()+

This function makes an HTTP PUT request to a given URL. If the body passed to this function is an element, it will be serialized to XML to be sent to the server, and the Content-Type sent to the server will be "text/xml". Otherwise, the body will be converted to a plain string, and the Content-Type will be "text/plain".

Parameters

  • $href

    The URL to which the request will be made (see note above).

  • $body

    The body which will be sent to the server.

Returns

  • item()+

    standard http-client return type.

Errors

  • error:HC001

    An HTTP error occurred.

  • error:HC002

    Error parsing the response content as XML.

  • error:HC006

    A timeout occurred waiting for the response.

  • error:HCV02

    Trying to follow a redirect of a PUT request.

Examples

put#3

declare %an:sequential function http:put(
    $href as xs:string,
    $body as item(),
    $content-type as xs:string
) as item()+
This function makes an HTTP PUT request to a given URL. If the body passed to this function is an element, it will be serialized according to the $content-type parameter as follows:
  • If $content-type is "text/xml", "application/xml", "text/xml-external-parsed-entity", or "application/xml-external-parsed-entity", or if it ends with "+xml", $body will be serialized to XML.
  • If $content-type starts with "text/html", $body will be serialized to HTML.
  • Otherwise, $body will be serialized to text.
If $body is not an element, $body will be serialized to text regardless of $content-type.

In any case, Content-Type of the request sent to the server will be $content-type.

Parameters

  • $href

    The URL to which the request will be made (see note above).

  • $body

    The body which will be sent to the server.

  • $content-type

    The content type of $body as described above.

Returns

  • item()+

    standard http-client return type.

Errors

  • error:HC001

    An HTTP error occurred.

  • error:HC002

    Error parsing the response content as XML.

  • error:HC006

    A timeout occurred waiting for the response.

  • error:HCV02

    Trying to follow a redirect of a PUT request.

Examples

delete#1

declare %an:sequential function http:delete(
    $href as xs:string
) as item()+

This function makes an HTTP DELETE request to a given URL.

Parameters

  • $href

    The URL to which the request will be made (see note above).

Returns

  • item()+

    standard http-client return type.

Errors

  • error:HC001

    An HTTP error occurred.

  • error:HC002

    Error parsing the response content as XML.

  • error:HC006

    A timeout occurred waiting for the response.

  • error:HCV02

    Trying to follow a redirect of a DELETE request.

Examples

post#2

declare %an:sequential function http:post(
    $href as xs:string,
    $body as item()
) as item()+

This function makes an HTTP POST request to a given URL. If the body passed to this function is an element, it will be serialized to XML to be sent to the server, and the Content-Type sent to the server will be "text/xml". Otherwise, the body will be converted to a plain string, and the Content-Type will be "text/plain".

Parameters

  • $href

    The URL to which the request will be made (see note above).

  • $body

    The body which will be sent to the server.

Returns

  • item()+

    standard http-client return type.

Errors

  • error:HC001

    An HTTP error occurred.

  • error:HC002

    Error parsing the response content as XML.

  • error:HC006

    A timeout occurred waiting for the response.

  • error:HCV02

    Trying to follow a redirect of a POST request.

Examples

post#3

declare %an:sequential function http:post(
    $href as xs:string,
    $body as item(),
    $content-type as xs:string
) as item()+
This function makes an HTTP POST request to a given URL. If the body passed to this function is an element, it will be serialized according to the $content-type parameter as follows:
  • If $content-type is "text/xml", "application/xml", "text/xml-external-parsed-entity", or "application/xml-external-parsed-entity", or if it ends with "+xml", $body will be serialized to XML.
  • If $content-type starts with "text/html", $body will be serialized to HTML.
  • Otherwise, $body will be serialized to text.
If $body is not an element, $body will be serialized to text regardless of $content-type.

In any case, Content-Type of the request sent to the server will be $content-type.

Parameters

  • $href

    The URL to which the request will be made (see note above).

  • $body

    The body which will be sent to the server

  • $content-type

    The content type of the body as described above.

Returns

  • item()+

    The first element of the result is the metadata (like headers, status etc), the next elements are the response

Errors

  • error:HC001

    An HTTP error occurred.

  • error:HC002

    Error parsing the response content as XML.

  • error:HC006

    A timeout occurred waiting for the response.

  • error:HCV02

    Trying to follow a redirect of a POST request.

Examples