Schaake.nu

XSLT URL decoder

Gepost in /XSLT stylesheets/Software op 26 Juni 2013
Deze blog is geschreven door Christiaan Schaake

Introduction

An URL contains a full path to a specific peace of information or the execution of a service. In some cases it is necessary to get the individual parts from the URL in XSLT stylesheets. This XSLT routine get any URL and decodes it into individual parts.

URL syntax

An URL contains a number of parameters to address a specific endpoint. A complete URL looks like this:

<schema>://<user>:<password>@<server>:<port>/<path>?<query>#<fragment>

Not all parameters of the URL are required. Some can be left blank and need to be read as null or empty. Some have a default value, if not provided they need to fall back to there default value.

Required parameters:

  • server – IP address or hostname of the addressed location. If a schema is specified, the server prefix or delimiter is ‘://’.

Optional parameters with default values:

  • schema – Protocol schema used, defaults to HTTP.
  • port – Specific port number to be used, of not provided use default schema port*

Optional parameters:

  • user and password – For authentication. Password may be blank to specify that no password is provided. If both user and password is blank, the whole authentication section with <user>:<password>@ must be removed
  • path - Service path to addressed location
  • query – Set of parameters required to access or address the requested resource. The question mark may only be provided when an actual query is provided.
  • fragment – Address a specific location within the request resource, e.g. a chapter on a webpage. The hash sign may only be provided when an actual location is provided.

*) Schema HTTP uses default port 80, and schema HTPS uses default port 443.

Determine schema

The schema is optional, so we need to detect is it is actually provided. If not, we can use the default schema HTTP.
First check if the delimiter ‘://’ exists. Next we can select the schema as part before the delimiter. If the delimiter is not found we must assume the default HTTP schema.

<xsl:choose> <xsl:when test="contains($url,'://')"> <schema><xsl:value-of select="substring-before($url,'://')"/></schema> </xsl:when> <xsl:otherwise> <schema><xsl:text>HTTP</xsl:text></schema> </xsl:otherwise> </xsl:choose>

Determine user and password

First we need to verify if a user and possibly a password is provided. The delimiter for the username/password in the URL is the at-sign. If this character exists, the username and password parameters are between the schema delimiter or the beginning and the at-sign.
The username is the part before the column, the password the part after the column.

<xsl:if test="contains($url,'@')"> <xsl:variable name="userpass"> <xsl:choose> <xsl:when test="contains($url,'://')"> <xsl:value-of select="substring-after(substring-before($url,'@'),'://')"/> </xsl:when> <xsl:otherwise> <xsl:value-of select="substring-before($url,'@')"/> </xsl:otherwise> </xsl:choose> </xsl:variable> <user> <xsl:value-of select="substring-before($userpass,':')"/> </user> <password> <xsl:value-of select="substring-after($userpass,':')"/> </password> </xsl:if>

Determine Server

The server is located after the optional schema and the optional username/password and before the optional port, optional path, optional query and optional segment. So we need to check a few things before we can find the server.

First find the start point of the server by checking:

  1. If schema exists, delimiter is ‘:\\’
  2. If username and optionally password exist, delimiter is ‘@’

Due to the fact that all parameters come in the required order, we can check in reverse. This way we already eliminate the previous checks which makes the programming a lot simpler.

Next find the endpoint of the server by checking:

  1. If the port number exists, delimiter is ‘:’
  2. If the path exists, delimiter is ‘/’
  3. If the query exists, delimiter is ‘?’
  4. If the segment exists, delimiter is ‘#’

If one of the delimiters is found, we do not need to search further. We can simply cut-off the rest of the URL to get the Server parameter.

$lt;xsl:variable name="serverstart"> $lt;xsl:choose> $lt;xsl:when test="contains($url,'@')"> $lt;xsl:value-of select="substring-after($url,'@')"/> $lt;/xsl:when> $lt;xsl:when test="contains($url,'://')"> $lt;xsl:value-of select="substring-after($url,'://')"/> $lt;/xsl:when> $lt;xsl:otherwise> $lt;xsl:value-of select="$url"/> $lt;/xsl:otherwise> $lt;/xsl:choose> $lt;/xsl:variable> $lt;xsl:choose> $lt;xsl:when test="contains($serverstart,':')"> $lt;server>$lt;xsl:value-of select="substring-before($serverstart,':')"/>$lt;/server> $lt;/xsl:when> $lt;xsl:when test="contains($serverstart,'/')"> $lt;server>$lt;xsl:value-of select="substring-before($serverstart,'/')"/>$lt;/server> $lt;/xsl:when> $lt;xsl:when test="contains($serverstart,'?')"> $lt;server>$lt;xsl:value-of select="substring-before($serverstart,'?')"/>$lt;/server> $lt;/xsl:when> $lt;xsl:when test="contains($serverstart,'#')"> $lt;server>$lt;xsl:value-of select="substring-before($serverstart,'#')"/>$lt;/server> $lt;/xsl:when> $lt;xsl:otherwise> $lt;server>$lt;xsl:value-of select="$serverstart"/>$lt;/server> $lt;/xsl:otherwise> $lt;/xsl:choose>

Determine Port

The port is optionally provided. But if it is provided in the URL, it must be specified after the server parameter. We can reuse the code in the previous chapter to strip-off all information before the server parameter.
By determining if a column still exists we know the port is provided, otherwise we should use the default port.
The default port for HTTP, which is the default schema is 80. Only schema HTTPS defaults to port 443. HTTPS can be given in capital letters or small caps.

<xsl:variable name="serverstart"> <xsl:choose> <xsl:when test="contains($url,'@')"> <xsl:value-of select="substring-after($url,'@')"/> </xsl:when> <xsl:when test="contains($url,'://')"> <xsl:value-of select="substring-after($url,'://')"/> </xsl:when> <xsl:otherwise> <xsl:value-of select="$url"/> </xsl:otherwise> </xsl:choose> </xsl:variable> <xsl:choose> <xsl:when test="contains($serverstart,':')"> <xsl:choose> <xsl:when test="contains($serverstart,'/')"> <port><xsl:value-of select="substring-after(substring-before($serverstart,'/'),':')"/></port> </xsl:when> <xsl:when test="contains($serverstart,'?')"> <port><xsl:value-of select="substring-after(substring-before($serverstart,'?'),':')"/></port> </xsl:when> <xsl:when test="contains($serverstart,'#')"> <port><xsl:value-of select="substring-after(substring-before($serverstart,'#'),':')"/></port> </xsl:when> <xsl:otherwise> <port><xsl:value-of select="substring-after($serverstart,':')"/></port> </xsl:otherwise> </xsl:choose> </xsl:when> <xsl:otherwise> <xsl:choose> <xsl:when test="(contains($url,'HTTPS') or contains($url,'https'))"> <port><xsl:text>443</xsl:text></port> </xsl:when> <xsl:otherwise> <port><xsl:text>80</xsl:text></port> </xsl:otherwise> </xsl:choose> </xsl:otherwise> </xsl:choose>

Determine path

The path parameter always succeeds the server and optionally port parameters and always starts with a slash ‘/’. Only the optional schema parameter may contain the slash character. So if we eliminate the schema from the URL and select everything after the next slash, we have the start point of the path parameter.
The path can be followed by a query and/or segment parameter. We need to strip-off these parameter by removing anything starting with a question mark or hash sign.

<xsl:choose> <xsl:when test="contains($url,'?')"> <path><xsl:text>/</xsl:text><xsl:value-of select="substring-before(substring-after(substring-after($url,'://'),'/'),'?')"/></path> </xsl:when> <xsl:when test="contains($url,'#')"> <path><xsl:text>/</xsl:text><xsl:value-of select="substring-before(substring-after(substring-after($url,'://'),'/'),'#')"/></path> </xsl:when> <xsl:otherwise> <path><xsl:text>/</xsl:text><xsl:value-of select="substring-after(substring-after($url,'://'),'/')"/></path> </xsl:otherwise> </xsl:choose>

Determine query

We can simply select anything after the question mark sign and, if exist, before the hash sign.

<xsl:if test="contains($url,'?')"> <xsl:choose> <xsl:when test="contains($url,'#')"> <query> <xsl:text>?</xsl:text> <xsl:value-of select="substring-before(substring-after($url,'?'),'#')"/> </query> </xsl:when> <xsl:otherwise> <query> <xsl:text>?</xsl:text> <xsl:value-of select="substring-after($url,'?')"/> </query> </xsl:otherwise> </xsl:choose> </xsl:if>

Determine segment

We can simply select anything after the hash sign.

<xsl:if test="contains($url,'#')"> <fragment> <xsl:text>#</xsl:text> <xsl:value-of select="substring-after($url,'#')"/> </fragment> </xsl:if>

See also (RFC3986 http://tools.ietf.org/html/rfc3986)

Deze blog is getagd als XML XSLT

Google
facebook