Javascript URL encoding and decoding

Javascript keywords, functions and properties used :

The URLEncode() function interface

function URLEncode (clearString) {

Nothing too exciting to see here so, moving right along, , ,

Variable initialization

var output = '';
var x = 0;
clearString = clearString.toString();
var regex = /(^[a-zA-Z0-9_.]*)/;
  • output, the variable that will hold the contents of the encoded string.
  • x, a very creative and originally named variable used to contain the current offset into the input string.
  • clearString.toString(), required in the case that the input is composed totally of numbers in which case the regex will fail due to the fact that regex operates only on objets of type String, not Number objects. We could test for this possibility and exit the function early but why bother as the extra code required to perform the test would 'cost' more than simply allowing the while loop to exit on its first iteration.
  • regex, the workhorse of this function documented in the discussion portion of this project.

While loop declaration

while (x < clearString.length) {
  . . . Character encoding . . .
}
return output;
  1. On the beginning of each loop, check to make sure the end of the input string has not been reached.
  2. At the end of the looping return the encoded string.

The guts and mind of the encoder

var match = regex.exec(clearString.substr(x));
if (match != null && match.length > 1 && match[1] != '') {
  output += match[1];
  x += match[1].length;
} else {
  if (clearString[x] == ' ')
    output += '+';
  else {
    var charCode = clearString.charCodeAt(x);
    var hexVal = charCode.toString(16);
    output += '%' + ( hexVal.length < 2 ? '0' : '' ) + hexVal.toUpperCase();
  }
  x++;
}
  1. Execute the Regular Expression on the input string beginning with the first character, identified by x, that hasn't yet been processed storing the result array in match. If any matches are found match[0] will contain the entire string that was matched and match[1 - n] will contain each individual match up to 'n' matches. In this case however, there will be at most only one match, which may be empty.
  2. Check to make sure the result match is:
    1. Valid and not null.
    2. Contains more than one element.
    3. The element of interest is not empty which will be the case if the first character examined by the regex is in need of encoding.
  3. If the array's element of interest is not empty, add the contents of what was matched to the output string and increment the input string's offset pointer by the number of characters found.
  4. If the array's element of interest is empty, encode the character of the input string indicated by the x input string offset indicator.
    1. Get the character code (Unicode value) of the character to be encoded.
    2. Convert the decimal character code to a hex string representation.
    3. Use toUpperCase() to make the hex value pretty. This is not really necessary and can be left out.
    4. Prepend '%' to the hex string.
    5. Add the resulting %-hex string to the output string.
    6. Increment x by 1.

The complete URLEncode()

function URLEncode (clearString) {
  var output = '';
  var x = 0;
  clearString = clearString.toString();
  var regex = /(^[a-zA-Z0-9_.]*)/;
  while (x < clearString.length) {
    var match = regex.exec(clearString.substr(x));
    if (match != null && match.length > 1 && match[1] != '') {
    	output += match[1];
      x += match[1].length;
    } else {
      if (clearString[x] == ' ')
        output += '+';
      else {
        var charCode = clearString.charCodeAt(x);
        var hexVal = charCode.toString(16);
        output += '%' + ( hexVal.length < 2 ? '0' : '' ) + hexVal.toUpperCase();
      }
      x++;
    }
  }
  return output;
}

Moving right along. . .

Here we have the naughty bits of the Javascript URL decoding function

function URLDecode (encodedString) {

The ever popular obvious department of obviousness function interface.

Initialising the decoder

var output = encodedString;
var binVal, thisString;
var myregexp = /(%[^%]{2})/;
  • output - Initialize the return variable with the input string. Read operations will be from the input string and output writes, to output
  • binVal - An intermediate value, for demonstrative purposes. Contains the binary value of a character code of a given character.
  • thisString - A temporary variable initialized here so as to save repetative re-initialization.
  • myregexp - Contains the regex object.

Begin the beguine

Start the decoder's while loop to process the input string until all characters requiring decoding have been decoded.

while ((match = myregexp.exec(output)) != null
           && match.length > 1
           && match[1] != '') {
  . . . Do some decoder type stuff here . . .
}
return output;

Perform the regex matching and test the results for being of valid structure, the correct number of elements and not empty. The match will be empty when there are no more characters in need of decoding.

When the decoder's while loop completes, no more characters to process/decode, return the results, output

The Nitty Gritty Dirt Script

Perform the following process on any character sets found in need of decoding.

binVal = parseInt(match[1].substr(1),16);
thisString = String.fromCharCode(binVal);
output = output.replace(match[1], thisString);
  1. Perform parseInt() on the matched characters found in match[1] using a Radix value of 'hex', 16. The result will be the Unicode value of the character.
  2. Using the Unicode value derived from the parseInt(), use the fromCharCode() method of the String Javascript object class to convert the Unicode value into a character.
  3. Replace all instances in the output of the un-encoded input with the decoded output

It's all over except the crying

function URLDecode (encodedString) {
  var output = encodedString;
  var binVal, thisString;
  var myregexp = /(%[^%]{2})/;
  while ((match = myregexp.exec(output)) != null
             && match.length > 1
             && match[1] != '') {
    binVal = parseInt(match[1].substr(1),16);
    thisString = String.fromCharCode(binVal);
    output = output.replace(match[1], thisString);
  }
  return output;
}