Few months ago, I introduced a simple algorithm that allow users to implement their own short URL into their system. Today, I have some spare time so I decided to write the short URL algorithm’s implementation in PHP.
At first, we define a function called shorturl()
that receives a URL as the input and returns an array that contains 4 hashed values (each 6 characters).
function shorturl($input) {
...
// return array of results
}
Below is the original pseudocode:
...
loop2: from 1st 4 bytes to 4th 4 bytes of md5 result
cast the 4 bytes to an integer
loop3: for shortCodeChar[0] to shortCodeChar[5]
use 1st 5 bits of the integer to find the value in codeMap
remove 5 bits from the integer
end loop3
save shortCodeChar as shortCode
...
// Database checking for duplication
end loop2
...
The following code is written according to the algorithm above excluding the database checking part for duplication:
function shorturl($input) {
$base32 = array (
'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h',
'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p',
'q', 'r', 's', 't', 'u', 'v', 'w', 'x',
'y', 'z', '0', '1', '2', '3', '4', '5'
);
$hex = md5($input);
$hexLen = strlen($hex);
$subHexLen = $hexLen / 8;
$output = array();
for ($i = 0; $i < $subHexLen; $i++) {
$subHex = substr ($hex, $i * 8, 8);
$int = 0x3FFFFFFF & (1 * ('0x'.$subHex));
$out = '';
for ($j = 0; $j < 6; $j++) {
$val = 0x0000001F & $int;
$out .= $base32[$val];
$int = $int >> 5;
}
$output[] = $out;
}
return $output;
}
Sample code to test/use the above function:
$input = 'http://www.snippetit.com/1';
$output = shorturl($input);
echo "Input : $input\n";
echo "Output : {$output[0]}\n";
echo " {$output[1]}\n";
echo " {$output[2]}\n";
echo " {$output[3]}\n";
echo "\n";
$input = 'http://www.snippetit.com/2';
$output = shorturl($input);
echo "Input : $input\n";
echo "Output : {$output[0]}\n";
echo " {$output[1]}\n";
echo " {$output[2]}\n";
echo " {$output[3]}\n";
echo "\n";
Output:
Input : http://www.snippetit.com/1
Output : h0xg4r
bdr3tw
osk2d3
4azfqa
Input : http://www.snippetit.com/2
Output : tm5kxb
ceoj2s
yw3dvl
nrmrxl
The function return an array of 4 elements, you can use any one of them. The others can be used as alternative unique code for the input when you found a duplicated code in your database (same code but different input – although it is unlikely to happen but it will happen). Chances to get a duplicated code is about n/(32^6) or n/1,073,741,824 where n is the number of records in your database.
As you can see, the output results are quite random although you only have one character different in the input string. The output is always consistent, for the same input you will always get the same output.
To make the output more unpredictable by the others, you can scramble the values in the $base32
array or/and add in your own private key or/and XOR the value of $val
with a value from range 0 to 31.
For example to scramble the values in the $base32
array, you can change the position of the values or/and replace the value with another (make sure the replaced value is URL safe character).
For example to add in private key, you can add in additional string when calling the md5()
function, e.g.:
$hex = md5('my-secret-key'.$input.'my-another-secret-key');
For example to XOR the value of $val
with value of 18:
$out .= $base32[$val ^ 18];
[…] 另一个算法来自http://www.snippetit.com/2009/04/php-short-url-algorithm-implementation/ 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 function shorturl($input) { $base32 = array ( ‘a’, ‘b’, ‘c’, ‘d’, ‘e’, ‘f’, ‘g’, ‘h’, ‘i’, ‘j’, ‘k’, ‘l’, ‘m’, ‘n’, ‘o’, ‘p’, ‘q’, ‘r’, ‘s’, ‘t’, ‘u’, ‘v’, ‘w’, ‘x’, ‘y’, ‘z’, ‘0’, ‘1’, ‘2’, ‘3’, ‘4’, ‘5’ ); $hex = md5($input); $hexLen = strlen($hex); $subHexLen = $hexLen / 8; $output = array(); for ($i = 0; $i < $subHexLen; $i++) { $subHex = substr ($hex, $i * 8, 8); $int = 0x3FFFFFFF & (1 * (’0x’.$subHex)); $out = ”; for ($j = 0; $j < 6; $j++) { $val = 0x0000001F & $int; $out .= $base32[$val]; $int = $int >> 5; } $output[] = $out; } return $output; } […]