Your browser (Internet Explorer 6) is out of date. It has known security flaws and may not display all features of this and other websites. Learn how to update your browser.
X
Post

PHP: Convert UTF-8 to Hex Codepoint values (Unicode Hexidecimal)

This is one of those strange things that sounds a lot easier to do than it is.

I originally handled this by exploding the character through this:

$array = preg_split('//u', $a);

This worked fine and the string was split into an array of unicode characters. The next part was converting it into a useful hexidecimal value.

$character = $array[0];
$value = hexdec(bin2hex($character));

I originally thought this was the way to do so – I was wrong, don’t do it. It turns out there is no real simple way to convert from UTF-8 to hex values. Instead, try the UTF8ToUnicode function here: http://hsivonen.iki.fi/php-utf8/

Include this function and use the author’s utf8ToUnicode function. It becomes simple then:

$value = utf8ToUnicode($character);
$value = $value[0];

I am only posting this because of the sheer amount of time it took for me to find this information. I hope it helps you out.

  • Thanks for this!

    I’d already found Henri Sivonen’s code, but your tale of success encouraged me to reassess it, after initally dismissing it.

    I did have to convert the result to hex with dechex($value) at the end, and for some reason I also had to modify the utf8ToUnicode() function to accept the variable passed by value, but I now have everything I need.

    Thanks again for posting!

    Simon

    May 23, 2008