PHP is quite infamous for not being unicode friendly. This PHP library by Harry Fuecks should make it uber easy to handle UTF-8 unicode in your web application.

Harry Fuecks writes at Sitepoint:

PHP UTF-8 is intended to make it possible to handle UTF-8 encoded strings in PHP, without requiring the mbstring extension (although it uses mbstring if it’s available). In short, it provides versions of PHP’s string functions (pretty much everything you’ll find on this list), prefixed with utf_ and aware of UTF-8 encoding (that 1character >= 1 byte). It also gives you some tools to help check UTF-8 strings for “well formedness”, strip bad sequences and some “ASCII helpers”.

It's released under the LGPL so you can use it in most open source web applications.

I've not really investigated it properly but I may look into using this in my scripts. Investigating unicode has been on my to do list for quite a while but I've never got around to it. 

PHP 6 should have built in native support for unicode. 

2 thoughts on “PHP UTF-8

  1. We're currently using mbstring on this server khlo, all the string functions are overloaded with the mbstring ones.

    Does PHP UTF-8 offer many extra features to mbstring? 

  2. I think it abstracts much of the functionality – if mbstring is there it'll use mbstring. It probably adds a few extra features too.

Leave a Reply

Your email address will not be published. Required fields are marked *