Internationalization and Localization with PHP – Zend_Locale, Zend_Translate and Zend_Date

I’ve been working on a project in my spare time.  As part of the initial analysis I decided that this project would be suitable for a wider, multi-lingual, international audience but probably in 2 or 3 years time.   I then had to make the difficult decision on whether to do the work now and waste the time on a new feature I’ll never actually use, or wait until I actually need the feature and then have perform a major refactor of the whole project.

My real issue was that I’d never created a truly international website (multi-lingual, multiple date formats) and without knowing how to implement it, I’d have to go off and find out!  I was very surprised to find that making a web application internationalised and localised with Zend Framework is actually very straight-forward.

After the research, I decided that it’s actually a small piece of work that wouldn’t add much overhead to my standard development practices going forward.  Also, as you’ll see in this blog, since the web application’s Urls need to contain locale information, it would be best to do it now, rather than cause SEO headaches later when I’d have to change the Url strategy of the web application.

If you’re aiming for a global audience for your website, you will then need to consider both internationalisation and localisation.  Here are my findings….

THE BASICS

Internationalization and Localization on Wikipedia:

“In computing, internationalization and localization (also spelled internationalisation and localisation, see spelling differences) are means of adapting computer software to different languages, regional differences and technical requirements of a target market. Internationalization is the process of designing a software application so that it can be adapted to various languages and regions without engineering changes. Localization is the process of adapting internationalized software for a specific region or language by adding locale-specific components and translating text.”

Locale Code (ISO)

The locale code is a string made up of a language code (ISO 639-1) and country code (ISO 3166).  We use the locale code to determine what language the visitor wishes to view the website in, along with a number of other important presentation preferences which will be outlined in this blog.

[Language code] – [Country code]
For example:

  • en-US – American English
  • en-GB – British English
  • fr-FR – French
  • de-DE – German
  • haw_US – American Hawaiian

You’ll notice that on some websites, the language code and country/territory code are separated with a hyphen (eg en-GB) and others will separate with an underscore (eg en_GB).  By default, Zend_Locale uses the underscore but will happily accept a hyphen separated locale.

Note, the language code can either be 2 or 3 characters in length such as English (en) and Hawaiian (haw) so you should avoid doing the following, instead relying on Zend_Locale’s in-built methods:

<?php
// WRONG:
// outputs "en"
$languageCode = substr('en-GB', 0, 2);
// outputs "ha"
$languageCode = substr('haw-US', 0, 2);

// CORRECT:
$locale = new Zend_Locale('en-GB');
// outputs "en"
$languageCode = $locale->getLanguage();
?>

Language

You’ll notice in the sample list above that I’ve listed ‘en-US’ and ‘en-GB’, both representing the English language.  There are subtle differences between a language written and spoken in different countries.  For example, American English and British English have many words that are spelt differently, and some words that don’t even exist in the other’s dictionary.  See Wikipedia: American and British English spelling differences.

Dates

Another very important consideration is in the way dates are presented.  Different countries present dates differently eg  dd/mm/yyyy or mm/dd/yyyy – chronologically, it makes more sense to be yyyymmdd (eg 19001230).   Generally, if you see 30/12/1900 or 12/30/1900, it’s easy to work out the date, however, it gets far more difficult when you see 6/12/1900 – is that June 12th or 6th December – a very important and significant difference?!

Deciding whether to simply use the language (eg en) instead of the full locale (eg en-US) is really dependent on where your audience will come from.  Differences in spelling will not stop a visitor from being able to use your website but the date can make it difficult for international visitors.  If you decide on collecting just the language, a work-around could be to represent the date as 2 Jun 2012 instead of 6/2/2012.

One final note on dates, you should carefully consider the format of your dates.  In Germany, it’s common for dates to be separated with periods instead of forward-slashes eg 6.2.2012.

Currencies

The locale code is also used in determining a visitor’s local currency.  Although, it’s important to note that Zend_Currency requires a full locale code (eg ‘en-US’) instead of simply the language code (eg ‘en’).

Below is a sample list of currency codes (ISO 4217):

  • AUD – Australian Dollars
  • USD – United States Dollars
  • GBP – Great Britain Pounds (Pounds Sterling)

Persisting Visitor’s Locale

It’s important to remember what a visitors internationalisation and localisation preferences are.  Cookies (browser or session) could do the task at hand but would make the page nearly impossible to cache if different content is being returned from the same Uri.  I’ll discuss caching considerations a bit later on.  The best way to persist a visitor’s locale is in the Uri itself.

British English homepage:  http://www.example.com/en-GB/
Page Title:  “30/12/2011:  We will be mesmerised by the colour of the fireworks”

United States English homepage:  http://www.example.com/en-US/
Page Title:  “12/30/2011:  We will be mesmerized by the color of the fireworks”

French homepage:  http://www.example.com/fr-FR/

However, if you’re only interesting in translation and not localisation (date formats, currencies etc), then you might decide on a Uri like:
http://www.example.com/en/

SUPPORTING LOCALIZATION AND INTERNATIONALIZATION WITH ZEND FRAMEWORK

Bootstrap

The first step in implementing internationalization and localization in a Zend Framework application is to collect the visitor’s locale from the Uri.

The following route will match a Uri similar to http://www.example.com/en-GB/index/index:

<?php
class Bootstrap extends Zend_Application_Bootstrap_Bootstrap
{
    protected function _initRoutes()
    {
        $router = $this->frontController->getRouter();
        $router->addRoute('LanguageControllerAction',
            new Zend_Controller_Router_Route_Regex(
            '([a-z]{2,3}[\-_]{1}[a-zA-Z]{2})/(.*)/(.*)?',
            array(
                'action'     => 'index',
                'controller' => 'index',
                'module'     => 'default'
            ),
            array(
                'localeCode' => 1,
                'controller' => 2,
                'action'     => 3
            ),
            '%s/%s/%s'
        ));
    }
}
?>

Within your controller, you’ll be able to access the locale code by:

public function indexAction()
{
    $localeCode = $this->_getParam('localeCode');
}

Ideally, the above bit of code would sit inside the init() of your Base Controller, which all of your controllers should inherit. The localeCode value would then be available in all controllers.

Wiring up Zend_Locale and Zend_Translate

Typically, the wiring up of these services should occur in your application’s Service Factory but for simplicity, I’m going to illustrate the wiring up inside a controller.

The following code illustrates the wiring up of Zend_Locale, using the locale code supplied on the Uri.

public function indexAction()
{
	$localeCode = $this->_getParam('localeCode');
	if (!empty($localeCode) {
		// create Zend_Locale object
		$locale = new Zend_Locale($localeCode);
	} else {
		// Guesses visitor's locale code base on browser
		$locale = new Zend_Locale(Zend_Locale::BROWSER);
	}
	Zend_Registry::set('Zend_Locale', $locale);
}

Once we’ve wired up the Zend_Locale service, we can now supply that object into Zend_Translate to add internationalization to the web application.

The following illustrates the wiring up of Zend_Translate, using the locale object created above:

public function indexAction()
{
    /**
    * Wiring up Zend_Locale
    */
    $localeCode = $this->_getParam('localeCode');
    if (!empty($localeCode) {
        // create Zend_Locale object
        $locale = new Zend_Locale($localeCode);
    } else {
        // Guesses visitor's locale code base on browser
        $locale = new Zend_Locale(Zend_Locale::BROWSER);
    }
    Zend_Registry::set('Zend_Locale', $locale);

    /**
    * Wiring up Zend_Translate
    */
    $translate = new Zend_Translate(array(
        'content' => APPLICATION_PATH . '/languages/lang.en_GB',
        'locale'  => 'en'     // Fallback for all en_* locales
    ));
    $translate->addTranslation(array(
        'content' => APPLICATION_PATH . '/languages/lang.en_US',
        'locale' => 'en-US'
    ));
    $translate->addTranslation(array(
        'content' => APPLICATION_PATH . '/languages/lang.fr_FR',
        'locale' => 'fr-FR'
    ));
    $translate->setLocale($locale);  // $locale was created above
    Zend_Registry::set('Zend_Translate', $translate);
}

I’m not a fan of global variables or registers, such as Zend_Registry, but ZF has made ZF components (such as Zend_Date) internationalisation and localisation-aware simply by accessing the Zend_Registry key ‘Zend_Locale’.  By setting the key (as we’ve done above), ZF components can handle international audiences.

A Zend_Registry key for ‘Zend_Translate’ was also created as it’ll be used by the very useful in-built translation view helper.  More about that below.

There are a number of different ways that Zend_Translate can be wired up depending on the adapter type used.  In the example above, I’m using the array adapter.  For a full list of adapters, check out the documentation.

Using Zend_Locale

Below is a list of useful Zend_Locale methods that can be used anywhere in your code base:

<?php
    $locale = new Zend_Locale('en-AU');

    // outputs "en"
    echo $locale->getLanguage();

    // outputs "AU"
    echo $locale->getRegion();

    // outputs "en_AU" (notice the underscore)
    echo $locale->toString();
?>

When Zend_Locale was created in the controller in the previous section, it was assigned to the Zend_Registry key ‘Zend_Locale’:

Zend_Registry::set('Zend_Locale', $locale);

Usage in a view script:

<?php
    $locale = Zend_Registry::get('Zend_Locale');
?>
<html lang="<?php echo $locale->getLanguage(); ?>">
<meta http-equiv="Content-Language" content="<?php echo $locale->getLanguage(); ?>" />

Using Zend_Locale with Locale-aware Components (Zend_Date)

Locale-aware components such as Zend_Date will use the locale settings determined from the visitor’s browser unless either Zend_Locale is registered in Zend_Registry or Zend_Locale is passed into Zend_Date directly.   This is why adding Zend_Locale to the registry is so important, unless of course, you’re wiring up all of your models in a Service Factory.

Zend_Date uses the locale to determine how the date should look.  For example, the following piece of code will output differently for different locales…

<?php
// will use visitor's browser to determine locale
$date = new Zend_Date('2006-12-31T00:00:00Z', Zend_Date::ISO_8601);
echo $date->get(Zend_Date::DATE_MEDIUM);  // see table below
?>

The above output, based on locale:

  • Australia (en-AU): 31/12/2006
  • United Kingdom (en-GB):  31 Dec 2006
  • United States (en-US):  Dec 31, 2006
<?php
// Example #1:
// Visitor requests http://www.example.com/en-US
// from Australia
$date = new Zend_Date('2006-12-31T00:00:00Z', Zend_Date::ISO_8601);

// outputs "31/12/2006" (uses browser for the locale, not the Uri)
echo $date->get(Zend_Date::DATE_MEDIUM);

// Example #2:
// Visitor requests http://www.example.com/en-US
// from Australia
$localeCode = 'en-US';  // from Uri.
$locale = new Zend_Locale($localeCode);
$date = new Zend_Date('2006-12-31T00:00:00Z', Zend_Date::ISO_8601, $locale);

// outputs "Dec 31, 2006" (uses Uri for the locale)
echo $date->get(Zend_Date::DATE_MEDIUM);

// Example #3:
// Visitor requests http://www.example.com/en-US
// from Australia
$localeCode = 'en-US';  // from Uri.
$locale = new Zend_Locale($localeCode);
Zend_Registry::set('Zend_Locale', $locale);
//  ^-- adding to registry means that we do not have to pass locale
//      to Zend_Date every time we initialise it
$date = new Zend_Date('2006-12-31T00:00:00Z', Zend_Date::ISO_8601);  // no need to pass $locale

// outputs "Dec 31, 2006" (uses Uri for the locale)
echo $date->get(Zend_Date::DATE_MEDIUM);
?>

I’ll be talking more about dates and timezones in a future blog.

Using Zend_Translate

Zend_Translate is pre-packaged with a translation view helper which makes translation very easy to implement in view scripts.

View Script:

<h1><?php echo $this->translate('Hello World'); ?></h1>

Within your Controller and Models:

<?php
// using the Zend_Translate created and registered earlier
$translate = Zend_Registry::get('Zend_Translate');
// outputs in visitor's language
echo $translate->_('You have entered the incorrect password');
?>

PHP UTF-8 GOTCHAS

A number of PHP string methods need the character encoding set to UTF-8.  A list of affected methods is available here:  http://www.phpwact.org/php/i18n/utf-8

MySQL UTF-8

On database and table creation, it’s very important that the UTF-8 character set is applied. For example:

/*!40101 SET NAMES utf8 */;

CREATE DATABASE `mydb` /*!40100 DEFAULT CHARACTER SET utf8 */;
USE `mydb`;

DROP TABLE IF EXISTS `mytable`;
CREATE TABLE `mytable` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `title` varchar(45) NOT NULL DEFAULT '',
  PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
ALTER TABLE `mytable` CHARACTER SET utf8 COLLATE utf8_general_ci;

It’s also worth familiarising yourself with the MySQL documentation on Internationalization support.

WHAT ARE i18n AND L10n?

Internationalization and Localization on Wikipedia:

“The terms are frequently abbreviated to the numeronyms i18n (where 18 stands for the number of letters between the first i and last n in internationalization, a usage coined at DEC in the 1970s or 80s)[1] and L10n respectively, due to the length of the words. The capital L in L10n helps to distinguish it from the lowercase i in i18n.”

SUMMARY

I thought it was going to be a lot more difficult to add support for an international audience to my web application, but as you can see it turned out to be remarkably simple.   Adding international support simply required:

  • wiring up a few extra objects (Zend_Locale, Zend_Translate)
  • creating a new route in the bootstrap (eg http://www.example.com/en-US)
  • wrapping all language specific text in either a $this->translate(‘Good morning’) or $translate->_(‘Good morning’) method call

It’s not an overly daunting task, so give it a go.

About the Author
Brett is the Lead Web Developer at BBC.com working on a number of products, such as the BBC International Homepage, News, Sport, Travel and the back-end work on the iPhone and iPad applications.

Advertisements
Posted in Internationalisation, PHP, Zend Framework, Zend_Locale, Zend_Translate
3 comments on “Internationalization and Localization with PHP – Zend_Locale, Zend_Translate and Zend_Date
  1. ry says:

    You are in England now – it’s localisation – no zeeeees!!!!! 🙂

  2. arurmamedov says:

    Thank you so much! Very good article. Now i wont to set all my time() call to the selected time_zone if you know and have time to tell me how i can made this i appreciate 🙂
    I apologize for my bad English, thank again.

    • brettscott says:

      Hi arurmamedov,

      Depending on what you’ve collected from the user, depends on what you can do.

      Firstly, you should set a default timezone in case you haven’t got a user’s timezone:
      date_default_timezone_set(‘Europe/Berlin’);

      There are a number of ways you can collect a visitors time zone. For example:
      1. Collect user’s locale-code via online form:
      $userLocaleCode = ‘en-GB’; // input from online form
      $locale = new Zend_Locale($userLocaleCode);
      Zend_Registry::set(‘Zend_Locale’, $locale);

      2. Collect user’s locale-code via browser:
      $userLocaleCode = new Zend_Locale(Zend_Locale::BROWSER);

      3. Collect user’s timezone via online form
      Using a JavaScript library, to populate a user’s online form – https://bitbucket.org/pellepim/jstimezonedetect
      This will give you the user’s $timezone (eg UTC, UTC+8 etc).

      Once you have either the locale-code (en-GB) or the timezone (UTC), and where you want to enter a date/time:
      1. If you’ve got Zend_Locale configured with the user’s locale-code, then use Zend_Date to present the date. Zend_Date integrates with Zend_Locale.

      2. If you just have the $timezone of the user, then you can do the following:
      $date = new DateTime(‘now’);
      $date->setTimezone(new DateTimeZone($timezone));
      $timestamp = $date->getTimestamp();

      3. Or, you can simply alter the default timezone with the user’s $timezone:
      date_default_timezone_set($timezone)

      Hope this helps 🙂
      Brett.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: