Some time ago I found myself needing to internationalize wordpress plugin. I read the wordpress documentation on the matter (http://codex.wordpress.org/I18n_for_WordPress_Developers), and when I arrived a the section where it tells you how to generate your POT files, I did not quite like it.
The wordpress team offers a couple php scripts that you need to download, and run in order to generate your pot files. I wanted to just use the gettext utilities that come with linux, so I decided to just use that. It is quite simple, to do so. After I finished the translation, I meant to write a post about it, but I didn’t. A few days ago, when I had to internationalize a theme, I regretted not writing that post. This time I won’t make the same mistake.
The first thing you need to do is make sure that your strings are wrapped in one of the special functions that wordpress provides for i18n. The two most popular are
_e(). Once you have done that, you need to collect all your strings using the
xgettext program. It is very simple to use, but you need to specify the key or keys that mark the strings fro translation. This is because the functions that wordpress uses are not the functions that are commonly used. You specify the key by using the
xgettext --keyword=_e --keyword=__
Do not run the command yet, you still need to specify some other options. Besides, if you run it like that, it won’t know in which file to search.
Next, we need to specify the output file name:
xgettext --keyword=_e --keyword=__ -o file_name.pot
Lastly, you need to specify the file or files in which to search:
xgettext --keyword=_e --keyword=__ -o file_name.pot *.php subdirectory/*.php ...
Note that the … indicates more files. Do not enter it literally.
Now you can run the command. This will generate a new file with the name file_name, and a pot extension. You will need to give that file to your translators, and they will give you back a file with the translated text. That file should have a po extension, not pot.
Once you get that file, you just run the msgfmt command:
msgfmt translated_file.po -o locale.mo
This will generate a binary file which is the one you use with wordpress.
You may encounter encoding problems. In my case, I did. This may be because the files where sent back and forth via email as text files. Email clients sometimes change the encoding of text files. The error I was getting when I tried to msgft the translated file was this: “invalid multibyte sequence”
To solve it, I had to change the encoding of the file to utf-8. In order to do that, you need to use the uconv program. It may not be installed in your system, so you may have to install it. In ubuntu you do that by running:
sudo apt-get install libicu-dev
That is, according to the output I got when I tried to run it the first time.
Once you have it installed. You need to know from which encoding to which encoding you want to convert the file. You could not specify a “from” encoding, but it may not work. In my case it didn’t.
To find out the encoding you can do this:
encoding=$(file -i messages.po | sed "s/.*charset=\(.*\)$/\1/")
Once you know the encoding of the file, you can simply run
uconv -f ico-8859-1 -t utf-8 -o converted.po -v messages.po
We are specifying we want to encode from ico-8859-1 to utf-8, we want the output file to be converted.po, we want the program to run in verbose mode so we can see as much info as possible, and we want the conversion to be performed on the messages.po file.
Once you do that, you can run msgfmt with no problems.
Here are some of the links I found useful this time:
http://stackoverflow.com/questions/1083518/msgfmt-invalid-multibyte-sequence-error-on-a-polish-text (not that useful, added just as a reference)