24th Oct, 2009

Pepipopum – automatically translate PO files with Google Translate

If you’ve ever worked on localizing an application or website, you may be familiar with the .po files used with GNU gettext and compatible tools.

I’ve written a script which can take a .po file and translate any untranslated strings with Google Translate. This may not be a ‘release quality’ translation, but does speed up the job of a real translator, who can simply proof read and correct the machine-translated entries.

See it in action here: http://pepipopum.dixo.net

I’ve released the source under the Affero GPL too, so you can tweak or host it yourself. The version hosted above does have a one second delay between translations, so if you want to go faster you’re encouraged to do exactly that!

Hope someone else finds it useful.

Responses

I’ve made a few fixes to correct some mangling of placeholders like %1 in translated strings, and also made the direct output appear as it is generated, rather than waiting until the entire translation is complete.

Hi Paul. You did a great job. I tried to translate my plugin (great plugin by the way http://wordpress.org/extend/plugins/welcome-announcement). It works fine except for accentuated characters and ‘ . Do you have any suggestion?

I like your translator, although it is slower (but more correct) than
http://translate.umpirsky.com

One suggestion: Add option to mark all automatically translated strings as fuzzy, to remind the user to check them manually.

I’ve contacted JMbamba and have his po file, so will see what I can do to improve things.

@Samir, do you mean as a comment, or in the string itself? Also, you can make it faster by installing it yourself – the version I’m hosting has an artificial delay.

Simply in output text for all translated lines (so not for the lines which originaly have some content) before the English line to put one
#, fuzzy
statement. You can turn on/off this option using one check box.

I know that the delay is artificial.

Hi!
I was wondering if maybe your script is available from anywhere else since both links seem to be down.

Thanks!

Hi! Any news about the broken links?

I am interested in the project myself!

Regards
Mike

Sorry about the service being offline – I had it running on a spare server we’ve since taken offline. It will be back up shortly!

Thanks for the source!

This is a wonderful idea you have here.

What license is the source under?

My (short!) post does mention it’s licensed under the Affero GPL :)

Hi!

This is a great programm, that you did! The only thing, that i am missing is a way to define the input language. My project is originally in german, so will have to modify the source…

Is this supposed to be working? I submitted a file and it claimed it was translating it, but none of the lines were translated in the download or the online output? Very strange. The file i submitted had been created by django-admin.py makemessages

Oops, there was a problem with curl on that server. It works now.

The hosted version is really just for demonstration, if you find it useful I encourage you to host it yourself.

thanks for your quick reply!!

Superb online tool it’s saves for our project http://userecho.com a lot of time.

Recently we translate site to Dutch with this tool.

It takes about 15 minutes for 5500 worlds.

Also you can look at our service to use it for collect customer ideas and feedback for websites and projects. We can offer our pay plan for free for your site http://pepipopum.dixo.net/index.php

THNX for this script so much! I’ve used it here on your site a bunch of times however when I tried to set it up on WAMP it will not run. It will think and think think and then show the Download file link but when clicking to download it says the file may have expired. I’ve enabled php_curl but still no dice. Any ideas?

It would be nice if will be possible translate already translated po-file to other language (similar) – need to add also the translation of destination.

#: gnome-manual-duplex.glade:381 gnome-manual-duplex.glade:446
msgid “”
“HP LJ 1005/1018/1020: reverse pages\n”
“HP LJ P1005/P1006/P1505: reverse pages\n”
“HP LJ Pro P1102, P1566: reverse pages\n”
“HP CLJ 1600/2600/CP1215: reverse pages\n”
“Minolta/QMS 2300 DL: reverse pages\n”
“Others: depends\n”
msgstr “HP LJ 1005/1018/1020: páginas de inverter HP LJ P1005/P1006/P1505: páginas de inverter HP LJ Pro P1102, P1566: páginas de inverter 1600/2600/CP1215 HP CLJ: páginas de inverter Minolta / QMS 2300 DL: páginas Outros inverso: depende”

Should be (Portagese) 6 lines instead of 1 line:

msgstr “”
“HP LJ 1005/1018/1020: páginas de inverter\n”
“HP LJ P1005/P1006/P1505: páginas de inverter\n”
“HP LJ Pro P1102, P1566: páginas de inverter\n”
“1600/2600/CP1215 HP CLJ: páginas de inverter\n”
“Minolta / QMS 2300 DL: páginas de inverter\n”
“Outros: depende”

I found this today, and do find it interesting. I’ve been translating for a while, and this seems like a tool that could help.

I wish to encourage you to implement the “fuzzy” feature, as suggested above. Automatic translations are not good enough to be used without being proof-read and fixed first. They need to be marked as only approximate, and that is exactly what the fuzzy marker in po files are for.

Otherwise, I will not know which ones to review. And if it is half of each, I’m not that much helped by the auto-translation any more.

Leave a response

Your response:

Categories