Edit: Since I wrote this in 2009, Google have withdrawn free access to the translation API. I’ll leave this post up for anyone using the paid version though…
If you’ve ever worked on localizing an application or website, you may be familiar with the .po files used with GNU gettext and compatible tools.
I’ve written a script which can take a .po file and translate any untranslated strings with Google Translate. This may not be a ‘release quality’ translation, but does speed up the job of a real translator, who can simply proof read and correct the machine-translated entries.
See it in action here: http://pepipopum.dixo.net
I’ve released the source under the Affero GPL too, so you can tweak or host it yourself. The version hosted above does have a one second delay between translations, so if you want to go faster you’re encouraged to do exactly that!
Hope someone else finds it useful.
I’ve made a few fixes to correct some mangling of placeholders like %1 in translated strings, and also made the direct output appear as it is generated, rather than waiting until the entire translation is complete.
Hi Paul. You did a great job. I tried to translate my plugin (great plugin by the way http://wordpress.org/extend/plugins/welcome-announcement). It works fine except for accentuated characters and ‘ . Do you have any suggestion?
I like your translator, although it is slower (but more correct) than
http://translate.umpirsky.com
One suggestion: Add option to mark all automatically translated strings as fuzzy, to remind the user to check them manually.
I’ve contacted JMbamba and have his po file, so will see what I can do to improve things.
@Samir, do you mean as a comment, or in the string itself? Also, you can make it faster by installing it yourself – the version I’m hosting has an artificial delay.
Simply in output text for all translated lines (so not for the lines which originaly have some content) before the English line to put one
#, fuzzy
statement. You can turn on/off this option using one check box.
I know that the delay is artificial.
Hi!
I was wondering if maybe your script is available from anywhere else since both links seem to be down.
Thanks!
Hi! Any news about the broken links?
I am interested in the project myself!
Regards
Mike
Sorry about the service being offline – I had it running on a spare server we’ve since taken offline. It will be back up shortly!
Thanks for the source!
This is a wonderful idea you have here.
What license is the source under?
My (short!) post does mention it’s licensed under the Affero GPL 🙂
Hi!
This is a great programm, that you did! The only thing, that i am missing is a way to define the input language. My project is originally in german, so will have to modify the source…
Is this supposed to be working? I submitted a file and it claimed it was translating it, but none of the lines were translated in the download or the online output? Very strange. The file i submitted had been created by django-admin.py makemessages
Oops, there was a problem with curl on that server. It works now.
The hosted version is really just for demonstration, if you find it useful I encourage you to host it yourself.
thanks for your quick reply!!
Superb online tool it’s saves for our project http://userecho.com a lot of time.
Recently we translate site to Dutch with this tool.
It takes about 15 minutes for 5500 worlds.
Also you can look at our service to use it for collect customer ideas and feedback for websites and projects. We can offer our pay plan for free for your site http://pepipopum.dixo.net/index.php
THNX for this script so much! I’ve used it here on your site a bunch of times however when I tried to set it up on WAMP it will not run. It will think and think think and then show the Download file link but when clicking to download it says the file may have expired. I’ve enabled php_curl but still no dice. Any ideas?
It would be nice if will be possible translate already translated po-file to other language (similar) – need to add also the translation of destination.
#: gnome-manual-duplex.glade:381 gnome-manual-duplex.glade:446
msgid “”
“HP LJ 1005/1018/1020: reverse pages\n”
“HP LJ P1005/P1006/P1505: reverse pages\n”
“HP LJ Pro P1102, P1566: reverse pages\n”
“HP CLJ 1600/2600/CP1215: reverse pages\n”
“Minolta/QMS 2300 DL: reverse pages\n”
“Others: depends\n”
msgstr “HP LJ 1005/1018/1020: páginas de inverter HP LJ P1005/P1006/P1505: páginas de inverter HP LJ Pro P1102, P1566: páginas de inverter 1600/2600/CP1215 HP CLJ: páginas de inverter Minolta / QMS 2300 DL: páginas Outros inverso: depende”
Should be (Portagese) 6 lines instead of 1 line:
msgstr “”
“HP LJ 1005/1018/1020: páginas de inverter\n”
“HP LJ P1005/P1006/P1505: páginas de inverter\n”
“HP LJ Pro P1102, P1566: páginas de inverter\n”
“1600/2600/CP1215 HP CLJ: páginas de inverter\n”
“Minolta / QMS 2300 DL: páginas de inverter\n”
“Outros: depende”
I found this today, and do find it interesting. I’ve been translating for a while, and this seems like a tool that could help.
I wish to encourage you to implement the “fuzzy” feature, as suggested above. Automatic translations are not good enough to be used without being proof-read and fixed first. They need to be marked as only approximate, and that is exactly what the fuzzy marker in po files are for.
Otherwise, I will not know which ones to review. And if it is half of each, I’m not that much helped by the auto-translation any more.
Hi!
Cool service. However it’s not usable for us unless the Google translated strings are marked “fuzzy” as suggested by Samir Ribic already. The reason is that the strings need to be reviewed by a human to check if things are accurate. If you already have a file partially completed, there is no way to distinct between already verified lines and newly Google translated lines.
Greets,
Dreas
You ROCK!!!! Thankyou!
I know, this is GREAT service – but currently the pepipopum.dixo.net site unavailable – please correct asap …
Nice script
2 requests.
Ignore place holder strings like %s etc
Mark as fuzy ( as mentioned above )
I downloaded the script to try it on my own machine, but cannot get it to work. Any suggestions?
Hi, great job! But is it still going to work after Google shuts down the Google Translate API on Dec 1st 2011?
Hello, Any chances that the translation will be enabled? Or any site that hasi this aplication up and running?
Hi,
great script but I can’t make it run on my server. Can you help me how to make it work?
Hi, it seems that Google does not provide the public API for translations anymore. Now there is a paid version 2.
Hi Paul,
Great Program, But i am worried, since google has made it V2 (Paid)
Could you fix it for V2, So those having Key can still use it.
That way we can save Great Piece of work.
Does anyone here found a solution how to translate the .po file automatically now when google wants to charge for using their service? I already setup an account with google api v2 and willing to pay for it..
What Scott suggested sounds great
Hello A1on,
Now this solution is working.
You can contact a person on xaurav at gmail com.
He has fixed it and can provide this solution.
can u make a download link so can use this software?is not everyone who understands the programming languages
Hi,
I am going to use TM-database tool for translate .po files. It is support translation with Google, Bing, and Yahoo. Last version: http://yehongmei.narod.ru/TM_database.rar
If this no longer works because Google shut down their free API, could you please make a note of that on the page? I just spent half an hour working on this only to realize it’s hopeless.
It is not hopeless. I developed a free translation of PO files without the need to use Google API. Look at this http://www.po-auto-translator.tk.
if you use PO files for software or website localization, you should check out this web-based translation platform http://poeditor.com/ which has an incorporated Google Translate API.
Hi
Just want to share a new script that I made
It uses Google Translate
Here is what I have done
Hope it helps somebody :o)
<?php
/**
* Pepipopum - Automatic PO Translation via Google Translate
* Copyright (C)2009 Paul Dixon (lordelph@gmail.com)
* $Id: index.php 21080 2009-10-25 10:04:24Z paul $
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as
* published by the Free Software Foundation, either version 3 of the
* License, or (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU Affero General Public License for more details.
*
* You should have received a copy of the GNU Affero General Public License
* along with this program. If not, see .
*
*
* REQUIREMENTS:
*
* Requires curl to perform the Google Translate API call but could
* easily be adapted to use something else or make the HTTP call
* natively.
*/
/**
* Define delay between Google API calls (can be fractional for sub-second delays)
*
* This reduces load on the server and plays nice with Google. If you want a faster
* experience, simply host Pepipopum on your own server and lower this value.
*/
define('PEPIPOPUM_DELAY', 1);
// dont forget to register google API -> https://code.google.com/apis/console/
// and visit "BILLING" and pay 10$ or more
// and visit "API Access" and find Your google translate API key (Key for browser apps (with referers))
// and edit/add a referer to "*.currentdomain.dk/*"
// and put your key here:
$mykey="PUT_KEY_HERE";
// the name of this file (put it in the root)
$this_filename = "translateNow.php";
/**
* POProcessor provides a simple PO file parser
*
* Can parse a PO file and calls processEntry for each entry in it
* Can derive from this class to perform any transformation you
* like
*/
class POProcessor
{
public $max_entries=0; //for testing you can limit the number of entries processed
private $start=0; //timestamp when we started
public function __construct()
{
}
/**
* Set callback function which is passed the completion
* percentage and remaining time of the parsing operation. This callback
* will be called up to 100 times, depending on the
* size of the file.
*
* Callback is a function name, or an array of ($object,$methodname)
* as is common for PHP style callbacks
*/
public function setProgressCallback($callback)
{
$this->progressCallback=$callback;
}
/**
* Parses input file and calls processEntry for each recgonized entry
* and output for all other lines
*
* To track progress, see setProgressCallback
*/
public function process($inFile)
{
set_time_limit(86400);
$this->start=time();
$msgid=array();
$msgstr=array();
$count=0;
$size=filesize($inFile);
$percent=-1;
$state=0;
$in=fopen($inFile, 'r');
while (!feof($in))
{
$line=trim(fgets($in));
$pos=ftell($in);
$percent_now=round(($pos*100)/$size);
if ($percent_now!=$percent)
{
$percent=$percent_now;
$remain='';
$elapsed=time()-$this->start;
if ($elapsed>=5)
{
$total = $elapsed/($percent/100);
$remain=$total-$elapsed;
}
$this->showProgress($percent,$remain);
}
$match=array();
switch ($state)
{
case 0://waiting for msgid
if (preg_match('/^msgid "(.*)"$/', $line,$match))
{
$clean=stripcslashes($match[1]);
$msgid=array($clean);
$state=1;
}
break;
case 1: //reading msgid, waiting for msgstr
if (preg_match('/^msgstr "(.*)"$/', $line,$match))
{
$clean=stripcslashes($match[1]);
$msgstr=array($clean);
$state=2;
}
elseif (preg_match('/^"(.*)"$/', $line,$match))
{
$msgid[]=stripcslashes($match[1]);
}
break;
case 2: //reading msgstr, waiting for blank
if (preg_match('/^"(.*)"$/', $line,$match))
{
$msgstr[]=stripcslashes($match[1]);
}
elseif (empty($line))
{
//we have a complete entry
$this->processEntry($msgid, $msgstr);
$count++;
if ($this->max_entries && ($count>$this->max_entries))
{
break 2;
}
$state=0;
}
break;
}
//comment or blank line?
if (empty($line) || preg_match('/^#/',$line))
{
$this->output($line."\n");
}
}
fclose($in);
}
/**
* Called whenever the parser recognizes a msgid/msgstr pair in the
* po file. It is passed an array of strings for the msgid and msgstr
* which correspond to multiple lines in the input file, allowing you
* to preserve this if desired.
*
* Default implementation simply outputs the msgid and msgstr without
* any further processing
*/
protected function processEntry($msgid, $msgstr)
{
$this->output("msgid ");
foreach($msgid as $part)
{
$part=addcslashes($part,"\r\n\"");
$this->output("\"{$part}\"\n");
}
$this->output("msgstr ");
foreach($msgstr as $part)
{
$part=addcslashes($part,"\r\n\"");
$this->output("\"{$part}\"\n");
}
}
/**
* Internal method to call the progress callback if set
*/
protected function showProgress($percentComplete, $remainingTime)
{
if (is_array($this->progressCallback))
{
$obj=$this->progressCallback[0];
$method=$this->progressCallback[1];
$obj->$method($percentComplete,$remainingTime);
}
elseif (is_string($this->progressCallback))
{
$func=$this->progressCallback;
$func($percentComplete,$remainingTime);
}
}
/**
* Called to emit parsed lines of the file - override this
* to provide customised output
*/
protected function output($str)
{
global $output;
$output.=$str;
}
}
/**
* Derivation of POProcessor which passes untranslated entries through the Googl
e Translate
* API and writes the transformed PO to another file
*
*/
class POTranslator extends POProcessor
{
/**
* Google API requires a referrer - constructor will build a suitable defaul
t
*/
public $referrer;
/**
* How many seconds should we wait between Google API calls to be nice
* to google and the server running Pepipopum? Can use a floating point
* value for sub-second delays
*/
public $delay=PEPIPOPUM_DELAY;
public function __construct()
{
parent::__construct();
//Google API needs to be passed a referrer
$this->referrer="http://{$_SERVER['HTTP_HOST']}{$_SERVER['REQUEST_URI']}";
}
/**
* Translates a PO file storing output in desired location
*/
public function translate($inFile, $outFile, $srcLanguage, $targetLanguage)
{
$ok=true;
$this->srcLanguage=$srcLanguage;
$this->targetLanguage=$targetLanguage;
$this->fOut=fopen($outFile, 'w');
if ($this->fOut)
{
$this->process($inFile);
fclose($this->fOut);
}
else
{
trigger_error("POProcessor::translate unable to open $outfile for writing", E_USER_ERROR);
$ok=false;
}
return $ok;
}
/**
* Overriden output method writes to output file
*/
protected function output($str)
{
if ($this->fOut)
{
fwrite($this->fOut, $str);
flush();
}
}
/**
* Overriden processEntry method performs the Google Translate API call
*/
protected function processEntry($msgid, $msgstr)
{
$input=implode('', $msgid);
$output=implode('', $msgstr);
if (!empty($input) && empty($output))
{
$q=urlencode($input);
$langpair=urlencode("{$this->srcLanguage}|{$this->targetLanguage}");
$url="https://www.googleapis.com/language/translate/v2?key=".$mykey."&q=".$q."&source=en&target=da";
$cmd="curl -e ".escapeshellarg($this->referrer).' '.escapeshellarg($url);
$result=`$cmd`;
$data=json_decode($result);
//echo $data->data->translations[0]->translatedText;
if (is_object($data) && is_object($data->data->translations[0]) && isset($data->data->translations[0]->translatedText))
{
$output=$data->data->translations[0]->translatedText;
//Google translate mangles placeholders, lets restore them
$output=preg_replace('/%\ss/', '%s', $output);
$output=preg_replace('/% (\d+) \$ s/', ' %$1\$s', $output);
$output=preg_replace('/^ %/', '%', $output);
//have seen %1 get flipped to 1%
if (preg_match('/%\d/', $input) && preg_match('/\d%/', $output))
{
$output=preg_replace('/(\d)%/', '%$1', $output);
}
//we also get entities for some chars
$output=html_entity_decode($output);
$msgstr=array($output);
}
//play nice with google
//usleep($this->delay * 1000000);
}
//output entry
parent::processEntry($msgid, $msgstr);
}
}
//simple progress callback which emits some JS to update the
//page with a progress count
function showProgress($percent,$remainingTime)
{
$time='';
if (!empty($remainingTime))
{
if ($remainingTime<120)
{
$time=sprintf("(%d seconds remaining)",$remainingTime);
}
elseif ($remainingTime<60*120)
{
$time=sprintf("(%d minutes remaining)",round($remainingTime/60));
}
else
{
$time=sprintf("(%d hours remaining)",round($remainingTime/3600));
}
}
echo '';
echo "document.getElementById('info').innerHTML='$percent% complete $time';";
echo "\n";
flush();
}
function processForm()
{
set_time_limit(86400);
$translator=new POTranslator();
if ($_POST['output']=='html')
{
//we output to a temporary file to allow later download
echo 'Processing PO file...';
echo '';
$translator->setProgressCallback('showProgress');
$outfile = tempnam(sys_get_temp_dir(), 'pepipopum');
}
else
{
//output directly
header("Content-Type:text/plain");
$outfile="php://output";
}
$translator->translate($_FILES['pofile']['tmp_name'], $outfile, 'en', $_POST['language']);
if ($_POST['output']=='html')
{
//show download link
$leaf=basename($outfile);
$name=$_FILES['pofile']['name'];
echo "Completed - download your updated po file";
}
else
{
//we're done
exit;
}
}
if (isset($_GET['viewsource']))
{
highlight_file($_SERVER['SCRIPT_FILENAME']);
exit;
}
if (isset($_GET['download']) && isset($_GET['name']))
{
//check download file is valid
$file=sys_get_temp_dir().DIRECTORY_SEPARATOR.$_GET['download'];
$ok=preg_match('/^pepipopum[A-Za-z0-9]+$/', $_GET['download']);
$ok=$ok && file_exists($file);
//sanitize name
$name=preg_replace('/[^a-z0-9\._]/i', '', $_GET['name']);
if ($ok)
{
header("Content-Type:text/plain");
header("Content-Length:".filesize($file));
header("Content-Disposition: attachment; filename=\"{$name}\"");
readfile($file);
}
else
{
//fail
header("HTTP/1.0 404 Not Found");
echo "The requested pepipopum output file is not available - it may have expired. Click here to generate a new one.";
}
exit;
}
if (isset($_POST['output']) && ($_POST['output']=='pofile'))
{
processForm();
}
?>
Pepipopum - Translate PO file with Google Translate
body
{
background:#eeeeee;
margin: 0;
padding: 0;
text-align: center;
font-family:Verdana,Arial,Helvetica
}
#main
{
padding: 3em;
margin: 1em auto 1em auto;
width: 50em;
border:1px solid #dddddd;
text-align: left;
background:white;
}
#footer
{
text-align:right;
font-size:8pt;
color:#888888;
border-top:1px solid #888888;
}
h1
{
margin-top:0;
}
form
{
background:#dddddd;
padding:2em;
margin:0 2em 0 2em;
-moz-border-radius: 1em;
-webkit-border-radius: 1em;
border-radius: 1em;
font-size:0.8em;
}
fieldset
{
background:#cccccc;
border:1px solid #aaaaaa;
margin-bottom:1em;
padding:1em;
position:relative;
-moz-border-radius: 0.5em;
-webkit-border-radius: 0.5em;
border-radius: 0.5em;
}
legend
{
background:#aaaaaa;
border:0;
padding:0 1em 0 1em;
margin-left:1em;
color:#ffffff;
position: absolute;
top: -.5em;
left: .2em;
-moz-border-radius: 0.5em;
-webkit-border-radius: 0.5em;
border-radius: 0.5em;
}
Pepipopum - Translate PO files with Google Translate
PO files originate from the GNU gettext
tools and can be generated by a wide variety of other localization tools.
Pepipopum allows you to upload a PO file containing English language strings
in the msgid,
and it uses the Google Translate API
to construct a PO file containing translated equivalents in each corresponding msgstr
If the PO file already contains a translation for a given msgid, it will not
be translated. This
allows you to upload a proof-read PO and just get translations for any new elements.
<form enctype="multipart/form-data" action="" method="post">
Input
PO File
Output options
Target Language
Afrikaans
Albanian
Arabic
Belarusian
Bulgarian
Catalan
Chinese (Simplified)
Chinese (Traditional)
Croatian
Czech
Danish
Dutch
English
Estonian
Filipino
Finnish
French
Galician
German
Greek
Hebrew
Hindi
Hungarian
Icelandic
Indonesian
Irish
Italian
Japanese
Korean
Latvian
Lithuanian
Macedonian
Malay
Maltese
Norwegian
Persian
Polish
Portuguese
Romanian
Russian
Serbian
Slovak
Slovenian
Spanish
Swahili
Swedish
Thai
Turkish
Ukrainian
Vietnamese
Welsh
Yiddish
Output PO File
Output progress meter and then provide a download link
You can automate translation by using a tool like cURL to post a PO file and obtain
a translated result. For example:
curl -F pofile=@input-po-filename \
-F language=target-language-code \
-F output=pofile
http://pepipopum.dixo.net \
--output output-po-filename
The PHP5 source code to this software is available under an
Affero GPL licence. Please
note that this installation of Pepipopum introduces a second
delay between each Google API call to reduce load on this server and to play nice with
Google. If you want to go faster, you're encouraged to host your own installation.
Why is called "Pepipopum"? I just invented a word which had
'po' in it and was relatively rare on Google! Pronounce it pee-pie-poe-pum.
Comments and suggestions are welcome.
(c)2009 Paul Dixon
Sorry I couldnt post it, but let me know how to do it, and I will post it