Monthly Archives: October 2009

Pepipopum – automatically translate PO files with Google Translate

Edit: Since I wrote this in 2009, Google have withdrawn free access to the translation API. I’ll leave this post up for anyone using the paid version though…

If you’ve ever worked on localizing an application or website, you may be familiar with the .po files used with GNU gettext and compatible tools.

I’ve written a script which can take a .po file and translate any untranslated strings with Google Translate. This may not be a ‘release quality’ translation, but does speed up the job of a real translator, who can simply proof read and correct the machine-translated entries.

See it in action here: http://pepipopum.dixo.net

I’ve released the source under the Affero GPL too, so you can tweak or host it yourself. The version hosted above does have a one second delay between translations, so if you want to go faster you’re encouraged to do exactly that!

Hope someone else finds it useful.

Pastebin.com and password lists

In the past few days, pastebin.com has been cited in a wide variety of high-profile news sources regarding a “leak” of email account passwords.

This brought a huge surge in visitors, and ensuring I kept the server functioning took up all my available spare time. I wrote a short blog entry which attracted a lot of comments. Things are a little calmer now, so I’m writing this longer post to explain what happened.

This looks like a long post, just tell me my email account wasn’t compromised…

  • I do not have copy of the list
  • and….I do not have a copy of the list
  • just to be clear….I do not have a copy of the list

Microsoft investigated and have frozen the affected accounts on their systems and if you find yourself locked out of your account, fill out their recovery form to regain control.

Aside from that, if you’ve ever entered your email login details into anything but your providers web page, then I recommend you change your password. It’s likely that the leaked list came from a much larger set – just seeing the published isn’t enough to be sure your details have not been compromised.

So, even if you are just a bit concerned, just change your password. Go on. I’ll wait.

All done? Now read on for the gory details….

Sometime prior to October 3rd 2009…

…some unknown bad guys start collecting email addresses and passwords.

We can be pretty sure that they didn’t “hack into” Microsoft or any other major email provider to obtain the passwords. These companies should not actually store your password, they just store a fingerprint of it (what developers call a cryptographic hash).

To extend this analogy to the real world: if you emailed me your fingerprint, I couldn’t tell what you looked like, i.e. I could not reconstruct you just from that fingerprint. However, I could verify your identity if I met you by taking your fingerprint and comparing to the one I had stored.

So, when you log in and send your password, they take the fingerprint of what you entered, and compare with the fingerprint stored in their database.

So if they didn’t hack into a provider, where did they get them?

The most likely, and perhaps surprising, answer is that they simply asked the users for them. For example, they could create an authentic, safe looking site which promises to tell you who has blocked you on MSN Chat – all you need to do is enter your MSN account details.

Some researchers have also suggested the details were harvested by infecting PCs with keylogging software.

Oct 3rd, 04:00 UTC – Bad guys post 10,000 passwords on Pastebin.com

For reasons unknown, our miscreants post a set of hotmail addresses and passwords on the pastebin.com website.

A sharp eyed user spotted the posting, or found it via a Google search, and it reached the attention of a tech news blog called Neowin.

Oct 3rd, 16:45 UTC – post is flagged as abuse

If users spot a post which appears not to belong on pastebin, they can flag it for attention. I check these flagged posts daily, and it’s a very rapid and streamlined process:

The software presents me the first 10 lines of the post, together with a link I can click if I think the post should be deleted. Generally it’s pretty easy to determine if something doesn’t belong, and a list of email addresses and passwords is obviously not going to make the cut.

So, someone spotted the post and flagged it. The next morning, Oct 4th, at 07:29 I saw the first 10 lines, and deleted the post in a heartbeat before realising the true scale of the list which subsequently caught media attention.

Oct 5th – Blog posts gather momentum

After Neowin posted their article on October 5th, interest in the story steadily grew.

Oct 6th – Mainstream media catches the story

I was up early on that day to check on the traffic and see if any special action would be needed. Having read the growing number of news articles I took the following action

  • Added additional rules to the content filters on pastebin.com to ensure hotmail addresses could not be posted
  • Began searching all existing posts to ensure no further copies remained

Traffic levels were so high that the search was running at a crawl, so I closed the site so the cleanup would complete, and left for my office.

I reopened the site late afternoon UK time, and continue to monitor the traffic to ensure it remained as usable as possible.

OK, so why didn’t you keep a copy?

Let me abuse Pulp Fiction for a moment:

  • Jimmie: “Now let me ask you a question, Jules. When you drove in here, did you notice a sign out in front that said, “Email password storage”?”
  • Jules: “Jimmie…”
  • Jimmie: “Answer the question! Did you see a sign out in front of my house that said “Email password storage”?”
  • Jules: “Naw man, I didn’t.”
  • Jimmie: “You know why you didn’t see that sign?”
  • Jules: “Why?”
  • Jimmie: “‘Cause storin’ email passwords ain’t my fuckin’ business!”

Now, if it happens again, I may act differently. Security professionals at some large companies have expressed interest in helping their users if such a list could be made available to them. I’m more interested in enhancing the content filters on pastebin to ensure that text that looks like a list of email addresses is simply rejected.

Even if your email address wasn’t on the list, if you think you’re the kind of person who is prone to phishing scams, just change your password. If you didn’t understand that last sentence, just change your password.

The published list was likely much larger, since it seems it was alphabetically ordered and only got as far as ‘b’. Having possession of that list will not help you determine if your address has been not been compromised.

More links

Can I ask a question?

Sure! As long as it’s not “is my address on the list?”

Pastebin.com and the Hotmail password leak

It seems that a list of 10,000 Hotmail usernames and passwords has been posted on pastebin.com in recent days.

Pastebin was created as a tool to aid software development, not to distribute this sort of material.

As a result of the interest this story is generating, pastebin.com is experiencing huge levels of activity – as a result I took it offline to ensure all the offending material has been removed, and have adjusted the abuse filters prevent re-occurence.

Edit: please don’t ask if you name was on the list. I have no way of knowing. Just change your password.

Edit #2: things have calmed down now, and I’ve written a longer post about the incident here.