Category Archives: Pastebin

Pastebin.com has a new owner!

Congratulations to Jeroen, who is the new owner of pastebin.com. Many thanks to everyone who expressed an interest in taking it over.

The site is now running on vastly improved hardware, and I’m sure Jeroen is going to do a fantastic job in taking the idea forward.

You can track future news and updates by following @pastebincom on Twitter.

End of era for me, but I wish Jeroen the very best of luck!

Want to buy pastebin.com?

I have a need to shed various side projects to free up my time, so I’m looking for anyone who is interesting in purchasing pastebin.com and developing it further.

I created the site way back in 2002, and it’s more popular now than ever with usage steadily growing. Now is a great to time to hand over to someone who can develop the idea further – something I’ve struggled to find the time to do.

Don’t delay though, as some good offers have already been made. Watch this space for more news.

(Edit – sold!)

Pastebin.com and password lists

In the past few days, pastebin.com has been cited in a wide variety of high-profile news sources regarding a “leak” of email account passwords.

This brought a huge surge in visitors, and ensuring I kept the server functioning took up all my available spare time. I wrote a short blog entry which attracted a lot of comments. Things are a little calmer now, so I’m writing this longer post to explain what happened.

This looks like a long post, just tell me my email account wasn’t compromised…

  • I do not have copy of the list
  • and….I do not have a copy of the list
  • just to be clear….I do not have a copy of the list

Microsoft investigated and have frozen the affected accounts on their systems and if you find yourself locked out of your account, fill out their recovery form to regain control.

Aside from that, if you’ve ever entered your email login details into anything but your providers web page, then I recommend you change your password. It’s likely that the leaked list came from a much larger set – just seeing the published isn’t enough to be sure your details have not been compromised.

So, even if you are just a bit concerned, just change your password. Go on. I’ll wait.

All done? Now read on for the gory details….

Sometime prior to October 3rd 2009…

…some unknown bad guys start collecting email addresses and passwords.

We can be pretty sure that they didn’t “hack into” Microsoft or any other major email provider to obtain the passwords. These companies should not actually store your password, they just store a fingerprint of it (what developers call a cryptographic hash).

To extend this analogy to the real world: if you emailed me your fingerprint, I couldn’t tell what you looked like, i.e. I could not reconstruct you just from that fingerprint. However, I could verify your identity if I met you by taking your fingerprint and comparing to the one I had stored.

So, when you log in and send your password, they take the fingerprint of what you entered, and compare with the fingerprint stored in their database.

So if they didn’t hack into a provider, where did they get them?

The most likely, and perhaps surprising, answer is that they simply asked the users for them. For example, they could create an authentic, safe looking site which promises to tell you who has blocked you on MSN Chat – all you need to do is enter your MSN account details.

Some researchers have also suggested the details were harvested by infecting PCs with keylogging software.

Oct 3rd, 04:00 UTC – Bad guys post 10,000 passwords on Pastebin.com

For reasons unknown, our miscreants post a set of hotmail addresses and passwords on the pastebin.com website.

A sharp eyed user spotted the posting, or found it via a Google search, and it reached the attention of a tech news blog called Neowin.

Oct 3rd, 16:45 UTC – post is flagged as abuse

If users spot a post which appears not to belong on pastebin, they can flag it for attention. I check these flagged posts daily, and it’s a very rapid and streamlined process:

The software presents me the first 10 lines of the post, together with a link I can click if I think the post should be deleted. Generally it’s pretty easy to determine if something doesn’t belong, and a list of email addresses and passwords is obviously not going to make the cut.

So, someone spotted the post and flagged it. The next morning, Oct 4th, at 07:29 I saw the first 10 lines, and deleted the post in a heartbeat before realising the true scale of the list which subsequently caught media attention.

Oct 5th – Blog posts gather momentum

After Neowin posted their article on October 5th, interest in the story steadily grew.

Oct 6th – Mainstream media catches the story

I was up early on that day to check on the traffic and see if any special action would be needed. Having read the growing number of news articles I took the following action

  • Added additional rules to the content filters on pastebin.com to ensure hotmail addresses could not be posted
  • Began searching all existing posts to ensure no further copies remained

Traffic levels were so high that the search was running at a crawl, so I closed the site so the cleanup would complete, and left for my office.

I reopened the site late afternoon UK time, and continue to monitor the traffic to ensure it remained as usable as possible.

OK, so why didn’t you keep a copy?

Let me abuse Pulp Fiction for a moment:

  • Jimmie: “Now let me ask you a question, Jules. When you drove in here, did you notice a sign out in front that said, “Email password storage”?”
  • Jules: “Jimmie…”
  • Jimmie: “Answer the question! Did you see a sign out in front of my house that said “Email password storage”?”
  • Jules: “Naw man, I didn’t.”
  • Jimmie: “You know why you didn’t see that sign?”
  • Jules: “Why?”
  • Jimmie: “‘Cause storin’ email passwords ain’t my fuckin’ business!”

Now, if it happens again, I may act differently. Security professionals at some large companies have expressed interest in helping their users if such a list could be made available to them. I’m more interested in enhancing the content filters on pastebin to ensure that text that looks like a list of email addresses is simply rejected.

Even if your email address wasn’t on the list, if you think you’re the kind of person who is prone to phishing scams, just change your password. If you didn’t understand that last sentence, just change your password.

The published list was likely much larger, since it seems it was alphabetically ordered and only got as far as ‘b’. Having possession of that list will not help you determine if your address has been not been compromised.

More links

Can I ask a question?

Sure! As long as it’s not “is my address on the list?”

Pastebin.com and the Hotmail password leak

It seems that a list of 10,000 Hotmail usernames and passwords has been posted on pastebin.com in recent days.

Pastebin was created as a tool to aid software development, not to distribute this sort of material.

As a result of the interest this story is generating, pastebin.com is experiencing huge levels of activity – as a result I took it offline to ensure all the offending material has been removed, and have adjusted the abuse filters prevent re-occurence.

Edit: please don’t ask if you name was on the list. I have no way of knowing. Just change your password.

Edit #2: things have calmed down now, and I’ve written a longer post about the incident here.

Pastebin, the Ti-89 signing keys, and the DMCA

I’ve had a DMCA takedown request sent in relation to a pastebin post containing the signing keys for a range of Texas Instruments calculators which, if I understand correctly, allow you to digitally sign a replacement operating system so that the hardware will accept it.

If you buy a piece of hardware, I firmly believe you should be able to do whatever you like with it, and people installing their own operating systems and *improving the damn product* is something TI should be happy about.

There’s a blog over at http://brandonw.net/ which is enthusiastic about this sort thing, and you can read wide and varied discussion about the issue on SlashDot too. (Edit: on 23rd Sep The Register weighed in with this article)

So, here is the DMCA takedown request Texas Instruments sent to me:

September 17, 2009
To Whom It May Concern:
Re: Illegal Offering of Material to Circumvent TI Copyright Protections
VIA: report abuse at pasetebin.com

It has come to our attention that the web site http://pastebin.com/f23af06b7, contains material and/or links to material that violate the anti-circumvention provisions of the Digital Millennium Copyright Act (“DMCA”). This letter is to notify you, in accordance with the provisions of the DMCA, of these unlawful activities. Pursuant to the safe harbor provisions of the DMCA, we request that you remove any whole or partial reproductions of and/or disable links to the following:

The post located on http://pastebin.com/f23af06b7

Texas Instruments Incorporated (“TI”) owns the copyright in the TI-83 Plus, TI84 Plus and TI-89 operating system software. The TI-83 Plus, TI-84 Plus and TI-89 operating systems use encryption to effectively control access to the operating system code and to protect its rights as a copyright owner in that code. Any unauthorized use of these files is strictly prohibited.

http://pastebin.com/f23af06b7 is distributing or providing links to information that bypasses TI’s anti-circumvention technology. By providing copies of or offering links to such information, http://pastebin.com/f23af06b7 has violated the anti-circumvention provisions of the DMCA at 17 U.S.C. §§ 1201(a)(2) and 1201(b)(1).

Please confirm to the undersigned in writing no later than noon on September 18, 2009 that you have complied with these demands. You may reach the undersigned by telephone at (xxx) xxx-xxxx or by email at xxxxxx@ti.com. TI reserves all further rights and remedies with respect to this matter.
I hereby confirm that I have a good faith belief that use of the Illegal Material in the manner complained of in this letter is not authorized by the copyright owner, its agent, or the law, that the information in this letter is accurate, and that, under penalty of perjury, I am authorized to act on behalf of TI, the owner of the exclusive rights in the TI-83 Plus, TI-84 Plus and TI-89 operating system software that are allegedly misappropriated using unlawful methods.
Texas Instruments Incorporated

XXX XXXXXX
Manager, Business Services
Education Technology Group

I live in the UK, and pastebin.com is hosted in the UK, so hitting me with a DMCA takedown request is rather pointless. However, I do remove copyrighted content on request, so much as it pains me to do so, I’ve deleted that post for now.

It’s no biggie, if you want the keys, just check wikileaks or do a Google search for 82EF4009ED7CAC2A5EE12B5F8E8AD9A0. That’s just a long hexadecimal number. Pretty sure I’m free to express that number in any form I like.

Can you say “Streisand Effect”?

Edit: Interesting post here on dealing with these TI DMCA notices. Persoanlly, I’m not interested in fighting to keep the post on pastebin.com as it is widely available elsewhere. I have a copy of the keys should I ever wish to actively distribute them though…

Edit#2, Oct 14th 2009: The Electronic Frontier Foundation have written the following about this issue: EFF Warns Texas Instruments to Stop Harassing Calculator Hobbyists.

pastebin.org considered harmful

I run pastebin.com, and maintain it daily. I check for abuses of the service, block IP addresses of serial offenders and try to ensure it provides a speedy and useful service.

I make the software available for others to use and improve upon too.

pastebin.org is one such site, but I’m starting to get emails from people who’ve used that site and are now infected with the Win32/Alureo trojan virus. In addition, the site seems to have been compromised in other ways, with extra advertising banners and popups.

I’m not responsible for that site. I’ve tried to make contact with the registrant listed in whois records, but not had a response.

The moral of the story: if you want to stay safe, stick with pastebin.com!

Pastebin post filtering

As there’s been some cases of cracked email address lists being posted on pastebin recently, this week I tweaked the spam filtering to block such posts. A few legitimate posts got caught in the crossfire, causing a few more tweaks to the rules.

If you’re having trouble posting something because pastebin says it looks like spam, post a sample in a comment below and I’ll see what I can do to improve it!

Pastebin fights the spam!

A few people have emailed me recently disappointed by the level of spam postings on pastebin.com. I’ve never really understood why spammers bother, but as they are bothering in increasing numbers it was time to take some action.

Last night I built in some spam filtering which has caught hundreds of posts since going live. I also added a “report spam” link which has flagged over 500 posts in past 20 hours. By iteratively tweaking the spam filter to identify the legimately flagged posts, I’ve been able to quickly delete a lot of older spam posts.

Hopefully this will make pastebin look like a well tended garden rather than a run-down wasteland! Comments welcome…

Pastebin Reloaded!

Well, I promised it waaaay back in january, but I’ve finally released an update to pastebin.com. A few people have asked for the source over the past few months and have seen some of the updates already, but here’s what’s new…

  • MySQL storage replaced with file-based storage, making it much faster
  • Revamped the colour scheme, which has been pretty much the same for 5 years
  • Added a ‘delete post’ feature
  • Switched to Affero GPL licence

If you’ve drifted away from pastebin due it’s lethargic speed, now’s the time to come back! Give it a whirl and if you have any feedback, leave a comment on this post.

Here’s some more detail on the changes…

File based storage

Pastebin used MySQL for storage since it was first launched in 2002. It has steadily grown in popularity, but that popularity began to take its toll on performance in the past 12 months.

Pastebin started out just keeping the last 1000 posts, which kept things zippy. Then I added custom domains, which increased the number of posts being retained, but what really hurt it was adding a common request – permanent posts, which meant that over time, the database grew inexorably larger.

In January I began to wonder if I needed a relational database at all. After all, pastebin is really just a single table application, and there are only two main operations:

  • Fetch post x
  • Get last 10 posts on domain foo

So I refactored the code to allow the storage mechanism to be changed. The new file based mechanism assigns a random identifier to a new post, e.g. abcdefgh and stores it in a structured directory:

posts/<d|m|f>/ab/cd/ef/abcdefgh

The top level directory ‘d’, ‘m’, or ‘f’ is chosen based on the desired lifetime of the post (1 day, 1 month or forever). Garbage collection of the 1 day posts in the ‘d’ directory can thus be carried out by performing a find for files older than a day with something like this running from cron every day:

find /path/to/pastebin/posts/d -mtime +1 -exec rm \{\} \;

To maintain the MRU lists of recent posts, the code maintains a serialized array for each domain. Whenever a post is made, this serialized file is locked, updated and unlocked. This is the only time the code can find itself competing for a shared resource, and even then its on a per-domain basis, rather than for the entire application as with the mysql storage.

As I write, this mechanism has been running for a few hours on the live site, and performance is much improved. At peak times it could take 15-20 seconds to make a post, it’s now much, much zippier!

Revamped Colour Scheme

I thought the old CSS was looking a little tired so I’ve freshened it up a little. I want to avoid adding graphics to the design and just use pure HTML and CSS if possible, which keeps things speedy too.

Comments on it are welcome, it’s likely I’ll tinker with it some more…

Delete Post

This is quite neat I think – if you choose to hit the “remember me” button, you’ll be assigned a random token which is used to mark your posts. This token is stored in a cookie. When you later view a post, if your cookie token and the post token match, you’ll be offered the opportunity of deleting the post.

I like this as you don’t have to go entering a password or setting up an account – it just works.

As always, if you’ve made a post you want removing and this feature doesn’t do it for you, just ask and I’ll take care of it

Changed to Affero GPL

The last few releases of pastebin used the GPL licence. Trouble is, while the GPL guarantees access to the source if you receive a binary copy of the software, with a website that doesn’t happen. The Affero GPL is a modified version of the GPL which contains an extra clause guaranteeing your access to the source when you interact with the software over a network.

So if you use pastebin in your own site, or adapt it further, you must continue to offer that source to your users. Lovely

What’s next?

Well, now that pastebin is actually usable again, I’m on a roll. The code has partially complete support for translation, and I’ve an army of volunteers ready to translate, so that’s the next goal…