<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>LordElph's Ramblings &#187; Pastebin</title>
	<atom:link href="http://blog.dixo.net/category/pastebin/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.dixo.net</link>
	<description>Stuff and nonsense about software development and whatever else I find fun...</description>
	<lastBuildDate>Thu, 11 Mar 2010 19:36:47 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.4</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Pastebin.com has a new owner!</title>
		<link>http://blog.dixo.net/2010/02/19/pastebin-com-has-a-new-owner/</link>
		<comments>http://blog.dixo.net/2010/02/19/pastebin-com-has-a-new-owner/#comments</comments>
		<pubDate>Fri, 19 Feb 2010 18:17:23 +0000</pubDate>
		<dc:creator>lordelph</dc:creator>
				<category><![CDATA[Pastebin]]></category>

		<guid isPermaLink="false">http://blog.dixo.net/?p=390</guid>
		<description><![CDATA[Congratulations to Jeroen, who is the new owner of pastebin.com. Many thanks to everyone who expressed an interest in taking it over.
The site is now running on vastly improved hardware, and I&#8217;m sure Jeroen is going to do a fantastic job in taking the idea forward.
You can track future news and updates by following @pastebincom [...]]]></description>
			<content:encoded><![CDATA[<p>Congratulations to Jeroen, who is the new owner of <a href="http://pastebin.com">pastebin.com</a>. Many thanks to everyone who expressed an interest in taking it over.</p>
<p>The site is now running on vastly improved hardware, and I&#8217;m sure Jeroen is going to do a fantastic job in taking the idea forward.</p>
<p>You can track future news and updates by following <a href="http://twitter.com/pastebincom">@pastebincom</a> on Twitter.</p>
<p>End of era for me, but I wish Jeroen the very best of luck!</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.dixo.net/2010/02/19/pastebin-com-has-a-new-owner/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Want to buy pastebin.com?</title>
		<link>http://blog.dixo.net/2010/02/09/want-to-buy-pastebin-com/</link>
		<comments>http://blog.dixo.net/2010/02/09/want-to-buy-pastebin-com/#comments</comments>
		<pubDate>Tue, 09 Feb 2010 13:01:54 +0000</pubDate>
		<dc:creator>lordelph</dc:creator>
				<category><![CDATA[Pastebin]]></category>

		<guid isPermaLink="false">http://blog.dixo.net/?p=386</guid>
		<description><![CDATA[I have a need to shed various side projects to free up my time, so I&#8217;m looking for anyone who is interesting in purchasing pastebin.com and developing it further.
I created the site way back in 2002, and it&#8217;s more popular now than ever with usage steadily growing. Now is a great to time to hand [...]]]></description>
			<content:encoded><![CDATA[<p>I have a need to shed various side projects to free up my time, so I&#8217;m looking for anyone who is interesting in purchasing pastebin.com and developing it further.</p>
<p>I created the site way back in 2002, and it&#8217;s more popular now than ever with usage steadily growing. Now is a great to time to hand over to someone who can develop the idea further &#8211; something I&#8217;ve struggled to find the time to do.</p>
<p>Don&#8217;t delay though, as some good offers have already been made. Watch this space for more news.</p>
<p><strong>(Edit &#8211; <a href="http://blog.dixo.net/2010/02/19/pastebin-com-has-a-new-owner/">sold</a>!)</strong></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.dixo.net/2010/02/09/want-to-buy-pastebin-com/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Pastebin.com and password lists</title>
		<link>http://blog.dixo.net/2009/10/07/pastebin-com-and-password-lists/</link>
		<comments>http://blog.dixo.net/2009/10/07/pastebin-com-and-password-lists/#comments</comments>
		<pubDate>Wed, 07 Oct 2009 23:56:47 +0000</pubDate>
		<dc:creator>lordelph</dc:creator>
				<category><![CDATA[Pastebin]]></category>

		<guid isPermaLink="false">http://blog.dixo.net/?p=340</guid>
		<description><![CDATA[In the past few days, pastebin.com has been cited in a wide variety of high-profile news sources regarding a &#8220;leak&#8221; of email account passwords. 
This brought a huge surge in visitors, and ensuring I kept the server functioning took up all my available spare time. I wrote a short blog entry which attracted a lot [...]]]></description>
			<content:encoded><![CDATA[<p>In the past few days, <a href="http://pastebin.com">pastebin.com</a> has been <a href="http://news.bbc.co.uk/1/hi/technology/8292299.stm">cited</a> in a <a href="http://www.guardian.co.uk/technology/2009/oct/06/hotmail-phishing">wide</a> <a href="http://news.zdnet.co.uk/security/0,1000000189,39790907,00.htm">variety</a> of <a href="http://money.cnn.com/news/newsfeeds/siliconalley/big-tech/gmail_yahoo_aol_accounts_exposed_2009_10.html">high-profile</a> <a href="http://www.telegraph.co.uk/technology/microsoft/6264539/Microsoft-Hotmail-leak-blamed-on-phishing-attack.html">news</a> <a href="http://gadgetwise.blogs.nytimes.com/2009/10/06/more-e-mail-account-details-leaked-online/?em">sources</a> regarding a &#8220;leak&#8221; of email account passwords. </p>
<p>This brought a huge surge in visitors, and ensuring I kept the server functioning took up all my available spare time. I wrote a <a href="http://blog.dixo.net/2009/10/06/pastebin-com-and-the-hotmail-password-leak/">short blog entry</a> which attracted a lot of comments. Things are a little calmer now, so I&#8217;m writing this longer post to explain what happened.</p>
<h2>This looks like a long post, just tell me my email account wasn&#8217;t compromised&#8230;</h2>
<ul>
<li>I do not have copy of the list</li>
<li>and&#8230;.I do not have a copy of the list</li>
<li>just to be clear&#8230;.<strong>I do not have a copy of the list</strong></li>
</ul>
<p><a href="http://windowslivewire.spaces.live.com/blog/cns!2F7EB29B42641D59!41528.entry?wa=wsignin1.0&#038;sa=363915619">Microsoft investigated</a> and have frozen the affected accounts on their systems and if you find yourself locked out of your account, fill out their<a href="https://support.live.com/eform.aspx?productKey=wlidvalidation&#038;ct=eformcs&#038;scrx=1"> recovery form</a> to regain control.</p>
<p>Aside from that, if you&#8217;ve ever entered your email login details into anything but your providers web page, then I recommend you change your password. <em>It&#8217;s likely that the leaked list came from a much larger set &#8211; just seeing the published isn&#8217;t enough to be sure your details have not been compromised.</em></p>
<p>So, even if you are just a bit concerned, just change your password. Go on. I&#8217;ll wait.</p>
<p>All done? Now read on for the gory details&#8230;.</p>
<h2>Sometime prior to October 3rd 2009&#8230;</h2>
<p>&#8230;some unknown bad guys start collecting email addresses and passwords.</p>
<p>We can be pretty sure that they didn&#8217;t &#8220;hack into&#8221; Microsoft or any other major email provider to obtain the passwords. These companies should not actually store your password, they just store a fingerprint of it (what developers call a <a href="http://en.wikipedia.org/wiki/Cryptographic_hash_function">cryptographic hash</a>).</p>
<p>To extend this analogy to the real world: if you emailed me your fingerprint, I couldn&#8217;t tell what you looked like, i.e. I could not reconstruct <em>you</em> just from that fingerprint. However, I could verify your identity if I met you by taking your fingerprint and comparing to the one I had stored.</p>
<p>So, when you log in and send your password, they take the fingerprint of what you entered, and compare with the fingerprint stored in their database.</p>
<h2>So if they didn&#8217;t hack into a provider,  where did they get them?</h2>
<p>The most likely, and perhaps surprising, answer is that they simply asked the users for them. For example, they could create an authentic, safe looking site which promises to tell you who has blocked you on MSN Chat &#8211; all you need to do is enter your MSN account details.</p>
<p><a href="http://www.computerworld.com/s/article/9139098/Researcher_refutes_Microsoft_s_account_of_hijacked_Hotmail_passwords?taxonomyId=82">Some researchers</a> have also suggested the details were harvested by infecting PCs with keylogging software.</p>
<h2>
Oct 3rd, 04:00 UTC &#8211; Bad guys post 10,000 passwords on Pastebin.com</h2>
<p>For reasons unknown, our miscreants post a set of hotmail addresses and passwords on the pastebin.com website.</p>
<p>A sharp eyed user spotted the posting, or found it via a Google search, and it reached the attention of a tech news blog called Neowin.</p>
<h2>Oct 3rd, 16:45 UTC &#8211; post is flagged as abuse</h2>
<p>If users spot a post which appears not to belong on pastebin, they can flag it for attention. I check these flagged posts daily, and it&#8217;s a very rapid and streamlined process:</p>
<p>The software presents me the first 10 lines of the post, together with a link I can click if I think the post should be deleted. Generally it&#8217;s pretty easy to determine if something doesn&#8217;t belong, and a list of email addresses and passwords is obviously not going to make the cut.</p>
<p>So, someone spotted the post and flagged it. The next morning, Oct 4th, at 07:29 I saw the first 10 lines, and deleted the post in a heartbeat before realising the true scale of the list which subsequently caught media attention.</p>
<h2>Oct 5th &#8211; Blog posts gather momentum</h2>
<p>After <a href="http://www.neowin.net/news/main/09/10/05/thousands-of-hotmail-passwords-leaked-online">Neowin</a> posted their article on October 5th, interest in the story steadily grew.</p>
<h2>Oct 6th &#8211; Mainstream media catches the story</h2>
<p>I was up early on that day to check on the traffic and see if any special action would be needed. Having read the growing number of news articles I took the following action</p>
<ul>
<li>Added additional rules to the content filters on pastebin.com to ensure hotmail addresses could not be posted</li>
<li>Began searching all existing posts to ensure no further copies remained</li>
</ul>
<p>Traffic levels were so high that the search was running at a crawl, so I closed the site so the cleanup would complete, and left for my office.</p>
<p>I reopened the site late afternoon UK time, and continue to monitor the traffic to ensure it remained as usable as possible.</p>
<h2>OK, so why didn&#8217;t you keep a copy?</h2>
<p>Let me abuse <a href="http://http://www.imdb.com/title/tt0110912/quotes">Pulp Fiction</a> for a moment:</p>
<ul>
<li>Jimmie: &#8220;Now let me ask you a question, Jules. When you drove in here, did you notice a sign out in front that said, &#8220;Email password storage&#8221;?&#8221;</li>
<li>Jules: &#8220;Jimmie&#8230;&#8221;</li>
<li>Jimmie: &#8220;Answer the question! Did you see a sign out in front of my house that said &#8220;Email password storage&#8221;?&#8221;</li>
<li>Jules: &#8220;Naw man, I didn&#8217;t.&#8221;</li>
<li>Jimmie: &#8220;You know why you didn&#8217;t see that sign?&#8221;</li>
<li>Jules: &#8220;Why?&#8221;</li>
<li>Jimmie: &#8220;&#8216;Cause storin&#8217; email passwords ain&#8217;t my fuckin&#8217; business!&#8221;</li>
</ul>
<p>Now, if it happens again, I may act differently. Security professionals at some large companies have expressed interest in helping their users if such a list could be made available to them. I&#8217;m more interested in enhancing the content filters on pastebin to ensure that text that looks like a list of email addresses is simply rejected.</p>
<p>Even if your email address wasn&#8217;t on the list, if you think you&#8217;re the kind of person who is prone to phishing scams, just change your password. If you didn&#8217;t understand that last sentence, just change your password. </p>
<p>The published list was likely much larger, since it seems it was alphabetically ordered and only got as far as &#8216;b&#8217;. <strong>Having possession of that list will not help you determine if your address has been not been compromised.</strong></p>
<h2>More links</h2>
<ul>
<li>The Register wrote an <a title="Article on The Register questioning how big a deal this was" href="http://www.theregister.co.uk/2009/10/08/webmail_phish/">article suggesting that this really wasn&#8217;t &#8220;news&#8221;</a> &#8211; I quite agree.</li>
<li><a title="Wired article about the analysis" href="http://www.wired.com/threatlevel/2009/10/10000-passwords/">Researchers have analysed the list</a> and found <a title="Research findings from analysing the passwords" href="http://www.acunetix.com/blog/websecuritynews/statistics-from-10000-leaked-hotmail-passwords/">weak passwords are common</a> (no surprises there) but also that a lot of Spanish names are in the top 20 passwords, suggesting that the credentials were captured from a Spanish-speaking community. </li>
<li><a href="https://www.google.com/support/accounts/bin/request.py?ara=1&#038;hl=en&#038;contact_type=ara&#038;ctx=ara">How to regain control of a Gmail account</a></li>
</ul>
<h2>Can I ask a question?</h2>
<p>Sure! As long as it&#8217;s not &#8220;is my address on the list?&#8221;</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.dixo.net/2009/10/07/pastebin-com-and-password-lists/feed/</wfw:commentRss>
		<slash:comments>75</slash:comments>
		</item>
		<item>
		<title>Pastebin.com and the Hotmail password leak</title>
		<link>http://blog.dixo.net/2009/10/06/pastebin-com-and-the-hotmail-password-leak/</link>
		<comments>http://blog.dixo.net/2009/10/06/pastebin-com-and-the-hotmail-password-leak/#comments</comments>
		<pubDate>Tue, 06 Oct 2009 08:40:11 +0000</pubDate>
		<dc:creator>lordelph</dc:creator>
				<category><![CDATA[Pastebin]]></category>

		<guid isPermaLink="false">http://blog.dixo.net/?p=333</guid>
		<description><![CDATA[It seems that a list of 10,000 Hotmail usernames and passwords has been posted on pastebin.com in recent days. 
Pastebin was created as a tool to aid software development, not to distribute this sort of material. 
As a result of the interest this story is generating, pastebin.com is experiencing huge levels of activity &#8211; as [...]]]></description>
			<content:encoded><![CDATA[<p>It seems that a list of <a href="http://news.bbc.co.uk/1/hi/technology/8291268.stm">10,000 Hotmail usernames and passwords</a> has been posted on pastebin.com in recent days. </p>
<p>Pastebin was created as a tool to aid software development, not to distribute this sort of material. </p>
<p>As a result of the interest this story is generating, pastebin.com is experiencing huge levels of activity &#8211; as a result I took it offline to ensure all the offending material has been removed, and have adjusted the abuse filters prevent re-occurence.</p>
<p><strong>Edit</strong>: please don&#8217;t ask if you name was on the list. I have no way of knowing. Just change your password.</p>
<p><strong>Edit #2</strong>: things have calmed down now, and I&#8217;ve written a <a href="http://blog.dixo.net/2009/10/07/pastebin-com-and-password-lists/">longer post about the incident here</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.dixo.net/2009/10/06/pastebin-com-and-the-hotmail-password-leak/feed/</wfw:commentRss>
		<slash:comments>94</slash:comments>
		</item>
		<item>
		<title>Pastebin, the Ti-89 signing keys, and the DMCA</title>
		<link>http://blog.dixo.net/2009/09/18/pastebin-the-ti-89-signing-keys-and-the-dmca/</link>
		<comments>http://blog.dixo.net/2009/09/18/pastebin-the-ti-89-signing-keys-and-the-dmca/#comments</comments>
		<pubDate>Fri, 18 Sep 2009 11:27:29 +0000</pubDate>
		<dc:creator>lordelph</dc:creator>
				<category><![CDATA[Pastebin]]></category>

		<guid isPermaLink="false">http://blog.dixo.net/?p=320</guid>
		<description><![CDATA[I&#8217;ve had a DMCA takedown request sent in relation to a pastebin post containing the signing keys for a range of Texas Instruments calculators which, if I understand correctly, allow you to digitally sign a replacement operating system so that the hardware will accept it.
If you buy a piece of hardware, I firmly believe you [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve had a DMCA takedown request sent in relation to a pastebin post containing the signing keys for a range of Texas Instruments calculators which, if I understand correctly, allow you to digitally sign a replacement operating system so that the hardware will accept it.</p>
<p>If you buy a piece of hardware, I firmly believe you should be able to do whatever you like with it, and people installing their own operating systems and *improving the damn product* is something TI should be happy about.</p>
<p>There&#8217;s a blog over at <a href="http://brandonw.net/">http://brandonw.net/</a> which is enthusiastic about this sort thing, and you can read <a href="http://yro.slashdot.org/story/09/09/21/1418256/TI-vs-Calculator-Hackers">wide and varied discussion about the issue on SlashDot</a> too. (Edit: on 23rd Sep <a href="http://www.theregister.co.uk/2009/09/23/texas_instruments_calculator_hacking/">The Register weighed in with this article</a>)</p>
<p>So, here is the DMCA takedown request Texas Instruments sent to me:</p>
<blockquote><p>September 17, 2009<br />
To Whom It May Concern:<br />
Re:     Illegal Offering of Material to Circumvent TI Copyright Protections<br />
VIA: report abuse at pasetebin.com</p>
<p>It has come to our attention that the web site http://pastebin.com/f23af06b7, contains material and/or links to material that violate the anti-circumvention provisions of the Digital Millennium Copyright Act (“DMCA”).  This letter is to notify you, in accordance with the provisions of the DMCA, of these unlawful activities. Pursuant to the safe harbor provisions of the DMCA, we request that you remove any whole or partial reproductions of and/or disable links to the following:</p>
<p>The post located on http://pastebin.com/f23af06b7</p>
<p>Texas Instruments Incorporated (“TI”) owns the copyright in the TI-83 Plus, TI84 Plus and TI-89 operating system software.  The TI-83 Plus, TI-84 Plus and TI-89 operating systems use encryption to effectively control access to the operating system code and to protect its rights as a copyright owner in that code. Any unauthorized use of these files is strictly prohibited.</p>
<p>http://pastebin.com/f23af06b7 is distributing or providing links to information that bypasses TI’s anti-circumvention technology.  By providing copies of or offering links to such information, http://pastebin.com/f23af06b7 has violated the anti-circumvention provisions of the DMCA at 17 U.S.C. §§ 1201(a)(2) and 1201(b)(1).</p>
<p>Please confirm to the undersigned in writing no later than noon on September 18, 2009 that you have complied with these demands. You may reach the undersigned by telephone at (xxx) xxx-xxxx or by email at xxxxxx@ti.com. TI reserves all further rights and remedies with respect to this matter.<br />
I hereby confirm that I have a good faith belief that use of the Illegal Material in the manner complained of in this letter is not authorized by the copyright owner, its agent, or the law, that the information in this letter is accurate, and that, under penalty of perjury, I am authorized to act on behalf of TI, the owner of the exclusive rights in the TI-83 Plus, TI-84 Plus and TI-89 operating system software that are allegedly misappropriated using unlawful methods.<br />
Texas Instruments Incorporated</p>
<p>XXX XXXXXX<br />
Manager, Business Services<br />
Education Technology Group</p></blockquote>
<p>I live in the UK, and pastebin.com is hosted in the UK, so hitting me with a DMCA takedown request is rather pointless. However, I do remove copyrighted content on request, so much as it pains me to do so, I&#8217;ve deleted that post for now. </p>
<p>It&#8217;s no biggie, if you want the keys, just check <a href="http://wikileaks.org/leak/ti-os-keys-dmca-2009.txt">wikileaks </a> or do a <a href="http://www.google.com/search?q=82EF4009ED7CAC2A5EE12B5F8E8AD9A0">Google search for 82EF4009ED7CAC2A5EE12B5F8E8AD9A0</a>. That&#8217;s just a long hexadecimal number. Pretty sure I&#8217;m free to express that number in any form I like. </p>
<p>Can you say &#8220;Streisand Effect&#8221;?</p>
<p><strong>Edit</strong>: <a href="http://erratasec.blogspot.com/2009/08/so-use-dmca-counter-claim.html">Interesting post her</a>e on dealing with these TI DMCA notices. Persoanlly, I&#8217;m not interested in fighting to keep the post on pastebin.com as it is widely available elsewhere. I have a copy of the keys should I ever wish to actively distribute them though&#8230;</p>
<p><strong>Edit#2, Oct 14th 2009</strong>: The Electronic Frontier Foundation have written the following about this issue: <a href="http://www.eff.org/press/archives/2009/10/13">EFF Warns Texas Instruments to Stop Harassing Calculator Hobbyists</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.dixo.net/2009/09/18/pastebin-the-ti-89-signing-keys-and-the-dmca/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>pastebin.org considered harmful</title>
		<link>http://blog.dixo.net/2009/09/15/pastebin-org-considered-harmful/</link>
		<comments>http://blog.dixo.net/2009/09/15/pastebin-org-considered-harmful/#comments</comments>
		<pubDate>Tue, 15 Sep 2009 18:12:01 +0000</pubDate>
		<dc:creator>lordelph</dc:creator>
				<category><![CDATA[Pastebin]]></category>

		<guid isPermaLink="false">http://blog.dixo.net/?p=318</guid>
		<description><![CDATA[I run pastebin.com, and maintain it daily. I check for abuses of the service, block IP addresses of serial offenders and try to ensure it provides a speedy and useful service.
I make the software available for others to use and improve upon too.
pastebin.org is one such site, but I&#8217;m starting to get emails from people [...]]]></description>
			<content:encoded><![CDATA[<p>I run <a href="http://pastebin.com">pastebin.com</a>, and maintain it daily. I check for abuses of the service, block IP addresses of serial offenders and try to ensure it provides a speedy and useful service.</p>
<p>I make the software available for others to use and improve upon too.</p>
<p>pastebin.org is one such site, but I&#8217;m starting to get emails from people who&#8217;ve used that site and are now infected with the Win32/Alureo trojan virus. In addition, the site seems to have been compromised in other ways, with extra advertising banners and popups.</p>
<p>I&#8217;m not responsible for that site. I&#8217;ve tried to make contact with the registrant listed in whois records, but not had a response.</p>
<p>The moral of the story: if you want to stay safe, stick with <a href="http://pastebin.com">pastebin.com</a>!</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.dixo.net/2009/09/15/pastebin-org-considered-harmful/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Pastebin post filtering</title>
		<link>http://blog.dixo.net/2008/03/16/pastebin-post-filtering/</link>
		<comments>http://blog.dixo.net/2008/03/16/pastebin-post-filtering/#comments</comments>
		<pubDate>Sun, 16 Mar 2008 07:56:42 +0000</pubDate>
		<dc:creator>lordelph</dc:creator>
				<category><![CDATA[Pastebin]]></category>

		<guid isPermaLink="false">http://blog.dixo.net/2008/03/16/pastebin-post-filtering/</guid>
		<description><![CDATA[As there&#8217;s been some cases of cracked email address lists being posted on pastebin recently, this week I tweaked the spam filtering to block such posts. A few legitimate posts got caught in the crossfire, causing a few more tweaks to the rules. 
If you&#8217;re having trouble posting something because pastebin says it looks like [...]]]></description>
			<content:encoded><![CDATA[<p>As there&#8217;s been some cases of cracked email address lists being posted on <a href="http://pastebin.com">pastebin </a>recently, this week I tweaked the spam filtering to block such posts. A few legitimate posts got caught in the crossfire, causing a few more tweaks to the rules. </p>
<p>If you&#8217;re having trouble posting something because pastebin says it looks like spam, post a sample in a comment below and I&#8217;ll see what I can do to improve it!</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.dixo.net/2008/03/16/pastebin-post-filtering/feed/</wfw:commentRss>
		<slash:comments>50</slash:comments>
		</item>
		<item>
		<title>Pastebin fights the spam!</title>
		<link>http://blog.dixo.net/2007/08/21/pastebin-fights-the-spam/</link>
		<comments>http://blog.dixo.net/2007/08/21/pastebin-fights-the-spam/#comments</comments>
		<pubDate>Tue, 21 Aug 2007 19:15:21 +0000</pubDate>
		<dc:creator>lordelph</dc:creator>
				<category><![CDATA[Pastebin]]></category>

		<guid isPermaLink="false">http://blog.dixo.net/2007/08/21/pastebin-fights-the-spam/</guid>
		<description><![CDATA[A few people have emailed me recently disappointed by the level of spam postings on pastebin.com. I&#8217;ve never really understood why spammers bother, but as they are bothering in increasing numbers it was time to take some action.
Last night I built in some spam filtering which has caught hundreds of posts since going live. I [...]]]></description>
			<content:encoded><![CDATA[<p>A few people have emailed me recently disappointed by the level of spam postings on <a href="http://pastebin.com">pastebin.com</a>. I&#8217;ve never really understood why spammers bother, but as they <em>are</em> bothering in increasing numbers it was time to take some action.</p>
<p>Last night I built in some spam filtering which has caught hundreds of posts since going live. I also added a &#8220;report spam&#8221; link which has flagged over 500 posts in past 20 hours. By iteratively tweaking the spam filter to identify the legimately flagged posts, I&#8217;ve been able to quickly delete a lot of older spam posts.</p>
<p>Hopefully this will make pastebin look like a well tended garden rather than a run-down wasteland! Comments welcome&#8230;</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.dixo.net/2007/08/21/pastebin-fights-the-spam/feed/</wfw:commentRss>
		<slash:comments>56</slash:comments>
		</item>
		<item>
		<title>Pastebin &#8211; Turbo Boost Success!</title>
		<link>http://blog.dixo.net/2007/07/17/pastebin-turbo-boost-success/</link>
		<comments>http://blog.dixo.net/2007/07/17/pastebin-turbo-boost-success/#comments</comments>
		<pubDate>Tue, 17 Jul 2007 08:19:06 +0000</pubDate>
		<dc:creator>lordelph</dc:creator>
				<category><![CDATA[Pastebin]]></category>

		<guid isPermaLink="false">http://blog.dixo.net/2007/07/17/pastebin-turbo-boost-success/</guid>
		<description><![CDATA[Just been checking the stats on pastebin.com and clearly the recent changes have worked well! Usage has trebled since last week and it&#8217;s still very responsive.
It&#8217;s nice to see that people still want to use it, so I&#8217;m going to ride this wave of enthusiasm and improve it further over the coming weeks. 
Your feedback, [...]]]></description>
			<content:encoded><![CDATA[<p>Just been checking the stats on <a href="http://pastebin.com">pastebin.com</a> and clearly the <a href="http://blog.dixo.net/2007/07/10/pastebin-reloaded/">recent changes</a> have worked well! Usage has trebled since last week and it&#8217;s still very responsive.</p>
<p>It&#8217;s nice to see that people still want to use it, so I&#8217;m going to ride this wave of enthusiasm and improve it further over the coming weeks. </p>
<p>Your feedback, as ever, is welcome!</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.dixo.net/2007/07/17/pastebin-turbo-boost-success/feed/</wfw:commentRss>
		<slash:comments>16</slash:comments>
		</item>
		<item>
		<title>Pastebin Reloaded!</title>
		<link>http://blog.dixo.net/2007/07/10/pastebin-reloaded/</link>
		<comments>http://blog.dixo.net/2007/07/10/pastebin-reloaded/#comments</comments>
		<pubDate>Tue, 10 Jul 2007 20:09:43 +0000</pubDate>
		<dc:creator>lordelph</dc:creator>
				<category><![CDATA[Pastebin]]></category>

		<guid isPermaLink="false">http://blog.dixo.net/2007/07/10/pastebin-reloaded/</guid>
		<description><![CDATA[Well, I promised it waaaay back in january, but I&#8217;ve finally released an update to pastebin.com. A few people have asked for the source over the past few months and have seen some of the updates already, but here&#8217;s what&#8217;s new&#8230;

MySQL storage replaced with file-based storage, making it much faster
Revamped the colour scheme, which has [...]]]></description>
			<content:encoded><![CDATA[<p>Well, I promised it <a href="http://blog.dixo.net/2007/01/18/pastebin-light-at-the-end-of-the-tunnel/">waaaay back in january</a>, but I&#8217;ve finally released an update to <a href="http://pastebin.com">pastebin.com</a>. A few people have asked for the source over the past few months and have seen some of the updates already, but here&#8217;s what&#8217;s new&#8230;</p>
<ul>
<li>MySQL storage replaced with file-based storage, making it much faster</li>
<li>Revamped the colour scheme, which has been pretty much the same for 5 years</li>
<li>Added a &#8216;delete post&#8217; feature</li>
<li>Switched to Affero GPL licence</li>
</ul>
<p>If you&#8217;ve drifted away from pastebin due it&#8217;s lethargic speed, now&#8217;s the time to come back! <a href="http://pastebin.com">Give it a whirl</a> and if you have any feedback, leave a comment on this post.</p>
<p>Here&#8217;s some more detail on the changes&#8230;</p>
<h2>File based storage</h2>
<p>Pastebin used MySQL for storage since it was first launched in 2002. It has steadily grown in popularity, but that popularity began to take its toll on performance in the past 12 months.</p>
<p>Pastebin started out just keeping the last 1000 posts, which kept things zippy. Then I added custom domains, which increased the number of posts being retained, but what really hurt it was adding a common request &#8211; permanent posts, which meant that over time, the database grew inexorably larger.</p>
<p>In January I began to wonder if I needed a relational database at all. After all, pastebin is really just a single table application, and there are only two main operations:</p>
<ul>
<li>Fetch post <i>x</i></li>
<li>Get last 10 posts on domain <i>foo</i></li>
</ul>
<p>So I refactored the code to allow the storage mechanism to be changed. The new file based mechanism assigns a random identifier to a new post, e.g. <i>abcdefgh</i> and stores it in a structured directory:</p>
<pre>posts/&lt;d|m|f&gt;/ab/cd/ef/abcdefgh</pre>
<p>The top level directory &#8216;d&#8217;, &#8216;m&#8217;, or &#8216;f&#8217; is chosen based on the desired lifetime of the post (1 day, 1 month or forever). Garbage collection of the 1 day posts in the &#8216;d&#8217; directory can thus be carried out by performing a find for files older than a day with something like this running from cron every day:</p>
<pre>
find /path/to/pastebin/posts/d -mtime +1 -exec rm \{\} \;
</pre>
<p>To maintain the MRU lists of recent posts, the code maintains a serialized array for each domain. Whenever a post is made, this serialized file is locked, updated and unlocked. This is the only time the code can find itself competing for a shared resource, and even then its on a per-domain basis, rather than for the entire application as with the  mysql storage.</p>
<p>As I write, this mechanism has been running for a few hours on the live site, and performance is much improved. At peak times it could take 15-20 seconds to make a post, it&#8217;s now much, much zippier!</p>
<h2>Revamped Colour Scheme</h2>
<p>I thought the old CSS was looking a little tired so I&#8217;ve freshened it up a little. I want to avoid adding graphics to the design and just use pure HTML and CSS if possible, which keeps things speedy too.</p>
<p>Comments on it are welcome, it&#8217;s likely I&#8217;ll tinker with it some more&#8230;</p>
<h2>Delete Post</h2>
<p>This is quite neat I think &#8211; if you choose to hit the &#8220;remember me&#8221; button, you&#8217;ll be assigned a random token which is used to mark your posts. This token is stored in a cookie. When you later view a post, if your cookie token and the post token match, you&#8217;ll be offered the opportunity of deleting the post.</p>
<p>I like this as you don&#8217;t have to go entering a password or setting up an account &#8211; it just works.</p>
<p>As always, if you&#8217;ve made a post you want removing and this feature doesn&#8217;t do it for you, just ask and I&#8217;ll take care of it</p>
<h2>Changed to Affero GPL</h2>
<p>The last few releases of pastebin used the GPL licence. Trouble is, while the GPL guarantees access to the source if you receive a binary copy of the software, with a website that doesn&#8217;t happen.  The <a href="http://www.affero.org/oagpl.html">Affero GPL</a> is a modified version of the GPL which contains an extra clause guaranteeing your access to the source when you interact with the software over a network. </p>
<p>So if you use pastebin in your own site, or adapt it further, you must continue to offer that source to your users. Lovely</p>
<h2>What&#8217;s next?</h2>
<p>Well, now that pastebin is actually <i>usable</i> again, I&#8217;m on a roll. The code has partially complete support for translation, and I&#8217;ve an <a href="http://blog.dixo.net/2006/05/10/translate-pastebin/">army of volunteers</a> ready to translate, so that&#8217;s the next goal&#8230;</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.dixo.net/2007/07/10/pastebin-reloaded/feed/</wfw:commentRss>
		<slash:comments>15</slash:comments>
		</item>
	</channel>
</rss>
