plaintext

Archive for the ‘code’ Category

Google App Engine Quotas on > 430k Requests/Day

with 4 comments

[UPDATE: Posted this yesterday for >140k requests/day, updating now with >430k requests..]

TheRealURL had a particularly busy Thanksgiving Day Black Friday for some reason. Just for curiosity’s sake, here’s the quota usage for a GAE application under a relatively high load. The quotas that aren’t pictured are all on %0. Billing is now turned on (for the Incoming Bandwidth) but limited to up to $1/day – and was never used yet.

Bear in mind this is a pretty minimal app, of course, but still – not too shabby.

Written by Nir

November 27, 2009 at 12:47 pm

UnShorten WordPress Plugin Uses TheRealURL

with 2 comments

I recently found that Jon Rogers, a developer from the UK, released the UnShorten WordPress plugin, which uses TheRealURL to unshorten links displayed by the Twitter Tools plugin.

That’s pretty cool, TheRealURL was designed as a web service with exactly this type of use in mind.

Between this and other users TheRealURL now serves over 40,000 requests a day. It’s nice to see Google App Engine handling that with barely using any of the various daily quotas (except for incoming bandwidth… Need to check on that one):

TheRealURL GAE Quota Usage

Written by Nir

November 10, 2009 at 3:09 pm

Posted in The Real URL, code

TheRealURL Adds Page Titles

without comments

I needed this for a project I’m working on, so I added a new feature to TheRealURL unshortening service: JSON/P requests now return the page title (scraped from the HTML <title> tag) as well as its original URL.

For example, http://therealurl.appspot.com/?format=json&url=bit.ly/a returns:

{ "url" : "http://www.apple.com/", "title" : "Apple" }

The plain text format remains as is – nothing but the unshortened URL – so I don’t think there should be any issues for existing API users. Response times don’t appear to be affected much either. If you do get any issues, please let me know in the comments or at niryariv@gmail.com.

Written by Nir

July 17, 2009 at 1:29 pm

Posted in The Real URL, code, projects

The Long Poll: AJAX Push(like) Chat with Comet

with 4 comments

Recently I’ve been working on an AJAX based chat application (in development..). The obvious way to do it is send an XMLHttpRequest every few seconds to check for new messages. Unless it’s a particularly animated conversation most requests won’t return any new content, so I added a simple Conditional-GET like system based on the chat’s text size. Here’s the client side implementation:

function refresh_chat() {
	$.ajax({
	  	url: "/chat",
	   	data: "format=xhr&chat_id={{chat_id}}&cur_len=" + chat_content.length,
		  complete: function(xhr){
				if (xhr.status == 200) render_chat(xhr.responseText);
				setTimeout("refresh_chat()", 5000)
		  }
	 });
}

And the server code that handles it:

cur_len = self.request.get("cur_len", 0)
if len(chat.content) == int(cur_len):
	self.error(304) # return 304 Not Modified
else:
	self.response.out.write(chat.content) # return new content

That’s basically the standard approach. Pretty simple, works ok (could be optimized a bit, for example return only the actual new content etc). It’s not exactly an elegant design, though. Trying to use HTTP, designed as a Pull protocol, for an application that requires Push results creates this system of frequent server requests with empty responses, kind of like the “Are we there yet?” conversations with kids on long road trips.

Jack Moffitt’s JSConf talk introduced me to the concept of Long Polling, aka Comet or (with a lot added) BOSH, as a way to simulate HTTP Push. Rather than have the client sending a lot of short, frequent requests and the server responding to each as fast as possible, long polling turns it around: the server holds the requests as long as it can, returning a response only when it has new data or a timeout limit was hit. So, instead of sending request every 3 seconds, for example, you can send one every 30 seconds.

Client side code remains almost the same:

function refresh_chat() {
	$.ajax({
	  	url: "/chat",
	   	data: "format=xhr&chat_id={{chat_id}}&cur_len=" + chat_content.length,
		  complete: function(xhr){
				if (xhr.status == 200) render_chat(xhr.responseText);
				setTimeout("refresh_chat()", 1000);
		  }
	 });
}

But on the server side, there’s a bit of new logic to keep checking for new content while the server holds the response:

cur_len = self.request.get("cur_len", 0)
end_by = int(time.time()) + 30

while int(time.time()) < end_by:
	if len(chat.content) != int(cur_len):
		return self.response.out.write(chat.content) # return new content

	time.sleep(1)

self.error(304) # return 304 Not Modified

If you have any experience building web applications, you’ve spent a lot of effort making sure servers respond quickly to requests. Delaying the response is counter-intuitive, which in itself makes Comet useful to know, if only for its new perspective. However, this also makes production use a bit complicated, since most web server stacks are optimized for maximum requests/second rather than long concurrent requests. Content-rich sites often use separate servers for big media content for this reason, and Comet also has its own server (er “HTTP-based event routing bus”) in Cometd.

Written by Nir

July 13, 2009 at 3:43 pm

Posted in code, design, web

Tagged with , , , , ,

FeedVolley: Messages From Iran

without comments

I just put up a quick hack I made with FeedVolley (more about FV here), that aggregates Twitter (and other media) feeds coming from inside Iran: Messages From Iran

I don’t know about news value, but it’s pretty cool to be able to refresh that page now and then and get a snapshot of the current mood and happenings, in these possibly historic times there.

It was also cool to find another use for FeedVolley, which I neglected a bit recently ;) I added some page caching on top of the existing feed caching, to allow it to handle some traffic (Slicehost’s 256MB slices seem to start sending swap alerts as soon as traffic rises above negligible). The sources are basically the ones listed here, with a few additional ones I’m trying to find. In fact, if you really want to keep a close watch on what’s going on, you may want to watch the FriendFeed stream – the FeedVolley page is really just an HTML skin to make the feed look a little nicer (hopefully).

(Favorite tweet so far: “@jonobacon IRC is blocked. Tell our regards to Ubuntu Global Jam from Iran. I’m twitting the #iranElection story from a Kubuntu machine :)“. Makes me think of starting to use Twitter again..)

Written by Nir

June 18, 2009 at 12:33 pm

Posted in code, feedvolley, rss

JSONP, Quickly

with 7 comments

I discovered JSONP just recently, following Chriscomment. Though I initially didn’t intend to support JSON, JSONP made enough difference that I rewrote most of the TheRealURL code (all 20 lines of it) to support it. Since it took me some time to figure out JSONP initially, perhaps a quick guide might help those who follow.

JSONP allows you to make an HTTP request outside your own domain, which enables consuming Web Services from JavaScript code. It relies on a JS quirk: while XMLHttpRequest is blocked from making external requests, there’s no such limit on <script> elements. What JSONP does is add a <script src=> element to the DOM, with the external URL as the SRC target.

To serve JSONP simply return the JSON data inside a function. e.g., this JSON:

{ "hello" : "Hi, I'm JSON. Who are you?"}

Becomes:

some_function({ "hello" : "Hi, I'm JSON. Who are you?"})

(The reason is that the latter is actually code that will run inside the created by the JSONP client, so it needs to be executable code rather than plain JSON data)

some_function is provided by the calling client, usually in the ‘callback’ parameter. So, a query like this:

get_jsonp?callback=getthedata

Should return:

getthedata({ "hello" : "Hi, I'm JSON. Who are you?"})

On the server side, this means adding some code similar to this:

// assume $json holds the JSON response
if ($GET['callback'] != '') $json = $GET['callback']."( $json )";
return $json;   // my PHP is rusty but you know what I mean

On the client side, modern JS frameworks include JSONP support (or you can DIY). For example, in jQuery <= 1.2 adding &callback=? to the query string in getJSON method’s URL sends a JSONP request.(jQuery transparently replaces the ‘?’ with a unique string). Here’s how you get the unshortened URL for ‘bit.ly/a’ using therealurl:

$.getJSON("http://therealurl.appspot.com?format=json&amp;url=bit.ly/q&amp;callback=?",
	function(data){ alert(data.url) }
);

That’s about it. JSONP probably won’t feature in the next Beautiful Code edition and obviously you need to watch the URLs you’re accessing so you don’t get malicious JS code executed, but, until cross site XHR is resolved, JSONP can get the job done.

Written by Nir

May 5, 2009 at 2:25 pm

Posted in code, web

The Real URL

with 13 comments

[UPDATED on April 21st, 2009 to reflect the JSON/P additions. Since it's <24 hours after the initial release, I hope it won't cause anyone problems.]

The Real URL began as a joke – after discovering, while working on another project, over 80 URL shortening services, I figured there must be room for a service that un-shortens all these URLs. (The web is overflowing with hype and blog posts/articles complaining about it just add to the noise, so it’s better to make your point by building something. My favorite example is the Twittering Office Chair).

Turns out there are already several out there: (eg, trueurl) but I built it anyway, since I had a slightly different approach in mind. The Real URL is meant to be used as a web service rather than on its own. It returns the “real” URL in either raw text, JSON or JSONP format – examples and details are on the homepage. (I added JSON mostly for JSONP, per Chris’ comment – admittedly I didn’t even know it existed ;) This enables cross site JS requests which might actually make The Real URL useful.

While I do want The Real URL to be solid and reliable in the long term, I don’t want to spend much time/money keeping it up. It’s a sustainability issue – building a system that will work reliably over a long time while requiring minimal care and resources. I made a few design decisions to that end:

  • Keep it simple (always a good idea). Real URL does only one thing and is accessible in only one way (the homepage demo uses XHR to access the service, to keep it so). It now supports text/JSON/JSONP, but it’s just the same output formatted differently. Sometimes you give up some elegance to make the product useful. As in the following item:
  • Deploy with Google’s App Engine. Initially it was nice, super-minimal Sinatra code. Unfortunately Google App Engine doesn’t support Ruby yet and there’s no service that offers comparable cost/stability ratio, so I rewrote in slightly less minimal Python for GAE.
  • Use App Engine’s domain (therealurl.appspot.com). Buying a domain and keeping it renewed isn’t a big deal, but it still requires some attention – especially if you happen to hit a nice domain name which people try to grab or piggyback on. Sticking with appspot.domain minimizes this issue. (if the need rises I might add a “real” domain later on, but in any case therealurl.appspot.com will remain active)

If you find a use for The Real URL this or have an idea for one, please comment here or email me at niryariv@gmail.com. Let the street find its own uses etc ;)

Written by Nir

April 20, 2009 at 1:16 pm

List of URL Shortening Sites

with 20 comments

I’ve been compiling this list of URL shortening services for some time now, for use in one of my projects, and thought it might help developers who need it for their own work (or VCs who seek to place a couple $mil on one)

Anyway, here are the 74 82 (thanks commentors!) sites I got so far. If you use Ruby, just stick %w{ } around it and you’ve got an array. If you own one of these sites, put “Twitter-compatible” on your homepage, who knows ;):

adjix.com b23.ru bit.ly budurl.com canurl.com cli.gs decenturl.com dolop.com dwarfurl.com easyurl.net elfurl.com ff.im fire.to flq.us freak.to fuseurl.com g02.me go2.me idek.net is.gd ix.lt kissa.be kl.am korta.nu krunchd.com ln-s.net loopt.us memurl.com miklos.dk moourl.com myurl.in nanoref.com notlong.com ow.ly ping.fm piurl.com poprl.com qicute.com qurlyq.com reallytinyurl.com redirx.com rubyurl.com rurl.org shorl.com short.ie shorterlink.com shortlinks.co.uk shorturl.com shout.to shrinkurl.us shurl.net shw.me simurl.com smallr.com snipr.com snipurl.com snurl.com starturl.com surl.co.uk tighturl.com tinylink.com tinypic.com tinyurl.com tinyvh.com tr.im traceurl.com twurl.nl u.mavrev.com ur1.ca url-press.com url.ie url9.com urlcut.com urlhawk.com urli.ca urlpass.com urlx.ie xaddr.com xrl.us yep.it yuarel.com yweb.com zurl.ws

UPDATE: I moved the list to listable.org, per Karan’s suggestion, which allows easily exporting the date to SQL, JSON or text. Future updates will all be there: http://www.listable.org/show/url-shortening-sites

UPDATE #2: As a result of this post I ended up building a URL unshortening service, which I now think might actually have some uses. More here.

Written by Nir

April 4, 2009 at 10:58 am

Posted in code, web

Backing Up MySQL Database with Subversion

without comments

While working on a recent Rails project, I wanted to occasionally backup the database to a remote location. Since we were already using Subversion for source control, I figured I could just use it for storing the DB contents as well and came up with a short Ruby script for this, called dbbackup (ironically, stored on a Git repo – I guess it could use a –use-git option ;)

The only Rails tie is that it uses config/database.yml to get the database name and login info, so you could easily adapt it to run on non-Rails projects too. It’s built so that it can be ran by a nightly cron task, and since it’s only sending the diffs it wouldn’t be too resource-heavy.

I wouldn’t use this for Facebook’s production servers, but if your needs are more moderate you might find it useful. Feel free to send over any questions or patches to niryariv@gmail.com. Here’s the repo URL again, with the script and an explanation on how to use it: http://github.com/niryariv/dbbackup/tree/master

Written by Nir

March 23, 2009 at 10:11 am

Feedvolley Design

without comments

I like 37signal’s Design Decisions posts, which explain the thinking behind seemingly small details in their apps. It makes sense: building the core functionality of most web apps is relatively straightforward, the real quality (and, ultimately, most of the effort) is in the details. So, here’s my take on Feedvolley’s design.

I’m not aware of any site that does the quite same thing as Feedvolley, so the first challenge is to get users to understand what it’s about and how to use it. My favorite way to learn to use something is to play around with it (not recommended with firearms, bikes and similar BTW), so the goal was to make Feedvolley’s interface invite users to do just that. That means making it as easy as possible to accomplish something, and then make it rewarding to keep playing with what was created.

To make starting out easy, the homepage is a minimal form with fields for feed or HTML page URL (one of the features that make RSS/Atom a good Web API is the fact it’s often auto-discoverable) and email (more about that in a moment). A default theme is pre-selected.

Another way to create a page is by clicking the “Create a page like this” link located on top of every user-created Feedvolley page. This lets users start with an existing page and modify the content and HTML to their needs. That’s one of my favorite web app buttons – it invites a viewer to become participator, and lets users start with something similar to what they want, and just modify it as they learn the system.

You might have noticed there is no registration step here. Personally, I hate having to register to a website in order to do anything. Feedvolley (like Notifyr) uses email as its user authorization system. Each user gets an edit link with a unique token string. This token is also kept in a cookie, so users aren’t forced to go to their email in order to start editing. “Create a page like this” also serves as backup for this system. In case a user lost the edit link, she can simply go to her page, create a duplicate and continue editing that.

This may not be perfect, but seems optimal for most users. If some users ask for a more rigid authorization system, we can always add it later on as an option.

Once a user created a page, the next step is to make further work on it possible, and worth the time. This is where the “Customize” link comes into play: users can set a page’s URL path and have complete control over the HTML, JavaScript and CSS in the theme. To make themes as easy as possible, Feedvolley’s markup closely follows that of Tumblr (Tumblr’s templates inspired the Feedvolley concept, in fact. They do great work over there). In “work with the existing environment” spirit, this lets users easily adapt existing Tumblr themes for Feedvolley and also use Tubmlr’s docs, which also saved us some documentation pains :)

As far as design goes, I’m with the “it’s done when there’s nothing more to remove” (as opposed to “nothing more to add”) school. So, I’m pretty happy with the result in Feedvolley. Some challenges remain: how to make it more obvious that a page can be customized once it’s created, for example. It’s not perfect yet and we’ve already incorporated some user feedback into it – if you have any comments or requests, please leave them here or email me directly: niryariv@gmail.com.

Written by Nir

July 2, 2008 at 6:38 pm