Entrez-AJAX: A RESTful API for Bioinformatics Web App Developers

Caution: Extreme bioinformatics nerdery follows, if you don't know your AJAX from your JSON please ignore this post!

When planning the new release of xBASE we needed the ability to launch searches from Entrez directly from the users' browser. A common pattern for making web requests directly from the browser is to use AJAX. By launching searches from the browser we can improve the user experience by speeding up page loads. We can also handle the situation where Entrez is slow or inaccessible without blocking the page.

However, Entrez do not currently support access directly via the browser from a third-party site a la the Google AJAX API. Right now there is no support for AJAX via a JavaScript API and no support for returning results in JSONP format (JSON with padding) to allow a third-party to build one.

A way round this problem is to create a JSON proxy to funnel results from the Entrez Programming Utilities (eUtils) through another server. Although I did find several JSON proxies built with Yahoo! Pipes none seemed to work reliably and often had idiosyncratic or partial support for the Entrez API.

To rectify the matter we wished to offer a simple and supported/supportable RESTful web service which natively returns JSONP results proxied from eUtils. Happily this is now ready for action on the project website.

Instead of building a bells-and-whistles JavaScript API we have opted for a simpler approach by offering some (hopefully) useful documentation and some basic examples in JavaScript to get you started. By doing this we've avoided a dependency on any particular JavaScript library. We like using jQuery to do our AJAX so the examples use this, but alternative libraries like Prototype, Dojo and YUI will work just as well.

As we were worried about what might happen if this service got very popular, we decided to deploy on Google App Engine which gives us a scaleable and fast infrastructure for free. We should be able to handle as much traffic as you can throw at us, but if a particular developer uses the service excessively we may suggest they deploy their own Google App Engine instance from the sourcecode and use that instead (or send us a donation to cover the cost).

Another benefit from using App Engine is the availability of a memory-backed cache and a persistent database. We thought we'd use these to cache search results (for a maximum of 24 hours), which should help ensure that searches return very quickly, and potentially even when Entrez is down or inaccessible.

So if you build web frontends for biology or medicine you may well find this API useful. The API documentation and examples are available from entrezajax.appspot.com. I'd be really grateful for any feedback that you may have.

Finally, there's no reason this service has to be necessarily restricted to Entrez. If you know of other database resources that would benefit from the same treatment, drop me a line and let me know about them.