Steady Eddy! HMMER gets RESTful ...

The legendary Sean Eddy (hey, that scans!) gave us data-nerds a Valentine's day present and posted the details of the new HMMER RESTful API. I am a big fan of RESTful APIs (SOAP APIs are another story - I loathe them with a passion). In my opinion REST is the best way of doing loose coupling between distributed biological databases right now.

For those not familiar with HMMER, it is the definitive tool for finding conserved protein domains in a protein sequence. It is usually combined with a profile databases such as Pfam. Since HMMER3 there is also PHMMER, an alternative to BLASTP or PSI-BLAST which can be used against protein sequence databases such as NR.

HMMER used to be on the slow side but with the release of HMMER3 it is blazingly quick - the Eddy team have managed a minor miracle. Eddy claims an incredible 1s latency for a search against NR. Not satisfied with that performance, he plans to get typical HMMER searches down into the 100-200ms range. This brings into play interactive applications (as opposed to submitting a job and going make a coffee batch approaches).

Eager to play with the new API I took a couple of hours to build a Python class to access HMMSCAN and PHMMER (straightforward). I then integrated it into the EntrezAJAX project. EntrezAJAX was written to permit cross-site scripting of the Entrez API, but the infrastructure is equally applicable to other web resources.

Here's the result: interactive HMMER searches against Pfam, live in your browser! This code will kick off a HMMSCAN search. Click the button to play!



Pretty cool eh?

And because EntrezAJAX caches results by default, common searches may return results in the blink of an eye.

Here's the relevant JavaScript source - if you want to use it please register for a personalised apikey at EntrezAJAX.

Bear in mind that the API is currently subject to change and cannot be relied upon for production just yet.

Next stop, domain cartoons using the canvas HTML element.