Short Read Archive Canned

This email apparently from NCBI head-honcho David Lipman was posted in the comments section at Tree of Life:

Dear Staff Members of NCBI,

As you are aware, the federal government as well as NIH is facing a period of budgetary uncertainty that is resulting in ongoing program reviews throughout the government.   At NCBI our senior staff have been giving serious consideration to our own projects and staffing levels in order to prepare for and adjust to new fiscal constraints.

NCBI had received a significant adjustment in its appropriated funding in the proposed FY2011 President’s Budget.  The President’s Budget, however, has not been enacted and we are being required to operate at last year’s (FY2010) level under a Continuing Resolution (CR) from Congress.  Upon the CR’s expiration on March 4, 2011, there is little likelihood the budget picture will improve.  The NIH Office of the Director has provided us with stop-gap funding to alleviate some of our FY2011 and FY2012 funding needs.

Therefore, to ensure that we can provide stability and some degree of reasonable growth for core activities, we have had to identify projects for downsizing or elimination.  In order to meet budget objectives we have had to come to a very difficult decision to downsize the Conserved Domain Database and to eliminate the OSIRIS and Peptidome projects.  The Sequence Read Archive (SRA) will also be phased out over the next 12 months.  Temporary funding from NIH for SRA is expected for at least four, and possibly eight, months.  Beyond that period, staff at the NIH ICs will be examining alternative approaches for SRA-type data.

Our expectation is that we can accomplish most of the restructuring through normal attrition but unfortunately some positions will have to be eliminated.    It is regrettable that we have had to take this drastic action, but it has become unavoidable.

These are difficult times but I am confident that NCBI is positioning itself to offer a stable employment environment and that all of you will continue your outstanding contributions to its success.  I thank you for all your efforts and am grateful for your continuing dedication.

David Lipman

Wow, big news. And not just because the fact that NCBI is downsizing is a worry in terms of funding priorities - given the ongoing explosion in production of genomics data.

I, like many others, won't necessarily be sorry to see the back of the Short Read Archive as it was a bit of a pain to upload to and a massive pain to retrieve from.

But the question is now, what will become the de facto place to find short-read data, or pointers to data? Certainly the SRA was useful - particularly for getting data to test bioinformatics applications against.

I am not aware of a corresponding announcement from the European Nucleotide Archive (EBI's version of SRA) so I guess that can be continued to be used going forward.

But it seems likely that no single repository is likely to be able to handle the sequencing output of the entire world for much longer. More sensible and sustainable would be a peer-to-peer arrangement where each sequencing centre is responsible for hosting their data and making it Web-accessible. There may be a role for services like BioTorrents.

But this of course throws up plenty of questions about standards of meta-data (arguably never that high in the first place), availability/resilience, etc. etc. I know that my local network administrators won't be impressed if we start hosting a bunch of terabyte-sized files that are heavily downloaded.

More coverage over at Eagle Genomics' blog.

Thoughts welcome in the comments as always.