Map of high-throughput instruments: What can you do with the data?

I'm going to try and get myself in the habit of more frequent, smaller updates.

A few people have started using the data from Omicsmaps.com, the world-map of high-throughput sequencing instruments that James Hadfield and I run to power their own projects, which we think is great.

For example a service called Findini is scraping the data and using it to help people find sequence providers. They've done a nice job with it.

Art Wuster, a post-doc at the Wellcome Trust Sanger Institute has started a nice blog called Seqonomics and he is regularly using the map data to try and understand the sequencing market, see posts like How is commercial sequencing getting on? and Who are the sequencing superpowers?.

I'm even helping the Genomics Network at the University of Lancaster use the map to help look at the social impact of genomics and sequencing.

So this is all great. But right now James and I have reached a bit of an impasse with Omicsmaps. James and I have had the occasional excited conversation about how the map could be extended and improved, but quite honestly real work means we don't have the time to a whole heap with it. It ticks along quite nicely with your community submissions, but I think the explosion of benchtop instruments means we can't capture as many installations proportionately as we used to, not surprisingly as many new users are not necessarily in touch with our close-knit genomics community centred around Twitter and Seqanswers.

My one thought is that if I can open it up a bit more, perhaps the community will come to my rescue and give it a second lease of life.

I'm happy to put the website code up on Github (well, I will definitely do this but I just haven't got round to it yet) if anyone thinks they might make changes to it.

But a first step in opening up the map is that I have put the data up as a public Google Fusion Table. Not only does this have locations and counts, it's also got snapshots from various timepoints going back to 2010. So hopefully this is a useful resource.

The really cool thing about Google Fusion Tables is that it allows you to do quick little visualisations like the one below really easily.

[iframe src="https://www.google.com/fusiontables/embedviz?viz=GVIZ&t=LINE&containerId=gviz_canvas&isXyPlot=true&q=select+col0%2C+SUM(col8)%2C+SUM(col9)%2C+SUM(col10)%2C+SUM(col11)%2C+SUM(col12)%2C+SUM(col13)%2C+SUM(col14)+from+1tYRJ6qreHion4wWx4bd_TnL7WrmMGai63jKEHPw&qrs=+where+col0+%3E%3D+&qre=+and+col0+%3C%3D+&qe=+group+by+col0+order+by+col0+asc+limit+10&att=true&width=800&height=285" width="800" height="300"]
Figure: Growth of sequencing platforms

Or perhaps see the number of sequencers by country:

[iframe src="https://www.google.com/fusiontables/embedviz?viz=GVIZ&t=PIE&containerId=gviz_canvas&q=select+col2%2C+SUM(col7)+from+1tYRJ6qreHion4wWx4bd_TnL7WrmMGai63jKEHPw+where+col0+%3D+'2012-07-20'&qrs=+and+col2+%3E%3D+&qre=+and+col2+%3C%3D+&qe=+group+by+col2+limit+52&att=true&width=800&height=285" width="800" height="285"]

And it has built-in geolocation support so you can even make little visualisations overlaid on maps.

As always you are only limited by your imagination with datasets like this (sorry, these examples weren't very imaginative, but as I say I'm trying to blog more regularly).

I'm putting the following license on these data which basically let you do what you like with it, obviously a citation would be nice. I will look into getting the map it's own DOI via Figshare or similar.

Creative Commons Licence
This work is licensed under a Creative Commons Attribution 3.0 Unported License.

As always, feedback very welcome.