Ion Torrent 316 First Impressions

28 Jul 2011

Last week we had our Ion Torrent upgraded to support the 316 chips at the faster flow rate - many thanks to Life Tech for getting this update to us so quickly.

Although Life Tech supply an E. coli K-12 DH10B library for testing (yay, E. coli beats boring old PhiX! BTW is PhiX the most sequenced genome in the world now?) we have been testing our Ion Torrent on an O104:H4 isolate from the German outbreak supplied by the HPA (strain 280).

Our intention is to do a comparison of the benchtop sequencing platforms, of which more in a later post.

For those with short attention spans, I'll cut to the chase. Our first two runs of 316 chips yielded an impressive 251Mb and 209Mb respectively! Mean read length was about 110bp.

Run 1	Run 2

Pretty! Read distribution looks nice and tight - that's down to Chrystala's lovely clean library (we used Bioruptor and e-gel for this, previously we used Covaris).

So what is interesting here is that we've loaded the chips way higher than we are used to with the 314 - densities of 76-82% - which is why it's nice and red. Looks angry! The wells in the chip is arranged in a teardrop shape and the reagents flow diagonally. High loading density mean more beads (sorry, IonSpheres) in wells.

This reflects a change to the protocol - when we were running 314 chips we were told to load fewer beads to get better coverage - and from our trials when we loaded at 41, 43 and 46% density on the 314 chip the 41% run did do best. The 314 chip has about 1.2m wells, so we were filling about 550k wells. About two-thirds of those wells were live spheres (meaning they have DNA on them) and out of those about two-thirds pass the quality filter - about 200k reads in all (~20Mb data).

The 316 chip has 6.3m wells and we're filling about 5m of these. A little under half are passing the quality filters, meaning we're getting about 2.25m reads.

I am not sure if these protocol changes reflect changes to chemistry or software improvements but they are very welcome - although it does mean we are getting to the limit for the physical loading of this chip (there are 6.3m wells on the 316 and we have managed to fill 5m of them on our best run). But there's plenty of scope to get more reads which pass the quality filter. We lose almost a third to the poor signal filter alone.

The 318 chip has ~12m wells which means either quality improvements and/or read length improvements are needed to get to 1Gb.

Watching Chrystala getting to grips with this instrument in the lab, fair burning through 314 chips - occasionally messing up the loading, I did have a mini epiphany about Ion Torrent. It really is the first platform that permits smaller, cash-strapped centres to get to experiment with high-throughput sequencing. Messed up the loading? Doesn't matter, you only wasted a chip. Got a new library prep method to try? No problem. Want to experiment with different loading protocols? Again, you can do this without breaking the bank. 314 chips are coming down in price to $99 each. That's cheap enough to let undergrads have a fiddle!

It's not just the cost constraint either, being able to run the machine in a couple of hours (it's more like 3 than 2) means you can get that feedback, change things and re-run in the same day, making the whole process feel more reactive.

I realise these aren't original thoughts but it really hits home when you have a machine of your own - before we were terrified of even a single 454 Titanium failure, because of the costs (>$10k for a run). This perhaps isn't democratisation of sequencing but it certainly makes it feel much more hacker friendly.

And now the barrier to getting useful work done is lifted by the 316, the game feels a lot more real and I can see us using the Ion Torrent in anger for our stated aims - genomic epidemiology of bacterial pathogens. 200Mb is enough to get decent coverage of one or maybe two bacterial genomes (although MID kits are not yet available), or perhaps do a little bit of RNA-Seq (particularly with a normalised library). In contrast Life Tech did 10 x 314 chips to sequence a single German E. coli isolate and BGI did 7. And perhaps we shouldn't speak of the 1000 chips that were used to sequence Gordon Moore's genome!

There's a small fly in the ointment however. At a first glance, quality scores are distinctly lower than we are used to with the 314 (plots generated by the marvellous FastQC from the FASTQ files off the Ion Torrent server)

[caption id="attachment_723" align="aligncenter" width="800" caption="316 chip qualities by base (Torrent Suite 1.4.0)"][/caption]

[caption id="attachment_725" align="aligncenter" width="800" caption="316 chip mean qualities by read (Torrent Suite 1.4.0)"][/caption]

Here we're staring at a medium Q14 per sequence and a mean Q16 for the first 100 bases of the sequence (clearly the low quality end of the reads can be trimmed).

Compare this to one of our 314 runs for the same bug.

[caption id="attachment_726" align="aligncenter" width="800" caption="314 chip qualities by base (Torrent Suite 1.3.0)"][/caption]

Two things are striking. First both the mean qualities and the 5' qualities are much lower for the 316 run than we are used to with the 314 run.

Another thing that is clear is that the quality distribution has changed somewhat - it starts lower but doesn't fall quite so precipitously.

Is this the new pipeline doing this? I re-ran our 314 analysis to check.

So weirdly this does change the per-base picture - it dials down the 5' ends quality, but increases the 3' ends - which actually serves to increase the mean read quality. Notably also it means we end up with some long reads in our dataset, up to 200 bases (these are only a small fraction however of the total dataset).

So now I'm wondering whether the increased loading density has any effect on quality.

Run	Pipeline	Bases	Longest	Q17+	Q20+	Q30+
314-Run3	1.3	17334598	119	63%	48%	0.12%
314-Run3	1.4	18141914	202	67.30%	53%	0.03%
316-Run7	1.4	251251423	203	48%	29.80%	0.00%
316-Run8	1.4	209384193	203	45%	26.80%	0.00%

Well, yes and no - not obviously within a chip class, but perhaps the different protocol has made a difference. The higher yield 316 run (Run8) has a higher fraction of >=Q17 and >=Q20 bases than Run7. But clearly both are worse than the 314 runs.

OK, so not really sure what this means. The final question is - are these quality scores actually meaningful? The base caller works de novo and so it is theoretically possible that the base calls are actually much better than expected.

The final assessment is to look at alignment quality scores, i.e. alignment scores calibrated against a known reference (I am just using the inbuilt Ion Torrent pipeline for now, which uses Nils Homer's TMAP algorithm). Assuming the reference is similar enough this should be a better judge of quality values than de novo quality scores.

So far I've only had time to re-run the analysis for one of the 316 chips, as it takes some hours to run (longer than the sequencing takes, in fact):

Run	Bases	AQ17	AQ20	Perfect
316-Run7	210mb	109.08 (51%)	83.57mb (39.7%)	33%

A definite improvement on the de novo calls, but also a source of bias (because bad reads won't map to the reference).

So in summary what I think is happening is:

It is possible that the 316 chip or new loading protocol results in a reduction in Q scores
New Torrent Suite 1.4.0 signal processor generates different (not necessarily higher or lower) quality scores to v1.3.0

Looking around on the Ion Torrent community forum I did find a member that was pushing loading densities way high and getting 70Mb runs - but also getting Q scores more similar to mine, so I wonder whether it is the protocol rather than the chip. And I did see a 316 run with similar quality scores.

Anyway, I'm not too worried about this for my needs. How does it actually perform in real life?

A quick and dirty assembly using Newbler 2.6 (Hint: DO NOT load Ion Torrent FASTQ files into Newbler, only SFF files - I found this out through bitter experience) gives me a respectable assembly with 414 contigs >= 500bp, mean contig size 12.5kb, N50 37.5kb and largest contig of 118kb. 98.84% of the bases in the assembly hit consensus quality of Q40.

If you want the data files I can put them up for you.

Potential conflict of interest declaration: Mark Pallen (who I work for) won an Ion Torrent in their European PGM grant programme.

Loman Labs

Ion Torrent 316 First Impressions

Related Posts

Balti and Bioinformatics: 14th November 2019 14 Oct 2019

Food in Birmingham 11 Aug 2019

How to generate consensus sequences using nanopolish 21 Dec 2018