The Illumina sequencing technique is one of the second generation sequencing techniques which are different but all have an amplification phase of the library fragments prior to their sequencing.
The Illumina sequencing technique, like the other second generation techniques, is based on three main steps:
- Construction of a library for next generation techniques (NGS) which involves the addition of specific adapters to the DNA or cDNA fragments to be sequenced. In this regard, it should be noted that the adapters used are different according to the technique used.
- Amplification of library fragments. This phase takes place differently depending on the second generation technique used.
- Sequencing of the fragments through cycles of biochemical reactions. During the reaction cycles, information is acquired which allows, by specific software, to reconstruct a DNA or cDNA sequence. This phase also varies according to the technique used.
As I have already mentioned in the article "The sequencing", there are several second generation techniques but surely, from what I have noticed from my brief experience, the most used second generation sequencing technique is Illumina, therefore in this article I will try to explain how it works in the simplest way possible but first I would like to introduce you, using the table below, the advantages and disadvantages of the second generation sequencing techniques.
|- There is no need to build a library in cloning vectors therefore the transformation phase is not necessary.|
- A large number of library fragments (> 96 fragments) can be sequenced in very small spaces.
- You work with very small volumes.
- The costs are low, but it is not advisable to use a 2nd generation NGS technique if we have to sequence a few fragments.
|- The length of the sequences obtained following sequencing is reduced, usually the reads obtained by the 2nd generation NGS techniques are maximum 600-700 bp long.|
- The accuracy of the obtained reads is 10 times lower than the Sanger sequencing.
Now, as promised, let's talk about the Illumina sequencing technique. This, in fact, is nowadays widely used and allows to obtain long reads 250-500 bp.
The Illumina technique involves three main steps:
- Building the library.
- Amplification of selected library fragments by bridge PCR (bPCR).
- Sequencing of the amplification product using the sequencing-by-synthesis method.
We now describe the different phases in detail.
CONSTRUCTION OF THE LIBRARY
There are two steps to build the Illumina sequencing library:
- Fragmentation of the extracted genomic DNA or of the cDNA obtained from the transcriptome extracted from the sample organism by sonication or enzymatic restriction to obtain fragments with a maximum size of 1000 bp. It is important to consider that the fragments obtained must not exceed 1000 bp in length otherwise there will be problems during sequencing.
- Then bind to the ends of the double-stranded fragments the Illumina adapters. Adapters are of two types and are referred to as adapters A e adapters B. The fragments that will then be used in the later stages of Illumina sequencing will be those equipped with an A adapter at one end and a B adapter at the other end. These correctly formed fragments are isolated in several ways, but one of the most popular methods involves the use of paramagnetic beads, that is marbles that become magnetic when placed inside a magnetic field, and on which molecules of streptavidin are located. In particular, the selection of the fragments is allowed by the binding between streptavidin and biotin, which is linked to one of the two types of adapters, for example adapter B.
Let me open a parenthesis on Illumina adapters, in fact, as I mentioned above, there are two categories of these:
- Single index Illumina adapters. These adapters contain a barcode (or index), that is the DNA sequences that identify a specific sample (where sample means a specific genotype) therefore by inserting these sequences inside the adapters it is possible to sequence the fragments related to different genotypes simultaneously. The sequencing of multiple samples simultaneously is called multiple sequencing or multiplexing.
These adapters consist of:
a) universal region that matches primers for amplification (this region is different between adapters A, which bind to one primer, and adapters B, which bind to another primer).
b) Illumina barcode (or index) present on adapter A or adapter B.
c) universal region to which the primer is bound for sequencing.
- Dual index Illumina adapters. These are instead characterized by the fact that both adapters A and adapters B are equipped with a barcode identifying the genotype. In this case, unlike the Illumina Single Index adapters, these adapters allow a higher level of multiplexing, in fact a barcode is used to distinguish the different samples, i.e. the different genotypes that are sequenced simultaneously, and another barcode is instead used to distinguish specific sequences within the library of each genotype.
- Illumina adapters without barcode (index). They are adapters used in standard Illumina sequencing protocols in which the flow cell, i.e. the support on which the fragments are placed and on which the amplification and sequencing reaction takes place, is capable of physically separating the samples to be sequenced from different genotypes, therefore barcodes are not necessary for the distinction of the different DNA samples. These adapters consist of:
- universal region that pairs to primers for amplification (this region is different between adapters A, which bind to one primer, and adapters B, which bind to another primer).
- universal region to which the primer binds for sequencing.
AMPLIFICATION OF SELECTED LIBRARY FRAGMENTS BY PCR BRIDGE (bPCR)
Once the fragments of the library equipped with adapter A and B have been selected, we proceed with the amplification of these using a method called bridge PCR, which consists in amplifying fragments on a solid support, called flow cell, on which the amplification primers are arranged which are able to recognize and bind adapters A and B, for example the forward primer binds to adapter A and the reverse to adapter B.
In particular, the PCR bridge takes place in the following way:
- The fragments bind to the forward and reverse primers of the flow cell.
- The DNA polymerase starts the amplification of the fragments bound to the flow cell primers (the amplification reaction is allowed by the fact that all the reagents for the amplification reaction are present on the flow cell such as: Taq polymerase, dNTPs, Buffer, Mg2 +).
- Following the first amplification cycle, denaturation occurs at 94 ° C and subsequent removal, by washing, of the fragments used as a template. In fact, only the newly synthesized fragments that have incorporated the primer bound to the flow cell will remain on the flow cell.
- The newly synthesized fragments left on the flow cell fold forming a "bridge" thanks to the binding of the free adapter with another complementary amplification primer present on the flow cell. At this point the DNA polymerase begins a new amplification cycle thus producing a new fragment. At the end of this second amplification a denaturation takes place but in this case no fragments will be washed away since all are anchored to the flow cell.
- Several amplification cycles are then carried out as in the previous point to the point of obtaining clusters (groupings) of copies of each linearized fragment each arranged in different points of the flow cell. It should be noted that in order to obtain homogeneous clusters, i.e. containing copies of only one type of fragment, it is necessary that the fragments have a size smaller than 1000 bp, in fact if they are larger they risk invading other clusters during the process amplification. The homogeneity of the clusters is fundamental in the subsequent sequencing phase, in which the reads obtained from a cluster will thus be relative to the copies of a single fragment of the library.
- Once the condition of clusters of copies of linearized fragments is reached, chemical lysis and subsequent washing are carried out in order to remove the reverse fragments from the flow cell, keeping only the forward fragments.
- Finally, before proceeding with sequencing, blocking is carried out, i.e. the 3 'end of the fragments remaining on the flow cell is blocked in order to prevent the formation of further and unwanted bridges.
SEQUENCING OF THE AMPLIFICATION PRODUCT USING THE SEQUENCING-BY-SYNTHESIS METHOD
The sequencing method used is defined sequencing-by-synthesis since it is based on the activity of the DNA polymerase which, starting from a sequencing primer, adds one nucleotide at a time in order to have a complementary read to a specific fragment of the flow cell. Each nucleotide is marked with different fluorescent molecules that emit light of a different color which is then detected by a special camera. Furthermore, these nucleotides have a block on the free hydroxyl group (-OH) so that only one nucleotide can be added per sequencing cycle.
In particular, there are two different sequencing-by-synthesis approaches:
- Sequencing by synthesis single read, that is a sequencing that foresees the reading of only one end of the fragments of the library. This sequencing approach usually allows to obtain only the first 250 nitrogenous bases of one end of each fragment of the library therefore it is often indicated with the words 1 × 250 bp.
- Sequencing by synthesis paired end, that is a sequencing that foresees the reading of the two ends of the fragments of the library. In this case we obtain the reads relative to both ends of the fragments, in fact in this case the number of nitrogenous bases of the sequence of each fragment that are provided by the sequencing is equal to 500 bp, 250 relative to the read at 3 'and 250 relating to the 5 'read of the fragment. This approach is indicated by the wording 2 × 250 bp and it is very useful as having the reads relative to the two ends of the fragment allows a better assembly of the sequenced DNA or cDNA sequence. It is also necessary to imagine that if the fragments of the library have a size of 500 bp it means that with a paired end sequencing we would be able to sequence the aforementioned fragments in full.
I decided not to tell you about the steps that occur during sequencing by synthesis for fear of making the reading too heavy, as always these are concepts that are difficult to express in a few simple words. However, if you want to know in detail the phases of single end and paired end sequencing, write me in the comments and follow me on Instagram.
But be careful not all that glitters is gold, in fact this sequencing technique also has its drawbacks. These are mainly two:
- Nucleotide substitution phenomena can occur. During sequencing, the DNA polymerase could in fact insert an incorrect nucleotide, that is, not complementary to the nucleotide of the fragment. This can happen because all four types of nucleotides are simultaneously provided in each sequencing cycle.
- Reduced length of reads (maximum 250 bp) which makes the assembly of the sequenced DNA or cDNA sequence more complex. In Illumina sequencing the reads cannot exceed 250 bp in length because if they are excessively long, errors in the detection of the light signal are obtained due to the incomplete removal of the fluorescent molecules.
Now I just have to thank you for reading. As always I hope I have left you some useful information with this article and that you understand how this sequencing technique works in detail.
Bye-bye and see you soon.