Next Generation Sequencing - A Step-By-Step Guide to DNA Sequencing.

Estimated read time: 1:20

    Learn to use AI like a Pro

    Get the latest AI workflows to boost your productivity and business performance, delivered weekly by expert consultants. Enjoy step-by-step guides, weekly Q&A sessions, and full access to our AI workflow archive.

    Canva Logo
    Claude AI Logo
    Google Gemini Logo
    HeyGen Logo
    Hugging Face Logo
    Microsoft Logo
    OpenAI Logo
    Zapier Logo
    Canva Logo
    Claude AI Logo
    Google Gemini Logo
    HeyGen Logo
    Hugging Face Logo
    Microsoft Logo
    OpenAI Logo
    Zapier Logo

    Summary

    This article explores the evolution of DNA sequencing methods, focusing on the shift from the lengthy Human Genome Project to the rapid Next Generation Sequencing (NGS). It highlights how NGS has revolutionized genome mapping, allowing for sequencing of billions of DNA strands simultaneously. The process involves purification, library preparation, and the use of advanced instruments like Illumina's sequencing by synthesis. NGS is crucial for cancer diagnostics, rare disease detection, and various research fields, with capabilities extending to RNA, cell-free DNA, and more.

      Highlights

      • From taking 32 years for the Human Genome Project to just a day with NGS, DNA sequencing has come a long way! 🌟
      • Using NGS, billions of DNA strands are sequenced at once, a huge leap from the single-strand sequencing of the past. 🎉
      • NGS relies on reference genomes, made possible by earlier projects, to map DNA and RNA efficiently. 📜
      • With applications ranging from cancer research to ecology, NGS proves its worth in various scientific fields. 🧬
      • Powerful sequencing tools and techniques make NGS a game-changer in genetic research and diagnostics. 🛠️

      Key Takeaways

      • NGS revolutionized DNA sequencing, reducing time from 32 years to just one day! 🚀
      • The secret behind NGS's speed is its ability to sequence billions of DNA strands simultaneously. 🔬
      • The Human Genome Project laid the foundation for NGS by creating a reference genome. 📚
      • NGS is versatile, sequencing DNA, RNA, and even cell-free DNA across diverse fields. 🌍
      • Advanced tools like Illumina’s sequencing by synthesis make NGS highly efficient. 🤖

      Overview

      DNA sequencing has undergone a significant transformation with the advent of Next Generation Sequencing (NGS). Compared to the Human Genome Project, which took over three decades to complete, NGS has drastically reduced the time required to sequence an entire human genome to just one day. This remarkable advancement is primarily due to the capability of sequencing billions of DNA strands simultaneously, a feat not possible with older technologies like Sanger sequencing.

        The introduction of NGS has opened up a plethora of possibilities in the world of genomics. It works by cutting DNA into smaller, manageable pieces which are sequenced and then assembled using a reference genome, a crucial component developed by projects like the Human Genome Project. This method not only speeds up the process but also expands the scope of what can be sequenced, including RNA and cell-free DNA, thereby revolutionizing diagnostics and research in fields such as cancer treatment, rare diseases, and environmental studies.

          Advanced instruments, particularly from Illumina, facilitate NGS through a method known as sequencing by synthesis. This involves intricate processes like library preparation, clonal amplification, and real-time sequencing which allow for extensive and precise genetic mapping. NGS has proven invaluable across various scientific areas, demonstrating its versatility and efficiency in providing a deeper understanding of genetic materials and their applications in modern science.

            Chapters

            • 00:00 - 00:30: Introduction and Background The chapter titled "Introduction and Background" discusses the Human Genome Project. Initially, only 85 percent of the human genome was sequenced between 1990 and 2003. It took 32 years to completely sequence the human genome, which finally concluded with the completion of the remaining gaps in 2022. Now, next generation sequencing technologies have drastically reduced the time required for sequencing the human genome.
            • 00:30 - 01:00: Sanger Sequencing and Comparison with NGS The chapter discusses the advances in DNA sequencing technology, comparing the traditional Sanger sequencing method with the modern Next-Generation Sequencing (NGS). It highlights the dramatic improvement in speed, where NGS allows billions of DNA strands to be sequenced simultaneously, drastically reducing the time required to sequence a person's entire genome from 32 years to just one day. Furthermore, it explains the importance of the Human Genome Project in enabling NGS by providing a human reference DNA sequence, noting that only Sanger sequencing was available during the project's execution.
            • 01:00 - 01:30: Basic Principle and Sample Preparation for NGS The basic principle behind Next-Generation Sequencing (NGS) is that DNA can be cut into small pieces and sequenced, which are then assembled into a complete sequence based on a reference genome. NGS can be used to sequence both DNA and RNA. Initially, samples are collected, followed by the purification of DNA or RNA.
            • 01:30 - 02:00: Library Preparation and Sequencing Instruments The chapter discusses the preparation of libraries and sequencing instruments, focusing on the steps involved in preparing RNA for sequencing. Initially, RNA is reversed-transcribed into DNA to facilitate sequencing. Following this, a library is prepared from the DNA by cutting it into short fragments using either high-frequency sound waves or enzymes. Adapters are then attached to each end of these DNA fragments to complete the library preparation process.
            • 02:00 - 02:30: Sequencing by Synthesis Process The chapter explains the process of sequencing by synthesis, highlighting the role of adapters in the sequencing process. These adapters contain essential information for sequencing and include an index for sample identification. Non-bound adapters are removed to complete the library. Depending on the application, a PCR step may be included to increase the library amount. A successful library is characterized by the correct size and a high enough concentration for sequencing. The chapter also mentions that the main sequencing instruments used in Next-Generation Sequencing (NGS) are manufactured by Illumina, which employs this method.
            • 02:30 - 03:00: Cluster Amplification and Sequencing Primer Binding This chapter explains the process of sequencing by synthesis, where DNA sequencing occurs on a glass surface of a flow cell. It describes how short DNA pieces, known as oligonucleotides, are bound to the flow cell surface and match the adapter sequences of the library. The library is first denatured to form single DNA strands, then added to the flow cell to attach to one of the two oligos.
            • 03:00 - 03:30: Fluorescent Nucleotides and Read Cycles The chapter titled 'Fluorescent Nucleotides and Read Cycles' covers the process of preparing DNA strands for sequencing. Initially, the forward strand attaches to an oligo, and then the reverse strand is synthesized. The forward strand is removed, leaving the DNA library bound to a flow cell. To ensure detectability, the fluorescent signal of these DNA fragments is amplified, specifically through a process of clonal amplification via PCR. This amplification occurs at a constant temperature, and involves annealing, extension, and melting by altering the flow cell solution.
            • 03:30 - 04:00: Index Sequencing and Filtering Bad Reads The chapter titled 'Index Sequencing and Filtering Bad Reads' explains the initial steps in sequencing where strands bind to a second oligo on the flow cell creating a bridge. This is followed by copying and denaturing of these strands to form double-stranded fragments. The process is repeated, forming localized clusters. Eventually, the reverse strands are cut and washed away, leaving the forward strand ready for sequencing. A sequencing primer then binds to these forward strands, setting the stage for sequencing using fluorescent nucleotides G, C, T, and A.
            • 04:00 - 04:30: Demultiplexing and Mapping Reads to Reference Genome This chapter focuses on the process of sequencing using a flow cell and it involves DNA polymerase and nucleotides with fluorescent tags. In this method, each nucleotide is tagged with a distinct fluorescent color and a terminator, ensuring that only one nucleotide is sequenced at a time. Initially, a complementary base pairs with the sequence, and the color specific to that cluster is recorded by a camera. Following this, a new solution is introduced to remove the terminators, allowing nucleotides and DNA polymerase to flow again, continuing the sequencing process.
            • 04:30 - 05:00: Paired-end Sequencing and Read Depth This chapter explains the process of paired-end sequencing, focusing on the read cycles, sequencing of indexes, and handling of the reverse strand. It highlights the steps involved in sequencing the first and second indexes, and how a bridge is created for the second oligo in the absence of a primer. The sequencing process is described as cyclical, with reads being washed away after being sequenced unless they are part of a paired-end strategy, where further sequencing takes place. This exploration of sequencing techniques elucidates the sequential processes and considerations involved in detailed DNA analysis.
            • 05:00 - 05:30: Coverage and Applications of NGS The chapter titled 'Coverage and Applications of NGS' delves into the sequencing process. Initially, unique dual indices are applied to facilitate the sequencing of potentially 384 samples in a single flow cell. As the sequencing progresses, the forward strands are synthesized and subsequently removed, allowing for the reverse strands to be sequenced. Post sequencing, any subpar reads such as overlapping clusters, those leading or lagging in sequencing, or exhibiting low intensity, are filtered out.
            • 05:30 - 06:00: Types of Sequencing and Additional Applications The chapter discusses the process involved in sequencing, starting with filtering out certain fragments from nanowells. It mentions polyclonal wells being filtered and explains the demultiplexing step where attached indexes are used to identify and sort reads from each sample. Finally, the chapter explains how reads are mapped to a reference genome, with various reads aligning and overlapping with each other on the genome.

            Next Generation Sequencing - A Step-By-Step Guide to DNA Sequencing. Transcription

            • 00:00 - 00:30 ClevaLab. The Human Genome Project uncovered  all 3.2 billion bases of the human genome.   This project started in 1990 and took until 2003  to complete 85 percent of the first genome. But,   in 2022, the gaps got filled and the sequence  became complete. So in total, sequencing the   human genome took 32 years. Now, with Next  Generation sequencing or NGS, it takes only
            • 00:30 - 01:00 a day to sequence a person's entire genome. One  day is a dramatic speed increase compared to 32   years! The difference is due to the number of DNA  strands sequenced at once. Billions of DNA strands   get sequenced simultaneously using NGS. However,  only Sanger sequencing was available for the   Human Genome Project. With Sanger Sequencing, only  one strand can get sequenced at a time. However,   NGS only works because the Human Genome Project  created a human reference DNA sequence. The
            • 01:00 - 01:30 basic principle behind NGS is that DNA can be cut  into small pieces and sequenced. The sequences of   these small pieces then get assembled based on the  reference genome. NGS can be used to sequence both   DNA and RNA. First, samples get collected, and  the DNA or RNA gets purified. Next, the DNA or RNA
            • 01:30 - 02:00 gets checked to ensure it's pure and undergraded.  RNA first needs to be reversed-transcribed into   DNA before it can get sequenced. A library  then gets prepared from the DNA. A library   is a collection of short DNA fragments from a long  stretch of DNA. Libraries get made by cutting the   DNA into short pieces of a specified size. This  cutting gets done by using high frequency sound   waves or enzymes. Then sequences of DNA called  adapters get added to each end of a DNA fragment.
            • 02:00 - 02:30 These adapters contain the information needed for  sequencing. They also include an index to identify   the sample. Finally, any non-bound adapters get  removed, and the library is complete. Depending   on the application, there can be a PCR step to  increase the library amount. A successful library   will be of the correct size. It will also be of  a high enough concentration for sequencing. The   main sequencing instruments used in NGS are from  Illumina. These instruments use a method called
            • 02:30 - 03:00 sequencing by synthesis. The sequencing occurs  on a glass surface of a flow cell. Short pieces   of DNA, called oligonucleotides, are bound to the  surface of the flow cell. These oligonucleotides   match the adapter sequences of the library. First,  the library gets denatured to form single DNA   strands. Then this Library gets added to the flow  cell, which attaches to one of the two aligos. The
            • 03:00 - 03:30 strand that attaches to the oligo is the forward  strand. Next, the reverse strand gets made, and   the forward strand gets washed away. The library  is now bound to the flow cell. If sequencing   started now the fluorescent signal would be too  low for detection. So each unique library fragment   needs to get amplified to form clusters. This  clonal amplification is by a PCR that happens   at a single temperature. Annealing, extension and  melting occur by changing the flow cell solution.
            • 03:30 - 04:00 First, the strands bind to the second oligo on  the flow cell to form a bridge. The strands get   copied. Then these double-stranded fragments  get denatured. This copying and denaturing   repeats over and over. Localized clusters get  made, and finally, the reverse strands get cut.   These strands get washed away, leaving the  forward strand ready for sequencing. The   sequencing primer binds to the forward strands.  Next, fluorescent nucleotides G, C, T and A get
            • 04:00 - 04:30 added to the flow cell along with DNA polymerase.  Each nucleotide has a different color fluorescent   tag and a terminator. So only one nucleotide can  get sequenced at a time. First, the complementary   base binds to the sequence. Then the camera reads  and records the color of each cluster. Next, a   new solution flows in and removes the terminators.  The nucleotides and DNA polymerase flowing again,
            • 04:30 - 05:00 and another nucleotide gets sequenced. These read  cycles continue for the number of reads set on the   sequencer. Once complete, these read sequences get  washed away. Then the first index gets sequenced,   and washed away. If only a single read is needed,  the sequencing ends here. But, for paired-end   sequencing, the second index is sequenced, as  well as the reverse strand of the library. There   is no primer for the second index read. Instead, a  bridge gets created so that the second oligo acts
            • 05:00 - 05:30 as the primer. The second index is then sequenced.  These two index reads use unique dual indices.   These allow the use of up to 384 samples in the  same flow cell. Next, the reverse strand gets   made, and the forward strands are cut and washed  away. The reverse strands are then sequenced.   Once the sequencing is complete, any bad reads  get filtered out. These include the clusters that   overlap, lead or lag with sequencing or are of low  intensity. The clusters cannot overlap on a patent
            • 05:30 - 06:00 flow cell, but there can be more than one library  fragment per nanowell. These polyclonal wells will   also get filtered out. Next, the reads passing the  filter get demultiplexed. Demultiplexing uses the   attached indexes to identify and sort reads from  each sample. Finally, the reads get mapped to the   reference genome. The different reads align to  the reference genome, overlapping each other.
            • 06:00 - 06:30 Paired-end sequencing creates two sequencing reads  from the same library fragment. During sequence   alignment, the alogarithm knows that these reads  belong together. Longer stretches of DNA or RNA   can get analyzed with greater confidence that the  alignment is correct. Read depth is an essential   metric in sequencing. Read depth is the number  of reads for a nucleotide. Average read depth is   the average depth across the region sequenced. For  whole genome sequencing, a 30x average read depth
            • 06:30 - 07:00 is good. A 1500x average read depth is suitable  for detecting rare mutation events in cancer.   Another essential metric is coverage. The aim is  to have no missing areas across the target DNA.   NGS gets used in a wide variety of applications.  In diagnosing cancer and rare disease, treatment   guidance for cancers, and many research areas from  ecology to botany to medical science. Both DNA and
            • 07:00 - 07:30 RNA can be sequenced. It could be the whole genome  or transcriptome, just the coding regions (called   exomes) of the DNA, or target genes in the DNA or  RNA. All types of RNA can be sequenced including   non-coding RNAs such as microRNAs and long  non-coding RNA. In addition, cell-free DNA,   single cells, as well as methylation or  protein binding sites can also get sequenced.