Molecular diagnostic strategies for the COVID-19 pandemic
Version 2.0
Fahim Farzadfard, Louis Kang, Samantha L. Bates, Karthik Dinakar, Jack Kreindler, Samuel Klein, A. James Phillips, Jessica Sousa, Amelia Wattenberger, James W. Weis, and Joichi Ito1
With the ongoing COVID-19 pandemic, and in response to high demand, a multitude of diagnostic tests for the detection and monitoring of the spread of the SARS-CoV-2 virus have been developed. This living document aims to survey the technical aspects of these diagnostic tests, outline the features and current limitations of their underlying technologies, formulate a modular framework for the assessment of the existing (and upcoming) tests workflows, and help depict a clearer and more comprehensive picture of this rapidly evolving space. With this manuscript as a starting point, we aim to help orchestrate a community effort to identify potential pitfalls and bottlenecks in the existing testing workflows---with the ultimate goal of paving the way to more sensitive, robust, scalable, and widespread tests for maximal social impact.
This document is a work in progress. The intention is to make it available publicly to solicit feedback and to be a living document. Please check back for the latest version at https://interventions.centerofci.org/pub/covid-testing-assessment/
The document is Creative Commons Attribution licensed - share freely with attribution.
SARS-CoV-2 (or Severe Acute Respiratory Syndrome Coronavirus 2) causes COVID-19 (also known as Coronavirus Disease 2019).
SARS-CoV-2 first appeared in December 2019 in Wuhan, China and rapidly spread to over 160 countries, with over 2.4 million individuals infected and over 160,000 reported deaths as of April 20, 2020.[1][2][3]
The current evidence indicates that the majority of documented cases are the result of infection from undiagnosed individuals.[4] This is likely due, at least in part, to the high levels of viral shedding apparent in individuals without, or with minimal, symptoms.[5]
There is currently no effective vaccine or widely accepted treatment for SARS-CoV-2. This, combined with the uncertainties around clinical diagnosis, the high viral shedding levels of minimally asymptomatic patients, and the high rate of infection from unknown sources makes the development of a widely-available and rapid diagnostic test with good performance characteristics (e.g. sensitivity and specificity) of critical concern. However, such tests have not yet been deployed in many countries in the world, including the United States.
The efficacy of the rapid deployment of widespread testing to the most heavily infected areas, in combination with digital surveillance, strict quarantine, and large stockpiles of personal protective equipment, has been demonstrated through the remarkable first wave containment of SARS-CoV-2 in Singapore (which is experiencing a second wave of cases), Taiwan, South Korea, and Hong Kong.
SARS-CoV-2 belongs to the broad family of viruses known as coronaviruses. Coronaviruses are positive-sense single-stranded RNA (+ssRNA) viruses.[6]
The SARS-CoV-2 RNA genome size is 29,891 nucleotides, encoding for 9,860 amino acids.
As of 20 April 2020, over 4500 SARS-CoV-2 genomes sampled on six continents were publicly available.[7]
The SARS-CoV-2 has 96% nucleotide identity with BatCoV RaTG13,[8] 89% with bat SARS-like-CoVZXC21 and CoVZC45, and 82% with that of human SARS-CoV.
SARS-CoV-2 consists of a lipid bilayer where the membrane, envelope, and spike structural proteins are anchored and encapsulate the virus RNA genome.
Like other coronaviruses, SARS-CoV-2 has four structural proteins, known as the S (spike), E (envelope), M (membrane), and N (nucleocapsid) proteins
The N protein holds the RNA genome
The S, E, and M proteins together with the lipid bilayer create the viral envelope.
The spike protein is the protein that enables the virus to attach to the host cell.
Currently, in the United States, most CDC-certified SARS-CoV-2 real-time reverse transcription PCR (RT-qPCR) diagnostic tests have a laboratory turnaround time of approximately 4-6 hours, with results that can be delayed for >24 hours after sample collection due to shipping requirements, and are only slightly more than 50% sensitive if performed once (twice = 71%).
New innovations in this regime are being pursued - as impactful as decreasing the readout time to 5 minutes. However, these tests are only available in CDC-designated public health laboratories certified to perform high-complexity testing.
The RT-qPCR tests rely on lengthy and resource-intensive RNA purification and RT-qPCR workflows. The shortage of proprietary reagents needed for these workflows has been a bottleneck in the diagnosis capacity of laboratories in the US and around the globe in the recent outbreak.
Demand could rise to 50% of the US population, while the projected testing capacity based on current infrastructure and methods, over the next few months, is expected to be ~100K tests per day.
Molecular diagnostics for SARS-COV-2 can be categorized into direct and indirect approaches:
Direct approaches assess the presence of viral components (RNA or proteins/antigens) in at-risk individuals and are suitable for identifying individuals who have contracted the virus.
Indirect approaches detect the presence of molecules/agents (e.g. antibodies) produced by the human body in response to this specific virus. These approaches are suitable for identifying patients who have previously contracted the virus but where the current infection is not necessarily still present. Future antibody serology assays may also demonstrate a degree of immunity to further infection.
In this technical assessment, we will mainly focus on diagnostic approaches that rely on nucleic acids (RNA) detection since they are most relevant to direct testing. Protein detection could be a complementary approach to nucleic acid detection, however, it is less robust and sensitive and more prone to errors due to the complexities of working with proteins. Given the high suspected rate of minimally-symptomatic infections, indirect approaches may also be useful in providing a mechanism to quantify the prevalence of immunity in the general population.[9]
In general, the molecular diagnostics workflows can be described as the following steps: 1) sampling, 2) upstream sample processing, 3) detection and signal amplification, and 4) readout. In this section we discuss the advantages and limitations of the existing strategies for each of these steps:
This is a manual step and common amongst all diagnostics methods. Given that Covid19 is an upper respiratory tract infection, nasal swabs have been the main form of sampling for direct (RNA or protein) detection while blood samples have been used for indirect (antibody) tests.2 Tests based on alternative, less invasive sampling methods are also being developed, one of which (based on saliva samples) recently received Emergency Use Authorization.[10]
Upstream sample processing such as RNA purification helps to achieve robust and sensitive detection and minimize possible false-negatives. Depending on the detection and the signal amplification strategy and readout used in a given workflow, the level of this dependency could be different. For example, a typical workflow based on RT-qPCR requires RNA purification to achieve sensitive detection as impurities in the samples could inhibit DNA polymerases. On the other hand, isothermal amplification, an alternative to PCR, involves methods that are less sensitive to sample impurities and minimal sample treatment (one-step RNA extraction) would be sufficient to detect and amplify the signal in most clinical samples. So, strategies that require no or minimal sample treatments are more scalable:
RNA purification vs. RNA extraction
RNA purification is a lengthy process (takes more than 30 minutes) that involves multiple washes and purification steps. Column-based RNA purification and phenol-chloroform extraction are the two most commonly used protocols for RNA purification. However, these strategies do not easily scale to millions of samples, as they are not compatible with high-throughput workflows and require manual processing. State-of-the-art RNA purification machines (e.g., from Qiagen) are relatively expensive and can only process a few dozen samples at a time. Furthermore, many of the commercially available reagents are proprietary which makes large-scale production of these reagents by third parties even more challenging. The shortage of RNA purification reagents is a bottleneck that limits the capacity of current detection kits that rely on RT-qPCR. While this shortage could be partially addressed by the ramp-up of their reagents, RNA purification remains a slow and laborious process. The FDA has issued Emergency Use Authorizations for many covid-19 tests,[11] but many of these EUA documents do not explicitly mention the RNA purification step.
RNA extraction is the release of the RNA genome from viral particles and can be achieved by simply suspending the swab in a lysis buffer or heating the samples for a few minutes. While the process does not separate the RNA from other agents, it can release enough RNA that can be detected by a robust detection and signal amplification method (e.g., isothermal amplification) that is less sensitive to (and thus not inhibited by) impurities in the sample. RNA extraction is a simple, cheap, and time-saving alternative to RNA purification that can be easily scaled to tens of thousands of samples. When using RNA extraction, the sensitivity of detection can be compensated for by using more sensitive signal detection and amplification methods and readouts.[12][13][14]
RNA extraction and RNA purification have been used interchangeably in various contexts. As a rule of thumb, one can consider workflows that only involve the release of RNA moieties into the medium (e.g., via heat treatment or lysis buffer) as RNA extraction and those that involve additional RNA binding and wash steps as RNA purification.
The nucleic acid detection and amplification methods that can be applied to the CoV-2 RNA genome can be categorized into three main classes: 1) RT-qPCR, 2) RNA Isothermal amplification, and 3) CRISPR-based methods. The first two methods involve reverse-transcription of segment(s) of the viral RNA in samples into cDNA (using a reverse-transcriptase enzyme) and amplifying the cDNA millions of times using PCR or isothermal amplification methods which can then be detected by a readout strategy. The CRISPR-based methods can be used to directly detect the viral RNA (using the Cas13 enzyme) or after converting the RNA into cDNA through reverse transcriptase (using the Cas12 enzyme). These methods have different degrees of reliance on specialized equipment, trained personnel, and upstream sample processing. The pros and cons of each of these strategies are described below:
RT-qPCR is the gold standard for molecular diagnostics of RNA viruses. It is a highly sensitive3 and well-established PCR-detection method that has been extensively used for the detection of various pathogens including CoV-2. Almost all the CoV-2 detection kits that are currently commercially available use RT-qPCR. However, for sensitive and robust detection, RT-qPCR requires an RNA purification step which makes the workflow laborious and slow that can only be performed by trained personnel. Without purified RNA the assay sensitivity drops significantly as DNA polymerases are inhibited by impurities in the samples, causing false-negatives; for example, human saliva contains RNA-degrading enzymes which leads to increased noise. The RT-qPCR workflow is slow and takes at least a few hours end-to-end. The time it takes to receive test results may also be longer if it is necessary to ship patient samples to locations that have the equipment and trained personnel to perform the test. Furthermore, the process requires relatively expensive qPCR equipment (the cheapest we could find on the market costs ~$6K) which is significant for low volume - in addition to the reagents needed for each test While the process can be performed in a thermocycler machine that is relatively cheaper, the less expensive equipment may decrease sensitivity and lead to increased false-negatives. Furthermore, there is evidence that the detection sensitivity for SARS-CoV-2 may be particularly poor despite the attomolar limit of detection, with some studies showing only 47-59% of positive cases are identified via RT-qPCR. This reduction is suspected to be a consequence of loss, degradation, and/or mutation of the viral RNA, and addressing this concern has been a focus of recent research. [15][16]
Isothermal amplification methods such as Loop-Mediated Isothermal Amplification (LAMP) and Recombinase Polymerase Amplification (RTA) have been successfully applied to detect various pathogens (RNA and DNA viruses and bacteria) in patient (blood, serum, saliva, etc.) and environmental (soil, water, etc.) samples, with minimal sample extraction or no sample treatments. The amplification is performed at a constant temperature (even room temperature), thus heavily reducing the reliance on specialized equipment and the cost. DNA amplification is very fast, and signals can be detected in as little as 10 minutes.
CRISPR-based methods are the newest player in molecular diagnostics and their robustness and performance need to be tested with clinical samples. As opposed to the above-mentioned methods, which rely on RNA/DNA amplification, CRISPR-based methods rely on the capacity of CRISPR effector proteins (Cas12 and Cas13) to discriminate the desired sequence with single base-pair resolution. The collateral nuclease activity of these proteins at the presence of their DNA or RNA target is used to degrade a nucleotide probe and create a colorimetric or fluorogenic signal. Similar to RT-qPCR, CRISPR-based methods also require an RNA purification step for robust and sensitive detection and signal amplification. A pre-amplification step using an isothermal amplification method could help to increase the sensitivity of these methods, however, as a result of such reliance, the success of CRISPR-based methods would become dependent on the success of the initial amplification step.
An amplified signal using one of the above-mentioned methods needs to be converted into a format that can be readout visually or by a machine.
Colorimetric and fluorometric readouts: In the case of DNA amplification methods (PCR and isothermal amplification), this can be achieved by adding colorigenic or fluorogenic DNA intercalating dyes that become visible or fluorescent upon binding to double-stranded DNA that is produced during DNA amplification, or by using fluorescent oligo probes that specifically interact with amplified DNA or are incorporated into it during amplification. Similar readouts can be achieved in CRISPR-based methods, using probes that become fluorescent or visible upon degradation by the collateral nuclease activity of activated Cas12 or Cas13. Fluorogenic substrates enable more sensitive and quantifiable readouts, however, they require optical equipment.
Lateral flow assays: Detection assays can be designed so that the nature of amplified DNA or oligo probes can be assessed by lateral flow (paper-based sample separation). This method provides a more sensitive and quantitative readout than bulk colorimetric assays; however, the sensitivity (signal-to-noise ratio) for this assay is less than for fluorometric assays. This lower sensitivity necessitates an RNA purification workflow to achieve robust detection and reduce false-negatives.
DNA sequence readout: In cases of RT-PCR and isothermal amplification methods, the amplified DNA can be directly sequenced. Adding DNA barcodes into primers enables us to uniquely mark each sample. Barcoded samples can be pooled together and analyzed on the same run by high-throughput sequencing methods, significantly reducing the cost per sample and improving the throughput of detection. However, this mode of detection requires access to Next Generation Sequencing (NGS) machines and is only applicable to centralized labs.
Alternative assays with shorter workflows: Even with the ramp-up of reagent production lines and the increase in the number of certified centers, certified RT-qPCR workflows which involve lengthy and multi-step RNA purification and DNA amplification processes cannot be scaled to meet increasing demand. Thus, alternative and more easily scalable detection strategies are needed. RNA purification can be replaced by quick RNA extraction protocols which can be combined with sensitive, cheap, and robust detection methods, such as isothermal amplification, that are compatible with high-throughput assays. In this case, to compensate for the drop in the sensitivity of assays due to the use of RNA extraction (as opposed to RNA purification) in the workflow and minimize false-negatives, more sensitive readouts such as fluorometric assays could be used.
Increasing the testing capacity of the centralized laboratories: It is clear that the testing demand is significantly higher than the current capacity of testing centers, and additional specialized testing centers are urgently needed. While valuable assets, services provided by centralized labs could be delayed due to shipment requirements. Furthermore, continuous monitoring of at-risk populations especially at remote locations may not be feasible as that would require routine (e.g., weekly) sampling and shipments for each at-risk individual. Therefore, employing more efficient testing methods at testing centers and developing more point-of-care diagnostic tools could be the best way to increase our testing capacity.
Point of care (POC) diagnostics: The rapid outbreak of the COVID-19 pandemic across the globe highlights the need for point-of-care (POC) diagnostic tools that enable fast, robust, sensitive, and wide-spread detection and monitoring of emerging pathogens to achieve an effective public health response. Complementary to centralized labs, low-cost, easy-to-operate POC devices can serve as effective tools to monitor and combat the spread of the disease. Furthermore, these devices can be designed to immediately transmit test results and substantial relevant metadata (location, time, environmental conditions) to secure databases for further analysis, providing healthcare officials and decision-makers with valuable information, such as the extent and dynamics of viral spread in exposed populations, and of pandemic response.
The sensitivity and robustness of the assay is a function of the methods used for each of the above mentioned four steps. Ideally, we want to:
Maximize assay sensitivity and robustness to reduce possible false negatives
While at the same time optimizing the following steps to the extent that doing so does not compromise the assay and lead to unwanted false negatives (producing negative results for a patient that is indeed positive):5
Reduce workflow cycle: increase throughput and save time
Minimize cost
Increase wide-spread accessibility (no reliance on supply chain and shipment infrastructure, easy-to-operate workflow without training requirement)
Considering these criteria and the details discussed for methods available for each of the four steps, a testing technology addressing specific needs can be evaluated and designed. For example, an ideal workflow for a distributable, point-of-care testing method would be something like this:
Sampling
RNA extraction (as opposed to the lengthy RNA purification used in the existing workflows): cheap and quick workflow
Isothermal amplification (LAMP): robust and sensitive, isothermal (no requirement for specialized equipment), cheap and non-proprietary reagents and technique, less sensitive to inhibitory agents in unpurified samples.
Fluorometric detection: sensitive, real-time monitoring. High signal to noise ratio, which could compensate for a drop in sensitivity as a result of using RNA extraction instead of purification in the workflow.
Cheng, M. P., Papenburg, J., Desjardins, M., Kanjilal, S., Quach, C., Libman, M., Dittrich, S., & Yansouni, C. P. (2020). Diagnostic Testing for Severe Acute Respiratory Syndrome–Related Coronavirus-2: A Narrative Review. Annals of Internal Medicine. https://doi.org/10.7326/M20-1301
Coronavirus Test Tracker: Commercially Available COVID-19 Diagnostic Tests. (n.d.). 360Dx. Retrieved April 15, 2020, from https://www.360dx.com/coronavirus-test-tracker-launched-covid-19-tests
Sheridan, C. (2020). Fast, portable tests come online to curb coronavirus pandemic. Nature Biotechnology. https://doi.org/10.1038/d41587-020-00010-2
COVID testing UPDATE. (n.d.). Open Cell. Retrieved April 15, 2020, from https://www.opencell.bio/news/covid-testing-update
Hodgson, J. (2020). The pandemic pipeline. Nature Biotechnology. https://doi.org/10.1038/d41587-020-00005-z
Health, C. for D. and R. (2020). Emergency Use Authorizations. FDA. https://www.fda.gov/medical-devices/emergency-situations-medical-devices/emergency-use-authorizations
Testing | Coronavirus Tech Handbook. (n.d.). Retrieved April 15, 2020, from https://coronavirustechhandbook.com/testing
Science papers you should be reading about the coronavirus. (2020, March 31). Fred Hutch. https://www.fredhutch.org/en/news/center-news/2020/03/coronavirus-latest-scientific-research.html
CORD-19 dataset | Semantic Scholar. Retrieved April 19, 2020, from https://pages.semanticscholar.org/coronavirus-research
COVID-19 Primer. (n.d.). Retrieved April 15, 2020, from https://covid19primer.com/dashboard