QUADAS-2 Mirella Fraquelli Gastroenterology and Endoscopy Unit
43 Slides6.10 MB
QUADAS-2 Mirella Fraquelli Gastroenterology and Endoscopy Unit Fondazione IRCCS Ca’ Granda Ospedale Maggiore Policlinico, Milan DIAGNOSIS: the pathway of a diagnostic test from bench to bedside. Basic residential course 4-8 April 2017 Palazzo Feltrinelli - Gargnano, Lake Garda, Italy
Why assessing quality Bias in primary studies can lead to misleading estimates of accuracy The studies may not be applicable to the review question The results of primary studies may vary
Why assessing the quality Quality assessment is useful to guide the interpretation of results: - In terms of risk of bias and applicability to the review question - To identify potential sources of heterogeneity
Sources of bias and variation BIAS: flaws in the design of the study may result in inaccurate estimation of accuracy APPLICABILITY: variation across studies means that the results may not be applicable to your review question QUADAS 2 instrument Whiting et al. - Ann Intern Med 2011
Structure of QUADAS-2 Phase 1: State the review question Phase 2: Draw a flow diagram for the primary study Phase 3: Risk of bias and applicability judgments – 4 key domains: patient selection, index test, reference standard, flow & timing – Each domain rated on risk of bias and concerns regarding applicability – Signaling questions to help reach rating(s) for each domain 5
Phase 1: State the review question Example: Anti-CCP for the diagnosis of RA Patients: Patients with joint symptoms 12 months duration Index test(s): 1. Second generation Anti-CCP test analysed by ELISA 2. Rheumatoid factor detected by latex agglutination Target condition: Rheumatoid Arthritis Reference Standard: American College of Rheumatology (ACR) criteria Anti-CCP: anti-cyclic citrullinated peptide antibody 6
Phase 2: Draw a flow diagram 467 consecutive patients Anti-CCP2: 467 Anti-CCP2 ve: 95 RA: 82 Other: 6 Anti-CCP2 -ve: 372 Unclear: 7 Unclear 100 Other: 200 RA: 71 7
Internal vs External validity Study parameters BIAS (PICO) Clinical question APPLICABILITY (PICO) Population Population Index test Index test Reference standard Reference standard Outcomes Outcomes Schmidt & Factor - Arch Pathol Lab Med 2013
Diagnostic test accuracy studies Internal validity External validity Correct study design Clinically relevant context Ideal experimental conditions No center selection Data homogeneity No patients selection Reduced heterogeneity Co-morbidity Data precision and repeatability Data applicability and transferability
Key Domain Risk of bias: High/low/unclear Applicability: High/Low/Unclear Patient selection Could the selection of patients have introduced bias? Are there concerns that the included patients do not match the review question? Index test Could the conduct or interpretation of the index test have introduced bias? Are there concerns that the index test, its conduct, or its interpretation differ from the review question? Reference standard Could the RS its conduct, or its interpretation have introduced bias? Are there concerns that the target condition as defined by the RS does not match the review question? Flow and timing Could the patient flow have introduced bias? 10
DOMAIN 1: Patient selection A. Risk of Bias Describe the methods of patient selection: Was a consecutive or random sample of Yes / No / Unclear patients enrolled? Was a case-control design avoided? Did the study avoid inappropriate Yes / No / Unclear exclusions? Could the selection of patients have introduced RISK: LOW / HIGH / UNCLEAR bias? B. Concerns about applicability Describe included patients: previous testing, presentation, intended use of index test, and setting Are there concerns that the included patients do not match the review question? CONCERN: LOW / HIGH / UNCLEAR
Patient selection bias Inappropriate methods of patient selection may introduce bias: - Study design: patient-control design - Retrospective enrollment - Selected enrollment (i.e. non-consecutive or random)
Study design Basic design of diagnostic accuracy studies: Prospective, blinded cross classification of test and reference standard in a clinically relevant setting TP Liver biopsy “Relevant” spectrum of patients FP TE FN Liver biopsy TN
Spectrum effects: evaluation of 2 very different populations Diagnostic Patient Control Study % Healthy volunteers Test threshold Very sick individuals Test parameter the healthiest the sickest SPECTRUM BIAS
Spectrum effects: evaluation of representative populations Diagnostic Cross-sectional Study % Test threshold Patients without disease Patients with disease SPECTRUM VARIATION Test parameter
Patient selection: applicability Measures of accuracy may vary across patient groups: Setting Advanced vs early disease Symptoms Demographic features Presence of alternative conditions Co-morbidities
Severity of the disease co-morbidities Severityof ofthe thedisease disease Severity co-morbidities co-morbidities
DOMAIN 2: INDEX TEST A. Risk of Bias Describe the index test and how it was conducted and interpreted: Were the index test results interpreted without knowledge of the results of the reference standard? Yes / No / Unclear If a threshold was used, was it prespecified? Yes / No / Unclear Could the conduct or interpretation of the index test have introduced bias? RISK: LOW / HIGH / UNCLEAR B. Concerns about applicability Are there concerns that the index test, its conduct, or its interpretation differ from the review question? CONCERN: LOW / HIGH / UNCLEAR
Index Test bias Could the conduct or interpretation of the Index Test have introduced bias? The execution of the Index Test described in sufficient detail to permit replication of results (predefined cutoff etc.) IT results should be interpreted without knowledge of the RS results (information bias)
Index Test applicability Are there concerns that the Index Test, its conduct, or interpretation differ from the review question? If test conduct, technology, setting or interpretation differ from your review question, the results may not be applicable
Reference Standard bias Could the reference standard, its conduct, or its interpretation have introduced bias? RS should be the most accurate test to diagnose the TC (misclassification bias) The execution of the RS should be described in sufficient detail to allow the replication of results (predefined cut-off etc.) RS should be interpreted blind to IT results (information bias) IT should not be a part of the RS (incorporation bias)
Misclassification bias From the GOLD STANDARD to the REFERENCE STANDARD The reference standard should accurately reflect the true state of the patient, but is usually imperfect. Its sensitivity and specificity are 100% Reference standard pos Target Disease
Misclassification bias When the reference standard does not correctly classify patients with the target condition Example Liver biopsy as reference standard For staging hepatic fibrosis can have 30% false-negative results Depends on Whether both tests make the same mistake
Disagreements between Reference Standard and Index Test REFERENCE STANDARD Pos INDEX TEST Pos Neg Neg A B agreement Cases detected only by the Index Test C D Cases detected only by the RS agreement Index Test more specific Index Test more sensitive Glasziou - Ann Intern Med 2008;149:816
Information bias When the Reference standard is interpreted knowing the Index test results Example: OLT US vs CT in diagnosing hepatic tumours Tendency to ”over-read” a suspicious area if other test results are known Usually leads to an increase of FP results
Incorporation bias When the index test is incorporated in a (composite) reference standard Example CT scanning of the head to determine whether children have cerebrospinal shunt malfunction Reference standard: Clinical assessment by neurosurgeons who used CT results to decide Usually leads to overestimation of diagnostic test accuracy
Reference Standard applicability Outcome of the RS is decisive: If the RS does not detect the target condition defined in the review question results may not be applicable The choice of valid/optimal RS is crucial Cholecystitis HCC and OLT
Flow and timing A. Risk of Bias Describe any patients who did not receive the index tests or reference standard or who were excluded from the 2 x 2 table (refer to flow diagram) Describe the interval and any interventions between index tests and the reference standard Was there an appropriate interval between index test and reference standard? Yes/No/Unclear Did all patients receive a reference standard? Yes/No/Unclear Did all patients receive the same reference standard? Yes/No/Unclear Were all patients included in the analysis? Yes/No/Unclear Could the patient flow have introduced bias? RISK: LOW /HIGH/UNCLEAR
Flow and timing bias Disease progression/regression or Tx paradox bias Partial or differential verification bias Response bias
Disease progression/regression or treatment paradox bias When the patients’ condition changes between administering the index test and the reference standard Example. Acute disease Antiviral Tx of patients with chronic viral hepatitis while a non-invasive test for staging liver fibrosis is under evaluation Under- or Over-estimation of diagnostic test accuracy depending on changes of patients’ condition
Partial verification bias When a nonrandom set of patients does not undergo the reference standard Example Detection of focal liver lesions Ultrasound biopsy Usually leads to overestimation of sensitivity; effects on specificity varies
Differential verification bias When a set of patients is verified with a second or third reference standard, especially when this selection depends on the index test result Example Detection of focal liver lesions - TC / RMN US biopsy Usually leads to overestimation of diagnostic test accuracy
Response bias When uninterpretable or intermediate test results and withdrawals are not included in the analysis Example. Technical faults or inferior image quality Any non-random missing data for study patients (e.g. disease-free cases that were not thoroughly investigated) Usually leads to overestimation of diagnostic test accuracy
Indeterminate results Patients excluded from the analysis Indeterminate/unclassified Index test results intention-to-diagnose approach
Indeterminate results Different scenarios
Indeterminate results Intention-to-diagnose Indeterminate results considered either as False-positive or False-negative
Indeterminate results REFERENCE STANDARD INDEX TEST Pos Neg TP FP Indeterminate X Y FN TN Intention-to-diagnose
Indeterminate results 1st scenario The indeterminate results are not analyzed CT CT Rx uncertain tot - tot 10 10 11 2 5 20 12 15 31 31 27 58 Rx tot 10 11 21 2 20 22 tot 12 31 43 Sensitivity: 0.476 Specificity: 0.909
Indeterminate results 2nd scenario The indeterminate results are divided half to Rx and half to Rx - CT CT Rx uncertain tot - tot 10 10 11 2 5 20 12 15 31 31 27 58 Rx tot 15 16 31 4 23 27 Sensitivity: 0.483 Specificity: 0.852 tot 19 39 58
Indeterminate results 3rd scenario The indeterminate results are classified as FN and FP CT CT Rx uncertain tot - tot 10 10 11 2 5 20 12 15 31 31 27 58 Rx tot 10 21 31 7 20 27 Sensitivity: 0.323 Specificity: 0.741 tot 17 46 58
Indeterminate results 4rd scenario The indeterminate results are classified as: CT Positive TP e FP Rx uncertain tot 10 10 11 31 2 5 20 27 tot 12 15 31 58 Negative TN e FN CT Rx tot 20 11 31 CT 7 20 27 tot 27 31 58 Rx tot 10 21 31 2 25 27 Sensitivity: 0.645 Sensitivity: 0.323 Specificity: 0.741 Specificity: 0.926 tot 12 46 58
Indeterminate results To eliminate results Sensitivity: 0.476 Specificity: 0.909 Indeterminate: Rx- (TN e FN) Sensitivity: 0.323 Specificity: 0.926 Indeterminate: FP e FN Sensitivity: 0.323 Specificity: 0.741 High risk of bias approach Indeterminate: Rx (TP e FP) Sensitivity: 0.645 Specificity: 0.741 Indeterminate Divided e Sensitivity: 0.483 Specificity: 0.852 More conservative approach