Appraise the evidence

Related Pages[edit | edit source]

  1. Evidence Based Practice (EBP)
  2. Step1: Formulate an answerable question
  3. Step 2: Find the best available evidence
  4. Step 3: Appraise the evidence
  5. Step 4: Implement the evidence
  6. Step 5: Evaluate the outcome

The Appraisal[edit | edit source]

After we have searched for the evidence, we need to decide if it is both valid and important before deciding if we can apply the evidence to our individual patients. The order in which we consider validity and importance depends on individual preference.

The third step of evidence based practice, critical appraisal is the systematic evaluation of clinical research papers in order to establish:

  1. Are the results of the study valid?
  2. What were the results?
  3. Will the results help me in caring for my patients?

If the answer to any of these questions is “no”, you can save yourself the trouble of reading the rest of it.

Cleland, Noteboom, Whitman and Allison (2008)[1][2] addressed these questions in a two part series on selected aspects of evidence based practice relating to questions of treatment which is summarised below:

Are the results of the study valid?[edit | edit source]

The first step is to decide whether the study was unbiased by evaluating its methodological quality. Different criteria for validity of articles are used for different types of questions on: treatment, diagnosis, prognosis and economic evaluation. Depending on the validity of an article we can classify it within a scale of levels of evidence and degrees of recommendation.

  • Hierarchy of evidence
  • Internal and external validity
  • Randomisation and baseline homogenity of groups - was the assignment of patients to treatments randomised? Were the groups similar at the start of the trial?
  • Concealment of allocation to groups - were patients analysed in the groups to which they were randomised?
  • Blinding - were patients, health workers, and study personnel "blind" to treatment?
  • Completeness of follow up (Intention to treat principle) - was follow up complete? Were all patients who entered the trial properly accounted for and attributed at its conclusion?
  • Equivalent experience of groups apart from treatment of interest - aside from the experimental intervention, were the groups treated equally?

Hierarchy of evidence

When evaluating evidence for effectiveness of an intervention, clinicians often find it helpful to use a system to determine the level of evidence and/or grade of recommendation for a particular study.

Internal and external validity

In addition to identifying the level of evidence on the hierarchy, therapists must also consider critically appraising the study’s overall quality and the study’s internal and external validity.  Internal validity relates to elements of research design intended to exert control over extraneous variables that could potentially impact the outcomes of the study, including interactions between patient assignment, competing interventions, history, maturation, and instrumentation.  External validity refers to the generalizability of the study’s results to actual clinical practice[1].

Randomisation and baseline homogenity of groups 

Randomization should theoretically ensure that each group of subjects is similar at baseline so that no extraneous variables (such as known and unknown prognostic factors) compete with the intervention to explain observed outcomes. Extraneous variables that could potentially affect outcomes in studies of treatment effectiveness include patient, age, race, gender, symptom duration, condition severity, comorbidities, intellectual status, motivation, and treatment expectations. Although randomization should ideally produce observed homogenous groups at baseline, there is always a chance, particularly with small samples, that groups may be dissimilar in important known and unknown prognostic factors, which may affect group homogeneity. For this reason a reader performing a critical appraisal must independently judge the extent to which groups are similar in key prognostic factors.

Concealment of allocation to groups

Even when randomization procedures are followed, bias from investigators in- fluencing subject enrollment and group composition can threaten validity if allocation to groups is not concealed from those enrolling subjects in the study

Blinding

In an attempt to minimize the effect of rater or subject bias, studies use various blinding schemes. There are 4 categories of study participants who should ideally be blinded to group assignment: (1) patients, (2) treating clinicians, (3) data collectors, and (4) data analysts. Although it is usually feasible to blind those from all 4 categories in a pharmaceutical study, this is usually not possible in studies of physical therapy interventions [3].

Completeness of follow up (Intention to treat principle)

The authors should report the reasons for any patient dropouts from the study and identify any patients who were lost to follow-up.  It is important for the clinician to know if the patient withdrew from the study due to full resolution of symptoms, for reasons unrelated to the study, or because the person experienced a worsening in status that was directly or potentially related to the examination or treatment program provided by the study protocol. When subjects are lost to follow-up, it may still be possible to include data from all subjects in the final data set using an intention-to-treat (ITT) approach, which has been used in a variety of recently published studies.[4]

Equivalent experience of groups apart from treatment of interest

It is possible to introduce bias into a study of treatment if there are important between-group differences in the overall patient experience, aside from the treatment itself. For example, if one group receives more time with treating therapists or receives cointerventions in addition to the intended treatment, this disparity can present a competing explanation for any observed benefits. For this reason, investigators often try to structure study protocols to minimize any unnecessary between-group differences in overall experience during the study, other than the treatment(s) of interest.

What were the results?[edit | edit source]

If we decide that the study is valid, we can go on to look at the results. At this step we consider whether the study’s results are clinically important. For example, did the experimental group show a significantly better outcome compared with the control group? We also consider how much uncertainty there is about the results, as expressed in the form of p values, confidence intervals and sensitivity analysis.

  • How large was the treatment effect?
  • How precise was the estimate of the treatment effect?

Physiotherapists should understand statistical analyses and the presentation of quantitative results when critically appraising an article. This article is a good overview.

Will the results help me in caring for my patients?[edit | edit source]

Once you have decided that your evidence is valid and important, you need to think about how it applies to your question. It is likely, for example, that your patient or population may have different characteristics to those in the study. Critical appraisal skills provides a framework within which to consider these issues in an explicit, transparent way.

  • Can the results be applied to my patient care?
  • Were all clinically important outcomes considered?
  • Are the likely treatment benefits worth the potential harms and costs?

This final question in a critical appraisal of evidence involves a series of deliberate judgments about the relevance and applicability of the evidence to a specific patient in the context of a specific clinical setting. An evidencebased practitioner will need to decide whether the patient under consideration is sufficiently similar to the patients in the study or group of studies for the results to be relevant.

Checklist[edit | edit source]

Are the results valid?

  • Was a randomization procedure explicitly reported? 
  • Was group assignment concealed from those enrolling patients? 
  • Were groups reasonably homogenous at baseline?
  • Were the patients blinded to the treatment they received?
  • Were treating clinicians blinded to group membership? 
  • Were data collectors blinded to group membership? 
  • Was the follow-up period sufficiently long? 
  • Did any patients drop out or switch group assignment? 
  • If there were dropouts or switchover patients, was an intention-to-treat analysis performed?
  • Was the overall research experience equivalent for groups, other than the treatment(s) of interest? 

What are the results?

  • Are the treatment effects statistically significant (a positive trial)?
  • In a positive trial, is the treatment effect size clinically meaningful (equal to or larger than the minimally clincically important difference (MCID))?
  • In a positive trial, does the 95% confidence interval around the point estimate of the treatment effect exclude the MCID?
  • In a negative trial, does the 95% confidence interval around the point estimate of the treatment effect exclude the MCID?

How can I apply the results to patient care?

  • Is my patient sufficiently similar to patients in the treatment group?
  • Are the outcomes measured in the study relevant to my patient’s goals?
  • Is the treatment compatible with my patient’s values, preferences, and expectations?
  • Are the anticipated benefits worth the costs and potential for any adverse effects?
  • Do I have the clinical skills and any required equipment to provide the treatment?


Critical Appraisal Worksheets[edit | edit source]

These critical appraisal worksheets from the centre for evidence based medicine are very useful:


Resources[edit | edit source]

Use the University of Alberta EBM Toolkit to guide you through a validity assessment.

The Pedro Tutorial is designed to help readers of clinical trials differentiate those trials which are likely to be valid from those that might not be. It also looks briefly at how therapists might use the findings of properly performed studies to make clinical decisions. The PEDro scale is a valid measure of the methodological quality of clinical trials. DeMorton (2009) suggests it is valid to sum PEDro scale item scores to obtain a total score that can be treated as interval level measurement and subjected to parametric statistical analysis[5].

Trisha Greenhalgh's series of 10 articles published in the BMJ in 1997 on 'How to read a paper' introduces non-experts to finding medical articles and assessing their value (scroll down to reach the relevant part of the PDF!!):

Set of eight critical appraisal tools from CASP designed to be used when reading research, these include tools for:

AGREE instrument provides a tool for assessing the quality of clinical guidelines.

Cleland, Noteboom, Whitman and Allison. </span>A Primer on Selected Aspects of EvidenceBased Practice Relating to Questions of Treatment, Part 1: Asking Questions, Finding Evidence, and Determining Validity. Journal of Orthopaedic & Sports Physical Therapy, 2008, 38(8)

Cleland, Noteboom, Whitman and Allison. A Primer on Selected Aspects of EvidenceBased Practice Relating to Questions of Treatment, Part 2: Interpreting Results, Application to Clinical Practice, and Self-Evaluation. Journal of Orthopaedic & Sports Physical Therapy, 2008, 38(8)

How to Read a Paper: The Basics of Evidence-Based Medicine BMJ 2008; 336

Understand Levels and Grades of evidence with our related page

Understand simple statistics with our Test Diagnostics page

References[edit | edit source]

  1. 1.0 1.1 Cleland, Noteboom, Whitman and Allison. A Primer on Selected Aspects of EvidenceBased Practice Relating to Questions of Treatment, Part 1: Asking Questions, Finding Evidence, and Determining Validity. Journal of Orthopaedic & Sports Physical Therapy, 2008, 38(8)
  2. Cleland, Noteboom, Whitman and Allison. A Primer on Selected Aspects of EvidenceBased Practice Relating to Questions of Treatment, Part 2: Interpreting Results, Application to Clinical Practice, and Self-Evaluation. Journal of Orthopaedic & Sports Physical Therapy, 2008, 38(8)
  3. Monaghan, Thomas F., Christina W. Agudelo, Syed N. Rahman, Alan J. Wein, Jason M. Lazar, Karel Everaert, and Roger R. Dmochowski. "Blinding in Clinical Trials: Seeing the Big Picture." Medicina 57, no. 7 (2021): 647.
  4. Nagel, Simon, Diogo C. Haussen, and Raul G. Nogueira. "Importance of the Intention-to-Treat Principle." JAMA neurology 77, no. 7 (2020): 905-906.
  5. de Morton NA (2009). The PEDro scale is a valid measure of the methodological quality of clinical trials: a demographic study, Australian Journal Physiotherapy, 55(2), 129-133