Facebook Pixel
Brookbush Institute Logo

Tuesday, June 6, 2023

Special Tests: Introduction

Brent Brookbush

Brent Brookbush


Introduction to Special Tests

by Brent Brookbush DPT, PT, COMT, MS, PES, CES, CSCS, ACSM H/FS


  • Special Tests (a.k.a. orthopedic tests, clinical tests, etc.): A set of motions, positions and/or palpations designed to provoke symptoms associated with a particular diagnosis. Most tests are dichotomous (two possible results), resulting in someone exhibiting or not exhibiting symptoms.


  • Negative (Rule-out): Test does not provoke intended symptoms
  • Positive (Rule-in): Test provokes intended symptoms

Important Note:

  • Concordant Sign: The patient's specific pain, symptom or complaint.
  • Comparable Sign: A combination of pain, stiffness and/or spasm during examination that is comparable to the patients symptoms (5).
    • Note: One common error made during evaluation with special tests is focus on any discomfort or pain, rather than identification of the pain/discomfort related to the patient's complaint (concordant/comparable sign). Many special tests purposefully test tissue limits and are at least mildly uncomfortable for everyone. Coming to conclusions/diagnoses based on positive tests that resulted in "concordant/comparable signs" will improve the chances of diagnosing the issue related to the patient's complaint.

FABER test for hip and sacroiliac joint pathology

Necessary Prior Knowledge:

"Performing special tests is easy, interpretation of the findings is more complicated."

Are you a licensed medical professional (PT, ATC, DC, OT, DO, MD)? If not, then special tests are not within your scope of practice.

Knowledge Base: Without a significant base of knowledge (ex. the minimum standard for a licensed medical professional), it would be difficult, if not impossible to weigh the relevance of a special test result. That is, it is not possible to interpret a cluster of subjective signs, special test results, and movement assessments if you have not developed an internal model of the healthy ideal, have no experience with what a positive result looks/feels like, are unaware of the potential diagnoses, and are not aware of the prevalence or likelihood of diagnoses. Even licensed medical professionals find assessment/diagnosis a significant challenge for several years after graduation, and every experienced licensed medical professional regularly encounters a cluster of findings that challenge conventional diagnosis.

Liability: Assessment and diagnosis using special tests comes with added responsibility. Many licensed professionals have learned "Red Flags" or "Red Flag Signs". These are findings that suggest a more threatening issue that mandates immediate referral to a physician, and in some cases immediate referral to emergency room care. One of the more common examples is a low back pain patient who is having bowel/bladder problems and is experiencing "saddle anesthesia". In these cases, emergency surgery is the best course of action for reducing the risk of losing bowel/bladder control, sexual function, and/or lower extremity strength/sensation. If an individual has come to you for an evaluation, and you miss "Red Flag Signs", you may be liable for the increased dysfunction or injury that results from a delay in treatment.

What we know when we don't know: Licensed medical professionals are trained to identify the issues their scope of practice are capable of addressing. An understanding of one's scope is one of the most important functions of an accredited graduate program. The majority of special tests are designed to identify orthopedic dysfunctions that may or may not be treatable with manual interventions, modalities and exercise. This implies that the default recommendation for test results that imply an issue not within the professional's scope is a referral for further testing by an individual who is more likely to be successful. This improves the likelihood of every patient finding the type of medical professional they need to address their issue.

Obrien's test for labral pathology

What are We Measuring?

"The most important thing to remember about special tests: a test result increases or decreases the likelihood of a diagnosis, and is not an absolute indicator."

Results are probabilistic. Unfortunately, there is no such thing as a special test with 100% accuracy. Each test has strengths and weaknesses. The practitioner should view each positive or negative test result as a piece of information that increases or decreases the odds of a particular diagnosis. There are statistical tools to aid in determining the accuracy, strength, or predictive value of each test. Initially it is unnecessary to memorize the statistics specific to each special test; however, a conceptual understanding is helpful for rating or weighing a test result against other findings during an evaluation. For example, your diagnosis may not be altered by a negative test result on a special test with low sensitivity, especially if it contradicts the findings of all other tests. A deeper understanding of these statistical tools and values may be necessary if you are responsible for choosing the optimal set of special tests for your practice. Note, in the articles that follow, the Brookbush Institute (BI) only covers special tests with the highest accuracy, and has purposefully omitted all other tests (more discussion below). The most important point to remember about special tests is that a test result increases or decreases the likelihood of a diagnosis, and is not an absolute indicator.

  • Results: Although most special tests are either positive or negative, the accuracy of a test is dependent on the number of positives and negatives that are correct. Therefore, when the accuracy of a test is determined their are 4 possible results, because a positive or negative result could be right (true) or wrong (false).
    • True Positive - the proportion of positives in a group that were assessed correctly
    • False Positive - the proportion of positives in a group that were assessed incorrectly (assessed as positive when the patient was actually negative)
    • True Negative - the proportion of negatives in a group that were assessed correctly
    • False Negative - the proportion of negatives in a group that were assessed incorrectly (assessed as negative when the patient was actually positive)
  • Sensitivity: The probability of a positive test result in someone with the pathology (the probability of a true positive). This could also be viewed as the ability of a test to be "sensitive" enough to detect all of the positive individuals in a group of likely candidates.
  • Specificity: The probability of a negative test result in someone who does not have the pathology (the probability of a true negative). The ability to "specifically" identify only those individuals with the dysfunction, by getting a negative result from everyone who does not.
  • Likelihood Ratios - Likilihood ratios are used for assessing the value of performing a diagnostic test. They use sensitivity and specificity to determine whether a test result usefully changes the practitioner's chance (probability) of reaching the correct assessment. Calculation of likelihood ratios is based on Baye's theorem.
    • Positive Likelihood Ratio (LR+) - The ratio of a positive test result in people with the pathology to a positive test result in people without the pathology.
    • Negative Likelihood Ratio (LR-) - The ratio of a negative test result in people with the pathology to a negative test result in people without the pathology.
  • Predictive values are the proportion of true positives and true negatives. The positive predictive value (PPV) and negative predictive value (NPV) describe the accuracy of a diagnostic test; however, unlike sensitivity and specificity, predictive values are largely dependent on the dysfunctions prevalence in the examined population (1). For example, imagine a 20-year old college female athlete with an on-the-field injury resulting in knee pain, and a 75-year-old male patient with knee pain that has progressively become worse over the last 3 months. Consider how these two cases may effect the predictive values of an eccentric step down test or Lachman's test .
    • Positive predictive value (PPV) - describes the probability of having the dysfunction of interest in a subject with a positive result. Therefore PPV represents the proportion of patients with positive test result that are positive in a given population.
    • Negative predictive value (NPV) - describes the probability of not having the dysfunction of interest in a subject with a negative test result. Therefore NPV represents the proportion of patients without the disease that are negative in a given population.
  • Accuracy - the proportion of subjects correctly identified by the test results (true positives + true negatives/tested population)

Gaenslen's Test for Sacroiliac Joint Dysfunction

The Difference Between a Good Test and a Bad Test:

"A good test is accurate, valid, reliable and relevant to the patient being assessed."

  • Accuracy - the proportion of subjects correctly identified by the test results (see "What are we measuring?" above)
  • Validity - refers to how well a test measures what it is purported to measure (Does it measure what we think it measures?).
  • Reliability refers to the ability of a test or assessment to produce consistently accurate results, time-after-time, regardless of who is performing the assessment. If an assessment cannot produce consistent results, than a professional cannot determine the accuracy of a measurement, compare that measurement to normative data, or reassess and compare measurements taken on two separate dates. Reliability falls into to two broad categories:
    • Inter-tester reliability - assesses the agreement (or lack of) between two or more testers in their assessment.
    • Intra-tester reliability - assesses the agreement (or lack of) between test scores from one test administration to the next (administered by a single tester).
  • Relevance - closely connected or appropriate to the matter at hand.
    • The actual impact of an assessment has on application is not given enough credit. If the results of an assessment do not impact how you will proceed, that assessment is not relevant, and it should be discarded. Further, the relevance of an assessment may affect predictive values (PPV and NPV), and the accuracy of your evaluation.

Valgus and Varus Stress Test for collateral ligament pathology of the knee

How the Brookbush Institute Chose the Special Tests in these Lessons:

There are hundreds of special tests to choose from. There is often an overwhelming number of tests for a single condition or pathology. Learning all of these tests is not realistic or practical, and if selection is refined using the concepts above, it is not necessary.

  • Accuracy: As more research on the accuracy (specificity, sensitivity, reliability, likelihood ratios, etc.) of special tests is published, findings indicate the majority of special tests are fairly poor. In fact, most are so poor that most dysfunctions require a test-item cluster to reach an acceptable level of accuracy. The Brookbush Institute has purposefully excluded all "poor" special tests that were not included in a test-item cluster.
    • Test-Item Cluster - A specific group of special tests for a particular diagnosis that has greater accuracy than a single special test, as demonstrated by research.
  • Best Possible Selection: The selection of special tests was further refined by best possible selection. When more than one cluster or special test demonstrated good accuracy, the most accurate cluster or test was chosen. A potential exception was made for more practical (easier to perform) tests/clusters when accuracy was near equal.
  • Common Diagnoses: Consideration was given to likely diagnosis. It is important that our bias toward outpatient orthopedic/sports medicine is considered. Special test or test-item cluster for conditions that are unlikely in an outpatient or sports setting were not included. If you work in a different setting, it is very likely that the tests covered in the following lessons are still the most accurate for the given diagnoses, but you may need to add special tests for diagnoses that are common to your setting.
  • When Weak is Better than Nothing: In some cases "weak" tests were selected because they were the best option available. These "weak" options were especially important when the diagnosis would have a significant impact on intervention. A weak test that demonstrates some level of accuracy via the rigors of peer-reviewed, published research is still stronger evidence than a personally biased guess. While selecting special tests for these lessons, it was important to consider whether excluding a test relegated a diagnosis to guesswork. In these cases, the best possible test was selected, despite not meeting the standards expected in cases where multiple researched tests were available.

Acromioclavicular (AC) Joint Resisted Extension Test

How to use Special Tests:

  • "Clear" the Patient/Client for Intervention: The gross majority of special tests are only helpful for clearing the patient for intervention, and do not have an impact on intervention selection. That is, these tests "clear" the patient to continue with treatment under your care. The Brookbush Institute is adamant that choosing an intervention to treat a diagnosis is flawed logic, and interventions should be selected based on the results of movement assessment. Evidence of certain diagnoses (e.g., evidence of shoulder labrum tear) may suggest that treatment should precede with caution, or that follow-up is necessary in the relatively near future. That is the diagnosis implies that if little or no change results from initial intervention that the patient should be referred-out for further diagnostic testing.
  • Highlight Contraindications - Some special tests may indicate damage to tissues and suggest that certain targeted interventions are inappropriate. For example, evidence of a muscle or ligament strain may imply that additional stress to those tissues should be avoided until after the acute injury phase.
  • Refine Exercise/Intervention Selection: A few special tests should be re-categorized as movement assessments. Although convention dictates their inclusion in these courses, these tests should likely be grouped with assessments like the Overhead Squat Assessment or Goniometry. The "stork test " for sacroiliac joint stiffness is a good example of a test that implies a movement impairment and not a particular diagnosis.

Additional Articles on Special Tests:


  1. Šimundić, A. M. (2009). Measures of diagnostic accuracy: basic definitions. Ejifcc19(4), 203. - Article
  2. Cook, C., & Hegedus, E. J. (2008). Orthopedic physical examination tests: an evidence-based approach.
  3. Dutton, M. (2012). Dutton's Orthopaedic examination, evaluation, and intervention. McGraw-Hill Medical.
  4. Magee, D. J. (2013). Orthopedic physical assessment. Elsevier Health Sciences.
  5. Maitland Australian Physiotherapy Seminars, MT-O: Evidence Based Orthopedic Diagnostic Evaluation © Maitland-Australian Physio-Therapy Seminars 2009-2012

© 2018 Brent Brookbush

Questions, comments and critiques are welcome and encouraged.