Facebook Pixel
Brookbush Institute Logo

Sunday, September 29, 2024

Is There a Single Best Approach to Physical Rehabilitation?

Is there a single best approach to physical rehabilitation? Science suggests the answer is "yes."
Brent Brookbush

Brent Brookbush

DPT, PT, MS, CPT, HMS, IMT

Listen to this article:

00:00 00:00

A Formal Proof for the Existence of a Single Best Approach in Physical Rehabilitation, with Considerations for Achieving Optimal Patient Outcomes.

by Brent Brookbush, DPT, PT, MS, CPT, HMS, IMT

Introduction

Problems with our industry:

Intervention selection is often "modality driven," based on practitioner preference, and rationalized based on any improvement in an outcome measure that was identified post hoc. Too often, this measure is a subjective assessment of current pain, which is not a reliable measure of short-term or long-term outcomes. I know this sounds harsh, but an examination of education throughout the industry and social media posts will quickly highlight an obsession with promoting or demonizing an intervention, with almost no reference to a systematic approach, assessment-driven decisions, the relative efficacy of the intervention, support from comparative research, or reference to reliable, objective outcome measures.

Intervention selection, sometimes referred to as clinical decision-making, is a topic that is not given enough attention in college and university curricula or professional continuing education courses. Often, these topics are not given any dedicated time during coursework, resulting in vague rationales and messy heuristics that fall apart when used in professional practice. Perhaps what is most disappointing is that logic, set theory, sorting and labeling, decision theory, and information science have continued to progress over the past 80 years, in large part due to technology. Our professions have ignored these sciences or failed to integrate these advancements in any significant way.

The thought experiment that inspired this article.

We could start thinking about intervention selection with a thought experiment. Imagine placing every possible physical rehabilitation technique in a pile on a table. Every modality, manual technique, exercise, etc., from every physical rehabilitation profession (PT, ATC, DC, OT, DO, etc.). Which techniques would you select from this pile? I am assuming most of us would select the best possible techniques or the best possible combination of techniques (that are within our scope of practice). But what does "best possible" mean? Unfortunately, "best" is too often based on the practitioner's preference (what the practitioner is comfortable performing) or potentially the techniques conventionally performed by a professional designation (e.g., chiropractors perform manipulations, acupuncturists perform acupuncture, etc.). Worse still, selections are often justified if the intervention had any positive effect, with little consideration for what intervention selection would have resulted in the best possible outcomes. By default, this replaces the pursuit of "optimal outcomes" with "it worked for me."

However, there is an alternative. The primary thesis of this article is that "best" is an objectively measurable quantity. It is not a debate that can or should be resolved by subjective opinions and vague references to professional experience. Expert opinion should be replaced with a "mathematical quantity." The definition of "best possible" can be derived from two objective measures: reliability/frequency (the percentage of time it results in a positive outcome) and effect size/magnitude of effect (the amount of improvement made). The product of frequency and magnitude is known as "expected value" (a term that is likely most often referenced in economics). Because sessions are limited in length of time, techniques should be prioritized by expected value, ensuring that the best interventions are performed within the session time and the best possible outcomes are achieved for that session.

Neck work
Caption: Neck work

Quick Summary

Practical Application

Primary Hypothesis:

  • There is an objectively measurable best possible set of interventions, developed from prioritizing techniques based on their expected values, that will result in the best possible average outcome for all patients.

This can be achieved with the following methodology:

  1. Use the outcomes (expected value) demonstrated in comparative research to build an intervention model that prioritizes intervention categories by relative efficacy and further lists the best intervention from each category. (Note that developing categories will require better labeling and sorting of intervention types).
  2. Additionally, assessments should be carefully selected to differentiate patient populations into subgroups that achieve optimal outcomes from different intervention plans, aiding in the optimal reprioritization of interventions for these subgroups.
  3. A methodology of assessment, intervention, and re-reassessment can then be used to test interventions in rank order of their expected values to refine intervention selection for individual patients in practice.
  4. Last, a small proportion of session time should be allocated to trying new approaches, with the aim of uncovering strategies that achieve a higher expected value than previously thought possible.

Section 1: Axioms Demonstrating There Is One Best Approach

  1. Outcomes are Probabilistic: Because an intervention's effect on outcomes is probabilistic,comparing interventions must be based on outcomes and not mechanism/intent.
  2. Choices are Relative. Assuming we are comparing interventions with an effect on outcomes greater than doing nothing, the choice of intervention must be based on relative effectiveness. The effectiveness of an intervention is relative to the effectiveness of all other interventions that could be selected.
  3. "Best" is a Measurable Quantity. Effectiveness can be calculated using the formula for expected value: frequency (reliability) x value (effect size) = expected value (effect on outcome measures). This implies the best intervention is the intervention with the highest expected value (reliability x effect size).
  4. Intervention selection is a "zero-sum game": Because the length of a session limits the total number of interventions that can be selected, a "zero-sum game" scenario is created, resulting in a need to determine relative efficacy. That is, a limited number of interventions can fit into a session, so beyond that number of interventions, any choice of interventions greater than that number requires the removal or dismissal of interventions.
  5. Prioritizing the "best intervention(s)" will result in the "best outcomes": Prioritizing interventions based on the highest expected value will result in the highest expected outcome. That is, the best possible outcome is the product of summing the interventions with the highest expected values.
  6. Assessment aids in re-prioritization of interventions for subgroups: From the perspective of decision theory, the goal of an assessment is to differentiate patient populations into subgroups that achieve optimal outcomes from different intervention plans.

Section 2: The Best Approach in Practice

  • "Best Interventions" Should Be Determined by Comparative Research: It is recommended that the "best interventions" be selected using the best data currently available. The best data for determining relative efficacy is outcomes from peer-reviewed and published comparative research.
  • Outcomes Measures Selected Based on Carry-over Effects: The best possible approach should likely consist of selecting interventions that have the largest carry-over effects on the treatable factors correlated with the best short-term and long-term patient outcomes.
    • The intervention's effect on outcomes is more important than the correlation between the affected factor and the pathology.
  • Experimentation is Necessary to Avoid Local Maxima: To investigate whether better outcomes are possible, it’s necessary to experiment with new interventions and new combinations of interventions, even if they deviate from established best practices.
  • Modeling May be Necessary: Modeling is likely necessary to test the interaction of multiple interventions and their effect on outcomes.

Formal Proof

  • Probabilistic Outcomes: Given that outcomes are probabilistic, the result of an intervention cannot be known with certainty. Let pip_irepresent the probability of a successful outcome for intervention IiI_i. These probabilities vary by intervention, and the outcome depends on both the reliability of the intervention and its short-term and long-term effect size on objective outcome measures.
  • Relativity of Choices: The success of an intervention is relative, not evaluated in isolation. If there are nn possible interventions {I1,I2,,In}\{I_1, I_2, \dots, I_n\}, the effectiveness of intervention IiI_i must be evaluated relative to the effectiveness of others IjI_j, where jij \neq i. Comparative research should inform this evaluation.
  • Zero-Sum Nature of Intervention Selection: In any session, time constraints limit the number of interventions S{I1,I2,,In}S \subseteq \{I_1, I_2, \dots, I_n\}that can be performed. Selecting intervention IiI_iexcludes IjI_j, creating a zero-sum game. The goal is to prioritize interventions with the highest expected value, factoring in carry-over effects.
  • Measurable Effectiveness: Effectiveness EiE_iof an intervention is defined as Ei=Fi×ViE_i = F_i \times V_i, where FiF_i is reliability (frequency of success) and ViV_i is effect size. Expected value should prioritize interventions shown to produce both short-term and long-term improvements, based on objective outcome measures, avoiding over-reliance on subjective reports like pain reduction.
  • Existence of a Best Intervention: Let the set of possible interventions be {I1,I2,,In}\{I_1, I_2, \dots, I_n\}. The best intervention IbestI_{\text{best}} is the one that maximizes EiE_inew experimental interventions should periodically be introduced to challenge the potential of local maxima.
  • Prioritizing the Best Intervention: By choosing IbestI_{\text{best}}, practitioners maximize the likelihood of the best possible outcome, but assessments must be used to identify patient subgroups (e.g., different responders). In cases where subgroups exist, interventions can be reprioritized for each subgroup to optimize expected value.
  • Formula for Optimal Intervention Selection (Example of Formula Use At the End of the article):

Ioptimal=argmaxi{1,2,,n}(E(OiS)×WLT)subject to:i=1kTiTsession

  • Where:
    • IoptimalI_{\text{optimal}} is the optimal intervention or combination of interventions selected for the patient.
    • argmax\arg\max means "the value of ii" that maximizes the expression E(Oi)×WLT\mathbb{E}(O_i) \times W_{\text{LT}}. Essentially, it finds the intervention IiI_i that provides the highest value of the expected outcome multiplied by the long-term weighting factor.
    • E(Oi)\mathbb{E}(O_i) is the expected outcome of intervention IiI_i, calculated as: E (Oi)=Fi×Vi
      • FiF_i is the reliability (frequency of success) of intervention IiI_i.
      • ViV_i is the effect size (magnitude of the outcome) of intervention IiI_i.
    • WLTW_{\text{LT}}is a weighting factor that accounts for the long-term carry-over effect of the intervention. This ensures that interventions with long-term benefits are prioritized higher than those with only short-term effects.
    • TiT_i is the time required to perform intervention IiI_i.
    • TsessionT_{\text{session}} is the total time available in the session.
    • The summation constraint i=1kTiTsession\sum_{i=1}^{k} T_i \leq T_{\text{session}} ensures that the total time spent on interventions in a session does not exceed the available time.
    • SS is the identified patient subgroup, and E(OiS) is the expected outcome of intervention IiI_i for the specific subgroup.​

Does exercise have a higher expected value than manual therapy? Which is the best choice?
Caption: Does exercise have a higher expected value than manual therapy? Which is the best choice?

Section 1: There is One Best Approach

Outcomes are probabilistic, not deterministic.

Practitioners do not perform an intervention and know the effect on outcomes with certainity. This uncertainty results in a relationship between interventions and outcomes that is a probability of an effect. Interventions are often taught deterministically as if a causal relationship exists that results in a specific effect; however, this teaching is either an oversimplification of a more complex relationship, or it may just be a false notion. All interventions result in an effect on outcomes that is variant in both reliability and effect size. Assertions of any causal relationship are hypotheses, and these hypotheses only affect the probability if they affect practice. That is, the effect of a hypothesis on an outcome is due to the effect on practice (e.g., intervention selection) and not due to the hypothesis itself.

Although it is common to see the phrase "correlation is not causation," it is rare to see a discussion on how a causal relationship is demonstrated. This is likely because demonstrating cause is not as simple as may be implied by the statement "correlation is not causation." The way this phrase is abused in the media, it is almost implied that correlation is the "easy way out", and a little extra effort would have revealed the causal relationship. This is not true. Although correlations are relatively easy to demonstrate, a causal relationship is accepted when all available evidence suggests that the causal hypothesis is the most accurate available and leads to the most accurate predictions. It would not be inaccurate (although maybe a bit incomplete) to say that causation is demonstrated by congruence between multiple correlations. It is also important to note that "proving causation" can become a "slippery slope argument" in which an ever-increasing search for a more detailed explanation results in a continuous demonstration of gaps in our knowledge.

This relationship between interventions, causation, evidence, and outcomes results in an uncertainty that, again, can only be expressed as a probability. Although it is not necessary to measure this probability to measure the probability of an intervention resulting in a change in an outcome measure, this probability of the accuracy of a hypothesis of causation should clearly demonstrate that the relationship between intervention and outcomes is not deterministic.

The Effectiveness of Interventions is Relative

Critique is Not Binary

The efficacy of an intervention does not compete with with the efficacy of not performing the intervention; it competes with the efficacy of all other interventions for available time during a session. When a technique is critically evaluated by a professional, it is not a binary choice of whether the technique should or should not be performed unless the technique has an expected value less than or equal to doing nothing. Techniques less effective than doing nothing are likely relatively rare due to the "survivorship bias." "Survivorship bias" implies that the available sample is biased to include only those techniques that have "survived (are still in use today)." Techniques that make a patient worse are unlikely to be used beyond a practitioner's tolerance for experimentation and are unlikely to spread in a profession since professionals are unlikely to tell their colleagues about ineffective techniques. Note that techniques may also fail to survive because of complexity, competitive factors other than efficacy, and failure to spread, but this will be considered further in the section on experimentation. The majority of techniques currently used in a rehabilitation setting have likely demonstrated efficacy that is more than doing nothing (or placebo), even if the effect is only a short-term improvement of subjective symptoms for some patients. At the very least, more techniques have demonstrated efficacy in randomized controlled trials (RCTs) than can be reasonably performed in a single session. This implies that the critical evaluation of techniques should focus on how effective the technique is relative to other possible techniques that could be chosen.

Unsupported Default Position Fallacy

There is another false comparison that must be addressed: the unsupported default position fallacy. The "unsupported default position fallacy" is the notion that if a flaw is asserted regarding an opposing position this strengthens the position of the person asserting the flaw. In short, the idea that "proving you wrong makes me more correct." The reason this is a fallacy is that proving a position wrong would only make the other position correct if there were only 2 possible solutions and 1 of the positions had to be correct. This is rarely, if ever, the case in physical medicine and is definitely not the case when considering the selection of interventions. If there are more than 2 possible solutions, and/or every position could be correct or false, then every position being critiqued must be critiqued based on the merits of the position (relative efficacy). For example, if it is demonstrated that a position is flawed, it is still necessary to demonstrate that the other position is less flawed. It is possible that both positions are flawed, but the opposing position is less flawed and should be the adopted position based on relative accuracy. Note that the unsupported default position fallacy is rampant on social media; trolls love to critique everything. However, these trolls seem oblivious to the fact that the interventions they are currently using often have the same or worse flaws than the interventions they are critiquing. A poignant example of this fallacy is supporters of Pain Neuroscience Education (PNE) asserting that an approach based on biomechanics or movement impairment/postural dysfunction is flawed, which implies PNE is the more effective intervention. However, all RCTs comparing PNE to techniques intended to address movement impairments have demonstrated that PNE is relatively ineffective (PNE Research Review ) (1). Of course, complex models of movement impairment and posture need additional refinement, but it is important to remember that medicine is an evolving scientific field. Every branch of medicine still has room for refinement. Identifying flaws is essential for continued progress, but the identification of a flaw does not necessarily imply that another methodology is less flawed. In summary, even if the description of an intervention is flawed or there is an apparent issue with implementation, if the intervention has the highest expected value (discussed below), then it must be chosen to result in optimal outcomes. The goal is to find the best possible intervention based on outcomes, not to find the flawless/perfect intervention (which may not exist).

"Best" is a Measurable Quantity

Effectiveness is a measurable quantity:

The primary hypothesis asserted by this paper is there is an attainable best treatment approach that our professions should strive to achieve. This "best" approach may be defined objectively/mathematically (as opposed to subjectively) based on reliable, objective outcome measures. Further, the best outcomes may be achieved with the optimal selection of interventions. It is important to note that "best outcomes" refers to the best average outcomes for all patients.

The largest contributors to the best average outcomes are likely the reliability and size of an intervention's effects on outcomes. These terms could also be expressed as "frequency" (reliability) and "value" (effect size), and the product of frequency and value is "expected value". The formula, expressed simply, is "Frequency (reliability) x Value (effect size) = Expected Value (effect on outcomes)." "Expected value" is a term that is more common in game theory or economics but aids in solving a problem that is created when comparing interventions based on these two variables (2). Scenarios can easily be imagined when, without such a formula, it might be near impossible to objectively compare techniques based on average outcomes when one technique was very reliable but had a relatively small effect size, and another technique had a very large effect size but very low reliability. By basing intervention selection on the product of these two numbers, we can compare interventions with a wide range of values and improve average outcomes.

For example, practitioners often fall prey to the "availability heuristic," giving more weight to the outcomes they remember most, e.g., remembering whether an intervention resulted in a very positive or very negative outcome but not having a sense of the frequency of that outcome. This is especially prevalent for those professionals who commit to a particular modality or method. If a technique worked incredibly well for a patient but only results in a similar effect 1/10 times, the average of these outcomes is likely to be less effective than other intervention selections with higher reliability. Note that this technique may become an excellent choice if an assessment can identify the 1/10 patients prior to implementation (this is discussed further below). A common example of this issue in physical rehabilitation is individuals who get Graston certified and try to treat everything with instrument-assisted soft tissue mobilization (IASTM), despite our systematic research review on IASTM suggesting that IASTM has less efficacy than specific manual techniques (e.g., ischemic compression, dry needling, joint mobilizations, and joint manipulations) (3). This does not imply that practitioners should not use IASTM. It only implies that IASTM should be used following specific manual techniques to increase mobility further when time allows (discussed below).

Why can't we base selections on the intended effect of an intervention?

Outcomes depend on the variables we can modify and not on our understanding of how the variables affect outcomes (mechanism of effect). The Branford Hill criteria of Causality, published in 1965, documented this interesting logic issue. The Bradford Hill Criteria list several criteria that "strengthen" a hypothesis of a causal relationship (4, 5). One criterion is a supportable causal hypothesis; however, Bradford Hill also notes that it is not necessary to know how a variable affects an outcome to know that it does affect an outcome. That is, "knowing how something works is not necessary to know that something works." Some everyday examples of this logical issue include the ability to make a phone call without any knowledge of telecommunications or smartphone technology, reducing symptoms of a headache with aspirin without knowledge of how aspirin affects headache symptoms, or benefiting from resistance training without knowing exercise physiology.

Further, a patient can benefit from the effects of an intervention despite the patient or practitioner believing a causal hypothesis that is inaccurate. For example, manipulations are likely to improve cervical pain, even if the patient and practitioner believe that this is due to the reduction of a "subluxation." In fact, to this last point, research has demonstrated that patient expectations, practitioner preference, and false narratives do not influence the effectiveness of an intervention (For more on this, check out the article - False Narratives, Nocebo, and Negative Expectations do NOT affect Manual Therapy Outcomes: Research Confirmed ) (6). Note that the fact that causal hypotheses do not affect outcomes influences how we determine the "best intervention" (discussed below). That is, we must base our selection on the end result of outcomes and not get lost in intent, hypotheses, or potential causal relationships. This is not to say that the mechanism of effect is unimportant, but its importance is to inspire modification or consideration of new variables that may be modified to improve outcomes. In summary, modifying variables affects outcomes, and hypothesized mechanisms of effect only aid in inspiring new methods.

Professional Title Does Not Affect Outcomes

Professional designation is not a variable that modifies outcomes. For example, we would not expect a chiropractor and physical therapist to achieve vastly different outcomes from performing joint mobilizations. An egocentric argument may be made that an individual's profession results in more skilled implementation. However, that is not an argument that suggests professional designations are influential; it is an observation that the level of skill in performing an intervention may affect the expected value. Even if we consider skill as a modifiable and influential variable, research suggests this is still unlikely to have a significant influence on the average outcome achieved by intervention relative to other interventions, suggesting it will have little, if any, influence on how interventions should be prioritized and selection of the optimal set of interventions (i.e. "the best approach"). Further, this implies that professionals treating the same population with similar problems should be treating all patients with a similar "best approach."

This has very important implications for how we should evolve. Scope wars should not exist. Every professional treating a specific patient population should have access to the best possible interventions. Scopes of practice should likely be divided based on the best possible combinations of interventions, not the protectionist actions of representative organizations. Further, professionals treating similar populations should be treating patients similarly because all patients deserve the best approach. Divisions between professions should likely be based on patient populations that require very different combinations of interventions (e.g., outpatient orthopedic compared to neurological pathology). For example, orthopedic outpatient-focused PTs, ATCs, and DCs should not be 3 separate professions; however, PTs should like be divided into orthopedics and neuro or outpatient and in-hospital care.

Chance of Multiple Best Solutions

Note that there is a chance that more than one "best approach" exists; however, using some Bayesian Probability logic suggests that the probability of this occurrence is exceedingly unlikely. If the best possible approach is the sum of the expected values of a set of interventions, then having two best approaches would require that the sum of the expected values of two sets of interventions not only be statistically similar but also result in the best possible outcomes. Keep in mind that expected value is itself a product of reliability x effect size. So, for there to be 2 "best approaches" requires 2 separate "combinations of combinations," resulting in similar values, and both being the best possible. The probability of this occurrence would become more and more unlikely with the addition of each additional best approach. This probability is similar to two different but equal hands winning at a poker table with multiple players.

A real-life analogy that might help illustrate this concept is comparing it to finding the "perfect recipe" for a dish like a cake. Imagine that you’re testing different combinations of ingredients, baking times, and temperatures to find the best cake recipe. Each combination is an "intervention," and the goal is to find the combination that produces the best cake. While there may be many good cake recipes, only one recipe will likely be the "best" — perfectly balancing taste, texture, and appearance. Now, for two different recipes to be equally the best, the specific combinations of ingredients and baking processes would have to result in cakes that are not only very similar but equally perfect. As you continue adding more variables — like using different types of flour, adjusting the temperature slightly, or varying the baking time — the chance that two completely different recipes result in cakes of the same highest quality becomes increasingly unlikely. Each small change reduces the probability of two entirely distinct recipes producing the "best" outcome. Similarly, with each additional "best approach," the probability that different sets of interventions would yield equally optimal results becomes smaller and smaller, making the occurrence exceedingly rare.

A potential way that the probability of multiple best solutions would be higher than expected would be the existence of thresholds. For example, if there was a threshold that existed for the amount of improvement (adaptation) that the body could achieve following a single session. Although we see evidence of threshold levels of adaptation in resistance training research (e.g., 5-8 sets/muscle group/session result in similar improvements ), research has not demonstrated a similar phenomenon when comparing physical medicine interventions. Often, techniques in physical medicine will result in very different outcomes, from resolution in days to resolution in months. Further, even if different treatment plans resulted in similar results because of a threshold, it is likely that other variables would not be equal (e.g., session time, patient discomfort, ability to use for self-management, etc.).

The idea of a threshold in the body’s adaptation can be likened to the concept that once you’ve reached a certain level of quality in your cake, small changes to the recipe might not significantly improve or worsen the taste of the cake. For example, small changes in the number of ingredients in an already excellent recipe may not significantly impact the taste of the final cake. However, in physical medicine, the body responds more variably, like experimenting with wildly different ingredients in baking. Sometimes, you might achieve the perfect cake in one try, while other times, it takes several attempts to get anywhere close to something that looks like a cake. Different physical medicine interventions appear to behave more like experimenting with wildly different cake recipes.

Intervention Selection is a Zero Sum Game

What is a zero-sum game?

Intervention selection is a zero-sum game because the number of interventions selected will always be limited by the amount of time in a session. A zero-sum game is a scenario in which 1 team "wins" only if the other team "loses." Similarly, in decision theory, a zero-sum game refers to scenarios in which a selection results in the rejection of other possible selections. Intervention selection is more complicated than 1 versus 1 selection because more than one intervention can be selected for a session. However, we could refer to the number of interventions selected within a session as 1 set. A set is full when the number of interventions performed requires all of the time allotted for a session. Once the set is "full" then adding additional interventions to the set must result in the rejection of one or more techniques to "make time" for the new intervention. Similarly, if one system is compared to another, those sets would replace one another in a session because there is not adequate time to perform both sets. Asserting that intervention selection is a zero-sum game or asserting that selecting interventions requires that other interventions are not selected may seem obvious. However, if this assertion were not true, there would be no need to compare interventions because an infinite number of techniques could be selected and combined.

Research has demonstrated that the "best intervention" has to be a set.

Research has demonstrated with near ubiquity that a combination of interventions is more effective than any intervention performed alone (e.g., the combination of manual therapy and exercise is more effective than manual therapy or exercise performed alone). This does not imply that the addition of any technique will improve outcomes. The addition of relatively ineffective techniques may not significantly improve outcomes, and some techniques may result in similar effects, which would result in no additional improvement in outcomes. Further, some combinations of techniques may not result in additive value for unknown reasons. For example, adding joint mobilizations to joint manipulations for the same segment may not result in an additional benefit because the techniques result in similar effects. The issue arising from combining relatively ineffective techniques is addressed by prioritizing technique selection based on expected value, and the issue with selecting similar techniques is addressed with better labeling and sorting, both of which are discussed below.

More on sets:

The number of interventions selected within a full set may not be fixed, but it may be optimized. Obviously, session length, session frequency (e.g., sessions/week), and total number of sessions are parameters that may be altered. However, there is an upper limit that is the result of a patient's tolerance to an increase in time spent in therapy and patient resources. Further, additional research should consider session efficiency (outcome/session time). The author has identified two interesting corollaries of the thought experiment above that may have a significant effect on how practice and our professions evolve and a significant impact on session efficiency. First, if the best possible combination of techniques results in the highest expected value, then session length should be optimized to "fit" that combination. This could be 35 minutes, 90 minutes, or a combination of session lengths for initial and follow-up sessions. Second, as mentioned above, a professional's scope of practice should be determined by the optimal full set of techniques for a patient population. However, here, we consider the impact of a patient having to see multiple professionals. If a patient must see multiple professionals in separate sessions to achieve the optimal set of interventions because the optimal set of interventions is in the scope of practice of several professionals, the result is a large loss in session efficiency and potentially a significant decrease in outcomes. For example, if the optimal intervention set included (movement assessment, dry needling, manipulation, exercise, kinesiology tape, and home exercise program), and the combination of these techniques in a single session had additive effects resulting in the best possible outcomes. Spreading these techniques into 3 sessions with 3 different professionals may result in a 3-10-fold increase in time and money spent (when adding transportation, documentation, wait times, and the fees for each practitioner) and a significant reduction in outcomes as additive benefits may be lost depending on the amount of time between sessions. Scopes of practice should likely be reorganized to match patient populations, and education should match the interventions that have demonstrated the highest expected value for that patient population. The author's idea for achieving this is that professions with similar scopes are combined into a doctorate of physical medicine (e.g., DCs, ATCs, PTs, OTs, and Acupuncturists), and university education or certificate programs include specialization based on populations (e.g., elective tracks in the 3rd year of a clinical doctorate with complementary clinical affiliations). This is similar to how physician education is currently delivered.

Prioritizing based on "Best Interventions" will result in "Best Outcomes".

When multiple interventions are considered, the expected outcome is determined by summing the expected values of each intervention. Prioritizing interventions with the highest expected values will yield the most favorable outcomes. Since an intervention's efficacy is relative and session time is limited, selecting and prioritizing interventions based on their expected value (relative efficacy) is necessary to achieve the best possible outcomes.

Nobody Wants the 2nd Best Option.

Although some individuals may respond better to an intervention that is different than the intervention with the highest expected value (e.g., different responders), nobody would choose to start with that intervention. This idea is best illustrated with a thought experiment. Imagine you have two techniques to choose from. The first technique is the best choice for 70% of patients, and the other technique is the best choice for 30% of patients. It is unlikely anyone would choose to start with the methodology that is less likely to be the best option, even if, after trying the two interventions, the 30% technique is the best option for this patient. Without additional information, there is no reason for the practitioner or patient to believe they are a "different responder." Note that in most physical rehabilitation settings, if the 70% technique did not work or did not work well, you could try the 30% technique next, with the worst-case scenario being that the technique is attempted during the next session. The issue of not prioritizing based on the highest expected value becomes more obvious when we add additional techniques. For example, let's consider a scenario with 4 possible techniques with the probability of having the best effect on 55%, 30%, 10%, and 5% of patients. If a practitioner started with the 5% technique and then performed the 10% technique, the likelihood of those techniques resulting in the best outcome is just 15%. Alternatively, if the practitioner starts with the 55% technique and then the 30% technique, the probability of the best possible outcome is 85%. In summary, the importance of prioritizing interventions based on the highest expected value ensures that most individuals receive the best possible approach in the least number of attempts. The most accurate source of information on average outcomes is peer-reviewed and published comparative research (discussed below). However, a type of "trial-and-error" is necessary to refine intervention selection and ensure the "best approach" for individual patients in practice. That is, a methodology of assessment, intervention, and re-reassessment should be used to determine if an intervention was effective, in combination with attempting interventions in rank order of their expected values as demonstrated by research.

The Problem with Current Practice is Random Sorting and the Practitioner's Preference

If it is not standard practice to prioritize interventions based on expected value, then it is more likely that the order of techniques performed will be at least partially random. This logic is similar to expecting six dice to naturally arrange themselves from highest to lowest by simply rolling them or even just half of them. Even if 3 dice had known values and could be ordered in advance, rolling the other 3 dice is likely to result in an order that is less than optimal. The same scenario occurs in practice in which some techniques are prioritized with known higher expected values, but a lack of knowledge (or potentially a lack of effort) results in a random ordering of additional interventions. Note that this lack of knowledge is often not due to a lack of data. More research is available than is currently being effectively applied to practice (this is discussed further below).

It may be argued that the random sorting problem is less problematic than the practitioner preference problem. Many professionals choose interventions based on allegiance to a modality or "school of thought" and/or comfort with a specific intervention type. Consider that comparative research demonstrates that manual therapy is likely more effective than exercise in the acute phase of orthopedic pain; further, the combination of both manual therapy and exercise is more effective than either alone (Active vs. Passive ) (7). Yet, some professionals refuse to perform manual therapy, claiming allegiance to "active approaches." As mentioned above, some professionals will prioritize IASTM over specific manual techniques despite research suggesting that specific manual techniques have a larger positive effect on outcomes (3). And, perhaps most obviously, therapeutic ultrasound (US) is less effective than manual therapy or exercise, interventions likely to require the majority of the length of a session, and yet US is still implemented within a session, reducing the time for more effective interventions (1). Of course, there is a possibility that these interventions with lower expected values will be the best intervention for an individual patient. However, practitioners should not fall prey to the availability heuristic, assuming that an intervention has a higher expected value because they can remember a really great outcome once or twice. The goal is to achieve the best possible outcomes for all patients; the goal is not to match needle-in-haystack results to the needle-in-haystack patients. If one of these lower expected value interventions is the best approach, and a practitioner is prioritizing based on expected value, then eventually, the intervention will be matched to the patient. However, starting with the interventions with the highest expected values ensures that most patients attain the best possible outcomes in the fewest sessions.

What about patient preference?

Although most patient preference issues may be politely addressed with patient education (e.g., "I understand that you enjoy manual therapy, but for this issue, this exercise is more effective."), some patient preferences deserve consideration. For example, some patients are very scared of manipulations, thanks to false notions that the risk of adverse events is high (for more on this topic - click here ) (12). However, whether these notions about manipulations are false or not may not matter. Research also suggests that low and moderate stress is unlikely to affect manipulation outcomes; however, high stress may significantly affect manipulation outcomes (for more information - click here ) (13). This implies that if someone is scared of manipulations, they should not receive them. However, this does not change the approach to determining the best possible intervention plan. For this patient, the best possible outcome is likely to be achieved by the next most effective intervention, which in this case is likely joint mobilization. In short, patient preference can influence intervention selection; however, it does not change the methodology for achieving the best possible intervention.

Issues with the Prioritization by Expected Value Approach

Determining the best combination of interventions should start with prioritizing techniques with the highest expected values. However, prioritizing interventions based on expected values has two potential issues.

The first is the potential of redundancy within the selection of techniques with the highest expected values. That is, it is likely that research will demonstrate that the 1st and 2nd most effective techniques are from the same category of techniques with similar effects. For example, research may demonstrate that the thoracic screw manipulation and the thoracic pistol manipulation are the most effective techniques for improving thoracic mobility. Although these techniques are likely both effective, it is unlikely that performing both techniques in sequence will result in a significant improvement in outcomes when compared to performing either technique alone, and especially when compared to following one technique with another relatively effective technique from a different category of techniques (e.g., ischemic compression or specific exercise).

The second related issue is the potential for some combinations of techniques to result in larger additive effects than the sequence of techniques dictated by a simple rank order of techniques with the highest expected values. This assertion accounts for the potential of synergistic relationships of techniques. For example, the interventions with the highest expected values for cervical pain may be joint manipulations, dry needling, IASTM, and manual release techniques (all manual therapy techniques for mobility); however, the combination resulting in the best possible outcomes may be dry needling, joint manipulation, home exercise program, and kinesiology tape (a balance of mobility, stability, and self-management techniques).

This issue may be solved with better sorting, labeling, and potentially modeling. Many techniques are variations of similar techniques.If techniques were appropriately labeled and categorized by type and effect, then the best interventions in each category could be selected, and the best set of interventions would be the best techniques from each category within a rank order of the most effective categories.With optimal labeling and categorization, the only necessary change to prioritizing based on expected value would be the selection of best categories versus the selection of individual interventions. For example, in the thoracic manipulation scenario discussed above, it is likely that choosing the best thoracic manipulation technique, followed by the best of the techniques from the 2nd best category of interventions, would result in better outcomes than two thoracic manipulations performed sequentially. Determining synergies between categories of techniques may be developed in practice; however, the relationship between assessment, categories of interventions, and outcomes likely implies a need for some modeling of an intervention plan. This issue is discussed further below.

The Goal of Assessment: Decision Theory

The Goal of Assessment

From the perspective of decision theory, the goal of an assessment is to differentiate patient populations into subgroups that achieve optimal outcomes from different intervention plans.That is, if an assessment can be used to identify a subgroup of patients that respond better to an intervention that is different than the intervention with the highest expected value based on more general criteria (e.g., different responders), then interventions may be reprioritized to optimize the expected value for the different responders. Further, this is also likely to increase the average expected outcome for the original group, whose averages may have been negatively affected by the lower-than-average results exhibited by the "different responders." The better an assessment is at accurately identifying a subgroup, the more assessments that can be identified that identify additional subgroups, and the better the prioritization of interventions (or categories of interventions) based on expected value, the more the expected outcomes for all patients are likely to increase. The upper limit to the division of subgroups resulting in better results would be limited to the number of assessments that result in additional changes in the reprioritization of techniques based on expected value.

Referring to some of the examples above, if it is possible to use an assessment before treatment to identify the 30% of patients that exhibit a larger effect size from an intervention that does not generally result in the best outcomes (different responders), then that intervention may be prioritized higher for those patients. This reduces the number of patients who would have to wait for the technique to be attempted second. A common example of this idea is the assessment of a range of motion, in which the identification of hypomobility results in the practitioner performing mobility techniques, and the identification of hypermobility results in the practitioner performing stabilization techniques. The identification of accurate and relevant assessments may also address some of the issues with allegiance to modalities based on a few remarkable results (e.g., availability heuristic bias). If an assessment can differentiate the subgroup of remarkable responders, it may be possible to increase the reliability of the intervention several times, increasing the expected value of the technique, which would result in reprioritization of the technique for those remarkable responders.

Assessment Issues

Assessments should not be selected if they fail to result in a re-prioritization of interventions that result in better outcomes. If an assessment does not change intervention selection (by reprioritizing interventions based on a more specific expected value), it should not be performed. Assessments that will not change the intervention plan have no value. Note that the identification of "red flags" for referral to another professional is a change in the intervention plan. In short, red flag assessments are sub-dividing the patient population in such a way that the intervention with the highest expected value exists, but you do not perform that intervention. It is recommended that all professionals take the time to ask the following questions before each assessment, "What will I do if I get a positive result from this assessment?" and "What will I do if I get a negative result from this assessment?". For any assessment in which the answer to these two questions is the same, the assessment should be discarded.

Additionally, short-term subjective measures of pain and symptoms (e.g., pain intensity scales, subjective evaluations, and patient history) lack specificity. Although our goal is to resolve a patient's complaints, the complaints themselves are unlikely to be the best indicators of what interventions result in the best outcomes. This fact is demonstrated by the effectiveness of a placebo on short-term subjective pain despite having little to no influence on long-term, reliable, objective outcome measures. For example, performing lower back effleurage may result in a significant reduction in current lower back pain but is unlikely to have any effect on the changes in recruitment and motion that are correlated with improvements in long-term objective outcomes. Another issue with subjective measures of pain and complaints is related to the lack of accuracy of our sensory system. Even the most educated professional can't perform a differential diagnosis based on their pain alone. For example, it may not be possible for a professional to differentiate between a sore throat from an allergy (post-nasal drip), a virus, strep throat (a bacteria), or throat cancer. However, these conditions require very different interventions. Last, symptoms alone often do not account for a patient's functional capacity. For example, a patient who leaves your office with a resolution of low back pain that was initiated by a collision on the basketball court is a positive outcome. However, if they do not return to playing basketball, it is not a full recovery.

Lastly, the effect an intervention has on outcomes is more important than the correlation between the factor the intervention effects and the pathology. This is a fact that has been missed by several "experts" in the field. The factors that correlate most with the experience of pain are not as important as the magnitude of change that can be made to a correlated factor and its effects on outcomes. This is similar to the question, "Would you prefer 70% of $50 (which is $35) or 25% of $100 (which is $25)? In economics classes, this analogy is used to show how focusing only on percentages or totals, without considering the absolute amount, can lead to suboptimal decision-making. In physical medicine, the pursuit of evidence-based practice has led some to base intervention selection on research investigating the factors correlated with the experience of pain rather than on comparative research investigating the effects of interventions on outcomes. The fallacy resulting from the treatment of the most correlated factors without consideration for absolute outcomes is illustrated most clearly by the "pain science" movement. Despite biopsychosocial factors, such as kinesiophobia, being highly correlated with pain, pain neuroscience education (PNE) is among the least effective interventions for treating pain (PNE Research Review ) (1). It might be hypothesized that even if cognitive factors have a stronger correlation with pain than biomechanical factors, the current interventions available for treating cognitive factors are so ineffective that outcomes are far better when treating weak or moderately correlated biomechanical issues that can be treated very effectively.

Defining Which Outcome Measures We Should Address

The best approach should likely consist of selecting interventions with the highest expected value based on the largest carry-over effects on the treatable factors correlated with the best short-term and long-term patient outcomes. Identifying reliable, accurate assessments for these treatable factors is important because it aids in the refinement of intervention selection during a session. For example, if an attempt is made to increase range of motion (ROM), and a technique results in no change in the goniometric assessment of that ROM, then the professional can select a different technique. This is the refinement of intervention selection in practice that is discussed above with the method of "assess-address-reassess" and trial-and-error progress through a list of techniques prioritized by expected value.

Ideally, the assessments chosen would be reliable, objective outcome measures of the treatable factors most correlated with the best short-term and long-term outcomes. However, there is insufficient research to define all of the most treatable factors correlated with all conditions observed in a clinical setting. For example, there is far more research to aid in the selection of assessments for the shoulder than there is the elbow. Additionally, not all treatable correlated factors have a reliable method of assessment. For example, there is no goniometric assessment for measuring the optimal range of anterior and posterior tipping of the scapula despite research demonstrating that excessive anterior tipping is correlated with shoulder impingement syndrome (SIS) (8). Despite these facts, and as mentioned above, there is more research available than is being effectively used for the refinement of practice. For example, research has demonstrated that the Overhead Squat Assessment can be used to reliably identify signs correlated with dysfunction (e.g., knee valgus), and research has demonstrated that reducing knee valgus reduces knee pain and/or the risk of future knee injury.

Additionally, it is possible that short-term changes are not congruent with long-term outcomes. Although this is a larger issue when using subjective measures, the potential still exists for short-term objective measures to be incongruent with the best possible interventions for long-term results. The solution to this issue may be the additional observation of whether an intervention has "carry-over" effects. That is, how much of the change exhibited at the end of a session was maintained at the start of the subsequent session. If assessments and interventions are chosen not only based on in-session effect size but also the effect size exhibited at the beginning of the subsequent session, it seems very unlikely that there would be an additional incongruence between session-to-session carry-over and the best possible long-term outcomes. To demonstrate why carry-over is so important, we may consider the effects of ice and other cryotherapies. Ice can be expected to be very effective in reducing pain, and that reduction of pain may improve function. But, these changes are likely to be very short-term and result in little to no carry-over from session to session. Compare this to the results of joint manipulation, in which there is an expectation that some of the mobility achieved will be maintained in the subsequent session. This is especially true when manipulations are combined with home exercise. Although the addition of home exercise represents an additional intervention, it would not be expected that ice would improve the carry-over effects of exercise alone. In summary, it is necessary that the best interventions are chosen based on expected value; however, it may be more accurate to base selection on the expected value exhibited at the beginning of the subsequent session.

What is the expected value of dry needling? Where should this be prioritized on our list of effective interventions?
Caption: What is the expected value of dry needling? Where should this be prioritized on our list of effective interventions?

Section 2: Further Refinements for Achieving the "Best Approach"

The Best Intervention Should Be Determined by Comparative Research Whenever Possible.

To prioritize the "best interventions" with the most accuracy, it is necessary to base the relative effectiveness of an intervention on the most accurate data available. Unfortunately, it is unlikely that sufficient data will be available on the reliability and average effect size (expected value) of various interventions. However, it may be assumed that average outcomes (expected value) are the product of reliability and effect size (as discussed previously), including carry-over effects. The most accurate source of information on average outcomes is peer-reviewed and published research (referred to throughout this article as "research"). Further, because the goal is prioritizing intervention selection based on relative efficacy, the research sourced must be comparative.

More on "Why Research?"

Although it is tempting to suggest that only "high-quality" research should be sourced, this assertion results in a reliance on the "levels of evidence hierarchies/pyramids," which are generally flawed (Sample Pyramids ) (10). Worse, reliance on these pyramids most often results in the completely ridiculous dismissal of the majority of research available, a kind of inadvertent cherry-picking that results in a loss of data that could aid in decision-making. To prove this point, ask yourself and 5 of your colleagues what metric is being used to rank the evidence in the hierarchies. Most individuals will say "quality," but that is not an answer. Quality is as subjective a term as the word "beautiful." Unless quality is based on an objective measure, it is just a subjective or arbitrary term that should not be used to determine conclusions in a scientific field. For example, you may feel that a Chevrolet is of higher quality than a Toyota, but without some metric to support that point, it is just your opinion. There are several objective measures that could be used, including the rate of errors, the relative accuracy of conclusions, or even the probability of the influence of bias. However, these metrics would require that a study be performed that compared a large number of studies to determine the average rates of these metrics. Further, these hierarchies create an additional issue when ranking types of research resulting from an attempt to compare one study to another. For example, if level 2 research is better than level 3 research, then by how much? Does one level-2 study supersede five level-3 studies? How about ten level-3 studies? These relatively simple questions demonstrate the flawed logic of attempting to rank research based on vague notions of quality that lack the accuracy to be useful. It is unlikely that any useful comparison of research types can be derived from anything other than large studies of the relative accuracy of currently published research. Note that methodology and study design are not even logical ways to rank research. It is just as likely to have a good or bad randomized control trial as it is to have a good or bad prospective observational study. And that still does not address the separate issue that different study designs result in different types of information, and the type of information needed is dependent on the research question being asked. Note an article is in the works to delve into these issues with interpreting research with more detail.

The truth is that levels of evidence hierarchies are based on vague notions of rank without a correlated measure. However, if these hierarchies were based on the number of controls for biases and errors inherent in a type of data, this could be used as an approximation for the relative accuracy of very different types of data. Further, if these hierarchies were simplified to ensure that the ranking is only as accurate as can be derived from this method, these hierarchies would result in more honest interpretations of data. For example, peer review, careful design of an experiment or observation, and analysis with statistics are all controls that aid in reducing biases and errors, including selection bias, personal bias, the availability heuristic (discussed above), etc. In short, peer-reviewed published research includes a variety of control measures that should result in a relative accuracy that is superior to the assertions that a single individual can make. However, since all research includes these traits, research types cannot be differentiated based on these criteria. Further, we have to be careful what we choose as a control to differentiate research because it may or may not affect the data we are looking for. For example, all peer-reviewed published research must attain IRB and peer-review approval, and for our purposes, we need comparative research. However, whether or not a study has a control group does not result in inherently better data if it compares two interventions with an effect size that is known to be greater than doing nothing. What can be asserted is that confirmations by repetition with multiple studies are likely more accurate than a single study. Further, it can likely be asserted with confidence that a research study has more controls than expert opinion based on clinical experience without formal analysis, and clinical experience results is a more controlled gathering of data than non-expert opinion. Using this simple but supportable logic, the following simplified but more accurate hierarchy can be developed:

  • A Better Level of Evidence Hierarchy
    1. More research studies are generally better than one study.
    2. Research is better than a single case.
    3. Objective outcome measures from practice are better than expert opinion.
    4. Expert opinion is better than non-expert opinion
    5. Non-expert opinion is not reliable.

Research Must Be Comparative

This relatively simple hierarchy highlights that research, the disciplined application of the scientific method with analysis via statistics and reviewed by a group of experts, is the best tool we have for reducing the influence of bias and error and ensuring "optimal accuracy." For the purposes of choosing the "best possible approach," we need the most accurate information. Further, the research we use must be comparative because the intent is to determine relative efficacy. Comparative studies may include both experimental and observational research and may or may not include a control group. Although experimental research and randomized control trials add additional information to our analysis that may be helpful, the only necessary aspect of the research design for our purposes is that at least 2 interventions were compared. Note that non-comparative studies (including control studies) should not be compared to one another due to the indeterminable number of confounding variables. This includes the use of meta-analysis (MA) in our field. An MA is little more than an average of averages and actually increases the potential for many biases and errors. Serious interpretation errors are occurring in mass in our industries with regard to the strength of conclusions that can be derived from an MA. For example, in our fields, it is at least as likely that a failure to refute the null is due to regression to the mean (e.g., an underpowered study) as it is that a variable failed to have an effect (e.g., a true null). For example, If you performed an MA including nine studies, seven of which demonstrate that one intervention results in significantly better outcomes than the other, two studies show that both interventions result in statistically similar results, and one of those two studies demonstrated a clear trend similar to the other 7, but failed to reach statistical significance, then there was no reason to do the MA. The research already demonstrated a clear trend. The only new information that an MA can add to this scenario is a result of "a failure to refute the null," and the fact that this is the only additional information that can be added represents a serious bias. Further, if the MA results in "a failure to refute the null," then it could be said with near certainty that there is an issue with the MA. The problem is obviously not the intervention. Again, an article is in the works to address these issues with research interpretation in more detail.

Simpler, But More Accurate, Interpretation

The issues discussed above regarding subjective ratings of quality, levels of evidence hierarchies, and comparative research methods imply that a simpler analysis may be more accurate. In summary, the highest "quality" information available (based on most controls for reducing bias and error) is peer-reviewed research, and to develop a prioritization of interventions based on relative efficacy, we need comparative research. Further, a simple rubric is needed to aid in interpreting the findings of multiple comparative studies to ensure that interpretation remains objective and systematic. The technique we recommended for interpretation is known as "vote counting." Note that in our field, "votes" are usually not close. Rarely studies comparing two interventions do not clearly demonstrate a trend toward one intervention or imply that there is little difference between the two interventions. The following is the "vote-counting" interpretation rubric used by the Brookbush Institute for all of our systematic reviews and courses.

  • Comparison Rubric (The goal is to use the research available to determine the most likely trend).
    • A is better than B in all studies = Choose A
    • A is better than B in most studies, and additional studies demonstrate that A and B result in similar results = Choose A
    • A is better than B in some studies, but most studies demonstrate that A and B result in statistically similar results = Choose A (but with reservations)
    • A is better than B in some studies, some studies show similar results, and some studies show B results in better outcomes = Results are likely similar (unless a reason can be identified that explains the difference in study results, e.g., participant age, sex, injury status, etc.)
    • In some studies, A is better than B, and other studies demonstrate B is better than A. Unless the number of studies overwhelmingly supports one result, results are likely similar.

When Additional Information is Needed

Although research should be the primary source for determining the best intervention, not all interventions have been investigated in peer-reviewed and published studies and/or compared to other interventions in studies. Research may permit some indirect comparisons; however, these types of indirect comparisons can rarely be made with confidence. For example, research might imply that joint manipulations result in better outcomes than IASTM, and IASTM results in better outcomes than ultrasound, so we can assume that manipulations result in better outcomes than ultrasound. Although this logic is likely reasonable for this example, the potential for each technique to have different effects that may or may not be useful in different population groups suggests that these indirect comparisons should be used sparingly. Further, "an absence of evidence is not evidence of absence." A lack of research cannot be used to imply that an intervention is ineffective. When the necessary research is unavailable, there is simply no evidence in the research of whether something is more or less effective. When this occurs, other methods must be used to determine the relative efficacy of interventions.

When research is unavailable, comparison in practice is essential. However, these comparisons should not be based on a professional's gut estimates of trends. As mentioned above, we are prone to a variety of well-researched biases (confirmation bias, the availability heuristic, anchoring, a failure to recognize exponential growth as different than linear, loss aversion, etc.). If comparisons must be made in practice due to a lack of research, then interventions should be compared based on reliable, objective outcome measures. Further, when possible, the effects of two interventions should be compared with the same patient, and that comparison should be repeated with multiple patients before a conclusion is made. When outcomes can be confidently predicted (e.g., A performs better than B with few exceptions), then this result should be documented, and interventions should be reprioritized based on this new information. In short, when research is unavailable, the professional should attempt to approximate an experiment in practice.

There is more research available than is currently utilized.

One critique of this article during initial reviews was that if a single "best" approach exists, the current body of research is insufficient to establish it. While I acknowledge that there are significant gaps in the current body of research, I can confidently assert that far more research exists than is presently being applied in practice. The Brookbush Institute aims to build the first comprehensively evidence-based education company, where every course begins with a comprehensive, systematic review of all relevant research on a given topic. Our goal is to achieve an unparalleled level of accuracy, ensuring every conclusion is meticulously derived from all available peer-reviewed data.

What has been particularly surprising is that, without exception, each topic we’ve reviewed has revealed dozens, if not hundreds, of studies previously overlooked in previously published reviews. These studies frequently provide critical insights, answering nuanced questions and adding important details. The issue with the "not enough research" critique is that it overlooks the significant progress that could be made by leveraging all of the currently published research. Even if this progress falls short of identifying a single "best" approach, it still represents significant progress toward a single best approach.

Experimentation is Necessary to Overcome Local Maxima

In the context of physical rehabilitation interventions, a local maximum can be defined as a treatment approach that yields a high level of effectiveness based on current knowledge and evidence. This approach may be considered optimal within the scope of the strategies that have been researched and tested up to this point in time. However, this approach may not represent the global maximum, which could be defined as the absolute best possible intervention strategy. To investigate whether better outcomes are possible, it is necessary to experiment with new interventions, and new combinations of interventions. This experimental approach may reveal outcomes that are better than the local maxima, uncovering strategies that achieve a higher expected value than previously thought possible. However, this can only be accomplished by exploring approaches different than those with known expected values, including the highest expected value.

A note regarding the barrage of recommendations on social media. Although someone without expertise (e.g., a non-professional) or little experience (e.g., a new graduate) may stumble upon a new best approach, this is very unlikely in a mature scientific field. It is far more likely that these individuals will repeat interventions that have been tried previously and or attempt approaches that may have been easily dismissed with a better understanding and more experience. In contrast, an educated, experienced professional is more likely to understand the limitations of existing interventions, identify subtle nuances and variations in interventions, and be aware of unexplored areas of treatment. Although the process of experimentation is inherently uncertain, it is far more likely to be successful with a systematic approach and knowledge of previous successes and failures.

As mentioned above, conclusions derived from experimentation should not be based on a professional's gut estimates of trends. If comparisons must be made in practice due to a lack of research, then interventions should be compared based on reliable, objective outcome measures, and the professional should attempt to approximate an experiment in practice. Our initial recommendation to address this issue is what we call "giving yourself 5 minutes to suck." If we assume sessions are 60 minutes (adjust accordingly if your sessions are longer or shorter), then 5 minutes are spent on an experiment with a new technique, and 55 minutes are spent on interventions based on known expected outcomes. It is unlikely that 5 minutes of a well-thought-out experimental intervention can have a negative effect greater than the 55 minutes of best-approach interventions; however, the cumulative learning of 5 minutes per session may result in significant improvements in expected outcomes over time. The name of this recommendation is inspired by the idea that it is okay to be "not great" for 5 minutes with the intent to continue learning.

This recommendation parallels Nassim Nicholas Taleb's concept of the antifragile barbell strategy. His application is to financial investments, where a portfolio is split between two extremes: low-risk, conservative investments (perhaps 95% of a portfolio) on one side and high-risk, high-reward opportunities on the other (perhaps 5% of the portfolio) (10). In physical rehabilitation, the "barbell" approach dedicates most of the session (55 minutes) to proven interventions (the low-risk side) while reserving a small portion (5 minutes) for experimental techniques (the high-risk, high-reward side). This ensures the majority of the session delivers reliable, effective treatment while minimizing negative outcomes. Meanwhile, the controlled use of novel techniques allows for incremental learning and potential breakthroughs without jeopardizing the session’s results. Over time, these "5-minute experiments" could lead to innovative strategies that enhance long-term patient outcomes, much like how small, calculated risks in the barbell strategy can yield outsized gains in a financial portfolio.

Modeling is Likely Necessary

In this context, "modeling" refers to a simplified representation or simulation of a dynamic system used to predict outcomes. Problems with multiple variables that interact and change over time typically require modeling. Additionally, models may aid in tracking multiple variables, allowing for various configurations and strategies to be tested.

The assertions discussed above suggest modeling is not only recommended, but may be necessary to achieve the best approach. For example, research has demonstrated that a combination of interventions improves outcomes more than any single intervention. Additionally, better labeling and categorization can reduce the chances of selecting interventions with redundant effects, but this implies that better sorting is a necessary step. Third, prioritization of interventions based on relative efficacy can effectively develop combinations of interventions based on the relative efficacy of categories of interventions. However, as mentioned above, there are likely interventions with synergistic effects, and further, some interventions may need to be added based on the results of specific assessments. Assessments themselves should be carefully selected to differentiate patient populations into subgroups that achieve optimal outcomes from different intervention plans, to aid in optimal reprioritization of interventions. And, some time needs to be allotted for continued experimentation to ensure additional progress is made toward the absolute best approach (as opposed to local maxima). Addressing these intricate and multi-variate relationships is when modeling is most appropriate. Consider the following statement from our courses that was developed following years of refinement or our systematic review process, concerns regarding labeling and sorting, and developing a replicable system with the intent to optimize outcomes. Note the explicit mention of modeling that aids in describing the relationship between assessments, interventions, and outcomes.

  • The future is likely the modeling of dysfunction by identifying impairments correlated with a symptom or symptom cluster, identifying reliable and accurate assessments to aid in identifying and differentiating those clusters from other conditions (that would ideally be treated differently), and determining the combination of interventions that result in the best possible objective outcome measures (reliability, effect size, etc.), assuming that those objective outcomes measures have also been correlated with short-term and long-term patient outcomes (e.g. short-term and long-term perception of pain, return to function, etc.).

Although this statement was developed before this article, it clearly attempted to describe our continued pursuit of an objectively measurable "best possible approach" through better decision-making. This article refines the development of a model with the following assertions.

  1. Use the outcomes (expected value) demonstrated in comparative research to build an intervention model that prioritizes intervention categories by relative efficacy and further lists the best intervention from each category. (Note that developing categories will require better labeling and sorting of intervention types).
  2. Additionally, assessments should be carefully selected to differentiate patient populations into subgroups that achieve optimal outcomes from different intervention plans, aiding in the optimal reprioritization of interventions for these subgroups.
  3. A methodology of assessment, intervention, and re-reassessment can then be used to test interventions in rank order of their expected values to refine intervention selection for individual patients in practice.
  4. Last, a small proportion of session time should be allocated to trying new approaches, with the aim of uncovering strategies that achieve a higher expected value than previously thought possible.

When Modeling is not Helpful

It should be noted that if the goal of a model is to improve outcomes, then the model must be intentionally developed to predict the relationship between interventions and improvements in outcomes. The goal of the model cannot be set at a variable correlated with outcomes or developed with the sole intent of explaining the causes of an outcome. In this way, there is no such thing as a “reps model” for resistance training unless the goal is performing more reps. Further, various certifications and courses are either using the word "model" incorrectly or have developed models with the wrong goal. Examples of this include courses that use the word model to describe a process of dismissing interventions they don't like, rationalizing interventions they do like, and/or labeling a random assortment of approaches as a model without any consideration of relative efficacy. An example of a model of explanation that is not a model for predicting outcomes is the Biopsychosocial (BPS) model. The BPS model is not outcome-driven. It will not improve intervention selection unto itself. It was developed with the intent of explaining the correlated factors of the pain experience. Attempting to base intervention selection on the BPS model ignores the relative efficacy of interventions, the prioritization of interventions based on efficacy, and the intent to optimize reliable, objective outcome measures. This is not the fault of the BPS model. This is the fault of individuals who have attempted to use a model attempting to explain causation, to refine outcomes. In summary, any group of things is not a model, and any use of the word model that lacks the intent to predict outcomes is either not a model or is a model that was not intended to aid in optimizing outcomes.

A Sample of the Developing Models from the Brookbush Institute.

The following are pieces of the model used by the Brookbush Institute for Lower Extremity Dysfunction . Initially, symptoms correlated with common impairments (based on a review of all relevant research) were grouped with consideration of how much of the body should be considered during a single treatment session. That is, treating the lower extremity versus just the ankle or the entire lower extremity and trunk. A list of the impairments correlated with symptoms was organized by tissues with consideration of the intent of various techniques and their efficacy. (If you noticed that there is some feedback loops, you would be correct. It took several attempts and refinement over time to find a model that appropriately fit the data). Assessment is then used to identify the impairments within the list that are exhibited by an individual. The list of correlated impairments is then used to select which structures should be addressed from a list based on the assessment. Those structures are then treated based on a treatment model including synergistic categories of interventions that were selected based on relative efficacy.

Symptoms and Diagnoses Clusters That Result in Similar Dysfunction

  • Ankle/Foot
    • Medial tibial stress syndrome
    • Pronation
    • Ankle sprain
    • Ankle instability
    • Achilles tendinopathy
    • Tibialis posterior tendinopathy
    • Plantar fasciitis
  • Knee
    • Anterior cruciate ligament injury
    • Functional valgus
    • Knee pain (patellofemoral pain syndrome, jumper's knee )
    • Lateral knee pain (iliotibial band syndrome, runner's knee)
    • Knee osteoarthritis
    • Knee effusion
    • Proximal tibiofibular joint pathology
    • Tibiofibular joint subluxation/dislocation
    • Medial and lateral heel whip
  • Hip
    • Trigger points
    • Abductor tendon tear
    • Adductor Groin Strain
    • Femoral acetabular impingement
    • Hip Osteoarthritis
    • Lumbosacral dysfunction
    • Low Back Pain
    • Sacroiliac joint pain

The Impairments Correlated with Lower Extremity Dysfunction

Correlated Changes in Overhead Squat Assessment (Sit-to-Stand Mechanics)

  • Loss of dorsiflexion (inadequate forward translation of the knee, i.e. tibia on foot dorsiflexion)
  • Feet Flatten (a.k.a. functional pes planus, pronation, eversion, calcaneus valgus, positive navicular drop test, etc.)
  • Feet turn out (a.k.a. turn-out, heel flare, heel whip, etc.)
  • Knees Bow In (a.k.a. functional knee valgus, medial knee displacement, hip adduction, etc.)
  • Excessive Forward Lean (excessive hip flexion, forward trunk position, tibia/torso angle)

Correlated Changes in Muscle Activity and Length

Correlated Changes in Joint Mobility (Increased stiffness)

  • Ankle : Inadequate posterior glide of the talus on the tibia
  • Ankle : Inadequate posterior glide of the lateral malleolus on the tibia
  • Ankle /Knee : Inadequate anterior glide of the fibular head on the tibia
  • Knee : Inadequate anterior glide of the tibia on the femur (the lateral compartment may be more restricted)
  • Hip : Inadequate posterior/inferior glide of the femur in the acetabulum

Correlated Changes in Fascia Mobility (Loss of Extensibility)

  • Sacrotuberous ligament
  • Iliotibial Band
  • Crural Fascia
  • Achilles Tendon
  • Plantar Fascia

Altered Subsystem Recruitment Patterns

Intervention Model (Prioritization of Categories Based on Relative Efficacy, Modifiable via Assessment Findings):

  • Mobilize
    1. Release
    2. Mobilize
    3. Lengthen
    4. IASTM (optional)
  • Activate
    • Isolated activation
    • Core integration
    • Reactive activation
    • Subsystem integration

Note that this case study demonstrates the use of objective, reliable assessments of factors correlated with lower extremity symptoms (symptom clusters) to select a group of interventions that have demonstrated the highest relative efficacy for addressing those factors. All of this was developed from all of the relevant peer-reviewed research available at the time of the systematic review for a particular variable and tested in practice.

Case Study:

Patient: 47yo, female, complains of shin pain. An avid long-distance runner since high school. History of knee and foot-related complaints (e.g. PFPS, Achilles tendinopathy, plantar fasciitis)

Sample Intervention (Ankle Dorsiflexion Restriction)

Note that this integrated intervention is based on the following intervention model (set of intervention categories with the highest relative efficacy).

Formal Proof

  • Probabilistic Outcomes: Given that outcomes are probabilistic, the result of an intervention cannot be known with certainty. Let pip_irepresent the probability of a successful outcome for intervention IiI_i. These probabilities vary by intervention, and the outcome depends on both the reliability of the intervention and its short-term and long-term effect size on objective outcome measures.
  • Relativity of Choices: The success of an intervention is relative, not evaluated in isolation. If there are nn possible interventions {I1,I2,,In}\{I_1, I_2, \dots, I_n\}, the effectiveness of intervention IiI_i must be evaluated relative to the effectiveness of others IjI_j, where jij \neq i. Comparative research should inform this evaluation.
  • Zero-Sum Nature of Intervention Selection: In any session, time constraints limit the number of interventions S{I1,I2,,In}S \subseteq \{I_1, I_2, \dots, I_n\}that can be performed. Selecting intervention IiI_iexcludes IjI_j, creating a zero-sum game. The goal is to prioritize interventions with the highest expected value, factoring in carry-over effects.
  • Measurable Effectiveness: Effectiveness EiE_iof an intervention is defined as Ei=Fi×ViE_i = F_i \times V_i, where FiF_i is reliability (frequency of success) and ViV_i is effect size. Expected value should prioritize interventions shown to produce both short-term and long-term improvements, based on objective outcome measures, avoiding over-reliance on subjective reports like pain reduction.
  • Existence of a Best Intervention: Let the set of possible interventions be {I1,I2,,In}\{I_1, I_2, \dots, I_n\}. The best intervention IbestI_{\text{best}} is the one that maximizes EiE_inew experimental interventions should periodically be introduced to challenge the potential of local maxima.
  • Prioritizing the Best Intervention: By choosing IbestI_{\text{best}}, practitioners maximize the likelihood of the best possible outcome, but assessments must be used to identify patient subgroups (e.g., different responders). In cases where subgroups exist, interventions can be reprioritized for each subgroup to optimize expected value.
  • Formula for Optimal Intervention Selection (Example of Formula Use At the End of the article):

Ioptimal=argmaxi{1,2,,n}(E(OiS)×WLT)subject to:i=1kTiTsession

  • Where:
    • IoptimalI_{\text{optimal}} is the optimal intervention or combination of interventions selected for the patient.
    • argmax\arg\max means "the value of ii" that maximizes the expression E(Oi)×WLT\mathbb{E}(O_i) \times W_{\text{LT}}. Essentially, it finds the intervention IiI_i that provides the highest value of the expected outcome multiplied by the long-term weighting factor.
    • E(Oi)\mathbb{E}(O_i) is the expected outcome of intervention IiI_i, calculated as: E (Oi)=Fi×Vi
      • FiF_i is the reliability (frequency of success) of intervention IiI_i.
      • ViV_i is the effect size (magnitude of the outcome) of intervention IiI_i.
    • WLTW_{\text{LT}}is a weighting factor that accounts for the long-term carry-over effect of the intervention. This ensures that interventions with long-term benefits are prioritized higher than those with only short-term effects.
    • TiT_i is the time required to perform intervention IiI_i.
    • TsessionT_{\text{session}} is the total time available in the session.
    • The summation constraint i=1kTiTsession\sum_{i=1}^{k} T_i \leq T_{\text{session}} ensures that the total time spent on interventions in a session does not exceed the available time.
    • SS is the identified patient subgroup, and E(OiS) is the expected outcome of intervention IiI_i for the specific subgroup.​

Explanation of hand position during a sacroiliac joint mobilization using a pisiform (saddle) grip hand position.
Caption: Explanation of hand position during a sacroiliac joint mobilization using a pisiform (saddle) grip hand position.

Step-by-Step Breakdown Using the Formula for Optimal Intervention Selection:

Example Scenario:

  • Lower back pain patient: Based on an assessment, you categorize patients into two subgroups:
    • Subgroup A: Patients with muscle over-activity as the primary contributor to pain.
    • Subgroup B: Patients with joint mobility restrictions as the primary contributor to pain.
  • You have three possible interventions to choose from (for simplicity, we are only choosing one):
    • Intervention 1: Static Manual Release
    • Intervention 2: Joint Mobilization
    • Intervention 3: Core Stability exercises
  • The goal is to select the best intervention for a given patient based on assessment

1. Define Expected Outcomes for Each Subgroup:

  • FiF_i: The reliability (frequency of success) of each intervention.
  • ViV_i: The effect size (magnitude of the outcome) for each intervention.

Subgroup A (Muscle Over-activity):

  • E(O1A)=F1×V1=0.8×10=8\mathbb{E}(O_1 | A) = F_1 \times V_1 = 0.8 \times 10 = 8 (Static Manual Release)
  • E(O2A)=F2×V2=0.4×5=2\mathbb{E}(O_2 | A) = F_2 \times V_2 = 0.4 \times 5 = 2 (Joint Mobilization)
  • E(O3A)=F3×V3=0.6×6=3.6\mathbb{E}(O_3 | A) = F_3 \times V_3 = 0.6 \times 6 = 3.6 (Core Stability Exercises)

Subgroup B (Joint Mobility Restriction):

  • E(O1B)=F1×V1=0.5×6=3\mathbb{E}(O_1 | B) = F_1 \times V_1 = 0.5 \times 6 = 3 (Static Manual Release)
  • E(O2B)=F2×V2=0.9×8=7.2\mathbb{E}(O_2 | B) = F_2 \times V_2 = 0.9 \times 8 = 7.2 (Joint Mobilization)
  • E(O3B)=F3×V3=0.7×7=4.9\mathbb{E}(O_3 | B) = F_3 \times V_3 = 0.7 \times 7 = 4.9 (Core Stability Exercises)

2. Adjust for Long-Term Weighting Factor WLTW_{\text{LT}}:

Assume the long-term carry-over effect is higher for exercise therapy in both subgroups because exercise can be self-administered, so assign weights:

  • WLT=1.2W_{\text{LT}} = 1.2 for Core Stability Exercises (Intervention 3).
  • WLT=1.0W_{\text{LT}} = 1.0 for Static Manual Release (Intervention 1).
  • WLT=1.1W_{\text{LT}} = 1.1 for Joint Mobilization (Intervention 2).

Now, let's recalculate the adjusted expected outcomes:

Subgroup A (Muscle Over-activity):

  • E(O1A)×WLT=8×1.0=8\mathbb{E}(O_1 | A) \times W_{\text{LT}} = 8 \times 1.0 = 8 (Static Manual Release)
  • E(O2A)×WLT=2×1.1=2.2\mathbb{E}(O_2 | A) \times W_{\text{LT}} = 2 \times 1.1 = 2.2 (Joint Mobilization)
  • E(O3A)×WLT=3.6×1.2=4.32\mathbb{E}(O_3 | A) \times W_{\text{LT}} = 3.6 \times 1.2 = 4.32 (Core Stability Exercise)

Subgroup B (Joint Mobility Restriction):

  • E(O1B)×WLT=3×1.0=3\mathbb{E}(O_1 | B) \times W_{\text{LT}} = 3 \times 1.0 = 3 (Static Manual Release)
  • E(O2B)×WLT=7.2×1.1=7.92\mathbb{E}(O_2 | B) \times W_{\text{LT}} = 7.2 \times 1.1 = 7.92 (Joint Mobilization)
  • E(O3B)×WLT=4.9×1.2=5.88\mathbb{E}(O_3 | B) \times W_{\text{LT}} = 4.9 \times 1.2 = 5.88 (Core Stability Exercise)

3. Find the Optimal Intervention (Using argmax):

  • For Subgroup A (Muscle overactivity):
    • argmax(8,2.2,4.32)\arg\max \left( 8, 2.2, 4.32 \right)Static Manual Release is the best intervention because it has the highest adjusted expected value of 8.
  • For Subgroup B (Joint Mobility Restriction):
    • argmax(3,7.92,5.88)\arg\max \left( 3, 7.92, 5.88 \right)Joint Mobilization is the best intervention because it has the highest adjusted expected value of 7.92.

4. Time Constraints:

  • Assume the available session time is 60 minutes and the time required for each intervention is:
    • T1=20T_1 = 20 minutes (Static Manual Release)
    • T2=30T_2 = 30 minutes (Joint Mobilization)
    • T3=40T_3 = 40 minutes (Core Stability Exercise)

In both cases, these interventions fit within the session's time constraints, so we don’t need to adjust for time limitations here.

Conclusion:

  • For Subgroup A (Muscle over-activity), the optimal intervention is Static Manual Release (Ioptimal=I1)(I_{\text{optimal}} = I_1)because it has the highest expected outcome adjusted for long-term effects.
  • For Subgroup B (Joint Mobility Restriction), the optimal intervention is Joint Mobilization (Ioptimal=I2)(I_{\text{optimal}} = I_2) because it has the highest expected outcome adjusted for long-term effects.
    • Note that the same formula can be applied for multiple techniques, and for consideration of time constraints

Bibliography

Please forgive me for referencing myself. Note that the only references to my previously published materials are references to comprehensive research reviews of a topic, and these references include annotated bibliographies. Citing these articles was only intended to improve the readability of this article by reducing dozens or hundreds of citations to a few citations. We genuinely hope that you will critically review these references, and we humbly ask for feedback if you believe our conclusions are anything less than objective and conservative. Additionally, some may question our reference to Wikipedia; however, the pages referenced give a better summary of those topics than any other single reference (e.g., The Branford Hill Criteria).

  1. Brookbush, B. (2024) Pain neuroscience education (PNE) is relatively ineffective: Research confirmed. Brookbush Institute. https://brookbushinstitute.com/articles/pain-neuroscience-education-pne-is-relatively-ineffective-research-confirmed
  2. Wikipedia contributors. (n.d.). Expected value. Wikipedia. Retrieved September 24, 2024, from https://en.wikipedia.org/wiki/Expected_value
  3. Brookbush, B., Campione, J. (2024) Instrument-assisted soft tissue mobilization (IASTM): Comprehensive systematic research review [Online course]. Brookbush Institute. https://brookbushinstitute.com/courses/instrument-assisted-soft-tissue-mobilization-iastm-comprehensive-systematic-research-review
  4. Hill, A. B. (2015). The environment and disease: association or causation?. Journal of the Royal Society of Medicine108(1), 32-37.
  5. Wikipedia contributors. Bradford Hill criteria. Wikipedia. Retrieved September 24, 2024, from https://en.wikipedia.org/wiki/Bradford_Hill_criteria
  6. Brookbush, B. (2024). False narratives: Nocebo and negative expectations do not affect manual therapy outcomes. Brookbush Institute. https://brookbushinstitute.com/articles/false-narratives-nocebo-and-negative-expectations-do-not-affect-manual-therapy-outcomes
  7. Brookbush, B. (2024). Active versus passive: Is exercise more effective than manual therapy? Brookbush Institute. Retrieved September 24, 2024, from https://brookbushinstitute.com/articles/active-versus-passive-is-exercise-more-effective-than-manual-therapy
  8. Lawrence, R. L., Braman, J. P., LaPrade, R. F., & Ludewig, P. M. (2014). Comparison of 3-dimensional shoulder complex kinematics in individuals with and without shoulder pain, part 1: sternoclavicular, acromioclavicular, and scapulothoracic joints. journal of orthopaedic & sports physical therapy44(9), 636-645.
  9. Brookbush, B. (2015). Lower extremity dysfunction. Brookbush Institute. Retrieved September 26, 2024, from https://brookbushinstitute.com/courses/lower-extremity-dysfunction
  10. Burns, P. B., Rohrich, R. J., & Chung, K. C. (2011). The levels of evidence and their role in evidence-based medicine. Plastic and reconstructive surgery128(1), 305-310.
  11. Taleb, N. N. (2012). Antifragile: Things that gain from disorder. Random House.
  12. Brookbush, B. (2020). Joint mobilizations and manipulations: Risk of adverse events. Brookbush Institute. https://brookbushinstitute.com/courses/joint-mobilization-and-manipulation-risk-of-adverse-events
  13. Brookbush, B. (2024). False narratives, nocebo, and negative expectations do not affect manual therapy outcomes: Research confirmed. Brookbush Institute. https://brookbushinstitute.com/articles/false-narratives-nocebo-and-negative-expectations-do-not-affect-manual-therapy-outcomes

Comments

Guest