Outcome Measures for Patients with Lower Limb Amputations: Difference between revisions

No edit summary
No edit summary
Line 91: Line 91:
=== Validity<br>  ===
=== Validity<br>  ===


'''Content /Face validity:''' This is the degree to which the content of an outcome measure is an adequate reflection of the construct or concept to be measured(5). It is usually considered and agreed by consensus of an expert group of clinicians and can and should include patient representatives. For example an instrument measuring activity limitation in young athletic individuals should include not only walking but also running, jumping, and climbing.<br>'''Structural validity:''' This refers to the degree to which the scores of an outcome measure are an adequate reflection of the dimension or factor of the construct being measured(5). It can be measured by performing factor analysis where the result demonstrate that if &gt;50&nbsp;% of data refer to one factor this confirms that the outcome measure is measuring one factor / dimension. Anything less indicates more than one factor is being assessed. Rasch Analysis may also be used it investigate where the outcome measure it’s unidimensionality, i.e. whether it is measuring one or more factors or dimensions.<br>  
'''Content /Face validity:''' This is the degree to which the content of an outcome measure is an adequate reflection of the construct or concept to be measured(5). It is usually considered and agreed by consensus of an expert group of clinicians and can and should include patient representatives. For example an instrument measuring activity limitation in young athletic individuals should include not only walking but also running, jumping, and climbing.<br>'''Structural validity:''' This refers to the degree to which the scores of an outcome measure are an adequate reflection of the dimension or factor of the construct being measured <ref name="mokkink" /> . It can be measured by performing factor analysis where the result demonstrate that if &gt;50&nbsp;% of data refer to one factor this confirms that the outcome measure is measuring one factor / dimension. Anything less indicates more than one factor is being assessed. Rasch Analysis may also be used it investigate where the outcome measure it’s unidimensionality, i.e. whether it is measuring one or more factors or dimensions.<br>  


*Wong et al (2013) reported results from factor analysis performed on the Berg Balance Score (BBS). The results showed that 70% of the data were explained in the model related to one dimension, i.e. balance capability(16).  
*Wong et al (2013) reported results from factor analysis performed on the Berg Balance Score (BBS). The results showed that 70% of the data were explained in the model related to one dimension, i.e. balance capability <ref name="wong" />.  
*Franchignoni et al (2007) used Rasch modelling on a modified LocomotorCapability Index to confirm good structural validity when level 1 and 2 category responses were combined and 4 items were deleted due to either over or under-fitting(17). The resultant modified index is known as LCI-5 which many clinicians now use.
*Franchignoni et al (2007) used Rasch modelling on a modified LocomotorCapability Index to confirm good structural validity when level 1 and 2 category responses were combined and 4 items were deleted due to either over or under-fitting <ref name="frachignoni">Franchignoni F, Giordano A, Ferriero G, Muñoz S, Orlandini D, Amoresano A. Rasch analysis of the Locomotor Capabilities Index-5 in people with lower limb amputation. Prosthet Orthot Int 2007 12;31(4):394-404.</ref>. The resultant modified index is known as LCI-5 which many clinicians now use.


'''Construct Validity: ''' This is the degree to which the scores of an outcome measure are consistent with pre-defined (apriori) hypotheses that outline relationships to the scores of other instruments, or differences between groups. If &gt; 75% of hypotheses are proved this is an indication of good validity(18).<br>It can also be referred to as:'''Concurrent validity '''– showing the ability to distinguish between groups (e.g. older and younger LLAs)which is often measured by testing hypotheses, or; Convergent validity – showing that measures that should be related are relatedwhich can also be measured using ICC with high values indicating good validity.<br>  
'''Construct Validity: ''' This is the degree to which the scores of an outcome measure are consistent with pre-defined (apriori) hypotheses that outline relationships to the scores of other instruments, or differences between groups. If &gt; 75% of hypotheses are proved this is an indication of good validity <ref name="terwee">Terwee CB, Bot SDM, de Boer M,R., van der Windt D,A.W.M., Knol DL, Dekker J, et al. Quality criteria were proposed for measurement properties of health status questionnaires. J Clin Epidemiol 2007 01;60(1):34-42.</ref> .<br>It can also be referred to as:'''Concurrent validity '''– showing the ability to distinguish between groups (e.g. older and younger LLAs)which is often measured by testing hypotheses, or; Convergent validity – showing that measures that should be related are relatedwhich can also be measured using ICC with high values indicating good validity.<br>  


*Major et al (2013) hypothesised positive relationships between the BBS scores and the Activities specific Balance Confidence (ABC) scale, the mobility scale of the Prosthetic Evaluation Questionnaire (PEQ(ms), the Frenchay Activities Index and the 2MWT and a negative relationship with the L-test score(19). These were all proved.  
*Major et al (2013) hypothesised positive relationships between the BBS scores and the Activities specific Balance Confidence (ABC) scale, the mobility scale of the Prosthetic Evaluation Questionnaire (PEQ(ms), the Frenchay Activities Index and the 2MWT and a negative relationship with the L-test score <ref name="major">Major MJ, Fatone S, Roth EJ. Validity and reliability of the berg balance scale for community-dwelling persons with lower-limb amputation. Arch Phys Med Rehabil 2013 11;94(11):2194-2202.</ref>. These were all proved.  
*To assess concurrent and convergent validity of the L-test Deathe&amp; Miller (2005) asked subjects to complete walks test; Timed "Up &amp; Go" Test (TUG),10-Meter Walk Test, and 2-Minute Walk Test, followed by self-reported measures; ABC scale, Frenchay Activities Index (FAI), PEQ (ms). Concurrent validity washigh (ICC = 0.86-0.97)between the L Test data and the other walk tests and fairto moderate ( ICC = 0.22 – 0.54) for self-report measures.Higher mean times were observed for thosesubjects who: (1) were older, (2) used a walking aid,(3) had to concentrate on each step they took, (4) hada vascular amputation, and (5) had a TF amputation. Therefore it was demonstrated that the L Test was able to discriminate between all groups as hypothesized(12).
*To assess concurrent and convergent validity of the L-test Deathe&amp; Miller (2005) asked subjects to complete walks test; Timed "Up &amp; Go" Test (TUG),10-Meter Walk Test, and 2-Minute Walk Test, followed by self-reported measures; ABC scale, Frenchay Activities Index (FAI), PEQ (ms). Concurrent validity washigh (ICC = 0.86-0.97)between the L Test data and the other walk tests and fairto moderate ( ICC = 0.22 – 0.54) for self-report measures.Higher mean times were observed for thosesubjects who:  


'''Criterion validity:''' This is the degree to which the scores of an outcome measure are an adequate reflection of a ‘gold standard’. However there are very few situations in rehabilitation where such a gold standard test exists. If no gold standard is available then it may be appropriate to test hypothetical relationships with comparator measures.<br>The estimation of criterion validity depends on the type of data. Intra-class correlations are used if both instruments (outcome instrument and comparator) have continuous scores (e.g. time, distance etc) and the results should preferably be above 0.70. If the outcome instrument has a continuous score but the comparator has a dichotomous score (e.g. Yes / No) then area under the receiver operated characteristic (ROC) is the preferred method. Again, a criterion of 0.70 is suggested(18). <br>
#were older
#used a walking aid
#had to concentrate on each step they took
#had a vascular amputation and  
#&nbsp;had a TF amputation.  


*Gremeauxet al (2012) presented ROC curves for the 2MWT. The modified Houghton Scale was used to stratify the patients into two groups; those with no mobility problems (scored 20/20) and those who scored less than 20 indicating a functional limitation. According to the ROC analysis cut off values of 130m or 150m were highly associated with the existence of functional limitations(20).
Therefore it was demonstrated that the L Test was able to discriminate between all groups as hypothesized <ref name="deathe" />.


<br>  
'''Criterion validity:''' This is the degree to which the scores of an outcome measure are an adequate reflection of a ‘gold standard’. However there are very few situations in rehabilitation where such a gold standard test exists. If no gold standard is available then it may be appropriate to test hypothetical relationships with comparator measures.<br>The estimation of criterion validity depends on the type of data. Intra-class correlations are used if both instruments (outcome instrument and comparator) have continuous scores (e.g. time, distance etc) and the results should preferably be above 0.70. If the outcome instrument has a continuous score but the comparator has a dichotomous score (e.g. Yes / No) then area under the receiver operated characteristic (ROC) is the preferred method. Again, a criterion of 0.70 is suggested <ref name="terwee" />. <br>
 
*Gremeauxet al (2012) presented ROC curves for the 2MWT. The modified Houghton Scale was used to stratify the patients into two groups; those with no mobility problems (scored 20/20) and those who scored less than 20 indicating a functional limitation. According to the ROC analysis cut off values of 130m or 150m were highly associated with the existence of functional limitations <ref name="gremeaux">Gremeaux V, Damak S, Troisgros O, Feki A, Laroche D, Perennou D, et al. Selecting a test for the clinical assessment of balance and walking capacity at the definitive fitting state after unilateral amputation: a comparative study. Prosthet Orthot Int 2012 12;36(4):415-422.</ref>.
 
<br>


=== Responsiveness  ===
=== Responsiveness  ===

Revision as of 12:49, 17 February 2015

Welcome to WCPT Network for Amputee Rehabilitation Project. This page is being developed by participants of a project to populate the Amputees section of Physiopedia. 
  • Please do not edit unless you are involved in this project, but please come back in the near future to check out new information!!  
  • If you would like to get involved in this project and earn accreditation for your contributions, please get in touch!

Tips for writing this page:

Aim:

  1. To enable the reader to select appropriate outcome measures to demonstrate effective intervention. (See CSP Outcome Measures Toolbox)

A quick word on content:

Content criteria:

  • Evidence based
  • Referenced
  • Include images and videos
  • Include a list of open online resources that we can link to

Example content:

Original Editor - Add a link to your Physiopedia profile here.

Top Contributors - Sheik Abdul Khadir, Lucy Aird, Admin, Tarina van der Stockt, Kim Jackson, 127.0.0.1, Simisola Ajeyalemi, Lauren Lopez and Rachael Lowe  

General introduction[edit | edit source]

Outcome measures can be used for many different purposes. A predictive measure should be able to classify individuals according to a set of pre-defined categories either concurrently or prospectively e.g. whether an amputee will use a prosthesis successfully [1] [2] .Detecting differences between people or groups demonstrates the discriminative value of an outcome measure e.g. being able to determine the different abilities of a trans-tibial or trans-femoral amputee or differences between prosthetic components from scores or times recorded [3]. Whereas an evaluative measure should be able to detect changes, usually over a period of time in an individual or group. An evaluative outcome measure may also detect changes occurring following some kind of intervention, e.g a therapy programme[4] or provision of a prosthetic component. Some outcome measures are designed to do only one of the above, while others may do a combination, though some of the requirements of these different types of outcome measures are competing [5] . Whichever purpose it is designed for, the psychometric properties of the outcome measure need to be reported to satisfy the user that it is fit for purpose with the population they wish to use it [6]. The psychometric properties of an outcome measure are the characteristics that express it’s adequacy in terms of reliability, validity and responsiveness. Another term often used is clinimetric properties. While being developed from similar origins as psychometrics, clinimetricshas been described as the practice of assessing or describing symptoms, signs, and laboratory findings by means of scales, indices, and other quantitative instruments, all of which should have adequate psychometric properties [7] [8] .

Considerations before choosing an outcome measure[edit | edit source]

If you are considering using an outcome measure with an amputee it is worth asking yourself the questions posed on the Outcome Measures page here in Physiopedia (Guide_to_Selecting_Outcome_Measures).At the very least you should consider these questions with your amputee patient or group in mind.

Why am I using an outcome measure?

  •  Am I trying to establish a baseline measure from which I can monitor changes over time for an individual patient?
  • Am I trying to predicthow my patient is going to perform? 
  • Am I trying to evaluate the impact of a treatment programme or prosthetic component on an individual or a group?
  • Am I trying to evaluate the needs of the amputee attending my service?
  • Am I trying to evaluate how my service is responding to needs of the amputee?

What am I aiming to measure?

  • Impairments of body structure and function?
  • Activity limitations?
  • Participation restrictions?
  • Quality of life?
  • Something else?

When you think you may have an outcome measure in mind you should also consider these questions.

Have the clinimetric properties of the outcome measure I am considering been measured in a population similar to mine?

  • Is the outcome measure reliable?
  1.  Do I know the rate of error detected with scores?
  2.  Do I know the minimum detectable change?
  • Is the outcome measure valid?
  1.  Does it measure what I want it to measure?
  • Is the outcome measure responsive to change?
  1.  Is there a known minimum clinically important difference?

Here are some examples of studies where the clinimetric, sometimes called psychometric, properties have been reported in an amputee population and what the results may tell you.

Reliability
[edit | edit source]

Reliability is usually measured by Intra-class correlation coefficients (ICC) and is presented as a number between 0 (no consistency) to 1 (complete consistency) [9]
Intra-rater Reliability: This indicates how consistently a rater administers and scores an outcome measure.
Inter-rater Reliability: This indicates how well two raters agree in the way they administer and score an outcome measure.
Test-retest reliability:If an individual completes a self-report survey and then repeats the survey on a second occasion when no change is expected, the results should be similar.

  • Brooks, Hunter et al (2002) examined the reliability of the 2MWT [10] . Participants completed 2 successive timed walks measured by 2 different raters on 2 consecutive days. Intra class correlations (ICC) were >0 .98 showing excellent intra- and inter-rater reliability.

Measurement error:This is the degree to which scores or ratings are identical irrespective of who performs or scores the test and can be reported using the standard error of measurement (SEM)or minimal detectable change (MDC), which is the same as smallest detectable change (SDC) [11] .

  • Deathe& Miller (2005) reported the SEM in absolute values, which was 3sec for the L-Test [12] .
  • Resnik& Borgia (2011) also reported MDCin absolute values for all the measures they studied: 2MWT (34.3m), 6MWT (45m), TUG (3.6s) and AMP (3.4pts) [13] 

Internal Consistency:This reliability property is reserved for outcome measures that are designed to test only one concept. Internal consistency assesses the extent to which all items or question in an outcome measure address the same underlying concept, e.g. in a mobility scale, all the items should deal with mobility [5].
There are two main methods used to report internal consistency,the Classical Test theory uses Cronbachs alpha (α) to indicate the reliability of an outcome measure as a whole. And the Item Response Theory uses Rasch Analysis to assess internal consistency by looking at each item within the outcome measure [14] .

  • The internal consistency of the ABC scale was considered excellent as measured by Cronbachs alpha (0.93) in a study by Milleret al (2003) [15]

Rasch analysis was used to examine all the items in the Berg Balance Scale which confirmed that it was able to test a range of difficulty and identify four levels of ability [16] .


Validity
[edit | edit source]

Content /Face validity: This is the degree to which the content of an outcome measure is an adequate reflection of the construct or concept to be measured(5). It is usually considered and agreed by consensus of an expert group of clinicians and can and should include patient representatives. For example an instrument measuring activity limitation in young athletic individuals should include not only walking but also running, jumping, and climbing.
Structural validity: This refers to the degree to which the scores of an outcome measure are an adequate reflection of the dimension or factor of the construct being measured [5] . It can be measured by performing factor analysis where the result demonstrate that if >50 % of data refer to one factor this confirms that the outcome measure is measuring one factor / dimension. Anything less indicates more than one factor is being assessed. Rasch Analysis may also be used it investigate where the outcome measure it’s unidimensionality, i.e. whether it is measuring one or more factors or dimensions.

  • Wong et al (2013) reported results from factor analysis performed on the Berg Balance Score (BBS). The results showed that 70% of the data were explained in the model related to one dimension, i.e. balance capability [16].
  • Franchignoni et al (2007) used Rasch modelling on a modified LocomotorCapability Index to confirm good structural validity when level 1 and 2 category responses were combined and 4 items were deleted due to either over or under-fitting [17]. The resultant modified index is known as LCI-5 which many clinicians now use.

Construct Validity: This is the degree to which the scores of an outcome measure are consistent with pre-defined (apriori) hypotheses that outline relationships to the scores of other instruments, or differences between groups. If > 75% of hypotheses are proved this is an indication of good validity [18] .
It can also be referred to as:Concurrent validity – showing the ability to distinguish between groups (e.g. older and younger LLAs)which is often measured by testing hypotheses, or; Convergent validity – showing that measures that should be related are relatedwhich can also be measured using ICC with high values indicating good validity.

  • Major et al (2013) hypothesised positive relationships between the BBS scores and the Activities specific Balance Confidence (ABC) scale, the mobility scale of the Prosthetic Evaluation Questionnaire (PEQ(ms), the Frenchay Activities Index and the 2MWT and a negative relationship with the L-test score [19]. These were all proved.
  • To assess concurrent and convergent validity of the L-test Deathe& Miller (2005) asked subjects to complete walks test; Timed "Up & Go" Test (TUG),10-Meter Walk Test, and 2-Minute Walk Test, followed by self-reported measures; ABC scale, Frenchay Activities Index (FAI), PEQ (ms). Concurrent validity washigh (ICC = 0.86-0.97)between the L Test data and the other walk tests and fairto moderate ( ICC = 0.22 – 0.54) for self-report measures.Higher mean times were observed for thosesubjects who:
  1. were older
  2. used a walking aid
  3. had to concentrate on each step they took
  4. had a vascular amputation and
  5.  had a TF amputation.

Therefore it was demonstrated that the L Test was able to discriminate between all groups as hypothesized [12].

Criterion validity: This is the degree to which the scores of an outcome measure are an adequate reflection of a ‘gold standard’. However there are very few situations in rehabilitation where such a gold standard test exists. If no gold standard is available then it may be appropriate to test hypothetical relationships with comparator measures.
The estimation of criterion validity depends on the type of data. Intra-class correlations are used if both instruments (outcome instrument and comparator) have continuous scores (e.g. time, distance etc) and the results should preferably be above 0.70. If the outcome instrument has a continuous score but the comparator has a dichotomous score (e.g. Yes / No) then area under the receiver operated characteristic (ROC) is the preferred method. Again, a criterion of 0.70 is suggested [18].

  • Gremeauxet al (2012) presented ROC curves for the 2MWT. The modified Houghton Scale was used to stratify the patients into two groups; those with no mobility problems (scored 20/20) and those who scored less than 20 indicating a functional limitation. According to the ROC analysis cut off values of 130m or 150m were highly associated with the existence of functional limitations [20].


Responsiveness[edit | edit source]

Internal responsiveness is the ability of a measure to change over a specified time frame. It will depend on the particular population being studied, the treatment or intervention which occurs during the time frame and the outcome measure used to determine any changes(5).
Standard effect size is the difference between the mean baseline scores and the follow-up scores, divided by the baseline standard deviation (SD). If there is a high variability in baseline scores in relation to the mean change scores the effect size will be small and the ability of the OM to detect meaningful changes is also small. A small effect will be 0.2 representing a change of approx 1/5 that of the baseline SD, 0.5 is considered moderate and anything over 0.8, or a change of at least 4/5 of the baseline SD is considered large(21).
The paired-t-test is a statistical test that can be used to detect the change in the average scores at two time points, but is dependent on the sample size and variability / reliability of the outcome measure used(14).
In a study by Devlinet al (2004) the effect size calculated for the change in mean scores for the Houghton Scale, from discharge to follow-up was 0.60, indicating a moderate difference(22).

  • Findings by Brooks et al (2001) indicated that the 2MWT was “responsive to change during rehabilitation”. Significant improvements were seen in means and SDs of the distances walked between baseline and discharge and follow-up(23). However, effect sizes were not calculated.


Other Considerations
[edit | edit source]

Other considerations (link to main Physiopedia outcome measures page)may come into play when deciding which outcome measure to use:
Financial Considerations:

  • What is the cost of this test?
  • Is a licence required? 

Therapist Implementation

  •  Is the measure easy for a clinician to conduct?
  • Is special training required/available?
  • Are there clear standardised instructions on how to carry out and score the measure?
  • How long does it take to carry out the measure?
  • How long does it take to record results?

Resources

  • Is special equipment or are special forms required?
  • Is space sufficient for this measure to be carried out?

Client

  • How much time does it take for the person to complete?
  • Is the task difficult?
  • Is privacy required?

Patient-Reported Outcome Measures (PROQ)

  • Is face-to-face contact required or can this measure be completed in the waiting room?
  • Does the questionnaire cover sensitive personal issues?
  • Is there a specific reading level required?
  • Is the measure available in other languages?

References
[edit | edit source]

  1. Condie ME, McFadyen AK, Treweek S, Whitehead L. The trans-femoral fitting predictor: a functional measure to predict prosthetic fitting in transfemoral amputees--validity and reliability. Arch Phys Med Rehabil 2011 08;92(8):1293-1297.
  2. Raya M, A., Gailey R, S., Gaunaurd I, A., Ganyard H, Knapp-Wood J, McDonough K, et al. Amputee Mobility Predictor-Bilateral: A performance-based measure of mobility for people with bilateral lower-limb loss. J Rehabil Res Dev 2013 11;50(7):961-968.
  3. Hafner BJ, Willingham LL, Buell NC, Allyn KJ, Smith DG. Evaluation of function, performance, and preference as transfemoral amputees transition from mechanical to microprocessor control of the prosthetic knee. Archives of Physical Medicine & Rehabilitation 2007 02;88(2):207-217.
  4. Rau B, Bonvin F, de Bie R. Short-term effect of physiotherapy rehabilitation on functional performance of lower limb amputees. Prosthet Orthot Int 2007;31(3):258-270.
  5. 5.0 5.1 5.2 Mokkink LB, Terwee CB, Patrick DL, Alonso J, Stratford PW, Knol DL, et al. The COSMIN study reached international consensus on taxonomy, terminology, and definitions of measurement properties for health-related patient-reported outcomes. J Clin Epidemiol 2010;63(7):737-745.
  6. Kirshner B, Guyatt G. A methodologicalframework for assessing health indices. J Chronic Dis 1985;38(1):27-36.
  7. Streiner DL. Clinimetrics vs. psychometrics: an unnecessary distinction. J Clin Epidemiol 2003 12;56(12):1142-1145.
  8. Galea M. Introducing Clinimetrics. Australian Journal of Physiotherapy 2005;51(3):139-140.
  9. Shrout PE(1), Fleiss JL. Intraclass correlations: Uses in assessing rater reliability. Psychol Bull 1979 / 03 / 01 /;86(2):420-428.
  10. Brooks D, Hunter JP, Parsons J, Livsey E, Quirt J, Devlin M. Reliability of the two-minute walk test in individuals with transtibial amputation. Arch Phys Med Rehabil 2002 11;83(11):1562-1565
  11. Stratford P, W., Riddle D, L. When Minimal Detectable Change Exceeds a Diagnostic Test-Based Threshold Change Value for an Outcome Measure: Resolving the Conflict. Phys Ther 2012 10;92(10):1338-1347.
  12. 12.0 12.1 Deathe AB, Miller WC. The L test of functional mobility: measurement properties of a modified version of the timed "up &amp;amp;amp; go" test designed for people with lower-limb amputations. Phys Ther 2005 07;85(7):626-635
  13. Resnik L, Borgia M. Reliability of outcome measures for people with lower-limb amputations: distinguishing true change from statistical error. Phys Ther 2011 04;91(4):555-565.
  14. Streiner DL, Norman GR. Health measurement scales : a practical guide to their development and use / David L. Streiner and Geoffrey R. Norman. : Oxford : Oxford University Press, 2003; 3rd ed; 2003.
  15. Miller WC, Deathe AB, Speechley M. Psychometric properties of the Activities-Specific Balance Confidence Scale among individuals with a lower-limb amputation. Arch Phys Med Rehabil 2003 05;84(5):656-661.
  16. 16.0 16.1 Wong C, Kevin, Chen C, C., Welsh J. Preliminary Assessment of Balance With the Berg Balance Scale in Adults Who Have a Leg Amputation and Dwell in the Community: Rasch Rating Scale Analysis. Phys Ther 2013 11;93(11):1520-1529.
  17. Franchignoni F, Giordano A, Ferriero G, Muñoz S, Orlandini D, Amoresano A. Rasch analysis of the Locomotor Capabilities Index-5 in people with lower limb amputation. Prosthet Orthot Int 2007 12;31(4):394-404.
  18. 18.0 18.1 Terwee CB, Bot SDM, de Boer M,R., van der Windt D,A.W.M., Knol DL, Dekker J, et al. Quality criteria were proposed for measurement properties of health status questionnaires. J Clin Epidemiol 2007 01;60(1):34-42.
  19. Major MJ, Fatone S, Roth EJ. Validity and reliability of the berg balance scale for community-dwelling persons with lower-limb amputation. Arch Phys Med Rehabil 2013 11;94(11):2194-2202.
  20. Gremeaux V, Damak S, Troisgros O, Feki A, Laroche D, Perennou D, et al. Selecting a test for the clinical assessment of balance and walking capacity at the definitive fitting state after unilateral amputation: a comparative study. Prosthet Orthot Int 2012 12;36(4):415-422.