Calibration and Evaluation

Reflecting on and applying the results of the research

August 2014

Presented at The Developing Group 2 Aug 2014

1. Background
2. Research methodology
3. Findings
4. Conclusions

*** We do not have access to Parts 3 and 4. If you have a copy of this article, we would be extremely pleased to hear from you. ***

1. Background

Evaluation is a kind of ‘behind the scenes’ process we all do regularly. We are constantly informally evaluating people’s behaviour against our own internal (often subconscious) standards. What makes a formal assessment ‘formal’ is that the standards and process of assessing are known, and hopefully well-defined.

A different kind of evaluation happens when we make an assessment of another person’s evaluation. To do so we need to take into account their means of evaluating, which may not be the same as ours. How accurate – or not – are we are calibrating another person’s evaluation?

We devoted the December 2011 Developing Group to learning how to use Clean Language as a research interview method when the topic being researched was how people evaluate an experience:

Clean evaluative interviewing

Dr. Susie Linder-Pelz and James have recently concluded an academic research project in which six coaching sessions were evaluated from three perspectives: by the coach, the client and an expert-assessor.*

An example of an interview from the research (conducted March 2012) is available. The annotation was added in 2023:

Evaluating Coaching

At the 2nd August 2014 Developing Group James updated the group on the findings of the research, and we explored how we individually and collectively we could make use of the conclusions. in particular we experientially investigated:

As a coach, how aware are you of how your client and an expert would evaluate a coaching session?

Does knowing your client and an expert’s opinion affect your own evaluation?

Over the years we have approached the topic of calibrating in different ways.

We have long noticed that when people on a training course are asked to evaluate a practice coaching session they often give an answer which varies wildly with the opinion of the client and/or us as expert observers.

For example, one coach said a session was “catastrophic”, while the client said “I got some useful insights and lots to think about”. James who was observing said to the coach, “You did what the activity called for. The client got what they asked for with their desired outcome. A more direct approach might have got to the meat earlier, and even so, you and they now have a lot more of a landscape to work with and a good basis for the next session.”

When the coach was asked what their evaluation of the session was now, having heard the opinion of the client and expert they said “Well I’m pleased the client got something out of it and I still think it was catastrophic”. We wonder what scale the coach was using to evaluate their effectiveness, and what they would have labelled a much worse session! (See The Importance of Scale)

Our modelling of excellent facilitators (not only those who use Clean Language) showed that a key skill was the ability to calibrate the experience of the client and to notice when it changed and in what direction. (See Systemic Outcome Orientation)

There are lots of ways to calibrate, and what seems more important than the method of calibrating is that (a) the facilitator is actively calibrating moment-by-moment; (b) there is a correspondence between the facilitator’s calibration and the client’s experience; and (c) the facilitator can quickly change in response to the results of their calibration. This led us to make the “First Principle of Symbolic Modelling” (See REPROCess and Modelling Attention):

Know what kind of experience the client is having (i.e. what you are modelling).

While calibrating is a matter of efficacy, we have pointed out that it is also an ethical matter. If you do not calibrate the kind of experience the client is having, how do you know whether what what you are doing is, or is not, working for the client? (See Calibrating Whether What You Are Doing is Working – Or Not)

James and Susie’s research of coaching sessions shows that even experienced coaches and experts can give widely differing ratings compared to those of the client and to each other. While this may be surprising at first, once it is appreciated that each tend to use different criteria in coming to their evaluations, the variation makes more sense.

In our opinion, a bigger issue is the difficultly there appears to be in managing multiple perspectives when they diverge. Many certification and evaluation processes use one perspective: Experts decide if a coach is competent to be certified or suitable for a job, or clients decide if they are satisfied with the service. Rarely are both taken into account. Even more rarely does the coach’s ability to calibrate both the client and the expert perspective become part of the assessment.

One reason for this may be the difficultly in comparing apples, oranges and bananas. This is compounded if the aim is to find a single composite score. The result is likely to be an arbitrary weighting of the contribution of each perspective. Rather than trying to reduce the perspectives to a single rating, an alternative is to live with the complexity of three perspectives and set acceptable levels in all three.**

By bringing our own evaluations out from ‘behind the scenes’ and making them ‘centre stage’ we can play with our own patterns of assuming, and get a ‘reality check’ on our how and what we are unconsciously calibrating.

* The first part of the study was published as:

Linder-Pelz, S. & Lawley, J. (2015). Using Clean Language to explore the subjectivity of coachees’ experience and outcomes. International Coaching Psychology Review, 10(2):161-174.
Download a free preprint version: Linder-Pelz_Lawley-ICPR_preprint_15_Jun_2015.pdf

The second part of the study was published as:

Lawley, J. & Linder-Pelz, S. (2016). Evidence of competency: exploring coach, coachee and expert evaluations of coaching, Coaching: An International Journal of Theory, Research and Practice.
Download a free preprint version: Lawley&Linder-Pelz;_CIJTRP_preprint_03_May_2016.pdf

** We are grateful to Michelle Duval who helped us to get clear on this point.

2. Research methodology

1. Participants sent Goal-focused Coaching Skills Questionnaire (GFCQ) in advance with a request to complete it and bring it on the day. The instruction given was:

Circle the number that most reflects your assessment of your current clean coaching competency.

2. Twelve questionnaires were completed. The average scores for each person were converted to the equivalent scores out of ten.

The average scores ranged from 6 to 9 with an average of 7.3.

3. Ten of the participants were paired up and allocated an expert-observer (an established accessor of Clean Facilitator competencies). The participants in each dyad took turns to be the coach and the client for an observed 40 minute session.

4. At the end of each session the client, coach and observer completed in private a sheet designed specifically for that role. The sheets were collected without the other participant’s seeing them. The sheets contained both requests for numerical evaluations out-of-ten from various perspectives, and a textual list of the key criteria used in the evaluation of the session (see below).

5. After both coaching sessions had finished and the figures entered on a computer, the sheets were handed back to each dyad and observer group for reflection and discussion within the triad.

6. Finally an anonymised summary of the results were shown to the whole group for more reflection and discussion.

7. Lastly, the group was spit in half and two of the expert-observers conducted a 30 minute coaching session observed by the other 4 participants and an expert-observer. Afterwards, the sheets were completed and compared within the group.

Sheet completed by CLIENT:

On a scale of 1 to 10, the value of the session to me was …(a)

Please list the key criteria you used to assess the value of the session to you:

Sheet completed by COACH:

On a scale of 1 to 10, I evaluate the quality of the session as …(b)

Please list the key criteria you used to assess the quality of the session:

I estimate the CLIENT rated the value of the session to him/her as …(c)

I estimate the OBSERVER rated my clean coaching skills as …(d)

Sheet completed by EXPERT-OBSERVER:

On a scale of 1 to 10, I evaluate the coach’s clean coaching skills as …(e)

Please list the key criteria you used to assess the clean coaching you observed:

I estimate the CLIENT rated the value of the session to him/her as …(f)

^1, Goal-focused Coaching Skills Questionnaire (GCSQ), Anthony M. Grant & Michael J. Cavanagh (used with permission, Social Behaviour and Personality, 2007, 35 (6), 751-760.

^2. The GFCQ sent out was slightly modified from the original. One word in each of questions f, g, h and j was modified to make the questions compatible with clean coaching.

3. Findings

Not currently available. If you have a copy of this article, we would be extremely pleased to hear from you.

4. Conclusions

Not currently available. If you have a copy of this article, we would be extremely pleased to hear from you.