SPED 8013 | Chapter 5: Improving and Assessing the Quality of Behavioral Measurement

Indicators of Trustworthy Measurement

  • Validity
    • Directly measures a socially significant target behavior
    • Measures a dimension of the behavior relevant to the question or concern about the behavior
    • Ensures the data are representative of the behavior’s occurrence under relevant conditions/times
  • Accuracy
    • The extent to which observed values match the true values of an event
  • Reliability
    • The extent to which a measurement procedure yields the same values when brought into repeated contact with the same state of nature

Think of accuracy as being able to fire a gun with accuracy in that the shots all mass in a single location as seen below in targets 1 and 2. However, to illustrate the difference between accuracy and reliability think of target 3 as unreliable. The person shooting this target is an unreliable shot in that they are unable to aim with consistency and hit the target with any particular precision.

Threats to Measurement Validity

  • Indirect measurement
    • Measuring a behavior other than the behavior of interest
    • Requires inferences be made about relationship between those behaviors
    • Must provide evidence that the behavior measured is directly related to behavior of interest
  • Measuring a dimension that is irrelevant or ill suited to the reason for measuring behavior
    • i.e., a ruler to measure temperature
  • Measurement artifacts
    • Discontinuous measurement
    • Poorly scheduled observations
    • Insensitive or limiting measurement scales

Threats to Measurement Accuracy and Reliability

  • Human error is the biggest threat–several factors contribute to this
  • Poorly designed measurement systems
    • Cumbersome
    • Difficult to use
    • Complex
  • Inadequate observer training
    • Explicit and systematic
    • Careful selection of observers
    • Train to an objective competency standard
    • Ongoing training to minimize observer drift
  • Unintended influences on observers
    • Observer expectations of what the data should look like
      • Measurement bias
      • Feedback to observers about how their data relates to the goals of the intervention
    • Observer reactivity when she/he is aware that others are evaluating the data
      • Intrusive/obtrusive observation

Assessing the Accuracy and Reliability of Behavioral Measurement

  • First, design a good measurement system
  • Second, train observers carefully
  • Third, evaluate extent to which data are accurate and reliable
    • Measure the measurement system

Assessing the Accuracy of Measurement

  • Accuracy means the observed values match the true values of an event
  • No one want to base research conclusions or treatment decisions on faulty data

Four purposes of accuracy assessment:

  1. Determine if data are good enough to make treatment decisions
  2. Discovery and correction of measurement errors
  3. Reveal consistent patterns of measurement error–improve calibration
  4. Assure consumers that data are accurate

Accuracy Assessment Procedures

  • Measurement is accurate when observed values match true values
    • Accuracy determined by calculating correspondence of each data point with its true value
    • Process for determining true value must differ from measurement procedures

Which is not particularly suited to ABA as the nature of ABA is highly individualized and  value is subjective, so…

Assessing the Reliability of Measurement

  • Measurement is reliable when it yields the same values across repeated measures of the same event
    • Not the same as accuracy
    • Reliable application of measurement system is important
    • Requires permanent products for re-measurement
    • Low reliability signals suspect data

Using Interobserver Agreement to Assess Behavioral Measurement

  • The degree to which two or more independent observers report the same values for the same events

Benefits of Interobserver Agreement (IOA)

  • Determine competence of new observers
  • Detect observer drift
  • Judge clarity of definitions and system
  • Increase believability of data

Requirements for IOA

  • Observers must:
    • Use the same observation code and measurement system
    • Observe and measure the same events
    • Observe and record independently of one another

Methods for Calculating IOA

  • Percentage of agreement is most common
  • Event recording methods are:
    • Total count recorded by each observer
    • Mean count-per-interval
    • Exact count-per-interval –how many are 100%
    • Trial-by-trial
  • Timing Recording methods:
    • Total duration IOA
    • Mean duration-per-occurrence IOA
      • Latency-per-response
      • Mean IRT-per-response
  • Interval recording and Time sampling:
    • Interval-by-interval IOA (point by point)
    • Scored-interval IOA (low rates)
    • Unscored-interval IOA (high rates)

Considerations in IOA

  • How often and when should IOA be obtained?
    • During each condition and phase of a study
    • Distributed across days of the week, time of day, settings, observers
    • Minimum of 20% of sessions, preferably 25-33%
    • More frequent with complex systems
  • For what variables should IOA be obtained and reported?
    • Obtain and report IOA at same levels at which researchers will report and discuss in study results
      • For each behavior
      • For each participant
      • In each phase of intervention or baseline
  • Which method of calculating IOA should be used?
    • More conservative methods should be used
    • Methods that will overestimate actual agreement should be avoided
  • Believability of data increases as agreement approaches 100%
  • History of using 80% agreement as acceptable benchmark
    • Depends on the complexity of the measurement system
  • How should IOA be reported?
    • Narrative form
    • Table
    • Graphs
      • In all formats, report how, when, and how often IOA was assessed