Measurement biases involve systematic error that can occur in collecting relevant data. Common measurement biases include:
- Instrument bias. Instrument bias occurs when calibration errors lead to inaccurate measurements being recorded, e.g., an unbalanced weight scale. (questionnaires, company records)
- Insensitive measure bias. Insensitive measure bias occurs when the measurement tool(s) used are not sensitive enough to detect what might be important differences in the variable of interest. (questionnaires)
- Expectation bias. Expectation bias occurs in the absence of masking or blinding, when observers may err in measuring data toward the expected outcome. This bias usually favors the treatment group. (behavioral observations)
- Recall or memory bias. Recall or memory bias can be a problem if outcomes being measured require that subjects recall past events. Often a person recalls positive events more than negative ones. Alternatively, certain subjects may be questioned more vigorously than others, thereby improving their recollections. (questionnaires)
- Attention bias. Attention bias occurs because people who are part of a study are usually aware of their involvement, and as a result of the attention received may give more favorable responses or perform better than people who are unaware of the study’s intent. (behavioral observations, questionnaires)
- Verification or work-up bias. Verification or work-up bias is associated mainly with test validation studies. In these cases, if the sample used to assess a measurement tool (e.g., diagnostic test) is restricted only to who have the condition of factor being measured, the sensitivity of the measure can be overestimated.
To develop a new measure in a field, the following process is suggested:
- Specify domain of construct
- Extensive literature review to define the exact construct I want to measure or evaluate.
- Measuring this construct would involve developing a scale to generate a degree of presence or absence of that construct or items making up that construct.
- In the instance of survey questions the measure may be only positive numbers (unipolar) with different degrees of the same attribute in mind, or positive and negative numbers on the scale, which conveys more bipolar dimensions.
- Empirically determine the extent to which items measure that domain
- The levels of the scales need to be appropriate of so that differences between measures be interpretable as quantitative differences in the property measured.
- Questions would be closed response to reduce variability in the responses and reduce ambiguity. From the initial sampling of respondents, the classification of respondents into categories and employing a factor analysis to verify construct validity.
- Each item on the questionnaire needs to address a single issue and measure.
- Each of the questions would then be evaluated for correlation to make sure that multiple questions were not measuring the same construct to simplify the measure.
- Construct Validity would then be verified with other measures of the same construct to be similar (Convergent Validity) and Correlations between measures of the same construct (discriminant validity).
- Examine the extent to which the measure produces results that are predictable from the theoretical hypotheses.
- Half of this initial sample would be utilized for exploratory factor analysis and after the measures and groupings were defined, the other half of the sample would be compared and measured with Confirmatory Factor Analysis (Cronbach’s Alpha) to verify the constructs we were measuring fit the data appropriately.
- This replication of the data will help assess the consistency, reliability, and validity of the measure.
(Adapted from group and course notes)
(Flashcards and other resources here)