How far does stereotype threat reach? The potential detriment of face validity in cognitive ability testing
This study examined the role of the face validity of tests in moderating stereotype threat. Face validity refers to the degree that a test instrument appears to relate validly to the construct(s) it measures. High face validity is assumed to have a positive impact on performance, but reducing face validity (e.g., telling participants that a test is non-diagnostic of ability) is one tool that has been use to improve performance by reducing stereotype threat. Therefore, this study focussed on whether improving face validity could in some circumstances simultaneously highlight negative stereotypes in the testing environment and trigger stereotype threat. To test these possibilities, undergraduates (N = 358), who were tested in groups of 20-50, were told they would play the role of a job applicant for a position of manager in charge of supervising maintenance and repair workers. It was emphasized that the position required mechanical and mathematical ability. Face validity was manipulated by presenting identical items phrased to either closely match the job in question (high face validity) or to seem unrelated to the position (low face validity). Stereotype threat was manipulated by either informing participants of gender differences in mathematical and mechanical ability (stereotype threat) or no mention of gender (control). After these manipulations, participants completed multiple measures assessing, among others, mathematical and mechanical ability. Results showed that improving face validity did not produce negative effects on womens performance. In fact, face validity enhancements tended to have some beneficial effects. Women who took the high face validity version of the mechanical ability test tended to perform better than those who took the low face validity version. Results for responses to items meant to detect the effectiveness of the stereotype threat manipulation were not reported, making it difficult to assess whether stereotype threat was successfully varied in this study. Nonetheless, these data show that increasing the face validity of testing instruments need not invariably harm test performance.