Monday, April 16, 2012

Crooks, Kane & Cohen (2008) - Threats to the valid use of assessments

Crooks, T. J., Kane, M. T., & Cohen, A. S. (2008). Threats to the valid use of assessments. In H. Wynne (Ed.) Student assessment and testing: Vol. 2 (Chapter 21, pp. 151-171). Thousand Oaks, CA: Sage.

Main points:
1) Validity is an integrated evaluative judgment of the degree to which empirical evidence and theoretical rationales support the adequacy and appropriateness of inferences and actions based on test scores or other modes of assessment (Messick, 1989, p. 13) [p. 150]
2) Assessment can be depicted as a chain of eight linked stages (administration, scoring, aggregation, generalization, extrapolation, evaluation, decision and impact) which form an assessment system
3) Problems and issues in any of the links can threaten the confidence in an assessment system, and undermine any inferences and claims from this system
4) Validity estimation relies heavily on human judgment and is therefore harder to carry out, report and defend. [p. 150]
5) High reliability is necessary but not sufficient for high validity. Some degree of
reliability is essential for validity. Reliability establishes an upper limit for validity.

Theoretical Framework
Concept of validity argument: interpretations of assessments of performances involve a linked series of inferences and assumptions (Kane 1992; Shepard, 1993; Cronbach, 1988). If the inferences and assumptions can be identified, the plausibility can be examined by logical and empirical means, and the importance of each can be debated. [p.151]

Assessment validation model
Crooks, Kane and Cohen’s (2008) assessment validation model depicts assessment as involving eight linked stages:
(1) Administration of assessment tasks to the student.
(2) Scoring of the student’s performances on the tasks.
(3) Aggregation of the scores on individual tasks to produce one or more combined scores.
(4) Generalization from the particular tasks included in a combined score to the whole domain of similar tasks (the assessed domain).
(5) Extrapolation from the assessed domain to a target domain containing all tasks relevant to the proposed interpretation.
(6) Evaluation of the student’s performance, forming judgments.
(7) Decision on actions to be taken in light of the judgments.
(8) Impact on the student and other participants arising from the assessment processes, interpretations and decisions. [p. 153]

Gipps (2008) - Socio-cultural aspects of assessment

Gipps, C. (2008). Socio-cultural aspects of assessment. In H. Wynne (Ed.) Student assessment and testing: Vol. 1 (Chapter 8, pp. 252-291). Thousand Oaks, CA: Sage.

1.     At all levels, assessment is a social activity and that we can understand it only by taking account of the social, cultural, economic, and political contexts in which it operates. (p. 252)
2.     Assessment plays an important role in cultural and social reproduction, in allocating educational and economic opportunities, and more recently, to control curriculum and teaching. (p. 264)
3.     Changes in assessment practice and design reflect changes in world view, a resulting change in epistemology, and new understandings of learning. (p. 273)
4.     There are complex interactions among students, teachers, and assessment. (p. 284)
5.     When designing assessment, there are trade-offs between reliability, validity, and assessment of higher order thinking skills. (p. 283)
6.     Theories about intelligence and learning have implications for assessment design (p.272)
7.     Although new approaches to assessment have the promise of being more equitable, performance assessments on their own will not enhance equity. (p. 283)

1.     CLAIM: Although IQ testing, objective testing, and external examinations were seen originally as equitable tools for selection and certification purposes, a sociological critique calls this into question. (p. 260)
a.     Performance at school may be affected by social and cultural background factors. Among these factors are poverty, poor resources at home and/or at school, absenteeism owing to work or domestic duties, mismatch between the language and culture of the home and the school, gender bias, and ethnic discrimination. As a result, examinations may be biased, and furthermore, because of their role in certification, they may institutionalize and legitimate social stratification. (p.361)
b.    Cultural capital argument: children from lower social groups are not less intelligent or less academically capable, but children from middle-class homes are better able to do well at school because of the correspondence of cultural factors between home and school. As a result, examinations have a legitimating role in that they allow the ruling classes to legitimate the power and prestige they already have. (p.361) [see Bourdieu & Passeron (1976)]
2.     CLAIM: It is possible to run a national assessment program that includes high-quality examinations and some performance assessment, and it is possible to design an assessment program with different features and purposes at different levels of the school system. (p.283)
EVIDENCE & REASONING: experience in England in the 1990s (Stobart & Gipps, 1997; James & Gipps, 1998) provides existence proofs that it is possible to implement new assessment systems at scale.
3.     CLAIM: Although performance assessment and evaluation of culturally sensitive classroom-based learning have the potential to foster multicultural inclusion and facilitate enhanced learning, performance assessment on its own will not enhance equity. (p. 283)
EVIDENCE & REASONING: Consideration must still be given to students' opportunity to learn (Linn, 1993), the knowledge and language demands of the task (Baker & O'Neil, 1995), and the criteria used for scoring (Linn, Baker, & Dunbar, 1991). Clearly, as with traditional forms of assessment, questions of fairness arise in the selection of tasks and in the scoring of responses. Furthermore, the more informal and open-ended such assessment becomes, the greater the reliance on the judgment of the teacher/assessor. Here we come again to the issue of power and control, a theme of this chapter. Alternative forms of assessment do not, of themselves, alter power relationships and cultural dominance in the classroom. (p.283)

Psychometric theory: Psychometric theory developed originally from work on intelligence and intelligence testing. The underlying notion was that intelligence was innate and fixed in the way that other inherited characteristics are, such as skin color. Intelligence could therefore be measured (since, like other characteristics, it was observable), and, on the basis of the outcome, individuals could be assigned to streams, groups, or schools that were appropriate to their intelligence (or "ability," as it came to be seen). … With the psychometric model comes an assumption of the primacy of technical issues, notably standardization and reliability (Goldstein, 1996). (p.263)

Constructivist learning theory: students learn by actively making sense of new knowledge, making meaning from it (Iran-Nejad, 1995), and mapping it into their existing knowledge map or schema. Shepard (1991) notes that "contemporary cognitive psychology has built on the very old idea that things are easier to learn if they make sense." (p.271)

Sociocultural learning theory: Socioculturalist assume human agency in the process of coming to know, but socioculturalists further argue that meaning derived from interactions is not exclusively a product of the person acting. They view the individual engaged in relational activities with others. Building on Vygotsky's arguments about the importance of interaction with more knowledgeable others and the role of society in providing a framework for the child's learning, sociocultural theorists thus describe learning in terms of apprenticeship (e.g., Brown et al., 1993; Glaser, 1990; Rogoff, 1990), legitimate peripheral participation (Lave & Wenger, 1991), or negotiation of meaning in the construction zone (Newman, Griffin, & Cole, 1989). (p. 271)

Tittle’s (1994) framework for an educational psychology of assessment: there are three dimensions: the epistemology and theories involved (both general and in relation to subject matter); the interpreter and user, whose presence, characteristics, needs, and values must be brought into the frame; and the characteristics of the assessment itself.

  1. Testing is now being used to control curriculum and teaching. (p. 283)
  2. Developments in cognition and learning are telling us to assess more broadly, in context, and in depth. This requires methods of assessment that do not lend themselves readily to traditional reliability, highlighting the tension between types and purposes of assessment. (p. 283)
  3. From an interpretivist viewpoint, it is important to acknowledge the complexity of interactions among students, teachers, and assessment. Factors such as students' perceptions of how testing affects them (Herman et al., 1997), student and teacher confidence in the veracity of test results, and differences in student and teacher perceptions of the goals of assessment all need to be considered. (p. 284)
  4. We need to bring out into the open the nature of the power relationship in teaching and assessment and point out the possibility of reconstructing this relationship. Perhaps most important, we need to encourage teachers to bring pupils into the process of assessment, in order to recognize their social and cultural background, and into self-assessment, in order to develop their evaluative and metacognitive skills. (p.286)
  5. A key direction for the future lies in the development of teachers' classroom assessment skills. It is evident from this chapter that some teachers are operating in collaborative, constructivist ways supported by portfolio work, for example, or as evidenced by their feedback to learners. Such practice is not common but clearly can become part of the teacher's repertoire. This implies the continued development of new assessment strategies for use by teachers, involving group and interactive assessment and interview and portfolio approaches. It will involve extending teachers' skills in observation and questioning while making them aware of social and cultural influences on the assessment process. (p. 286)