Performance Assessment and Authentic Assessment: A Conceptual Analysis of the Literature
Performance Assessment and Authentic Assessment: A Conceptual Analysis of the Literature
Performance assessment and authentic assessment are recurrent terms in the literature on education and educational research. They have both been given a number of different meanings and unclear definitions and are in some publications not defined at all. Such uncertainty of meaning causes difficulties in interpretation and communication and can cause clouded or misleading research conclusions. This paper reviews the meanings attached to these concepts in the literature and describes the similarities and wide range of differences between the meanings of each concept.
There are a number of ill-defined concepts and terms used in the literature on education and educational research. This is a problem for many reasons, and one of them is the difficulty of interpreting research results. There are several examples in the literature of loosely defined constructs that have been used differently in different studies, which have caused different results and in turn clouded and caused misleading conclusions. The diversity of meanings also makes communication and efficient library searches more difficult. Performance assessment and authentic assessment are two concepts that have been given a multitude of different meanings in the literature and are used with different meanings by different researchers. In addition, they are sometimes only vaguely defined and sometimes used without being defined at all. This multitude of different meanings, especially in the light of the lack of clear definitions in some publications, makes it difficult for teachers and newcomers in the assessment research field to get acquainted with the research in this area. But it also causes misunderstandings and communicational problems among experienced researchers, which is evident from a debate in the Educational Researcher. Furthermore, due to different histories of assessment practices the difficulties caused by the confusion about the meanings of these concepts may arise even more easily in situations involving international participation. The introduction of the term authentic assessment and the increase in use of the term performance assessment in theoretical school subjects seem to have come as a response to the extensive use of multiple-choice testing in the United States. But since many countries do not have, nor have had, such an extensive use of multiple-choice testing many non-U.S. researchers and practitioners do not share the experiences that led to these different meanings, which causes very different bases for interpreting the situation with all of the different (and sometimes vague) meanings. Indeed, a corresponding concept to performance assessment does not even exist in many countries.
The aim of this article is neither to present additional definitions nor to make judgments on existing ones. The intention with the article is to analyze the meanings given to the two concepts performance assessment and authentic assessment in the literature in an attempt to clarify the diversity as well as the similarities of the existing meanings. Such a survey may be helpful for communication about important assessment issues and also for further efforts of coming up with definitions that can be agreed upon, which for reasons mentioned above indeed would be desirable.
For these aims, it is important to acquire a full picture of the variety of meanings these concepts possess.
Most definitions of performance assessment seem to be subject-independent and therefore the section about this concept mostly deals with definitions not specific to a particular subject. Since performance assessment sometimes is described by its typical characteristics and sometimes by a more clear definition the section about performance assessment includes one subsection describing the characteristics that have been argued in the literature to be typical of performance assessments, and a subsequent subsection describing the different definitions. The latter subsection begins with an overview of different types of definitions that have been put forth and concludes with examples of definitions to exemplify the similarities and differences of the meanings of the definitions. Authentic assessment is treated in the following section. Definitions of authentic assessment are also often subject-independent, but not to the same extent as performance assessment. Therefore, both subject-independent and subject-specific definitions will be included. The subject mathematics will be used to exemplify the subject-specific definitions. The first subsection on authentic assessment provides a classification of different meanings, and is followed by two subsections with examples of definitions intended to clarify the classification.
Brief history
Brief history
At the middle of the twentieth century the term performance test was in most cases connected to the meaning of practical tests not requiring written abilities. In education the idea was to measure individuals' proficiency in certain task situations of interest. It was acknowledged that the correlation between facts and knowledge, on the one hand, and performance based on these facts and knowledge, on the other, were not always highly correlated. Judgement of the performance in the actual situation of interest was therefore desirable. The usefulness of such tests was regarded as obvious in vocational curricula and they seem to have been mostly applied in practical areas such as engineering, typewriting and music. Out of school, such practical performance tests were for example used for considering job appliances and in the training of soldiers during the Second World War. In psychology, performance tests were mostly associated with non-verbal tests measuring the aptitude of people with language deficiencies. This historical heritage is still fundamental to the concept of performance assessment but now, at the turn of the century, the situation has grown considerably more complex.
From the nineteen eighties onwards there has been an upsurge in the amount of articles on performance assessment. The term assessment now coexists with the term test. But now theoretical school subjects, such as mathematics, have also become a matter of interest. It is appropriate, at this point, to acknowledge the difference between vocational school subjects and theoretical school subjects, such as mathematics as an independent subject, in terms of performance. In vocational subjects there are well-defined performances tied to the profession, which can be observed relatively direct. This is not the case for mathematics. Both a professional mathematician and a student may apply problem-solving techniques, but they solve very different problems and hence their performances are different. Students may occasionally be placed in task situations in real life beyond school so performance in such situations may be assessed relatively direct, but there is no well-defined performance tied to the understanding of mathematical concepts and ideas so inferences to such understanding can only be drawn from indicators.
The growing interest in performance assessment and the new focus on more theoretical subjects seem to emanate from dissatisfaction with the extensive use of multiple choice tests in the United States. The validity of these tests as indicators of complex performance was experienced to be too low, and to have negative effects on teaching and learning. When arguing for other forms of assessment better fulfilling these requirements the term performance assessment was recognized as a suitable choice. But desires for change open up numbers of possible perspectives, so new views on the meaning of the attribute 'performance' have been added, and consensus on the meaning of performance assessment has not been reached.
The dissatisfaction with the emphasis on multiple-choice testing in the United States was also a fundamental factor for the development of the concept of authentic assessment. This much more recent term in education arose from the urge to meet needs that were experienced not to be met by the use of multiple-choice tests. Norm-referenced standardized multiple-choice tests of intellectual achievement were said not to measure important competence needed in life beyond school. Interpretations of test results from such tests were claimed to be invalid indicators of genuine intellectual achievement and since assessments influence teaching and learning they were also said to be directly harmful. However, from the original idea of assessing the important achievement defined by Archbald and Newmann, a number of more or less related meanings have been attached to this concept.