CERC's Electronic Book

Doing Comparative Education: Three Decades of Collaboration


Part III: Achievement, Assessment, and Evaluating Learning

Comparative School Achievement
National Case Study Report
International Study of School Achievement
Reflections
The Two Faces of Examinations
Tradeoffs in Examination Policies: An International Comparative Perspective
Secondary School Examinations: International Perspectives on Policies and Practice
An International Perspective on National Standards
A Comparative Assessment of Assessment
An International Comparison of End-of-Secondary School Examinations

Source: Max A. Eckstein, "A Comparative Assessment of Assessment," Assessment in Education 3.2 (1996): 233-240.

A COMPARATIVE ASSESSMENT OF ASSESSMENT


The impact of examination systems on curriculum development: an international study. Christine De Luca, 1994. Paris: UNESCO.

National examinations: design, procedures and reporting. John P. Keeves, 1994. Paris: UNESCO - International Institute for Educational Planning (Fundamentals of Educational Planning No. 50)

A Comparative Study of Current Theories and Practices in assessing students' achievements at primary and secondary level. Henry G. Macintosh, 1994. Geneva: International Bureau of Education (IBE Documents Series, Number 4).

Why have some countries become so exercised about national assessment and examinations over the past decade or so? In England and Wales, the GCE has been substantially restructured, examinations for a new vocational qualification were introduced, and pressure is great to reform A-levels; in the United States, local administrations and the several States have been introducing new high school graduation requirements, some for the first time, and the idea of some kind of national school-leaving standard has again been raised; Sweden, which had virtually replaced its traditional university entrance examination with a more elastic set of tests and reports is having second thoughts; and Israel last year cancelled a part of its stringent academic requirements for the traditional Abitur-like Bagrut. And why has so much interest been sparked in what is going on in foreign parts? Only a few years ago, it would have seemed most unlikely that national examinations "in other countries" would generate such intense interest and scrutiny.

Over half a century ago, Paul Monroe and I. L. Kandel of Teachers College, Columbia University conducted a major international inquiry on the subject in collaboration with scholars from several other nations. But it was not until the 1950s, with the groundbreaking comparative achievement studies of the IEA (International Association for the Evaluation of Educational Achievement) that comparative interest in examinations revived. It was clear that these were important administrative devices used for selection within national school systems, but they might also explain significant differences in student achievement among those countries. Then, in the late 1980s, World Bank efforts to reform teaching methods and curriculum in developing nations pointed to the instrumental value of national examinations for reforming curriculum and teaching. Since that time, study and discussion within individual countries as well as comparative investigations have repeatedly pointed to the critical importance of examination systems not only for student selection but also for setting and even raising educational standards.

Simply stated, assessment is important in a variety of ways to a variety of interested parties. It serves at one time as incentive and as yardstick for schools and school systems; it is a means of accounting as well as an instrument for individual advancement; and, increasingly, it provides a measure for nations to compare themselves with others. Thus, to examine examinations is no simple academic enterprise: it reaches into many corners of a country's total social context.

The three studies under review testify to the increasing attention given to the various roles and forms of national examinations: how they work, what they stand for, what is changing in some places, and what can be learned through comparative, international scrutiny of their practices. In particular, attention is increasingly directed at the question of how examination systems affect and are affected by school curriculum and achievement standards, and the vexing questions of how best to assess students and monitor school systems. Each of these works informs us on social and educational policy and practice in contemporary societies and makes an enlightening and useful contribution to the subject.

The three reports differ, however, in purpose, scope, methods and provenance. Their very differences suggest a number of criteria one might employ to assess them, including the number and range of countries described and discussed and the levels of schooling considered; the methodology employed; the sponsors of the respective works; the purposes of the studies; and the special academic interests they address.

The comparative study by de Luca, Research Officer of the Scottish Examination Board, was sponsored by Unesco, and conducted on behalf of the International Association for Educational Assessment. It seeks to assess the influence of examination systems upon curriculum development in seven countries, four of them in the developed world: Colombia, Egypt, France, Japan, Scotland, the U.S.A. and Zimbabwe. Data and conclusions are derived from extensive literature search including reports from professional and other publications, and responses by informed persons to a questionnaire. De Luca concludes that examination systems do indeed have considerable effects on curriculum development. But whether these are positive or negative depends on a number of factors such as: the articulation between examinations and curriculum, the relationship between the agencies responsible for each of these activities, the degree to which teachers are involved in policy and the actual practice of curriculum and examination formulation, the availability of in-service training, and the degree of importance attached to the results. The author concludes that exams can have positive effects upon curriculum reform but that the effects on pupils are generally negative. Nevertheless, she observes, measurement of attainment and maintaining standards remain the major functions of examination systems and provide powerful justification for their continued existence.

Macintosh conducted his investigation for the IAEA on behalf of the International Bureau of Education in Geneva. The goal of this study was somewhat different from de Luca's: to compare how current thinking on assessment relates to practices in ten countries, half of them in the developed world: Australia, Bahrain, England and Wales, Guatemala, Israel, Malaysia, Namibia, Poland, Scotland and Slovenia. It points to recent changes in assessment in each country and overall, gives examples of gaps that appear between theory and practice and identifies some factors explaining these, and suggests ways of linking theory and practice more closely. The title is somewhat misleading, however, since it is more about the uses of assessment than anything else, and on the relation of assessment to educational change, rather than on assessment theories per se. Macintosh concludes from his survey that, across the countries reviewed, selection continues to be the major purpose of assessment, rather than monitoring or diagnosis. He concludes that there is little pressure for change, that current practices on the whole do little to help prepare students for adult life or to address the many social and educational problems that beset societies. The author agrees that assessment may serve as a vehicle for curriculum change and that this requires increasing involvement of teachers, but he regards most current assessment practices as obstacles to reform. Macintosh is all in favor of educational change. But the higher the examination stakes, the greater the pressure on examination-makers to maintain reliability and neutrality, and, hence, standards. The result, he asserts, is that examinations are less and less likely to encourage innovative and progressive curricula, which are called for as the school populations involved grow larger and more diverse. He ends with a call for more systematic and comprehensive training for teachers in the uses of assessment.

Keeves's contribution differs from both the previous two in several important respects. It is one of a Unesco series explicitly directed towards educational planners and administrators as well as those generally interested in how educational planning may relate to national development. It, too, provides information on how countries select and certify their students, and discusses the various roles of central examinations, the influence they exert in their nation's school system, and their purported benefits and disadvantages. Keeves's sources of data on national examinations derive from the 24 countries which participated in the IEA 2nd Science Study in the mid-1980s. This is the largest sample represented among the three studies, and therefore allows the author to make general, more global statements with greater authority. From the outset, this work is more truly comparative and more than a set of country studies. Keeves assumes that his readers have much of the necessary background knowledge about the countries he refers to, or at least that they have access to it and acknowledge the importance of context. He can therefore more easily move from a few selected themes to particular instances, rather than from a congeries of information to points of similarity or differences as de Luca and Macintosh tend to do. Thus, for example, Table 1 puts together in succinct but comprehensible form such variables as school attendance at different stages of national education systems, grades at which the several stages end, points at which national examinations take place, and participation rates in tertiary and terminal secondary schooling.

Keeves's contribution differs from the other two reports in another important respect. Like the other two authors, he distinguishes between the different roles of national examinations. But he also analyses a number of practical and technical aspects of comparative achievement testing in considerable depth, raising the question, what are the consequences for national examination systems of advances in measurement and statistical analysis. For these exercises, the author is eminently well-qualified, given his considerable involvement with educational research in Australia and with the gargantuan international work of the IEA projects. In conclusion, Keeves suggests that the functions of national examination systems develop from a stage of selection and certification, to monitoring and deliberate policy influence upon curriculum and teaching -- an intriguing historical model that deserves attention.

The national case study approach taken by both de Luca and Macintosh presents considerable background information on each country, somewhat more generous in the de Luca work. Generalizations about trends and their causes often appear in the national profiles, but only in Keeves's contribution do we gain a sense of what these signify historically and from a more global perspective.

In addition to the factual information, the questionnaires distributed by Macintosh and de Luca respectively invited comments and informed opinion from respondents. Macintosh directs attention to current assessment theories and practices and on changes that specialists consider necessary. He reports views on how far existing arrangements were adequate to "meet the needs of contemporary society" and to prepare young people for adult life in it. In respect of her seven countries, de Luca's study of the influence of examination systems on curriculum development is based on a literature search, including press cuttings from "reputable education authorities," as well as responses by experts to a questionnaire. However, though only de Luca uses the term in reference to her respondents, each study makes use of an 'opportunity sample' of countries, an oxymoronic term if ever there was one!

Each work pays attention to the different forms of assessment and the characteristic practices in each of the countries they surveyed. Across the countries surveyed, essay, shorter answers and multiple choice are the main forms, with some oral examinations, depending on the national traditions. But it is not easy for the reader to track how prevalent a given form of practice is across countries, nor are the answers very evident as to why one style is more common in someplaces than others. For example, why do Japan and the U.S.A. depend so heavily on multiple choice, England and others upon essay answers, while some European countries, Poland and Slovenia, trust their teachers enough to rely on oral examinations. Still, internationally viewed, educators and administrators appear as very traditional and conservative. Only de Luca points to the relatively innovative practice of profiles of achievement as envisaged in the State of Vermont, U.S.A. An unfortunate example as it turns out. The State Education Dept. recently cancelled its plans for such a system for budget reasons.

However, as Macintosh observes (p.26):
...there are in reality only a limited number of modes of assessments which can be used. How they are used is what matters. An individual can be asked to do something either in writing or orally or pictorially or through first hand or simulated demonstration. He or she can respond in writing, orally, pictorially or by demonstration -- all common modes of assessment in wide general use and referred to frequently in these responses. There is, therefore, no new mode of assessment awaiting around the corner to be discovered... Furthermore, this shared narrow assessment framework operates in an equally narrow curriculum framework...

All agree that teacher involvement in assessment is a significant variable, whether in setting the tasks, grading the students' responses, or evaluating and amending the examinations themselves. Considerations differ according to the stage or level (primary, secondary, college entrance), and whether one is looking at these specific jobs or participating in overall national or regional policies discussions. Profiles of achievement obviously require considerable teacher input, as do teacher-graded, independent student assignments examination assignments ("course work"). These may call for a teacher role that differs from a country's precedents, and for new skills and attitudes that require training and development. While attaching or even including students' scores from classroom assignments and periodic tests to marks gained in external examinations is common, preparing teachers explicitly to be involved, knowledgeable, and skilled in overall assessment practices is not. And it seems that when innovations are introduced, adequate investment in in-service teacher training is often lacking. To involve teachers more extensively and effectively in assessment is expensive and often calls for classroom teachers to adopt roles previously reserved for inspectors or administrators, a challenge to the teaching corps and many school administrators as well as to conventional thinking.

Data about the effectiveness of school practices used to be an internal matter for schools and school systems. But in many countries nowadays, such information has special significance beyond the realms of schools and educators as ways of demonstrating public accountability. 'League tables', currently in fashion in England as they have been elsewhere, serve as a means of such monitoring. The importance of external examination results on such balance sheets cannot be understated. Like national attempts at data collection for the purposes of school assessment (e.g. the former efforts of the APU in England and Wales and of NAEP in the U.S.A.), they may have broad ramifications for school funding and administrative autonomy, or for parents seeking the best education for their children in an open market. The results of assessment, accompanied by other data, are also critical for comparisons across regions, and for the kind of monitoring that takes place when countries decentralize aspects of their educational administration. In short, the results as well as the forms of assessment are important not merely to students and teachers, but also in an increasingly consumer-oriented society, and have themselves become a market force in education.

Together, the three studies confirm that there is more to the study of examinations than statistics and psychometrics, and far more than merely comparisons of their academic content, to which, incidentally, none of the authors pay much attention. All three works demonstrate the broad instrumental uses of large-scale assessment practices and indicate the national policy implications. As a group, however, they inform the reader less than they might have done on a number of pertinent dimensions. For example, as already indicated, despite the considerable detail provided on a country-by-country basis in the de Luca and Macintosh studies, the authors do not generally venture into generalizations about the direction of national trends, let alone international ones. But more fundamentally, they do not point to many of the political aspects of examination practices. Political here refers to the participation of interest groups in the struggle to institute or to prevent changes. Depending on the country and its changing circumstances, the interplay of various interested parties may be critical in affecting the outcome of debates over many kinds of assessment issues. The pressures emanate from local and national government, school administrators, teachers, parents, students, university professors, employers and unions. In certain countries on occasion these may also include political parties. The issues affected include examination format, requirements, and management. Should the examinations be harder or easier? Unitary and uniform for all or multiple assessments, variable by some criteria or other? To what extent should they be internal to the school, or external, centrally or regionally determined?

Given the various uses of examinations identified by the three authors -- student, school and educational system evaluation, selection, monitoring, instituting and maintaining standards, improving curriculum and teaching -- it is difficult to see a world without some systematic use of assessment devices in schools. Despite rising and strident criticism, indications are that assessment, examinations, and credentialling are becoming more rather than less pervasive and more important on all fronts. Two of the three studies are rather negative toward national examinations, because of their impact upon curriculum development and the lack of articulation between examination theory and practice. Their authors are sympathetic towards the ideas of reform and change and regard national assessment as on the whole obstacles to it. Keeves takes a more neutral line and is less judgemental on the subject.

To return to the question posed at the beginning of this essay, why are nations now so interested in the instruments of assessment and in the management of examinations? Because, apart from pedagogical considerations such as student learning and school management, examinations are a means of exerting power over individuals and groups. The owners of examinations, whether Ministries of Education, Boards of Examiners, private agencies, or whatever, possess power over important resources and decisions. Moreover, the results have value, economic as well as social. While they are by no means the only factor, poor examination results deny students access to certain levels and forms of advanced schooling and thus may close the doors to social, political, and economic advancement. In some countries, too, poor examination results are a reason for parents to choose another school for their children, or even cause an administrative authority to exert stricter control over a school. It cannot be sufficiently stressed that this is not just an individual, personal matter, or a domestic issue. Nations are more aware than ever before that their fortunes in the world depend on the educational levels of their populations. Average national school achievement is no longer merely a status symbol but, though crude, stands as one outstanding indicator of a nation's economic and social health.

Which brings us finally to the many dilemmas confronting those who look beyond the merely technical aspects of assessment. As Keeves observes (citing Eckstein and Noah on the subject1), policy options abound. This reviewer would have welcomed more attention to some systematic, comparative analysis of these dimensions, for example, the relation of policies and practices to such indicators as the level of development of educational systems, and the diversity of students that they service. The mix of more and less developed countries in each of these works should suggest some observations on the first of these topics, if only tentative. The overall expansion of schooling to include larger proportions of youth and thus a vastly more varied population than in the past, is an obvious and dramatic development but one treated only in passing by these authors. One of the purposes of international study is to gather relevant information, which all of these works do. But a second purpose is comparison, and only Keeves demonstrates much more than simple juxtaposition of the timely and interesting data gathered in these works.

In theory, the options are many, as we think of what may be the most desirable and efficient devices for assessing progress toward the educational and national priorities that nations determine are important for them. But it is hardly likely that even the most enlightened and energetic comparative investigation will come upon the one best system, even in theory. As some comparativists used to say, each country gets the system it wants (or deserves). It cannot divest itself of its historical context or of the effects of its own changing conditions. But is it too deterministic to conclude, then, that the 'best system' is the one best suited to the particular context?

The three works in their several ways demonstrate the great value of such comparative scrutiny to shed light on a set of problems that are common to all educational systems. Each is informative and thought-provoking to anyone interested in the forms and processes of assessment and in education in general. None of these works could have been accomplished without large-scale collaboration among people and institutions and demonstrates the value, indeed the necessity of such international cooperation. The effort and energy that went into each of these three works was considerable. Parenthetically, could the work not have been more efficiently done if the authors and their sources of information had communicated more closely?

We now have a body of literature on the subject of educational assessment that extends well beyond histories of examinations in Imperial China and 19th century Europe. Just as there are fashions in consumer tastes, so there are fashions in educational research. One hopes that the topic these works deal with is not a passing fad. We still have much to learn on the subject and it is far too important to be neglected. Comparative study of examinations has made us increasingly aware of the complex and multiple effects and causes of particular assesment practices. Educational planners and examination reformers ignore these at their peril and at the risk of having their plans fail or produce unanticipated outcomes.

As we have noted, policy dilemmas abound. They will not disappear. But these three works remind us that when considering examinations, or for that matter any educational issue, we must from time to time return to first principles: certainly, what and how do we wish to assess, but more fundamentally, what are the goals and functions of education in modern society that are to be the targets of our examinations?

NOTES

  1. Max A. Eckstein and Harold J. Noah (1993). Secondary School Examinations: International Perspectives on Policies and Practice. New Haven and London: Yale University Press. Pp. 243-45. [BACK]


Back to Top
Go to Electronic Book's Contents
Go to CERC's Main Page
To obtain a copy of the book, order from CERC