Beginning Research | Action Research | Case Study | Interviews | Observation Techniques | Education Research in the Postmodern
Evaluation Research in Education | Narrative| Presentations | Qualitative Research | Quantitative Methods | Questionnaires | Writing up Research
Originally prepared by Professor Harold Silver.
Component now led by Dr. Nick Pratt.
© H Silver, Faculty of Education, University of Plymouth, 2004
(links reinstated August 2006)
2 What is evaluation?
5 Internal evaluation
6 The evaluator
8 References and further reading
The questions to be addressed are interrelated and can be summarised as:
What is evaluation?
Is it research?
How and by whom?
Evaluation has become a widespread activity, internationally, under that name since the 1960s, in a variety of contexts. Although we are focusing on education here, it is important to remember that evaluation models have been developed elsewhere, notably in the social sciences. It has been used to test the effectiveness of, for example, national and international programmes in agriculture or crime prevention, health improvement or transport policy. A postgraduate course in applied social science at Manchester University was introduced as follows:
Increasingly, social service providers, programme administrators and legislators use evaluation research in order to consider the effectiveness of new and existing programmes, procedures and/or interventions at producing some form of outcome or change. The findings from evaluations focus on the strengths and weaknesses of various aspects of innovations as well of their overall outcome. This information is, in turn, used to consider how such interventions might be modified, enhanced or even eliminated in the effort to provide a better service, fulfil a particular need or meet a specific challenge.
In education evaluation has served a somewhat similar purpose, and has been applied to major programmes of whole-school reform or specific curriculum changes, and more limited projects to try out innovations. There is a vast literature on different types and purposes of evaluation, and we shall sample some of it here as we address the priority questions. Some of the discussion overlaps with issues discussed in other RESINED components such as action research, qualitative research and interviewing, and the links will be highlighted as they occur.
At the lowest level evaluation is a regular social activity, such as that conducted by Which? magazine and other publications, and by ourselves. It makes comparisons amongst products or services, with a view to making a selection a kitchen utensil or an investment, a car or a Chardonnay. At this level evaluation is comparative on the basis of relatively straightforward criteria and available information, and is a preliminary to decision-making. The criteria, of course, are not the same for everyone evaluating comfort may or may not override the cost or style of a car, and labelling may or may not influence choice of a wine. In education the purposes and the criteria are inevitably more complex and evaluation is a process of acquiring information. Evaluation of an innovation or an activity, a curriculum or organisational change, raises a series of sometimes difficult or contentious issues. Who is sponsoring the evaluation, what do they want to know, and why do they want to know it? What depends on the outcomes more or less finance, promotion or redundancy? What is the salient issue for the evaluation change in student learning, staff development, value for money, position in a league table ? Whose opinion counts most students feedback in the university, the teachers perceptions in the school, project managers, administrators?
Evaluation in education therefore encompasses competing criteria and purposes, and is situated in potentially sensitive political and ethical contexts.
If you will be undertaking a 'task' at the end of this
component you may find it helpful to make some notes as you
go along. At this point you could make a preliminary list of
problems you think might be encountered in evaluating
a new initiative in your own institution.
It is important to note that evaluation research (a concept discussed below) is basically what is commonly called programme or project evaluation. The features of such evaluation (in its various forms) may be the same or similar at all levels of education, concern innovations, initiatives and developments of many kinds, and it is mostly conducted by individual evaluators or small teams. There are, however, other forms of evaluation that are not included in the discussion here. These include, commonly in higher education, the evaluation of teaching quality or of research or the evaluation of institutions, as part of a system approach to quality assurance conducted by national agencies. Teaching quality and institutional evaluation may also be conducted internally as a form of self-evaluation (eg Ellington and Ross, Evaluating teaching quality throughout a university [Robert Gordon University], and Adelman and Alexander, The Self-Evaluation Institution).
Definitions of evaluation can indicate the intentions involved, but are elusive as complete explanations. The kind of definition that was often used in the 1950s and 1960s, notably in the United States, was:
Evaluation is the systematic assessment of the worth or merit of some object.
The judgmental tenor of that definition in fact reflects the evaluation of cars of Chardonnays assessing their worth or merit in order to choose, though it does not reflect the casual nature of personal judgments that are often unsystematic. Subsequent attempts to define evaluation have adapted this formulation. Trochim, in the United States, for example, suggests:
Evaluation is the systematic acquisition and assessment of information to provide useful feedback about some object.
He explains the older and the revised versions, which both agree that evaluation is systematic and use object to refer to a programme, policy, technology, person, need, activity and so on. The revised definition, however, emphasizes acquiring and assessing information rather than assessing worth or merit because all evaluation work involves collecting and sifting through data, making judgements about the validity of the information and of inferences we derive from it, whether or not an assessment of worth or merit results (Trochim, website). Whether evaluation makes judgments or is a preliminary to other people making judgments, is a contentious issue in the field (and is discussed further below). The former definition, assessing worth or merit, inescapably involves acquiring and assessing information, but the revised version does focus on the information. It suggests that assessing worth depends on an analytical approach to information, that is, on an understanding of the object about which feedback is required.
Another, this time British, attempt at revising the first definition was in connection with the evaluation of educational institutions. It defined such evaluation as involving:
the making of judgements about the worth and effectiveness of educational intentions, processes and outcomes; about the relationships between these; and about the resource, planning and implementation frameworks for such ventures. (Adelman and Alexander 1982, p. 5)
While retaining the notion of making judgments about worth, there are two important extensions in this version. First, the object of study has acquired intentions, processes and outcomes; it is a complex sequence in which the parts have relationships, and it is therefore clear that evaluation is concerned in some way with that sequence. Second, this sequence is not isolated. It is in a framework which has to do with resources, planning and implementation. Evaluation therefore understands the sequence only by also taking account of the framework in which the sequence takes place. The curriculum is in a classroom with its relationships, in a school, and in a complex and interactive context involving families and communities, authorities and the various levels of policy making - all of which affect what is taught and learned. Further education colleges and universities have their own departmental, disciplinary, institutional and other contexts that may have to be taken into account when a project or initiative is evaluated.
In considering evaluation in your institution are there possible
major issues concerning relationships in the context of management,
the whole institution, outside constituencies and agencies...?
If you were to conduct an external evaluation in an institution other
than your own, how different might the issues be from the ones
you have considered above.
Types of evaluation
With these preliminary considerations in mind, it would be helpful to look carefully at the following and make some tentative choices regarding the role or roles that may seem most appropriate in your evaluation of the project, programme, innovation or other initiative (for simplicity sake we will encompass all of these from now on in the term project). The evaluators role is to be:
as objective as possible (interviewing, questioning, reporting on findings, not being too close to the participants) and to report to the person or body for whom the evaluation is conducted;
to collect data rigorously and scientifically;
to feed back impressions to participants (so that they can take note of your findings and improve their activities);
to understand and describe the project and make judgments;
to be involved with the project from the outset, working with the project participants to plan their programme and the evaluation together;
to monitor the process, that is, the implementation of the initial terms of reference or objectives of the project;
to focus on the life of the project in its relevant wider contexts;
to investigate the outcomes, successful or unsuccessful, of the project;
to judge whether the project has been (or is likely to be) value for money;
to conduct an external evaluation and nothing more;
to help participants to conduct an internal evaluation, in addition to the formal external one, or as a substitute for it;
It will be clear from the choices available that evaluation is far from being a simple or standard activity. The choices are neither right nor wrong, but may be more appropriate to particular programmes, conditions and requirements, and to the self-image of the evaluator. Evaluators and evaluation theorists have extensively explored the alternatives and these have been the focus of various kinds of controversy. To compare your own preferences or issues with some of those in the literature in terms of types of evaluation click here. We cannot here consider all of these alternative approaches, but it is important to emphasise two that are frequently met in the evaluation literature.
Process and impact
The purposes of evaluation can be encapsulated in these two terms, the former to highlight what is and has been happening, the latter to attempt to indicate what has happened as a result. Both encounter difficulties.
An impact evaluation assesses the changes in individuals’ well-being that can be attributed to a particular program or policy. It is aimed at providing feedback and helping improve the effectiveness of programs and policies. Impact evaluations are decision-making tools for policymakers and make it possible for programs to be accountable to the public. (World Bank, website)
Such a role for the evaluator raises questions, discussed below, of the kind of contract agreed at the beginning of the evaluation, and the possible influence of the audiences for the reporting procedure at the end. There are issues about the tentative or reliable nature of impact data, which may differ considerably by type of project. Since a funding agency may require impact data and an evaluator may find such data unattainable, there is room for misunderstanding and conflict.
Formative and summative
These may be, but are not necessarily, related to the above.
Hopkins (as we saw above in terms of types of evaluation) made the simple suggestion that formative evaluation was when the cook tasted the soup, and summative when the guest tasted it. He also suggested that the difference was not so much when as why. What is the information for, for further preparation and correction or for savouring and consumption? Both lead to decision making, but toward different decisions (Hopkins 1989, p. 16). This latter distinction establishes the difference between these concepts and those relating to process and impact. Formative evaluation is designed to help the project, to confirm its directions, to influence or help to change them. It is more than monitoring or scrutinising, it serves a positive feedback function (which process evaluation does not necessarily do). Summative evaluation is not just something that happens at the end of the project, it summarises the whole process, describes its destination, and though it may have insights into impact, it is not concerned solely with impact.
Summative evaluation has often been associated with the identification of the preset objectives and judgments as to their achievement (again, not necessarily in terms of impact). The assumption in this case is that, unlike in formative modes, evaluation is not (should not be) involved in changing the project in midstream otherwise the relationship between objectives and their achievement cannot be evaluated:
…every new curriculum, research project, or evaluation program starts with the specifications to be met in terms of content and objectives and then develops instruments, sampling procedures, a research design, and data analysis in terms of these specifications. (Bloom 1978, p. 69)
Starting specifications that are expected or required to be met therefore dictate the nature of the summative evaluation. The instruments or sampling procedures cannot produce pure data if the process is corrupted by the intervention of evaluator feedback or other alterations to the original specifications. It is possible to conceive of evaluation as both formative and summative, but in this case summative comes closer to meaning final, and cannot present data and make judgments as purely as is suggested in Blooms definition.
Other approaches to evaluation emerged in the last quarter of the 20th century, and some will be mentioned further below in relation to methodology. These have included illuminative, democratic (as opposed to bureaucratic evaluation), participative and responsive evaluation. These all have implications for the role of the evaluator in relation to the project, for example, sharing with the project participants, responding to the activity not to specifications and intentions, identifying and reporting differences of perspective and values, emphasising the importance of understanding or recording competing perceptions. Much of this work relates to discussion in other RESINED components, notably action research and case studies.
You could at this point consult the paper by Parlett and Hamilton on Evaluation
as illumination in Hamilton et al., Beyond the Numbers Game
(quoted in types of evaluation),
and other contributions to this influential book.
See also the chapter on Program evaluation: particularly responsive
evaluation by Robert Stake, in Dockrell and Hamilton, Rethinking Educational
Research, and Helen Simons, Getting to Know Schools in a Democracy:
the politics and process of evaluation.
We have so far by-passed discussion of the terms evaluation and evaluation research and some difficulties inherent in this vocabulary, but also in conjunction with other terms sometimes used in relation to evaluation including applied research and academic research. Jamieson suggests that there are basic differences between the last of these and evaluation research, in the degree of constraint on their purpose and operation, the funding and its implications, publishing and reporting:
Evaluation reports and research reports not only have different audiences but their main objectives are different. The goal of the research report is the enhancement of understanding and knowledge via publication to the scientific community. The main goal of the evaluation report is to inform and/or influence decision makers… the relative emphasis of the two activities must be different. (Jamieson 1984, pp. 72-3)
This seeks to establish one kind of distinction, but Jamieson also indiscriminately uses evaluation and evaluation research in the argument. So is evaluation a form of research? The question ultimately raises issues about the nature and definition of research as well as of evaluation, and to approach these issues let us take some examples of discussion of the relationship.
Different or the same?
For some commentators the distinction is between evaluation and research, ignoring any such concept as evaluation research. The distinction drawn is generally between the methodology of research based in the social sciences, and often directed towards answering questions relating to policy, even to improving it. Evaluation particularly the more recent approaches to evaluation is seen as serving a very different purpose. Parsons argues that if evaluation is seen as serving the interests of decision makers then it has no right to claim the title of evaluation it is then a form of research and should obey the rigorous rules of research, and it is then the decision makers who are the real evaluators. He particularly excludes formative evaluation from the definition of research:
Formative evaluators work alongside development or action research teams with the tasking of feeding such teams with information that might help them modify their work, counter weaknesses, anticipate problems and so on. The formative evaluator is an internal critic and provides an information feedback service [Formative evaluation] serves a narrow audience, the developers, and to be effective needs to be closely allied to or an integral part of the team. The commitment thereby generated would make the formative evaluator suspect as the provider of objective summative information of significance to a wider audience. (Parsons 1981, pp. 40-2)
This is a critique of claims for evaluation as research. Others, however, see the distinction as a necessary and positive one. A crucial point in this argument is identified By MacDonald and Walker:
The methodological difficulties faced by curriculum evaluators who want to offer a comprehensive range of information about new programmes have drawn them to the case-study as a technique. Many of the quite legitimate questions that are put to evaluators, especially by teachers, cannot be answered by the experimental methods and numerical analyses that constitute the instrumental repertoire of conventional educational research. (MacDonald and Walker 1977, p. 181)
This argument refers to experimental methods and numerical analyses, but as conventional research, itself under attack from case study and other (including action research) approaches to research. There is, of course, no one way to conduct case studies research or action research, but broadly speaking distinguishing evaluation from research involves also drawing a distinction between them both and conventional forms of research.
Evaluation organisations themselves sometimes distance themselves from such social policy-based versions of research. In American examples chosen earlier it may be difficult to judge in what ways they are research. The Action Evaluation Research Institute defines its central evaluation activity as
a new method of evaluation, one that focuses on defining, monitoring, and assessing success. Rather than waiting until a project concludes, Action Evaluation supports project leaders, funders, and participants as they collaboratively define and redefine success until it is achieved. Because it is integrated into each step of a program and becomes part of an organization, Action Evaluation can significantly enhance program design, effectiveness and outcome. (AERI [2000?], website)
Explicitly, the approach is differentiated from traditional evaluation, and implicitly its purposes and methodologies differentiate it from traditional research. The strategy may be based on extensive research, but the strategy itself is difficult to define as research.
Click back on types of evaluation and judge whether you think the examples
can or cannot be described as research.
It can also be suggested that evaluation and research are the same or out of the same stable of activities, not least by using the concept and title of evaluation research. An early American attempt to consider the relationships between research and evaluation studies thought it evident that many of the activities undertaken in evaluation and in research in education were the same. In research itself it points out that a distinction is often drawn between applied research and basic research on the basis of utility or simply new knowledge. Since evaluation studies are made to provide a basis for making decisions about alternatives, questions of utility are also addressed. This account sets out the range of possible differences between ideal research and evaluation studies, and though the differences exist it concludes that they share many characteristics of method and approach, they both add to new knowledge, stimulate and benefit from the development of theory and contribute to a science of education. The essential differences are not those of the evaluator and the researcher, but those of different kinds of subsequent decision-makers:
The consequence of the differences between the proper function of evaluation studies and research studies is not to be found in differences in the subject interest or in the methods of inquiry of the researcher and of the evaluator. It is to be found in the manner in which the outcomes of the two types of studies are used and regarded. (Hemphill 1969, pp. 189-92)
Studies is here simply a substitute for research. The defence of evaluation as either a form of research or as part of the same family has continued to emphasise that the confusion has related to stereotypes of both activities. Both have encountered debates about methodology, including a case study approach; both have erected and torn down barriers round their respective professional communities; both have faced problems about their relationship to patrons, funders and audiences.
These debates about evaluation and/or research can be pursued
in chapters by Parsons (A policy for educational evaluation) and
Simons (Process evaluation in schools) in Lacey and Lawton
Issues in Evaluation and Accountability; in Stenhouses chapter
on The evaluation of curriculum in An Introduction to Curriculum
Research and Development, or in other items in the
Is it simply a case of it all depends on what you mean by .?
Whatever the distinctions between academic research and evaluation research, the research methods used are broadly similar though any given activity in either may use only a segment of the methods available, and these will be overwhelmingly in the methods of qualitative research. Across the range of evaluation approaches the methods will include interviews and questionnaires, focus discussion groups and observation, case studies and diaries or logs. Some of these are discussed in RESINED components on interviews, observation techniques and questionnaires in education research. Some methods are used for particular evaluation strategies and purposes.
In objectives-based and some other kinds of evaluation, for example, pre-test and post-test strategies are likely to be used, in order to provide a baseline on which to make judgments about how much has changed as a result of the project. An American approach to evaluating whole school reform explains the strategy as follows:
This model makes the assumption that without the intervention, things will go on as they did before. Other things being equal, teachers will continue to teach as they did before, and students will continue to show the same pattern of achievement as they did before. With the intervention, things will change over time, it is hoped in a positive way…The model can include repeated measures… The pattern of change at different points in time can then be interpreted as a result of the intervention. (North West Region Education Laboratory, website)
A health-related example goes into greater detail:
In order to determine how well the program is working to change those factors that cause social problems, an evaluation needs to address these specifically. Often, this means focusing upon how much behaviors of behavioural determinants (knowledge, attitudes, beliefs, skills, or values) have changed from prior to the program intervention until sometime after it. The questions answered in this type of evaluation refer to the programs goals and how well they are being reached. For example:
How much did participants knowledge of tobacco as an addictive drug change due to the program?
Have youth feelings of community empowerment increased between the start and finish of the program?
Are High School students less likely to engage in alcohol use because of the program?
The most common way to answer these questions is through the use of pre and post surveys of participants. This is not the only way to gather data on changes in behavior or behavioral determinants… However, the pre and post survey method can provide a good way to compare participants before and after the program intervention. (Nebraska Council to Prevent Alcohol and Drug Abuse, website)
The model makes assumptions about the possibility of outcomes occurring directly or uniquely as a result of the intervention, and being susceptible to accurate measurement. The components of such an approach include objectives, data collected by test surveys, rigorous adherence to an impact model and though it is possible to collect change data during as well as at the end of the project, relaying these data back to the project formatively would distort ability to measure pre- and post- situations. As with Blooms description of summative evaluation, the problem is that of ensuring the achievement of undistorted, uncontaminated data.
An evaluation method used primarily in higher education is that of student feedback, on a new course or form of delivery, or regularly on the student experience. Of the formal methods of obtaining feedback questionnaires are the most common. Students may be asked to give their views on the curriculum and the teaching, the course management and assessment, and the analysis of the questionnaire may be used as part of a broader process, as a basis for interviews or group discussion. An alternative is some form of structured (or pyramid) group feedback, in which the group is split into small and then larger groups, with agreed points being reached at each stage, for presentation at the end in plenary discussion. The aim is to obtain feedback without any person or group dominating the response. Nominal Group Technique has the same purpose, but normally involves no discussion (except for item clarification), being based on participants own written recording of their views, including nominating points for inclusion in the report, and then a presentation by the session leader of the ideas expressed, with no attempt to evaluate the suggestions. The NGT procedure aims at maximum objectivity of feedback, and shares with other forms of structured feedback techniques an attempt to make feedback representative. The purpose of regular feedback of these kinds is to inform the teaching staff of the state of a course or the success or otherwise of an intervention and is therefore a different kind of formative evaluation. The procedures can therefore also support the evaluation of an initiative, and stand alongside other forms of evaluation of teaching or curriculum change (for fuller details see ONeil and Pennington, Evaluating Teaching and Courses from an Active Learning Perspective, pp. 21-34).
Although structures of the kinds outlined above are not typical of illuminative or similar approaches it is important that some kind of structure is involved. Parlett and Hamilton, for example, introduce illuminative evaluation as being in three characteristic phases: investigators observe; inquire further; and then seek to explain. They give an example of how these three stages took place and overlapped, and with this three-stage framework an information profile is assembled using data collected from four areas: observation, interviews, questionnaires and tests, documentary and background sources (Parlett and Hamilton 1976, pp. 14-15). For course development, training or other initiatives the range of methods available is wide. For evaluation generally, focus groups and questionnaires, interviews and implementation logs, feedback and testing methods are part of a menu of approaches. One American list of evaluation tools for projects involving interactive media contains 39 items, without including anything relating to an ethnographic approach.
What all of these methods do is attempt to penetrate the complexities of social situations within which evaluation of an initiative takes place, whether or not the evaluation is descriptive or judgmental. Policy, project, documentation, process and outcome are not givens and the evaluator relates to them in a variety of ways, using a variety of approaches. There is no appropriate evaluation method, only the selection of an approach or approaches for a particular situation, depending on the predominant assumptions of the evaluator, a project team and the evaluation sponsors in some kind of understanding or negotiation. Determining the strengths and weaknesses of a particular method is therefore itself an elusive process, and a great deal of emphasis has to be laid on all the preliminary encounters and insights obtained either at the beginning of the evaluation or, preferably, before the project and the evaluation are launched. The evaluators statement of intent, terms of reference or contract therefore needs to be clear about a number of agreed principles, though not necessarily, of course, in any of the following vocabularies:
Prior clarification of the assumptions underpinning the project and the assumptions of the evaluator. This could lead to identifying the style of evaluation that would be appropriate whether based on objectives and measurement or a kind of joint exploration between project and evaluation, whether autocratic and bureaucratic or democratic. An important element in this clarification is identifying the source of power for whom is the evaluation primarily intended and who can influence the operation of the evaluation?
What is most wanted from an evaluation? Final calculation of whether the project has been value-for-money? Ongoing feedback on implementation (is it working?), and feedback to whom? What are the proposed outcomes? Are there already (eg in a funding contract) defined objectives and expectations, possibly against existing baseline data? Answers to such questions may determine whether to define the evaluation as description and enlightenment, analysis and judgment.
The intended end product of all interventions is change. What kind of change is anticipated, and how can the effectiveness of the intervention in producing it be either measured or portrayed? Discussions of evaluation often focus on what does not produce change. For example, there are views that self-evaluation by an institution is unlikely to result in change; that testing instruments do not reveal what actually happens and what produces the change; observation may reveal little of changes in teaching. The strengths and weaknesses of a particular method may therefore be a function of its ability to respond to underlying questions of any project: Who wants to know what, and why? - and whether there is a method or methods that will offer something worthwhile.
The focus of the discussion here has been on external evaluation. The assumption has been that evaluation with a research connotation is conducted by someone (or a team or group) external to the project evaluated or to the institution. In the absence of, or in collaboration with, an external evaluator some of the approaches and methodologies discussed above may also be applicable to the internal (or 'self-') evaluation conducted by the project leader or team. One task of an external evaluator may be to advise on internal evaluation methodology (interviews, questionnaires ) on an initial or ongoing basis. It is common for small projects or initiatives, whether funded from within or outside the institution, to require an evaluation without making provision for an external evaluation.
Some internal evaluation may consist solely of the collection of limited data (perhaps similar to that undertaken to obtain student feedback in higher education, discussed above). Where a single person is responsible for the project and has an audience or partnership (for example, of students, colleagues, other professionals, patients ) there may be a tendency to rely on informal feedback or opinions from those with whom the project has occasional or regular contact. This in no way meets the requirements of any of the versions of evaluation research that we have considered. To answer the questions What do you know?and How do you know it? something more systematic has to take place.
As with all evaluation and research there is a strong temptation to include, and in internal evaluation to rely on, questionnaires as the source of data. Planning, conducting and analysing a questionnaire are subject to difficulties and pitfalls (see RESINED component on questionnaires). Interviews, particularly within the limited community that may be covered by a small project, may be difficult to conduct. Structured discussion in small groups in what are described above as focus groups or using a pyramid approach may be useful tools in these kinds of situations, combining structure and focus for small group discussions. The use of logs or diaries by participants in the project may be a valuable alternative or supplement to any of these approaches.
As a small, internal initiative (though possibly one of a number of such initiatives) the institution is unlikely to enter into a contract or agreement on evaluation in the same way as for an external evaluator. What is needed, however, is agreement at the appropriate level for at least initial contact by the project leader(s) or team(s)with a consultant who can discuss and advise on the options available for internal project evaluation. The onus rests with the committee or senior staff to ensure not just that evaluation is needed but also that such advice is available to those conducting the project.
The conduct of an evaluation may be by a full-time professional evaluator, part-time by a member of the same or another institution, by a team of two or three people or a much larger team. The evaluation may be for a short period or for a number of years. The appointment of the team may be by the institution with a great deal of input or very little input by those conducting the project. The evaluator may be required to report to a steering group or committee responsible for the project, to the institution, or to the funding body or to some or all of these. The ground rules for the evaluation may be decided by the evaluator, or may pre-exist the appointment.
On the last of these points, for example, the governments Department of Employment (as it was in 1994) issued a document entitled Evaluating Development in Higher Education: a guide for steering committees, contractors and project staff. The Employment Department, like all government departments and others, did not just fund projects, it contracts with an institution or organisation for a piece of project work, and the contract stipulated a series of requirements. On the question of evaluation the Department indicated:
All this work requires evaluation. All the partners (individual staff, departments, project steering groups, institutions and the Department among others) need to know whether it has been successful and is worth imitating, what lessons there are for the future, and what further development or research may be needed. Without this, resources, including scarce staff time and energy, will be wasted in repeating mistakes and rediscovering what is already known.
For guidance the document set out 11 key questions, on such matters as the customers for the work and the evaluation, the balance between formative and summative evaluation, the data and the outcomes, baseline information. It suggested that steering groups might add their own questions, concerning how the planned work was carried out, whether each objective was achieved, future development and value for money. Although this was a guide, there were issues that should always be addressed:
assessing contract compliance and value for money
contributing formatively to development
informing future agenda building and gathering intelligence
informing the review of development and evaluation methodology (Department of Employment, click here for greater detail)
In this kind of case with such requirements built in at the contractual stage before the appointment of an evaluator, the latter will enter a pre-determined situation, since the institution or steering group will have committed themselves to a project and an evaluation within this framework. The evaluator will have room for manoeuvre at the margins, mainly in the selection of a methodology that will provide answers to the questions that have already been formulated.
In other situations, of course, the steering group or project team will have only the broadest (if any) guidance, and the evaluator possibly in order to secure the appointment will be asked or volunteer to supply an evaluation brief setting out in appropriate detail what the evaluator intends to do (style of evaluation, time commitment, ownership of the data and the evaluators report(s), means of negotiation of any changes in evaluation procedure ). Some situations may require only an informal relationship generally for modest interventions without external funding. In all cases, even in the most informal, a contract between the project management or the institution and the evaluator is essential even if only to specify time, payment and any requirements for example the date by which a final evaluation report has to be provided. These preliminaries are necessary but only partially protect the evaluator. As one commentator put it:
… people who accept positions as evaluators place themselves in a vulnerable position: to put it neatly the evaluator sets himself up for evaluation…In embarking on an evaluation the evaluator makes a commitment to deliver some goods… failure to deliver the goods, or to deliver superior goods, will be an embarrassment at least, if not a serious threat to his academic status or career prospects. (Gomm 1981, p. 127)
This threat can be particularly acute if there are multiple audiences for the report, and it is not impossible for evaluators to be tempted to minimise it by muting critical content in the report. Stake, in the United States, describes the position to make it more than just a hypothetical one:
It is recognized, particularly by Mike Scriven and Ernie House, that cooption is a problem, that the rewards to an evaluator for producing a favorable evaluation report often greatly outweigh the rewards for producing an unfavourable report. I do not know of any evaluators who falsify their reports, but I do know many consciously or unconsciously choose to emphasize the objectives of the program staff and to concentrate on the issues and variables most likely to show where the program is successful. I often do this myself… (Stake 1980, p. 74)
A form of reporting that entails judgments and possibly recommendations (a common but not a universal element of reports) therefore raises particular issues of this kind. Cooption is a danger of case study, illuminative or similar forms of evaluation, since the evaluator by definition in these cases works closely with the team and may feel tempted or even obliged, as Stake suggests, to highlight their view of the process and outcomes, and given the trust that has been involved, to highlight the positive ones. The danger is not inevitable, and can be overcome by adherence to the initial principles and strategies agreed for the project. This makes the initial agreement, and the forms of consent of the parties concerned, all the more important. Agreement at the outset needs to be clear about the process, the outcomes, the audiences and the nature and purpose of the report.
Given normal undertakings of confidentiality, the literature of evaluation contains few examples of actual reports. Those that are in the public domain are normally those submitted on major national or international initiatives, may be very substantial, and in some cases are on internet or intranet websites. A 100-200 page final report, probably highly statistical, on a multimillion £ or $ project in agriculture or literacy will not help to illustrate the issues discussed here at more modest levels. However:-
It may help finally to summarise some advice to have in mind when undertaking an evaluation:
Ensure at the outset that you have a full discussion of what you are going to be doing, resulting in an agreed written statement. This may cover time scales, finance, reporting (frequency, to whom, ownership of reports ).
Be sure whether it is a process or impact, formative or summative, evaluation though this is not necessarily the language of what is agreed.
Be clear about the intended methodology (observation, interviews, questionnaires, focus groups, diaries ) and the relationship with the project team, other participants and project management (senior staff, steering committee ).
Be sure about confidentiality (eg if formatively reporting to the project team, what information it is legitimate, or not, to reveal; whether interviewees will be identified or indentifiable in reports ). The project team and others involved need to understand the confidentiality position, and it may be advisable to explain this and other matters in writing for everyone concerned (commonly referred to as an ethics protocol).
If there is also to be an internal evaluation, consider what help you can give on its purpose and methodology.
When submitting reports (interim, final) will they go first as drafts (to whom?) to be checked for accuracy not to challenge or confirm your judgments (it is your report)?
Consider, throughout the evaluation process, your own and shared purposes, the effectiveness of your methodology, the appropriateness of your relationships.
Take account of what literature may be helpful.
Given the hypotheticalevaluation or evaluations that you
have considered, including some of the problems or difficulties, and given
your own position,if invited to conduct such an evaluation
would you do it?
If so, why? If not, why not?
My own final reflection is that I hope you would!
|Some of the items below are accessible on the Internet as indicated. There are books that it would be worth reading, but where possible chapters or papers in books are suggested. Items in bold are the most recommended.|
Action Evaluation Research Institute (2000?), Helping groups define, promote and assess success, http://www.aepro.org/ [including overview, methodology, recent essays and conceptual frameworks].
Adelman, Clem and Alexander, Robin J. (1982), The Self-Evaluating Institution: practice and principles in the management of educational change, Methuen, London.
Albee, Alana (1999) Assessing impact: some current and key issues, Caledonia Centre for Social Development, http://www.caledonia.org.uk/pia.htm [a very useful paper].
Bloom, Benjamin S. (1969) Some theoretical issues relating to educational evaluation, in Tyler, Ralph W. (ed.), Educational Evaluation: new roles, new means, National Society for the Study of Education, Chicago [perceptive study of objectives, specifications and outcomes].
Bloom, Benjamin S. (1978), Changes in evaluation methods, in Glaser, Robert (ed.), Research and Development and School Change, Lawrence Erlbaum, New York [useful insights into early assumptions about evaluation].
Burgess, Robert G. (ed.), Educational Research and Evaluation: for policy and practice?, Falmer Press, London [chs include local and national evaluation, and relationship (if any) of evaluation to policy].
Department of Employment, Further and Higher Education Branch (1994), Evaluating Development in Higher Education (duplicated).
Ellington, Henry and Ross, Gavin (1994), Evaluating teaching quality throughout a university: a practical scheme based on self-assessment, Quality Assurance in Education, vol. 2, no. 2, pp. 4-9 plus annexes.
Gomm, Roger (1981), Salvage evaluation, in Smetherham, David (ed.), Practising Evaluation, Nafferton Books, Driffield.
Hamilton, David et al. (1977), Beyond the Numbers Game: a reader in educational evaluation, Macmillan, Basingstoke [Invaluable, including key writers, MacDonald and Walker on case study, and the influential Parlett and Hamilton study of Evaluation as illumination. Can be read selectively.]
Hemphill, John K. (1969) The relationship between research and evaluation studies, in Tyler, Ralph W. (ed.) Educational Evaluation: new roles, new means, National Society for the Study of Education, Chicago [useful discussion of the relationship].
Hopkins, David (1989), Evaluation for School Development, Open University Press, Milton Keynes [First 2 chapters are a good introduction to types of evaluation and an argument for evaluation in the service of development].
Jamieson, Ian (1983), The role of evaluation in action-research projects: the case of the Schools Council Industry Project, Cambridge Journal of Education, vol. 13, no. 2. pp 37-45 [brief account, raising many of the issues discussed here].
Jamieson, Ian (1984), Evaluation: a case of research in chains?, in Adelman, Clem (ed.), The Politics and Ethics of Evaluation, Croom Helm, London [the publishers failed to have this book proof read, so read this chapter with care!].
Kogan, Maurice (1986) Education Accountability: an analytic overview, Hutchinson, London [particularly ch. 6, Epistemologies and evaluation].
Lawton, Denis (1978) Curriculum evaluation: new approaches, in Denis Lawton et al., Theory and Practice of Curriculum Studies [short, but covers most of the issues raised here].
MacDonald, Barry and Walker, Rob: see Hamilton et al. above.
Manchester University Department of Applied Social Science (n.d.) Evaluating policy and practice, [brief account of postgraduate course approach, objectives, course content].
Nebraska Council to Prevent Alcohol and Drug Abuse (2000), The Least You Need to Know About… http://www.nde.state.ne.us/SDFS/ATOD/evaluation.html[types of evaluation].
Northwest Regional Educational Laboratory (2000), Evaluating Whole-School Reform Efforts: a guide for district and school staff, http://www.nwrac.org/whole-school/index.html [including good sections on impact evaluation].
ONeil, Mike and Pennington, Gus (1992), Evaluating Teaching and Courses from an Active Learning Perspective, CVCP Universities Staff Development and Training Unit, Sheffield [mainly on evaluating teaching in higher education, especially methods of collecting evidence].
Parlett, Malcolm and Hamilton, David: see Hamilton et al. above.
Parsons, Carl (1981) A policy for educational evaluation, in Lacey, Colin and Lawton, Denis, Issues in Evaluation and Accountability, Methuen, London.
Simons, Helen (1981), Process evaluation in schools, in Lacey, Colin and Lawton, Denis, Issues in Evaluation and Accountability, Methuen, London.
Simons, Helen (1987), Getting to Know Schools in a Democracy: the politics and process of evaluation, Lewes, Falmer Press.
Stake, Robert E. (1980), Program evaluation, particularly responsive evaluation, in Dockrell, W.B. and Hamilton, David (eds), Rethinking Educational Research, Hodder and Stoughton, London.
Stenhouse, Lawrence (1975) An Introduction to Curriculum Research and Development, Heinemann, London [Ch. 8 on The evaluation of curriculum is a key text].
Trochim, William M.K. (2002), Introduction to Evaluation, http://www.socialresearchmethods.net/kb/intreval.htm [definitions, strategies, types, questions and methods; link to The Planning-Evaluation Cycle and An Evaluation Culture. Based on book, Research Methods Knowledge Base].
Wayne State University Center for Urban Studies (n.d.) account of approach to evaluation research, go to http://www.cus.wayne.edu/capabilities/intro.asp and click on 'evaluation' in bullet point list.
Weiss, C.H. (1998, 2nd edn), Evaluation: methods for studying programs and policies, Prentice Hall, New York [massive compendium, suitable for consulting; very expensive].
World Bank Group (2001) Poverty Net, http://worldbank.org/poverty/impact/ [substantial account of an approach to large-scale project evaluation, including understanding impact evaluation, methods and techniques, many examples and readings; valuable insights].
Tasks, once completed, should be sent to email@example.com, making clear:
It will then be passed on to the component leader (and copied to your supervisor). The component leader will get back to you with comments and advice which we hope will be educative and which will help you in preparing your dissertation proposal once you are ready. (Remember that these tasks are formative and that it is the proposal which forms the summative assessment for the MERS501 (resined) module.) This email address is checked daily so please use it for all correspondence about RESINED other than that directed to particular individuals for specific reasons.
TASK (B) DATA COLLECTION TASK
Scenario and involvement
NOTE THAT THERE IS NO TASK C FOR THIS COMPONENT
Beginning Research | Action Research | Case Study | Interviews | Observation Techniques | Education Research in the Postmodern
Evaluation Research in Education | Narrative| Presentations | Qualitative Research | Quantitative Methods | Questionnaires | Writing up Research