miércoles, 7 de diciembre de 2011

Administering Assessment

‘Standardisation implies uniformity of procedure in administering and scoring the test. If the scores obtained by different persons are to be comparable, testing conditions must obviously be the same for all’.
Anastasi and Urbina (1997, p. 6)

Aiken (1997) lists a number of considerations before administrating a test, such as scheduling the test, informed consent, becoming familiar with the test, ensuring satisfactory testing conditions and minimising cheating. He lists the examiner’s duties during the test as follows: following test directions, remaining alert, establishing rapport, preparing for special problems and flexibility. On this last point Aiken is referring primarily to flexibility in so far as it is possible, without jeopardising the standardised nature of the test.

Some practical considerations in administering a test include finding a room that is appropriate. Students completing a test in a room next door to a building site may perform differently than students who have total quiet. It is important to ensure that the students will not be interrupted by another class-group waiting to come in. Correct temperature, lighting and seating must all be sorted before the students arrive.

As the students will vary in relation to how anxious they are about the test, it is important to help them relax without unintentionally raising their anxiety levels. A calm approach, acknowledging that some of them may be worried makes sense. In my experience too much focus on ensuring that the students are not worried may be counter-productive as excessive reassurance can increase someone’s anxiety.

Tests can also be administered individually. Problem checklists such as the Porteous Problem Checklist (Porteous, 1997) may be much more beneficial given individually than in a group as the guidance counsellor can observe carefully the student’s manner in completing the items. A slight hesitancy at a particular item might be an invaluable clue as to what is really going on for a student. While good administration of tests obviously requires organisational skills, it also requires an empathic relationship, astute observation and sensitivity. In my view the “counselling” part of the guidance counsellor’s role is integral to testing at all stages, but particularly in the administration of tests and the feedback of the results.

miércoles, 19 de octubre de 2011

The Use of Rubrics in the EFL Classroom

The Use of Rubrics in the EFL Classroom

By CONARE-MEP-UCR

Learning Assessment Course

Group 1

UCR-Limón, I, II, III Cycle

As professionals in ELT, we should all be aware of the importance of ensuring students’ success aided by the use of assessment rubrics. The latter ought to be used as a way to guide our students throughout their academic development. A rubric is used to check students’ academic performance as well as their daily production, including their behavior and any other criteria set by the professor or the educational curriculum. Moreover, it should serve as a guideline or tool for teachers to objectively evaluate their students’ work and measure their observable performance.

It is important to note that rubrics must be customized to suit various purposes, levels and skills. If required, they should also be modified once they have been developed according to the target audience.

We must bear in mind that rubrics must be presented and handed out to students prior to assessment. As a matter of fact, if students are provided with scoring criteria as well as rubrics, they will be better prepared and be able to improve the quality of their own work and increase knowledge.

domingo, 25 de septiembre de 2011

Chapter 6. Assessing Speaking

Teachers are often asked to evaluate learner progress during courses, maybe by preparing progress tests. It can seem straightforward enough to test grammar or vocabulary with pen and paper tests – but if our students’ work includes speaking – then it also seems necessary to assess their speaking skills. Teachers often feel unsure as to how they could do this. Here are some ideas.

What’s the aim of a progress test? Often it’s to give encouragement that something is being done well - or to point out areas where a learner’s not achieving as much as they could. With this kind of aim, giving 'marks' may not be the most effective way to assess. An interesting alternative option for progress tests is to base them around assessing if learners are successful when compared against some 'can do' criteria statements (i.e. statements listing things “I can do”), such as “I can describe what’s happening in a picture of town streets.” or “I can take part in a discussion and explain my point of view clearly and politely.” To prepare a criteria list think of about ten kinds of speaking that students have worked on over the course and turn them into criteria.

A frequent problem for teachers is when there are so many learners in one class that it seems to make it unrealistic to assess speaking. With a list of criteria (such as those above) it now becomes considerably more straightforward to assess even a large group. Explain to your class what you will be doing, then, the next three or four times you set speaking tasks (i.e. where learners work in pairs or groups), walk around class with a list of names, listening in to various groups and noting successes, keeping track of individual 'can do’s'. Extend your assessment over a few lessons; keep listening and adjusting your evaluation over a variety of tasks.

What are possible speaking tasks for assessment? Well, almost anything you do in normal class work – e.g. narrating a picture story; role-plays; pair work information gap exchanges; discussions etc. If you have a smaller class and enough time then a “three learners with one teacher” activity is a very good way to assess, i.e. setting a task that gets the three learners to interact together while you watch and evaluate.

Although fear of bad marks can sometimes be motivating, it’s surprising to find the amount of power that students feel when assessing themselves. It can be a real awareness-raising activity. Distribute a list of criteria and ask students to first write a short line comparing themselves against each criterion (in English or in their own language) – a reflective view rather than just a 'yes' or 'no'. Encourage 'guilt-free' honest reflection. After the writing stage, learners can meet up in small groups and talk through their thoughts, explaining why they wrote what they did.

Chapter 5. Assessing Listening

You can use post-listening activities to check comprehension, evaluate listening skills and use of listening strategies, and extend the knowledge gained to other contexts. A post-listening activity may relate to a pre-listening activity, such as predicting; may expand on the topic or the language of the listening text; or may transfer what has been learned to reading, speaking, or writing activities.

In order to provide authentic assessment of students' listening proficiency, a post-listening activity must reflect the real-life uses to which students might put information they have gained through listening.

It must have a purpose other than assessment
It must require students to demonstrate their level of listening comprehension by completing some task.

To develop authentic assessment activities, consider the type of response that listening to a particular selection would elicit in a non-classroom situation. For example, after listening to a weather report one might decide what to wear the next day; after listening to a set of instructions, one might repeat them to someone else; after watching and listening to a play or video, one might discuss the story line with friends.

Use this response type as a base for selecting appropriate post-listening tasks. You can then develop a checklist or rubric that will allow you to evaluate each student's comprehension of specific parts of the aural text.

For example, for listening practice you have students listen to a weather report. Their purpose for listening is to be able to advise a friend what to wear the next day. As a post-listening activity, you ask students to select appropriate items of clothing from a collection you have assembled, or write a note telling the friend what to wear, or provide oral advice to another student (who has not heard the weather report). To evaluate listening comprehension, you use a checklist containing specific features of the forecast, marking those that are reflected in the student's clothing recommendations.

Chapter 4. Assessing Writing

Writing assessment can be used for a variety of appropriate purposes, both inside the classroom and outside: providing assistance to students, awarding a grade, placing students in appropriate courses, allowing them to exit a course or sequence of courses, certifying proficiency, and evaluating programs-- to name some of the more obvious. Given the high stakes nature of many of these assessment purposes, it is crucial that assessment practices be guided by sound principles to insure that they are valid, fair, and appropriate to the context and purposes for which they designed. This position statement aims to provide that guidance.

In spite of the diverse uses to which writing assessment is put, the general principles undergirding it are similar:

Assessments of written literacy should be designed and evaluated by well-informed current or future teachers of the students being assessed, for purposes clearly understood by all the participants; should elicit from student writers a variety of pieces, preferably over a substantial period of time; should encourage and reinforce good teaching practices; and should be solidly grounded in the latest research on language learning as well as accepted best assessment practices.

In a course context, writing assessment should be part of the highly social activity within the community of faculty and students in the class. This social activity includes:

a period of ungraded work (prior to the completion of graded work) that receives response from multiple readers, including peer reviewers,
assessment of texts—from initial through to final drafts—by human readers, and
more than one opportunity to demonstrate outcomes.

Self-assessment should also be encouraged. Assessment practices and criteria should match the particular kind of text being created and its purpose. These criteria should be clearly communicated to students in advance so that the students can be guided by the criteria while writing.

Students should have the right to weigh in on their assessment. Self-placement without direction may become merely a right to fail, whereas directed self-placement, either alone or in combination with other methods, provides not only useful information but also involves and invests the student in making effective life decisions.

Proficiency or exit assessment involves high stakes for students. In this context, assessments that make use of substantial and sustained writing processes are especially important.

Judgments of proficiency must also be made on the basis of performances in multiple and varied writing situations (for example, a variety of topics, audiences, purposes, genres).

The assessment criteria should be clearly connected to desired outcomes. When proficiency is being determined, the assessment should be informed by such things as the core abilities adopted by the institution, the course outcomes established for a program, and/or the stated outcomes of a single course or class. Assessments that do not address such outcomes lack validity in determining proficiency.

The higher the stakes, the more important it is that assessment be direct rather than indirect, based on actual writing rather than on answers on multiple-choice tests, and evaluated by people involved in the instruction of the student rather than via machine scoring. To evaluate the proficiency of a writer on other criteria than multiple writing tasks and situations is essentially disrespectful of the writer.

Chapter 3. Assessing Reading

Assessing Reading

Reading ability is very difficult to assess accurately. In the communicative competence model, a student's reading level is the level at which that student is able to use reading to accomplish communication goals. This means that assessment of reading ability needs to be correlated with purposes for reading.

Reading Aloud

A student's performance when reading aloud is not a reliable indicator of that student's reading ability. A student who is perfectly capable of understanding a given text when reading it silently may stumble when asked to combine comprehension with word recognition and speaking ability in the way that reading aloud requires.

In addition, reading aloud is a task that students will rarely, if ever, need to do outside of the classroom. As a method of assessment, therefore, it is not authentic: It does not test a student's ability to use reading to accomplish a purpose or goal.

However, reading aloud can help a teacher assess whether a student is "seeing" word endings and other grammatical features when reading. To use reading aloud for this purpose, adopt the "read and look up" approach: Ask the student to read a sentence silently one or more times, until comfortable with the content, then look up and tell you what it says. This procedure allows the student to process the text, and lets you see the results of that processing and know what elements, if any, the student is missing.

Comprehension Questions

Instructors often use comprehension questions to test whether students have understood what they have read. In order to test comprehension appropriately, these questions need to be coordinated with the purpose for reading. If the purpose is to find specific information, comprehension questions should focus on that information. If the purpose is to understand an opinion and the arguments that support it, comprehension questions should ask about those points.

In everyday reading situations, readers have a purpose for reading before they start. That is, they know what comprehension questions they are going to need to answer before they begin reading. To make reading assessment in the language classroom more like reading outside of the classroom, therefore, allow students to review the comprehension questions before they begin to read the test passage.

Finally, when the purpose for reading is enjoyment, comprehension questions are beside the point. As a more authentic form of assessment, have students talk or write about why they found the text enjoyable and interesting (or not).

Authentic Assessment

In order to provide authentic assessment of students' reading proficiency, a post-listening activity must reflect the real-life uses to which students might put information they have gained through reading.

It must have a purpose other than assessment
It must require students to demonstrate their level of reading comprehension by completing some task

To develop authentic assessment activities, consider the type of response that reading a particular selection would elicit in a non-classroom situation. For example, after reading a weather report, one might decide what to wear the next day; after reading a set of instructions, one might repeat them to someone else; after reading a short story, one might discuss the story line with friends.

Use this response type as a base for selecting appropriate post-reading tasks. You can then develop a checklist or rubric that will allow you to evaluate each student's comprehension of specific parts of the text.

Chapter 2. Techniques for testing

Process: Objective and subjective evaluation

If we view objectivity and subjectivity of evaluation along a continuum, we can represent various assessment and scoring methods along its length.

Test items that can be evaluated objectively have one right answer (or one correct response pattern, in the case of more complex item formats). Scorers do not need to exercise judgment in marking responses correct or incorrect. They generally mark a test by following an answer key. In some cases, objective tests are scored by scanning machines and computers. Objective tests are often constructed with selected-response item formats, such as multiple-choice, matching, and true-false. An advantage to including selected-response items in objectively scored tests is that the range of possible answers is limited to the options provided by the test writer—the test taker cannot supply alternative, acceptable responses.

Because much of what we assess in reading and listening comprehension measures is first interpreted by the test writer, some degree of subjectivity is present in objectively scored items. For that reason, assessments of the Interpretive mode, even those comprised of "one-right-answer" items, might not be placed all the way at the objective end of the continuum.

Evaluating responses objectively can be more difficult with even the simplest of constructed-response item formats. An answer key may specify the correct answer for a one word, gap-filling item, but there may in fact be multiple, acceptable alternative responses to that item that the teacher or test developer did not anticipate. In classroom testing situations, teachers may perceive some responses as equally or partially correct, and apply some subjective judgment in refining their scoring criteria as they mark tests. Informal scoring criteria for short-answer items probably work well for classroom testing as long as they are applied consistently and are defensible.

Just as there may be few truly objective measures of second language knowledge and skill, so too is it rare to find purely subjective evaluations of performance. Allowing the subjective impressions of scorers to determine learners' grades would not be acceptable to most students, their parents, or other stakeholders. We do not usually have to justify our opinion that a work of art is good or bad—we simply like it or we don't. Since our judgment has no significant consequences for the artist (unless we are art critics), a subjective evaluation is acceptable. It is also not a matter of concern that the many viewers of the artwork do not agree about its quality.

In assessment, we strive to ensure two types of reliability: inter-rater (raters agree with each other) and intra-rater (a rater gives the same score to a performance rated on separate occasions). The higher the stakes, the more reliable (consistent) judgments must be. Scoring criteria, in the form of rubrics, are generally used to guide raters to arrive at the same, or nearly the same, evaluation of a product. Thus, although it is common to refer to scoring which requires human judgment as subjective evaluation, in most cases we might place it near the midpoint on our objective-subjective continuum.

In rated assessments, the scoring criteria form an integral part of the evaluation. Specialists in language testing often identify three key components in performance assessment. These components are:

Tasks that are effective in eliciting the performance to be assessed.
Rating criteria to evaluate the quality of the performance. The criteria reflect the relative importance of various aspects of the performance, and are appropriate for the population being assessed.
Raters that are trained to apply the criteria and can do so consistently.

RULES FOR WRITING MULTIPLE-CHOICE QUESTIONS

Multiple choice is a form of assessment in which respondents are asked to select the best possible answer (or answers) out of the choices from a list.

1. Use Plausible Distractors (wrong-response options)

• Only list plausible distractors, even if the number of options per question

changes

• Write the options so they are homogeneous in content

• Use answers given in previous open-ended exams to provide realistic

distractors

2. Use a Question Format

• Experts encourage multiple-choice items to be prepared as questions (rather

than incomplete statements)

Incomplete Statement Format:

The capital of California is in

Direct Question Format:

In which of the following cities is the capital of California?

3. Emphasize Higher-Level Thinking

• Use memory-plus application questions. These questions require students to

recall principles, rules or facts in a real life context.

• The key to preparing memory-plus application questions is to place the

concept in a life situation or context that requires the student to first recall the

facts and then apply or transfer the application of those facts into a situation.

• Seek support from others who have experience writing higher-level thinking

multiple-choice questions.

4. Keep Option Lengths Similar

• Avoid making your correct answer the long or short answer

5. Balance the Placement of the Correct Answer

• Correct answers are usually the second and third option

6. Be Grammatically Correct

• Use simple, precise and unambiguous wording

• Students will be more likely to select the correct answer by finding the

grammatically correct option

7. Avoid Clues to the Correct Answer

• Avoid answering one question in the test by giving the answer somewhere

else in the test

• Have the test reviewed by someone who can find mistakes, clues,

grammar and punctuation problems before you administer the exam to

students

• Avoid extremes – never, always, only

• Avoid nonsense words and unreasonable statements

8. Avoid Negative Questions

• 31 of 35 testing experts recommend avoiding negative questions

• Students may be able to find an incorrect answer without knowing the

correct answer

9. Use Only One Correct Option (Or be sure the best option is clearly the best

option)

• The item should include one and only one correct or clearly best

answer

• With one correct answer, alternatives should be mutually exclusive

and not overlapping

• Using MC with questions containing more than one right answer

lowers discrimination between students

10. Give Clear Instructions

Such as:

Questions 1 - 10 are multiple-choice questions designed to assess your ability to

remember or recall basic and foundational pieces of knowledge related to this course.

Please read each question carefully before reading the answer options. When you

have a clear idea of the question, find your answer and mark your selection on the

answer sheet. Please do not make any marks on this exam.

Questions 11 – 20 are multiple-choice questions designed to assess your ability to

think critically about the subject. Please read each question carefully before reading

the answer options. Be aware that some questions may seem to have more than one

right answer, but you are to look for the one that makes the most sense and is the most

correct. When you have a clear idea of the question, find your answer and mark your

selection on the answer sheet. You may justify any answer you choose by writing

your justification on the blank paper provided.

11. Use Only a Single, Clearly-Defined Problem and Include the Main Idea in

the Question

• Students must know what the problem is without having to read the

response options

12. Avoid the “All the Above” Option

• Students merely need to recognize two correct options to get the

answer correct

13. Avoid the “None of the Above” Option

• You will never know if students know the correct answer

14. Don’t Use MC Questions When Other Item Types Are More Appropriate

• limited distractors or assessing problem-solving and creativity

The Matching Format

The matching test item format provides a way for learners to connect a word, sentence or phrase in one column to a corresponding word, sentence or phrase in a second column. The items in the first column are called premises and the answers in the second column are the responses. The convention is for learners to match the premise on the left with a given response on the right. By convention, the items in Column A are numbered and the items in Column B are labeled with capital letters.

DIFFERENCE BETWEEN COMPLETION AND SHORT ANSWER

EXAMPLE OF A COMPLETION QUESTION

The first Prime Minister of Canada was _________________.

EXAMPLE OF A SHORT ANSWER QUESTION

1. Who was the first Prime Minister of Canada?
_____________________________

KEY: Sir John A. Macdonald

Completion = fill-in the blank
Short Answer = answer the question

when we say short answer, talking sentence or less
more than one sentences or paragraph are short written response, because no longer objective items --> need more complex scoring scheme

ESSAY QUESTION

Writing an effective essay examination requires two important abilities: recalling information and organizing the information in order to draw relevant conclusions from it. While this process sounds simple, writing an effective essay examination under pressure in limited time can be a daunting task.

Common strategy terms for Essay writing are as follows:

Analyze: Divide an event, idea, or theory into its component elements, and examine each one in turn: Analyze Milton Friedman's theory of permanent income.
Compare and/or Contrast: Demonstrate similarities or dissimilarities between two or more events or topics: Compare the portrayal of women in Beloved with that in Their Eyes Were Watching God.
Define: Identify and state the essential traits or characteristics of something, differentiating it clearly from other things: Define Hegelian dialectic.
Describe: Tell about an event, person, or process in detail, creating a clear and vivid image of it: Describe the dress of a knight.
Evaluate: Assess the value or significance of the topic: Evaluate the contribution of black musicians to the development of an American musical tradition.
Explain: Make a topic as clear and understandable as possible by offering reasons, examples, and so on: Explain the functioning of the circulatory system.
Summarize: State the major points concisely and comprehensively: Summarize the major arguments against using animals in laboratory research.