Quick Guide to Better Tests
General Suggestions for Developing Tests
- List the important concepts, principles, and skills you want students to master, then write items that measure these.
- Don't try to write an exam in one sitting. One idea: after every class, write two or three items that relate to that class session.
- Write items that students can't answer just by memorizing information; exam questions should measure ability to apply content to new settings, to analyze, evaluate, etc.
- Use homework or in-class activities to give students practice at responding to items like those you will use on the exam.
- Ask a colleague (or BEST) to review an exam for clarity before you finalize it.
- Recognize that essay tests are relatively easy to write, but take a long time to grade. Good multiple-choice tests, on the other hand, can be scored quickly, but take much longer to write.
Multiple choice items
- Present items in a new context, rather than using exactly the language from the text or class.
- Use item "templates" to help vary the format and cognitive level
of items. E.g.,
- Which of the following is the best definition of concept X?
- Which of the following is the best label for this description?
- Which of the following possible examples best exemplifies this concept or principle?
- Which of the following features best distinguishes this concept from related concepts?
- Given this scenario, which of the following is the best course of action?
- Given this scenario, which of the following is the most likely consequence?
- After administering the exam, use BEST's item analysis program to help identify items that didn't work as you intended.
- Guidelines for writing good items:
- The item "stem" (the part that appears before the answer options) should present a problem, should generally contain a verb, and should include any words that would otherwise be repeated in each answer option.
- Avoid stems that reveal the answer to another item.
- Avoid negatives ("not", "never") in the stem, but if necessary, call attention to them by underlining or bolding.
- "Distracters" (the incorrect answer options) should be wrong, but plausible.
- Use common student errors as distracters.
- Avoid using "all of the above" and "none of the above" as answer options.
- Make the correct answer option about the same length as the distracters.
- Avoid unintended verbal clues as to the right answer; e.g., words in the stem repeated in the correct answer, but not in the distracters, or grammatical clues, where only the correct answer makes grammatical sense with the stem.
- The correct answer option should occur in each "position" (i.e., A, B, C, or D) about the same number of times, but avoid a repeating pattern.
Essay items
- Avoid using essay items to test for only factual knowledge, since multiple-choice items can do this more reliably and efficiently.
- Structure and focus questions clearly, so students know what
you expect. Present a specific problem, such as
- Compare and contrast X and Y in regard to Z
- Present arguments for or against some issue
- Describe an application of a rule or principle
- Evaluate a scenario in light of given criteria
- Predict an outcome or draw inferences from given data
- Several shorter questions are usually better than fewer longer questions.
- Guidelines for grading essays more reliably:
- Before grading, list the main points you expect a good answer to cover.
- Decide in advance how you will handle factors such as spelling and grammar, and apply the rules consistently.
- Before grading, read through a few sample student answers to get a general idea of the quality level.
- To counteract the "halo" effect, try to grade answers without knowing the student's identity.
- Grade one question for all students before going on to the next question.
- If possible, read each answer twice, shuffling the order the second time through.
- Reshuffle the papers after completing each item.
- Sort papers into "high," "medium," and "low" stacks before assigning final grades.
- Write comments so that students understand why answers were good or poor.
- If multiple graders are used, have a "norming" session.
Sources
Haladyna, T. M. (1994) Developing and Validating Multiple-Choice Test Items. Hillsdale, NJ: Erlbaum.
Hopkins, K. D. (1998) Educational and Psychological Measurement and Evaluation. Boston: Allyn and Bacon.
Jacobs, L. C. and Chase, C. I. (1992) Developing and Using Tests Effectively. San Francisco: Jossey-Bass.
