# Digitek and Test Scoring Questions

1. What is Digitek?

Digitek is a program that processes test answer sheets read by BEST's high-speed optical scanner. For more information, see the Digitek User's Manual.

2. What program can I use for online testing?

QuizSite can be used to create and administer course exams and tests via the Web. For more information, see the QuizSite Instructor Documentation.

3. What does the difficulty level say about the questions? What level of difficulty is best? Should there be a range of difficulties? How large?

The difficulty level is the proportion of students who got an item right. According to measurement theory, the "ideal" difficulty level for an item is about 0.50. However, in practical terms, the items on a test will usually have a range of difficulty levels from, say 0.25 to 0.8. Items that are too difficult (at or near 0.0) or too easy (at or near 1.0) aren't really doing their job of discriminating between students who know the material and those who don't. (I should add that, in part, this is a matter of teaching philosophy. Some instructors believe that most or all students should be able to master the material; thus, having items that all students got right wouldn't trouble them.)

4. What do the correlation and discrimination say about the questions? Are the "best" questions the ones with a high correlation? What correlation is sufficient to warrant including a question on the exam?

The discrimination index indicates the correlation between each item response option and the total score. In other words, it measures the degree to which scores on a particular item tend to reflect overall score on the test. Thus, if students who did well overall tended to get an item right, and students who did poorly overall tended to get it wrong, that item will have a high discrimination index. Since this is a correlation, it can range from -1.0 (a perfect inverse relationship) to 1.0 (a perfect positive relationship). On multiple choice exams, we generally want the correct answer option to have a positive correlation with total score. The higher the better, but anything above approximately 0.4 is probably okay. We want the distractors (the incorrect answer options) to have negative or near-zero discrimination indices.

So, if you have any items in which the discrimination index for the correct answer option is near zero or negative, you may want to examine these test items to see if they are ambiguous, unintentionally misleading, or perhaps even keyed incorrectly.

Another thing you can look for is any distractor options that no one chose. You might want to replace these with distractors that are more "enticing".

5. What about the indices of reliability? Are the numbers I got good or bad? How high a reliability should I be achieving?

The overall test reliability indices are measures of the degree with which the scores on individual items correlate with one another. These statistics are probably less important than the individual item statistics. The possible range is 0 to 1.0. The higher the better. Probably anything at 0.5 or higher is okay. Low reliability indices can be increased by identifying and improving individual items on which the discrimination indices are low.