Education | Educational Assessment and Psychological Measurement
Y527 | 5984 | Dr. Ginette Delandshere

Course Description

For a good part of the 20th century, measurement has been the primary
method used in the conduct of educational research and educational
assessment.  The field of educational research is changing and
educational assessment and psychological measurement appear to be in
transition.  Changes in research methodology, in our theoretical
understanding of learning and being or behaving, in our conception of
knowledge, and the ever increasing demands made on assessment are
creating tensions that affect the role and use of assessment, the
nature of the constructs and observations used in research and the
assessment process.

The purpose of this course is to introduce the foundations and major
concepts of measurement including issues of design, function and
consequences of testing and assessment and the important related
issues of validity.  As measurement data are used both in research and
assessment, issues of data quality, their meaning, the
appropriateness, credibility and consequences of their uses and the
inferences we make based on such data will also be explored.  Students
will also learn how to analyze and interpret assessment/measurement

A variety of instructional formats will be used in this course
including lecture, seminar, and small group discussion.  Participation
in the discussions is an important part of learning and therefore
attendance is required.  Course material will also be available on
SiteScape at  Articles on reserve for
this course can be found at

Course Requirements and Assignments

There will be two examinations (each 35 % of final grade) and one
written project (30 % of final grade) which will require the
conceptualization, construction and interpretation of a measurement
instrument or assessment scheme.  This written project will be ongoing
during a good part of the semester including several steps for which
you will receive regular feedback.

Students are also responsible for the assigned readings and for
in-class and homework assignments.  Reading response assignments (not
to exceed more than one page per assignment) consisting of written
reflection on assigned readings will be used as a way to stimulate
class discussion and participation and will be taken into account for
the evaluation of the performance for the course.  The reading
response assignments and other homework will not be graded; they will
be checked for completion and adequacy and feedback will be provided.
To receive full credit all homework assignments have to be completed
and turned in on time.  If homework is not turned in or is
systematically late, incomplete or inadequate your final grade will be
decreased by one grade level or two (e.g., A will turn into A- or B+)
depending on the number of late and incomplete assignments.  A course
grade of "Incomplete" will not be assigned except in the case of
illness or other emergencies.  Intended or unintended cheating and/or
plagiarism (see academic handbook) will yield a grade of F in the


Thorndike, R. M. (1997).  Measurement and Evaluation in Psychology and
Education. (6th Ed.).  NJ: Prentice Hall.

Other readings are assigned as judged appropriate

Other References

Testing and Measurement

Allen, M. J., and Yen, W. M. (1979).  Introduction to measurement
theory.  Monterey, Calif. Brooks-Cole.

Anastasi, A.  Psychological Testing.  (Sixth Edition).  Macmillan
Publishing Company, New York, NY, 1988.

Brennan, R. L. (1983)  Elements of generalizability theory.  Iowa
City:  ACT Publications

Guilford, J. P., and Fruchter, B (1978).  Fundamental statistics in
psychology and education. New York:  McGraw-Hill.

Hanson, F. A. (1993).  Testing, testing: Social consequences of the
examined life.  Berkeley, Calif.: University of California Press.

Hopkins, K.C., Stanley, J.C. & Hopkins, B.R. (1990).  Educational and
Psychological Measurement and Evaluation.  NY: Prentice Hall.

Linn, R. L. (1989).  Educational Measurement (3rd edition).  National
Council on Measurement in Education, American Council on Education.
Macmillan Publishing Company, New York, NY.

Linn, R.L. and Gronlund N.E., (1995).  Measurement and assessment in
teaching.  Prentice Hall.

Messick, S. (1995).  Validity of psychological assessment.  Validation
of inferences from persons' responses and performances as scientific
inquiry into score meaning.  American Psychologist, 50, 9, 741-749.

Messick, S. (1989).  Meaning and values in test validation:  The
science and ethics of assessment.  Educational Researcher, 18, 2, pp.

Messick, S.  The Once and Future Issues of Validity:  Assessing the
Meaning and Consequences of Measurement (Chap. 3). In. H. Wainer and
H. Braun (Eds.). Test Validity. Lawrence Erlbaum and Associates: New
Jersey. 1988.

Nitko, A.J. (1996).  Educational assessment of students (2nd. ed.).
Englewood Cliffs, NJ.

Pedhazur E. &  Schmelkin L. Chapter 2: Measurement and Scientific
Inquiry in Measurement, Design and Analysis--An Integrated Approach.
Lawrence Erlbaum Associates: New Jersey, 1991.

Sax G. (1989).  Principles of educational and psychological
measurement and evaluation (3rd. ed.).  Wadsworth Pub., Belmont, CA.

Thorndike, R. L. (1971).  Educational measurement (2nd ed.).
Washington, D. C.: American Council on Education.

Other Readings on Assessment

Blake, P. J. ((1998).  Testing, friend and foe?: The theory and
practice of assessment and testing.  London: Falmer Press.

Bennett, R.E. & Ward, W.C. (Eds.) (1993).  Construction Versus Choice
in Cognitive Measurement:  Issues in Constructed Response, Performance
Testing, and Portfolio Assessment.  Hillsdale, NJ:  Lawrence Erlbaum

Curren, R.R. (1995).  Coercion and the ethics of grading and testing.
Educational Theory, 45, 4, 425-441.

Finch, F. (1991).  Educational performance assessment.  Chicago:
Riverside Publishing Company

Gifford, B. R. & O'Connor, M.C. (Eds.), (1992).  Changing assessments:
Alternative views of aptitude, achievement and instruction.  Boston:
Kluwer Academic Publishers.

Gipps, C. V. (1999).  Socio-cultural aspects of assessment.  Review of
Research in Education, 24.

Gipps, C. V. (1994).  Beyond Testing.  London:  Falmer Press.

Kane, S., Crooks, T., & Cohen, A. (1999).  Validation measures of
performance.  Educational Measurement, 18(2), 5-17.

Madaus, G. F. (1994).  Testing place in society:  An essay review of
testing:  Social consequences of examined life.  American Journal of
Education, 102, 222-234.

Madaus, G. F. & O'Dwyer, L. M. (1999).  A short history of performance
assessment.  Phi Delta Kappan (May). 688-695.

Meier, D.  (2000).  Will standards save public education?  Boston, MA:
Beacon Press.

Milofsky, C. (1989).  The sociology of school psychology.  Brunswick,
NJ:  Rutgers University Press.

Resnick, L. B., & Resnick, D. P. (1992).  Assessing the thinking
curriculum:  New tools for educational reform.  In B. Gifford & M.
O'Connor (Eds.), Changing assessment:  Alternative views of aptitude,
achievement and instruction (pp. 37-76), London: Kluwer Academic

Pellegrino, J. W., Baxter, G. P., and Glaser, R. (1999).  Addressing
the "Two Disciplines" problem:  Linking theories of cognition and
learning with assessment and instructional practices.  Review of
Research in Education, 24.

Shepard, L. A. (2000).  The role of assessment in a learning culture.
Educational Researcher, 26(7), 4-14.

Shepard, L. A. (1991).  Psychometrician's beliefs about learning.
Educational Researcher, 20(6), 2-16.

Shepard, L. A (1989).  Why we need better assessment.  Educational
Leadership, April.

Sternberg, R. J. (1998).  Abilities are forms of developing expertise.
Educational Researcher, 27(3), 11-20.

Sternberg, R. J. (1996).  Myths, Countermyths, and Truths about
Intelligence.  Educational Researcher, 25(2), 11-16.

Wiggins, G. P. (1998).  Educative assessment. San Francisco:
Jossey-Bass Publishers.

Wiggins, G. P. (1993).  Assessing student performance:  Exploring the
purpose and limits of testing.  San Francisco:  Jossey-Bass

Wolf, D. (1993). Assessment as an episode of learning. In Bennett,
R.E. & Ward, W.C. (Eds.) Construction Versus Choice in Cognitive
Measurement:  Issues in Constructed Response, Performance Testing, and
Portfolio Assessment (pp. 213-240).  Hillsdale, NJ:  Lawrence Erlbaum

Tentative Course Outline and Class Schedule (subject to change)

9/3 Introduction and Overview
Experience with assessment and measurement
Issues of Inference

9/10 History, Definitions and Perspective
[Thorndike - Chap. 1]
[Gipps - Chap. 1]
[Pedhazur &  Schmelkin Chap. 2]
(reading response assignment)

9/17 Validity - History, Sources of Evidence
[Thorndike - Chap. 5]

9/24 Validity - Issues of Meaning and Values, Appraisal and Inferences
Messick - ER 89
(reading response assignment)

10/1 Test scores: Statistical Concepts and Norms
[Thorndike - Chap. 2 & 3]

10/8 Reliability - The Consistency of Scores and Judgments
[Thorndike - Chap. 4]


10/22-10/29 Assessment Inventory
Educational Assessment

" Achievement measures - Objective tests

" Classroom assessment

" Performance Assessment
[Madaus & O'Dwyer, 1999] - required reading
Psychological Measurement

" Aptitude measures" Interest, personality and attitude measures

" Other psychological measurements/observations
[Sternberg, 1998] - required reading

For these two weeks, students will also select readings depending on
their interests and focus for written project - select among the
following for descriptions of these measures and any relevant article
in the attached Resource List:
[Thorndike - Chap. 8, 9, 11, 12, 15]
[Linn & Gronlund, Chap. 10] -

11/5-12	Assessment, Learning and Teaching
[D. Wolf - Chap. 10 in Bennett & Ward]
[Shepard, 1989, 2000]
(reading response assignment)

11/19-26 Assessment and Measurement Conceptualization, Development and
Issues in Educational Assessment - see Resource List
Issues in Psychological Measurement - see Resource List
[including discussion of reading focused on substantive
learning and psychological theories]

12/3-10	Bias, Equity and Ethics
[Thorndike, Chap. 14]
[Curren, R.R. (1995).  Coercion and the ethics of grading and testing.
Educational Theory, 45, 4, 425-441.]


(For selected reading and written assignment)

Educational Assessment

For Large-Scale (e.g., NAEP), High-Stakes Assessment (e.g., ISTEP).

Haertel, E. H. (1999).  Validity arguments for high-stakes testing In
serach of the evidence. Educational Measurement: Issues and Practice.
18(4), 5-9. Available:

Popham, W. J. (1999).  Where large scale assessment is heading and why
it shouldn't. Educational Measurement: Issues and Practice. 18(3),
13-17. Available:

Kane, M. (2002).  Validating high stakes testing programs. Educational
Measurement: Issues and Practice. 21(1), 31-39. Available:

Linn, R. L., Baker, E. L., & Betebenner, D. W. (2002).  Accountability
systems: Implications of requirements of the No Child Left Behind Act
of 2001.  Educational Researcher, 31(6), 3-16. Available:

The No Child Left Behind Act of 2001 - legislation available at

For Performance Assessment (e.g., portfolios, writing, essays,
open-ended questions, scientific experiment, exhibits, oral
language/music/theater performance) and Assessment in General.

Delandshere, G. & Arens, S.  (2003) Examining the quality of the
evidence in pre-service teacher portfolios.  Journal of Teacher
Education (forthcoming Jan/Febr. Issue). Available:

Delandshere, G. (2002).  Assessment as Inquiry.  Teachers College
Record. 104(7), 1461-1484. Available:

Delandshere, G. & Arens, S.  (2001) Representations of teaching in
standards-based reform: Are we closing the debate about teacher
education?  Teaching and Teacher Education, 17, 547-566. Available:

Stiggins, R. J. (2001). The unfullfilled promise of classroom
assessment.  Educational Measurement: Issues and Practice. 20(3),

Delandshere, G. & Jones, J. H.  (1999) Elementary teachers' beliefs
about assessment in mathematics:  A case of assessment paralysis.
Journal of Curriculum and Supervision, 14(3), 216-240. Available:

Delandshere, G. & Petrosky, A.  (1998).  Assessment of complex
performances:  Limitations of key measurement assumptions. Educational
Researcher, 27(2), 14-24. Available:

Mabry, L. (1999). Writing to the Rubric: Lingering Effects of
Traditional Standardized Testing on Direct Writing Assessment.  Phi
Delta Kappan. 80(9), 673-679.

Psychological Measures

The following articles are all available online:

Reliability of Scores from the Eysenck Personality Questionnaire: A
Reliability Generalization Study. Caruso J.C.; Witkiewitz K.;
Belcourt-Dittloff A.; Gottlieb J.D. Educational and Psychological
Measurement, August 2001, vol. 61, no. 4, pp. 675-689(15) Sage
Publications Inc.

Differential Item Functioning in the WISC-III: Item Parameters for
Boys and Girls in the National Standardization Sample.  Maller S.J.
Educational and Psychological Measurement, October 2001, vol. 61, no.
5, pp. 793-817(25) Sage Publications Inc.

Expectancies for Success as a Multidimensional Construct Among
Employed Adults. Ward E.A.  Educational and Psychological Measurement,
October 2001, vol. 61, no. 5, pp. 818-826(9) Sage Publications Inc.

The Kaufman Ability Battery for Children Mental Processing Scale: A
Valid Measure of "Pure" Intelligence?  Cahan S.; Noyman A. Educational
and Psychological Measurement, October 2001, vol. 61, no. 5, pp.
827-840(14). Sage Publications Inc.

An Investigation of the Validity of Scores on Locally Developed
Performance Measures in a School Assessment Program.  Crehan K.D.
Educational and Psychological Measurement, October 2001, vol. 61, no.
5, pp. 841-848(8). Sage Publications Inc. [maybe good reading for
exercise on validity and reliability - short article]

A Study Strategies Self-Efficacy Instrument for Use With Community
College Students.  Silver B.B.; Smith E.V..J.; Greene B.A.
Educational and Psychological Measurement, October 2001, vol. 61, no.
5, pp. 849-865(17). Sage Publications Inc. [items included - factor

A General Measure of Work Stress: The Stress in General Scale.
Stanton J.M.; Balzer W.K.; Smith P.C.; Parra L.F.; Ironson G.
Educational and Psychological Measurement, October 2001, vol. 61, no.
5, pp. 866-888(23) Sage Publications Inc. [items included + corr, with
other measures]

Title: A Structural and Discriminant Analysis of the Work Addiction
Risk Test.  C.P. Flowers; B. Robinson. Educational and Psychological
Measurement. 2002. Volume: 62 no. 3 pp 517-526.  Sage Publications.
[Items included]

A Theoretical and Empirical Analysis of the Measurement of Collective
Efficacy: The Development of a Short Form.  R. Goddard.  Educational
and Psychological Measurement. 2002, Volume: 62 no. 1 pp. 97-110. Sage
Publications. [Items included]

Further Validity and Reliability Evidence for Beck Hopelessness Scale
Scores in a Nonclinical Sample. L. Steed. Educational and
Psychological Measurement. 2001, Volume: 61 no. 2  pp303-316. Sage
Publications. [items not included but probably could be obtained -

An Examination of Measurement Characteristics and Factorial Validity
of Scores on the Revised Conflict Tactics (used for intimate partner
violence).  R.R. Newton; C.D. Connelly; J.A. Landsverk.  Educational
and Psychological Measurement. 2001. Volume: 61 no. 2, pp317-335. Sage
Publications. [items included]

A Reliability Generalization Study of the Teacher Efficacy Scale and
Related Instruments.  R.K. Henson; L.R. Kogan; T. Vacha-Haase.
Educational and Psychological Measurement. 2001. Volume: 61 no. 3, pp
404 -- p420.  Sage Publications [items not included but could probably
be obtained].

The Factorial Validity of Scores on the Teacher Interpersonal
Self-Efficacy Scale.  A. Brouwers; W. Tomic. Educational and
Psychological Measurement.  2001. Volume: 61 no. 3 pp433 -- p445, Sage
Publications.  [items included - used FA].

Initial Development and Score Validation of the Adolescent Anger
Rating Scale.  D.M. Burney; J. Kromrey. Educational and Psychological
Measurement. 2001. Volume: 61 no. 3, pp446 - 460. Sage Publications.
[items included  - used FA]

Computerized and Paper-and-Pencil Versions of the Rosenberg
Self-Esteem Scale: A Comparison of Psychometric Features and
Respondent Preferences.  W.P. Vispoel; J. Boo; T. Bleiler.
Educational and Psychological Measurement. 2001, Volume: 61, no. 3,
pp461 - 474.  Sage Publications. [items not included but could
probably be obtained - CFA].

Exploratory Analysis of the Structure of Scores From the
Multidimensional Scales of Perceived Self-Efficacy.  N. Choi; D.R.
Fuqua; B.W. Griffin.  Educational and Psychological Measurement. 2001.
Volume: 61 no. 3, pp 475 - 489. Sage Publications. [Bandura's
Multidimensional Scales of Perceived Self-Efficacy (MSPSE) included]