Gregory J. Cizek
The right of Gregory J. Cizek to be identified as author of this work has been asserted by him in accordance with sections 77 and 78 of the Copyright, Designs and Patents Act 1988.
All rights reserved. No part of this book may be reprinted or reproduced or utilised in any form or by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying and recording, or in any information storage or retrieval system, without permission in writing from the publishers.
To Stephen Francis Gregory Cizek (July 7, 1987 October 19, 2019). You are my treasured son.
COPYRIGHT ACKNOWLEDGMENT
Just Cant Get Enough
Words and Music by Will Adams, Allan Pineda, Jaime Gomez, Stacy Ferguson, Jabbar Stevens, Julie Frost, Thomas Brown, Joshua Alvarez, Rodney Jerkins and Stephen Shadowen
Copyright 2010 BMG Sapphire Songs, i.am.composing, llc, BMG Platinum Songs US, apl.de.ap.publishing llc, BMG Gold Songs, Headphone Junkie Publishing LLC, EMI April Music Inc., Kid Ego, Darkchild Songs, Totally Famous Music, TBHits, Tuneclique Music, Native Boy Music, The Publishing Designee Of Stephen Shadowen and Rodney Jerkins Productions, Inc.
All Rights for BMG Sapphire Songs, i.am.composing, llc, BMG Platinum Songs US, apl.de.ap.publishing llc, BMG Gold Songs and Headphone Junkie Publishing LLC Administered by BMG Rights Management (US) LLC
All Rights for EMI April Music Inc., Kid Ego, Darkchild Songs, Totally Famous Music and TBHits Administered by Sony/ATV Music Publishing LLC, 424 Church Street, Suite 1200, Nashville, TN 37219
All Rights Reserved. Used by Permission.
Reprinted by Permission of Hal Leonard LLC
Contents
My validity journey has been a long one. I first encountered the notion of validityand the high esteem it was affordednearly 40 years ago. I was an undergraduate student taking my last elective course in my final semester of preparation to become an elementary school teacher. The instructor was Professor Robert Ebel, whose stature in the field of measurement I did not apprehend at the time.
After spending several years as a fourth grade teacher, my interests focused more clearly in the area of assessment, and I pursued a graduate degree in that area. Eventually, I came to see psychometrics, at its core, as a field concerned with data quality control. That is, the fundamental aim of psychometricians was to help ensure that the data collected via any measurement procedure could be trusted as dependable, and that the scores produced would have the meaning they were intended (or assumed) to convey to users of information yielded by tests. I have found that aim to be consistent across the diverse contexts I experienced throughout my career: elementary school classroom assessments, local school board academic program decision making, high-stakes licensure and certification examinations, guidance tools used by school counselors, and statewide student achievement testing programs.
Having now spent nearly 30 years in a university setting, I often have the privilege of directing students dissertation research projects. In those contexts and in the courses I teach, I try to pass along the high value and importance of validity, encouraging my colleagues to always interrogate whether the data they obtain so cleverly and analyze so complexly are any good in the first place (Cone & Foster, 1991, p. 653).
Over the course of those decades, however, what seemed to be so clear in importance seemed to be so muddied in understandingeven among specialists in the field of measurement. In my own scholarly work in the area, I first became concerned by what I perceived to be a flaw in validity theory regarding the place of consequences of testing. I was not the first to notice that error; indeed, I found that many others (e.g., Mehrens, 1997; Popham, 1997) had pointed out the error much earlier. The first steps in my validity journey focused somewhat narrowly on the error of including consequences of testing as a source of evidence supporting the validity of test scores.
That initial focus on consequences, however, proved to be fortuitous because it began to illuminate broader concerns. I came to realize that the problem of how to deal with the (appropriate) concern about consequences of testing was a comparatively minor aspect of two much more serious problems.
The first problem was that incompatible aspects of testing (test score meaning and test score use) had been incorporated into a single conceptvaliditywhere they had a predictably unsettled coexistence. The second concern was the lack of a comprehensive framework for defensible testing that clearly differentiated the two most important questions in educational and psychological measurement: (1) What is the evidence that this test score has the meaning it is intended to have?; and (2) What is the evidence that this test score should be used as it is intended to be used? As regards this second concern, it is worth noting that whereas professionally accepted evidentiary sources, procedures, and best practices for answering the first question (i.e., evaluating the intended meaning of test scores) have existed for more than a half-century, no similarly mature traditions or proffered guidelines existed for answering the second question.
This state of affairs (i.e., the fundamental incompatibility of combining evidence regarding test score meaning and test score use into a single concept, and the lack of a comprehensive framework for dealing with those concerns) have, I believe, frequentlythough understandablycontributed to anemic validation efforts, and to often weakly supported or indefensible testing policy initiatives.