Saturday, March 15, 2008

Chapter 3 – Validity

Educational Assessment - Review By Brenda Roof
Classroom Assessment – What Teachers Need to Know - W. James Popham

Chapter 3 discusses validity. Validity is the most significant concept in assessment. Tests themselves are not valid. Inferences about the tests are made to check the validity of a test. There are three types of validity evidence available to test validity.
The first type of validity is content-related evidence of validity. This refers to the adequacy with which content of a test represent the content of the curricular standard about which inferences are to be made. Curricular standard are content that encompasses knowledge, skills or attitudes. The more critical an assessment the more developmental care may be given to the assessment. Developmental activities are used to test content-related validity. First a panel of national experts recommends knowledge and skills that should be measured. Then the content is systematically contrasted with topics from five leading texts. Then, a group of nationally recognized teachers in the assessments subject provide suggestions regarding key topics. Several internationally recognized college professors in the subject area then offer recommendations for additions, deletions and modifications. Finally, state and national associates in the subject area, provide reviews of the proposed content to be measured.
A less formal form of content-related evidence is a teacher creating a end of unit assessment the teacher should create an outline of important skills and knowledge, then identify the content of curricular standards covering the instructional period. The assessment should then be created based on the identified content. A second form of content-related evidence of validity for educational assessment procedures would involve gathering judges to rate the content appropriateness of a given assessment, as it relates to the curricular standard, the assessment represents. For high stakes assessments this process is very systematic as it is used to evaluate student performance on a large scale. For general classroom assessments a fellow teacher can be asked to review the assessment.
The second form of validity evidence is criterion-related evidence of validity. This form of validity helps educators decide how much confidence can be placed in a score-based inference about a student’s status with regards to one of more curricular standards. This method is used when trying to predict how well students will perform on a subsequent criterion variable. This type of evidence is typically collected on an aptitude test and the grades students subsequently earn. If predictor assessments work well, results can be used to make educational decisions about students. These tests however, are typically far from perfect, so this form of validity should be used with caution.
The last validity evidence is construct-related evidence of validity. It is the most comprehensive of the three types of validity evidence. Construct-related evidence of validity is the extent to which empirical evidence confirms that an inferred construct exists and that a given assessment procedure is measuring the inferred construct accurately. The data for this construct is gathered first based on our understanding of how the hypothetical construct we are measuring works. Then data is gathered to evidence whether the hypothesis or hypotheses is confirmed. If all the data is confirmed and the test is measuring what it is intended to measure we are able to draw a valid score based inference once students take the test and scores are given.
There are three types of strategies most commonly used with construct-related evidence studies. The first is an intervention study. In this method it is hypothesized that after some type of intervention students will respond differently to an assessment that was previously given. The second kind of investigation is differential-population study. In this study based on the knowledge of the construct being measured a hypothesis that students representing distinctly different populations will score differently on the assessment procedure under consideration. The third investigation is related-measure study. Here a hypothesis that a given kind of relationship will be present between students’ scores on the assessment device being scrutinized and their scores on a related or even unrelated assessment device. These strategies should be used with related test scores to show a positive relationship known as, convergent evidence. If the comparisons are not related the relationship is discriminate evidence and the results would be weak and not easily supported.
The most important thing to remember about validity is that it does not reside in the test, but is a score based inference that is either accurate or inaccurate. It is important for teachers to have a relative understanding of Assessment validity. Content-related evidence is probably the most important of the three types for a teacher to have a good handle on, especially for high stakes tests. Best practices for teachers is, to have a colleague review their tests who has an understanding of the curriculum standards or key topics being taught to ensure that is what is being assessed.

No comments: