Utilizing Assessments in the Classroom - Educational Assessments: April 2008

Sunday, April 27, 2008

Chapter 10 – Affective Assessments

Educational Assessment - Review By Brenda Roof
Classroom Assessment – What Teachers Need to Know - By W. James Popham

Chapter 10 discusses affect and its relevance as well as, the importance of assessing affect. The use of affective assessments could allow teachers to write goals and focus whole group instruction on students' affect, especially when individual students are not shifting their affective status, or for better monitoring and predicting of future behaviors. Likert inventories and Multifocus affective inventories are described. A five step process for creating Multifocus inventories is also discussed. Suggestions for when and how to assess are also presented.
The first question that is asked is why assess affect at all. Many teachers feel that they only need to address student’s skills and knowledge base. The idea of affect assessing is believed to be important and uninfluenced in the classroom. The author stresses however, that affect-variables are even more significant than cognitive variables or at leas as significant. Change in the belief that affect is not significant needs to occur. Many times teachers are totally unaware of a student’s attitude, interests and values. If a teacher were to become more aware of these things, especially in early years of education, they could have opportunities to better adjust a students affect toward instruction. By continually monitoring this from year to year students would be less likely to form negative attitudes and behaviors towards school. Especially of these attitudes can be detected and influenced more positively.
The other side of this argument points to vocal groups of people who only want traditional cognitive educations offered in public schools. The argument has been that affect should be the job of family and church. The problem with this entirely is the focus. There needs to be universal agreement to focus affect on specific areas. Specific areas, such as, “the promotion of students’ positive attitudes towards learning as an affective aspiration”. One would hope that everyone would agree to support affective learning. Therefore, it is very important to have a clear focus on affect and its relevance to learning.
There are some specific variables that could be assessed to promote affect universally. They are attitude, interests and values. Students who feel good about learning are more likely to continue to learn if learning is continued to be promoted. Some potential attitude focuses may be, positive attitudes; toward learning, toward self, toward self as a learner and toward those who differ from us. Another target would be student’s interests. Some interest targets might be subject-related interests, interest in reading and interest in emerging technologies. The third target would be values. While there are values some feel schools should have no part in, there are some that generally are non-controversial. Those values are honesty, integrity, justice and freedom. The goal would be to try not to assess too many variables but to target a few important ones to promote positive future affect in students.
There are a few different ways affect can be assessed in a classroom. The easier the ability to assess the more likely and successful a teacher will be as well. The best way to assess affect might be to ask students to complete a self-report inventory. An example of this type of affect assessment is a Likert inventory. Likert inventories are a series of statements that are responded to by agreement or disagreement. There are about eight steps in building a good Likert inventory. The first step is to choose the affective variable you want to assess. Determine which education variable to assess, an attitude, interest or value. Next generate a series of favorable or unfavorable statements regarding the affective variable. Try to use equal numbers of positive and negative statements. The third step is to get several people to classify each statement as positive or negative. Throw out any that aren’t agreed upon. The fourth step is to decide on the number and phrasing of the response options for each statement. Typically Likert scales uses five options, SD = strongly disagree, D = disagree, NS = not sure, A = agree, SA = strongly agree. Younger students may benefit from fewer choices. The fifth step is prepare the self-report inventory, giving students directions regarding how to respond and stipulating the inventory must be completed anonymously. Clear directions and general sample statement are important to producing good Likert assessments. The sixth step would be to administer the inventory either to your own students or if possible another set of students in a class that is not yours. Based on the responses you can make improvements before you administer to your students or the next time you administer to another group of your students. The seventh step is to score the inventories. Scoring should be clearly addressed in the directions and should conform to the number of responses. There should also be equal positive and negative distribution. An example of a scoring scenario could be, if there is a 5 choice response sequence; the SD and SA responses could be 5 points. Then lower as you move in D and A responses could be 3 points, and NS would be 0 points. The final and eighth step would be to identify and eliminate statements that fail to function in accord with the other statements. This can be done by completing a correlation coefficient. Remove statements that are not consistent in response and re-score the inventory without those responses. This process is referred to as Likerts criterion of internal consistency. Since there are many steps to the Likert inventories and that may be discouraging, a teacher could eliminate some steps.
A second type of inventory that focuses on collecting information about a number of students affective dispositions is a Multifocus affective inventory. Since Likert inventories focus on 10 to 20 items to a single affective area with fewer questions, Multifocus Affective Inventories can cover more areas. There are five steps to creating a Multifocus affective inventory. The first step is to select the affective variables to measure. Again here you will need to identify the educationally significant variables. The second step would be to determine how many items to allocate to each affective variable. The importance here is to include equal number of positive and negative responses. It is also recommended each item have two responses one positive and one negative. The more items the increments increase equally. The third step would be creating a series of positive and negative statements related to each affective variable. Statements need to be designed to elicit differential responses from students, but at the same time, the statements need to be scattered and not grouped together. The fourth step is to determine the number and phrasing of students’ response options. Traditional Likert responses can by used for Multifocus assessments as well. The fifth and final step would be to create clear directions for the inventory and an appropriate presentation format. It is important to include lucid directions about how to respond, at least one sample item, a request for anonymous, honest responses, and a reminder there is no right or wrong answers. These assessments are scored just like the Likert assessments. The Multifocus assessments purpose is to gather inferential data on student affect with fewer statements.
Affect can be assessed in systematic ways to allow a teacher to make instructional decisions about student’s current and future affect. Group focused inferences are the best way to use affective assessments. Individual inferences should be avoided. Attitudes, interests and values are variables that can be looked at universally for measuring affect. Self-report assessments such as Likert Inventories or Multifocus Inventories can created and used to assess affect.

Sunday, April 20, 2008

Chapter 9 – Portfolio Assessments

Educational Assessment - Review By Brenda Roof
Classroom Assessment – What Teachers Need to Know - W. James Popham

Chapter 9 discusses Portfolio Assessments. The chapter shows how portfolio assessments in the classroom are more effective than using them as a standardized testing tool. Also discussed, is the importance of self-assessment when using portfolio assessment. Seven steps should also be implemented by the classroom teacher when using portfolio assessments. There are three functions of portfolio assessments that should be thought about by a teacher wanting to utilize portfolio assessments. As with most assessment tools there are also negatives and positives to be looked at as well.
The definition of a portfolio is a “systematic collection of one’s work”. In the educational setting we can think of a portfolio as the collection of a students work. While portfolio’s have been used by many professions over the years, they are somewhat new to education. They have been embraced by educators who are not keen on standardized tests. However, efforts to employ portfolios as large scale applications to accountability have not been very encouraging. One of the biggest reasons for this is cost for trained scorers and then centralized scoring. In some states where they have the classroom teachers do the scoring reliability has become an issue. This is due to the improper or no training and student bias concerns. In general using portfolio assessments in accountability testing may not be the best use of this form of assessment.
Using portfolio assessments in the classroom however, may be more realistic. The author suggests a seven stop sequence to making portfolio assessment successful in the classroom. The first step is to make sure your students “own” their portfolios. Students need to understand that the portfolio is a collection of their work not just a project to earn a grade. The second step would be to decide what kind of samples to collect. Collaboratively, the teacher and student should decide what work should be collected. A wide variety is also recommended. The third step is to collect and store work samples. This should be planned out with students as to how and where they will store and collect work. The fourth step is to select criteria by which to evaluate portfolio work samples. Again the teacher and student need to carefully plan out evaluative criteria for the work. Students need to clearly understand the rubric’s evaluative criteria. The fifth step is to require students to evaluate continually their own portfolio products. Using index cards students should provide a written self-evaluation of their work on an ongoing basis. The sixth step is to schedule and conduct portfolio conferences. This will take time but is very important for teacher and student. This step will also help the student make good self-evaluative comments. The seventh step is to involve parents in the portfolio assessment process. Parents should receive information as well about expectations and should also be encouraged to review student’s portfolios from time to time. They can even be involved in the self-assessment and reviews. These activities will encourage and promote the importance and show the value of portfolio assessment in the classroom.
There are several main purposes of portfolios identified by portfolio specialists. The first purpose is documentation of student progress. The major function would be to provide the teacher, student and parents with evidence of growth. Typically, this is known as a working portfolio and student self-evaluations are useful tools in this purpose. Student achievement levels should also influence instructional decisions. To do this information should be collected or assessed as close to the marking terms as possible. The second purpose of a portfolio is to provide an opportunity for showcasing student opportunities. The author Robert Stiggins refers to these as celebration portfolios and encourages them especially in early grades. Showcase portfolios should be a selection of best work and a thoughtful reflection of its quality provided by the student is essential. The third purpose of a portfolio assessment is evaluation of student status. This purpose would serve as determination of previously established evaluative criteria. Standardization of how portfolios are appraised is important in this purpose, such as a pre-established rubric provided with clear examples for the student to follow. These three purposes show why it is important for a classroom teacher to decide first the primary purpose of portfolios and then to determine how they should look and be prepared by students. Portfolio assessments should have one priority or purpose. One purpose can not satisfy multiple functions. The three purposes can not be provided in one function.
There are pros and cons of portfolio assessments as there are with all forms of assessments. The greatest strength of portfolio assessment is its ability to be tailored to a student’s needs, interests and abilities. Portfolios also show growth and learning of students. It provides a way to document and evaluate growth and learning in the classroom that standardized or written tests can not. Self-evaluation is also fostered which guides student learning over time. Personal ownership is also experienced by students in relation to their work and the progress they experience. There are also some cons of portfolios. The time factor sometimes makes it difficult to have consistent evaluations, as well as, creating appropriate scoring guides. The amount of time needed to properly carry out the task in properly creating and reviewing portfolios. The biggest problem can be proper training in carrying out portfolio assessments.
Classroom teachers really need to understand that portfolio assessments are not a one time measurement approach to address short term objectives. Portfolio assessments should be used for a big goal addressed throughout the year. Self-evaluation should be nurtured along as well. Teachers should pick one core area to use portfolios and not try to implement them the same way for every subject.
There are many good uses for portfolio assessments. While they should not be used in place of standardized testing or in conjunction with large scale accountability assessments. Portfolio assessments do have a place in the classroom setting. The student progress over-time can be addressed using portfolio assessments. The seven key ingredients to utilize portfolio assessments were also discussed. Three different functions of portfolio assessments were highlighted; documentation of progress, showcasing accomplishments and evaluation of status. Also discussed were the pros and cons of this form of assessment.

Saturday, April 12, 2008

Chapter 8 – Performance Assessment

Educational Assessment - Review By Brenda Roof
Classroom Assessment – What Teachers Need to Know - W. James Popham

Chapter eight takes a look at performance assessments. Performance assessments try to create real-life situations and apply assessment inferences to the tasks being performed. Appropriate skills tasks are essential for performance assessments to be valid. There are seven evaluative criteria for performance assessment tasks. The skills to be assessed must also be significant. Evaluative criteria are one of the most important components of a rubric used to evaluate responses on performance assessments. Distinctions should be drawn for three rubric types as well as to enhance instruction.
A performance assessment is defined as an approach to measuring a student’s status based on the way the student completes a specified task. There are varied opinions about what a true performance assessment is. Some educators feel that short-answer and essay assessments constitute performance assessments. Other educators feel that there are three criteria that a true performance assessment must possess. The first criterion is multiple evaluative criteria. Performance must be judged using more than one criterion. The second criterion is pre-specified quality standards. This criterion states that each evaluative criterion that is being judged is clearly explained, in advance of any judgment of the quality of performance. The third criterion is judgmental appraisal. Human judgments are used to determine how acceptable a student’s performance is. There are still others who feel performance assessments must be demanding and aligned according to Blooms taxonomies. Regardless of the criteria, performance assessments are very different than selected or constructed response assessments.
Suitable tasks should be identified for performance assessments. Teachers will need to generate their own performance tests tasks or select tasks from other educators. Teachers will also need to make inferences about students and decisions based on those inferences. All of this should be based on the curricular aims established early on. One of the biggest drawbacks of performance assessments is, because students are responding to fewer tasks than typical pencil and paper tests, it is more difficult to generalize accurately about the skills and knowledge gained by a student. There are several evaluative criteria you can consider when evaluating performance-test tasks. The first is generalize-ability. Is there is a high likelihood the students performance on the tasks can be compared to other tasks? The second is authenticity. Is the task true to life as opposed to school only? The third is multiple foci, does the task measure multiple instructional outcomes, not just one? The fourth criterion is teach-ability, is the task one that students can become more proficient in, as a consequence of the teacher’s instructional efforts? The fifth is fairness. Is the task fair to all students? This is also a form of test-bias. The sixth is feasibility. Is the task realistically implementable? The seventh and final criterion is score-ability. Is the task likely to elicit content that can be reliably and accurately evaluated? The best case scenario would be to apply all of these criteria, but as many as possible will work as well. One last important factor to consider about performance assessments is the significance of the skill you’re evaluating. The performance assessment should be used for the most significant skills due to the amount of time in developing and scoring them.
A scoring rubric is typically used to score student responses on performance assessments. There are three important features of a scoring rubric. The first is evaluative criteria. This factor should be used to determine the quality of the response. No more than three or four evaluative criteria should be used. Descriptions of qualitative differences for the evaluative criteria should be included. A description must be supplied so qualitative distinctions in a response can be made using the criterion. Describe in words what a perfect response should be. An indication of whether a holistic or analytic scoring approach is to be used. The rubric must indicate if evaluative criteria are to be applied collectively in the form of holistic scoring or on a criterion-by-criterion basis in the form of analytic scoring. A well planned rubric will benefit instruction greatly.
There are a variety of rubrics seen today. Two types that are described as sorid by the author are task-specific and hyper-general. One that is described as super is a skill-focused rubric. The first sorid rubric is a task-specific rubric. In this rubric evaluative criteria are linked only to a specific task embodied in a specific performance test. This rubric does to provide insight into instruction for teacher. Students should be taught to perform well on a variety of tasks not a single task. The second sorid rubric is described as hyper-general rubric. In this rubric evaluative criteria are seen as general with very lucid terms used. This leads to inadequate essay or organization. These rubrics may as well be scored with letter grades of A through F as they provide no instructional value to student performance. The third rubric described is a rubric of value and one that should be used. This rubric is called a skill-focused rubric. These rubrics are developed around constructed response assessments being measured, as well as, what is being pursued instructionally by the teacher. The key here is to develop the scoring rubric before instructional planning begins. There are two areas of organization that should be appraised in a skill-focused rubric, overall structure and sequence.
There are five rules that should be followed in creating a skill-focused rubric. You will generate this rubric before you plan your instruction. The first rule is making sure the skill to be assessed is significant. Skills that are assessed with a skill-focused rubric should be demanding accomplishments, if they are not other assessment forms are more appropriate to use. Rule number two is to make certain all of the rubric’s evaluative criteria can be addressed instructionally. Scrutinize all evaluative criteria to ensure you can teach students to master all criteria. The third rule is to employ as few evaluative criteria as possible. Always try to focus on three or four evaluative criteria. If there are more criteria you are trying to achieve mastery on, it will be difficult to use performance assessments properly. The fourth rule is to provide a succinct label for each evaluative criterion. Using one word labels allows students to keep focused on what is expected to achieve mastery. The fifth rule is to match the length of the rubric to your own tolerance for detail. If more than one page rubrics seem overwhelming to you than keep them short. Rubrics should be built to match the detail preference of the teacher.
Performance assessments provide an alternative to traditional paper and pencil assessments. They are also sometimes seen as more true to life and what one would be expected to do in the real world. The tasks in performance assessments align closer to high level cognitive skills, allowing more accurate inferences to be derived about students. This allows for more positive influence on instruction. These assessments do however; require much more time and energy from students as well as teachers. The development and scoring must be done correctly in order for the inferences to be valid and effective.

Tuesday, April 8, 2008

Chapter 7 – Constructed Response Tests

Educational Assessment - Review By Brenda Roof
Classroom Assessment – What Teachers Need to Know - W. James Popham

Constructed-response tests are the focus of chapter seven. Constructed-response tests are great for assessing if a student has achieved mastery. The creation and scoring of constructed response and effort involved is greater than selected response. They should be used when teachers want to know what is truly mastered. Scoring should be established ahead of time to preserve valid inferences. There are two kinds of constructed-response assessments discussed, short answer and essay.
The first type of constructed response discussed is short-answer item. Short-answer items require a word, phrase or sentence in response to a direct question or incomplete statement. Typically, short-answer response helps answer learning outcomes such as those focused on. The advantage of short-answer is students must produce a correct response, not pick-out a familiar choice. The major disadvantage of short answer is, responses are difficult to score. In accurate scoring leads to reduced reliability and then reduces the validity of the assessment based inferences made about students.
There are five item-writing guidelines for short-answer items. The first guideline is to employ direct questions rather than incomplete statements, particularly for younger students. Younger students especially will be less confused by direct questions. The use of direct question format also helps ensure the item writer phrases the item so less ambiguity is present. The second short-answer guideline is to structure the item so that a response should be concise. The test items should be created to elicit brief clear responses. The third guideline is to place blanks in the margin for direct questions or near the end of incomplete statements. This will also aid in scoring so items are aligned. By placing fill-in-the-blanks toward the end students will read the whole statement to ensure more accurate responses. The fourth guideline is for incomplete statements; use only one, or at most, two blanks. The use of more than two blanks leaves the item with holes galore causing meaning to be lost. The fifth guideline for short-answer response items is to make sure blanks for all items are equal in length. This practice will eliminate unintended clues to the responses.
The second kind of constructed response assessment item is essay items. Essay items are the most common form used of constructed response. Essay items are used to gage a student’s ability to synthesize evaluate and compose. A form of essay item is a writing sample. These are used heavily in performance assessments. A strength of essay item is assessment of complex learning outcomes. Some weaknesses of essay items are, they are difficult to write properly, and scoring responses reliably can also be a challenge.
There are five item-writing guidelines for essay items. The first guideline for essay items is convey to students a clear idea regarding the extensiveness of the response desired. There are two forms used for this, a restricted-response, which limits the form and content of the response. The second is an extended-response item, which gives more latitude in the response. When using these forms you can provide certain amount of space or number of word limits. The second guideline is to construct items so the student’s task is explicitly described. The nature of the assessment task has to be set forth clearly, so the student knows exactly how to respond. The third guideline is to provide students with the approximate time to be expended on each item, as well as each items value. Directions should state clearly how much time and the point values of your essay item responses. The forth guideline is to not employ optional items. Offering a menu of options in turn represents different exams altogether, the consequence is the impossibility of scoring on a common scale. The fifth guideline for essay-items is recursively judge an items quality by composing, mentally or in writing, the item as well as, a precursor for your expectations for a response.
As mentioned earlier the most difficult problem with constructed response items is scoring these items. There are five guidelines for scoring responses to essay items. The first is score responses holistically and/or analytically. Holistic scoring focuses on the essay response as a whole using evaluative criteria. The second scoring focus is analytic; this is a specific point-allocation approach. The second guideline is to prepare a tentative scoring key in advance of judging responses. To avoid being influenced by the students actions in class or quality of the first few responses decide ahead of time how you will score. The third guideline is to make decisions regarding the importance of mechanics writing prior to scoring. It is important to decide upfront how you will score mechanics. If the material is more important to what inferences you want to make, be sure to establish this before you begin scoring. The fourth guideline is to score all responses to one item before scoring responses to the next item. When scoring that item is complete, go to the next item on the same essay as well. Complete a whole essay at once. When scoring the whole essay at once, you increase the reliability of your scoring. This will also avoid having to constantly shift your focus to what you are scoring. It seems like it would take longer but it really wont. The fifth guideline is as much as possible, evaluate responses anonymously, try to have students sign their papers on the back and while scoring do not look at the names. This will help not making judgments outside your scoring rubric.
Assessing students using the constructed response format allows for more in-depth awareness of learned skills. There are two forms used for this short-answer and essay-item. There are also guidelines to implementing these item responses, as well as, scoring guidelines to ensure greater reliability and validity to score-bases inferences.

Sunday, April 6, 2008

Chapter 6 – Selected Response Tests

Educational Assessment - Review By Brenda Roof
Classroom Assessment – What Teachers Need to Know - W. James Popham

Chapter six focuses on creating and evaluating appropriate selected response assessments. There are five general item-writing commandments for selected and constructed response items. The author also discusses strengths and weaknesses for four types of selected response assessments. Selected item responses if written properly can assess higher level thinking skills. A best practice is to have a content knowledge colleague review your assessments as well as using formulas that can be used to review your assessments.
The five general item-writing commandments are essential to remember when creating selected and constructed response tests. The first commandment is not to provide opaque directions to students regarding how to respond to your assessments. Teachers typically don’t put serious thought into their directions. When a student is introduced to a testing format that is unfamiliar, wordy directions can be a distracter and cause incorrect responses that are unintentional and impair the implications from the assessment. The second commandment is not to employ ambiguous statements for your assessment items. If your questions are unclear and students are unsure of what you mean the questions can be misinterpreted. Again, this could cause a wrong response, when the student knows the answer, but does not know what you are asking. The third commandment is not to provide unintentional clues, regarding the correct response. Students in this case, will come up with a correct response because they were lead to it from the wording of the items. The student may not really know the correct response and now attention to that item is not assessed properly and follow-up may not occur when it would have been beneficial to gain actual learning. An unintentional clue can be as simple as the use of a word like “an” and how it completes the sentence or answer. The fourth commandment is to not employ complex syntax in assessment items. The goal in this case is to use very simple sentences. Too many clauses mess up the flow of the test item and what it is asking. The fifth and final commandment is to not use vocabulary that is more advanced than required or understood by the student. To get a fix on the students status you need to assess what is taught and learned not introduce new material. The first type of selected response tests is binary-choice items. This form of test item is commonly seen as true-false. This form is probably one of the oldest forms as well. There are five guidelines for writing binary-choice items. The first is phrase items so that superficial analysis will lead to wrong answers. By doing this you are trying to get students to think about the test item and present a way for you to assess how much good thinking they can do. The second guidelines is rarely use negative statements, and never use double negatives. It can be tempting to use the word “not” in a true statement, but this will only confuse the question and should be avoided. The third guideline is to only include one concept in each statement. If a test item has a concept in the first part that is true and the second concept is given that is false it makes it difficult for the student to respond correctly. This also leads to false inferences about the students’ true learning. The fourth guideline is importance of balancing. Keeping equal number of true-false responses is important and should be easy to do. The fifth guideline is keep item length similar for both categories being assessed. This guideline like the fourth guideline encourages structure to avoid guessing. If students see two answers are worded longer and that becomes a common pattern, they will begin responding that way, and again you will not get a true assessment of learning.
The second type of selected response assessments is multiple binary-choice items. A multiple binary-choice item is, when a cluster of items is presented, which requires a binary response to each of the items in the cluster. These types of clusters look similar to multiple choice however; they are statement clusters that require a single response for each cluster. Two important guidelines should be used for multiple binary-choice items. The first is separate item clutters clearly from one another. Since students are more familiar with multiple choice, when using binary multiple choice cluster items together, to clearly identify what is clustered together. Use stars or bold each new cluster very clearly. The second guideline is, to make certain that each item fits well with the clusters stem. The part preceding the response is the stem, all items should be linked to the stem in a meaningful way. One large benefit of using multiple binary-choice is if the stem contains new material and the binary-choice depends on the new material, it is certain that the student will need to go beyond recall knowledge to answer the question. Therefore, more intellectually demanding thought will be required, than just an ability to memorize.
The third style of selected response is multiple choices. This form of testing has also been widely used for achievement testing. It is typically used to measure student’s possession of knowledge as well as their ability to engage in higher levels of thinking. There are five guidelines for multiple choice items that should be employed. The first is that the stem should consist of a self-contained question or problem. Therefore, it is important to put as much content in the stem as necessary, to understand what the question item is getting at. The second guideline is to avoid negatively stated stems. Again using “not” may only confuse the testing item. Sometimes, it is even overlooked, so if it must be used use italics or bold the word “not”. The third guideline for multiple choice items is, not to let the length of alternative responses supply unintended clues. Try to keep all responses the same length or at least two short and two long. Distracters should align with the correct response. The fourth guideline is to make sure you scatter your correct responses. If students notice a pattern in your answers they may respond to that instead of your assessment. A good rule of thumb is 25% of your answers should represent the correct answers if they are A B C or D, or however many answer’s you have evenly divided. The fifth guideline is a suggestion to never use “all-of-the-above”. However; you can use “none-of-the-above”. “None-of-the-above” can be used to increase item difficulty. The reason “all-of-the-above” is not a good idea, as the assessment taker may only look at the first response see it is correct and choose it without looking further. To increase an items level of difficulty using “none-of-the-above” will work for test based inferences such as math problem solving. If a problem is displayed and the student must properly solve the answer if they guess something close you will know they did not work the problem correctly.
The fourth selected response type is matching items. Matching items consist of two parallel lists of words or phrases, requiring students to match items with appropriate items on the second list. One side should be premises and the other responses. There are six guidelines to follow for well constructed matching assessments. The first is employing homogenous lists. Each side should be as close to equal as possible, otherwise matching should not be used. The second guideline is to use relatively brief lists and place shorter words or phases to the right. Use about ten or less premise statements or words to cut down on distracters from choosing the correct response. The third guideline would be to use more responses than premises. The use of more responses will decrease the ability to answer by process of elimination. The fourth guideline is to order the responses logically, to avoid unintended clues. If you order the responses logically or alphabetically, you can avoid giving unintended clues. The fifth guideline is to describe the basis for matching and the number of times a response can be used. Students need to clearly understand how they should respond accurately. The more accurate they respond, the more valid your test is for making score based inferences. The sixth and final guideline is to place all premises and responses for an item on one page. Page flipping only creates confusion and leads to wrong responses, as well as, distractions for other assessment takers.
This chapter addressed important rules for constructing selected and constructive response assessments. Also addressed were the four types of selected response tests commonly used. Guidelines for the use of each of these types of selected response assessments were also discussed. By practicing these concepts assessment based validity and reliability of selected response assessments can be increased. Assessment based inferences can also be employed appropriately.

Wednesday, April 2, 2008

Chapter 5 – Deciding What to Assess and How to Assess It

Educational Assessment - Review By Brenda Roof
Classroom Assessment – What Teachers Need to Know - W. James Popham

Chapter 5 focuses on what a classroom teacher should be assessing as well as the procedures for properly assessing students. These questions should be guided by what information a teacher hopes to gather on students. Curricular standards should also play a role in assessment targets. Blooms Taxonomy is a helpful framework to decide cognitive outcomes from assessments and instruction. Deciding how to assess focuses on norm-referenced and criterion-referenced approaches. Selected response or constructed response type assessments are other considerations teachers should be aware of for assessing.
In this chapter we first focus is on what to assess. Decision-driven assessments help teachers gain information about their students. It is important to clarify, before an assessment is created, what decisions or decision will be influenced by a student’s performance on the assessment. Many times the knowledge and skills of the student are not the only expectation of the results. The attitudes toward what is being taught, as well as, effectiveness of instruction or need for further instruction are essential outcomes. By determining these things before the instruction and assessment a teacher can better inform instruction and assessment.
Curricular objectives also play a role in what to assess. Considering what your instructional objectives are can help you get a fix on what you should assess. In the past there was a demand for behavioral objectives. These objectives were sometimes too abundant and small-scoped which overwhelmed teachers. Today the goal is conceptualize curricular aims that are framed broadly and are measurable in order to organize instruction around them. The measurability is the key to starting good objectives. Even if they are broad, if they are measurable the can be managed by your instructional aims.
There are three potential assessment targets. The first is cognitive assessment which deals with students intellectual operations. These are the ability to display acquired knowledge or demonstrating thinking skills. The second target is affective assessment which deals with attitudes, interests, and values, like self-esteem, risk taking or attitude toward learning. The third target is psychomotor assessment targets which deal with a students’ large-muscle or small-muscle skills. These would be demonstrated in keyboarding skills or shooting a basketball in physical education. These ideas were presented by Benjamin Bloom through a classification system for educational objectives known as The 1956 Taxonomy of Educational Objectives.
The next area of focus is how to assess. There are two suggested strategies that are widely accepted. The first is norm-referenced measurement. In this strategy educators interpret a students performance in relation to the performance of students who have previously taken the same assessment. The previous group known is known as the norm group. The second strategy is criterion referenced strategy or criterion referenced interpretation. Criterion-referenced is an absolute interpretation as it hinges on the extent to which the curricular aim represented by the test are actually mastered by the student. The biggest differences in these approaches are how they are interpreted. Norm-referenced strategy should really only be used when a group of students need to be chosen for a specific educational experience. Otherwise, criterion referenced interpretations proved a much better idea of what students can and can not do, to allow teachers to make good instructional decisions.
Once teachers decide what to assess and then how to assess, the next thought should be how they will respond. There are really only two types of ways a student is able to respond, they are selected response and constructed response. Selected responses can be multiple choice or true or false type selections. Constructed response can be essay constructions, oral speeches, or product produced results. In deciding which type of response works best, scoring ease should not be a consideration. The assessment procedure used should focus on the student’s status in regards to an unobservable variable the teacher hopes to determine.
The more up-front thought a teacher gives to what to assess and how to assess the more likely they are to assess appropriately. Teachers need to be flexible and willing to change instruction based on assessment and strategies used to assess. By understanding instructional objectives and making them measurable assessments can be written to answer these questions for teachers and students.

Utilizing Assessments in the Classroom - Educational Assessments