• Our Mission
  • Why Is Assessment Important?

Asking students to demonstrate their understanding of the subject matter is critical to the learning process; it is essential to evaluate whether the educational goals and standards of the lessons are being met.

Assessment is an integral part of instruction, as it determines whether or not the goals of education are being met. Assessment affects decisions about grades, placement, advancement, instructional needs, curriculum, and, in some cases, funding. Assessment inspire us to ask these hard questions: "Are we teaching what we think we are teaching?" "Are students learning what they are supposed to be learning?" "Is there a way to teach the subject better, thereby promoting better learning?"

Today's students need to know not only the basic reading and arithmetic skills, but also skills that will allow them to face a world that is continually changing. They must be able to think critically, to analyze, and to make inferences. Changes in the skills base and knowledge our students need require new learning goals; these new learning goals change the relationship between assessment and instruction. Teachers need to take an active role in making decisions about the purpose of assessment and the content that is being assessed.

what is assessment in education essay

Grant Wiggins, a nationally recognized assessment expert, shared his thoughts on performance assessments, standardized tests, and more in an Edutopia.org interview . Read his answers to the following questions from the interview and reflect on his ideas:

  • What distinction do you make between 'testing' and 'assessment'?
  • Why is it important that teachers consider assessment before they begin planning lessons or projects?
  • Standardized tests, such as the SAT, are used by schools as a predictor of a student's future success. Is this a valid use of these tests?

Do you agree with his statements? Why or why not? Discuss your opinions with your peers.

When assessment works best, it does the following:

  • What is the student's knowledge base?
  • What is the student's performance base?
  • What are the student's needs?
  • What has to be taught?
  • What performance demonstrates understanding?
  • What performance demonstrates knowledge?
  • What performance demonstrates mastery?
  • How is the student doing?
  • What teaching methods or approaches are most effective?
  • What changes or modifications to a lesson are needed to help the student?
  • What has the student learned?
  • Can the student talk about the new knowledge?
  • Can the student demonstrate and use the new skills in other projects?
  • Now that I'm in charge of my learning, how am I doing?
  • Now that I know how I'm doing, how can I do better?
  • What else would I like to learn?
  • What is working for the students?
  • What can I do to help the students more?
  • In what direction should we go next?

Continue to the next section of the guide, Types of Assessment .

This guide is organized into six sections:

  • Introduction
  • Types of Assessment
  • How Do Rubrics Help?
  • Workshop Activities
  • Resources for Assessment

Study.com

In order to continue enjoying our site, we ask that you confirm your identity as a human. Thank you very much for your cooperation.

Education assessment in the 21st century: Moving beyond traditional methods

Subscribe to the center for universal education bulletin, esther care and esther care former nonresident senior fellow - global economy and development , center for universal education @care_esther alvin vista alvin vista former brookings expert @alvin_vista.

February 23, 2017

This blog is part of a four-part series on shifting educational measurement to match 21st century skills, covering traditional assessments , new technologies , new skillsets , and pathways to the future . These topics were discussed at the Center for Universal Education’s Annual Research and Policy Symposium on April 5, 2017 .  You can watch video from the event or listen to audio here .

The United Nations’ Sustainable Development Goals (SDGs) describe the target of achieving inclusive and quality education for all by 2030. As we work to accomplish this goal, we must also face the bigger challenge of not only identifying where children can access education, but how they can benefit from access—an imprecise target. From the perspective of educational measurement, to what extent are we ready and able to assess progress in terms of quality of education?

Traditional educational measurement

When we think about tests in schools, we often picture students shuffling papers at their desks. They fill in short answers to questions, respond to multiple-choice style options, or write brief essays. The majority of their cognitive effort is focused on searching their memory to find appropriate responses to the test items, or applying formulae to familiar problems. This style of educational assessment targets the types of skills that were seen as important throughout the 20th century—the skills of storing relevant information and retrieving it upon demand, often as these processes related to literacy and numeracy.

However, from a measurement perspective, the issues are more complex. Meaningful measurement requires defining what one intends to measure, as well as a consistent system to define the magnitude of what is being measured. This is straightforward for physical measurements, such as weight in pounds and height in inches, but not for cognitive measurements. Although we have been assessing numeracy and literacy skills for over a hundred years, measuring these skills is not as simple as it seems.

Measuring human attributes

Numeracy and literacy are “made-up” concepts. These concepts (known as “constructs” in academic literature) are not tangible objects that can easily be measured by their weight or height. These constructs lack inherent measurement properties independent of human definition. This presents educators with a dilemma. We need to assess student-learning outcomes in order to know what students are ready to learn next. Historically we have relied upon numbers to communicate learning outcomes; however, numbers that are easily applied to properties that exist independently of humans, such as mass and length, do not translate so easily with regard to human characteristics.

When we think about learning or skills, we assume underlying competencies are responsible for particular behaviors. But we cannot see these competencies; we can only see their outcomes. So if we are to measure those competencies, we must examine the outcomes in order to estimate their amount, degree, or quality. This is the challenge: with a huge variety of ways in which competencies might manifest, how do we define a scale to measure outcomes in a way that has consistent meaning? An inch is always an inch, but what is viewed as a correct answer to a question may vary. So what we look for in measurement of these educational constructs are proxies—something that stands for what we are really interested in.

Using proxy measurements

We use proxy measures for many things, physical as well as conceptual. For example, in forensic science, when skeletons are incomplete, the height can be estimated using the length of the arm or leg . These proxies work well, as opposed to say teeth, because they are reasonably accurate and relate closely with height. The quality of our measurements are therefore very much dependent on the quality of the proxies we choose.

Student responses on educational tests are proxies for their competencies and learning, and different types of proxies will be better or worse at revealing the quality of competencies. Here is the crunch: What sorts of proxies are most useful for each skill or competency, and how do we collect these?

The future of educational assessment

Through the last few decades, pen and paper tests have been the main method used to assess educational outcomes. For literacy and numeracy, this makes reasonable sense, since the learning outcome can be demonstrated in much the same way as the applied skill itself is typically demonstrated. However, for other skills of increasing interest in the education world—such as problem solving, critical thinking, collaboration, and creativity—this is less the case.

The challenge is how to proceed from the status quo, where system-level assessment using traditional tests is still seen as using good-enough proxies of academic skill, and where testing processes are implemented using traditional methods that everyone finds convenient, systematic, and cost-effective. In addition, increasing interest in education systems’ implementation of 21st century skills raises new hurdles. If we are interested in supporting students’ acquisition of these skills, we need assessment methods that make the skills themselves explicit—in other words, we need to look for new proxies.

Related Content

Esther Care, Alvin Vista

March 1, 2017

Luis Crouch, Carmen Strigel

August 3, 2017

Esther Care, Helyn Kim, Alvin Vista, Kate Anderson

January 30, 2019

Education Technology Global Education

Global Economy and Development

Center for Universal Education

June 20, 2024

Modupe (Mo) Olateju, Grace Cannon, Kelsey Rappe

June 14, 2024

Emily Markovich Morris, Laura Nóra, Richaa Hoysala, Max Lieblich, Sophie Partington, Rebecca Winthrop

May 31, 2024

Created by the Great Schools Partnership , the GLOSSARY OF EDUCATION REFORM is a comprehensive online resource that describes widely used school-improvement terms, concepts, and strategies for journalists, parents, and community members. | Learn more »

Share

In education, the term  assessment  refers to the wide variety of methods or tools that educators use to evaluate, measure, and document the academic readiness, learning progress, skill acquisition, or educational needs of students.

While assessments are often equated with traditional tests—especially the standardized tests  developed by testing companies and administered to large populations of students—educators use a diverse array of assessment tools and methods to measure everything from a four-year-old’s readiness for kindergarten to a twelfth-grade student’s comprehension of advanced physics. Just as academic lessons have different functions, assessments are typically designed to measure specific elements of learning—e.g., the level of knowledge a student already has about the concept or skill the teacher is planning to teach or the ability to comprehend and analyze different types of texts and readings. Assessments also are used to identify individual student weaknesses and strengths so that educators can provide specialized  academic support , educational programming, or social services. In addition, assessments are developed by a wide array of groups and individuals, including teachers, district administrators, universities, private companies, state departments of education, and groups that include a combination of these individuals and institutions.

While assessment can take a wide variety of forms in education, the following descriptions provide a representative overview of a few major forms of educational assessment.

Assessments are used for a wide variety of purposes in schools and education systems :

  • High-stakes  assessments  are typically standardized tests used for the purposes of accountability—i.e., any attempt by federal, state, or local government agencies to ensure that students are enrolled in effective schools and being taught by effective teachers. In general, “high stakes” means that important decisions about students, teachers, schools, or districts are based on the scores students achieve on a high-stakes test, and either punishments (sanctions, penalties, reduced funding, negative publicity, not being promoted to the next grade, not being allowed to graduate) or accolades (awards, public celebration, positive publicity, bonuses, grade promotion, diplomas) result from those scores. For a more detailed discussion, see  high-stakes test .
  • Pre-assessments  are administered before students begin a lesson, unit, course, or academic program. Students are not necessarily expected to know most, or even any, of the material evaluated by pre-assessments—they are generally used to (1) establish a baseline against which educators measure learning progress over the duration of a program, course, or instructional period, or (2) determine general academic readiness for a course, program, grade level, or new academic program that student may be transferring into.
  • Formative  assessments  are in-process evaluations of student learning that are typically administered multiple times during a unit, course, or academic program. The general purpose of formative assessment is to give educators in-process feedback about what students are learning or not learning so that instructional approaches, teaching materials, and academic support can be modified accordingly. Formative assessments are usually not scored or graded, and they may take a variety of forms, from more formal quizzes and assignments to informal questioning techniques and in-class discussions with students.
Formative assessments are commonly said to be  for  learning because educators use the results to modify and improve teaching techniques during an instructional period, while summative assessments are said to be  of  learning because they evaluate academic achievement at the conclusion of an instructional period. Or as assessment expert Paul Black put it, “When the cook tastes the soup, that’s formative assessment. When the customer tastes the soup, that’s summative assessment.”
  • Interim assessments   are used to evaluate where students are in their learning progress and determine whether they are on track to performing well on future assessments, such as standardized tests, end-of-course exams, and other forms of “summative” assessment. Interim assessments are usually administered periodically during a course or school year (for example, every six or eight weeks) and separately from the process of instructing students (i.e., unlike formative assessments, which are integrated into the instructional process).
  • Placement assessments  are used to “place” students into a course, course level, or academic program. For example, an assessment may be used to determine whether a student is ready for Algebra I or a higher-level algebra course, such as an honors-level course. For this reason, placement assessments are administered before a course or program begins, and the basic intent is to match students with appropriate learning experiences that address their distinct learning needs.
  • Screening assessments  are used to determine whether students may need specialized assistance or services, or whether they are ready to begin a course, grade level, or academic program. Screening assessments may take a wide variety of forms in educational settings, and they may be developmental, physical, cognitive, or academic. A preschool screening test, for example, may be used to determine whether a young child is physically, emotionally, socially, and intellectually ready to begin preschool, while other screening tests may be used to evaluate health, potential learning disabilities, and other student attributes.

Assessments are also designed in a variety of ways for different purposes:

  • Standardized assessments  are designed, administered, and scored in a standard, or consistent, manner. They often use a multiple-choice format, though some include open-ended, short-answer questions. Historically, standardized tests featured rows of ovals that students filled in with a number-two pencil, but increasingly the tests are computer-based. Standardized tests can be administered to large student populations of the same age or grade level in a state, region, or country, and results can be compared across individuals and groups of students. For a more detailed discussion, see  standardized test .
  • Standards-referenced or standards-based  assessments  are designed to measure how well students have mastered the specific knowledge and skills described in local, state, or national  learning standards . Standardized tests and high-stakes tests may or may not be based on specific learning standards, and individual schools and teachers may develop their own standards-referenced or standards-based assessments. For a more detailed discussion, see  proficiency-based learning .
  • Common  assessments  are used in a school or district to ensure that all teachers are evaluating student performance in a more consistent, reliable, and effective manner. Common assessments are used to encourage greater consistency in teaching and assessment among teachers who are responsible for teaching the same content, e.g. within a grade level, department, or  content area . They allow educators to compare performance results across multiple classrooms, courses, schools, and/or learning experiences (which is not possible when educators teach different material and individually develop their own distinct assessments). Common assessments share the same format and are administered in consistent ways—e.g., teachers give students the same instructions and the same amount of time to complete the assessment, or they use the same scoring guides to interpret results. Common assessments may be “formative” or “summative .” For more detailed discussions, see coherent curriculum  and  rubric .
  • Performance assessments  typically require students to complete a complex task, such as a writing assignment, science experiment, speech, presentation, performance, or long-term project, for example. Educators will often use collaboratively developed common assessments, scoring guides, rubrics, and other methods to evaluate whether the work produced by students shows that they have learned what they were expected to learn. Performance assessments may also be called “authentic assessments,” since they are considered by some educators to be more accurate and meaningful evaluations of learning achievement than traditional tests. For more detailed discussions, see authentic learning ,  demonstration of learning , and  exhibition .
  • Portfolio-based  assessments  are collections of academic work—for example, assignments, lab results, writing samples, speeches, student-created films, or art projects—that are compiled by students and assessed by teachers in consistent ways. Portfolio-based assessments are often used to evaluate a “body of knowledge”—i.e., the acquisition of diverse knowledge and skills over a period of time. Portfolio materials can be collected in physical or digital formats, and they are often evaluated to determine whether students have met required learning standards . For a more detailed discussion, see  portfolio .

The purpose of an assessment generally drives the way it is designed, and there are many ways in which assessments can be used. A standardized assessment can be a high-stakes assessment, for example, but so can other forms of assessment that are not standardized tests. A portfolio of student work can be a used as both a “formative” and “summative” form of assessment. Teacher-created assessments, which may also be created by teams of teachers, are commonly used in a single course or grade level in a school, and these assessments are almost never “high-stakes.” Screening assessments may be produced by universities that have conducted research on a specific area of child development, such as the skills and attributes that a student should have when entering kindergarten to increase the likelihood that he or she will be successful, or the pattern of behaviors, strengths, and challenges that suggest a child has a particular learning disability. In short, assessments are usually created for highly specialized purposes.

While educational assessments and tests have been around since the days of the one-room schoolhouse, they have increasingly assumed a central role in efforts to improve the effectiveness of public schools and teaching. Standardized-test scores, for example, are arguably the dominant measure of educational achievement in the United States, and they are also the most commonly reported indicator of school, teacher, and school-system performance.

As schools become increasingly equipped with computers, tablets, and wireless internet access, a growing proportion of the assessments now administered in schools are either computer-based or online assessments—though paper-based tests and assessments are still common and widely used in schools. New technologies and software applications are also changing the nature and use of assessments in innumerable ways, given that digital-assessment systems typically offer an array of features that traditional paper-based tests and assignments cannot. For example, online-assessment systems may allow students to log in and take assessments during out-of-class time or they may make performance results available to students and teachers immediately after an assessment has been completed (historically, it might have taken hours, days, or weeks for teachers to review, score, and grade all assessments for a class). In addition, digital and online assessments typically include features, or “analytics,” that give educators more detailed information about student performance. For example, teachers may be able to see how long it took students to answer particular questions or how many times a student failed to answer a question correctly before getting the right answer. Many advocates of digital and online assessments tend to argue that such systems, if used properly, could help teachers “ personalize ” instruction—because many digital and online systems can provide far more detailed information about the academic performance of students, educators can use this information to modify educational programs, learning experiences , instructional approaches, and  academic-support strategies  in ways that address the distinct learning needs, interests, aspirations, or cultural backgrounds of individual students. In addition, many large-scale standardized tests are now administered online, though states typically allow students to take paper-based tests if computers are unavailable, if students prefer the paper-based option, or if students don’t have the technological skills and literacy required to perform well on an online assessment.

Given that assessments come in so many forms and serve so many diverse functions, a thorough discussion of the purpose and use of assessments could fill a lengthy book. The following descriptions, however, provide a brief, illustrative overview of a few of the major ways in which assessments—especially assessment results—are used in an attempt to improve schools and teaching:

  • System and school accountability : Assessments, particularly standardized tests, have played an increasingly central role in efforts to hold schools, districts, and state public-school systems “accountable” for improving the academic achievement of students. The most widely discussed and far-reaching example, the 2001 federal law commonly known as the No Child Left Behind Act, strengthened federal expectations from the 1990s and required each state develop  learning standards   to govern what teachers should teach and students should learn. Under No Child Left Behind, standards are required in every grade level and  content area  from kindergarten through high school. The law also requires that students be tested annually in grades 3-8 and at least once in grades 10-12 in reading and mathematics. Since the law’s passage, standardized tests have been developed and implemented to measure how well students were meeting the standards, and scores have been reported publicly by state departments of education. The law also required that test results be tracked and reported separately for different “subgroups” of students, such as minority students, students from low-income households, students with special needs, and students with  limited proficiency in English . By publicly reporting the test scores achieved by different schools and student groups, and by tying those scores to penalties and funding, the law has aimed to close  achievement gaps  and improve schools that were deemed to be underperforming. While the No Child Left Behind Act is one of the most controversial and contentious educational policies in recent history, and the technicalities of the legislation are highly complex, it is one example of how assessment results are being used as an accountability measure.
  • Teacher evaluation and compensation : In recent years, a growing number of elected officials, policy makers, and education reformers have argued that the best way to improve educational results is to ensure that students have effective teachers, and that one way to ensure effective teaching is to evaluate and compensate educators, at least in part, based on the test scores their students achieve. By basing a teacher’s income and job security on assessment results, the reasoning goes, administrators can identify and reward high-performing teachers or take steps to either help low-performing teachers improve or remove them from schools. Growing political pressure, coupled with the promise of federal grants, prompted many states to begin using student test results in teacher evaluations. This controversial and highly contentious reform strategy generally requires fairly complicated statistical techniques—known as  value-added measures   or  growth measures —to determine how much of a positive or negative effect individual teachers have on the academic achievement of their students, based primarily on student assessment results.
  • Instructional improvement : Assessment results are often used as a mechanism for improving instructional quality and student achievement. Because assessments are designed to measure the acquisition of specific knowledge or skills, the design of an assessment can determine or influence what gets taught in the classroom (“teaching to the test” is a common, and often derogatory, phrase used to describe this general phenomenon). Formative assessments, for example, give teachers in-process feedback on student learning, which can help them make instructional adjustments during the teaching process, instead of having to wait until the end of a unit or course to find out how well students are learning the material. Other forms of assessment, such as standards-based assessments or common assessments, encourage educators to teach similar material and evaluate student performance in more consistent, reliable, or comparable ways.
  • Learning-needs identification : Educators use a wide range of assessments and assessment methods to identify specific student learning needs, diagnose learning disabilities (such as autism, dyslexia, or nonverbal learning disabilities), evaluate language ability, or determine eligibility for specialized educational services. In recent years, the early identification of specialized learning needs and disabilities, and the proactive provision of educational support services to students, has been a major focus of numerous educational reform strategies. For a related discussion, see  academic support .

In education, there is widespread agreement that assessment is an integral part of any effective educational system or program. Educators, parents, elected officials, policy makers, employers, and the public all want to know whether students are learning successfully and progressing academically in school. The debates—many of which are a complex, wide ranging, and frequently contentious—typically center on how assessments are used, including how frequently they are being administered and whether assessments are beneficial or harmful to students and the teaching process. While a comprehensive discussion of these debates is beyond the scope of this resource, the following is a representative selection of a few major issues being debated:

  • Is high-stakes testing, as an accountability measure, the best way to improve schools, teaching quality, and student achievement? Or do the potential consequences—such as teachers focusing mainly on test preparation and a narrow range of knowledge at the expense of other important skills, or increased incentives to cheat and manipulate test results—undermine the benefits of using test scores as a way to hold schools and educators more accountable and improve educational results?
  • Are standardized assessments truly  objective  measures of academic achievement? Or do they reflect intrinsic biases—in their design or content—that favor some students over others, such wealthier white students from more-educated households over minority and low-income students from less-educated households? For more detailed discussions, see  measurement error and  test bias .
  • Are “one-size-fits-all” standardized tests a fair way to evaluate the learning achievement of all students, given that some students may be better test-takers than others? Or should students be given a variety of assessment options and multiple opportunities to demonstrate what they have learned?
  • Will more challenging and  rigorous   assessments lead to higher educational achievement for all students? Or will they end up penalizing certain students who come from disadvantaged backgrounds? And, conversely, will less-advantaged students be at an even greater disadvantage if they are not held to the same high educational standards as other students (because lowering educational standards for certain students, such as students of color, will only further disadvantage them and perpetuate the same cycle of low expectations that historically contributed to racial and socioeconomic  achievement gaps )?
  • Do the costs—in money, time, and human resources—outweigh the benefits of widespread, large-scale testing? Would the funding and resources invested in testing and accountability be better spent on higher-quality educational materials, more training and support for teachers, and other resources that might improve schools and teaching more effectively? And is the pervasive use of tests providing valuable information that educators can use to improve instructional quality and student learning? Or are the tests actually taking up time that might be better spent on teaching students more knowledge and skills?
  • Are technological learning applications, including digital and online assessments, improving learning experiences for students, teaching them technological skills and literacy, or generally making learning experiences more interesting and engaging? Or are digital learning applications adding to the cost of education, introducing unwanted distractions in schools, or undermining the value of teachers and the teaching process?

Creative Commons License

Alphabetical Search

Center for Teaching

Student assessment in teaching and learning.

what is assessment in education essay

Much scholarship has focused on the importance of student assessment in teaching and learning in higher education. Student assessment is a critical aspect of the teaching and learning process. Whether teaching at the undergraduate or graduate level, it is important for instructors to strategically evaluate the effectiveness of their teaching by measuring the extent to which students in the classroom are learning the course material.

This teaching guide addresses the following: 1) defines student assessment and why it is important, 2) identifies the forms and purposes of student assessment in the teaching and learning process, 3) discusses methods in student assessment, and 4) makes an important distinction between assessment and grading., what is student assessment and why is it important.

In their handbook for course-based review and assessment, Martha L. A. Stassen et al. define assessment as “the systematic collection and analysis of information to improve student learning.” (Stassen et al., 2001, pg. 5) This definition captures the essential task of student assessment in the teaching and learning process. Student assessment enables instructors to measure the effectiveness of their teaching by linking student performance to specific learning objectives. As a result, teachers are able to institutionalize effective teaching choices and revise ineffective ones in their pedagogy.

The measurement of student learning through assessment is important because it provides useful feedback to both instructors and students about the extent to which students are successfully meeting course learning objectives. In their book Understanding by Design , Grant Wiggins and Jay McTighe offer a framework for classroom instruction—what they call “Backward Design”—that emphasizes the critical role of assessment. For Wiggens and McTighe, assessment enables instructors to determine the metrics of measurement for student understanding of and proficiency in course learning objectives. They argue that assessment provides the evidence needed to document and validate that meaningful learning has occurred in the classroom. Assessment is so vital in their pedagogical design that their approach “encourages teachers and curriculum planners to first ‘think like an assessor’ before designing specific units and lessons, and thus to consider up front how they will determine if students have attained the desired understandings.” (Wiggins and McTighe, 2005, pg. 18)

For more on Wiggins and McTighe’s “Backward Design” model, see our Understanding by Design teaching guide.

Student assessment also buttresses critical reflective teaching. Stephen Brookfield, in Becoming a Critically Reflective Teacher, contends that critical reflection on one’s teaching is an essential part of developing as an educator and enhancing the learning experience of students. Critical reflection on one’s teaching has a multitude of benefits for instructors, including the development of rationale for teaching practices. According to Brookfield, “A critically reflective teacher is much better placed to communicate to colleagues and students (as well as to herself) the rationale behind her practice. She works from a position of informed commitment.” (Brookfield, 1995, pg. 17) Student assessment, then, not only enables teachers to measure the effectiveness of their teaching, but is also useful in developing the rationale for pedagogical choices in the classroom.

Forms and Purposes of Student Assessment

There are generally two forms of student assessment that are most frequently discussed in the scholarship of teaching and learning. The first, summative assessment , is assessment that is implemented at the end of the course of study. Its primary purpose is to produce a measure that “sums up” student learning. Summative assessment is comprehensive in nature and is fundamentally concerned with learning outcomes. While summative assessment is often useful to provide information about patterns of student achievement, it does so without providing the opportunity for students to reflect on and demonstrate growth in identified areas for improvement and does not provide an avenue for the instructor to modify teaching strategy during the teaching and learning process. (Maki, 2002) Examples of summative assessment include comprehensive final exams or papers.

The second form, formative assessment , involves the evaluation of student learning over the course of time. Its fundamental purpose is to estimate students’ level of achievement in order to enhance student learning during the learning process. By interpreting students’ performance through formative assessment and sharing the results with them, instructors help students to “understand their strengths and weaknesses and to reflect on how they need to improve over the course of their remaining studies.” (Maki, 2002, pg. 11) Pat Hutchings refers to this form of assessment as assessment behind outcomes. She states, “the promise of assessment—mandated or otherwise—is improved student learning, and improvement requires attention not only to final results but also to how results occur. Assessment behind outcomes means looking more carefully at the process and conditions that lead to the learning we care about…” (Hutchings, 1992, pg. 6, original emphasis). Formative assessment includes course work—where students receive feedback that identifies strengths, weaknesses, and other things to keep in mind for future assignments—discussions between instructors and students, and end-of-unit examinations that provide an opportunity for students to identify important areas for necessary growth and development for themselves. (Brown and Knight, 1994)

It is important to recognize that both summative and formative assessment indicate the purpose of assessment, not the method . Different methods of assessment (discussed in the next section) can either be summative or formative in orientation depending on how the instructor implements them. Sally Brown and Peter Knight in their book, Assessing Learners in Higher Education, caution against a conflation of the purposes of assessment its method. “Often the mistake is made of assuming that it is the method which is summative or formative, and not the purpose. This, we suggest, is a serious mistake because it turns the assessor’s attention away from the crucial issue of feedback.” (Brown and Knight, 1994, pg. 17) If an instructor believes that a particular method is formative, he or she may fall into the trap of using the method without taking the requisite time to review the implications of the feedback with students. In such cases, the method in question effectively functions as a form of summative assessment despite the instructor’s intentions. (Brown and Knight, 1994) Indeed, feedback and discussion is the critical factor that distinguishes between formative and summative assessment.

Methods in Student Assessment

Below are a few common methods of assessment identified by Brown and Knight that can be implemented in the classroom. [1] It should be noted that these methods work best when learning objectives have been identified, shared, and clearly articulated to students.

Self-Assessment

The goal of implementing self-assessment in a course is to enable students to develop their own judgement. In self-assessment students are expected to assess both process and product of their learning. While the assessment of the product is often the task of the instructor, implementing student assessment in the classroom encourages students to evaluate their own work as well as the process that led them to the final outcome. Moreover, self-assessment facilitates a sense of ownership of one’s learning and can lead to greater investment by the student. It enables students to develop transferable skills in other areas of learning that involve group projects and teamwork, critical thinking and problem-solving, as well as leadership roles in the teaching and learning process.

Things to Keep in Mind about Self-Assessment

  • Self-assessment is different from self-grading. According to Brown and Knight, “Self-assessment involves the use of evaluative processes in which judgement is involved, where self-grading is the marking of one’s own work against a set of criteria and potential outcomes provided by a third person, usually the [instructor].” (Pg. 52)
  • Students may initially resist attempts to involve them in the assessment process. This is usually due to insecurities or lack of confidence in their ability to objectively evaluate their own work. Brown and Knight note, however, that when students are asked to evaluate their work, frequently student-determined outcomes are very similar to those of instructors, particularly when the criteria and expectations have been made explicit in advance.
  • Methods of self-assessment vary widely and can be as eclectic as the instructor. Common forms of self-assessment include the portfolio, reflection logs, instructor-student interviews, learner diaries and dialog journals, and the like.

Peer Assessment

Peer assessment is a type of collaborative learning technique where students evaluate the work of their peers and have their own evaluated by peers. This dimension of assessment is significantly grounded in theoretical approaches to active learning and adult learning . Like self-assessment, peer assessment gives learners ownership of learning and focuses on the process of learning as students are able to “share with one another the experiences that they have undertaken.” (Brown and Knight, 1994, pg. 52)

Things to Keep in Mind about Peer Assessment

  • Students can use peer assessment as a tactic of antagonism or conflict with other students by giving unmerited low evaluations. Conversely, students can also provide overly favorable evaluations of their friends.
  • Students can occasionally apply unsophisticated judgements to their peers. For example, students who are boisterous and loquacious may receive higher grades than those who are quieter, reserved, and shy.
  • Instructors should implement systems of evaluation in order to ensure valid peer assessment is based on evidence and identifiable criteria .  

According to Euan S. Henderson, essays make two important contributions to learning and assessment: the development of skills and the cultivation of a learning style. (Henderson, 1980) Essays are a common form of writing assignment in courses and can be either a summative or formative form of assessment depending on how the instructor utilizes them in the classroom.

Things to Keep in Mind about Essays

  • A common challenge of the essay is that students can use them simply to regurgitate rather than analyze and synthesize information to make arguments.
  • Instructors commonly assume that students know how to write essays and can encounter disappointment or frustration when they discover that this is not the case for some students. For this reason, it is important for instructors to make their expectations clear and be prepared to assist or expose students to resources that will enhance their writing skills.

Exams and time-constrained, individual assessment

Examinations have traditionally been viewed as a gold standard of assessment in education, particularly in university settings. Like essays they can be summative or formative forms of assessment.

Things to Keep in Mind about Exams

  • Exams can make significant demands on students’ factual knowledge and can have the side-effect of encouraging cramming and surface learning. On the other hand, they can also facilitate student demonstration of deep learning if essay questions or topics are appropriately selected. Different formats include in-class tests, open-book, take-home exams and the like.
  • In the process of designing an exam, instructors should consider the following questions. What are the learning objectives that the exam seeks to evaluate? Have students been adequately prepared to meet exam expectations? What are the skills and abilities that students need to do well? How will this exam be utilized to enhance the student learning process?

As Brown and Knight assert, utilizing multiple methods of assessment, including more than one assessor, improves the reliability of data. However, a primary challenge to the multiple methods approach is how to weigh the scores produced by multiple methods of assessment. When particular methods produce higher range of marks than others, instructors can potentially misinterpret their assessment of overall student performance. When multiple methods produce different messages about the same student, instructors should be mindful that the methods are likely assessing different forms of achievement. (Brown and Knight, 1994).

For additional methods of assessment not listed here, see “Assessment on the Page” and “Assessment Off the Page” in Assessing Learners in Higher Education .

In addition to the various methods of assessment listed above, classroom assessment techniques also provide a useful way to evaluate student understanding of course material in the teaching and learning process. For more on these, see our Classroom Assessment Techniques teaching guide.

Assessment is More than Grading

Instructors often conflate assessment with grading. This is a mistake. It must be understood that student assessment is more than just grading. Remember that assessment links student performance to specific learning objectives in order to provide useful information to instructors and students about student achievement. Traditional grading on the other hand, according to Stassen et al. does not provide the level of detailed and specific information essential to link student performance with improvement. “Because grades don’t tell you about student performance on individual (or specific) learning goals or outcomes, they provide little information on the overall success of your course in helping students to attain the specific and distinct learning objectives of interest.” (Stassen et al., 2001, pg. 6) Instructors, therefore, must always remember that grading is an aspect of student assessment but does not constitute its totality.

Teaching Guides Related to Student Assessment

Below is a list of other CFT teaching guides that supplement this one. They include:

  • Active Learning
  • An Introduction to Lecturing
  • Beyond the Essay: Making Student Thinking Visible in the Humanities
  • Bloom’s Taxonomy
  • How People Learn
  • Syllabus Construction

References and Additional Resources

This teaching guide draws upon a number of resources listed below. These sources should prove useful for instructors seeking to enhance their pedagogy and effectiveness as teachers.

Angelo, Thomas A., and K. Patricia Cross. Classroom Assessment Techniques: A Handbook for College Teachers . 2 nd edition. San Francisco: Jossey-Bass, 1993. Print.

Brookfield, Stephen D. Becoming a Critically Reflective Teacher . San Francisco: Jossey-Bass, 1995. Print.

Brown, Sally, and Peter Knight. Assessing Learners in Higher Education . 1 edition. London ; Philadelphia: Routledge, 1998. Print.

Cameron, Jeanne et al. “Assessment as Critical Praxis: A Community College Experience.” Teaching Sociology 30.4 (2002): 414–429. JSTOR . Web.

Gibbs, Graham and Claire Simpson. “Conditions under which Assessment Supports Student Learning. Learning and Teaching in Higher Education 1 (2004): 3-31.

Henderson, Euan S. “The Essay in Continuous Assessment.” Studies in Higher Education 5.2 (1980): 197–203. Taylor and Francis+NEJM . Web.

Maki, Peggy L. “Developing an Assessment Plan to Learn about Student Learning.” The Journal of Academic Librarianship 28.1 (2002): 8–13. ScienceDirect . Web. The Journal of Academic Librarianship.

Sharkey, Stephen, and William S. Johnson. Assessing Undergraduate Learning in Sociology . ASA Teaching Resource Center, 1992. Print.

Wiggins, Grant, and Jay McTighe. Understanding By Design . 2nd Expanded edition. Alexandria, VA: Assn. for Supervision & Curriculum Development, 2005. Print.

[1] Brown and Night discuss the first two in their chapter entitled “Dimensions of Assessment.” However, because this chapter begins the second part of the book that outlines assessment methods, I have collapsed the two under the category of methods for the purposes of continuity.

Teaching Guides

  • Online Course Development Resources
  • Principles & Frameworks
  • Pedagogies & Strategies
  • Reflecting & Assessing
  • Challenges & Opportunities
  • Populations & Contexts

Quick Links

  • Services for Departments and Schools
  • Examples of Online Instructional Modules

6. Assessment

6.1 assessment and evaluation.

Assessment, as defined by  www.edglossary.org , “ refers to the wide variety of methods or tools that educators use to evaluate, measure, and document the academic readiness, learning progress, skill acquisition, or educational needs of students.”   It is analogous to  evaluation, judgment, rating, appraisal, and analysis. (Great Schools Partnership, 2015)

Although the terms assessment and evaluation are often used synonymously, they are in fact distinctive and different. The intent of assessment is to measure effectiveness; evaluation adds a value component to the process.  A teacher may assess a student to ascertain how well the  individual  successfully met the learning target. If, however, the measurement is used to determine program placement, for example with a special education program, honors club, or for Individual Educational Program documentation, the assessment constitutes an evaluation.  

Assessment is ongoing is positive is individualized provides feedback. Evaluation provides closure is judgmental is applied against standards shows shortfalls. Both require criteria use measures are evidence driven

Goals of Assessment  

Assessment is two-fold in nature. It enables the teacher to gather information and to then determine what the learner knows or does not know and concurrently drives the planning phase. In order to meet the needs of all learners, the teacher may need to differentiate the instruction.

The teacher is then responsible for providing positive feedback in a timely manner to the student. This feedback should include specifically whether the student met the learning target, specifically what needs to be improved upon, and who and how these goals will be met.

The intent of assessment has traditionally been to determine what the learner has learned. Today, the emphasis is on authentic assessment. While the former typically employed recall methods, the latter encourages learners to demonstrate greater comprehension.  (Wiggins, 1990)

7 Keys to Effective Feedback

1) Goal-referenced Learner knows whether they are on track towards a goal or need to change course.
2) Tangible & transparent Learners can understand exactly how your feedback relates to the task at hand.
3) Actionable Learners know specifically what actions to take to move towards their goal
4) User-friendly Learner finds the feedback appropriate to his/her cognitive level.
5) Timely Learner receives feedback while the attempt and effect are still fresh in their mind.
6) Ongoing Learner has multiple opportunities to learn and improve towards the ultimate goal.
7) Consistent Learner can adjust his/her performance based on stable, accurate, and trust-worthy feedback.

Methods to Assess  

Within an academic setting, assessment may include “the process of observing learning; describing, collecting, recording, scoring, and interpreting information about a student’s or one’s own learning  http://www.k12.hi.us/atr/evaluation/glossary.htm .”

It can occur by observations, interviews, tests, projects or any other information gathering method. Within the early childhood and early primary elementary grades, observations are used frequently to assess learners. Teachers may use a checklist to note areas of proficiency or readiness and may opt to use checkmarks or some other consistent means for record-keeping.

Characterization by Value Set Organization Valuing Responding Receiving

It is helpful for a teacher to include the date, day, and time. This record-keeping may result in emerging patterns. Does the learner exhibit certain behaviors or respond to learning activities because of proximity to lunchtime, or morning or afternoon? The aspect of understanding how individuals learn can be noted within the affective domain. (Kirk, N/D) This may influence how a student learns and behaves within a classroom setting. Seating, natural and artificial lighting, noise, and temperature all influence how a student feels and interacts within the environment and can have effect cognitive behaviors.

Interviews can be used on the elementary or secondary levels as an assessment tool. Like any other well- planned assessment tool, they necessitate careful planning and development of questions, positive rapport with the student, and an environment that is free from distractions, outside noise, and time constraints. Interviews may or may not be audiotaped or videotaped and scoring rubrics may be used to assess (Southerland, ND).

Tests offer yet another venue for assessment purposes. They may take the form of essay or short response, fill-in-the-blank, matching, or true or false formats. Like any of the other methods, they should be valid and reliable. Carefully thought out test questions need to be tied to learning standards and a clear and fair scoring measure needs to be in place.

Typically, assessment has been viewed as the result; the letter or point assigned at the end of an assignment; however, assessment can and should come at the beginning, end and throughout the teaching and learning process. While assessment should drive instruction, it often falls short when determining instructional decisions

5 Domains of Learning and Development Approaches to Learning Cognitive Development Language Development & Communication Health & Physical Development Emotional-Social Development

Danielle Stein eagerly anticipated the upcoming parent-teacher conferences of the day. She had studied hard as a Childhood Education major and had worked diligently in her first year as a third-grade teacher at Maplewood Elementary School.  Danielle had planned interdisciplinary lessons, employed inquiry-based learning centers, and met regularly with individual students to ensure that they had mastered the skills as determined by the state standards.

Each student had a portfolio filled with dated representations of their work. Ms. Stein understood the importance of specific and timely feedback and had painstakingly provided detailed written feedback on each work sample. She meticulously arranged the portfolios along with anecdotal notes and looked forward to sharing the accomplishments of the students with their family members.

As last-minute jitters began to set in, Danielle realized that she had no grades for any of the students. Despite doing all the right things, she had no way to assign a grade to any of the work the students had done. How would she respond when guardians asked what grade their child would earn on the first report card? How would she accurately tell them how they compared with their peers in reading? In math? In social studies and science?

Danielle quickly realized she was not as prepared as she had anticipated.

Discussion Questions

How do teachers assess student work? Is there a certain number of assignments that should be graded within  a  9-week session? Are there  alternatives to  letter grades? Reflect on how you were graded as a student.   

  • Foundations of Education. Authored by : SUNY Oneonta Education Department. License : CC BY: Attribution

Footer Logo Lumen Candela

Privacy Policy

  • Grades 6-12
  • School Leaders

Check Out Our 32 Fave Amazon Picks! 📦

Formative, Summative, and More Types of Assessments in Education

All the best ways to evaluate learning before, during, and after it happens.

Collage of types of assessments in education, including formative and summative

When you hear the word assessment, do you automatically think “tests”? While it’s true that tests are one kind of assessment, they’re not the only way teachers evaluate student progress. Learn more about the types of assessments used in education, and find out how and when to use them.

Diagnostic Assessments

Formative assessments, summative assessments.

  • Criterion-Referenced, Ipsative, and Normative Assessments

What is assessment?

In simplest terms, assessment means gathering data to help understand progress and effectiveness. In education, we gather data about student learning in variety of ways, then use it to assess both their progress and the effectiveness of our teaching programs. This helps educators know what’s working well and where they need to make changes.

Chart showing three types of assessments: diagnostic, formative, and summative

There are three broad types of assessments: diagnostic, formative, and summative. These take place throughout the learning process, helping students and teachers gauge learning. Within those three broad categories, you’ll find other types of assessment, such as ipsative, norm-referenced, and criterion-referenced.

What’s the purpose of assessment in education?

In education, we can group assessments under three main purposes:

  • Of learning
  • For learning
  • As learning

Assessment of learning is student-based and one of the most familiar, encompassing tests, reports, essays, and other ways of determining what students have learned. These are usually summative assessments, and they are used to gauge progress for individuals and groups so educators can determine who has mastered the material and who needs more assistance.

When we talk about assessment for learning, we’re referring to the constant evaluations teachers perform as they teach. These quick assessments—such as in-class discussions or quick pop quizzes—give educators the chance to see if their teaching strategies are working. This allows them to make adjustments in action, tailoring their lessons and activities to student needs. Assessment for learning usually includes the formative and diagnostic types.

Assessment can also be a part of the learning process itself. When students use self-evaluations, flash cards, or rubrics, they’re using assessments to help them learn.

Let’s take a closer look at the various types of assessments used in education.

Worksheet in a red binder called Reconstruction Anticipation Guide, used as a diagnostic pre-assessment (Types of Assessment)

Diagnostic assessments are used before learning to determine what students already do and do not know. This often refers to pre-tests and other activities students attempt at the beginning of a unit.

How To Use Diagnostic Assessments

When giving diagnostic assessments, it’s important to remind students these won’t affect their overall grade. Instead, it’s a way for them to find out what they’ll be learning in an upcoming lesson or unit. It can also help them understand their own strengths and weaknesses, so they can ask for help when they need it.

Teachers can use results to understand what students already know and adapt their lesson plans accordingly. There’s no point in over-teaching a concept students have already mastered. On the other hand, a diagnostic assessment can also help highlight expected pre-knowledge that may be missing.

For instance, a teacher might assume students already know certain vocabulary words that are important for an upcoming lesson. If the diagnostic assessment indicates differently, the teacher knows they’ll need to take a step back and do a little pre-teaching before getting to their actual lesson plans.

Examples of Diagnostic Assessments

  • Pre-test: This includes the same questions (or types of questions) that will appear on a final test, and it’s an excellent way to compare results.
  • Blind Kahoot: Teachers and kids already love using Kahoot for test review, but it’s also the perfect way to introduce a new topic. Learn how Blind Kahoots work here.
  • Survey or questionnaire: Ask students to rate their knowledge on a topic with a series of low-stakes questions.
  • Checklist: Create a list of skills and knowledge students will build throughout a unit, and have them start by checking off any they already feel they’ve mastered. Revisit the list frequently as part of formative assessment.

What stuck with you today? chart with sticky note exit tickets, used as formative assessment

Formative assessments take place during instruction. They’re used throughout the learning process and help teachers make on-the-go adjustments to instruction and activities as needed. These assessments aren’t used in calculating student grades, but they are planned as part of a lesson or activity. Learn much more about formative assessments here.

How To Use Formative Assessments

As you’re building a lesson plan, be sure to include formative assessments at logical points. These types of assessments might be used at the end of a class period, after finishing a hands-on activity, or once you’re through with a unit section or learning objective.

Once you have the results, use that feedback to determine student progress, both overall and as individuals. If the majority of a class is struggling with a specific concept, you might need to find different ways to teach it. Or you might discover that one student is especially falling behind and arrange to offer extra assistance to help them out.

While kids may grumble, standard homework review assignments can actually be a pretty valuable type of formative assessment . They give kids a chance to practice, while teachers can evaluate their progress by checking the answers. Just remember that homework review assignments are only one type of formative assessment, and not all kids have access to a safe and dedicated learning space outside of school.

Examples of Formative Assessments

  • Exit tickets : At the end of a lesson or class, pose a question for students to answer before they leave. They can answer using a sticky note, online form, or digital tool.
  • Kahoot quizzes : Kids enjoy the gamified fun, while teachers appreciate the ability to analyze the data later to see which topics students understand well and which need more time.
  • Flip (formerly Flipgrid): We love Flip for helping teachers connect with students who hate speaking up in class. This innovative (and free!) tech tool lets students post selfie videos in response to teacher prompts. Kids can view each other’s videos, commenting and continuing the conversation in a low-key way.
  • Self-evaluation: Encourage students to use formative assessments to gauge their own progress too. If they struggle with review questions or example problems, they know they’ll need to spend more time studying. This way, they’re not surprised when they don’t do well on a more formal test.

Find a big list of 25 creative and effective formative assessment options here.

Summative assessment in the form of a

Summative assessments are used at the end of a unit or lesson to determine what students have learned. By comparing diagnostic and summative assessments, teachers and learners can get a clearer picture of how much progress they’ve made. Summative assessments are often tests or exams but also include options like essays, projects, and presentations.

How To Use Summative Assessments

The goal of a summative assessment is to find out what students have learned and if their learning matches the goals for a unit or activity. Ensure you match your test questions or assessment activities with specific learning objectives to make the best use of summative assessments.

When possible, use an array of summative assessment options to give all types of learners a chance to demonstrate their knowledge. For instance, some students suffer from severe test anxiety but may still have mastered the skills and concepts and just need another way to show their achievement. Consider ditching the test paper and having a conversation with the student about the topic instead, covering the same basic objectives but without the high-pressure test environment.

Summative assessments are often used for grades, but they’re really about so much more. Encourage students to revisit their tests and exams, finding the right answers to any they originally missed. Think about allowing retakes for those who show dedication to improving on their learning. Drive home the idea that learning is about more than just a grade on a report card.

Examples of Summative Assessments

  • Traditional tests: These might include multiple-choice, matching, and short-answer questions.
  • Essays and research papers: This is another traditional form of summative assessment, typically involving drafts (which are really formative assessments in disguise) and edits before a final copy.
  • Presentations: From oral book reports to persuasive speeches and beyond, presentations are another time-honored form of summative assessment.

Find 25 of our favorite alternative assessments here.

More Types of Assessments

Now that you know the three basic types of assessments, let’s take a look at some of the more specific and advanced terms you’re likely to hear in professional development books and sessions. These assessments may fit into some or all of the broader categories, depending on how they’re used. Here’s what teachers need to know.

Criterion-Referenced Assessments

In this common type of assessment, a student’s knowledge is compared to a standard learning objective. Most summative assessments are designed to measure student mastery of specific learning objectives. The important thing to remember about this type of assessment is that it only compares a student to the expected learning objectives themselves, not to other students.

Chart comparing normative and criterion referenced types of assessment

Many standardized tests are criterion-referenced assessments. A governing board determines the learning objectives for a specific group of students. Then, all students take a standardized test to see if they’ve achieved those objectives.

Find out more about criterion-referenced assessments here.

Norm-Referenced Assessments

These types of assessments do compare student achievement with that of their peers. Students receive a ranking based on their score and potentially on other factors as well. Norm-referenced assessments usually rank on a bell curve, establishing an “average” as well as high performers and low performers.

These assessments can be used as screening for those at risk for poor performance (such as those with learning disabilities) or to identify high-level learners who would thrive on additional challenges. They may also help rank students for college entrance or scholarships, or determine whether a student is ready for a new experience like preschool.

Learn more about norm-referenced assessments here.

Ipsative Assessments

In education, ipsative assessments compare a learner’s present performance to their own past performance, to chart achievement over time. Many educators consider ipsative assessment to be the most important of all , since it helps students and parents truly understand what they’ve accomplished—and sometimes, what they haven’t. It’s all about measuring personal growth.

Comparing the results of pre-tests with final exams is one type of ipsative assessment. Some schools use curriculum-based measurement to track ipsative performance. Kids take regular quick assessments (often weekly) to show their current skill/knowledge level in reading, writing, math, and other basics. Their results are charted, showing their progress over time.

Learn more about ipsative assessment in education here.

Have more questions about the best types of assessments to use with your students? Come ask for advice in the We Are Teachers HELPLINE group on Facebook.

Plus, check out creative ways to check for understanding ..

Learn about the basic types of assessments educators use in and out of the classroom, and how to use them most effectively with students.

You Might Also Like

Photo of handshake representing administrator failing upward

Why Do So Many Administrators “Fail Upward”?

Teachers are speaking up. Continue Reading

Copyright © 2024. All rights reserved. 5335 Gate Parkway, Jacksonville, FL 32256

National Academies Press: OpenBook

Knowing What Students Know: The Science and Design of Educational Assessment (2001)

Chapter: 6 assessment in practice, 6 assessment in practice.

Although assessments are currently used for many purposes in the educational system, a premise of this report is that their effectiveness and utility must ultimately be judged by the extent to which they promote student learning. The aim of assessment should be “ to educate and improve student performance, not merely to audit it” (Wiggins, 1998, p.7). To this end, people should gain important and useful information from every assessment situation. In education, as in other professions, good decision making depends on access to relevant, accurate, and timely information. Furthermore, the information gained should be put to good use by informing decisions about curriculum and instruction and ultimately improving student learning (Falk, 2000; National Council of Teachers of Mathematics, 1995).

Assessments do not function in isolation; an assessment’s effectiveness in improving learning depends on its relationships to curriculum and instruction. Ideally, instruction is faithful and effective in relation to curriculum, and assessment reflects curriculum in such a way that it reinforces the best practices in instruction. In actuality, however, the relationships among assessment, curriculum, and instruction are not always ideal. Often assessment taps only a subset of curriculum and without regard to instruction, and can narrow and distort instruction in unintended ways (Klein, Hamilton, McCaffrey, and Stecher, 2000; Koretz and Barron, 1998; Linn, 2000; National Research Council [NRC], 1999b). In this chapter we expand on the idea, introduced in Chapter 2 , that synergy can best be achieved if the three parts of the system are bound by or grow out of a shared knowledge base about cognition and learning in the domain.

PURPOSES AND CONTEXTS OF USE

Educational assessment occurs in two major contexts. The first is the classroom. Here assessment is used by teachers and students mainly to assist learning, but also to gauge students’ summative achievement over the longer term. Second is large-scale assessment, used by policy makers and educational leaders to evaluate programs and/or obtain information about whether individual students have met learning goals.

The sharp contrast that typically exists between classroom and largescale assessment practices arises because assessment designers have not been able to fulfill the purposes of different assessment users with the same data and analyses. To guide instruction and monitor its effects, teachers need information that is intimately connected with the work their students are doing, and they interpret this evidence in light of everything else they know about their students and the conditions of instruction. Part of the power of classroom assessment resides in these connections. Yet precisely because they are individualized and highly contextualized, neither the rationale nor the results of typical classroom assessments are easily communicated beyond the classroom. Large-scale, standardized tests do communicate efficiently across time and place, but by so constraining the content and timeliness of the message that they often have little utility in the classroom. This contrast illustrates the more general point that one size of assessment does not fit all. The purpose of an assessment determines priorities, and the context of use imposes constraints on the design, thereby affecting the kinds of information a particular assessment can provide about student achievement.

Inevitability of Trade-Offs in Design

To say that an assessment is a good assessment or that a task is a good task is like saying that a medical test is a good test; each can provide useful information only under certain circumstances. An MRI of a knee, for example, has unquestioned value for diagnosing cartilage damage, but is not helpful for diagnosing the overall quality of a person’s health. It is natural for people to understand medical tests in this way, but not educational tests. The same argument applies nonetheless, but in ways that are less familiar and perhaps more subtle.

In their classic text Psychological Tests and Personnel Decisions, Cronbach and Gleser (1965) devote an entire chapter to the trade-off between fidelity and bandwidth when testing for employment selection. A high-fidelity, narrow-bandwidth test provides accurate information about a small number of focused questions, whereas a low-fidelity, broad-bandwidth test provides noisier information for a larger number of less-focused questions. For a

fixed level of resources—the same amount of money, testing time, or tasks— the designer can choose where an assessment will fall along this spectrum. Following are two examples related to the fidelity-bandwidth (or depth versus breadth) trade-offs that inevitably arise in the design of educational assessments. They illustrate the point that the more purposes one attempts to serve with a single assessment, the less well that assessment can serve any given purpose.

Trade-Offs in Assessment Design: Examples

Accountability versus instructional guidance for individual students.

The first example expands on the contrast between classroom and largescale assessments described above. A starting point is the desire for statewide accountability tests to be more helpful to teachers or the question of why assessment designers cannot incorporate in the tests items that are closely tied to the instructional activities in which students are engaged (i.e., assessment tasks such as those effective teachers use in their classrooms). To understand why this has not been done, one must look at the distinct purposes served by standardized achievement tests and classroom quizzes: who the users are, what they already know, and what they want to learn.

In this example, the chief state school officer wants to know whether students have been studying the topics identified in the state standards. (Actually, by assessing these topics, the officer wants to increase the likelihood that students will be studying them.) But there are many curriculum standards, and she or he certainly cannot ascertain whether each has been studied by every student. A broad sample from each student is better for his or her purposes—not enough information to determine the depth or the nature of any student’s knowledge across the statewide curriculum, but enough to see trends across schools and districts about broad patterns of performance. This information can be used to plan funding and policy decisions for the coming year.

The classroom teacher wants to know how well an individual student, or class of students, is learning the things they have been studying and what they ought to be working on next. What is important is the match among what the teacher already knows about the things students have been working on, what the teacher needs to learn about their current understanding, and how that knowledge will help shape what the students should do now to learn further.

For the chief state school officer, the ultimate question is whether larger aggregates of students (such as schools, districts, or states) have had “the opportunity to learn.” The state assessment is constructed to gather information to support essentially the same inference about all students, so the

information can most easily be combined to meet the chief officer’s purpose. For the teacher, the starting point is knowing what each student as an individual has had the opportunity to learn. The classroom quiz is designed to reveal patterns of individual knowledge (compared with the state grade-level standards) within the small content domain in which students have been working so the teacher can make tailored decisions about next steps for individual students or the class. For the teacher, combining information across classes that are studying and testing different content is not important or possible. Ironically, the questions that are of most use to the state officer are of the least use to the teacher.

National Assessment of Educational Progress (NAEP): Estimates for Groups Versus Individual Students

The current public debate over whether to provide student-level reports from NAEP highlights a trade-off that goes to the very heart of the assessment and has shaped its sometimes frustratingly complex design from its inception (see Forsyth, Hambleton, Linn, Mislevy, and Yen, 1996 for a history of NAEP design trade-offs). NAEP was designed to survey the knowledge of students across the nation with respect to a broad range of content and skills, and to report the relationships between that knowledge and a large number of educational and demographic background variables. The design selected by the founders of NAEP (including Ralph Tyler and John Tukey) to achieve this purpose was multiple-matrix sampling. Not all students in the country are sampled. A strategically selected sample can support the targeted inferences about groups of students with virtually the same precision as the very familiar approach of testing every student, but for a fraction of the cost. Moreover, not all students are administered all items. NAEP can use hundreds of tasks of many kinds to gather information about competencies in student populations without requiring any student to spend more than a class period performing those tasks; it does so by assembling the items into many overlapping short forms and giving each sampled student a single form.

Schools can obtain useful feedback on the quality of their curriculum, but NAEP’s benefits are traded off against several limitations. Measurement at the level of individual students is poor, and individuals can not be ranked, compared, or diagnosed. Further analyses of the data are problematic. But a design that served any of these purposes well (for instance, by testing every student, by testing each student intensively, or by administering every student parallel sets of items to achieve better comparability) would degrade the estimates and increase the costs of the inferences NAEP was created to address.

Reflections on the Multiple Purposes for Assessment

As noted, the more purposes a single assessment aims to serve, the more each purpose will be compromised. Serving multiple purposes is not necessarily wrong, of course, and in truth few assessments can be said to serve a single purpose only. But it is incumbent on assessment designers and users to recognize the compromises and trade-offs such use entails. We return to notions of constraints and trade-offs later in this chapter.

Multiple assessments are thus needed to provide the various types of information required at different levels of the educational system. This does not mean, however, that the assessments need to be disconnected or working at cross-purposes. If multiple assessments grow out of a shared knowledge base about cognition and learning in the domain, they can provide valuable multiple perspectives on student achievement while supporting a core set of learning goals. Stakeholders should not be unduly concerned if differing assessments yield different information about student achievement; in fact, in many circumstances this is exactly what should be expected. However, if multiple assessments are to support learning effectively and provide clear and meaningful results for various audiences, it is important that the purposes served by each assessment and the aspects of achievement sampled by any given assessment be made explicit to users.

Later in the chapter we address how multiple assessments, including those used across both classroom and large-scale contexts, could work together to form more complete assessment systems. First, however, we discuss classroom and large-scale assessments in turn and how each can best be used to serve the goals of learning.

CLASSROOM ASSESSMENT

The first thing that comes to mind for many people when they think of “classroom assessment” is a midterm or end-of-course exam, used by the teacher for summative grading purposes. But such practices represent only a fraction of the kinds of assessment that occur on an ongoing basis in an effective classroom. The focus in this section is on assessments used by teachers to support instruction and learning, also referred to as formative assessment. Such assessment offers considerable potential for improving student learning when informed by research and theory on how students develop subject matter competence.

As instruction is occurring, teachers need information to evaluate whether their teaching strategies are working. They also need information about the current understanding of individual students and groups of students so they can identify the most appropriate next steps for instruction. Moreover, students need feedback to monitor their own success in learning and to know

how to improve. Teachers make observations of student understanding and performance in a variety of ways: from classroom dialogue, questioning, seatwork and homework assignments, formal tests, less formal quizzes, projects, portfolios, and so on.

Black and Wiliam (1998) provide an extensive review of more than 250 books and articles presenting research evidence on the effects of classroom assessment. They conclude that ongoing assessment by teachers, combined with appropriate feedback to students, can have powerful and positive effects on achievement. They also report, however, that the characteristics of high-quality formative assessment are not well understood by teachers and

A project at King’s College London (Black and Wiliam, 2000) illustrates some of the issues encountered when an effort is made to incorporate principles of cognition and reasoning from evidence into classroom practice. The project involved working closely with 24 science and mathematics teachers to develop their formative assessment practices in everyday classroom work. During the course of the project, several aspects of the teaching and learning process were radically changed.

One such aspect was the teachers’ practices in asking questions in the classroom. In particular, the focus was on the notion of wait time (the length of the silence a teacher would allow after asking a question before speaking again if nobody responded), with emphasis on how short this time usually is. The teachers altered their practice to give students extended time to think about any question posed, often asking them to discuss their ideas in pairs before calling for responses. The practice of students putting up their hands to volunteer answers was forbidden; anyone could be asked to respond. The teachers did not label answers as right or wrong, but instead asked a student to explain his or her reasons for the answer offered. Others were then asked to say whether they agreed and why. Thus questions opened up discussion that helped expose and explore students’ assumptions and reasoning. At the same time, wrong answers became useful input, and the students realized that the teacher was interested in knowing what they thought, not in evaluating whether they were right or wrong. As a consequence, teachers asked fewer questions, spending more time on each.

that formative assessment is weak in practice. High-quality classroom assessment is a complex process, as illustrated by research described in Box 6– 1 that encapsulates many of the points made in the following discussion. In brief, the development of good formative assessment requires radical changes in the ways students are encouraged to express their ideas and in the ways teachers give feedback to students so they can develop the ability to manage and guide their own learning. Where such innovations have been instituted, teachers have become acutely aware of the need to think more clearly about their own assumptions regarding how students learn.

In addition, teachers realized that their lesson planning had to include careful thought about the selection of informative questions. They discovered that they had to consider very carefully the aspects of student thinking that any given question might serve to explore. This discovery led them to work further on developing criteria for the quality of their questions. Thus the teachers confronted the importance of the cognitive foundations for designing assessment situations that can evoke important aspects of student thinking and learning. (See Bonniol [1991] and Perrenoud [1998]) for further discussion of the importance of high-quality teacher questions for illuminating student thinking.)

In response to research evidence that simply giving grades on written work can be counterproductive for learning (Butler, 1988), teachers began instead to concentrate on providing comments without grades—feedback designed to guide students’ further learning. Students also took part in self-assessment and peer-assessment activities, which required that they understand the goals for learning and the criteria for quality that applied to their work. These kinds of activities called for patient training and support from teachers, but fostered students’ abilities to focus on targets for learning and to identify learning goals for which they lacked confidence and needed help (metacognitive skills described in ). In these ways, assessment situations became opportunities for learning, rather than activities divorced from learning.

There is a rich literature on how classroom assessment can be designed and used to improve instruction and learning (e.g., Falk, 2000; Niyogi, 1995; Shepard, 2000; Stiggins, 1997; Wiggins, 1998). This literature presents powerful ideas and practical advice to assist teachers across the K-16 spectrum in improving their classroom assessment practices. We do not attempt to summarize all of the insights and implications for practice presented in this literature. Rather, our emphasis is on what could be gained by thinking about classroom assessment in light of the principles of cognition and reasoning from evidence emphasized throughout this report.

Formative Assessment, Curriculum, and Instruction

At the 2000 annual meeting of the American Educational Research Association, Shepard (2000) began her presidential address by quoting Graue’s (1993, p. 291) observation, that “assessment and instruction are often conceived as curiously separate in both time and purpose.” Shepard asked:

How might the culture of classrooms be shifted so that students no longer feign competence or work to perform well on the test as an end separate from real learning? Could we create a learning culture where students and teachers would have a shared expectation that finding out what makes sense and what doesn’t is a joint and worthwhile project, essential to taking the next steps in learning? …How should what we do in classrooms be changed so that students and teachers look to assessment as a source of insight and help instead of its being the occasion for meting out reward and punishments. To accomplish this kind of transformation, we have to make assessment more useful, more helpful in learning, and at the same time change the social meaning of evaluation. (pp. 12–15)

Shepard proceeded to discuss ways in which classroom assessment practices need to change: the content and character of assessments need to be significantly improved to reflect contemporary understanding of learning; the gathering and use of assessment information and insights must become a part of the ongoing learning process; and assessment must become a central concern in methods courses in teacher preparation programs. Shepard’s messages were reflective of a growing belief among many educational assessment experts that if assessment, curriculum, and instruction were more integrally connected, student learning would improve (e.g., Gipps, 1999; Pellegrino, Baxter, and Glaser, 1999; Snow and Mandinach, 1991; Stiggins, 1997).

Sadler (1989) provides a conceptual framework that places classroom assessment in the context of curriculum and instruction. According to this framework, three elements are required for formative assessment to promote learning:

A clear view of the learning goals.

Information about the present state of the learner.

Action to close the gap.

These three elements relate directly to assessment, curriculum, and instruction. The learning goals are derived from the curriculum. The present state of the learner is derived from assessment, so that the gap between it and the learning goals can be appraised. Action is then taken through instruction to close the gap. An important point is that assessment information by itself simply reveals student competence at a point in time; the process is considered formative assessment only when teachers use the information to make decisions about how to adapt instruction to meet students’ needs.

Furthermore, there are ongoing, dynamic relationships among formative assessment, curriculum, and instruction. That is, there are important bidirectional interactions among the three elements, such that each informs the other. For instance, formulating assessment procedures for classroom use can spur a teacher to think more specifically about learning goals, thus leading to modification of curriculum and instruction. These modifications can, in turn, lead to refined assessment procedures, and so on.

The mere existence of classroom assessment along the lines discussed here will not ensure effective learning. The clarity and appropriateness of the curriculum goals, the validity of the assessments in relationship to these goals, the interpretation of the assessment evidence, and the relevance and quality of the instruction that ensues are all critical determinants of the outcome. Starting with a model of cognition and learning in the domain can enhance each of these determinants.

Importance of a Model of Cognition and Learning

For most teachers, the ultimate goals for learning are established by the curriculum, which is usually mandated externally (e.g., by state curriculum standards). However, teachers and others responsible for designing curriculum, instruction, and assessment must fashion intermediate goals that can serve as an effective route to achieving the ultimate goals, and to do so they must have an understanding of how people represent knowledge and develop competence in the domain.

National and state standards documents set forth learning goals, but often not at a level of detail that is useful for operationalizing those goals in instruction and assessment (American Federation of Teachers, 1999; Finn, Petrilli, and Vanourek, 1998). By dividing goal descriptions into sets appropriate for different age and grade ranges, current curriculum standards provide broad guidance about the nature of the progression to be expected in various subject domains. Whereas this kind of epistemological and concep-

tual analysis of the subject domain is an essential basis for guiding assessment, deeper cognitive analysis of how people learn the subject matter is also needed. Formative assessment should be based on cognitive theories about how people learn particular subject matter to ensure that instruction centers on what is most important for the next stage of learning, given a learner’s current state of understanding. As described in Chapter 3 , cognitive research has produced a rich set of descriptions of how people develop problem-solving and reasoning competencies in various content areas, particularly for the domains of mathematics and science. These models of learning provide a fertile ground for designing formative assessments.

It follows that teachers need training to develop their understanding of cognition and learning in the domains they teach. Preservice and professional development are needed to uncover teachers’ existing understandings of how students learn (Strauss, 1998), and to help them formulate models of learning so they can identify students’ naive or initial sense-making strategies and build on those strategies to move students toward more sophisticated understandings. The aim is to increase teachers’ diagnostic expertise so they can make informed decisions about next steps for student learning. This has been a primary goal of cognitively based approaches to instruction and assessment that have been shown to have a positive impact on student learning, including the Cognitively Guided Instruction program (Carpenter, Fennema, and Franke, 1996) and others (Cobb et al., 1991; Griffin and Case, 1997), some of which are described below. As these examples point out, however, such approaches rest on a bedrock of informed professional practice.

Cognitively Based Approaches to Classroom Assessment: Examples

Cognitively guided instruction and assessment.

Carpenter, Fennema, and colleagues have demonstrated that teachers who are informed regarding children’s thinking about arithmetic will be in a better position to craft more effective mathematics instruction (Carpenter et al., 1996; Carpenter, Fennema, Peterson, and Carey, 1988). Their approach, called Cognitively Guided Instruction (CGI), borrows much from cognitive science, yet recasts that work at a higher level of abstraction, a midlevel model designed explicitly to be easily understood and used by teachers. As noted earlier, such a model permits teachers to “read and react” to ongoing events in real time as they unfold during the course of instruction. In a sense, the researchers suggest that teachers use this midlevel model to support a process of continuous formative assessment so that instruction can be modified frequently as needed.

The cornerstone of CGI is a coarse-grained model of student thinking that borrows from work done in cognitive science to characterize the semantic structure of word problems, along with typical strategies children use for their solution. For instance, teachers are informed that problems apparently involving different operations, such as 3 + 7 = 10 and 10 – 7 = 3, are regarded by children as similar because both involve the action of combining sets. The model that summarizes children’s thinking about arithmetic word problems involving addition or subtraction is summarized by a three-dimensional matrix, in which the rows define major classes of semantic relations, such as combining, separating, or comparing sets; the columns refer to the unknown set (e.g., 7 + 3 = ? vs. 7 + ? = 10); and the depth is a compilation of typical strategies children employ to solve problems such as these. Cognitive-developmental studies (Baroody, 1984; Carpenter and Moser, 1984; Siegler and Jenkins, 1989) suggest that children’s trajectories in this space are highly consistent. For example, direct modeling strategies are acquired before counting strategies; similarly, counting on from the first addend (e.g., 2 + 4 = ?, 2, 3(1), 4(2), 5(3), 6(4)) is acquired before counting on from the larger addend (e.g., 4, 5(1), 6(2)).

Because development of these strategies tends to be robust, teachers can quickly locate student thinking within the problem space defined by CGI. Moreover, the model helps teachers locate likely antecedent understandings and helps them anticipate appropriate next steps. Given a student’s solution to a problem, a classroom teacher can modify instruction in a number of ways: (1) by posing a developmentally more difficult or easier problem; (2) by altering the size of the numbers in the set; or (3) by comparing and contrasting students’ solution strategies, so that students can come to appreciate the utility and elegance of a strategy they might not yet be able to generate on their own. For example, a student directly modeling a joining of sets with counters (e.g., 2 + 3 solved by combining 2 chips with 3 chips and then counting all the chips) might profit by observing how a classmate uses a counting strategy (such as 2, 3(1), etc.) to solve the same problem. In a program such as CGI, formative assessment is woven seamlessly into the fabric of instruction (Carpenter et al, 1996).

Intelligent Tutors

As described in previous chapters, intelligent tutoring systems are powerful examples of the use of cognitively based classroom assessment tools blended with instruction. Studies indicate that when students work alone with these computer-based tutors, the relationship between formative assessment and the model of student thinking derived from research is comparatively direct. Students make mistakes, and the system offers effective

A large-scale experiment evaluated the benefits of intelligent tutoring in an urban high school (Koedinger, Anderson, Hadley, and Mark, 1997). Researchers compared achievement levels of ninth-grade students who received the PUMP curriculum, which is supported by an intelligent tutor, the PUMP Algebra Tutor (PAT) (experimental group), with those of students who received more traditional algebra instruction (control group). The results, presented below, demonstrate strong learning benefits from using the curriculum that included the intelligent tutoring program.

The researchers did not collect baseline data to ensure similar starting achievement levels across experimental and control groups. However, they report that the groups were similar in terms of demographics. In addition, they looked at students’ mathematics grades in the previous school year to check for differences in students’ prior knowledge that would put the experimental group at an advantage. In fact, the average prior grades for the experimental group were lower than those for the control group.

*  

The researchers note that their research strategy is first to establish the success of the whole package and then to examine the effects of the curriculum and intelligent tutoring components independently; this work is still to be finished.

remediation. As a result, students on average learn more with the system than with other, traditional instruction (see Box 6–2 ).

On the other hand, some research suggests that the relationship between formative assessment and cognitive theory can be more complex. In a study of Anderson’s geometry tutor with high school students and their teachers, Schofield and colleagues found that teachers provided more articulate and better-tuned feedback than did the intelligent tutor (Schofield, Eurich-Fulcer, and Britt, 1994). Nevertheless, students preferred tutor-based to traditional instruction, not for the reasons one might expect, but because the tutor helped teachers tune their assistance to problems signaled by a student’s interaction with the tutor. Thus, student interactions with the tutor

(and sometimes their problems with it) served to elicit and inform more knowledgeable teacher assistance, an outcome that students apparently appreciated. Moreover, the assistance provided by teachers to students was less public. Hence, formative assessment and subsequent modification of instruction—both highly valued by these high school students—were mediated by a triadic relationship among teacher, student, and intelligent tutor. Interestingly, these interactions were not the ones originally intended by the designers of the tutor. Not surprisingly, rather than involving direct correspondence between model-based assessments and student learning, these relationships are more complex in actual practice. And the Schofield et al. study suggests that some portion of the effect may be due to stimulating positive teacher practices.

Reflections on the Teacher’s Role

Intelligent tutors and instructional programs such as Facets (described in Chapter 5 ) and CGI share an emphasis on providing clearer benchmarks of student thinking so that teachers can understand precursors and successors to the performances they are observing in real time. Thus these programs provide a “space” of student development in which teachers can work, a space that emphasizes ongoing formative assessment as an integral part of teaching practice. Yet these approaches remain under specified in important senses. Having good formative benchmarks in mind directs attention to important components and landmarks of thinking, yet teachers’ flexible and sensitive repertoires of assistance are still essential to achieving these goals. In general, these programs leave to teachers the task of generating and testing these repertoires. Thus, as noted earlier, the effectiveness of formative assessment rests on a bedrock of informed professional practice. Models of learning flesh out components and systems of reasoning, but they derive their purpose and character from the practices within which they are embedded. Similarly, descriptions of typical practices make little sense in the absence of careful consideration of the forms of knowledge representation and reasoning they entail (Cobb, 1998).

Complex cognitively based measurement models can be embedded in intelligent tutoring systems and diagnostic assessment programs and put to good use without the teacher’s having to participate in their construction. Many of the examples of assessments described in this report, such as Facets, intelligent tutoring systems, and BEAR (see Chapter 4 ), use statistical models and analysis techniques to handle some of the operational challenges. Providing teachers with carefully designed tools for classroom assessment can increase the utility of the information obtained. A goal for the future is to develop tools that make high-quality assessment more feasible for teachers. The topic of technology’s impact on the implementation of classroom assessment is one to which we return in Chapter 7 .

The Quality of Feedback

As described in Chapter 3 , learning is a process of continuously modifying knowledge and skills. Sometimes new inputs call for additions and extensions to existing knowledge structures; at other times they call for radical reconstruction. In all cases, feedback is essential to guide, test, challenge, or redirect the learner’s thinking.

Simply giving students frequent feedback in the classroom may or may not be helpful. For example, highly atomized drill-and-practice software can provide frequent feedback, but in so doing can foster rote learning and context dependency in students. A further concern is whether such software is being used appropriately given a student’s level of skill development. For

instance, a drill-and-practice program may be appropriate for developing fluency and automatizing a skill, but is usually not as appropriate during the early phase of skill acquisition (Goldman, Mertz, and Pellegrino, 1989). It is also noteworthy that in an environment where the teacher dominates all transactions, the frequent evocation and use of feedback can make that dominance all the more oppressive (Broadfoot, 1986).

There is ample evidence, however, that formative assessment can enhance learning when designed to provide students with feedback about particular qualities of their work and guidance on what they can do to improve. This conclusion is supported by several reviews of the research literature, including those by Natriello (1987), Crooks (1988), Fuchs and Fuchs (1986), Hattie (1987, 1990), and Black and Wiliam (1998). Many studies that have examined gains between pre- and post-tests, comparing programs in which formative assessment was the focus of the innovation and matched control groups were used, have shown effect sizes in the range of 0.4 to 0. 7 1 (Black and Wiliam, 1998).

When different types of feedback have been compared in experimental studies, certain types have proven to be more beneficial to learning than others. Many studies in this area have shown that learning is enhanced by feedback that focuses on the mastery of learning goals (e.g., Butler, 1988; Hattie, 1987, 1990; Kluger and DeNisi, 1996). This research suggests that other types of feedback, such as when a teacher focuses on giving grades, on granting or withholding special rewards, or on fostering self-esteem (trying to make the student feel better, irrespective of the quality of his or her work), may be ineffective or even harmful.

The culture of focusing on grades and rewards and of seeing classroom learning as a competition appears to be deeply entrenched and difficult to change. This situation is more apparent in the United States than in some other countries (Hattie, Biggs, and Purdie, 1996). The competitive culture of many classrooms and schools can be an obstacle to learning, especially when linked to beliefs in the fixed nature of ability (Vispoel and Austin, 1995; Wolf, Bixby, Glen, and Gardner, 1991). Such beliefs on the part of educators can lead both to the labeling—overtly or covertly—of students as “bright” or “dull” and to the confirmation and enhancement of such labels through tracking practices.

International comparative studies—notably case studies and video studies conducted for the Third International Mathematics and Science Study

  

To give a sense of the magnitude of such effect sizes, an effect size of 0.4 would mean that the average student who received the treatment would achieve at the same level as a student in the top 35 percent of those who did not receive the treatment. An effect size of 0.7, if realized in the Third International Mathematics and Science Study, would raise the United States from the middle of the 41 countries participating to one of the top 5.

that compare mathematics classrooms in Germany, Japan, and the United States—highlight the effects of these cultural beliefs. The studies underscore the difference between the culture of belief in Japan that the whole class can and should succeed through collaborative effort and the culture of belief endemic to many western countries, particularly the United States, that emphasizes the value of competition and differentiation (Cnen and Stevenson, 1995; Holloway, 1988).

The issues involved in students’ views of themselves as learners may be understood at a more profound level by regarding the classroom as a community of practice in which the relationships formed and roles adopted between teacher and students and among students help to form and interact with each member’s sense of personal identity (Cobb et al., 1991; Greeno and The Middle-School Mathematics Through Applications Group, 1997). Feedback can either promote or undermine the student’s sense of identity as a potentially effective learner. For example, a student might generate a conjecture that was later falsified. One possible form of feedback would emphasize that the conjecture was wrong. A teacher might, instead, emphasize the disciplinary value of formulating conjectures and the fruitful mathematics that often follows from generating evidence about a claim, even (and sometimes especially) a false one.

A voluminous research literature addresses characteristics of learners that relate to issues of feedback. Important topics of study have included students’ attributions for success and failure (e.g., Weiner, 1986), intrinsic versus extrinsic motivation (e.g., Deci and Ryan, 1985), and self-efficacy (e.g., Bandura and Schunk, 1981). We have not attempted to synthesize this large body of literature (for reviews see Graham and Weiner, 1996; Stipek, 1996). The important point to be made here is that teachers should be aware that different types of feedback have motivational implications that affect how students respond. Black and Wiliam (1998) sum up the evidence on feedback as follows:

…the way in which formative information is conveyed to a student, and the context of classroom culture and beliefs about ability and effort within which feedback is interpreted by the individual recipient, can affect these personal features for good or ill. The hopeful message is that innovations which have paid careful attention to these features have produced significant gains when compared with the existing norms of classroom practice, (p. 25)

The Role of the Learner

Students have a crucial role to play in making classroom assessment effective. It is their responsibility to use the assessment information to guide their progress toward learning goals. Consider the following assessment ex-

ample, which illustrates the benefits of having students engage actively in peer and self-assessment.

Researchers White and Frederiksen (2000) worked with teachers to develop the ThinkerTools Inquiry Project, a computer-enhanced middle school science curriculum that enables students to learn about the processes of scientific inquiry and modeling as they construct a theory of force and motion. 2 The class functions as a research community, and students propose competing theories. They then test their theories by working in groups to design and carry out experiments using both computer models and real-world materials. Finally, students come together to compare their findings and to try to reach consensus about the physical laws and causal models that best account for their results. This process is repeated as the students tackle new research questions that foster the evolution of their theories of force and motion.

The ThinkerTools program focuses on facilitating the development of metacognitive skills as students learn the inquiry processes needed to create and revise their theories. The approach incorporates a reflective process in which students evaluate their own and each other’s research using a set of criteria that characterize good inquiry, such as reasoning carefully and collaborating well. Studies in urban classrooms revealed that when this reflective process is included, the approach is highly effective in enabling all students to improve their performance on various inquiry and physics measures and helps reduce the performance gap between low- and high-achieving students (see Box 6–3 ).

As demonstrated by the ThinkerTools example, peer and self-assessment are useful techniques for having learners share and grasp the criteria of quality work—a crucial step if formative assessment is to be effective. Just as teachers should adopt models of cognition and learning to guide instruction, they should also convey a model of learning (perhaps a simplified version) to their students so the students can monitor their own learning. This can be done through techniques such as the development of scoring rubrics or criteria for evaluating student work. As emphasized in Chapter 3 , metacognitive awareness and control of one’s learning are crucial aspects of developing competence.

Students should be taught to ask questions about their own work and revise their learning as a result of reflection—in effect, to conduct their own formative assessment. When students who are motivated to improve have opportunities to assess their own and others’ learning, they become more capable of managing their own educational progress, and there is a transfer of power from teacher to learner. On the other hand, when formative feed-

  

Website: <garnet.berkeley.edu:7019/mchap.html>. [September 5, 2000].

White and Frederiksen (2000) carried out a controlled study comparing ThinkerTools classes in which students engaged in the reflective-assessment process with matched control classes in which they did not. Each teacher’s classes were evenly divided between the two treatments. In the reflective-assessment classes, the students continually engaged in monitoring and evaluating their own and each other’s research. In the control classes, the students were not given an explicit framework for reflecting on their research; instead, they engaged in alternative activities in which they commented on what they did and did not like about the curriculum. In all other respects, the classes participated in the same ThinkerTools inquiry-based science curriculum. There were no significant differences in students’ initial average standardized test scores (the Comprehensive Test of Basic Skills [CTBS] was used as a measure of prior achievement) between the classes assigned (randomly) to the different treatments.

One of the outcome measures was a written inquiry assessment that was given both before and after the ThinkerTools Inquiry Curriculum was administered. Presented below are the gain scores on this assessment for both low- and high-achieving students and for students in the reflective-assessment and control classes. Note first that students in the reflective-assessment classes gained more on this inquiry

assessment. Note also that this was particularly true for the low-achieving students. This is evidence that the metacognitive reflective-assessment process is beneficial, particularly for academically disadvantaged students.

This finding was further explored by examining the gain scores for each component of the inquiry test. As shown in the figure below, one can see that the effect of reflective assessment is greatest for the more difficult aspects of the test: making up results, analyzing those results, and relating them back to the original hypotheses. In fact, the largest difference in the gain scores is that for a measure termed “coherence,” which reflects the extent to which the experiments the students designed addressed their hypotheses, their made-up results related to their experiments, their conclusions followed from their results, and their conclusions were related back to their original hypotheses. The researchers note that this kind of overall coherence is a particularly important indication of sophistication in inquiry.

back is “owned” entirely by the teacher, the power of the learner in the classroom is diminished, and the development of active and independent learning is inhibited (Deci and Ryan, 1994; Fernandes and Fontana, 1996; Grolnick and Ryan, 1987).

Because the assessor, in this context typically the classroom teacher, has interactive contact with the learner, many of the construct-irrelevant barriers associated with external standardized assessments (e.g., language barriers, unfamiliar contexts) can potentially be detected and overcome in the context of classroom assessment. However, issues of fairness can still arise in classroom assessment. Sensitive attention by the teacher is paramount to avoid potential sources of bias. In particular, differences between the cultural backgrounds of the teacher and the students can lead to severe difficulties. For example, the kinds of questions a middle-class teacher asks may be quite unlike, in form and function, questions students from a different socioeconomic or cultural group would experience at home, placing those students at a disadvantage (Heath, 1981, 1983).

Apart from the danger of a teacher’s personal bias, possibly unconscious, against any particular individual or group, there is also the danger of a teacher’s subscribing to the belief that learning ability or intelligence is fixed. Teachers holding such a belief may make self-confirming assumptions that certain children will never be able to learn, and may misinterpret or ignore assessment evidence to the contrary. However, as emphasized in the above discussion, there is great potential for formative assessment to assist and improve learning, and some studies, such as the ThinkerTools study described in Box 6–3 , have shown that students initially classified as less able show the largest learning gains. There is some indication from other studies that the finding of greater gains for less able students may be generalizable, and this is certainly an area to be further explored. 3 For now, these initial findings suggest that effective formative assessment practices may help overcome disadvantages endured at earlier stages in education.

Another possible source of bias may arise when students do not understand or accept learning goals. In such a case, responses that should provide the basis for formative assessment may not be meaningful or forthcoming.

  

The literature reviews on mastery learning by Block and Burns (1976), Guskey and Gates (1986), and Kulik, Kulik, and Bangert-Drowns (1990) confirm evidence of extra learning gains for the less able, gains that have been associated with the feedback enhancement in such regimes. However, Livingston and Gentile (1996) have cast doubt on this attribution. Fuchs and Fuchs (1986) report that studies with children with learning handicaps showed mean gain effect sizes of 0.73, compared with a mean of 0.63 for nonhandicapped children.

This potential consequence argues for helping learners understand and share learning goals.

LARGE-SCALE ASSESSMENT

We have described ways in which classroom assessment can be used to improve instruction and learning. We now turn to a discussion of assessments that are used in large-scale contexts, primarily for policy purposes. They include state, national, and international assessments. At the policy level, large-scale assessments are often used to evaluate programs and/or to set expectations for individual student learning (e.g., for establishing the minimum requirements individual students must meet to move on to the next grade or graduate from high school). At the district level, such assessments may be used for those same purposes, as well as for matching students to appropriate instructional programs. At the classroom level, large-scale assessments tend to be less relevant but still provide information a teacher can use to evaluate his or her own instruction and to identify or confirm areas of instructional need for individual students. Though further removed from day-to-day instruction than classroom assessments, large-scale assessments have the potential to support instruction and learning if well designed and appropriately used. For parents, large-scale assessments can provide information about their own child’s achievement and some information about the effectiveness of the instruction their child is receiving.

Implications of Advances in Cognition and Measurement

Substantially more valid and useful information could be gained from large-scale assessments if the principles set forth in Chapter 5 were applied during the design process. However, fully capitalizing on the new foundations described in this report will require more substantial changes in the way large-scale assessment is approached, as well as relaxation of some of the constraints that currently drive large-scale assessment practices.

As described in Chapter 5 , large-scale summative assessments should focus on the most critical and central aspects of learning in a domain as identified by curriculum standards and informed by cognitive research and theory. Large-scale assessments typically will reflect aspects of the model of learning at a less detailed level than classroom assessments, which can go into more depth because they focus on a smaller slice of curriculum and instruction. For instance, one might need to know for summative purposes whether a student has mastered the more complex aspects of multicolumn subtraction, including borrowing from and across zero, rather than exactly which subtraction bugs lead to mistakes. At the same time, while policy makers and parents may not need all the diagnostic detail that would be

useful to a teacher and student during the course of instruction, large-scale summative assessments should be based on a model of learning that is compatible with and derived from the same set of knowledge and beliefs about learning as classroom assessment.

Research on cognition and learning suggests a broad range of competencies that should be assessed when measuring student achievement, many of which are essentially untapped by current assessments. Examples are knowledge organization, problem representation, strategy use, metacognition, and kinds of participation in activity (e.g., formulating questions, constructing and evaluating arguments, contributing to group problem solving). Furthermore, large-scale assessments should provide information about the nature of student understanding, rather than simply ranking students according to general proficiency estimates.

A major problem is that only limited improvements in large-scale assessments are possible under current constraints and typical standardized testing scenarios. Returning to issues of constraints and trade-offs discussed earlier in this chapter, large-scale assessments are designed to serve certain purposes under constraints that often include providing reliable and comparable scores for individuals as well as groups; sampling a broad set of curriculum standards within a limited testing time per student; and offering cost-efficiency in terms of development, scoring, and administration. To meet these kinds of demands, designers typically create assessments that are given at a specified time, with all students taking the same (or parallel) tests under strictly standardized conditions (often referred to as “on-demand” assessment). Tasks are generally of the kind that can be presented in paper-and-pencil format, that students can respond to quickly, and that can be scored reliably and efficiently. In general, competencies that lend themselves to being assessed in these ways are tapped, while aspects of learning that cannot be observed under such constrained conditions are not addressed. To design new kinds of situations for capturing the complexity of cognition and learning will require examining the assumptions and values that currently drive assessment design choices and breaking out of the current paradigm to explore alternative approaches to large-scale assessment.

Alternative Approaches

To derive real benefits from the merger of cognitive and measurement theory in large-scale assessment requires finding ways to cover a broad range of competencies and to capture rich information about the nature of student understanding. This is true even if the information produced is at a coarse-grained as opposed to a highly detailed level. To address these challenges it is useful to think about the constraints and trade-offs associated

with issues of sampling—sampling of the content domain and of the student population.

The tasks on any particular assessment are supposed to be a representative sample of the knowledge and skills encompassed by the larger content domain. If the domain to be sampled is very broad, which is usually the case with large-scale assessments designed to cover a large period of instruction, representing the domain may require a large number and variety of assessment tasks. Most large-scale test developers opt for having many tasks that can be responded to quickly and that sample broadly. This approach limits the sorts of competencies that can be assessed, and such measures tend to cover only superficially the kinds of knowledge and skills students are supposed to be learning. Thus there is a need for testing situations that enable the collection of more extensive evidence of student performance.

If the primary purpose of the assessment is program evaluation, the constraint of having to produce reliable individual student scores can be relaxed, and population sampling can be useful. Instead of having all students take the same test (also referred to as “census testing”), a population sampling approach can be used whereby different students take different portions of a much larger assessment, and the results are combined to obtain an aggregate picture of student achievement.

If individual student scores are needed, broader sampling of the domain can be achieved by extracting evidence of student performance from classroom work produced during the course of instruction (often referred to as “curriculum-embedded” assessment). Student work or scores on classroom assessments can be used to supplement the information collected from an on-demand assessment to obtain a more comprehensive sampling of student performance. Although rarely used today for large-scale assessment purposes, curriculum-embedded tasks can serve policy and other external purposes of assessment if the tasks are centrally determined to some degree, with some flexibility built in for schools, teachers, and students to decide which tasks to use and when to have students respond to them.

Curriculum-embedded assessment approaches afford additional benefits. In on-demand testing situations, students are administered tasks that are targeted to their grade levels but not otherwise connected to their personal educational experiences. It is this relatively low degree of contextualization that renders these data good for some inferences, but not as good for others (Mislevy, 2000). If the purpose of assessment is to draw inferences about whether students can solve problems using knowledge and experiences they have learned in class, an on-demand testing situation in which every student receives a test with no consideration of his or her personal instruction history can be unfair. In this case, to provide valuable evidence of learning, the assessment must tap what the student has had the opportunity to learn (NRC, 1999b). In contrast to on-demand assessment, embedded

assessment approaches use techniques that link assessment tasks to concepts and materials of instruction. Curriculum-embedded assessment offers an alternative to on-demand testing for cases in which there is a need for correspondence among the curriculum, assessment, and actual instruction (see the related discussion of conditional versus unconditional inferences at the end of Chapter 5 ).

The following examples illustrate some cases in which these kinds of alternative approaches are being used successfully to evaluate individuals and programs in large-scale contexts. Except for DIAGNOSER, these examples are not strictly cognitively based and do not necessarily illustrate the features of design presented in Chapter 5 . Instead they were selected to illustrate some alternative ways of approaching large-scale assessment and the trade-offs entailed. The first two examples show how population sampling has been used for program evaluation at the national and state levels to enable coverage of a broader range of learning goals than would be possible if each student were to take the same form of a test. The third and fourth examples involve approaches to measuring individual attainment that draw evidence of student performance from the course of instruction.

Alternative Approaches to Large-Scale Assessment: Examples

National assessment of educational progress.

As described earlier in this chapter, NAEP is a national survey intended to provide policy makers and the public with information about the academic achievement of students across the nation. It serves as one source of information for policy makers, school administrators, and the public for evaluating the quality of their curriculum and instructional programs. NAEP is a unique case of program evaluation in that it is not tied to any specific curriculum. It is based on a set of assessment frameworks that describe the knowledge and skills to be assessed in each subject area. The performances assessed are intended to represent the leading edge of what all students should be learning. Thus the frameworks are broader than any particular curriculum (NRC, 1999a). The challenge for NAEP is to assess the breadth of learning goals that are valued across the nation. The program approaches this challenge through the complex matrix sampling design described earlier.

NAEP’s design is beginning to be influenced by the call for more cognitively informed assessments of educational programs. Recent evaluations of NAEP (National Academy of Education, 1997; NRC, 1999a) emphasize that the current survey does not adequately capitalize on advances in our understanding of how people learn particular subject matter. These study

committees have strongly recommended that NAEP incorporate a broader conceptualization of school achievement to include aspects of learning that are not well specified in the existing NAEP frameworks or well measured by the current survey methods. The National Academy of Education panel recommended that particular attention be given to such aspects of student cognition as problem representation, the use of strategies and self-regulatory skills, and the formulation of explanations and interpretations, contending that consideration of these aspects of student achievement is necessary for NAEP to provide a complete and accurate assessment of achievement in a subject area. The subsequent review of NAEP by the NRC reiterated those recommendations and added that large-scale survey instruments alone cannot reflect the scope of these more comprehensive goals for schooling. The NRC proposed that, in addition to the current assessment blocks, which are limited to 50-minute sessions and paper-and-pencil responses, NAEP should include carefully designed, targeted assessments administered to smaller samples of students that could provide in-depth descriptive information about more complex activities that occur over longer periods of time. For instance, smaller data collections could involve observations of students solving problems in groups or performing extended science projects, as well as analysis of writing portfolios compiled by students over a year of instruction.

Thus NAEP illustrates how relaxing the constraint of having to provide individual student scores opens up possibilities for population sampling and coverage of a much broader domain of cognitive performances. The next example is another illustration of what can be gained by such a sampling approach.

Maryland State Performance Assessment Program

The Maryland State Performance Assessment Program (MSPAP) is designed to evaluate how well schools are teaching the basic and complex skills outlined in state standards called Maryland Learner Outcomes. Maryland is one of the few states in the country that has decided to optimize the use of assessment for program evaluation, forgoing individual student scores. 4 A population sampling design is used, as opposed to the census testing design used by most states.

MSPAP consists of criterion-referenced performance tests in reading, mathematics, writing, language usage, science, and social studies for students in grades 3, 5, and 8. The assessment is designed to measure a broad range of competencies. Tasks require students to respond to questions or directions that lead to a solution for a problem, a recommendation or decision, or an explanation or rationale for their responses. Some tasks assess one content

  

Website: < >. [June 29, 2000].

area; others assess multiple content areas. The tasks may encompass group or individual activities; hands-on, observation, or reading activities; and activities that require extended written responses, limited written responses, lists, charts, graphs, diagrams, webs, and/or drawings. A few MSPAP items are released each year to educators and the public to provide a picture of what the assessment looks like and how it is scored. 5

To cover this broad range of learning outcomes, Maryland uses a sampling approach whereby each student takes only one-third of the entire assessment. This means an individual student’s results do not give a complete picture of how that child is performing (although parents can obtain a copy of their child’s results from the local school system). What is gained is a program evaluation instrument that covers a much more comprehensive range of learning goals than that addressed by a traditional standardized test.

AP Studio Art

The above two examples do not provide individual student scores. The AP Studio Art portfolio assessment is an example of an assessment that is designed to certify individual student attainment over a broad range of competencies and to be closely linked to the actual instruction students have experienced (College Board, 1994). Student work products are extracted during the course of instruction, collected, and then evaluated for summative evaluation of student attainment.

AP Studio Art is just one of many Advanced Placement (AP) programs designed to give highly motivated high school students the opportunity to take college-level courses in areas such as biology, history, calculus, and English while still in high school. AP programs provide course descriptions and teaching materials, but do not require that specific textbooks, teaching techniques, or curricula be followed. Each program culminates in an exam intended to certify whether individual students have mastered material equivalent to that of an introductory college course. AP Studio Art is unique in that at the end of the year, instead of taking a written summative exam, students present a portfolio of materials selected from the work they have produced during the AP course for evaluation by a group of artists and teachers. Preparation of the portfolio requires forethought; work submitted for the various sections must meet the publicly shared criteria set forth by the AP program.

The materials presented for evaluation may have been produced in art classes or on the student’s own time and may cover a period of time longer than a single school year. Instructional goals and the criteria by which students’ performance will be evaluated are made clear and explicit. Portfolio

requirements are carefully spelled out in a poster distributed to students and teachers; scoring rubrics are also widely distributed. Formative assessment is a critical part of the program as well. Students engage in evaluation of their own work and that of their peers, then use that feedback to inform next steps in building their portfolios. Thus while the AP Studio Art program is not directly based on cognitive research, it does reflect general cognitive principles, such as setting clear learning goals and providing students with opportunities for formative feedback, including evaluation of their own work.

Portfolios are scored quickly but fairly by trained raters. It is possible to assign reliable holistic scores to portfolios in a short amount of time. Numerous readings go into the scoring of each portfolio, enhancing the fairness of the assessment process (Mislevy, 1996). In this way, technically sound judgments are made, based on information collected through the learning process, that fulfill certification purposes. Thus by using a curriculum-embedded approach, the AP Studio Art program is able to collect rich and varied samples of student work that are tied to students’ instructional experiences over the course of the year, but can also be evaluated in a standardized way for the purposes of summative assessment.

It should be noted that some states attempting to implement large-scale portfolio assessment programs have encountered difficulties (Koretz and Barron, 1998). Therefore, while this is a good example of an alternative approach to on-demand testing, it should be recognized that there are many implementation challenges to be addressed.

Facets DIAGNOSER

We return to Minstrell and Hunt’s facets-based DIAGNOSER (Minstrell, 2000), described in some detail in Chapter 5 , to illustrate another way of thinking about assessment of individuals’ summative achievement. The DIAGNOSER, developed for use at the classroom level to assist learning, does not fit the mold of traditional large-scale assessment. Various modules (each of which takes 15 to 20 minutes) cover small amounts of material fairly intensively. However, the DIAGNOSER could be used to certify individual attainment by noting the most advanced module a student had completed at a successful level of understanding in the course of instruction. For instance, the resulting assessment record would distinguish between students who had completed only Newtonian mechanics and those who had completed modules on the more advanced topics of waves or direct-circuit electricity. Because the assessment is part of instruction, there would be less concern about instructional time lost to testing.

Minstrell (2000) also speculates about how a facets approach could be applied to the development of external assessments designed to inform decisions at the program and policy levels. Expectations for learning, currently

conveyed by state and national curriculum standards, would be enhanced by facets-type research on learning. Current standards based on what we want our students to know and be able to do could be improved by incorporating findings from research on what students know and are able to do along the way to competence. By using a matrix sampling design, facet clusters could be covered extensively, providing summary information for decision makers about specific areas of difficulty for learners—information that would be useful for curriculum revision.

Use of Large-Scale Assessment to Signal Worthy Goals

Large-scale assessments can serve the purposes of learning by signaling worthwhile goals for educators and students to pursue. The challenge is to use the assessment program to signal goals at a level that is clear enough to provide some direction, but not so prescriptive that it results in a narrowing of instruction. Educators and researchers have debated the potential benefits of “teaching to a test.” Proponents of performance-based assessment have suggested that assessment can have a positive impact on learning if authentic tasks are used that replicate important performances in the discipline. The idea is that high-quality tasks can clarify and set standards of academic excellence, in which case teaching to the test becomes a good thing (Wiggins, 1989). Others (Miller and Seraphine, 1993) have argued that teaching to a test will always result in narrowing of the curriculum, given that any test can only sample the much broader domain of learning goals.

These views can perhaps be reconciled if the assessment is based on a well-developed model of learning that is shared with educators and learners. To make appropriate instructional decisions, teachers should teach to the model of learning—as conveyed, for example, by progress maps and rubrics for judging the quality of student work—rather than focusing on the particular items on a test. Test users must understand that any particular set of assessment tasks represents only a sample of the domain and that tasks will change from year to year. Given this understanding, assessment items and sample student responses can provide valuable exemplars to help teachers and students understand the underlying learning goals. Whereas teaching directly to the items on a test is not desirable, teaching to the set of beliefs about learning that underlie an assessment—which should be the same set of beliefs that underlies the curriculum—can provide positive direction for instruction.

High-quality summative assessment tasks are ones for which students can prepare only through active learning, as opposed to rote drill and practice or memorization of solutions. The United Kingdom’s Secondary School Certification Exam in physics (described in more detail later in this chapter) produces a wide variety of evidence that can be used to evaluate students’ summative achieve

ment. The exam includes some transfer tasks that have been observed to be highly motivating for students (Morland, 1994). For instance, there is a task that assesses whether students can read articles dealing with applications of physics that lie outside the confines of the syllabus. Students know they will be presented with an article they have not seen before on a topic not specified in the syllabus, but that it will be at a level they should be able to understand on the basis of the core work of the syllabus. This task assesses students’ competency in applying their understanding in a new context in the process of learning new material. The only way for students to prepare for this activity is to read a large variety of articles and work systematically to understand them.

Another goal of the U.K. physics curriculum is to develop students’ capacity to carry out experimental investigations on novel problems. Students are presented with a scientific problem that is not included in the routine curriculum materials and must design an experiment, select and appropriately use equipment and procedures to implement the design, collect and analyze data, and interpret the data. Again, the only way students can prepare for this task is by engaging in a variety of such investigations and learning how to take responsibility for their design, implementation, and interpretation. In the United Kingdom, these portions of the physics exam are administered by the student’s own teacher, with national, standardized procedures in place for ensuring and checking fairness and rigor. When this examination was first introduced in the early 1970s, it was uncommon in classrooms to have students read on topics outside the syllabus and design and conduct their own investigations. The physics exam has supported the message, also conveyed by the curriculum, that these activities are essential, and as a result students taking the exam have had the opportunity to engage in such activities in the course of their study (Tebbutt, 1981).

Feedback and Expectations for Learning

In Chapters 4 and 5 , we illustrated some of the kinds of information that could be obtained by reporting large-scale assessment results in relation to developmental progress maps or other types of learning models. Assessment results should describe student performance in terms of different states and levels of competence in the domain. Typical learning pathways should be displayed and made as recognizable as possible to educators, students, and the public.

Large-scale assessments of individual achievement could be improved by focusing on the potential for providing feedback that not only measures but also enhances future learning. Assessments can be designed to say both that this person is unqualified to move on and that this person’s difficulty lies

in these particular areas, and that is what has to be improved, the other components being at the desired level.

Likewise, assessments designed to evaluate programs should provide the kinds of information decision makers can use to improve those programs. People tend to think of school administrators and policy makers as removed from concerns about the details of instruction. Thus large-scale assessment information aimed at those users tends to be general and comparative, rather than descriptive of the nature of learning that is taking place in their schools. Practices in some school districts, however, are challenging these assumptions (Resnick and Harwell, 1998).

Telling an administrator that mathematics is a problem is too vague. Knowing how a school is performing in mathematics relative to past years, how it is performing relative to other schools, and what proportions of students fall in various broadly defined achievement categories also provides little guidance for program improvement. Saying that students do not understand probability is more useful, particularly to a curriculum planner. And knowing that students tend to confuse conditional and compound probability can be even more useful for the modification of curriculum and instruction. Of course, the sort of feedback needed to improve instruction depends on the program administrator’s level of control.

Not only do large-scale assessments provide means for reporting on student achievement, but they also convey powerful messages about the kinds of learning valued by society. Large-scale assessments should be used by policy makers and educators to operationalize and communicate among themselves, and to the public, the kinds of thinking and learning society wishes to encourage in students. In this way, assessments can foster valuable dialogue about learning and its assessment within and beyond the education system. Models of learning should be shared and communicated in accessible ways to show what competency in a domain looks like. For example, Developmental Assessment based on progress maps is being used in the Commonwealth of Victoria to assess literacy. An evaluation of the program revealed that users were “overwhelmingly positive about the value and potential of Developmental Assessment as a means for developing shared understandings and a common language for literacy development” (Meiers and Culican, 2000, p. 44).

Example: The New Standards Project

The New Standards Project, as originally conceived (New Standards™, 1997a, 1997b, 1997c), illustrates ways to approach many of the issues of large-scale assessment discussed above. The program was designed to provide clear goals for learning and assessments that are closely tied to those

goals. A combination of on-demand and embedded assessment was to be used to tap a broad range of learning outcomes, and priority was given to communicating the performance standards to various user communities. Development of the program was a collaboration between the Learning Research and Development Center of the University of Pittsuburgh and the National Center on Education and the Economy, in partnership with states and urban school districts. Together they developed challenging standards for student performance at grades 4, 8, and 10, along with large-scale assessments designed to measure attainment of those standards. 6

The New Standards Project includes three interrelated components: performance standards, a portfolio assessment system, 7 and an on-demand exam. The performance standards describe what students should know and the ways they should demonstrate the knowledge and skills they have acquired. The performance standards include samples of student work that illustrate high-quality performances, accompanied by commentary that shows how the work sample reflects the performance standards. They go beyond most content standards by describing how good is good enough, thus providing clear targets to pursue.

The Reference Exam is a summative assessment of the national standards in the areas of English Language Arts and Mathematics at grades 4, 8, and 10. The developers state explicitly that the Reference Exam is intended to address those aspects of the performance standards that can be assessed in a limited time frame under standardized conditions. The portfolio assessment system was designed to complement the Reference Exam by providing evidence of achievement of those performance standards that depend on extended work and the accumulation of evidence over time.

The developers recognized the importance of making the standards clear and presenting them in differing formats for different audiences. One version of the standards is targeted to teachers. It includes relatively detailed language about the subject matter of the standards and terms educators use to describe differences in the quality of work produced by students. The standards are also included in the portfolio material provided for student use. In these materials, the standards are set forth in the form of guidelines to help students select work for inclusion in their portfolios. In addition, there were plans to produce a less technical version for parents and the community in general.

  

Aspects of the program have since changed, and the Reference Exam is now administered by Harcourt Educational Measurement.

  

The portfolio component was field tested but has not been administered on a large scale.

ASSESSMENT SYSTEMS

In the preceding discussion we have addressed issues of practice related to classroom and large-scale assessment separately. We now return to the matter of how such assessments can work together conceptually and operationally.

As argued throughout this chapter, one form of assessment does not serve all purposes. Given that reality, it is inevitable that multiple assessments (or assessments consisting of multiple components) are required to serve the varying educational assessment needs of different audiences. A multitude of different assessments are already being conducted in schools. It is not surprising that users are often frustrated when such assessments have conflicting achievement goals and results. Sometimes such discrepancies can be meaningful and useful, such as when assessments are explicitly aimed at measuring different school outcomes. More often, however, conflicting assessment goals and feedback cause much confusion for educators, students, and parents. In this section we describe a vision for coordinated systems of multiple assessments that work together, along with curriculum and instruction, to promote learning. Before describing specific properties of such systems, we consider issues of balance and allocation of resources across classroom and large-scale assessment.

Balance Between Classroom and Large-Scale Assessment

The current educational assessment environment in the United States clearly reflects the considerable value and credibility accorded external, large-scale assessments of individuals and programs relative to classroom assessments designed to assist learning. The resources invested in producing and using large-scale testing in terms of money, instructional time, research, and development far outweigh the investment in the design and use of effective classroom assessments. It is the committee’s position that to better serve the goals of learning, the research, development, and training investment must be shifted toward the classroom, where teaching and learning occurs.

Not only does large-scale assessment dominate over classroom assessment, but there is also ample evidence of accountability measures negatively impacting classroom instruction and assessment. For instance, as discussed earlier, teachers feel pressure to teach to the test, which results in a narrowing of instruction. They also model their own classroom tests after less-than-ideal standardized tests (Gifford and O’Connor, 1992; Linn, 2000; Shepard, 2000). These kinds of problems suggest that beyond striking a better balance between classroom and large-scale assessment, what is needed are coordinated assessment systems that collectively support a common set of learning goals, rather than working at cross-purposes.

Ideally in a balanced assessment environment, a single assessment does not function in isolation, but rather within a nested assessment system involving states, local school districts, schools, and classrooms. Assessment systems should be designed to optimize the credibility and utility of the resulting information for both educational decision making and general monitoring. To this end, an assessment system should exhibit three properties: comprehensiveness, coherence, and continuity. These three characteristics describe an assessment system that is aligned along three dimensions: vertically, across levels of the education system; horizontally, across assessment, curriculum, and instruction; and temporally, across the course of a student’s studies. These notions of alignment are consistent with those set forth by the National Institute for Science Education (Webb, 1997) and the National Council of Teachers of Mathematics (1995).

Features of a Balanced Assessment System

Comprehensiveness.

By comprehensiveness, we mean that a range of measurement approaches should be used to provide a variety of evidence to support educational decision making. Educational decisions often require more information than a single measure can provide. As emphasized in the NRC report High Stakes: Testing for Tracking, Promotion, and Graduation, multiple measures take on particular importance when important, life-altering decisions (such as high school graduation) are being made about individuals. No single test score can be considered a definitive measure of a student’s competence. Multiple measures enhance the validity and fairness of the inferences drawn by giving students various ways and opportunities to demonstrate their competence. The measures could also address the quality of instruction, providing evidence that improvements in tested achievement represent real gains in learning (NRC, 1999c).

One form of comprehensive assessment system is illustrated in Table 6– 1 , which shows the components of a U.K. examination for certification of top secondary school students who have studied physics as one of three chosen subjects for 2 years between ages 16 and 18. The results of such examinations are the main criterion for entrance to university courses. Components A, B, C, and D are all taken within a few days, but E and F involve activities that extend over several weeks preceding the formal examination.

This system combines external testing on paper (components A, B, and C) with external performance tasks done using equipment (D) and teachers’ assessment of work done during the course of instruction (E and F). While

TABLE 6–1 Six Components of an A-Level Physics Examination

Component

Title

No. of Questions or Tasks

Time

Weight in Marks

Description

A

Coded Answer

40

75 min.

20%

Multiple choice questions, all to be attempted.

B

Short Answer

7 or 8

90 min.

20%

Short with structured subcomponents, fixed space for answer, all to be attempted.

C

Comprehension

3

150 min.

24%

a) Answer questions on a new passage.

b) Analyze and draw conclusions from a set of presented data.

c) Explain phenomena described in short paragraphs: select 3 from 5.

D

Practical Problems

8

90 min.

16%

Short problems with equipment set up in a laboratory, all to be attempted.

E

Investigation

1

About 2 weeks

10%

In normal school laboratory time, investigate a problem of the student’s own choice.

F

Project Essay

1

About 2 weeks

10%

In normal school time, research and write about a topic chosen by the student.

 

this particular physics examination is now subject to change, 8 combining the results of external tests with classroom assessments of particular aspects of achievement for which a short formal test is not appropriate is an established feature of achievement testing systems in the United Kingdom and

  

Because the whole structure of the 16–18 examinations is being changed, this examination and the curriculum on which it is based, which have been in place for 30 years, will no longer be in use after 2001. They will be replaced by a new curriculum and examination, based on the same principles.

several other countries. This feature is also part of the examination system for the International Baccalaureate degree program. In such systems, work is needed to develop procedures for ensuring the comparability of standards across all teachers and schools.

Overall, the purpose is to reflect the variety of the aims of a course, including the range of knowledge and simple understanding explored in A, the practical skills explored in D, and the broader capacities for individual investigation explored in E and F. Validity and comprehensiveness are enhanced, albeit through an expensive and complex assessment process.

There are other possible ways to design comprehensive assessment systems. Portfolios are intended to record “authentic” assessments over a period of time and a range of classroom contexts. A system may assess and give certification in stages, so that the final outcome is an accumulation of results achieved and credited separately over, say, 1 or 2 years of a learning course; results of this type may be built up by combining on-demand externally controlled assessments with work samples drawn from coursework. Such a system may include assessments administered at fixed times or at times of the candidate’s choice using banks of tasks from which tests can be selected to match the candidate’s particular opportunities to learn. Thus designers must always look to the possibility of using the broader approaches discussed here, combining types of tasks and the timing of assessments and of certifications in the optimum way.

Further, in a comprehensive assessment system, the information derived should be technically sound and timely for given decisions. One must be able to trust the accuracy of the information and be assured that the inferences drawn from the results can be substantiated by evidence of various types. The technical quality of assessment is a concern primarily for external, large-scale testing; but if classroom assessment information is to feed into the larger assessment system, the reliability, validity, and fairness of these assessments must be addressed as well. Researchers are just beginning to explore issues of technical quality in the realm of classroom assessment (e.g., Wilson and Sloane, 2000).

For the system to support learning, it must also have a quality the committee refers to as coherence. One dimension of coherence is that the conceptual base or models of student learning underlying the various external and classroom assessments within a system should be compatible. While a large-scale assessment might be based on a model of learning that is coarser than that underlying the assessments used in classrooms, the conceptual base for the large-scale assessment should be a broader version of one that makes sense at the finer-grained level (Mislevy, 1996). In this way, the exter-

nal assessment results will be consistent with the more detailed understanding of learning underlying classroom instruction and assessment. As one moves up and down the levels of the system, from the classroom through the school, district, and state, assessments along this vertical dimension should align. As long as the underlying models of learning are consistent, the assessments will complement each other rather than present conflicting goals for learning.

To keep learning at the center of the educational enterprise, assessment information must be strongly linked to curriculum and instruction. Thus another aspect of coherence, emphasized earlier, is that alignment is needed among curriculum, instruction, and assessment so that all three parts of the education system are working toward a common set of learning goals. Ideally, assessment will not simply be aligned with instruction, but integrated seamlessly into instruction so that teachers and students are receiving frequent but unobtrusive feedback about their progress. If assessment, curriculum, and instruction are aligned with common models of learning, it follows that they will be aligned with each other. This can be thought of as alignment along the horizontal dimension of the system.

To achieve both the vertical and horizontal dimensions of coherence or alignment, models of learning are needed that are shared by educators at different levels of the system, from teachers to policy makers. This need might be met through a process that involves gathering together the necessary expertise, not unlike the approach used to develop state and national curriculum standards that define the content to be learned. But current definitions of content must be significantly enhanced based on research from the cognitive sciences. Needed are user-friendly descriptions of how students learn the content, identifying important targets for instruction and assessment (see, e.g., American Association for the Advancement of Science, 2001). Research centers could be charged with convening the appropriate experts to produce a synthesis of the best available scientific understanding of how students learn in particular domains of the curriculum. These models of learning would then guide assessment design at all levels, as well as curriculum and instruction, effecting alignment in the system. Some might argue that what we have described are the goals of current curriculum standards. But while the existing standards emphasize what students should learn, they do not describe how students learn in ways that are maximally useful for guiding instruction and assessment.

In addition to comprehensiveness and coherence, an ideal assessment system would be designed to be continuous. That is, assessments should measure student progress over time, akin more to a videotape record than to

the snapshots provided by the current system of on-demand tests. To provide such pictures of progress, multiple sets of observations over time must be linked conceptually so that change can be observed and interpreted. Models of student progression in learning should underlie the assessment system, and tests should be designed to provide information that maps back to the progression. With such a system, we would move from “one-shot” testing situations and cross-sectional approaches for defining student performance toward an approach that focused on the processes of learning and an individual’s progress through that process (Wilson and Sloane, 2000). Thus, continuity calls for alignment along the third dimension of time.

Approximations of a Balanced System

No existing assessment systems meet all three criteria of comprehensiveness, coherence, and continuity, but many of the examples described in this report represent steps toward these goals. For instance, the Developmental Assessment program shows how progress maps can be used to achieve coherence between formative and summative assessments, as well as among curriculum, instruction, and assessment. Progress maps also enable the measurement of growth (continuity). The Australian Council for Educational Research has produced an excellent set of resource materials for teachers to support their use of a wide range of assessment strategies—from written tests to portfolios to projects at the classroom level—that can all be designed to link back to the progress maps (comprehensiveness) (see, e.g., Forster and Masters, 1996a, 1996b; Masters and Forster, 1996). The BEAR assessment shares many similar features; however, the underlying models of learning are not as strongly tied to cognitive research as they could be. On the other hand, intelligent tutoring systems have a strong cognitive research base and offer opportunities for integrating formative and summative assessments, as well as measuring growth, yet their use for large-scale assessment purposes has not yet been explored. Thus, examples in this report offer a rich set of opportunities for further development toward the goal of designing assessment systems that are maximally useful for both informing and improving learning.

CONCLUSIONS

Guiding the committee’s work were the premises that (1) something important should be learned from every assessment situation, and (2) the information gained should ultimately help improve learning. The power of classroom assessment resides in its close connections to instruction and teachers’ knowledge of their students’ instructional histories. Large-scale, standardized assessments can communicate across time and place, but by so

constraining the content and timeliness of the message that they often have limited utility in the classroom. Thus the contrast between classroom and large-scale assessments arises from the different purposes they serve and contexts in which they are used. Certain trade-offs are an inescapable aspect of assessment design.

Students will learn more if instruction and assessment are integrally related. In the classroom, providing students with information about particular qualities of their work and about what they can do to improve is crucial for maximizing learning. It is in the context of classroom assessment that theories of cognition and learning can be particularly helpful by providing a picture of intermediary states of student understanding on the pathway from novice to competent performer in a subject domain.

Findings from cognitive research cannot always be translated directly or easily into classroom practice. Most effective are programs that interpret the findings from cognitive research in ways that are useful for teachers. Teachers need theoretical training, as well as practical training and assessment tools, to be able to implement formative assessment effectively in their classrooms.

Large-scale assessments are further removed from instruction, but can still benefit learning if well designed and properly used. Substantially more valid and useful inferences could be drawn from such assessments if the principles set forth in this report were applied during the design process.

Large-scale assessments not only serve as a means for reporting on student achievement, but also reflect aspects of academic competence societies consider worthy of recognition and reward. Thus large-scale assessments can provide worthwhile targets for educators and students to pursue. Whereas teaching directly to the items on a test is not desirable, teaching to the theory of cognition and learning that underlies an assessment can provide positive direction for instruction.

To derive real benefits from the merger of cognitive and measurement theory in large-scale assessment, it will be necessary to devise ways of covering a broad range of competencies and capturing rich information about the nature of student understanding. Indeed, to fully capitalize on the new foundations described in this report will require substantial changes in the way large-scale assessment is approached and relaxation of some of the constraints that currently drive large-scale assessment practices. Alternatives to on-demand, census testing are available. If individual student scores are needed, broader sampling of the domain can be achieved by extracting evidence of student performance from classroom work produced during the course of instruction. If the primary purpose of the assessment is program evaluation, the constraint of having to produce reliable individual student scores can be relaxed, and population sampling can be useful.

For classroom or large-scale assessment to be effective, students must understand and share the goals for learning. Students learn more when they understand (and even participate in developing) the criteria by which their work will be evaluated, and when they engage in peer and self-assessment during which they apply those criteria. These practices develop students’ metacognitive abilities, which, as emphasized above, are necessary for effective learning.

The current educational assessment environment in the United States assigns much greater value and credibility to external, large-scale assessments of individuals and programs than to classroom assessment designed to assist learning. The investment of money, instructional time, research, and development for large-scale testing far outweighs that for effective classroom assessment. More of the research, development, and training investment must be shifted toward the classroom, where teaching and learning occur.

A vision for the future is that assessments at all levels—from classroom to state—will work together in a system that is comprehensive, coherent, and continuous. In such a system, assessments would provide a variety of evidence to support educational decision making. Assessment at all levels would be linked back to the same underlying model of student learning and would provide indications of student growth over time.

Three themes underlie this chapter’s exploration of how information technologies can advance the design of assessments, based on a merging of the cognitive and measurement advances reviewed in .

. For instance, technology offers opportunities to strengthen the cognition-observation linkage by enabling the design of situations that assess a broader range of cognitive processes than was previously possible, including knowledge-organization and problem-solving processes that are difficult to assess using traditional, paper-and-pencil assessment methods.

Education is a hot topic. From the stage of presidential debates to tonight's dinner table, it is an issue that most Americans are deeply concerned about. While there are many strategies for improving the educational process, we need a way to find out what works and what doesn't work as well. Educational assessment seeks to determine just how well students are learning and is an integral part of our quest for improved education.

The nation is pinning greater expectations on educational assessment than ever before. We look to these assessment tools when documenting whether students and institutions are truly meeting education goals. But we must stop and ask a crucial question: What kind of assessment is most effective?

At a time when traditional testing is subject to increasing criticism, research suggests that new, exciting approaches to assessment may be on the horizon. Advances in the sciences of how people learn and how to measure such learning offer the hope of developing new kinds of assessments-assessments that help students succeed in school by making as clear as possible the nature of their accomplishments and the progress of their learning.

Knowing What Students Know essentially explains how expanding knowledge in the scientific fields of human learning and educational measurement can form the foundations of an improved approach to assessment. These advances suggest ways that the targets of assessment-what students know and how well they know it-as well as the methods used to make inferences about student learning can be made more valid and instructionally useful. Principles for designing and using these new kinds of assessments are presented, and examples are used to illustrate the principles. Implications for policy, practice, and research are also explored.

With the promise of a productive research-based approach to assessment of student learning, Knowing What Students Know will be important to education administrators, assessment designers, teachers and teacher educators, and education advocates.

READ FREE ONLINE

Welcome to OpenBook!

You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

Do you want to take a quick tour of the OpenBook's features?

Show this book's table of contents , where you can jump to any chapter by name.

...or use these buttons to go back to the previous chapter or skip to the next one.

Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

Switch between the Original Pages , where you can read the report as it appeared in print, and Text Pages for the web version, where you can highlight and search the text.

To search the entire text of this book, type in your search term here and press Enter .

Share a link to this book page on your preferred social network or via email.

View our suggested citation for this chapter.

Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

Get Email Updates

Do you enjoy reading reports from the Academies online for free ? Sign up for email notifications and we'll let you know about new publications in your areas of interest when they're released.

Academia.edu no longer supports Internet Explorer.

To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to  upgrade your browser .

Enter the email address you signed up with and we'll email you a reset link.

  • We're Hiring!
  • Help Center

paper cover thumbnail

(2016) The Role of Assessment in Teaching and Learning

Profile image of Danielle Wood-Wallace (DWW)

Deeply embedded in the current education system is assessment. Within education, assessment is used to track and predict pupil achievement and can be defined as a means by which pupil learning is measured (Ronan, 2015). The delivery of teaching and learning within schools is often predetermined by what is assessed, with pupils actively being taught how to achieve the success criteria (appendix 7a). Recognised as a key professional competency of teachers (GTCNI, 2011) and the 6th quality in the Teachers’ Standards (DfE, 2011), assessment can be outlined as ‘the systematic collection, interpretation and use of information to give a deeper appreciation of what pupils know and understand, their skills and personal capabilities, and what their learning experiences enable them to do’ (CCEA, 2013: 4). The aims of the current essay are to venture further into the role of assessment in teaching and learning, paying particular attention to how formative and summative forms of assessment contribute to the discipline; and what impact these have at the classroom and the school level for both teachers and learners. The paper will examine my own experiences of using formative and summative assessment in the classroom, looking specifically at the summative processes I am aware of, before evaluating the purpose of Independent Thinking Time (ITT) and Talk Partners (TP); and how formative assessment can take place within these. In addition to this, the essay will also explore the role of Closing the Gaps (CTGs) in marking, and how questioning can assess conceptual understanding. These will be evaluated against the Teachers’ Standards. The essay will endeavour to foreground some potential challenges with formative and summative assessment (including what I have learned about assessment), before identifying some areas for future development and the strategies to facilitate these.

Related Papers

morro camara

what is assessment in education essay

Bernie Moreno

Eddy White, Ph.D.

This report describes research carried out at in an EFL public speaking course at a Japanese university. While student presentations typically involve both delivery and student slideshows being assessed at the same time, this investigation looks at separating the two components and assessing presentation slideshows separately. Using a summative assessment instrument, a slideshow rubric, a series of related formative assessments were also developed and administered with the goal of creating an synergy of assessments whereby the combined effect of interweaving these assessments together would promote greater student learning. The 22 university students in this class engaged in trio of related assessments (two formative, one summative) in developing a slideshow for a persuasive speech. Students produced a first draft of their presentation slideshows, and these were used for a self-assessment, and also for a formative teacher assessment (ungraded) prior to the final graded summative assessment. The report into the formative use of summative assessment describes the processes and instruments used in this experiment in assessment synergy. Assessment information and data from five students in the class provide actual assessment examples to help delineate the processes described. These student examples, and the teacher feedback included, help demonstrate that the formative use of summative assessment had positive effects on student learning related to the effective construction of presentation slideshows. The report concludes with a call for more classroom based research and publications in EFL/ESL contexts related to the synergy of formative and summative assessment processes, practices and instruments.

Hina Hashmi

Higher Education

Mantz Yorke

Abstract. The importance of formative assessment in student learning is generally acknowl- edged, but it is not well understood across higher education. The identification of some key features of formative assessment opens the way for a discussion of theory. It is argued that there is ...

Dylan Wiliam

Through a series of historical contingencies, we have arrived at a situation in the United States in which the circumstances of the assessments have become conflated with the purposes of the assessment (Black and Wiliam, 2004a). So, for example, it is often widely assumed that the role of classroom assessment should be limited to supporting learning and all assessments with

Formative Assessment: A Review of Literature 2007

Michael A. Buhagiar

Assessment reform has been on the educational agenda of many countries for at least two decades. In this article—which begins by charting what lies behind the calls for a paradigm shift in assessment and what is being proposed instead—classroom assessment is explored in detail in relation to its links with teacher and student assessment, as well as with formative and summative assessment. The emerging notion that classroom assessment embodies all forms of assessment that take place within the confines of the classroom is subsequently analysed from an ‘assessment for learning’ perspective, which is increasingly being accepted as the way forward if teaching, learning and assessment are to become fully integrated as demanded by the underlying philosophy of the new, alternative paradigm. However, noting the gross difficulties facing the translation of new policies into classroom practices, the article finally suggests what can be done to rectify this certainly frustrating, but also potentially dangerous, situation.

Journal of Education

Lee Rusznyak

… and Developments in …

Rachel Lofthouse

Loading Preview

Sorry, preview is currently unavailable. You can download the paper by clicking the button above.

RELATED PAPERS

Formative Assessment and All That Jazz

Juliet Nickels

siyabonga zondo

tamat srulevitch

Roger Murphy

International Journal of Science Education

Mandy Asghar

d'Reen Struthers

Tomas Chavez

Andrea Mallaburn

Journal of Maltese Education Research

Sollsintec Proceeding 2013

Mohd Sallehhudin Abd Aziz

The Journal of Asia TEFL

Alexander Nanni , Satja Sopha

Danny Liu , Charlotte Taylor

Danielle Wood-Wallace (DWW)

Keith Dixon

Seyed Mohammad Reza Amirian

Educational Technology Research and Development

Stephen Yanchar

Muir Houston

Peter Rawlins

AMEE Guides

Rajaa Allhiani

Teaching for Integrative Learning: Innovations in University practice, vol 4.

Heather Watkins

European Journal of Education

Jennifer Groff

Studies in Educational Evaluation

Glenda Anthony

Christine Parsons

Samuel DoctorMath

Mari Chikvaidze

Student Engagement in Higher Education Journal

Joy Robbins

Educational Assessment Evaluation and Accountability

Carleton Holt

Dr. Anthony Bordoh (PhD., M.Phil., B.Ed., Tr. Cert 'A')

Jay Derrick

Michael Franklin Mensah , Dr. Anthony Bordoh (PhD., M.Phil., B.Ed., Tr. Cert 'A')

Bioscience Education E-journal

Kavita Patil-Pawar

Catherine Owen

RELATED TOPICS

  •   We're Hiring!
  •   Help Center
  • Find new research papers in:
  • Health Sciences
  • Earth Sciences
  • Cognitive Science
  • Mathematics
  • Computer Science
  • Academia ©2024
  • (855) 776-7763

Training Maker

Collaborate

WebinarNinja

ProProfs.com

Knowledge Base

Survey Maker

All Products

  • Get Started Free

FREE. All Features. FOREVER!

Try our Forever FREE account with all premium features!

Educational Assessment 101: Definition, Types, Examples & Importance

Vipul Bhagia

Author & Editor at ProProfs

Vipul is a seasoned e-learning expert, specializing in crafting impactful learning experiences and designing employee training assessments. His passion lies in writing about tools that enhance online learning and training outcomes.

educational_assessment_101_definition_types_examples_importance

Whenever there is talk about educational assessments, students writing exams is the first image to float in many people’s minds. 

Well, the real picture is quite different.

Assessments are much more than just tests. They are also about exploring and measuring learning in different ways. 

Assessments help identify what students have learned, what they need to learn next, and how they can learn more effectively. They also help teachers plan, improve, and evaluate their teaching and educational programs better. 

But what is educational assessment, and why is it important for teachers and students?

In this blog post, I’ll answer these questions and more by taking you on a journey through the fascinating world of educational assessments.

Definition and Purpose of Educational Assessment 

– Grant Wiggins, assessment expert

Educational assessment is the process of collecting and analyzing evidence of learning in different settings and contexts. It helps teachers, students, parents, and other stakeholders understand what learners know and can do, as well as identify their strengths and areas for improvement.

Almost every educational environment today uses assessments, whether end-of-chapter quizzes and final exams or large-scale standardized testing , such as the Common Core, SAT , and GRE tests.

Educational assessments can take many forms:

  • They may involve formal tests or performance-based activities.
  • They may be administered online or be paper-based.
  • They may be objective (requiring a single correct answer) or subjective (requiring an open-ended answer, such as in an essay).
  • They may be formative (carried out at various points during an educational course ) or summative (carried out at the end of a course).

Watch: How to Create an Assessment Online

The purpose of educational assessment is to enhance learning and improve educational quality by enabling educators to make informed decisions that support learners’ growth and development.

Using education assessments, educators can:

  • Align instruction with learning goals and standards
  • Provide feedback and guidance to learners
  • Track learners’ progress and achievement over time
  • Adjust teaching strategies based on learners’ needs
  • Motivate and engage learners in their learning
  • Evaluate the effectiveness of educational programs
  • Communicate learners’ progress and achievement to stakeholders
  • Foster a culture of growth and accountability

Now that we know the definition and meaning of educational assessment and its aims, let us dive into the various types of assessments that educators can use to measure and enhance learning outcomes.

Different Types of Educational Assessments

Teachers and schools use different types of education assessments to check how well students learn. Some of these types are fixed by the system, but others are the teachers’ choice.

1. Diagnostic Assessment

What do you do before you start teaching a new topic to your students? Do you just dive into the lesson and hope for the best? Or do you try to find out what students already know and can do and what they need to learn? 

If you choose the latter option, you’re using diagnostic assessment.

– Simon Cox, secondary school math teacher & EEF math specialist

Diagnostic assessment checks your students’ prior knowledge and skills before you begin a new lesson or unit. It helps you identify their strengths and weaknesses and any misconceptions or gaps in their understanding. 

This way, you can plan your instruction to suit their needs and abilities and avoid wasting time on things they already know or don’t need to know.

There are many tools and techniques you can use for diagnostic assessment, such as:

  • Pre-tests: These are short quizzes or tests covering the main concepts or skills you will teach in the lesson or unit. They give you and your students a clear picture of how much they already know and what they still need to learn.
  • Surveys or questionnaires: These are questions you ask your students to rate their confidence or interest in the topic or to share their opinions or experiences. They give you and your students an insight into their attitudes and motivations toward learning.
  • Checklists: These are lists of skills or concepts your students are expected to master by the end of the lesson or unit. They give you and your students a way of tracking their progress and setting goals.

Watch: How to Create an Online Quiz in Under 5 Mins

Diagnostic assessment is useful not only for teachers but also for students. It helps them become aware of their learning and set realistic expectations. It also helps them start the learning process on the same page as you and work towards achieving the desired outcomes.

2. Formative Assessment

– Dylan Wiliam, Emeritus professor of educational assessment at the UCL Institute of Education

Formative assessments are used throughout the educational process to identify problem areas and improve teaching and learning. They are not meant to grade students but to provide feedback and guidance for both students and teachers.

Some formative educational assessment examples are:

  • Quizzes: These are short tests that check students’ understanding of key concepts or skills. They can be given at the beginning, middle, or end of a lesson, graded or ungraded.
  • Exit tickets: These are short questions or tasks that students complete at the end of a lesson to show what they have learned or still have questions about.
  • Think-pair-share: This cooperative learning strategy involves students thinking individually about a question or problem, then discussing it with a partner, and finally sharing their ideas with the whole class.
  • Self-assessment: This process involves students reflecting on their learning and progress using rubrics, checklists, or portfolios.
  • Peer feedback: This process involves students giving and receiving constructive comments and suggestions from their classmates.
  • Flashcards: Cards with questions on one side and answers on the other

Formative assessments offer many benefits for students and teachers. For students, formative assessments can:

  • Increase their motivation and engagement in learning
  • Help them identify their strengths and weaknesses
  • Help them monitor their learning and set goals
  • Help them develop metacognitive and self-regulation skills

For teachers, formative assessments can:

  • Provide valuable information about students’ learning needs and progress
  • Help them adjust their instruction and provide differentiated support
  • Help them communicate effectively with students and parents
  • Help them evaluate the effectiveness of their teaching methods and materials

3. Summative Assessment

Summative assessments are used at the end of a learning block, such as a unit, a semester, or a year, to evaluate students’ achievement of the learning goals. They are designed to measure the outcomes that students are expected to demonstrate at the end of the instruction. 

They also provide feedback on the effectiveness of the learning process, the quality of the instruction, and the long-term impact of the learning. 

Some summative educational assessment examples are:

  • Final exams: These comprehensive tests cover the entire content and skills taught in a course or a grade level. They can be written or oral and include multiple-choice, short-answer, or essay questions.
  • Projects: These are complex tasks that require students to apply their knowledge and skills to create a product or a solution. They can be individual or group-based and involve research, design, or presentation.
  • Portfolios: These are collections of students’ work and achievements over time. They can include samples of their best work, reflections, feedback, and self-evaluations.
  • Standardized tests: These are tests that are administered and scored consistently across schools or districts. They can be used to measure students’ proficiency in specific subjects or skills or to compare their performance with other students or groups.

Watch: How to Create Online Tests or Exams

4. Standardized Assessments

Standardized assessments, or standardized tests , are given and scored consistently across large groups of students. They are often used to measure students’ proficiency in particular subjects or skills or to compare their performance. 

They also help teachers and students discover why a student might be struggling, succeeding, or accelerating on their grade-level standards and plan the next step in their assessment for learning.

Some standardized educational assessment examples are grade-level tests and the SAT . 

These tests are usually objective, with question types like multiple-choice and true-or-false. However, some tests also include subjective items, like short-answer and essay questions. They can be given in person or online.

Watch: Question Types for Online Learning & Assessment

5. Performance-Based Assessments

Performance-based assessments require students to demonstrate their skills and knowledge in a specific domain, such as writing an essay or delivering a speech. They evaluate students’ ability to use what they have learned in a meaningful way rather than just memorizing facts or information. 

This type of assessment is becoming more popular as competency-based education gains traction. Competency-based education emphasizes students’ mastery of specific outcomes rather than following a fixed curriculum.

Performance-based assessments are usually conducted face-to-face, but they can also be done online in some cases. For example, students enrolled in a web development or graphic design course may showcase their learning by creating a digital project using online tools.

6. Norm & Criterion-Referenced Assessments

Referenced assessments are assessments that compare students’ results to a particular standard. There are two types of referenced assessments: norm-referenced and criterion-referenced.

Norm-referenced assessments compare students’ performance with the performance of a large group of similar students, whose score is referred to as the norm. 

For example, imagine all students in Grade 9 take the same norm-referenced test. If a particular student scores in the 91st percentile, that means they did better than 90% of the sample serving as the norm. 

Norm-referenced tests are often used to rank or classify students according to their relative abilities or achievements.

Criterion-referenced assessments are tests that measure students’ performance against a set of standards or criteria, which are based on the curriculum and learning objectives. 

For example, students in a class may be required to score 80% or higher on a particular test before moving on to the next concept. 

Criterion-referenced tests are often used to determine whether students have achieved the expected learning outcomes or demonstrated proficiency in a certain skill or subject.

7. Ipsative Assessments

The term “ipsative” comes from the Latin word “ipse” which means “of the self.” Ipsative assessment is a type of assessment that compares a student’s performance with their previous performance rather than with the performance of others or with a set of criteria.

Ipsative assessment measures a student’s personal improvement and progress over time rather than their absolute level of achievement. It can also help students identify their strengths and weaknesses and set realistic goals.

8. Alternative Assessments

Alternative assessments are different from traditional tests, such as multiple-choice or true-or-false questions. They can include various tasks and activities that allow students to demonstrate their learning in different ways. They are often more flexible and student-centered than traditional tests.

Some alternative educational assessment examples are:

  • Observations: Teachers watch and record students’ behaviors, actions, and interactions in the classroom or other settings.
  • Essays: Students write responses expressing their thoughts, opinions, or arguments on a topic or question.
  • Performance tasks: Students perform a task that requires them to apply their knowledge and skills to create a product or a solution.
  • Exhibitions and demonstrations: Students display or present their work or performance to the class or an audience.
  • Portfolios: Students collect and showcase their work and achievements over time.
  • Journals: Students record their learning experiences and thoughts.
  • Project work: Students work individually or collaboratively on a task that involves planning, researching, designing, and producing a product or a solution related to a real-world problem or issue.
  • Interviews: Teachers ask, and students answer questions about a topic or problem orally. They can do this in-person or via video interview mediums, such as video conferencing tools and online quizzes.

Watch: How to Create a Video Interview Quiz

9. Authentic Assessments 

Authentic education assessments are alternative assessments that measure students’ ability to apply their knowledge and skills to real-world situations and problems. 

They require students to perform complex and action-oriented tasks that reflect the standards and expectations of the discipline or profession. They also require students to use higher-order critical thinking skills.

Some examples of authentic educational assessment tasks are:

  • Projects: Students plan, research, design, and produce a product or a solution related to a real-world problem or issue.
  • Portfolios: Students collect and showcase their work and achievements over time and reflect on their learning process and outcomes.
  • Real-world applications: Students apply their knowledge and skills to authentic contexts or situations they may encounter in their personal or professional lives.

Authentic assessments have many benefits for deeper learning and skill development. For example, they can:

  • Help students connect their learning to real-life situations and problems by simulating authentic contexts or scenarios
  • Help students develop transferable skills for their future careers or studies by requiring them to use higher-order, critical thinking skills
  • Help teachers evaluate the effectiveness of their teaching methods and materials by measuring the impact of their instruction on students’ learning outcomes

Now that we have discussed the various types of educational assessments and their purposes, let’s understand their importance in the education system and their role in improving teaching and learning.

Importance of Assessments in the Education System

Educational assessments are powerful tools that can enhance learning and teaching when done well. Some of the benefits of a good educational assessment are:

  • Progress monitoring: Assessments help educators track students’ progress and identify their strengths and weaknesses so that they can provide appropriate support and intervention.
  • Feedback: They provide feedback to students about their performance, which they can use to monitor their learning and set goals for improvement.
  • Motivation: They motivate students, as they know they will be assessed on what they have learned and how well they have learned it.
  • Alignment: They help educators align their instruction with the learning objectives and outcomes and determine the most effective strategies and methods to help students achieve them.
  • Curriculum improvement: They can improve the curriculum by identifying gaps, redundancies, or inconsistencies in the content and skills taught.
  • Evaluation: They can be used to evaluate teachers’ and school systems’ performance and the impact of different teaching practices on student learning.
  • Differentiation: They help educators differentiate their instruction and assessment according to their students’ diverse needs and abilities and provide them with multiple ways to demonstrate their learning.

These are the benefits of “good” educational assessment, but what makes assessments good?

What Makes an Educational Assessment “Good”?

A good educational assessment should follow three basic principles of quality:

1. Assessments should be aligned with defined objectives and outcomes 

Clarity about the desired knowledge and skills students are expected to learn and demonstrate is essential for designing and implementing effective assessments. The assessments should match the learning objectives and outcomes and measure the extent to which students have achieved them.

2. Assessments should be valid 

Validity refers to how well an assessment measures what it claims to measure. Different types of assessments are suitable for different types of learning, and they should be chosen accordingly.

For example, a multiple-choice test may be a valid way to assess students’ knowledge of historical facts but not their research skills.

3. Assessments should be reliable 

Reliability measures how consistent and accurate an assessment is. A reliable assessment should produce similar results when administered under the same conditions and minimize errors and biases that may affect the scores. 

For example, a reliable assessment should have clear instructions, fair scoring criteria, and adequate time limits.

We’ve gone through the basics of educational assessment and how quizzes are often used for it. But did you know you can create and share quizzes online easily? Before finishing this post, let’s check out the steps for making an online educational quiz.

How to Create an Educational Assessment Quiz Online

Here are the steps for creating an online educational assessment quiz. We’ve taken ProProfs Quiz Maker as an example here.

Step 1: Choose a quiz type

You can create a scored or personality quiz depending on your needs.

step_1_choose_a_quiz_type

Step 2: Choose a template or make a quiz from scratch

You can pick from thousands of educational quiz templates or start with a blank quiz.

step_2_choose_a_template_or_make_a_quiz_from_scratch

Step 3: Create or import your quiz questions

You can import questions from a bank of over a million ready-to-use questions or create your own. You can also add images and videos to make your quiz more engaging.

step_3_create_or_import_your_quiz_questions

You can choose from 15+ question types, including multiple-choice, fill-in-the-blanks, matching, hotspot, video response, and more. 

Watch: How to Create a Quiz Using Question Bank & Templates

Step 4: Customize the look and settings of your quiz.  

You can change your quiz’s theme, colors, fonts, background image, and more to customize its look and feel and enhance its visual appeal.

step_4_customize_the_look_and_settings_of_your_quiz

You can also adjust the settings for scoring, feedback, security, cheating prevention, and more.

Watch: How to Configure Online Quiz Settings & Theme

Step 5: Share your quiz with your students. 

You can embed your quiz on your website or blog, share it on social media or email, or generate a link or QR code for your quiz.

Watch: How to Share Online Quizzes With Learners

Check our comprehensive online assessment guide to learn more about creating online assessment quizzes for education.

Boost Teaching Outcomes With Educational Assessments

Educational assessment is a vital process that helps teachers, students, and other stakeholders measure and improve learning outcomes. It can take various forms, such as formative, summative, diagnostic, and more. Educators can gain a comprehensive picture of students’ strengths, weaknesses, progress, and needs using various assessment methods and tools.  

One of the most effective tools for creating online assessments is ProProfs Quiz Maker . ProProfs allows you to create quizzes and exams in minutes and ask questions in 15+ different ways. You can customize your quizzes with multimedia, themes, custom settings, and feedback options. ProProfs Quiz Maker software helps you make learning fun and interactive for your students.

Frequently Asked Questions:

What do you mean by educational assessment?

Educational assessment is the systematic process of documenting and using empirical data on the knowledge, skills, attitudes, aptitudes, and beliefs of students to refine programs and improve student learning.

What is an example of an educational assessment?’

Some examples of educational assessments are:

Standardized tests: Tests that measure students’ performance against pre-defined goals or outcomes

Pop quizzes: Short, informal tests that check students’ understanding of a topic or lesson

Portfolios: Collections of students’ work that demonstrate their skills, knowledge, and progress over time

Performance assessments : Tasks or activities that require students to demonstrate their understanding and application of concepts or skills in a real-world context

Rubrics: Scoring guides that describe the criteria and levels of quality for students’ work

Self-assessments: Processes that involve students evaluating their own learning and identifying their strengths and areas for improvement

Peer-assessments: Processes that involve students giving and receiving feedback from their classmates on their work

What is the need of educational assessment?

Educational assessment helps monitor learning progress, provide feedback to students on their strengths and weaknesses, guide instruction and curriculum development, evaluate the effectiveness and quality of educational programs, and influence educational policies and decisions.

Vipul Bhagia

About the author

Vipul bhagia.

Vipul Bhagia is an e-learning expert and content creator, specializing in instructional design. He excels in crafting compelling e-learning modules and designing effective employee training assessments. He is passionate about leveraging digital solutions to transform work culture and boost productivity. Vipul enjoys exploring emerging tech innovations and sharing his insights with fellow industry professionals.

Popular Posts in This Category

what is assessment in education essay

7 Best Career Assessment Tools & Software in 2024

what is assessment in education essay

8 Best Technical Assessment Tools to Hire Faster & Better in 2024

what is assessment in education essay

Top 6 Online Math Assessment Tools for Teachers

what is assessment in education essay

Benefits of Online Assessments for Learning & Hiring: Going Beyond Traditional Evaluation Methods

what is assessment in education essay

What Are Pre-Employment Assessment Tests & How to Use Them to Build Strong Teams

what is assessment in education essay

What Is an Assessment Tool? Types, Benefits, Best Practices

Assessment and Evaluation Compare & Contrast Essay

  • To find inspiration for your paper and overcome writer’s block
  • As a source of information (ensure proper referencing)
  • As a template for you assignment

Assessment is an interactive process that provides teachers, parents or guardians and the students themselves with valid information about progress and attainment of expected curriculum teaching. It focuses on teaching, learning and outcomes. The main goal of assessment is to improve student learning in the subject under study.

Assessments are based on achievement goals and standards developed for a particular curriculum grade. Assessment is done to collect information on individual student performance within a given time frame. Learning evidences may include tests and portfolios. It may also involve other learning tasks such as journals and written work.

The outcome information can be shared with the students to make improvements. Improvement of student learning can be through changing achieved through changing the learning environment or the study habits. The subject of assessment can be of any type. It may be a happening or event, an individual, a place or a condition. The subject matter is learner-centered, course based and in most times anonymous and not graded. Assessment seeks to note down all data whether subjective or objective (Jere, 2010).

On the hand, an evaluation is a set of activities or statements that seek to determine whether objectives were realized. It focuses largely on grades and may reflect the components of classroom other than mastery level and course content. Evaluation may include discussion, attendance, verbal ability and cooperation among others. It is the last object of an inquiry.

Evaluations tell whether a set goal or a solution has been met or not. It takes place after completion of a learning activity. Evaluation is done at end of inquiry. Evaluation can result in three things: A positive change, a negative change or no change or development at all. Evaluation looks into whether improvements or changes have occurred in the data. Assessment and evaluation need each other and support one another (Gavi, 2011).

In summary, the three differences between assessment and evaluation are; Assessment is formative in the sense that it is ongoing and meant to improve learning while evaluation is summative, that is, it is final and it is meant to gauge quality. Assessment focuses on how learning is going (process-oriented) while evaluation focuses on what has been learned (product-oriented). Assessment identifies areas for improvement (it is diagnostic) while evaluation arrives at an overall grade, that is, it is judgmental (Patty, 2004).

Formal assessments have common sets of expectations from all students. These tests help teachers to understand how well students have understood theme skills and concepts taught in class. Thus teachers are able to systemically evaluate the students by use of real writing and reading experiences. They come with prescribed criteria interpretation. Data is computed and summarized mathematically. There are criteria for scoring and scores are commonly given by standard scores, percentiles or stanines.

The teachers have statistics that can support certain conclusions such as “a student is reading below average” because these tests have had been tried before on the students. Flexibility in assessment outcomes gives teachers another chance to closely monitor the students in order to modify assessment as required. Thus these benchmarks help teachers and guardians to evaluate student progress over the entire year (Gavi, 2011).

On the other hand, informal assessment indicates techniques that are incorporated into learning activities or classroom routines. Informal assessments are also called performance based measures or criterion referenced measures. They should be used in form of instruction. The type of assessment used should be in line with the purpose of assessment. They can be employed at any time without necessarily interfering with instructional time.

The results obtained indicate the performance of the student on that particular subject or skill of interest. Activities associated with informal assessment include demonstrations, oral presentations, individual projects and experiments among others. This type of assessment does not compare a broader group other than the students in that particular local project. Unlike formal assessments that are data driven, informal assessments are content and performance driven (Patty, 2004).

Social studies are integrated studies that are meant to improve civic responsibilities of students. I have chosen history subject as a social study in my informal assessment below. The informal assessment test is designed for students in elementary grades. A fourth-year grade teacher has just finished three hours lesson teaching a topic on the state’s history.

He intends to check the instructional effectiveness and understanding of this topic by employing a type of discussion that takes the form of written checks. The data obtained will in turn assist the teacher to plan and use data-based instructions during the next teaching period.

Gavi, R. M. (2011). Dyslexia : Special Educational Needs Series . New York, NY: International Publishing Group.

Jere, E. B. (2010). Effective assessment and Evaluation . Chicago: Tylor & Francis.

Patty, S. A. (2004). Making sense of Online learning: A guide for beginners and the truly skeptical. New York, NY: John Wiley & Sons.

  • Overview of Instructional Technology in Education
  • Difference between formal and informal teaching
  • Online Education Goals and Instructional Objectives
  • Changing the Structure of the Class Grading System
  • Influence of Cultural Identity the Way Middle School Students Learn
  • Ownership and Operation of the Global Village English Language School
  • How to Prepare the Students to Travel to a Foreign Country?
  • Play Role in Cognitive Development of Children
  • Chicago (A-D)
  • Chicago (N-B)

IvyPanda. (2018, October 31). Assessment and Evaluation. https://ivypanda.com/essays/assessment-and-evaluation/

"Assessment and Evaluation." IvyPanda , 31 Oct. 2018, ivypanda.com/essays/assessment-and-evaluation/.

IvyPanda . (2018) 'Assessment and Evaluation'. 31 October.

IvyPanda . 2018. "Assessment and Evaluation." October 31, 2018. https://ivypanda.com/essays/assessment-and-evaluation/.

1. IvyPanda . "Assessment and Evaluation." October 31, 2018. https://ivypanda.com/essays/assessment-and-evaluation/.

Bibliography

IvyPanda . "Assessment and Evaluation." October 31, 2018. https://ivypanda.com/essays/assessment-and-evaluation/.

  • Visit the University of Nebraska–Lincoln
  • Apply to the University of Nebraska–Lincoln
  • Give to the University of Nebraska–Lincoln

Search Form

Assessing student writing, what does it mean to assess writing.

  • Suggestions for Assessing Writing

Means of Responding

Rubrics: tools for response and assessment, constructing a rubric.

Assessment is the gathering of information about student learning. It can be used for formative purposes−−to adjust instruction−−or summative purposes: to render a judgment about the quality of student work. It is a key instructional activity, and teachers engage in it every day in a variety of informal and formal ways.

Assessment of student writing is a process. Assessment of student writing and performance in the class should occur at many different stages throughout the course and could come in many different forms. At various points in the assessment process, teachers usually take on different roles such as motivator, collaborator, critic, evaluator, etc., (see Brooke Horvath for more on these roles) and give different types of response.

One of the major purposes of writing assessment is to provide feedback to students. We know that feedback is crucial to writing development. The 2004 Harvard Study of Writing concluded, "Feedback emerged as the hero and the anti-hero of our study−powerful enough to convince students that they could or couldn't do the work in a given field, to push them toward or away from selecting their majors, and contributed, more than any other single factor, to students' sense of academic belonging or alienation" (http://www.fas.harvard.edu/~expos/index.cgi?section=study).

Source: Horvath, Brooke K. "The Components of Written Response: A Practical Synthesis of Current Views." Rhetoric Review 2 (January 1985): 136−56. Rpt. in C Corbett, Edward P. J., Nancy Myers, and Gary Tate. The Writing Teacher's Sourcebook . 4th ed. New York: Oxford Univ. Press, 2000.

Suggestions for Assessing Student Writing

Be sure to know what you want students to be able to do and why. Good assessment practices start with a pedagogically sound assignment description and learning goals for the writing task at hand. The type of feedback given on any task should depend on the learning goals you have for students and the purpose of the assignment. Think early on about why you want students to complete a given writing project (see guide to writing strong assignments page). What do you want them to know? What do you want students to be able to do? Why? How will you know when they have reached these goals? What methods of assessment will allow you to see that students have accomplished these goals (portfolio assessment assigning multiple drafts, rubric, etc)? What will distinguish the strongest projects from the weakest?

Begin designing writing assignments with your learning goals and methods of assessment in mind.

Plan and implement activities that support students in meeting the learning goals. How will you support students in meeting these goals? What writing activities will you allow time for? How can you help students meet these learning goals?

Begin giving feedback early in the writing process. Give multiple types of feedback early in the writing process. For example, talking with students about ideas, write written responses on drafts, have students respond to their peers' drafts in process, etc. These are all ways for students to receive feedback while they are still in the process of revising.

Structure opportunities for feedback at various points in the writing process. Students should also have opportunities to receive feedback on their writing at various stages in the writing process. This does not mean that teachers need to respond to every draft of a writing project. Structuring time for peer response and group workshops can be a very effective way for students to receive feedback from other writers in the class and for them to begin to learn to revise and edit their own writing.

Be open with students about your expectations and the purposes of the assignments. Students respond better to writing projects when they understand why the project is important and what they can learn through the process of completing it. Be explicit about your goals for them as writers and why those goals are important to their learning. Additionally, talk with students about methods of assessment. Some teachers have students help collaboratively design rubrics for the grading of writing. Whatever methods of assessment you choose, be sure to let students in on how they will be evaluated.

 Do not burden students with excessive feedback. Our instinct as teachers, especially when we are really interested in students´ writing is to offer as many comments and suggestions as we can. However, providing too much feedback can leave students feeling daunted and uncertain where to start in terms of revision. Try to choose one or two things to focus on when responding to a draft. Offer students concrete possibilities or strategies for revision.

Allow students to maintain control over their paper. Instead of acting as an editor, suggest options or open-ended alternatives the student can choose for their revision path. Help students learn to assess their own writing and the advice they get about it.

Purposes of Responding We provide different kinds of response at different moments. But we might also fall into a kind of "default" mode, working to get through the papers without making a conscious choice about how and why we want to respond to a given assignment. So it might be helpful to identify the two major kinds of response we provide:

  • Formative Response: response that aims primarily to help students develop their writing. Might focus on confidence-building, on engaging the student in a conversation about her ideas or writing choices so as to help student to see herself as a successful and promising writer. Might focus on helping student develop a particular writing project, from one draft to next. Or, might suggest to student some general skills she could focus on developing over the course of a semester.
  • Evaluative Response: response that focuses on evaluation of how well a student has done. Might be related to a grade. Might be used primarily on a final product or portfolio. Tends to emphasize whether or not student has met the criteria operative for specific assignment and to explain that judgment.

We respond to many kinds of writing and at different stages in the process, from reading responses, to exercises, to generation or brainstorming, to drafts, to source critiques, to final drafts. It is also helpful to think of the various forms that response can take.

  • Conferencing: verbal, interactive response. This might happen in class or during scheduled sessions in offices. Conferencing can be more dynamic: we can ask students questions about their work, modeling a process of reflecting on and revising a piece of writing. Students can also ask us questions and receive immediate feedback. Conference is typically a formative response mechanism, but might also serve usefully to convey evaluative response.
  • Written Comments on Drafts
  • Local: when we focus on "local" moments in a piece of writing, we are calling attention to specifics in the paper. Perhaps certain patterns of grammar or moments where the essay takes a sudden, unexpected turn. We might also use local comments to emphasize a powerful turn of phrase, or a compelling and well-developed moment in a piece. Local commenting tends to happen in the margins, to call attention to specific moments in the piece by highlighting them and explaining their significance. We tend to use local commenting more often on drafts and when doing formative response.
  • Global: when we focus more on the overall piece of writing and less on the specific moments in and of themselves. Global comments tend to come at the end of a piece, in narrative-form response. We might use these to step back and tell the writer what we learned overall, or to comment on a pieces' general organizational structure or focus. We tend to use these for evaluative response and often, deliberately or not, as a means of justifying the grade we assigned.
  • Rubrics: charts or grids on which we identify the central requirements or goals of a specific project. Then, we evaluate whether or not, and how effectively, students met those criteria. These can be written with students as a means of helping them see and articulate the goals a given project.

Rubrics are tools teachers and students use to evaluate and classify writing, whether individual pieces or portfolios. They identify and articulate what is being evaluated in the writing, and offer "descriptors" to classify writing into certain categories (1-5, for instance, or A-F). Narrative rubrics and chart rubrics are the two most common forms. Here is an example of each, using the same classification descriptors:

Example: Narrative Rubric for Inquiring into Family & Community History

An "A" project clearly and compellingly demonstrates how the public event influenced the family/community. It shows strong audience awareness, engaging readers throughout. The form and structure are appropriate for the purpose(s) and audience(s) of the piece. The final product is virtually error-free. The piece seamlessly weaves in several other voices, drawn from appropriate archival, secondary, and primary research. Drafts - at least two beyond the initial draft - show extensive, effective revision. Writer's notes and final learning letter demonstrate thoughtful reflection and growing awareness of writer's strengths and challenges.

A "B" project clearly and compellingly demonstrates how the public event influenced the family/community. It shows strong audience awareness, and usually engages readers. The form and structure are appropriate for the audience(s) and purpose(s) of the piece, though the organization may not be tight in a couple places. The final product includes a few errors, but these do no interfere with readers' comprehension. The piece effectively, if not always seamlessly, weaves several other voices, drawn from appropriate archival, secondary, and primary research. One area of research may not be as strong as the other two. Drafts - at least two beyond the initial drafts - show extensive, effective revision. Writer's notes and final learning letter demonstrate thoughtful reflection and growing awareness of writer's strengths and challenges.

A "C" project demonstrates how the public event influenced the family/community. It shows audience awareness, sometimes engaging readers. The form and structure are appropriate for the audience(s) and purpose(s), but the organization breaks down at times. The piece includes several, apparent errors, which at times compromises the clarity of the piece. The piece incorporates other voices, drawn from at least two kinds of research, but in a generally forced or awkward way. There is unevenness in the quality and appropriateness of the research. Drafts - at least one beyond the initial draft - show some evidence of revision. Writer's notes and final learning letter show some reflection and growth in awareness of writer's strengths and challenges.

A "D" project discusses a public event and a family/community, but the connections may not be clear. It shows little audience awareness. The form and structure is poorly chosen or poorly executed. The piece includes many errors, which regularly compromise the comprehensibility of the piece. There is an attempt to incorporate other voices, but this is done awkwardly or is drawn from incomplete or inappropriate research. There is little evidence of revision. Writer's notes and learning letter are missing or show little reflection or growth.

An "F" project is not responsive to the prompt. It shows little or no audience awareness. The purpose is unclear and the form and structure are poorly chosen and poorly executed. The piece includes many errors, compromising the clarity of the piece throughout. There is little or no evidence of research. There is little or no evidence of revision. Writer's notes and learning letter are missing or show no reflection or growth.

Chart Rubric for Community/Family History Inquiry Project

Clearly and compellingly demonstrates influence of event Clearly and compellingly demonstrates influence of event Demonstrates influence of event Discusses event; connections unclear Not responsive to prompt
Strong audience awareness; engages throughout Strong audience awareness; usually engages Audience awareness; sometimes engages Little audience awareness Little or no audience awareness
Appropriate for audience(s), purpose(s) Appropriate for audience(s), purpose(s); organization occasionally not tight Appropriate for audience(s), purpose(s); organization breaks down at times Poorly chosen or poorly executed Poorly chosen and executed
Virtually error-free Few, unobtrusive errors Several apparent, sometimes obtrusive errors Many, obtrusive errors Many obtrusive errors
Seamlessly weaves voices; 3 kinds of research Effectively weaves voices; 3 kinds of research; 1 may not be as strong Incorporates other voices, but awkwardly; at least 2 kinds of research Attempts to incorporate voices, but awkwardly; poor research Little or no evidence of research
Extensive, effective (at least 2 drafts beyond 1st) Extensive, effective (at least 2 drafts beyond 1st) Some evidence of revision Little evidence or revision No evidence of revision
Thoughtful reflection; growing self-awareness Thoughtful reflection; growing self-awareness Some evidence of reflection, growth Little evidence of reflection Little or no evidence of reflection
Thoughtful reflection; growing self-awareness Thoughtful reflection; growing self-awareness Some evidence of reflection, growth Little evidence of reflection Little or no evidence of reflection

All good rubrics begin (and end) with solid criteria. We always start working on rubrics by generating a list - by ourselves or with students - of what we value for a particular project or portfolio. We generally list far more items than we could use in a single rubric. Then, we narrow this list down to the most important items - between 5 and 7, ideally. We do not usually rank these items in importance, but it is certainly possible to create a hierarchy of criteria on a rubric (usually by listing the most important criteria at the top of the chart or at the beginning of the narrative description).

Once we have our final list of criteria, we begin to imagine how writing would fit into a certain classification category (1-5, A-F, etc.). How would an "A" essay differ from a "B" essay in Organization? How would a "B" story differ from a "C" story in Character Development? The key here is to identify useful descriptors - drawing the line at appropriate places. Sometimes, these gradations will be precise: the difference between handing in 80% and 90% of weekly writing, for instance. Other times, they will be vague: the difference between "effective revisions" and "mostly effective revisions", for instance. While it is important to be as precise as possible, it is also important to remember that rubric writing (especially in writing classrooms) is more art than science, and will never - and nor should it - stand in for algorithms. When we find ourselves getting caught up in minute gradations, we tend to be overlegislating students´- writing and losing sight of the purpose of the exercise: to support students' development as writers. At the moment when rubric-writing thwarts rather than supports students' writing, we should discontinue the practice. Until then, many students will find rubrics helpful -- and sometimes even motivating.

  • MyU : For Students, Faculty, and Staff
  • Academic Leaders
  • Faculty and Instructors
  • Graduate Students and Postdocs

Center for Educational Innovation

  • Campus and Collegiate Liaisons
  • Pedagogical Innovations Journal Club
  • Teaching Enrichment Series
  • Recorded Webinars
  • Video Series
  • All Services
  • Teaching Consultations
  • Student Feedback Facilitation
  • Instructional Media Production
  • Curricular and Educational Initiative Consultations
  • Educational Research and Evaluation
  • Thank a Teacher
  • All Teaching Resources
  • Aligned Course Design
  • Active Learning
  • Team Projects
  • Active Learning Classrooms
  • Leveraging the Learning Sciences
  • Inclusive Teaching at a Predominantly White Institution
  • Strategies to Support Challenging Conversations in the Classroom
  • Assessments
  • Online Teaching and Design
  • AI and ChatGPT in Teaching
  • Documenting Growth in Teaching
  • Early Term Feedback
  • Scholarship of Teaching and Learning
  • Writing Your Teaching Philosophy
  • All Programs
  • Assessment Deep Dive
  • Designing and Delivering Online Learning
  • Early Career Teaching and Learning Program
  • International Teaching Assistant (ITA) Program
  • Preparing Future Faculty Program
  • Teaching with Access and Inclusion Program
  • Teaching for Student Well-Being Program
  • Teaching Assistant and Postdoc Professional Development Program
  • Essay Exams

Essay exams provide opportunities to evaluate students’ reasoning skills such as the ability to compare and contrast concepts, justify a position on a topic, interpret cases from the perspective of different theories or models, evaluate a claim or assertion with evidence, design an experiment, and other higher level cognitive skills. They can reveal if students understand the theory behind course material or how different concepts and theories relate to each other. 

+ Advantages and Challenges of essay exams

Advantages:

  • Can be used to measure higher order cognitive skills
  • Takes relatively less time to write questions
  • Difficult for respondents to get correct answers by guessing

Challenges:

  • Can be time consuming to administer and to score
  • Can be challenging to identify measurable, reliable criteria for assessing student responses
  • Limited range of content can be sampled during any one testing period
  • Timed exams in general add stress unrelated to a student's mastery of the material

+ Creating an essay exam

  • Limit the use of essay questions to learning aims that require learners to share their thinking processes, connect and analyze information, and communicate their understanding for a specific purpose. 
  • Write each item so that students clearly understand the specific task and what deliverables are required for a complete answer (e.g. diagram, amount of evidence, number of examples).
  • Indicate the relative amount of time and effort students should spend on each essay item, for example “2 – 3 sentences should suffice for this question”.
  • Consider using several narrowly focused items rather than one broad item.
  • Consider offering students choice among essay questions, while ensuring that all learning aims are assessed.

When designing essay exams, consider the reasoning skills you want to assess in your students. The following table lists different skills to measure with example prompts to guide assessment questions. 

Table from Piontek, 2008
Skill to Assess Possible Question Stems
Comparing
Relating Cause and Effect 
Justifying
Summarizing
Generalizing
Inferring
Classifying
Creating
Applying
Analyzing
Synthesizing

+ Preparing students for an essay exam

Adapted from Piontek, 2008

Prior to the essay exam

  • Administer a formative assessment that asks students to do a brief write on a question similar to one you will use on an exam and provide them with feedback on their responses.
  • Provide students with examples of essay responses that do and do not meet your criteria and standards. 
  • Provide students with the learning aims they will be responsible for mastering to help them focus their preparation appropriately.
  • Have students apply the scoring rubric to sample essay responses and provide them with feedback on their work.

Resource video : 2-minute video description of a formative assessment that helps prepare students for an essay exam. 

+ Administering an essay exam

  • Provide adequate time for students to take the assessment. A strategy some instructors use is to time themselves answering the exam questions completely and then multiply that time by 3-4.
  • Endeavor to create a distraction-free environment.
  • Review the suggestions for informal accommodations for multilingual learners , which may be helpful in setting up an essay exam for all learners.

+ Grading an essay exam

To ensure essays are graded fairly and without bias:

  • Outline what constitutes an acceptable answer (criteria for knowledge and skills).
  • Select an appropriate scoring method based on the criteria.
  • Clarify the role of writing mechanics and other factors independent of the learning aims being measured.
  • Share with students ahead of time.
  • Use a systematic process for scoring each essay item.  For instance, score all responses to a single question in one setting.
  • Anonymize student work (if possible) to ensure fairer and more objective feedback. For example students could use their student ID number in place of their name.

+ References & Resources

  • For more information on setting criteria, preparing students, and grading essay exams read:  Boye, A. (2019) Writing Better Essay Exams , IDEA paper #76.
  • For more detailed descriptions of how to develop and score essay exams read: Piontek, M.E. (2008). Best Practices for Designing and Grading Exams, CRLT Occasional Paper # 24.

Web resources

  • Designing Effective Writing Assignments  (Teaching with Writing Program - UMNTC ) 
  • Writing Assignment Checklist (Teaching with Writing Program - UMNTC)
  • Designing and Using Rubrics (Center for Writing - UMTC)
  • Caroline Hilk
  • Research and Resources
  • Why Use Active Learning?
  • Successful Active Learning Implementation
  • Addressing Active Learning Challenges
  • Why Use Team Projects?
  • Project Description Examples
  • Project Description for Students
  • Team Projects and Student Development Outcomes
  • Forming Teams
  • Team Output
  • Individual Contributions to the Team
  • Individual Student Understanding
  • Supporting Students
  • Wrapping up the Project
  • Addressing Challenges
  • Course Planning
  • Working memory
  • Retrieval of information
  • Spaced practice
  • Active learning
  • Metacognition
  • Definitions and PWI Focus
  • A Flexible Framework
  • Class Climate
  • Course Content
  • An Ongoing Endeavor
  • Learn About Your Context
  • Design Your Course to Support Challenging Conversations
  • Design Your Challenging Conversations Class Session
  • Use Effective Facilitation Strategies
  • What to Do in a Challenging Moment
  • Debrief and Reflect On Your Experience, and Try, Try Again
  • Supplemental Resources
  • Align Assessments
  • Multiple Low Stakes Assessments
  • Authentic Assessments
  • Formative and Summative Assessments
  • Varied Forms of Assessments
  • Cumulative Assessments
  • Equitable Assessments
  • Multiple Choice Exams and Quizzes
  • Academic Paper
  • Skill Observation
  • Alternative Assessments
  • Assessment Plan
  • Grade Assessments
  • Prepare Students
  • Reduce Student Anxiety
  • SRT Scores: Interpreting & Responding
  • Student Feedback Question Prompts
  • Research Questions and Design
  • Gathering data
  • Publication
  • GRAD 8101: Teaching in Higher Education
  • Finding a Practicum Mentor
  • GRAD 8200: Teaching for Learning
  • Proficiency Rating & TA Eligibility
  • Schedule a SETTA
  • TAPD Webinars

How capable is a synopsis chatbot as a sparring partner for doctoral students? An explorative case study.  Some design-based, theoretical and pedagogical implications from an early phase of the innovation

29 Pages Posted: 27 Jun 2024

Rune Johan Krumsvik

University of Bergen - Faculty of Psychology

Date Written: June 22, 2024

The advent of AI technologies, particularly in higher education, can offer new opportunities of feedback and formative assessment, but the knowledge base is limited. In this case study I will examine this in light of doctoral education where, despite favorable employment conditions, only two-thirds of PhD candidates in Norway complete their doctorate. There are many reasons for this but is partly due to the challenges posed by the introduction of articlebased PhD dissertations, which consist of scientific articles and a comprehensive extended summary (called "synopsis", 60-90 pages). Our research from 2016 to 2024 highlights ambiguities and unintended double standards across disciplines concerning this dissertation format, despite general national guidelines for doctoral dissertations. There seem to be a blind spot in the current state of knowledge regarding if, and eventually how rubrics and domainspecific chatbots can contribute to this area and to formative assessment at the PhD level. This position paper focuses on design-based research with an exploratory case study and where I trained a synopsis-chatbot (on top of GPT-4) on Norwegian doctoral rubrics and literature about the article-based dissertation. The goal was to identify if, and how, the chatbot could function as a sparring partner for PhD candidates working on their synopsis of their articlebased dissertations in the Norwegian doctoral context. This paper describes the early phase of developing the synopsis chatbot and explores its theoretical and pedagogical implications. Developing such a chatbot can be crucial since GPT-4 is generic and primarily trained on English-language sources, Anglo-American PhD conditions and lacks anchoring to Norwegian PhD regulations, the Norwegian article-based genre, Norwegian culture, and the Norwegian language. Therefore, the synopsis-chatbot was trained to make GPT-4 more domain-specific (for academic writing at the PhD level), context-specific (Norwegian articlebased dissertation), and multilingual-specific (sensitive to both Norwegian and English). The case study results indicate varying levels of AI-acceptance among PhD supervisors, with 6 out of 10 expressing skepticism about using AI for academic writing. The PhD's has also varied experience with the use of ChatGPT and GPT-4. The study also shows that theoretical underpinnings and rubrics are essential for the chatbot's configuration. Pre-testing showed that the synopsis chatbot performs well in providing formative assessment and handling multimodal illustrations, proving to be a valuable sparring partner for PhD candidates. An implication of the paper is to ask if conventual theories are capable of encapsulating this new educational AI-terrain. Therefore, this paper suggests a new, updated definition of formative assessment which encapsulates AI and chatbot as complementary sparring partners and "digital supervisors" within doctoral education. The synopsis-chatbot can potentially contribute to mitigate some issues related to unwritten rules and vague genre requirements, complementing traditional PhD supervision in academic writing and formative assessment. However, several limitations exist since this is an exploratory case study in an early phase of the innovation , and several ethical considerations must be addressed.

Keywords: AI, GPT-4, Synopsis-chatbot, doctoral students, formative assessment, rubrics

Suggested Citation: Suggested Citation

Rune Johan Krumsvik (Contact Author)

University of bergen - faculty of psychology ( email ), do you have a job opening that you would like to promote on ssrn, paper statistics, related ejournals, educational technology, media & library science ejournal.

Subscribe to this fee journal for more curated articles on this topic

COMMENTS

  1. Why Is Assessment Important?

    Why Is Assessment Important? Asking students to demonstrate their understanding of the subject matter is critical to the learning process; it is essential to evaluate whether the educational goals and standards of the lessons are being met. July 15, 2008. Assessment is an integral part of instruction, as it determines whether or not the goals ...

  2. Essay on Assessment

    Introduction. Assessment of students is a vital exercise aimed at evaluating their knowledge, talents, thoughts, or beliefs (Harlen, 2007). It involves testing a part of the content taught in class to ascertain the students' learning progress. Assessment should put into consideration students' class work and outside class work.

  3. Assessment in Education

    Assessment in education is the collation of various data from different resources to check the student's learning and understanding. When reviewed and placed in context, this data helps gauge ...

  4. Educational Assessment Essay

    Educational assessments are carried out to measure the efficiency of the program, the quality of instruction and progress of a child's learning. The purpose is to determine the growth and development. There are several kinds of assessment carried out during a student's learning. These include the placement assessment, formative assessment ...

  5. Assessing Student Learning

    Learning and Teaching in Higher Education 1 (2004): 3-31. Print. Henderson, Euan S. "The Essay in Continuous Assessment." Studies in Higher Education 5.2 (1980): 197-203. Taylor and Francis+NEJM. Web. Gelmon, Sherril B., Barbara Holland, and Amy Spring. Assessing Service-Learning and Civic Engagement: Principles and Techniques. Second ...

  6. What Is Assessment? Purposes of Assessment and Evaluation

    Assessment and evaluation are a serious and integral part of the instructional process, affecting not only students, but teachers, society, and the whole educational milieu. The nature and purposes of assessment and evaluation are discussed emphasizing their significance as dynamic processes in providing feedback to learners about classroom ...

  7. (PDF) Reflective Essay on Assessment

    Reflective Essay on Assessment. Kerwin Anthony Livingstone, PhD. Email: [email protected]. The goal of education is learning, and the vehicle used to accomplish this goal is ...

  8. Education assessment in the 21st century: Moving beyond traditional

    This blog is part of a four-part series on shifting educational measurement to match 21st century skills, covering traditional assessments, new technologies, new skillsets, and pathways to the ...

  9. PDF What is assessment

    Assessment is the systematic collection, review and use of information about educational programs to improve student learning. Assessment focuses on what students know, what they are able to do, and what values they have when they graduate. Assessment is concerned with the collective impact of a program on student learning.

  10. PDF Context and Perspective: Implications for Assessment in Education

    reimagining approaches to assessment in education proceeds. This essay is guided by the conviction that context and perspective are vitally influential in matters of human performance and human behavior. The various contexts for human performance may be categorized as including economic, existential, physical, political,

  11. Assessment Definition

    In education, the term assessment refers to the wide variety of methods or tools that educators use to evaluate, measure, and document the academic readiness, learning progress, skill acquisition, or educational needs of students. While assessments are often equated with traditional tests—especially the standardized tests developed by testing companies and administered to large populations ...

  12. Student Assessment in Teaching and Learning

    By Michael R. Fisher, Jr. Much scholarship has focused on the importance of student assessment in teaching and learning in higher education. Student assessment is a critical aspect of the teaching and learning process. Whether teaching at the undergraduate or graduate level, it is important for instructors to strategically evaluate the effectiveness of their teaching...

  13. 6.1 Assessment and Evaluation

    6.1 Assessment and Evaluation. Assessment, as defined by www.edglossary.org, "refers to the wide variety of methods or tools that educators use to evaluate, measure, and document the academic readiness, learning progress, skill acquisition, or educational needs of students.". It is analogous to evaluation, judgment, rating, appraisal, and ...

  14. Formative, Summative & More Types of Assessments in Education

    Essays and research papers: This is another traditional form of summative assessment, typically involving drafts (which are really formative assessments in disguise) and edits before a final copy. Presentations: From oral book reports to persuasive speeches and beyond, presentations are another time-honored form of summative assessment.

  15. 6 Assessment in Practice

    Educational assessment seeks to determine just how well students are learning and is an integral part of our quest for improved education. The nation is pinning greater expectations on educational assessment than ever before. We look to these assessment tools when documenting whether students and institutions are truly meeting education goals.

  16. (PDF) ASSESSMENT AND EVALUATION IN EDUCATION

    Assessment is a vital factor that determines the quality of a program. The study started with the assumption that the assessment system in educational science at the investigated government ...

  17. (2016) The Role of Assessment in Teaching and Learning

    Within education, assessment is used to track and predict pupil achievement and can be defined as a means by which pupil learning is measured (Ronan, 2015). The delivery of teaching and ... (CCEA, 2013: 4). The aims of the current essay are to venture further into the role of assessment in teaching and learning, paying particular attention to ...

  18. 17.6: What are the benefits of essay tests?

    Essays, along with multiple choice, are a very common method of assessment. Essays offer a means completely different than that of multiple choice. When thinking of a means of assessment, the essay along with multiple choice are the two that most come to mind (Schouller).The essay lends itself to specific subjects; for example, a math test ...

  19. What Is Educational Assessment? (Definition, Types & Importance)

    The purpose of educational assessment is to enhance learning and improve educational quality by enabling educators to make informed decisions that support learners' growth and development. Using education assessments, educators can: Align instruction with learning goals and standards. Provide feedback and guidance to learners.

  20. Assessment and Evaluation Compare & Contrast Essay

    Evaluation looks into whether improvements or changes have occurred in the data. Assessment and evaluation need each other and support one another (Gavi, 2011). In summary, the three differences between assessment and evaluation are; Assessment is formative in the sense that it is ongoing and meant to improve learning while evaluation is ...

  21. Essay On Assessment In Education

    Essay On Assessment In Education. 767 Words4 Pages. Above I have discussed teaching and learning at great lengths but not much has been said about assessment. Assessment is an integral part of any education system. It is how one determines whether a learner has learnt or understood what has been taught, it is a means of quantifying a teachers ...

  22. Assessing Student Writing

    Assessment is the gathering of information about student learning. It can be used for formative purposes−−to adjust instruction−−or summative purposes: to render a judgment about the quality of student work. It is a key instructional activity, and teachers engage in it every day in a variety of informal and formal ways.

  23. Essay Exams

    Essay exams provide opportunities to evaluate students' reasoning skills such as the ability to compare and contrast concepts, justify a position on a topic, interpret cases from the perspective of different theories or models, evaluate a claim or assertion with evidence, design an experiment, and other higher level cognitive skills. They can reveal if students understand the theory behind ...

  24. How capable is a synopsis chatbot as a sparring partner for ...

    An implication of the paper is to ask if conventual theories are capable of encapsulating this new educational AI-terrain. Therefore, this paper suggests a new, updated definition of formative assessment which encapsulates AI and chatbot as complementary sparring partners and "digital supervisors" within doctoral education.