Glossary of Student Evaluation Terms

Reprinted from The Student Evaluation Standards (Corwin, 2003) with permission from the publisher and The Joint Committee on Standards for Educational Evaluation , Arlen R. Gullickson, Chair.

"The terms in this glossary are defined as they are used in this volume, in the context of student evaluation. In other settings, a number of them may have different or less specialized definitions." (p. 225)

Table of Contents

[A-C] [D-F] [G-I] [J-L] [M-O] [P-R] [S-U] [V-Z] [Table Of Contents]

A-C

Accuracy The extent to which an evaluation conveys technically adequate information about the performance and qualifications of a student.

Achievement What a student has learned as a result of formal instruction, usually in school.

Achievement test Assessment method, usually in paper-and-pencil format, designed to measure student competency or acquired knowledge, skills, attitude, or behavior in relation to specified learner expectations.

Analytic scoring The use of a scoring key containing an ideal response to judge the competence or proficiency of student responses on an assessment.

Anecdotal record A short, written report of an individual's behavior in a specific situation or circumstance.

Anonymity (provision for) A situation in which it is not possible to identity an individual.

Appropriate user A person who has a legitimate right or the consent of a student and, if necessary , the student's parents/guardian to the see the results and findings of the evaluation of the student.

Aptitude A student's capability or potential for performing a particular task or skill.

Assessment The process of collecting information about a student to aid in decision making about the progress and development of the student.

Assessment method A strategy or technique evaluators may use to acquire evaluation information. These include but are not limited to observations, text- and curriculum-embedded questions and tests, paper-and-pencil tests, oral questioning, benchmarks or reference sets, interviews , peer- and self-assessments, standardized criterion-referenced and norm-referenced tests, performance assessments, writing samples, exhibits, portfolio assessment, and project and product assessments.

Audiences Those persons to be guided by the results of student evaluations in making decisions about the development and progress of students and all others with an interest in the evaluation results and findings.

Audit (of an evaluation) An independent examination and verification of the quality of an evaluation plan, the adequacy of its implementation , the accuracy of results, and the validity of conclusions .

Authentic assessment Method of assessment in which the student is expected to demonstrate his or her competence and proficiency through the completion of a task that mimics a job or a higher educational or life skill.

Behavior Specific, observable actions of a student in response to internal and external stimuli.

Benefit An advantageous consequence of a program or action.

Bias A constant error; any systematic influence-on measures or on statistical results-irrelevant to the purpose of the evaluation.

Checklist A list of performance criteria for a particular activity or product on which an observer marks the student's performance on each criterion using a scale that has only two points (e.g., present or absent, adequate or inadequate).

Competency A skill, knowledge, or experience that is suitable or sufficient for a specified purpose.

Conclusions (of an evaluation) The final judgments and recommendations resulting from the assessment information collected about a student.

Confidentiality (provision for) Situation in which the identity of students will not be released to other individuals or institutions beyond the teacher or others who evaluate students.

Conflict of interest A situation in which an evaluator's private interests affect her or his evaluative actions, or in which the evaluative actions might affect private interests.

Construct A characteristic or trait of individuals inferred from empirical evidence (e.g., numerical ability).

Construct irrelevance Occurs when the assessment used to measure an educational or psychological construct includes items or measures that are not relevant (extraneous) to the construct and cause scores to be different from what they should be.

Construct underrepresentation Occurs when some of the aspects that represent the construct to be addressed are not included in the assessment used to measure it.

Context The set of circumstances or acts that surround and may affect a particular student, learning situation, classroom, or school.

Contextual variables Indicators or dimensions that are useful in describing the facts or circumstances that surround a particular learning situation and influence a student's performance in that situation.

Correlation The degree to which two or more sets of measurements vary together; e.g., a positive correlation exists when high values on one scale are associated with high values on another; a negative correlation exists when high values on one scale are associated with low values on another.

Credibility Believability or confidence by virtue of being trustworthy and possessing pertinent knowledge, skills, and experience.

Criterion-referenced Performance interpreted in relation to prespecified standards.

Critical score A specified point in a predictor distribution of scores below which candidates are rejected or considered not to have reached a minimum standard of performance; also called a cut score.

Cross validation The application of a scoring system or set of weights empirically derived in one sample to a different sample drawn from the same population to investigate the stability of relationships based on the original weights.

Curriculum The knowledge, skills, attitudes, behaviors, and values students are expected to learn from schooling; includes statement of expected student outcomes , descriptions of material and activities, and the planned sequence that will be used to help students acquire the expected outcomes.


[A-C] [D-F] [G-I] [J-L] [M-O] [P-R] [S-U] [V-Z] [Table Of Contents]

D-F

Data Evidence, in either numerical or narrative form, gathered during the course of an evaluation and that serves as the basis for information, discussion, and inference.

Data access Conditions under which access to information is provided, including who has the access.

Data analysis The process of organizing, summarizing, and interpreting numerical, narrative, or artifact data, so that the results can be validly interpreted and used to guide future development of students.

Data collection procedures The set of steps used to obtain quantitative or qualitative information about the knowledge, skills, attitudes, or behaviors possessed by a student.

Decision consistency coefficient Like the reliability coefficient, this is a calculated value that tells the extent to which the decision results (e.g., classifications) would be the same if the process were repeated. It is often used in either-or decision situations (e.g., mastery-nonmastery). Like the reliability coefficient, a coefficient of zero (0) means no consistency and a coefficient of one (1.0) means fully consistent.

Dependability A measure of how consistent the results obtained in an assessment are in a criterion-referenced evaluation; consistency of decisions in relation to prespecified standards (see Reliability).

Design (evaluation) A representation of the set of decisions that determine how a student evaluation is to be conducted; e.g., identifying purposes and use of the information, developing or selecting of assessment methods, collecting assessment information, judging and scoring student performance, summarizing and interpreting results, reporting evaluation findings, and followingup evaluation results.

Diagnosis Identification of specific strengths and weaknesses in a student's learning.

Discrimination index An index that indicates how well an item distinguishes between the students who understand the content being assessed and those who do not. Positive discrimination indicates that the item or task is discriminating in the same way as the assessment method of which it is a part.

Educational objective A statement describing the knowledge, skill, attitude, or behavior a student is expected to learn or perform and the content on which it will be performed as a result of instruction.

Evaluation Systematic investigation of the worth or merit of a student's performance in relation to a set of learner expectations or standards of performance.

Evaluator Anyone who accepts and executes responsibility for planning, conducting, and reporting student evaluations.

External evaluation An evaluation conducted by an evaluator from outside the classroom.

Feasibility The extent to which an evaluation is appropriate and practical for implementation .

Follow-up Actions taken to maintain the strengths and address the weaknesses that were identified in the evaluation of the student.

Formative evaluation Evaluation conducted while a creative process is under way, designed and used to promote growth and improvement in a student's performance or in a program's development.


[A-C] [D-F] [G-I] [J-L] [M-O] [P-R] [S-U] [V-Z] [Table Of Contents]

G-I

Grading system The process by which a teacher arrives at the symbol, number, or narrative presentation that is used to represent a student's achievement in a content or learning area.

High-stakes evaluations Evaluations that lead to decisions that, if incorrect, harm students and are detrimental to their future progress and development. For example, misinterpretation of the level of performance of an end-of-unit test may result in incorrectly holding a student from progressing to the next instructional unit in a continuous progress situation. Every effort should be made in high-stakes evaluations to ensure that the assessment method will yield reliable and valid results.

Holistic scoring Method of scoring essays, products, and performances in which a single score is given to represent the overall quality of the essay, product, or performance without reference to particular dimensions (see Analytic scoring).

Informed consent Prior to the collection of this information and/or its release in evaluation reports, an agreement by students-and, if the students are of minority age, their parents/guardians-that their names and/or the confidential information supplied by them may be used in specified ways, for stated purposes, and in light of possible consequences.

Instruction The methods and processes used by teachers to change what students know and can do, their attitudes, or their behavior.

Instrument An assessment device adopted, adapted, or constructed for the purposes of the evaluation.

Instructional objectives More detailed expressions of educational objectives (see Educational objective ).

Inter-rater coefficients This is a special type of reliability coefficient used to determine the extent to which two or more raters are consistent in their scoring of students. It is often used to determine whether two judges grade in the same way (e.g., would students receive the same grade if their responses were graded by two different teachers).

Item A single question, problem, or task used to assess a student (see Task).

Item analysis A technique employed to analyze student responses to objective test items. The technique is used both to improve the quality of items and enhance interpretation of results. This technique shows the difficulty of the items and the extent to which each item properly discriminates between high achieving and low achieving students.


[A-C] [D-F] [G-I] [J-L] [M-O] [P-R] [S-U] [V-Z] [Table Of Contents]

J-L

Learner expectations or outcomes See Instructional objectives.

Letter grade A summary evaluation of a student's proficiency or competency expressed on an alphanumeric or numeric scale.

Low-stakes evaluations Evaluations that lead to decisions that, if incorrect, are less harmful to students and likely will not interfere with their progress and development. For example, incorrectly completing an in-class assignment is less harmful than failing an end-of-unit test in a continuous progress situation.


[A-C] [D-F] [G-I] [J-L] [M-O] [P-R] [S-U] [V-Z] [Table Of Contents]

M-O

Mandated assessments Assessments teachers are required to conduct to fulfill the duties associated with their terms of employment, such as assessments for grading and promoting students and assessments required by district or state policy (district and state assessments).

Mean The arithmetic average of a set of numbers.

Measurement The process of assigning numbers or categories to performance according to specified rules.

Metaevaluation An evaluation of an evaluation.

Norms A set of scores that describes the performance of a specific population of students at a particular grade level on a selection or constructed response set of tasks. The population may be a local, state, or national population. These sets are used to interpret scores of students on the same selection or constructed response set of tasks and belonging to the same population.

Objective scoring Different scorers or raters will independently arrive at the same score or ratings for a student's performance; most often associated with assessment methods comprised of selection items (see Subjective scoring).

Options Alternatives available to students to select from in multiple-choice items.


[A-C] [D-F] [G-I] [J-L] [M-O] [P-R] [S-U] [V-Z] [Table Of Contents]

P-R

Parallel forms Two or more forms of a test constructed to be as comparable and interchangeable as possible in their content, difficulty, length, and administration procedures and in the

scores and test properties (e.g., means, variance , and standard error of measurement ).

Peer assessment An assessment method in which students within a similar educational setting make and report judgments about other students' performances.

Performance assessment A formal assessment method in which a student's skill in carrying out an activity and producing a product is observed and judged (e.g., construction of a woodworking project; completion of an essay in English, research report in history, or lab in science).

Performance criteria The observable aspects of a performance or product that are observed and judged in a performance assessment.

Performance standards The levels of achievement students must reach to receive particular grades or to be allowed to move to the next unit in a criterion-referenced assessment system (e.g., 90 percent and higher receive an A, between 80 percent and 89 percent receive a B, and so on; a student who receives a score of 80 percent or more moves on to the next unit, while retained students need to review the material tested and retake the test or a parallel form of it).

Pilot test A brief, simplified preliminary trial study designed to learn whether a proposed evaluation seems likely to yield valuable results.

Portfolio assessment Method of assessment that relies on a collection of student- and/or teacher-selected samples of student work or performance in order to evaluate individual student achievement.

Propriety The extent to which an evaluation will be conducted legally, ethically, and with due regard for the welfare of those involved in the evaluation as well as those affected by its results.

Qualitative information Information presented and/or summarized in narrative form, for example, written expressions descriptive of a behavior or product.

Quantitative information Information presented and/or summarized in numerical form; for example, scores on a paper-and-pencil test or on a five-point analytical scale.

Random sampling Drawing a number of individuals from a larger group or population, so that all individuals in the population have the same chance of being selected.

Reliability A measure of how consistent the results obtained in an assessment are in a norm-referenced evaluation situation; consistency of a student's ranking within the group of students against which the student is being compared (see Dependability).

Reliability coefficient A calculated number whose value must be between 0 and 1. The number describes the consistency of the assessment results. The larger the number's magnitude, the more consistent the assessment. For example, if the coefficient value were 1, all students' scores would be expected to rank exactly the same way on retesting.

Report card Summary of student achievement, either formative or summative, describing student progress and development with respect to learner expectations, cognitive ability, and expected behavior.

Rubric A description of a specific level of performance within a performance scale.


[A-C] [D-F] [G-I] [J-L] [M-O] [P-R] [S-U] [V-Z] [Table Of Contents]

S-U

Sample A part of a population.

School district A legally constituted collection of institutions, within defined geographic and/or philosophical boundaries, that collaborate in teaching students of less than college age.

Score A specific value in a range of possible values describing the performance of a student.

Scoring key A list of correct answers for selection items or the scoring guide to be followed with scoring or judging responses to constructed response items.

Selection item Test item or task to which the students respond by selecting their answers from choices given: true-false, matching, multiple-choice .

Self-assessment An assessment method in which students make and report judgments about their own performance.

Stakeholder Any person legitimately involved in or affected by the evaluation, for example, students, their parents/guardians, teachers, guidance counselors, school psychologists, and others who make decisions that affect the education of the student.

Standard A description of the expected level of performance that describes minimum competence in relation to a critical score or other measure of student performance.

Standard deviation The standard deviation is a calculated number that describes the extent to which scores are dispersed (spread out) from the mean. Nearly all scores are typically within 3 standard deviations of the mean.

Standardized tests Assessment methods, either criterion- or norm-referenced, designed to be administered, scored, and interpreted in the same way regardless of when and where it is administered.

Statistic A summary number typically used to describe a characteristic of a sample and from which inferences about the population represented by the sample are made.

Student evaluation The process of systematically collecting and interpreting information that can be used (1) to inform students, and their parents/guardians where applicable, about the progress they are making toward attaining the knowledge, skills, attitudes, and behaviors to be learned or acquired; and (2) to inform the various personnel who make educational decisions (instructional, diagnostic, placement , promotion, graduation) about students.

Student evaluation system All the procedures-including developing and choosing methods for assessment, collecting assessment information, judging and scoring student performance, summarizing and interpreting results, reporting evaluation findings-and policies that evaluators use to evaluate their students.

Subjective scoring Different scorers and raters may differ on a student's score or rating; most often associated with constructed response assessments (See Objective scoring).

Summative evaluation An evaluation designed to present conclusions about the merit or worth of a student's performance (see Formative evaluation).

Task A single question, problem, or task used to assess a student (see Item).

Utility The extent to which an evaluation will serve the relevant information needs of students, their parents, and other appropriate users.


[A-C] [D-F] [G-I] [J-L] [M-O] [P-R] [S-U] [V-Z] [Table Of Contents]

V-Z

Validity Related to the purposes of the evaluation, the degree to which inferences drawn about a student's knowledge, skills, attitudes, and behaviors from the results of assessment methods used are correct, trustworthy, and appropriate for making decisions about students.

Variable A characteristic of students that can take on different values: for example, achievement, skill, attitudes, and behavior.

Weighting The amount of emphasis given to a particular set of information. For grading purposes weighting usually entails multiplying all scores for one component (e.g., a test) by a numerical value (e.g., 2) to increase the emphasis it receives over other data (e.g., student homework).

[A-C] [D-F] [G-I] [J-L] [M-O] [P-R] [S-U] [V-Z] [Table Of Contents]
[The Evaluation Center] | [Glossary Resources Home]


The Evaluation Center 4405 Ellsworth Hall Western Michigan University
Kalamazoo, MI 49008-5237
Phone: (269) 387-5895 Fax: (269) 387-5923
Page last updated on April 4, 2008