Department of Medical Education

Test Items

Test items are the best instrument to measure how well students learned what you intended (learning objectives), as well as how effective instruction was.

Here are some topics about the creation of test items:


Test Score Findings

 The test scores can be translated into two major instructional findings:

  • Which students need remediation and
  • Which parts of instruction need to be improved

In other words, if few students fail a certain question, these students need remediation but if a great number of students don't perform well, the instruction itself and/or the test item needs to be enhanced.


Appropriate Test Items

The test items that you write must assess the exact performances called for in the learning objectives. The conditions, or givens, in an objective must also be included in the assessment.

A good way to determine whether the desired relationship exists between the objective and test items is to answer the following questions:

  • Does the test item require the same performance of the student as that specified in the learning objective?
  • Does the test item provide the same conditions or givens as those specified in the learning objective?
Example
  • Learning Objective: Given a case of an infant with bronchiolitis, the students will correctly assess the child's physiology and select the appropriate tests to support or confirm the assessment.
  • Non-Appropriate Assessment Item: List the most common symptoms of bronchiolitis in an infant.
  • Appropriate Assessment Item: Answer the following questions based on the given scenario (infant with bronchiolitis): What do you already know about this child's physiology (air movement and gas exchange)? What laboratory tests should you consider in order to support or confirm your assessment

Well-Written Test Items

Your test items should clearly indicate the nature of the response that is expected from the student. When your students are unsure about what is expected, their responses may not accurately reflect their knowledge related to an item.

  • Focus on Important Concept - The question focus on an important concept, typically a common or potentially catastrophic clinical problem.
  • Clarity - The stem must pose a clear question that has a definite answer. The students should be able to read the question and answer the question without reference to the alternatives.
  • Assess Application of Knowledge - The question should assess application of knowledge, not recall of an isolated fact (i.e. "two-step" thinking.)
  • Homogeneous - All distracters (incorrect options) should be homogenous. The use of options that are plausible, even close, but not the best, forces discrimination.
  • Prompt Free - Questions should not have technical flaws that provide benefit to test-wise examinees, as for example negatively phrased questions and grammatical clues.
  • Interpretation of Data - When possible, questions should require interpretation of data or information from an image, graph or chart.
  • Multiple-Choice With Five Choices - Questions should be multiple-choice with at least five choices but if the choices are a list of drugs or structures or enzymes or diseases, the list can be up to ten choices.
  • Patient Vignette - Whenever possible, a question should use a patient vignette in the stem of the question. Take a look at some templates from the National Board of Medical Examiners (NBME)'s "Constructing Written Test Questions for the Basic and Clinical Sciences" Manual (PDF).
  • Single Correct Answer - Single correct answer should be indicated for all questions.

Poorly-Written Test Items

Following are some common practices that result in poorly-written test items:

  • Unnecessarily Complex Questions and/or Directions - Unnecessarily complexity may be due to the amount of information included in the item, to poor grammar, or both.
  • Not Specifying Basis for Sequencing/Ordering - The basis for ordering things should be specified in the assessment item. For example, putting planets in a certain order can be done in several ways: by distance from the sun, by size, or by alphabetical order.
  • Not Indicating the Nature of the Desired Description - Most things can be described in at least two ways: by their physical features or by their functions. Test items should indicate both what is to be described and what is to be included in the description.
  • Using Absolutes - The use of absolutes (e.g., always, never, only, no) should be avoided. Not only absolute statements are usually incorrect, but also some student will know an exception to the keyed answer and get confused or panic.
  • Using Implausible Alternatives - When multiple-choice items contain alternatives that are obviously incorrect, students have a greater chance of selecting the correct choice.
  • Using Equal Numbers of Items for Matching Items - When number of items to be paired in matching items is the same it increases the chance of students to guess correctly on items not previously learned.
  • Including Grammatical Clues - Articles such as "a" and "an", plural word forms, and gender forms may provide clues to correctly answering without having learned the content.
  • Having to Reference to Alternatives - The students should be able to read the question and answer the question without reference to the alternatives -- and their answer should be among the alternatives when they get there... In other words, the stem asks a question that has a definite answer.
  • Not Emphasizing Key Words - Anytime you have a key word that would change the answer if the student misses it, boldface it, CAPITALIZE it, underline it or all of the PREVIOUS.
  • Not Specifying What Best Answer Is - When best-answer items are used, the qualifier in the stem needs to be emphasized and also you must specify in what way an answer is the "best".
  • Negatively Worded Questions - Negatively worded stems should be avoided.
  • Lengthy Information in Alternatives - Include as much information in the stem and as little in the options as possible. For example, if the point of an item is to associate a term with its definition, the preferred format would be to present the definition in the stem and several terms as options rather than to present the term in the stem and several definitions as options.
  • Using "All of The Above" - Don't use "all of the above." Recognition of one wrong option eliminates "all of the above," and recognition of two right options identifies it as the answer, even if the other options are completely unknown to the student. All of the above can also be unfair, because it means answer (A) is correct, so student anticipates (A), looks down, sees (A), chooses it, then moves on. Gets it wrong, even though knew (A) was the correct answer -- gets answer right but question wrong.
  • Using Trickery - The use of trickery must be avoided. There is no point in finding out if you can trick the students -- you can.

Templates for Patient Vignettes

Templates can be very helpful when creating questions. The following information about question templates is an excerpt from the National Board of Medical Examiners (NBME)'s "Constructing Written Test Questions for the Basic and Clinical Sciences" Manual (PDF).

Item Templates

The overall structure of an item can be depicted by an item template. You can typically generate many items using the same template.

For example, the following template could be used to generate a series of questions related to gross anatomy:
A (patient description) is unable to (functional disability). Which of the following is most likely to have been injured?

This is a question that could be written using this template:

A 65-year-old man has difficulty rising from a seated position and straightening his trunk, but he has no difficulty flexing his leg. Which of the following muscles is most likely to have been injured?

  • Gluteus minimus
  • Hamstrings
  • Iliopsoas
  • Obturator internus
Patient Vignettes Components

Many basic science questions can be presented within the context of a patient vignette. The patient vignettes may include some or all of the following components:

  • Age, gender (e.g. a 45-year-old man)
  • Site of care (e.g. comes to the emergency department)
  • Presenting complaint (e.g. because of a headache)
  • Duration (e.g. that has continued for 2 days)
  • Patient history (with family history ?)
  • Physical findings
  • +/- Results of diagnostic studies
  • +/- Initial treatment, subsequent findings, etc.
Additional Templates
  • A (patient description) has a (type of injury and location). Which of the following structures is most likely to be affected?
  • A (patient description) has (history findings) and is taking (medications). Which of the following medications is the most likely cause of his (one history, PE or lab finding)
  • A (patient description) has (abnormal findings). Which [additional] finding would suggest/suggests a diagnosis of (disease 1) rather than (disease 2)?
  • A (patient description) has (symptoms and signs). These observations suggest that the disease is a result of the (absence or presence) of which of the following (enzymes, mechanisms)?
  • A (patient description) follows a (specific dietary regime). Which of the following conditions is most likely to occur?
  • A (patient description) has (symptoms, signs, or specific disease) and is being treated with (drug or drug class). The drug acts by inhibiting which of the following (functions, processes)?
  • A (patient description) has (abnormal findings). Which of the following (positive laboratory results) would be expected?
  • (time period) after a (event such as trip or meal with certain foods), a (patient or group description) became ill with (symptoms and signs). Which of the following (organisms, agents) is most likely to be found on analysis of (food )?
  • Following (procedure), a (patient description) develops (symptoms and signs). Laboratory findings show (findings). Which of the following is the most likely cause?
  • A (patient description) dies of (disease). Which of the following is the most likely finding on autopsy?
  • A patient has (symptoms and signs). Which of the following is the most likely explanation for the (findings)?
  • A (patient description) has (symptoms and signs). Exposure to which of the (toxic agents) is the most likely cause?
  • Which of the following is the most likely mechanism of the therapeutic effect of this (drug class) in patients with (disease)?
  • A patient has (abnormal findings), but (normal findings). Which of the following is the most likely diagnosis?
Types of Questions
  • Guess my drug
  • Guess my toxic exposure
  • Guess my diet
  • Guess my mood
  • Predict physical findings
  • Predict lab findings
  • Predict sequelae
  • Identify underlying cause/diagnosis
  • Identify cause of drug responses
  • Identify drug to administer
Sample Lead-ins and Option Lists

Which of the following is abnormal?

  • sites of lesions;
  • list of nerves;
  • list of muscles;
  • list of enzymes;
  • list of hormones;
  • types of cells;
  • list of neurotransmitters;
  • list of toxins, molecules, vessels, or spinal segments

Which of the following findings is most likely?

  • laboratory results;
  • list of additional physical signs;
  • autopsy results;
  • results of microscopic examination of fluids, muscle or joint tissue;
  • DNA analysis results;
  • serum levels.

Which of the following is the most likely cause?

  • underlying mechanisms of the disease;
  • medications that might cause side effects;
  • drugs or drug classes;
  • toxic agents;
  • hemodynamic mechanisms, viruses, or metabolic defects.

Which of the following should be administered?

  • drugs;
  • vitamins;
  • amino acids;
  • enzymes;
  • hormones.

Which of the following is defective/deficient/nonfunctioning?

  • enzymes;
  • feedback mechanisms;
  • endocrine structures;
  • dietary elements;
  • vitamins.

Copyright © 1996, 1998 National Board of Medical Examiners® (NBME®)
Copyright © 2001, 2002 National Board of Medical Examiners® (NBME®)

Last edited on 05/25/2016.