About CSB and SJU | Academics | Admission | Alumnae/i and Friends | Arts and Culture | News, Events and Sports | Student Life


Designing Tests and Paper Questions

Designing Tests and Paper Questions
September 7 and 9, 2005
Presenter: Ken Jones

Think about the course you brought the sample test question for.  Jot down a couple of sentences on what you want your students to gain from taking that course.  If you don’t have a sample question, just pick a course you are teaching now – or one you hope to teach in the future – think about your goals, and a question you would use on a test.

Now, turn to the person next to you.  Explain your goals, your question, and discuss how clear the linkage is.

I started with this little exercise today just to encourage you to resist something that I find myself doing far too often: when I sit down to write a test, I immediately look at the content we have covered.  

Thinking about what our students should know is obviously a critical step, but I’d like to encourage us to – at least every once in a while – back up a little and ask the question “what do I want them to get from this course?” and design the assessment instrument accordingly.

For some courses, the answer might be “mastery of a certain body of basic information.”   If content acquisition is the central goal of your course, then multiple choice or other kinds of objective questions work well. If, on the other hand, your goal is to have students display an ability to integrate, to think critically, or develop their own interpretations, then you might be better off with essay questions.

               HANDOUT -- advantages/disadvantages of objective and essay tests 

As you can see, this handout summarizes the advantages and disadvantages of objective and essay tests.

Multiple choice tests are particularly good at assessing content knowledge, they are easy to score, and help you provide quick feedback.  While I’m at it, let me explain one approach I came across that seems like a great way to provide immediate feedback.  The professor gives students only part of the period to take the test.  S/he asks them to mark their answers on test and on answer sheet, collects the answer sheets, and then spend rest of class going through answers and explaining why/correcting misunderstandings.

Essay tests, on the other hand, more closely mirror how knowledge will be used outside the classroom, and can more easily assess more complex thinking skills.  They are, however, more time consuming to grade, and the scores can be less reliable – meaning different people are going to give different marks.

As you can see, factors such as ease of grading and degree of subjectivity that may affect the format you choose, but from my perspective, the key should be fitting the type of test to the learning goals.
Before I get into more nuts and bolts details, let me push this idea of shaping evaluations to your goals a little further with three examples from Barbara Walvoord’s Effective Grading.

First, she describes a Math professor who wanted students to understand certain math functions.  He realized, however, that his tests focused on getting the right answer.  So he told students to fold paper in half.  On one side they were to do the problem.  On the other, they were to explain in complete sentences what they did and why.

The second example is a Sociologist who responded to a question about what he wanted his students to learn and his testing format this way.    (OV from Walvoord 33)  (Wanted students to apply sociological analysis to everyday events, but evaluation was based on two exams and a paper.)   He realized students were cramming for the two tests, rather than applying sociological analysis to what they saw around them, and that the term paper might very likely be based on library work alone.  That realization led him to change so that students were asked to write a journal in which they applied sociological analysis to things they observed.  The problem was the students read the word “journal” as just personal thoughts, so he renamed the task “sociological analysis” and emphasized that he wanted them to apply the sociological concepts they were learning to what they saw in everyday events.  He also provided explicit criteria: students had to summarize the sociological perspective correctly, include the kind of detailed observations sociologists do, and then link the theories and observations in a reasonable and thoughtful way.

The third example is a Biology professor who asked for twelve lab reports over the semester.  They never seemed to move beyond mediocre and he was buried in work.  After some thought, he decided that he wanted students to be able to write a good lab report, rather than turning in twelve mediocre ones that covered content.  To achieve that, he began teaching how to write the lab report more thoroughly, and set up stages to build skills.  For first two labs, students wrote only the Introduction and he focused his critique there.  Then he added additional sections as they went on so that by the end of the semester they were writing complete – and better – reports.  And just to remind you that there are lots of alternatives to standard in class exams or regular papers, here’s another handout. 

            HANDOUT -- Suskie Examples

What I love about this list is that it suggests all sorts of ways that we can make the learning and assessment of their ability seem less like a traditional test and more like something that they will be asked to so as soon as they graduate.  If we can do that, it will be more real, more exciting, plus they will learn more and produce better work.

Ok, back to what I advertised, so let me give you the outline of today’s session.

            HANDOUT  session outline

If you can’t figure it out, I’ve done the first two points and am moving to the third.

Let’s say you want to create a multiple choice test.  What do the experts say you should do? 
Rather than starting with straight content – your lecture notes or the text – design a test blueprint.   Ask yourself, what are the learning goals I want my students to demonstrate on this exam.  Could be anything, but if you are focused on specific content, these are probably going to be key concepts that you have covered in the unit.

Once you have the learning goals laid out, then decide on the relative importance of each area, and allocate the number of questions accordingly. The advantage of starting this way is that it encourages a very conscious attention to what matters – rather than asking questions that are easy to write or overloading in one area.   If you tie questions to learning outcomes, you can also use this for assessment, but that’s another topic.

OK, once you have your blueprint – your outline of what you are going to cover in the test, you need to write the questions.

Big drum-roll here for the two key precepts --

More specifically

Writing Good Stems

Writing Options

Writing Distracters

From my perspective, this is really important because it allows us to use exams as a teaching/learning tool rather than just an assessment/grading device.  If we write the distracters carefully so that they illuminate the problem areas, we can zero in on where our students need help.   

Then Finish the Task  --

Of course, multiple choice tests are not the only kind of objective test.  Here’s a handout that hits the highlights on other varieties.

               HANDOUT ON OTHER OBJECTIVE TEST FORMATS                    

We most often use multiple choice questions to test simple recall of information, but it is possible to ask for application or analysis as well.   You can see some of the approaches in this handout. 


           HANDOUT ON EXAMPLES OF HIGHER ORDER M-C

At the top you have a couple of examples of points I have already made.  I’d like to draw your attention to the bottom of the page.  Here you have two questions addressing the same basic idea, but as you can see, the second version demands understanding rather than just recall.

The back side of the handout has two longer, more complicated examples of how you can use the multiple choice format to test higher order thinking.  Another approach I have seen that appeals to me is to lay out a scenario, and then give the students 3 or 4 options that are all generally right, but ask them to choose the best solution to the issue posed in the scenario.  If you do it right, you can really push them to think through the issues.

You can also achieve some of the same ends by combining the classic multiple choice question with a written explanation.  Have them chose their answer, and then – in writing – explain why they chose that one.

Let me, however, pass on a couple of warnings about trying to assess higher order thinking through objective exams.  First, I’ve seen estimates that it can take a really experienced objective test writer hours to come up with one good higher order thinking question.  Second, studies have show that while many faculty would say that they are writing objective questions that require higher order thinking, in reality, 90 to 95% of multiple choice questions require recall only.

There’s one way of catching ourselves on this that I really like.  The suggestion is that if you want to make sure you are asking for more than recall, then make your objective tests open book.   If the students can find the answer directly in the text, then you aren’t asking for deeper comprehension.

Ok, let’s move on and talk about creating essay assignments.

Once again, I would encourage you to think about what it is that you want your students to gain from taking this course.  Content, skills, whatever.  After you have done that, ask yourself is the testing format you are planning fits.  For example, if you decide, on reflection, that content is the key, then why use an essay exam?

Or, if what you really want to see is whether they can write a coherent, well supported argument, is an in-class exam the right way to do.  The classic approach when I was younger was to ask students to walk in, get the questions, and then write like mad for 30 minutes on each of two questions.  That really asked for a brain dump, not reflection or clear writing.

(By the way, I’ve run across a couple of sources that suggest that we can expect students to write a little more than a paragraph for each ten minutes we give them.  So if you are looking for a classic five paragraph exam, then they need about 45 minutes.  In my experience, our students can do more than that, but I haven’t done in-class exams in a long time.)

If you want both the in-class exam and a more reflective, coherent result, let me suggest a couple of alternatives.  One common approach is to give out several questions in advance, and then choosing one for them to write on in the exam.  Presumably, they have prepared all of them and will do better.

Another approach takes this a little further.  When they come to class they are asked to write the essay, but that isn’t the end.  The professor then grades and provides feedback, and the student then revises – out of class – what they started in class.  Then gets second grade.  One source I have seen suggests making the revision mandatory on the first exam, optional on the second, and not possible on the third.

Once you have settled on the format, the next task is to create questions.  First, make sure they reflect your objectives.  Second, if possible, I would encourage you to write them in a way that asks students to make use of course information in ways that connect to their lives.  In my experience, this really enhances student engagement.   For example, I used to use a question at the end of my diplomatic history course that basically asked them to sum up some key themes.  A number of years ago I changed it to a scenario where they are called upon to advise the incoming president on what he needs to do to be a great leader in foreign policy.  In improvement in quality was striking.   Once we have written the question, I suggest that we reflect on them a little longer.  (I know this isn’t going to happen if you are doing it the night before, but….)  

Ask yourself:  Do these questions really reflect my goals?  If I really want to encourage critical thinking, why does this question focus primarily on recall?  How can I change things around so that they have to apply what they know in a new setting?

Or, if my goal is to help them to develop their ability to propose and test their own ideas, does this question that asks them to describe the application of a principle really what I want? 

I also find that I really need to worry about the level of difficulty.  It is really easy to design questions that demand a greater ability to make cognitive leaps than is reasonable to expect.  Remember you aren’t writing a question you would like to talk about with the other four people in the world who care, but rather are designing for novices.  I find it helpful to write the question, then physically sketch out what I hope to see in a good answer, and then ask myself if it is reasonable to expect someone who isn’t an expert in the area to move from the prompt to the creation of similar connections.  Did we really talk in depth about those things?  Should they be able to make the leap?  The experts suggest that you also think about how much freedom you want to allow. 

Restricted questions – or prompts, as they are sometimes called -- ask everyone to provide pretty much the same response.  Extended response prompts, on the other hand, offer more latitude.  For example, a restricted prompt would be “explain how Kennedy’s assassination affected passage of the 1964 Civil Rights Act,” while and extended one would be something like “what two factors were most significant in the passage of the 1964 Civil Rights Act.” 

A lot of the experts say relatively narrower questions are better because they are more manageable in terms of time and because they are easier to evaluate.  For me, it depends on my goals.  If my goal is to promote their ability to draw on multiple points of view and reach their own well supported conclusion, then I’m going to go for a much more extended or open ended question.

If it fits your goals, you can provide lots of what I call “scaffolding” in order to direct students toward the desired outcome.  This can come in the question itself.  For example, compare these two: “Why does an internal combustion engine work?” versus “Explain the functions of fuel, carburetor, distributor, and the operation of the cylinder’s components in making an internal combustion engine run.”    Scaffolding can also be a little more external – for example, you can remind them of critical readings/discussions they should include.

Another key issue in the “freedom” category is how much choice to allow on an essay exam.

Barbara Davis, whose work I really like, says emphatically that you should never allow students any choice on an essay exam.  Her argument, and that of many others, is that if you allow students to answer different questions, you can’t really compare their responses and give a fair grade.  Furthermore, since it is impossible to design equally difficulty assignments, some students may hurt themselves by taking on the harder one.  And finally, some people argue that multiple options are bad in a timed exam because they penalize slow readers.

Those who argue for providing choices point out that doing so allows students to choose according to interest and learning styles, and that that freedom may draw out better work.

I understand the concern about reliability when you offer choice, but again, I think the answer has to go back to your learning goals.  If I want to insure that no student leaves my course without knowing the reasons for the US involvement in Vietnam, then I had better test on that and not provide any choice.  If, on the other hand, my goal is to promote their ability to think critically about historical arguments, then I don’t think it matters very much whether they write on Vietnam or something else.  

The literature suggests a few other things about writing essay questions that I should pass on.

First, make the length appropriate to the amount of learning.  Why ask for a 10 page paper if your learning goals can be accomplished with a 5 page assignment. Second, if students are writing on several questions in a timed environment, make sure you indicate the point value of each essay so they can allocate their time and energy properly.  Third, try to find ways to make your expectations clear.  Sharing your scoring rubric is one way.  Having the class discuss/develop a rubric is another.  My favorite is to have some examples of old papers available. 

              HANDOUT – Examples of essay exams

Ok, my hope is that I’ve done two things today.  First, I hope that I have encouraged you or reminded you to think about your overall course goals when you design your evaluation instruments.  Second, I hope that I have distilled the wisdom of the “experts” into some specific advice on how to construct various instruments that will be useful to you.

Thank you.

Multiple Choice Tests

Advantages –

Disadvantages

Essay Tests (or papers)

Advantages

Disadvantages

Examples of what essays can ask students to do:

Other Objective Test Formats

True-False - Multiple choice questions with two options

Advantages 

Disadvantages (versus multiple choice)

Matching Items - Another form of multiple choice

Advantages

Disadvantages

Advice on Writing Matching Questions

Interpretive Exercises - Having students read a passage or chart that they haven’t seen and then answering a set of objective questions

Advantages

Disadvantages

Writing Interpretive Exercises

Completion/fill-in-the-blank - To be true objective question, must have only one right answer (i.e., anyone with an answer key can score correctly).  Short answer questions usually are more subjective.

Advantages

Disadvantages

Writing Multiple Choice Test Items

Good stems allow a knowledgeable student to answer quickly

The mean:
      
a) is the most frequently occurring score in a distribution
     
 b) ….

The mean distribution of test scores is:
      
a) the most frequently occurring score
      
b) …

Be Concise

Which of the following is the best definition of sociobiology?
     
 a) The scientific study of humans and their relationships with the environment
      
b) The scientific study of animal societies and communication
       
c) …

Sociobiology is the scientific study of
       
a) humans and their relationships within the environment
       
b) animal societies and communications
       
c) …

Moving From Recall to Understanding/Interpretation

A percentile score is
      
a) percentage of items a student answers correctly
      
b) percentage of students that answer an item correctly
      
c) percentage of a group getting a lower score
       
d) average score for a group on a test

John scored at the 80th percentile on the 100 item final exam given to his class of fifty students.  This means that John
      
a) answered eighty items correctly
      
b) scored higher than forty students in the class
     
c) scored lower than forty students in the class
     
d) scored 80 percent higher than the average student 

Essay Test Examples

Cause and Effect
Explain the possible effects a recession would have on today’s society.

Application of a Principle
Using Newton’s third law of motion, explain why a rubber ball bounces higher when dropped from fifteen feet than when it is dropped from five feet.

What is meant by the statement "all physical and chemical changes are accompanied by changes in energy?"

Compare and Contrast
Compare and contrast the physical traits and the personal character of Grendel in Beowulf and Calaban in The Tempest.

Describe three ways that the process of meiosis differs from the process of mitosis.  Then, explain how each of the differences in process makes the daughter cells resulting from meiosis different from the daughter cells resulting from mitosis.

Present Arguments For and Against
Two methods for nominating candidates for offices such as U.S. Senator are the party caucus system and the popular primary.  Provide three reasons for supporting each system and three reasons for not supporting each.

Create and Defend a Hypothesis
Some popular news media have reported that trees themselves contribute to air pollution.  Explain this argument and agree or disagree, citing supporting evidence.

Analyze, Synthesize, Support  (take home)
Critically evaluate the following statement, focusing on events in the period 1947-1961.  "From the Truman era on, the need to respond quickly to communist threats was used to eliminate Congress' role and give the president complete control over the use of military power to achieve his foreign policy objectives." 

Scaffolding  (take home)
The actions and rhetoric of the Truman, Eisenhower, and Kennedy Administrations so committed the United States to South Vietnam that Lyndon Johnson saw the introduction of American ground combat troops in 1965 as a logical and necessary step.  Evaluate the role of the Truman, Eisenhower, and Kennedy Administrations in creating this situation.  Which played the most important role?  Why?  [Make certain that you address both rhetoric and actions.]             

Scenario   (take home)
It is January 2001.  Gore has conceded, so George Walker Bush knows that he is going to be called on to lead the nation in a world where the direction of American policy remains unsettled.  As this realization sinks in, he experiences a mild panic attack because he remembers his dad saying that people like Carter and Clinton were handicapped as foreign policy leaders by their lack of a sense of history.  Since what he learned in his History classes at Yale in the 1960s remains lost in a fog, and since it would be too embarrassing to tell his dad he hadn’t been listening to all those fatherly tips, the new incumbent turns to you for help.   Impressed by your knowledge of American foreign policy in the 20th century, he asks you to prepare a brief (no more than eight pages) set of foreign policy guidelines (goals and tactics) for him, along with a rationale based on your understanding of the lessons of the past.  (You will be evaluated on creativity, realism, thoroughness, and the appropriateness of your evidence.    Remember that your credibility is based on your understanding of the past; if your program isn’t rooted in historical lessons, you will soon be digging post holes for a new fence at the President’s Texas ranch as the thermometer tops 100 degrees.)        

Designing Tests and Paper Questions
September 7 and 8, 2005

What Do I Want Them to Get from This Course

Creating correspondence between goals and means of assessment

Designing Multiple Choice Exams

Writing Good Stems

Writing Good Options

Writing Distracters

Finish the Task

Review

Examples

Designing Essay Assignments

Some Good Sources:

Cashin, William, “Improving Essay Tests,” Idea Paper #17, (January 1987)
Clegg, Victoria and William Cashin, “Improving Multiple Choice Tests,” Idea Paper #16 (September 1986)
Davis, Barbara, Tools for Teaching  (1993)
Devine, Marjorie and Nevart Yaghlian, “Construction of Objective Tests,” 
Jacobs, Lucy and Clinton Chase, Developing and Using Tests Effectively  (1992)
Suskie, Linda, Assessing Student Learning  (2004)
Walvoord, Barbara and Virginia Anderson, Effective Grading (1998)
Cornell University Office of Instructional Support, ”http://www.clt.cornell.edu/campus/teach/faculty/Materials/TestConstructionManual.pd"