Please update your web browser or disable Compatibility View.

Education Department Writing Assessment

2003 - 2004

Summary. As in years past, we invited students seeking acceptance as elementary education majors or secondary education minors to demonstrate their writing proficiency by completing persuasive essays on current topics related to educational practice. Registered students received reading packets during the week prior to their examination offering information on how to write persuasive essays, readings on the topic selected for their essays, and information about our testing process. The colleges’ Writing Center offered workshops for interested essayists during the same period. About one month later faculty and staff readers gathered to score these essays using widely accepted holistic procedures. Following review of the scoring process and its results, all students were notified of their performance by letter. Information included with their score reports offered options for developmental opportunities or additional assessments should they wish to confirm their performance. Students who wrote failing essays were encouraged to meet with the chief reader to review the basis for their ratings and to discuss their options.

A total of 49 young men and women gathered in a large classroom on the morning of Saturday, 27 September 2003, to take and advance their positions on Minnesota’s use of K-12 student test scores in estimating a school’s effectiveness. A team of six trained faculty and staff readers met on 25 October to read and rate their efforts. Using the department’s scoring guide, these readers agreed that eight students (16% of 49) wrote Level One essays that failed to reach the standard of writing performance expected of those selected to prepare for licensure. This team of readers also judged essays prepared by the remaining 41 students to reflect acceptable Level Two writing (84%). Readers found no essays in this sample that reached Level Three performance.

We offered a second opportunity for those seeking acceptance to verify their writing performance on the morning of 14 February 2004. Thirty-five prospective teachers wrote persuasive essays on this occasion, taking a position on Minnesota’s “zero tolerance” weapons policy. When a team of four trained faculty and staff readers scored these 35 essays on 13 March, they found 25 (71%) to be consistent with Level Two performance on our scoring guide. They rated ten essays at Level One (29%), below our standard for acceptance. Readers did not rate any essays in this sample at Level Three.

The following table summarizes writing assessment results for this academic year compared with aggregated results from the previous 18 essay examinations.

Writing Performance Spring 1996 to Spring 2003 Fall 2003 Spring 2003

Level 3

45 (5%)

Level 2

585 (59%) 41 (84%) 25 (71%)

Level 1

293 (30%) 8 (16%) 10 (29%)

Below 1

61 (6%)
Totals 984 49 35

Essays scored at Level One typically reveal significant weakness in the structure of the author’s argument, reasoning, and evidence. Essays scored as Level Two provide a stronger argument, at least two logically distinct reasons in support of that argument, and stronger forms of evidence. Although they are writing under some pressure, most students observe the conventions of grammar, punctuation, spelling, and capitalization.

Our continuing assessment of students’ writing performance, which began in April of 1996, could not take place without the willing support of our faculty and staff. Readers for the September 2003 assessment included Dr. Bruce Dickau, Sister Lois Wedl, Dr. Edmund Sass, Ms. Melisa Dick, Ms. Maryjean Opitz. Ms. Cynthia Forsman-Earl served as our outside reader.  Dr. Art Spring, Sister Lois Wedl, Sister Ann Marie Biermaier, and Melisa Dick read and scored students’ essays written for the February 2004 examination. Ms. Connie Schiff developed and managed the registration of examinees. Dr. David Leitzman served as chief reader for both assessments.

Fall 2003 Education Department Writing Assessment

Overview. Forty-nine students seeking acceptance by the colleges’ Department of Education as secondary minors or elementary majors wrote essays during a proctored examination on 27 September 2003. Using a reading packet provided by the department for each examinee one week prior to their examination and other sources of information of their choosing, these students wrote persuasive essays concerned with using K-12 student test scores to assess the effectiveness of Minnesota’s public schools. Six faculty and staff readers met on 25 October to read and holistically score students’ essays using a six step scoring guide devised for this assessment. These readers found that 41 essays met the department’s standard for acceptance (84% of 49), while eight did not (16%).

Procedure.The College of Saint Benedict and Saint John’s University Department of Education first used a formal screening examination in April of 1996 to verify the writing performance of prospective candidates for teacher licensure. Following practices that have evolved from that first experience with formal writing assessment, students registering for the September 2003 essay were provided with a reading packet one week prior to their examination. This packet offered guidance on writing persuasive essays, background readings exploring many facets of the selected topic, and the essay “prompt” along with the scoring guide that would be used to judge the essays it might inspire.

Writers preparing essays were encouraged to search for any additional information that might help them take and defend a position on the question for this session; “Should Minnesota use students’ test scores to judge a school’s effectiveness?” The colleges’ Writing Center offered workshops during the week preceding the examination to encourage students choosing to review techniques of persuasive writing. These procedures were developed over several essay administrations to encourage as “authentic” a writing experience as possible.

The forty-nine students registered for this experience gathered on the morning of 27 September 2003 with their collected texts, readings, notes, outlines, and preliminary drafts to prepare the final versions of their essays during a two-hour proctored examination. Following the proctor’s brief instructions, all present began writing.

As the proctor looked on, writers used their preferred prewriting and drafting techniques to prepare handwritten double-spaced final drafts of from five to fourteen pages long. After writing for one hour, students were free to leave the testing room at any time before the close of the two hour period. As writers prepared to leave that room, the proctor collected their essays along with all supportive materials they might have brought for their use.

Scoring followed the generally accepted procedures for the holistic scoring of writing as developed and described by Edward White in his now classic Teaching and Assessing Writing (1998). Five Education Department faculty and staff members, joined by one staff member employed in another college department, invested a Saturday morning in scoring these 49 essays on 25 October 2003.

The scoring session began with a review this approach to assessing writing. Readers examined the writing packet and essay prompt used by the students whose work they would soon review.

Readers also verified their understanding of the department’s six step scoring guide by evaluating and discussing several transcribed essays. A “chief reader” monitored this training process, selecting from among these typed “test essays” those which might best illustrate features evident in this sample of essays. This training process continued for 75 minutes until readers were able to use the scoring guide to consistently judge sampled essays.

With the conclusion of the training session, essay scoring began in earnest. As each reader scored an essay, he or she lightly wrote a number of from one to six reflecting a value assigned to that essay on its final page. This number, corresponding to the scoring guide, was then covered with an opaque colored paper “dot.” Each reader used a different color to ease selection of subsequent readers for second readings. Essay scores were recorded by the chief reader, who then sent essays to additional readers for a second score. If the second reader’s score agreed with the first (for example, scores of 4 and 4), or differed by no more than one point (4, 5), that essay’s review would end. Should scores for an essay differ by more than one point (3, 5), an essay would be read again by a third reader. Essays scored by two readers as “7” (3, 4) were always read and scored a third time to pull the essay into the passing (4, 4) or failing (3, 3) range.

Following their scoring the chief reader reviewed all failing essays as well as a sample of passing essays to verify compliance with the criteria described in the scoring guide. All “problem” essays with unusual scoring patterns (3, 4, 1, 3) provoked further review. The chief reader also examined each essay reader’s scores to uncover patterns that might have influenced the outcome of the scoring session.

Students received written notification of their essay scores from the Education Department following score review. Those who wrote failing essays were encouraged to visit with the chief reader to review their scores. They might then seek reassessment to verify their score or plan to begin a program of developmental assistance should they persist in their desire to prepare for teacher licensure.

Results. The following table reveals the distribution of scores awarded to students’ essays written during the 27 September examination. The scoring guide on which levels and scores are based is appended to this report.

Level

Score

Frequency

3

12

3

11

2

10

9 (18%)

*********

2

09

8 (16)

********

2 Pass

08

24 (49)

************************

07

1 Fail

06

4 (8%)

****

1

05

2 (4)

**

1

04

2 (4)

**

1

03

1

02

49

Essays scored at Level One often reveal weaknesses in the structure of their arguments, confusing or uncertain logic, and the use of little or no evidence to support their claims. While not a major factor in the scoring process, most essays in this group did not reveal serious flaws in the “mechanics” of composition that could hinder a reader’s understanding of the writer’s ideas.

Traditionally accepted statistical measures of reliability have not been widely used by proponents of holistic scoring.  Following the advice offered by Stemler (2004), we might look to the frequency with which readers assign the same score as an indicator of scoring consensus, although the complexity of writing and reading might argue against reliance on such an approach.  Nonetheless, this team of readers was more likely to agree in its assessment of the essays in this sample than has been the case in past writing assessments.  The following tabulation reveals that nearly three-fourths of the 49 essays in this sample were scored with identical or similar (within one point) scores after two readings (36, 73%).   Twelve essays, 14% of the total sample, were read three times, of which nine (18% of 49) were “score 7” (3, 4) essays that required a third reader’s score to place them in Level One or Level Two.  Only one essay required four readings to reach agreement among readers: a 3, 4, 1, 3 pattern that was resolved and sustained at Level One with a raw score of “6” (3,3).

  • Essays scored after two readings:36 of 49, 73%
  • Essays scored after three readings:12 of 49, 24%
    • “Score 7” Essays: 9 of 12
    • “More than 2” Essays: 3 of 12
  • Essays scored after four readings:1 of 49, 2%

It is sometimes helpful to look for patterns among those who read and scored essays that might help affirm the accuracy of the scoring process. While two readers might not score the same essay in the same way on every occasion, their understanding and use of the scoring guide should encourage reasonably similar scores across a sample of readings. Six readers were recruited for this scoring in light of the number of essays to be assessed. Once they were trained to score these essays, readers worked at their own pace. The following table reveals descriptive information for each of the four readers.

Reader 1

Reader 2

Reader 3

Reader 4

Reader 5

Reader 6

All Readers

Essays Scored:

24

12

20

11

15

28

49

Mean Scores:

4.0

3.8

4.1

4.1

3.7

3.6

3.8

The range of mean scores for these four readers on this occasion, .5 (from 3.6 to 4.1), is somewhat narrower than for previous sessions. The number essays read, from a low of 11 for Reader 4 to a high of 28 for Readers 6, is characteristic of past readings.

Appendix I includes the scoring guide on which these essay scores were based (I.A) as well as two essays written for this examination which illustrate qualities associated by readers writing Level One and Level Two essays.

Spring 2004 Assessment

Overview. Thirty-five students, all seeking acceptance by the Education Department as candidates for teacher licensure, wrote essays on 14 February 2004 during a proctored two-hour examination period. Acceptable performance on such an essay has been required of all who would be prepared for licensure as teachers since the Department began the assessment of writing in April of 1996. Students were encouraged to make use of provided reading packets or to search for and use any other information that would help them prepare their essays. Four department members, trained in the holistic scoring of essays on this topic, gathered on 13 March 2004 to read and rate the essays written by these students. Twenty-five students wrote essays rated as equal to or better than the Department’s expected level of performance for entering students. Ten students, representing 29% of this group, wrote essays that readers rated as below this standard.

Procedure. As they registered for the writing examination and paid a small testing fee, Education Department staff invited these prospective secondary or elementary teachers to accept a reading packet designed to help them prepare their essays. Packets were available eight days prior to the examination, although some students did not choose to collect their packets until two days prior to the day on which all would write their essays. Each packet included the prompt on which they would write and the scoring guide that would be used by faculty readers to rate their performance.

The packet also offered advice on writing a persuasive essay and several readings exploring facets of the topic selected for this assessment, schools’ use of “zero tolerance” weapons policies. As they collected their packets, students were invited to attend one of two workshops focused on writing persuasive essays for this assessment. These workshops were prepared and offered by the colleges’ Writing Center staff.

Students were encouraged to use the readings in their packets and to seek additional information that could help them discover and refine a position on “zero tolerance weapons policies” employed by all of Minnesota’s public schools in response to a federal mandate. While they could bring other sources, notes, outlines, or preliminary drafts to the examination, students were required to write a final draft during the two hour test session. The test proctor collected students’ final essays, reading packets, and all other reference materials in their possession at the close of the test session.

Four Education Department faculty and staff members gathered on 13 March to read and rate the 35 essays written a month earlier. In doing so they followed a protocol for the “holistic scoring” of writing samples devised by Edward White (1998). After group training in the use of a six step scoring guide or “rubric” devised for this assessment, readers began to individually read and rate prospective students’ essays. A fifth staff member, serving as the “chief reader” for this assessment, conducted the training session, recorded essay scores, and oversaw the scoring process.

Each essay was read and scored at least twice.  If both readers’ scores differed by no more than one point, the essay was awarded that total “raw” score and set aside. Should the second reader’s score for an essay differ from the first by more than one point, that essay would be read by a third reader. Essays with a total raw score of “7” (a “3” [fail] plus “4” [pass] pattern) were always read by a third reader to break the essay’s mid-point position on the rubric. The third score, should it fall within one point of one of the previous scores, would break the impasses by assigning the essay to either “Level Two” (passing) or “Level One” (failing). Following the scoring session, all failing essays and a sample of passing essays were read again by the chief reader to verify that their original scores fit the demands of the Education Department Writing Assessment Scoring Guide.

Results. The following table reveals the distribution of scores awarded to students’ essays written during the 14 February examination. The scoring guide on which levels and scores are based is appended to this report.

Level

Score

Frequency

12

3

11

2

10

4 (11%)

****

2

09

6 (17)

******

2 Pass

08

15 (43)

***************

07

1 Fail

06

6 (17%)

******

1

05

3 (9)

***

1

04

1 (3)

*

1

03

1

02

35

As in recent administrations of the department’s writing assessment, readers did not score any essay above “Level Two.” The proportion of passing (71% of 35) to failing essays (29%) is generally consistent with aggregated performance on this assessment. Students whose writing falls into the “lower one half” of the rating scale were awarded raw scores of one, two or three by each of two readers. Believing that those who would teach others to write must themselves be competent in that skill, students with “Level One” essay scores were encouraged to strengthen their performance in this basic academic skill before seeking acceptance to the department’s licensure programs. Those who felt their scores were not an accurate reflection of their writing performance were afforded additional opportunities to verify their performance by retesting or through review of samples of their writing.

Looking more closely at the ten essays that did not receive passing ratings, trends emerging in past assessments seem evident once again. Those who wrote five of the ten failing essays did not include all required elements as specified in the prompt: a thesis, reasons, evidence, and a call to action. Eight of the ten did not develop an organizational pattern that could advance these elements within the context of a viable argument. Seven of ten failing essays revealed “limited or faulty reasoning (a raw score of 3). Writers of failing essays included weak forms of evidence in support of their positions on the prompt (9 of 10). Most (7 of 10), however, wrote essays that were free of significant “mechanical” errors that could hinder readers’ understanding. While this topic has encouraged several “Level Three” performances when used on past occasions, such was not the case on 14 February. Conversations with students awarded failing scores suggests that some my have invested little time in preparing for their assessment in the belief that they could retest or take “Option Two” (review of five recently written papers) should they be unsuccessful. Reading packets for these students revealed few signs of use in contrast to the packets of most writers.

While we might hope that an essay would attract the same score from two or more trained readers using the same scoring guide to estimate the quality of that essay, the complex nature of judgments associated with the holistic assessment of writing suggests that such an expectation may be unrealistic. Yet we should expect to find a reasonable estimate of the consensus among readers trained in the use of the scoring guide and familiar with the nature of holistic scoring.

The following table reveals the scoring consensus for the essays prepared by the thirty-five hopeful students who gathered for the examination on 14 February.

  • Essays scored after two readings:17 of 35, 49% 
  • Essays scored after three readings: 17
    • “Score 7” Essays: 14 of 17, 82%
    • “More than 2” Essays: 3 of 17, 18%
  • Essays scored after four readings:1 of 35, 3%

About one-half of the essays were given the same raw score by each of two readers (4, 4) or two scores which did not differ by more than one point (4, 5) and which did not equal seven. As many essays required three readings (17) as on other scoring occasions, although more in this session were “7” essays (3, 4). Our custom for several years has been to read “7” essays a third time to pull the “3, 4”essay into either the passing (4, 4 or 4, 5) or failing (3, 3 or 3, 2) group. Fourteen of these 17 “three reader” essays were first scored as “7’s,” comprising about 40% of the 35 essays in this sample and 82% of the 17 essays that were read by three readers.

The scoring process used for this group of essays followed the general outline of the 19 previous scoring sessions, although only four readers were invited to participate on 13 March in light of the smaller number of essays to be assessed. Once they were trained to score these essays, readers worked at their own pace. The following table reveals descriptive information for each of the four readers.

Reader 1

Reader 2

Reader 3

Reader 4

Essays Scored:

16

19

27

27

Mean Scores:

3.1

3.9

3.8

3.2

The range of mean scores for these four readers on this occasion, .8 (from 3.1 to 3.9), is consistent with previous sessions. The number essays read from a low of 16 for Reader 1 to a high of 27 for Readers 3 and 4, is also consistent with previous sessions.

Those who consult Appendix II will find a copy of the scoring guide used by readers to rate essays in this sample as well as examples of essays rated at Levels Two and One.

References

Semler, S.E. (2004) A comparison of consensus, consistency, and measurement approaches to estimating interrater reliability. Practical Assessment, Research, and Evaluation, 9(4). Available online: http://PAREonline.net/getvn.asp?v=9&n=4.

White, E.M. (1998). Teaching and Assessing Writing. Portland, MA: Calendar Island Press.

Appendix I

A. Scoring Guide: Essay Form G.2: "Test Scores and School Effectiveness"

Lower One-Half Essays: Scores of 1, 2, or 3

Score 1.The "1" paper suggests incompetent writing hindered by serious errors. The paper reveals poor comprehension of the persuasive or argumentative task, usually offering no useful structure or evidence. The writer is often unable to arrange thoughts into minimally acceptable prose. Essays scored at this level may be seriously “off the topic,” as if written in response to different question or “prompt.”

Score 2. The "2" essay reveals a significantly incomplete response, usually neglecting some of the assigned topic's required elements. It will also lack coherence, direction, or focus. The author may or may not clearly take a stand on the topic. A paper at this level may wander off the topic or portray circular reasoning. While a "2" paper may include reasons in support of the author's position, some or all of those reasons may not be relevant for the author's position. Restatement may appear in place of a reason or development of an argument. The reader may be distracted by serious errors in word choice, sentence structure, punctuation, or other facets of the "mechanics" of composition.

Score 3. While better than a "2" essay, a paper scored as a "3" remains in the lower one-half of the score range because it slights or ignores one part of the assignment (no thesis, reasons, or evidence), offers personal assertions or other weak forms of supporting evidence, or reveals limited or faulty reasoning. Such essays often fail to lay a persuasive foundation of logic and evidence. Organizational or mechanical difficulties, while less serious than in a "2" essay, may seriously weaken the essay.

Upper One-Half Essays: Scores of 4, 5, or 6

Score 4. A "4" paper takes a clear stand or position on the topic, but may lack a unified argument, an arguable proposition, or at least three distinct reasons supporting that position. The writer may offer a statement of personal preference on the topic, yet will go beyond an appeal to personal opinion as a reason for that preference. While a "4" paper will reveal two or three logically developed reasons to support the writer’s position on the topic, although one or two may reveal flawed thinking. Stronger forms of evidence will be used to support most of those reasons, if not all of them. Mechanical or structural errors will be evident, but less frequent than a paper that falls within the lower one-half of the range of scores. Essays scored at this level often reveal a writer working toward control of the persuasive task.

Score 5. The "5" essay responds to all requirements of the prompt, offering both a clearly stated and well-argued position. Each supporting reason is distinct, going beyond a simple rephrasing of the writer's position. Each reason is clearly supported by relevant evidence which is more often in the form of expert opinion or empirically founded generalizations. All parts of the paper will be connected in some direct manner. Papers at this level affirm the writer’s ability to compose clear sentences that form coherent, fluent paragraphs. A "5" paper, while not error free, does not distract the reader with poorly chosen words, grammatical errors, or spelling mistakes. It is in most respects a "very good" paper.

Score 6. An essay awarded this highest score adds a unity of tone and point of view as well as a good sense of audience to the features of a "5" paper. The writer selects and uses sufficient evidence to encourage the reader to respect the significance of his or her point of view on the topic. Essays at this level may reveal a creative approach to the topic, perhaps evident in the use of quotations, images, or humor. Sentences in a "6" paper will show more variety in form, with more sophisticated use of vocabulary, transitions, and connecting words than will be evident in a "5" essay. Papers at this level, as Dr. Sass notes, are "exquisitely written."

B. Two Essays Transcribed as Written

Should Minnesota use students’ test scores to judge a school’s effectiveness?

Essay A

Across the country the effectiveness of our schools has been declining. Students are not learning the basic educational skills such as reading, writing, and mathematics that are needed to be successful in their future. The Government is aware of this decline in education among schools. Their way of solving the school’s problems is by using standardized tests. Many teachers and principals are against such tests. They feel that there is “no way they can make the required ‘adequate yearly progress toward such daunting goals, given the deadlines” (Mathews, p. 1). However the only way to achieve a higher level of effectiveness in our schools is with the help of standardized testing. That is why standardized testing should be used to evaluate schools and the scores should be used to establish a system of reward and punishment, develop morale and deside when Government assistance is needed.

To make standardized testing effective, a system of reward and punishment needs to be in play. This system will motivate teachers to work hard at teaching their students the basic skills they will need to pass standardized tests. Such rewards may be money split among the teachers for being successful in their teachings. Their also needs to be punishments set up for those teachers that fail to improve their student’s test scores. Punishments could be as drastic as losing ones job. As Representative Joe Opatz, Del-St. Cloud said, “People need to understand that, at some point, you have to try something fairly dramatic to turn a school in crisis around!”

With a reward and punishment system in play, teachers and students will develop a positive morale. A school that becomes a high-performing school will develop positive mental and emotional conditions which often lead to better organized learning environments. Students might not understand the real value of their education. I, even as a nineteen year old college student, sometimes do not understand why I pay over $20,000 dollars in tuition for my education. It is up to the teachers to show students the importance of an education and this requires a positive moral.

If teachers fail to higher the level of morale, the Government must step in as a last resort. There are many options the Government has to try and help low-performing schools. One option is improvement planning which “may require a failing school to design and implement a formal improvement plan, identifying deficiencies and outlining strategies to bolster achievement” (Education Week p. 1-2). Another option is to bring in teams of experts into failing schools that will monitor and offer recommendations and on improving teaching efforts and also help train school staff. As a last resort Governments may use reconstitution, when a state or district replaces one or all of a school’s staff members and starts over, or government takeovers which is basically the same thing as reconstitution, but usually an entire district is taken over. These steps are needed if a school is not improving their test scores.

Standardized testing should be used to evaluate schools and the scores should be used to establish a system of reward and punishment for the teachers and schools. Giving teachers an extra insentive to do their best to teach their students the basic skills will improve test scores and develop positive moral. As teachers develop positive mental and emotional conditions in their classroom, more students will be inclined to learn and do their best at their school work. If for some reason teachers fail to higher the level of morale, the government must step in as a last resort. They must assist schools to become a better learning environment and as a last resort takeover the school and basically start over with a new staff if schools are chronically underperforming. With standardized testing to evaluate school’s effectiveness the government will be able to help low-performing schools improve and reward schools that are succeeding in their teaching skills.

Scores: First Reading 2 Second Reading 2 (a failing, Level One essay)

Essay B

Do students’ test scores determine the effectiveness of a school? This is a highly debated issue in the world of education today. The No Child Left Behind Act passed recently states that by the year 2014, 100% of all students should be proficient in reading and math. What happens to schools that cannot meet this goal? Can these scores alone be the determining factor of whether or not a school is “low performing” or “failing?” Minnesota should not use students’ test scores to judge a school’s effectiveness. The No Child Left Behind Act, coupled with the states requirements is both unfair and unrealistic for schools in certain areas. Teachers and faculty will have less freedom and added pressure. With added emphasis on test scores, more time is taken from other aspects of education. With so many other outlying factors, test scores alone cannot judge a school’s overall effectiveness.

If a school’s effectiveness is solely based on its student’s test scores, it would be unjust for overcrowded schools and schools in poverty stricken areas. Schools labeled as failing or low performing are often located in areas with higher poverty rates. When a student doesn’t have access to necessary materials, proper facilities, and quality teaching, how can they be expected to pass the states standardized tests? Students in these areas may be worrying more about their home life than the test they are about to take. With the NCLBA being passed, many educators feel that it is unrealistic to make the necessary annual progress. The new federal law would require that all students pass state testing in twelve years. This includes all handicapped students as well as those who speak English as a second language. These are the students what have the hardest and most challenging time preparing for testing. Take Midwood High School in Brooklyn, New York, for example. Midwood High has high graduation rates and was viewed as one of the more successful schools in the city until recently. A few weeks ago Midwood High was placed on the state’s failing schools list, despite being so successful. The high school was placed on the list because 33 disabled students failed to pass the state’s math and reading tests (New York Times, “Define Paradox”). It is simply unfair for the entire school to be labeled low performing because such a small percentage of the student body has insufficient test scores.

Placing a higher emphasis on a school’s test scores also causes problems for the staff and faculty at the school. When a school is labeled as “failing“ or “low performing,” it obviously upsets the school’s teachers, leaders, and parents. When a school receives a negative review, there now becomes the tremendous pressure to improve students’ test scores. In a study done at 11 “failing” schools in Kentucky and Maryland, educators were demoralized and became dissatisfied with their jobs after receiving the negative review. This lead to high turnover rates among the teachers at these schools. The study goes on to mention, “The negative label often drove teachers out of their schools, because of what researchers called “intolerable pressures” to improve student performance” (Education Week, State Intervention). Picture yourself as an educator in a Minnesota school. Every student in your class passes the state’s standardized tests in reading and math. Your school as a whole however, is labeled low performing. Isn’t this demoralizing, especially since you had absolutely no control over the situation? Now that your school is “low performing,” does that make it less effective?

If test scores will determine the effectiveness and success of Minnesota schools, every school will strive to have its students master the basic skills required by the state and NCLB. If a school has not been performing well on tests in recent years, there is an immediate need for improvement. Where is the extra time needed to achieve these goals? If teachers need to focus more on reading, math, and writing to pass state testing, students loose out on opportunities in art, physical education, music, and science. Something will obviously have to give. Now we will have debate over what is put into the curriculum and ultimately what is left out. Who’s to say that mathematics and reading standards are more important than art or music classes? One article describes the situation perfectly. “Everything will be fine as long as social studies, language arts, science, mathematics, geography, and the arts all receive 30 percent of the curriculum” (Focusing on Outcomes).

Can a simple test score really determine the overall success and academic achievement of a school? Tests in and of themselves do not improve the quality of education; teachers do. We should be more focused on improving the schools and students that need the most help, not test scores. There are too many variables that can determine the success or failure of standardized testing. Testing alone should not be the determining factor in judging a Minnesota school’s effectiveness.

Scores: First Reading 5 Second Reading 5 (a passing Level Two essay)

Appendix II: Spring 2004

A. Scoring Guide: Essay Form H.2:"Zero Tolerance"

Lower One-Half Essays: Scores of 1, 2, or 3

Score 1. The "1" paper suggests incompetent writing hindered by serious errors. The paper reveals poor comprehension of the persuasive or argumentative task, usually offering no useful structure or evidence. The writer is often unable to arrange thoughts into minimally acceptable prose. Essays scored at this level may be seriously “off the topic,” as if written in response to different question or “prompt.”

Score 2. The "2" essay reveals a significantly incomplete response, usually neglecting some of the assigned topic's required elements. It will also lack coherence, direction, or focus. The author may or may not clearly take a stand on the topic. A paper at this level may wander off the topic or portray circular reasoning. While a "2" paper may include reasons in support of the author's position, some or all of those reasons may not be relevant for the author's position. Restatement may appear in place of a reason or development of an argument. The reader may be distracted by serious errors in word choice, sentence structure, punctuation, or other facets of the "mechanics" of composition.

Score 3. While better than a "2" essay, a paper scored as a "3" remains in the lower one-half of the score range because it slights or ignores one part of the assignment (no thesis, reasons, or evidence), offers personal assertions or other weak forms of supporting evidence, or reveals limited or faulty reasoning. Such essays often fail to lay a persuasive foundation of logic and evidence. Organizational or mechanical difficulties, while less serious than in a "2" essay, may seriously weaken the essay.

Upper One-Half Essays: Scores of 4, 5, or 6

Score 4. A "4" paper takes a clear stand or position on the topic, but may lack a unified argument, an arguable proposition, or at least three distinct reasons supporting that position. The writer may offer a statement of personal preference on the topic, yet will go beyond an appeal to personal opinion as a reason for that preference. While a "4" paper will reveal two or three logically developed reasons to support the writer’s position on the topic, although one or two may reveal flawed thinking. Stronger forms of evidence will be used to support most of those reasons, if not all of them. Mechanical or structural errors will be evident, but less frequent than a paper that falls within the lower one-half of the range of scores. Essays scored at this level often reveal a writer working toward control of the persuasive task.

Score 5. The "5" essay responds to all requirements of the prompt, offering both a clearly stated and well-argued position. Each supporting reason is distinct, going beyond a simple rephrasing of the writer's position. Each reason is clearly supported by relevant evidence which is more often in the form of expert opinion or empirically founded generalizations. All parts of the paper will be connected in some direct manner. Papers at this level affirm the writer’s ability to compose clear sentences that form coherent, fluent paragraphs. A "5" paper, while not error free, does not distract the reader with poorly chosen words, grammatical errors, or spelling mistakes. It is in most respects a "very good" paper.

Score 6. An essay awarded this highest score adds a unity of tone and point of view as well as a good sense of audience to the features of a "5" paper. The writer selects and uses sufficient evidence to encourage the reader to respect the significance of his or her point of view on the topic. Essays at this level may reveal a creative approach to the topic, perhaps evident in the use of quotations, images, or humor. Sentences in a "6" paper will show more variety in form, with more sophisticated use of vocabulary, transitions, and connecting words than will be evident in a "5" essay. Papers at this level, as Dr. Sass notes, are "exquisitely written."

B. Sampled Essays Transcribed as Written.

Should the Big City School District maintain a safe environment for learning by continuing to enforce its “zero tolerance” policy for weapons?

Essay A

Zero tolerance.  Doesn’t it sound nice? It sounds like the perfect way to get rid of drugs, violence and weapons in schools. Zero tolerance gives parents, teachers, and to a certain extent students the comforted feeling that all the “bad guys” will be rooted out. Let’s look into it further; what happens to the students that are trying to protect their loved ones from committing suicide by bringing the weapon to school? What about the student who perhaps lent a car to someone else and was ignorant to the fact that there was a weapon in the car? There are two sides to everything and is it fair to treat every situation in the same manner as a student who deliberately brings guns to school? I should think not, but they are.

Every one acknowledges the fact that violence in American schools is growing. Zero tolerance is actually a clever way to send the message across to delinquent students that weapons will not be tolerated. However, there should be ways in which students who made “innocent mistakes” shouldn’t be subject to such harsh correction. For example, a parent by the name of Melany (last name not given for security) shares her story about how her son’s friend brought a BB gun into his car without him being aware of it. During a confrontation with other students in a parking lot, the friend took the BB gun out of the car. Being the kind of friend that her son was, he took if from his friend and stuck it between the car seat. He didn’t want his friend to get into a trouble. They then drove away. It so happened that the security guard had seen what happened and contacted the principal. Zero tolerance policies were applied and her son was expelled with a record to his name as well. In some states schools provide alternative placements for the child who has been expelled. It’s called “Teleschool,” but unfortunately Melany’s son could not be placed in Teleschool because he was in advanced placement classes (Melany’s son had a 3.7 GPA and was in his last year of high school). The Teleschool told Melany that there was no class that her son could take and obtain credits for because he had already taken them all. No other school would take her son because of his record. Should kids who are above average in intelligence not be allowed to go to alternative placements? Should their lives be abruptly stopped because of one mistake? The principal in that school should have given punishment, yes, but not so severe as expulsion. Punishment should be given because no one wants to send the message across that you are going to get off “Scott free” but in that case you must agree that expulsion was too severe.

Zero tolerance does help and it has and is doing it’s part. There are some students, without a doubt in my mind, that is looking forward to the opportunity to release their pent up rage. Then there are the majority of students who just want to lead a normal life. Most kids just want to go to the mall, hangout with friends, date and then graduate. They don’t want to shoot school’s and kill their fellow school mates. Principals should be aware of this and see that not in all cases is it appropriate to apply zero tolerance. Society is partly to blame for boys wanting to bring guns and knives to school in the first place. How? Lets look at it. From the time that I was growing up, it was not normal and wasn’t acceptable for boys to play with dolls, hair clips and head bands or anything other “girlie” stuff. Boys were encouraged to play with truck, cars, helicopters, ect. There is evidence that supports this because if we go to any store and look at the boys aisle do we see shirts that have pink flowers? Pink hearts? Lace? No. We see shirts that has some kind of outdoor sport logoed, or a picture of a car or of skaters. Boys are raised to be competitive, to be tough. Boys always want to show off their new found “toy”. What better place than to bring their “toys” to school? All of their friends are there and it is the ultimate hang out spot. Yes, they are aware of the dangers, and yes the rules state clearly the consequences but how many of us have not broken rules? How many of us can truly say we have not taken risks?  Society has taught these little boys how to become men but then cut them short when they want to play by the rules. This might seem to you the reader as gender biased but how many school shootings have you heard of being done by a girl?

Let’s look at one more factor related to zero tolerance. If a student builds up enough rage to consider bringing a weapon in intent to actually cause harm to other students would he really be concerned with being expelled? Perhaps he would be thinking more on the line of not wanting to go to jail. School principals should reduce the zero tolerance policy to, instead of expulsion for any weapons brought to school. They should sit and consider the case in which the incident happened and issue a severe but lower form of punishment. For cases in which clearly the student’s intent was not to harm, perhaps suspension. To ensure that the child is not lying they should investigate further into the situation. The following case is a perfect example. A young high school student after bringing a gun to school was expelled. He should be, after all; what is a gun doing in school? But consider this; that morning in a drunken rage his father stuck a gun down the youngster’s throat and before passing out threatened to kill him and his younger brother. The student brought the gun to school to save their lives. Before he could give it to his principal, the gun was discovered. No amount of explaining helped because of zero tolerance. I am not saying that yes guns should be brought to school, but instead of expelling the student they should investigate further into the situation and perhaps issue a punishment (as not to come across as permitting guns) but less severe. This would even be a social work case. An old adage goes “There is more than one way to skin a cat” meaning there is more than one way to deal with a problem. In this case expulsion is not always the answer.

In conclusion, we should all think about what would happen when there is a growing amount of kids becoming expelled and being uneducated. Wouldn’t the crime rate rise? There is more deeper issues to this than there appears. In order to really evaluate the solution and find it effective we must first try it. Why shouldn’t we fight for the rights of students who are innocently expelled? Why don’t we take actions on our beliefs? The innocent should not suffer for the guilty and every one deserves the right to an education. Let’s take action and stop the madness of zero tolerance and bring back the old days of “second chances.”

Scores: First Reading: 2 Second Reading: 2 (Scored as a failing Level One essay.)

Essay B

Mary is an 11 year-old student at a small private elementary school. She is a bright student who has always followed the rules. One morning her mother packed her lunch and included an apple peeler because she knew her daughter did not like the skin of the apple. When, during lunch, a supervisor noticed Mary peeling the apple with this tool, Mary was immediately taken to the principal’s office. According to the school’s definition of a weapon, the apple peeler was a potentially dangerous object. The school’s zero tolerance policy for weapons requires any student caught with such an object to receive immediate expulsion for at least one year. Mary was expelled.

This incident never actually occurred, but similar situations have arose from the application of the zero tolerance policy. With the increase in school violence during the 1990s, it is clear that our schools are in need of some program that will combat this problem. However, zero tolerance for weapons contains many flaws which have left me as well as many others with this question: Does zero tolerance really address the serious problem of school violence? If have concluded that because the zero tolerance policy is verifiable unfair, inconsistently applied and interpreted, and negatively influential in the lives of our children, the Big City School District should no longer enforce the zero tolerance policy for weapons in its schools.

When critically examined, one can find evidence to support the unfairness of the zero tolerance policy. First and most important, this law does not take intention into account. There is a large difference between whether a student has brought an instrument to school to peel an apple, as in Mary’s case, or brought it with the intention of harming another person. The zero tolerance policy treats dissimilar problems with similar behavior outcomes. As a result minor incidents are treated with the same severity as major incidents that actually lead to school violence. This in turn leads to the expulsion of innocent children. In addition, the zero tolerance policy is unfair because it infringes upon the rights of all citizens to remain “innocent until proven guilty.” Those for this policy will argue that statistics seem to support the effectiveness of this policy. According to a study conducted by the Department of Children, Families and Learning (CFL) there were 920 weapons incidents in Minnesota schools in 2000-2001. Zero tolerance was applied and students were expelled. What makes this practice unfair is the fact that only 1.3 percent of these incidents involved firearms with potential lethality. As a result, innocent children with no intention of harming anyone were expelled.

Another problem with the zero tolerance policy is its inconsistent application and interpretation. This policy does not clearly define what is allowed and what is not allowed. According to the American Heritage Dictionary, zero tolerance is defined as “the policy or practice of not tolerating undesirable behavior, such as violence or illegal drug use, especially in the automatic imposition of severe penalties for first offenses.” This definition sounds plausible but it does not clearly state what actions are undesirable and what specific consequences will result from violation of the rules. Because Weapons Assessment Committees have not been able to develop a clear policy, zero tolerance has been left to interpretation by separate school districts. A major problem with this is that often those enforcing the policy do not clearly understand it. Some schools will adopt this policy simply because they want to send a message to potential violators. Unfortunately, when approached from this viewpoint, zero tolerance does absolutely nothing to encourage students in our schools to make the right decisions or to help those who are having difficulties in school. Without counseling, conflict-resolution programs, peer mediation, anger management programs, and other alternative punishments, the problem of school violence will continue in our schools.

My final argument, and most important, concerns the impression and influence that such a policy creates on our children. By refusing explanation by examination of intention, we are teaching our children to be intolerant of others. Is it not true that from the first day of kindergarten we work to instill in our students the importance of acceptance of others especially those who come from other cultures and backgrounds? The current policy is contradicting these values. It teaches children that mistakes are bad and deserve punishment; that communication with others is not necessary; that understanding is not valued in our schools. If our children are intolerant of others this in itself will lead to violence. Unable to deal with their anger, our children will make poor choices and the zero tolerance policy will be to blame. Such a policy that leads to larger problems than the one it attempts to address is inefficient.

In addition children who are punished by this policy are more likely to drop out of school, even extremely intelligent students.  By removing children from the school environment, we are taking away their education and opening up new opportunities for these students to become involved in gangs and other poor circumstances.

In conclusion, the zero tolerance policy needs to be amended or thrown out all together. There are several alternative methods for combating the problem of school violence. Our district would benefit from adopting the solution proposed by Dr. Richard L. Curwin and Allen N. Mendler. They call their solution “as tough as necessary.” This approach works to create a balance between being strong and being fair. It allows for consequences that fit the offense. It encourages our students to exhibit tolerance while recognizing those actions that are not acceptable behavior in our schools. If this policy had been used in the case of Mary and the apple peeler, she would not have been expelled. She would have been allowed to continue her education and her classmates would have been exposed to a fair system of punishment that encourages every person to be tolerant of one another by allowing an explanation of intent.

My hope is that the Big City School District will receive this essay with an open, tolerant mind as well; seriously reflecting upon what we value in our schools.

Scores: First Reading: 4 Second Reading: 5 (Scored as a passing Level Two essay.)

D. Leitzman, May 2005

M:\AP-EDWA Reports\03-04 Writing Assessment Summary Report.doc