Why Average? Alternatives to Averaging Grades

(Part 3 of the “Why Average?” trilogy from the week of Aug 7-14. Here’s Part 1. Here’s Part 2.)

Over the past week, the topic of averaging grades has risen to the forefront of the twitter-verse.  Posts abound around the issues that professional educators have with lumping several disparate values together in the hopes of describing a student’s level of competence or understanding.  (For reminder of these posts, see Why Average?, xkcd’s TornadoGuard, David Wees’ A Problem with Averages, and Frank Noschese’s Grading and xkcd.)



After seeing so many (including myself) highlight the inadequacy of averaged grades, the words of our county’s assistant superintendent come to mind: “If you offer a problem, you’d better be ready to suggest a solution.”  That being said, here are a few alternatives to sole reliance on averaging student data to describe their competence, organized by the issues described in Part 2 of this “Why Average?” trilogy.

Issue 1: Averages of data that do not match intended outcomes do not suddenly describe outcome achievement.

The xkcd comic (along with the correlation to education on Frank’s blog) ties in most closely to this issue.  So often, we as educators assign points (and therefore value) to things that do not necessarily relate to outcome achievement.  Assigning grades for homework completion, timeliness- even extra credit for class supplies- and combining them with outcome achievement data introduces a high level of “grade fog”, where anyone looking at the final grade would have a high degree of difficulty in parsing out the components that led to a student’s grade.

In his article, “Zero Alternatives”, Thomas Guskey lays out the six overall purposes that most educators have for assigning grades:

  1. To communicate the achievement status of students to parents and others.
  2. To provide information students can use for self-evaluation.
  3. To select, identify, or group students for specific educational paths or programs.
  4. To provide incentives for students to learn.
  5. To evaluate the effectiveness of instructional programs.
  6. To provide evidence of a student’s lack of effort or inability to accept responsibility for inappropriate behavior.

Frank Noschese’s blog post highlights these cross-purposes: in the image paired with the xkcd comic, the student’s grade of B seems to come from averaging grades that are meant to provide motivation (“I do my homework”, “I participate in class”), responsibility (“I organize my binder”) and information on achievement (“I still don’t know anything”).

The simple answer to this issue would be to stop averaging grades for things like homework completion, class participation, and responsibility together with values for student achievement.  Instead, make grades specifically tied to meeting standards and course objectives.  Of course, if it were that easy, we would all be doing it, right?  I guess the bigger question is, How do we provide the desired motivation and accountability without tying it to a student’s grade?  Guskey’s article suggests several ideas for how one might differentiate these cross-purposes (e.g. a grade of “Incomplete” with explicit requirements for completion, separate reports for behaviors, etc).  Other alternatives from my own practice:

  • Report non-academic factors separate from a student’s grade. Character education is an important part of a student’s profile, though it does not necessarily need to be tied to the student’s academic success.  One way of separating the two would be to report the two separately.  I had a category in my gradebook specifically for these kinds of data, though the category itself had no weight relative to the overall grade.  Providing specific feedback to students (and their parents) on topics of organization and timeliness separate from achievement grades can go a long way toward getting behaviors to change.
  • Set “class goals” for homework and class participation.  Sometimes, there is no better motivator than positive “peer pressure”.  One of my bulletin boards in my classroom had a huge graph set up, labeled, “Homework completion as a function of time”.  Each day, we would take our class’ average homework completion, and put a sticker on the graph that corresponded to that day’s completion rate for the class.  We set the class goal as 85% completion every day, and drew that level as the “standard” to be met.  As a class, if we consistently met that standard over the nine-week term, there was a class reward.  One unintended consequence: each class not only held themselves to the standard, but also “competed” with other class periods for homework supremacy!  (Of course, there was that one class that made it their mission to be the worst at completing homework…goes to show that not every carrot works for every mule.)
  • Make homework completion an ‘entry ticket’ for mastery-style retests. If homework’s general purpose is to promote understanding, one would assume a correlation between homework completion and achievement.  While I ‘checked’ for homework completion on a daily basis and recorded student scores under a “Homework” category, that category had no weight in the student’s overall grade.  Instead, once the summative assessment came up, those students who did not reach the sufficient level of mastery needed to show adequate attempts on their previously assigned work before we could set a plan for their re-assessment.  You may think that students would “blow off” their homework assignments in this situation- and some did, initially.  However, once they engaged in the process, students did what was expected of them.  Over time, there was no issue with students being unmotivated to do their homework as necessary.

Issue 2: Averages of long-term data over time do not suddenly describe current state understanding.

This issue is a little trickier to manage.  On his blog Point of Inflection, Riley Lark summed up his thinking on the subject of how to best describe current state understanding with a combination of long-term data in a post entitled, Letting Go of the Past.  In the post, he compares straight averages to several other alternatives, including using maximums and the “Power Rule” (or decaying average).  I strongly suggest all those interested in this topic read Riley’s post.  Riley has since created ActivGrade, a standards-based gradebook on the web that “[makes] feedback the start of the conversation- instead of the end.”

For some other resources for ideas:

– – – – – – – – – –

At the heart of the question “Why Average?” is a push to purpose.  While none of the ideas described in this trilogy of posts are inherently right, at the very least, I hope that it has brought readers some “jumping-off points” on how to ensure that their methods match their intended purpose.  We owe at least that much to our students.  If you have other resources, ideas, or questions that would extend the conversation further, please share them by all means.

Why Average? on the Minds of Many

(Part 2 of the “Why Average?” trilogy from the week of Aug 7-14. See Part 1 here. See Part 3 here.)

So on Sunday, I posted a comic about how goofy it can be to average long-term data to describe current state measurements.  Imagine my surprise this afternoon upon checking the RSS feed to see this new comic on xkcd:



Earlier today, physics teacher and #sbar advocate Frank Noschese paired the xkcd image with an educational correlate on his Action-Reaction blog:



While this comic tackles a different problem with averaging than does my own post, it seems like concerns with averaging as a description of data are on the minds of many.  (To get an idea of the scope of the discussion, check out the conversations happening in the comment boxes on posts by Frank Noschese and David Wees, respectively.)

Our comics highlight two different but very real issues with trying to describe such a complex thing as learning with such a simple thing as one averaged value:

  • When we take values that do not match intended outcomes (a student’s knowledge, understanding, and skills acquisition) and average them together, the new number does not somehow suddenly describe outcome achievement.
  • Even if we do happen to measure the outcomes described above, but those measures are taken over time and then averaged together, the new number does not somehow suddenly describe current state.

Have you seen any other visuals that help to describe these problems with averaging data?

Lesson Design using Wordle: A Pre/Post Class Assessment for Learning

I have run across many posts in the recent past explaining varied uses for Wordle in the classroom.  (See this post from the Tech Savvy Educator, and this one from Clif’s Notes for some examples that come to mind.)  While I appreciate the springboards that these many examples provide, I did notice that most posts collect many ideas together as opposed to describing the use within the context of a specific lesson design.  Below, I describe the process my students used as an assessment for whole-class learning in my physics classes, where Wordle played an integral part.  I hope that making my practice public can inspire each of you to improve on what I’ve tried- every time one of you shares how the lesson design works in your own classroom, we get a new opportunity to grow and learn from each other!

Pre-Assessment (The ‘Before’)

Before beginning our studies of magnetism, we had a quick class discussion around one question: “When you think of ‘magnetism,’ what comes to mind?”  Using a little “write-pair-share” strategy, we made a list- as they shared aloud, I collected their responses in a Word doc projected on the board.  After all three of my common preps completed this activity, we had three different classes’ “pre-assessed” knowledge around magnetism.  Copying all of that text into a Wordle, we could now find the commonalities in our ideas:


This cloud gives the class a picture of what ‘we’ think in relation to magnetism. As the last conversation was a “class-ending” conversation the day before, the word cloud became a “class-starting” conversation the next day.  We began class by examining this word cloud, questioning what it was that we would likely want to learn next about magnetism.

Learning Time (The ‘During’)

During this 2nd class period, several of the students who had experience in chemistry had a sneaking recollection that there was some relationship between electrons and magnetism, and became the leaders in a short class discussion around the concept of magnetic fields and magnetic domains.  At that point in the lesson design, we had our “do some stuff with magnetic fields” time.  Around the room were several demo stations related to the relationship between electricity and magnetism, where students had a central question to consider- “What Happens When I Do This?,” and “Why Do I Think It Happens?”  

Following these experiences- which led students in all sorts of WHWYDT kinds of directions (both expected and unexpected)- we came together as a class to discuss what we had seen at these stations, and what questions had developed from the experiences.  As a closing activity to the day, each student responded to a 1-question Google Form that asked the same question as their pre-assessment: “When you think of ‘magnetism,’ what comes to mind?”

Post-Assessment (The ‘After’)

The next class period, students entered class with this picture in front of them:


By taking the student responses and pasting them into a Wordle, we were able to see what “we” now think about magnetism.  As a class, we compare this new word cloud to the first Wordle: by analyzing the similarities & differences between these two Wordles, the class is now examining what we have learned, and how our thinking has changed.  

The unintended consequence- many students noted that our new responses went farther down the path of “induced” magnetism (that is, magnetism brought on by electric current), and farther away from the more typical concept of naturally magnetic materials.  They wondered how we would connect these two ideas, as they still seemed disconnected in our thinking.  This connection just happened to be the planned topic of study for the day, not only because it was part of our original pacing guide, but specifically because now we have noticed this trend in the “data” that the Wordle had presented.  The students noticed that the dots were not connected, and the students wanted to connect them, which made the day’s learning much more authentic.  It was not just something I was supposed to teach them: it had become something that they wanted to learn.

Generalizing for Lesson Design:

While not a flawless design, these six steps seemed paramount in increasing students’ desire to learn:

  • Students pre-assessing their own knowledge and understanding – “What does _insert topic here_ mean to me?”
  • Students using Wordle to analyze the pre-assessment responses
  • Students “doing stuff” to experience _insert topic here_ in real life – “What happens when I do this?”
  • Students responding to what they now know and understand – “What does _insert topic here_ mean to me today?”
  • Students comparing the Wordle of their current thinking to that of their pre-assessment responses
  • Students asking the question, “Given what I first thought, and what I now think, what do I think of next?

Without the use of Wordle, we lose out on a central piece of this lesson design puzzle.

Have you used Wordle as a class assessment for learning with your students?  Please share ideas, questions, and suggestions in the comments.  If you decide to try out this lesson design with a topic in your class with your students, please consider sharing how it goes in the comments- learning from your experiences helps us all grow!

A Response to Data-Informed Decisions

Earlier this evening, I read a colleague’s blog post discussing the concept of making data-informed decisions as opposed to data-driven decisions.  It’s a thoughtful post, one I hope you will read in depth.

The post brought out a response in me that unearthed some Sherlock Holmes quotes I thought I had forgotten. (Read the comment, if you’re interested in the quotes themselves.)  There are a couple of images that seem to sync up well with the idea from the response, so I figured I’d put them up here for posterity’s sake:


These images are meant to be viewed in succession, almost as an evolution.  The 1st image depicts the concept of a data-driven decision as Steven describes it in his blog: data leads to our decision to act in a certain way, and those actions lead to new data.  What this idea is missing- and what Steven asserts- is the process of thoughtful reflection that occurs when you consider not just the data but also the perceived reasons for the data.  In the 2nd image, the data has informed those reasons, and those reasons then drive the decision on how to act.

The 3rd image adds a level of balance into the system as drawn from similar diagrams in Senge’s Fifth Discipline.  In this cycle, our decisions are still driven by the reasons for the data, but here the data is the perceived gap between the actual results and those we expected.  In other words, we’re not necessarily asking ourselves the question, “Why do we see the data we see?” but rather, “What is the reason for the difference between what we see and what we thought we’d see?”

Given time, there would probably be several more iterations of this image- I hope your thoughts will help to continue to shape it into something better than it is today.  Thanks again to Steven & Rich @ Teaching Underground for inspiring the response.