To Auf, or Not to Auf: A Lesson in Communication

Watching the latest episode of Project Runway (hey, don’t knock it ’til you’ve tried it) made me think about project-based learning and the importance of communication of grading practices.

For those who don’t know, Project Runway is Heidi Klum’s challenge-based reality show where one day you’re “in”, and the next day you’re “out” (or “Auf’d!” as we say based on the German-born host’s good-bye catchphrase). Pitting upstart fashion designers against each other, the 18-week run showcases one elimination per week, creating a sewing fight to the finish where only 1 designer can end up on top.


Normally, I end up watching the show through the lens of project-based learning and performance assessment. It’s not really much of a stretch:

  • Given parameters for a task, designers make a plan (for an outfit), gather resources to execute that plan, and develop the final product of a runway-ready outfit – usually within two days or so from soup to nuts.
  • The final product gets assessed by a panel of fashion professionals, who judge the outfit accoding to some specified criteria along with their own professional opinion.
  • These panelists provide each “Top 3” and “Bottom 3” designer with both commendations and critical feedback. Through deliberation, they then choose one designer as the winner, and one to be Auf’d.
  • Every designer who has participated – whether they continue on or not – has ideally learned more about the skills and understanding it takes to make it in this business.

This panelist interaction highlights for me the distinctions between tests and assessments, between feedback and grades. These terms get thrown around and conflated in education all the time – I thought maybe that applying them to this show would be a helpful way to distinguish them. Here’s how the analogy works for me:

  • Test = The task given to the designers
  • Assessment = Observation of the products that resulted from this task, and judgement of the relative quality of that product
  • Feedback = The process of sharing these observations and judgements, while potentially suggesting some changes for future endeavors
  • Grade = In this case, a norm-referenced rating: Winner of the challenge, Top 3, “Safe” in the middle, Bottom 3, “Auf’d”.

Fast-forward to this past week. (SPOILER ALERT, for those who care about that kind of thing.) In episode 7, the task was to make a design that fit into an existing collection developed by Lord & Taylor, a reputable fashion company. After completing the challenge, the runway walk revealed the 9 designers’ dresses – all of which were pretty good. I didn’t think there was a “bad” one in the bunch.

Apparently, the judges felt the same way: they decided that everyone “met par” on the challenge, and no one was Auf’d. The decision made perfect sense to me, but I’m a standards-based addict. The response from Christopher (a contestant on the show):


Because these contestants are used to a certain norm-referenced grading scale – Winner, Top 3, “Safe”, Bottom 3, “Auf’d” – they will react negatively if the scale gets changed on them somehow. They will consider any decision based on that change to be “unfair”, even if it is the fairer thing to do.

Something to keep in mind for all of you starting up the school year with new ways of assessing and grading (standards-based or otherwise): make sure you’ve communicated the change and its rationale before acting on the change. Otherwise, your kids will likely respond just like Christopher did.

Why Average? Alternatives to Averaging Grades

(Part 3 of the “Why Average?” trilogy from the week of Aug 7-14. Here’s Part 1. Here’s Part 2.)

Over the past week, the topic of averaging grades has risen to the forefront of the twitter-verse.  Posts abound around the issues that professional educators have with lumping several disparate values together in the hopes of describing a student’s level of competence or understanding.  (For reminder of these posts, see Why Average?, xkcd’s TornadoGuard, David Wees’ A Problem with Averages, and Frank Noschese’s Grading and xkcd.)


After seeing so many (including myself) highlight the inadequacy of averaged grades, the words of our county’s assistant superintendent come to mind: “If you offer a problem, you’d better be ready to suggest a solution.”  That being said, here are a few alternatives to sole reliance on averaging student data to describe their competence, organized by the issues described in Part 2 of this “Why Average?” trilogy.

Issue 1: Averages of data that do not match intended outcomes do not suddenly describe outcome achievement.

The xkcd comic (along with the correlation to education on Frank’s blog) ties in most closely to this issue.  So often, we as educators assign points (and therefore value) to things that do not necessarily relate to outcome achievement.  Assigning grades for homework completion, timeliness- even extra credit for class supplies- and combining them with outcome achievement data introduces a high level of “grade fog”, where anyone looking at the final grade would have a high degree of difficulty in parsing out the components that led to a student’s grade.

In his article, “Zero Alternatives”, Thomas Guskey lays out the six overall purposes that most educators have for assigning grades:

  1. To communicate the achievement status of students to parents and others.
  2. To provide information students can use for self-evaluation.
  3. To select, identify, or group students for specific educational paths or programs.
  4. To provide incentives for students to learn.
  5. To evaluate the effectiveness of instructional programs.
  6. To provide evidence of a student’s lack of effort or inability to accept responsibility for inappropriate behavior.

Frank Noschese’s blog post highlights these cross-purposes: in the image paired with the xkcd comic, the student’s grade of B seems to come from averaging grades that are meant to provide motivation (“I do my homework”, “I participate in class”), responsibility (“I organize my binder”) and information on achievement (“I still don’t know anything”).

The simple answer to this issue would be to stop averaging grades for things like homework completion, class participation, and responsibility together with values for student achievement.  Instead, make grades specifically tied to meeting standards and course objectives.  Of course, if it were that easy, we would all be doing it, right?  I guess the bigger question is, How do we provide the desired motivation and accountability without tying it to a student’s grade?  Guskey’s article suggests several ideas for how one might differentiate these cross-purposes (e.g. a grade of “Incomplete” with explicit requirements for completion, separate reports for behaviors, etc).  Other alternatives from my own practice:

  • Report non-academic factors separate from a student’s grade. Character education is an important part of a student’s profile, though it does not necessarily need to be tied to the student’s academic success.  One way of separating the two would be to report the two separately.  I had a category in my gradebook specifically for these kinds of data, though the category itself had no weight relative to the overall grade.  Providing specific feedback to students (and their parents) on topics of organization and timeliness separate from achievement grades can go a long way toward getting behaviors to change.
  • Set “class goals” for homework and class participation.  Sometimes, there is no better motivator than positive “peer pressure”.  One of my bulletin boards in my classroom had a huge graph set up, labeled, “Homework completion as a function of time”.  Each day, we would take our class’ average homework completion, and put a sticker on the graph that corresponded to that day’s completion rate for the class.  We set the class goal as 85% completion every day, and drew that level as the “standard” to be met.  As a class, if we consistently met that standard over the nine-week term, there was a class reward.  One unintended consequence: each class not only held themselves to the standard, but also “competed” with other class periods for homework supremacy!  (Of course, there was that one class that made it their mission to be the worst at completing homework…goes to show that not every carrot works for every mule.)
  • Make homework completion an ‘entry ticket’ for mastery-style retests. If homework’s general purpose is to promote understanding, one would assume a correlation between homework completion and achievement.  While I ‘checked’ for homework completion on a daily basis and recorded student scores under a “Homework” category, that category had no weight in the student’s overall grade.  Instead, once the summative assessment came up, those students who did not reach the sufficient level of mastery needed to show adequate attempts on their previously assigned work before we could set a plan for their re-assessment.  You may think that students would “blow off” their homework assignments in this situation- and some did, initially.  However, once they engaged in the process, students did what was expected of them.  Over time, there was no issue with students being unmotivated to do their homework as necessary.

Issue 2: Averages of long-term data over time do not suddenly describe current state understanding.

This issue is a little trickier to manage.  On his blog Point of Inflection, Riley Lark summed up his thinking on the subject of how to best describe current state understanding with a combination of long-term data in a post entitled, Letting Go of the Past.  In the post, he compares straight averages to several other alternatives, including using maximums and the “Power Rule” (or decaying average).  I strongly suggest all those interested in this topic read Riley’s post.  Riley has since created ActivGrade, a standards-based gradebook on the web that “[makes] feedback the start of the conversation- instead of the end.”

For some other resources for ideas:

– – – – – – – – – –

At the heart of the question “Why Average?” is a push to purpose.  While none of the ideas described in this trilogy of posts are inherently right, at the very least, I hope that it has brought readers some “jumping-off points” on how to ensure that their methods match their intended purpose.  We owe at least that much to our students.  If you have other resources, ideas, or questions that would extend the conversation further, please share them by all means.

Why Average? on the Minds of Many

(Part 2 of the “Why Average?” trilogy from the week of Aug 7-14. See Part 1 here. See Part 3 here.)

So on Sunday, I posted a comic about how goofy it can be to average long-term data to describe current state measurements.  Imagine my surprise this afternoon upon checking the RSS feed to see this new comic on xkcd:


Earlier today, physics teacher and #sbar advocate Frank Noschese paired the xkcd image with an educational correlate on his Action-Reaction blog:


While this comic tackles a different problem with averaging than does my own post, it seems like concerns with averaging as a description of data are on the minds of many.  (To get an idea of the scope of the discussion, check out the conversations happening in the comment boxes on posts by Frank Noschese and David Wees, respectively.)

Our comics highlight two different but very real issues with trying to describe such a complex thing as learning with such a simple thing as one averaged value:

  • When we take values that do not match intended outcomes (a student’s knowledge, understanding, and skills acquisition) and average them together, the new number does not somehow suddenly describe outcome achievement.
  • Even if we do happen to measure the outcomes described above, but those measures are taken over time and then averaged together, the new number does not somehow suddenly describe current state.

Have you seen any other visuals that help to describe these problems with averaging data?