Monday, January 7, 2013

Zero, One, Many Rule of Thumb

In Udacity CS101, Unit 2 introduces functions, parameter passing and return values. Lesson 2-17 has us define a function bigger(a,b) that returns the bigger of the 2 values passed in. As in most of the lessons in the course, they pose a question, have the student work the problem and then explain an answer.

One student noted that her answer was slightly different code than the answer that the instructor gave and asked on the forum if there was any reason one was better than the other. To see the original question and answer thread, see: http://forums.udacity.com/questions/2009058/unit-2-lesson-17-alternative-solution-not-longer#cs101

Now if there's one thing I can't resist, it's expressing opinions on a couple of little scraps of code, and defending my choice. Here's what I said:

I liked his "if a>b: r=a; else: r=b; return r" version better than your "if b>a: a=b; return a" version because even though the procedure doesn't change the passed in value of a, and isn't meant to change a out in the calling procedure, the code as you wrote it can easily be mis-read as doing something to the first parameter. Better to introduce local variable r so there's no need to waste a neuron on wondering if the procedure is changing anything in the caller's variables. Better, in my opinion, to leave the parameters passed in as read-only unless there's real reason to alter them. But my habits were formed working with languages that have different parameter passing rules than Python has. Only clear reason to stick with my habits that I can see is what if you had occasion later to add more code to the procedure and wanted access to the passed in value of a? If the code treated a as "read only", there's no chance of a bad surprise.

I actually preferred the "return a; else return b" version because it dodged the need for a short lived local variable. It's one drawback that I can see is that it isn't so obvious how to extend the code to handle more values such as a, b, c, and d. But at some point you probably should think of having a list of values and not a whole bunch of variable names. I was reading recently that code should be written to handle 0 cases, exactly one case or "many cases". Comparing a & b is one case, comparing a, b, c (and d) is clearly more than one case. Instead of coding for 2 cases and then for 3 cases, look for a way to generalize it to handle the "many" case, such as a loop to pick the biggest element out of a list of arbitrary length. But only do that if you have to handle more than one case. Generalizing the code before there's an actual requirement to do so is likely not time well spent. Zero, one, infinity rule is a decent summary, though where I saw this recently was Tim Ottinger's "Agile in a Flash" blog (Simplify Design with Zero, One, Many Rules). I can still find the article in Google reader, but as for rustling up a URL to it to share here, I'm having no success at that whatsoever. It's like the blog has lost track of that article.

Addendum

Well, better late then never. Google searching again today for that blog article found it. I also stumbled into a Wikipedia article that indicates that design rule predates that blog post by years. Alas, it's one of those "[citation needed]" Wikipedia articles that doesn't back up its information with a reference.