Sunday, July 28, 2013

Pythonic Python - Writing Python code that fits the language's idioms.

A recent posting by +Luke Plant to the Google+ Python community about the importance of PEP-8 reminded me that for quite a while now I've been thinking I should compose a blog posting about idiomatic Python code - and that's what led to this article.

Now I know this is a very small candle trying to light a very large dark space, but so it goes. My own Python experience has so far been limited to small projects and class-work, no larger scale production quality team efforts. The good news is that that limitation in my own experience means there's plenty of room for you to chime in with opinions of your own on this matter. I write posts to my blog in hopes of sparking up some conversations. Do feel free to speak up!

Some Orientation if you're New Around Python

If you are new around Python and aren't entirely clear on what it means when we talk about "the Python community", I offer you this orientation section lest you be confused by technical details in my words here. If any of it remains unclear, do speak up about that and I'll try to expand the material to cover more.

A "PEP" is a "Python Enhancement proposal". There's a web site where all the PEP's are published and shared. PEP 0 is an index to the PEP's. PEP 1 is a guide to how to write and submit a PEP.

Most PEP's are proposals to enhance the Python programming language in some way.
e.g. PEP 435 proposes adding an enum data type to the Python standard library. It's been a long discussed topic. PEP 354 for example proposed something similar but was rejected in 2005. PEP 435 has been accepted and is scheduled to be implemented and released as a feature of Python 3.4.

Other "PEP's" are merely informational, not really enhancement proposals at all.
For example, PEP 429 is the nominal schedule and plans for the Python 3.4 release. So, if PEP's aren't always "Enhancement proposals", what are they? PEP's are public records of consensus opinions of the Python community. Not every voice counts the same. Benevolent Dictator for Life (BDFL) and Creator of the Python language in the first place, Guido Van Rossum, has extraordinary influence on the fate of a PEP. Fortunately, he generally shows good judgement in steering the language.

Among the informational PEP's, PEP-8 is a style guide for Python code. If you don't want your code to look strange to the eyes of experienced Python programmers, you should try to comply with the guidance of PEP-8. To help you do that, there's a PEP-8 checking program. Actually, if you do a Google search, you'll find that there's more than one such PEP-8 checking program available. You should have at least one of these programs in your tool box and make a habit of running it against your code and spending the time to tidy things up to make the checker program happy.

It's not that you can't ever bend the rules, but you darn well better have an excellent reason why deviating from the style guidelines was worthwhile in the exceptional case that you decided not to fix.

Another informational PEP that you should read is PEP-20, the Zen of Python. Unlike PEP-8, the guidance of PEP-20 isn't something so simple that a straight-forward program can look at your code and say whether or not you were thinking like a Python programmer when you wrote your code. PEP-20 does try to get your head into the right way of looking at the things that you wish to program using Python.

Beyond Style

There's more to writing "Pythonic" Python code than following PEP-8's style guidelines and letting PEP-20 shape your thinking. For example, given a collection of things (a "list" being the most typical way of setting that up in a Python program, if your reflex to process that list is to think "DO I=1 TO N"... then you are probably still under the influence of Fortran or C or some other such old programming language. A more Pythonic way is "FOR ITEM IN LIST:"... Now there are lots of special situations that might drive you to explicitly stepping an index through a list, even in Python, but don't do it as a simple automatic habit. Ned Batchelder has written an excellent tutorial on how to "loop like a native".

One other area where I've found I have to fight against habits formed in working in other older programming languages is in conceiving of the types of return values from a function. In some old languages, the return values were limited to something simple, and to exactly one thing. But Python is entirely dynamic, so you can feel free to inexpensively return elaborate data structures or even multiple values (tuples) at a time. It isn't inordinately expensive because the implementation doesn't copy the values around, just pass descriptors of the values. The language's runtime garbage collector takes care of reclaiming the storage space when you no longer need the fancy value that you constructed.

Additional Tools

It is important to understand that Python does incredibly little error checking at "compile" time. Your code will pass without complaint from the Python language processor, even if your program has grossly mis-spelled variable names or calls non-existent functions. You won't hear about the errors in a given line of code until you actually try to execute that line. This makes testing your code incredibly important. You should definitely look into Python unit-test tools so you can embed test cases with your code and be well prepared to routinely re-test after you make revisions to the code. Related to testing, you might also benefit from coverage tools that report which portions of your code remain unexercised. This may guide you into beefing up your unit test cases. Sadly, even 100% test coverage of all the lines of code still can't guarantee that your code has no undiscovered bugs lurking in it. But if you haven't even exercised all the lines of code it is easy to anticipate that there may be easy-to-find errors lying in wait to spring out at you at some inopportune time.

There's no real substitute for good judgement and as the old adage explains good judgement is something you learn from experience and experience is often something you gain from applying bad judgement. Another tool that may draw your attention to places in need of better judgement is sloccount. sloccount tells you how many lines of non-blank, non-comment source code you've written and while it is counting, it computes complexity metrics for your functions. A function that is oversized or that scores as exceptionally complex deserves to be re-considered. More often than not, those are the kinds of functions where your undiscovered bugs are lurking.

There are multiple static checking programs that try to find the more obvious kinds of problems for you. pylint, pyflakes, and pychecker to start you off with 3 names to Google search for. Some of the suggestions from these tools can be very annoying, like if I need a short-lived integer variable for local use, why shouldn't I name that variable "i"? But generally the checkers are tunable to tailor the rules to your taste. Don't just get annoyed with what the checker program tells you. Look at what it thinks it sees and see if you could do better and make it happy while you are at it.

Data Types.

One other aspect of the Python language that you should pay attention to is it's richness in data types. strings, lists, dictionaries, sets, tuples... And those are just "collections" of values. If you find yourself frequently searching through a list, stop and think whether a list is really the right choice. Maybe you really ought to use a dictionary to make it simpler to check if a given value is in the collection.

Modules and Name Spaces

If your Python programs all tend to each live in exactly one file, then odds are you aren't making use of Python's modules and name space capabilities to separate your code into manageable sized pieces. Ultimately, that will limit you in your ability to tackle larger projects such as multiple-person programming teams. Keep an eye out for possible reusable modules that are worth separating from the specific problem at hand so you can use the same code elsewhere in the future. Python is quite liberal in its handling of type ("duck typing"). You can exploit that to make your code quite flexible about what data it is willing to deal with. Take the time to carefully document what the requirements are for the data that your module can handle. e.g. maybe you had in mind that it would handle "employees", but perhaps it could be equally happy with any kind of object that has a mailing address as part of it ("customers" for instance).

Multiple paradigms.

Although Python is not a gigantic language with the sort of sprawl that PL/I was notorious for, Python does allow for more than one programming style. It certainly has support for object oriented programming as well as structured programming. Happily, it doesn't insist that you make use of all of its possibilities, but if you have been shying away from some aspect of the language because it supports a style that you are unaccustomed to, do push yourself toward learning how to use that aspect of the language appropriately. "generators" are a kind of co-routine and you may not have run into such a control structure in other languages, but they are worth the time to learn. Object oriented programming is still a weak area for me, but I've been working on trying to understand how to put that to good use. Test driven development is new to me too, but again I've been trying to regroove my mind to pay attention to doing things that way.

Libraries and Frameworks

One of the mixed blessings of working in Python is that there are many rich libraries of existing code available for your use. Do plan to spend some time searching to find what is available that may be helpful to you. The code is generally free for you to download, but you may need to invest some time to understand it and bend it to your will. Forking it to make a specific-to-you version is probably a bad idea. But reinventing the wheel on your own is probably an even worse use of your time. If the library module isn't going to comfortably fit into your program, that may be nature's way of telling you to return to your favorite search engine to find another alternative implementation to use instead of your initial "find".

Learning the Python language is just a start. Learning to put it to good use is a much taller order. As Peter Norvig promises in his essay, "Yes, you can learn to program in only 10 years". But don't get discouraged. Learning new stuff every day can be great fun.

11/10/2013 - Corrected a typo. "are are". Being your own editor has its hazards.