Saturday, November 30, 2013

Linus Torvalds on Teaching Children to Program


Linus Torvalds in this less than 3 minute video clip gives a great answer to a question about teaching children how to write computer code. He specifically mentions the Raspberry Pi as having the attractive property of being cheap enough to throw away. I've said before that if you persuade kids to try a computer programming course, some number of them will hate it, and will vow never to wrestle with computer software again. But there should be some who will decide this is fascinating and will pursue it further, perhaps making a career out of it.


http://www.youtube.com/watch?v=2KfJiWR1FPw


My conclusion is that an introductory programming course should therefore be short. I like Udacity CS101. It is nominally only 8 weeks at 5 hours of class time per week. It is self-paced and not on a fixed schedule. You may want to see my blog post: Is the Udacity CS101 Course Watered Down?


There are many possible ways to introduce computer programming. For example, you could pick a commercially important programming language, e.g. C or Java, and organize a course around learning that language. Or you could create/select a student language just for the students to get started with the concepts (Scratch, Logo, Basic, CUPL, ...) and teach that. I think Udacity CS101 chose wisely in picking a subset of Python. Python is a clean, powerful, multi-paradigm programming language that does see some real use commercially. Udacity CS101 doesn't visit all aspects of the language, but it does teach enough of it to give a good start at understanding the construction of computer programs. I believe it is much easier to get started in Python than in C or Java and that Python can carry you a lot further than Scratch, Basic or CUPL.


Another approach to introducing computer programming is to pick projects that are particularly attractive to students. Robots and computer games are two oft-mentioned examples. The parts and tools needed for a Robot-centered course call for a substantial investment ($10-$25 thousand per team is a working estimate I've seen on the web) and also requires comparatively large secure space to store the project, parts and tools. Real-time software that interacts with sensors and motors strikes me as more advanced material then a short introductory course should aim to cover. Games keep the work in the virtual world, so there's less need for parts and a workshop space, but interactive software with graphics is again, in my opinion, more advanced material that a short introductory course should aim to cover. Fair enough to stir into an introductory course a bit of foreshadowing to hint at how the material the course is covering can be extended to work in games or robots.


But I haven't any real experience teaching computer programming to students. I've been trying to launch such a course at the local community center here in the New Cassel section of North Hempstead, NY for well more than a year now, but it hasn't gotten off the ground yet. A shameless plug for yet another of my blog posts: Marketing the Importance of Programming Education. The comment thread on that article is, in my opinion, particularly worth reading.


A reminder: Blog sites such as this one are intended to be 2-way communication, not just reading material. Whether you are a student, teacher, programmer, experienced or inexperienced, you are invited to post comments about this article down below. Suggestions, counter-examples, criticism, pointers to other sites that said it better, and, of course, praise are all welcomed. Anything but spam.

Saturday, September 14, 2013

Bicycles for the mind... A Steve Jobs talk from long ago.

1980

In the Introduction talk to Udacity CS101, Professor Evans mentions Steve Jobs having compared computers to "bicycles for the mind".
I think I've now stumbled across the talk where Jobs more or less said that. Run time for the Jobs talk is about 20 minutes. The bicycle reference is around the 6 minute mark in the Jobs video. Evans attributed the quote to circa 1990, but the talk is said to be from 1980, and given Steve's youthful appearance in the video, I believe the 1980 date.


Speaking of small errors, one of the annoying things in Jobs talk is he speaks of meeting with some 4th and 5th graders, but he then calls them 4-5 year-olds. I think 9-11 year-olds would be a lot closer to correct. 20 kids and 6 Apple computers? What a depressing student/machine ratio that would be these days!


Jobs mentions Visicalc in his talk like that is something that everybody in the audience knows of. But here we are some 30+ years later and I'm no longer sure that you all know what Visicalc was. Happily, Wikipedia remembers. Even if you remember Visicalc, I recommend that you visit the Wikipedia page. It has some wonderful interesting links, including one that lets you download a free copy of Visicalc for your x86 Windows PC. That version was written for DOS 1.0 so it only works within the current directory. DOS 1.0 didn't have directories, just disk drives. Most folks back then didn't have disks with any more capacity than a 5.25" floppy disk. In 1980, that would have maybe been 140KB of storage space.


One other link to particularly take note of is the one that asks "What if Visicalc had been patented?". If you haven't been paying attention to the arguments about software patents and why they are not good for the economy, you really should Google up some background reading for yourself, maybe sit through a Richard Stallman talk or 2 about "intellectual property". Be forewarned that Stallman's talk is a 2 hour talk, so take a bathroom break and get yourself a fresh mug of coffee, tea, or whatever before you fire up the Stallman talk.


If I'm going to mention Steve Jobs and Richard Stallman in the same blog post, it is probably appropriate for me to point also you to this short video where Stallman contrasts his own accomplishments vs. those of Jobs and of Bill Gates.

Time marches on, but progress?

Listening to Jobs 1980 predictions for what the heck we'd do with even more computing power, I can't help but be disappointed with how little real progress we've made on that front. The computing power has, of course, materialized as predicted, and I suppose the graphical user interfaces of Windows, MacOS and web browsers is something of a usability improvement compared to DOS 1.0, but I was sending e-mail and posting netnews items aplenty back in 1980 and it isn't like that process is hugely different today. To keep things in perspective, the Macintosh computer was introduced in 1984. Here's an early 1984 video of Steve Jobs giving a timeline leading up to the Macintosh. It's only about 7 minutes long and includes the famous "1984" teaser ad for the Macintosh; an ad still worth watching, in my opinion. Here's a 10 minute video of Steve Jobs actually introducing and demonstrating the Macintosh.


Still, if you have a problem that you want to solve with a computer, are the barriers to solving your problem significantly lower today or about the same despite the powerful GUI computers that are now available today? If there's an existing product that fits your needs ("there's an app for that"), your path is easy, but if you really need custom software, perhaps a custom database, I expect you still have a rough road ahead. "Cobol is to Visicalc as Y is to Z", but what are Y and Z?


Comparing the Python language to the programming languages of 1980 (C, PL/I, Cobol, Fortran) I guess there's some evidence of our having learned to apply plentiful compute power to making the programming job a little easier, but there's still a steep hill to climb to bring computers to bear on your problem, whatever your problem might be. The Internet, the World Wide Web and search engines seems to be the most evident signs of progress in the computing world since 1980. I do wish the world had more progress to show on reducing the barriers to applying computers to solve problems given the passage of 30+ years since that Jobs talk. Are there specific improvements in the computing world that I'm overlooking here and not giving proper credit to? Should smartphones get mention or are they just scaled down screens with battery powered small computers.


If I was better at HTML, maybe I could rig this article to provide background music as you read the previous paragraph. Or am I being too sentimental about lack of technological progress?


If you are completely unfamiliar with Stallman's contributions to the notions of "free software", you might give a quick read of a past blog post of my own as a way to get started at understanding software licensing and Stallman's GPL in particular: See "Copied Code or Unfortunate Coincidence".

06/11/2014 - Updated: The Steve Jobs talk link went bad! Why didn't anyone tell me with a comment so I'd know? Anyhow, I found a link that works (today).

Tuesday, September 10, 2013

Control Engineering?

My Master's degree

My MSE is from U. of Michigan, Ann Arbor. As I previously explained in a STEM post, I was there in a program sponsored by Bell Labs called "One Year On Campus". Within a maximum of 12 months, it was a requirement for my continued employment to complete a Master's of Science, Engineering degree. The Labs imposed a few additional requirements on the courses I was to take. I was to take an information theory and communications course (modulation techniques, etc., etc.), a digital logic course, and I'm not sure what all else. My main interest was in studying computer science, but the University had a requirement that X% of your courses had to be from within the department of your major. For broad studies of computing, that was a problem as the computing courses were spread throughout the University. The digital logic, machine architecture and assembly language courses were over in the Electrical Engineering department. The computer graphics courses were in the Industrial Engineering department. Operating systems, higher level languages and compilers were in the Computer Science department. Database courses were way over in the business school. So, the Engineering school created the "CICE - Computer, Information and Control Engineering" department to cross list all of the above courses. By majoring in CICE, I was able to take a broad variety of computing courses and still meet that X% within my major department requirement.


So much to do and only a year to do it. So, I admit that the edge of the CICE curriculum where I did not fit in many courses was control engineering. I had some exposure to it, but just enough to hum along if someone decided to sing about the subject. I couldn't even confidently promise to you a competent explanation of the topic if you asked me about it.

TED talk/demo of control engineering

So, I was excited today to find a most excellent TED talk demonstrating control engineering with a set of "Quad copters". The talk has some very impressive demonstrations of software control of the copters. I could wish for a more technical explanation then they offered. I'm not at all sure how many processors of what kind they used to track and control the copters. I'm pretty sure there's more real-time computation going on there then would comfortably fit into one processor, but that's just my gut feeling and not based on any real experience.


And so, for your entertainment and to convey to you what the heck "control engineering" is about, I share with you Raffaello D'Andrea: The astounding athletic power of quadcopters from June 2013. Run time of the video is about 16 minutes.

Want to know more?

I did some mild Google searching to try to find the source code for the control programs, which I assume are likely to be "open source". I didn't find exactly what I was looking for. But I did find some useful info. There's a Wikipedia article about Quadcopters and from there, there are plenty of links to additional information. I found this article about the lab where they test their stuff to be interesting. I was heartened to learn that they do have a safety net to catch the falling copters when the software is still under development. The article does give me the distinct impression that they've developed custom hardware to do the control algorithms fast enough. That is a potential major obstacle for a casual hobbyist who wants to play. If the Wikipedia article doesn't give you enough leads to follow there's an existing Pearltree specifically about quadcopters. I did find some open source (Arduino controller) quadcopter projects. e.g. this one and another one.


If you try your own Google searches, be careful about people in other fields of research with names similar to Raffaello D'Andrea. e.g. I found a biology PhD candidate at U. of Michigan named Rafael D'Andrea - Not the same guy. Oh, and I did find some Cornell U. work in the field of quadcopters too: Autonomous Quadcopter Docking System. Oh my, on page 11 of that report, the guy strapped his Android Smartphone to the quadcopter to add a camera to the configuration. That must have taken guts! (I wonder if he has the "Send me to Heaven" game on his Android phone? Not me!)


If you want to get into the math of control of a quadcopter, this looks like a reasonable place to start reading: Estimation and Control for an Open-Source Quadcopter. Or just sit back and enjoy the demo video that I pointed to in the previous section. Not everyone has to be a DIY engineer with amazing toys.

Monday, September 9, 2013

The public does remember scandals, even in NYC?

I'm writing this the night before the NY primary elections. A couple of the current races had been troubling me. One is the Democrat's field competing to be on the ballot for mayor this November. The other is the Democrat's field competing to be the NYC Comptroller on the ballot this November. How could Anthony Weiner and Elliot Spitzer be regarded as serious candidates after they'd been thrown out of previous high-office positions for plenty good scandalous cause.

Background

In case you have been sleeping under a rock or perhaps simply aren't from the NY area, here's a quickie background on these candidates.


Spitzer, a married man, was a well regarded governor of the state of NY 2007-2008. But then it was discovered that he was paying huge sums for the services of high priced prostitutes. It was widely reported, his wife stood by her man, and then he resigned from office under fire. He resigned from office rather than risk getting dragged through impeachment proceedings.


Far as I can tell, Spitzer was never prosecuted nor convicted on any of the charges, but I've seen nothing to so much as imply that he is anything less than 100% guilty of exactly what he was accused of.


Weiner, also a married man, was a U.S. congressman representing Manhattan's 9th district from January 1999 to June 2011. But then it was discovered that he was sending photos of his private parts to various women. Amid all the hoopla, he resigned from office.


I'm not upset with these guys resigning from office, though I do wonder about how no one bothered to prosecute them for their misconduct. But the huge shock to me is that they would then have the nerve to throw their hat into the ring to run for public office again. My one comfort is that recent polling numbers show both Spitzer and Weiner trailing in their races, so maybe their comeback attempts are ill-fated after all. Goodness knows the late-night talk shows have been having great fun with those candidates, especially with Weiner.


So, not much longer to wait to see if these comeback attempts fail as clearly as they deserve to fail. From my point of view, neither of these guys should have embarrassed us by running for office again. Or am I just being an old fuddy-duddy?

Election Night Update 09/10/2013

Well, the polls for today's primary election have closed and results indicate that Weiner came in a distant 5th place in a field of 5 candidates. Weiner got 5% of the vote. Bill Deblasio got 40%, just enough to win the Democratic ballot slot for NYC Mayor without needing to campaign for a run-off election in 3 weeks. (40% was the magic number to avoid having a run-off of the top candidates in the Primary voting).


Spitzer also lost with Stringer winning the race for the Democratic ballot slot for NYC Comptroller, but just barely. I was happy to see Spitzer lost, but would have been happier to see him handed a more decisive defeat. "Name recognition" is valuable when trying to get votes, though Weiner shows there are limits.


Perhaps it mattered that during this campaign, Weiner asserted that his checkered past was now behind him, but then the whole "Carlos Danger" brouhaha hit the fan during his comeback campaign. Maybe the difference was that there was no fresh scandal from Spitzer, just the old scandal that at least some folks have not yet forgotten.

Wednesday, September 4, 2013

SOLID software design...

My hopeless backlog

One of the bad habits I have is accumulating lists of things to read. Google Reader used to be a handy place to categorize and keep my "subscriptions" to blogs and so forth to read. Unfortunately, Google pulled the plug on that service in July. Fortunately, another web service, theoldreader.com provided a very similar free service. In fact, their service was designed to look like an earlier version of Google Reader. Some time ago, Google removed some "social" aspects of Google Reader and the change irked some of their users. theoldreader.com sprung up to undo the perceived damage of Google's changes. When Google announced the planned demise of the Google Reader service, Google at least was kind enough to provide a mechanism for retrieving my list of subscriptions, so I opened a free account with theoldreader.com and submitted my subscription list. But so did a zillion other people, so it took literally weeks for the site to get around to processing my subscription list. In the fullness of time, they did get the subscriptions into their system and it seemed usable enough, but then their servers crumbled under the newly stepped up load. (Well, I think that the story is more like they made changes to beef up their servers and in making the changes someone tripped over a power cord or something). More days and days of noticeable downtime before things got back on the air and seemingly stable again.

My point in mentioning Blog subscriptions is that the universe conspires to generate "interesting" blog articles faster than I manage to read them. "Another day older and deeper in debt...". My backlog of unread blog articles is quite hopeless, but whenever I have nothing much to do, instead of turning on the boob-tube in the living room, I fire up theoldreader.com in my browser and try to read up on what I've been missing.

If you've got your own solution to tracking new blog articles, I hope my blog here is on your subscription list. If not, please take a moment to sign-up my blog onto your list. Go ahead and do it now. I'll wait for you to get back. (Full disclosure: No one has ever told me of positive or negative experiences from subscribing to my blog with their favorite RSS tool. I'm assuming that blogspot.com does the right thing, but if you run into a snag, I'd sure like to hear about it).

Emily's "Coding is Like Cooking" blog

One of the blogs that I do try to keep up with is from an expert on test-driven-development (TDD) named Emily Bache, an Englishwoman who now lives and works in Sweden. Her blog is called "Coding is Like Cooking". I like it because it tends to be quite well written and covers relatively recent software development topics that I might otherwise miss out on. i.e. Stuff that I didn't learn in school back in the days when Structured Programming was still somewhat controversial, and that I didn't pick up by osmosis in the later years of my employment at Bell Labs. I freely confess that Test-Driven-Development wasn't part of the quite informal methodology that "we" in Math Research were following. The mathematicians tended to find the underlying math of the problems far more interesting then the structure of the code. I also found myself unexposed to a great mentor for Object-Oriented-Programming. Java got some use in our projects, but I knew enough to recognize bad use of the language when I saw it. So, now I'm retired and still have much to learn. The web has no shortage of material for me to learn from and by reading Emily's articles and saying "Huh?" when something comes up that isn't at all familiar to me, I find lots of great stuff to learn.

One of my "Huh?" moments came from her mention of the "London School". I followed her link and picked up a book for my Amazon book Wish list: Growing Object-Oriented Software, Guided by Tests

SOLID Principles and TDD

So today I was reading an article of her's from September, 2012, SOLID principles and TDD. I didn't get very far into it when my brain complained "Huh? What does she mean by SOLID?". So I opened up another window thanks to a link she provided and read the Wikipedia article on SOLID. It isn't a particularly excellent Wikipedia article. It is dense with off-putting terminology - might be the fault of the SOLID acronym and not really the fault of the article - I've got to forgive the Wikipedia article because it has some excellent links to reference material.

Uncle Bob's principles of Object Oriented Design

One of the links from Wikipedia that I followed was http://butunclebob.com/ArticleS.UncleBob.PrinciplesOfOod. Still more links to clarify the mysterious-to-me parts of that SOLID acronym. This summary article looks especially good: http://www.objectmentor.com/resources/articles/Principles_and_Patterns.pdf.

And just to make sure I don't run out of things to read, there's a book mentioned and even recommended by another reader - Agile Software Development, Principles, Patterns, and Practices that I've added to my Amazon "Agile" book wish list to remind me I really need to take a look at it.

In closing...

So, my situation is quite hopeless if the goal is to finish reading the stuff on my list. If I'm reading anything interesting, it tends to add more things to my list. And that doesn't even count the time to actually try the tools and techniques that I'm reading about. I hope you found some of this material interesting enough to add to your own read-and-try list. I guess I have a bit of a sadistic streak to want to inflict my backlog on other folks.

Learned anything interesting lately?

Saturday, August 31, 2013

Out of the Box Thinking - 3D Printers in Space

It is fairly standard advice that if you want to come up with a new solution to a problem, you have to look at it differently. You need to think outside of the box of conventional solutions.


Background

For a while now, I'd been following the growing interest in 3D printers. If you're unfamiliar with them, see my Pearltree on the topic - a shared collection of links to web pages where you can read more about the topic. Short version: A 3D printer is a device that typically extrudes some material (e.g. melted plastic) under computerized control of the position of the nozzle to create a specified object on a platform in a box.


Prototype of a planned 3D-printing spiderbot for construction of buildings on earth. Image borrowed without formal permission from this site.


3D-Printers in Space

From time to time I've read mention of NASA being interested in using 3D printing in space to create replacement parts instead of having to wait for delivery. The FEDEX delivery truck doesn't come by the International Space Station very often. So, I thought, that's interesting. I wonder to what unintended extent the commercial 3D printers depend on gravity. I haven't thought it through, but won't be surprised if it turns out some engineering revisions are needed before a 3D printer designed to work on earth will work properly in a zero-gravity environment. If nothing else, the absence of natural air convection with hot air rising and cold air descending, perhaps will require the addition of forced air cooling to the 3D-printer to be used in space (assuming the printer will reside in a pressurized area where there is air). Cooling times for materials extruded in the vacuum of the great outdoors of space are a whole other thing that needs to be thought through.


And then today I came across this article from the Verge web site, reporting on some plans by NASA to "3D Print" large structures in space. Plenty more questions than answers at this point, but it is clear that I have been guilty of literally confining my thinking about 3D-printing in space to literally be inside the box. The job of the box of a 3D-printer is mostly to support the print mechanism and to provide a frame of reference within which the printer head will move. But in space, with zero gravity, you don't need anything to hold up the printing mechanism. Perhaps a finely adjusted GPS mechanism and laser range-finder can be used to guide positioning of the extruder, and small thrusters can move the extruder from point to point. Once you've literally gotten out of the box, you can think about 3-D printing really large objects, such as rigid beams to hold together gigantic solar arrays - Objects far larger than you'd likely be able to consider launching into space preassembled from the ground.


Seems to me that the hard part is still going to be getting the raw materials (and thruster fuel) into space. Perhaps really large structures fabricated in space will depend upon the hypothetical space elevator to lift materials to orbit for relatively low cost. Of course, if you can lift relatively large things to orbit, an engineering alternative would be to lift prefabricated sections to orbit and then have construction robots that assemble the pieces into the desired large rigid beam structures. Of course, if you can figure out how to 3D-print the desired structures without needing material from earth that might tip things in favor of space-based 3D-printers, but I'm not at all sure how you'd go about fabricating anything from moon dust or the materials of asteroids.

Acknowledgement

Thanks and a tip of the hat to Nuno Cravino for sharing the link to the Verge article on the Google+ STEM Community. It is always good to see reminders from time to time of the importance of thinking outside the box.


While contemplating the illustration on The Verge article, I remembered there was an old episode of Star-Trek where the Enterprise was trapped in a web-like structure. I'm pretty sure the web was made of energy beams, not solid extruded materials. Thanks to the Internet's excellent reference sites about Star-Trek, all I needed was a Google Search for:


    star trek enterprise caught in a web


to find that was Season 3 Episode 9, 1968's episode "The Tholian Web".


Image borrowed from Memory Alpha.org, the Star Trek Wiki.


imdb.com says there was an actual on-screen mention of the Tholian web in an episode of Futurama.

Wednesday, August 28, 2013

Learn How to Program Computers!

Say What?

News photo of an experimental self-driving Volvo from Engadget report.


Computer procesors show up almost everywhere these days. Desktops, laptops, cell phones, processors embedded in cars, processors embedded in printers, embedded in cameras, embedded in dish-washers.


One thing that they all have in common is that someone (or some team of folks) had to develop a set of instructions to guide the computer to do whatever it is that it does. Those instructions are called programs, software, and the stuff that goes into a program is called "code", not to be confused with "secret codes", though there is a connection if cryptography is what interests you.


So, no matter what, you are pretty much fated to be a user of computer programs, but with some effort on your part, you can learn to write your own computer programs too. Creating a computer program is called "software development". You can learn to do software development on your own. I'll caution up front that seriously big projects generally are tackled by teams, not individuals working solo. But the longest journey starts with but a single step.


My intent here is to convince you that you should embark on the long journey of learning to program computers, to do software development.

Motive

There are plenty of reasons why you should learn how to program computers. Before you begin on that path, you might want to think about what your motive is. Are you hoping to write your own computer games? Want to understand Cyberwarfare before you are noted as a threat by Skynet? Interested in autonomous robots? Self-driving cars? Want to develop a fancy web site of your own? Thinking it might be a useful skill when seeking a serious job? Want to understand how the NSA accidentally intercepted calls from Washington, DC, when they meant to intercept calls from Egypt?


I can't tell you what motivates you. I will caution that this is no short trip, so you should probably be sure you are well motivated before you dive in.


I'll also mention up front that the early days of getting started with developing your own software can be frustrating. As you master the basics, a self-driving car may seem awfully far away. Part of my hope here is that I can convince you that if you keep your motive in mind, you can work through the steep initial learning curve.

Age restrictions?

In my opinion, there's no upper bound on the age when you can learn to program computers. If you are past retirement age, that might alter your own list of motives for learning, but it is no reason to refuse to give it a try.


Can you be too young to learn to program? Well, programming does generally involve reading and writing, so if you haven't gotten proficient at those skills yet, you may find it hard to get into software development. But then, how is it that you are reading this article? The US apparently has regulations that strongly discourage web sites from registering information for children under age 13, so if you are under age 13, it is important that you discuss your plans with your parent or guardian and have permission from them to get involved in online courses.

Cost?

If you have access to a computer with a good Internet connection, you probably have all that you need to get started. If you don't, then how is it that you are reading this article?


Of course, if you have money to invest in this project of yours, there are things you might want to look into. For instance, books. You can get plenty of materials for free here on the Internet, but sometimes it can be useful to have a paper document that you can bookmark, dog-ear, highlight and annotate. Don't feel you have to invest in books up front, but if you can find a good local book store, you may find that there are useful things to be found by browsing. Your local library may also be useful, though in my experience, the local library tends to be woefully short of current technical books. The good news is that if you can find titles worth looking at, perhaps from web searches of places like amazon.com, then most likely your local library can arrange an inter-library loan so you can examine the book without having to buy it first.


Of course, library books, including books borrowed on inter-library loans, need to be treated politely, not dog-eared, high-lighted and given marginal annotations. And they do have to be returned after a relatively short time. If you find a title that looks really well matched to your needs, that's where you just might blow your allowance on an order from an online book-seller, so you'll have a copy of it for your own. Amazon.com does have provisions for wish lists, sort of like bridal registries, so as you publically grow your list of titles you hunger for, you'll at least make it easy for folks thinking about getting you a birthday present or Christmas present.


Libraries can also be a great place for getting free public access to Internet-connected computers. You might find that there are administrative obstacles to your installing software development tools on the library's PC's, or even filters to protect you from the educational materials. Don't let those barriers stop you. Talk to the librarian to find out who is in charge of those kinds of filters. Most likely, arrangements can be made for good reasons like you have.

Courses?

There are many computer programming courses available on the Internet for free. "Massive Open Online Courses", MOOC's, featuring different programming languages and different levels of material. Some aim to serve particularly younger students. You might look, for example, for courses that introduce the "Scratch" programming language.


I haven't taken a "Scratch" course myself, so I'm not going to single out a specific suggestion here. Just try a simple Google search for:


    scratch programming course


and let us know in a comment which course you picked and why, and how that went for you.


But if you feel you are ready to learn a somewhat more conventional programming language, one that will take you further than I believe Scratch will, my suggestion is CS101 from Udacity.com, where you will learn to program using the Python 2 programming language.


The Python programming language continues to evolve. There are Python 3 versions available today. The world is still catching up to that. You'll be fine starting with Python 2, and learning the differences later on to get over to the newer versions. It is important that you know that there are multiple versions and that when you are shopping for books or tools that you get a version that matches what the course is expecting you to have.


There are other courses available, though I haven't tried the others myself. Certainly there are courses that teach Python specifically for game development. And there are plenty of courses that teach other languages, Java and C, for example. But in my opinion, CS101 from Udacity.com is a good place to start. It is free, self-paced, you work on it on your own schedule. Nominally, it is an 8-week course. It doesn't have pre-requisite other courses. I believe it is an excellent place to get started with MOOC's. There's a final exam at the end and when you pass it, they will e-mail to you a certificate to commemorate your accomplishment. There are reports on the web of real colleges that even give credits if you are a registered student and pass the Udacity.com CS101 course, but being a registered student at a physical college is outside the realm of stuff you can try for no charge.


If nothing else, trying a MOOC to get started will show you if this is a field that holds your interest. I know software development has been a long standing interest of my own, but I also know that some fraction of the students who try it find that they absolutely hate programming. If you find that's the case for you, my advice is that you tough it out to completion of the introductory course and then look for other fields that do hold your interest. It's only a couple of months to work through Udacity CS101, and it isn't anything like a full-time course load while you are working through just the introductory course. Getting the certificate isn't good motive to start the course, but perhaps it is a good motive to stick it out to the end.

Where to find more?

code.org is a web site that advocates that everyone should learn how to code. They offer 3 editions of a promotional video to promote interest in the field. One is a 1-minute teaser. Another is a 5-minute edition featured on their web site's front page. And if you have 10-minutes to spare, there is a full edition available.


There are links on the code.org site to various local places to learn to program. For example, there's a brief plug there for the "Yes We Can Community Center" here in Westbury, NY.


Details of the schedule are not yet nailed down for Fall 2013, but if you are local to here, one way to take on CS101 is to sign up with the Community Center. The Center has the computers and Internet access, and it will have other students so you won't feel too much that you are on-your-own. And, for what it worth, I'll be available to answer questions and help keep you motivated while you are working through the course online.


Not quite free, as there is a membership fee to sign up with the Community Center. But use of the basketball courts, game room and locker room and access to quiet study space for your homework time all come with that membership, so it's probably worth joining if this is your community. (Use of the fitness center is not included in basic membership. Sorry).


The schedule I've proposed is that Monday evenings we'd meet as a class to share discussion of progress and problems. Other school nights I'd be available to answer questions 7-9PM or by appointment.


In any case, you can try udacity.com CS101 on your own before we even get started and then continue at the Community center once we get our act together there. Please, do let them know at the front desk that you're interested in taking CS101 as seating is limited.

Free for Senior Citizens

If you are a resident of North Hempstead and are age 60 or more, you are eligible for free membership in Project Independence and get a free community center membership too when you join Project Independence. Such a deal!

Further reading

Benefits of Teaching Kids To Code That No One Is Talking About - This blog post by an online acquaintance of mine has an example of a Scratch program, and a link to a video of a talk by the creator of Scratch.


Is Udacity CS101 Watered Down - This is a blog post from me in December 2012, describing what you should expect to get out of the online Udacity CS101 course.


Where to Get Python - This is another blog post from me. This one describes how you can install Python on your own PC. Note that back when I wrote that, Udacity was still using Python 2.6, but the course has since updated it's software to Python 2.7. From an end-user point of view that's an almost imperceptible change.


There are numerous Youtube videos available about learning to program games. Here is Episode 1 of a series that is dozens of videos long. Part 1 shows off a couple of games the guy has written in Python and describes what prerequisite knowledge he expects you to have to get started with his tutorials.


This isn't the first time that I've written about plans for CS101 at the Community Center. For more details of my intended format for the weekly meetings, see my blog article: Marketing the Importance of Programming Education

Thursday, August 22, 2013

What's the fuss about parallel programming?

What's the fuss about parallel programming?

A young friend of mine, now a 2nd year computer engineering student, asked me:

What is parallel programming? Why is parallel programming regarded as important in the future?

I don't have any idea about parallel programming and try to learn  by Googling. Yet,  it is difficult to understand. Why mutable data has been considered as inefficient in programming recently? How it creates problem and in what way functional programming avoids this mess? Will functional programming increase the performance of multicore systems?
Also, to which OS books should I refer? As I am starting my study on my own, and I want to get good at OS and basically able to understand the difference between linux and windows, which book should I follow? Earlier, you said that you are interested in operating systems and also best at it. Please, just suggest me some books which would able to justify the differences between linux and windows technically.
In which language is OS programming done?

Image of multiple processors taken somewhat out of context with a thank-you to Tom's Hardware, a web site where people try to keep up with this stuff


This is my reply to that e-mail...
You ask "What is parallel programming?"   That's a very similar topic to another topic you recently asked about:   concurrent programming.   Both concern how to write programs that do more than one thing at once so that overall performance is improved.   e.g. if the time to run the program depends on "n" (perhaps n is the amount of input data to be processed), then what a parallel program wants to do is apply more than one processor to the problem so the job can be completed sooner than one processor would be able to.

For example, if the job is to sort n items, you might divide the list up into a separate list per processor so each processor needs only sort a shorter list of items.   Of course, before the job is finished, the multiple shorter lists need to be merged together to make the final result.

Distributing the items across the processors is work, merging the lists back together again is work.   Whether the overhead of those extra steps is worth it or not depends on things like how much memory each of the processors has good access to.   If the items divided make a small enough list to fit in the RAM of each processor, then things are probably going to go very fast.    But if the sub-problems are still big enough that you need to spill things out to intermediate work files, and if the extra processors don't have good access to the disk space used to store the spill files, then the dividing up of things might turn out to be a net loss in performance.

http://en.wikipedia.org/wiki/Parallel_programming_model

Moore's Law

You also ask "Why is parallel programming regarded as important for the future?".   Well, if you go way back to the early days of integrated circuits, Gordon Moore predicted in 1965 that the number of transistors on an integrated circuit would double every 2 years.   He thought that observation would hold true for 10 more years or so.   We actually have gotten through a lot more doublings than that and aren't done yet (though folks are starting to fret that they can see ultimate limits ahead - so it won't go on forever).
His prediction was more and more transistors and it isn't entirely obvious that that translates to mean faster computers.   But, in fact, what folks have done with those transistors is figure out ways to apply them to make faster computers.    If you look back to the earliest IBM PC's, the processor chip didn't even do floating point arithmetic.   If you needed faster floating point, you'd have to add a math co-processor onto the motherboard (there was a socket for that additional chip).

I confess to liking that idea of having separate useful pieces that you can custom integrate to create a tailored computer with exactly the strengths that you want.   Alas, the expense of having multiple chips connected together at the circuit board level argues powerfully against that piece-part model of the chip business.   The trend instead has been to absorb more and more functionality into a single chip - whole systems on a chip - just to be rid of the sockets and pins and propagation delays of getting off-chip and on-chip and back again.

So where did all the transistors get spent to speed things up?   Some of it is obvious.   Computers today have amounts of memory that were unthinkable just a few years ago.   Along with more memory, you certainly have more cache and more layers of cache to speed up access to that memory.   There's much to be learned in contemplating why there are more layers of cache instead of just bigger cache.   But that's a more hardware-centric topic than I'm comfortable explaining here as a software guy.

Besides more memory and more registers, the paths and registers have gotten wider.   Where there were 8 bits in the beginning, there are often 64 bits today.    You can try cranking that up in the future to 128 bits, but at some point you get into diminishing returns.   Slinging around 128-bit pointers in a program that could be happy dealing with only 32-bit pointers may not be optimal.    Maybe the problem is just that we need a little more time for programs to comfortably exploit gigantic memory spaces.   My PC today only has 2GB of real RAM.    32 bits is more than enough to directly address that much memory.  2^32 in fact is enough to directly address 4GB of RAM.   So the line of needing more than 32 bits isn't super far away. But 64 bits is enough to directly address 16 exabytes of RAM.   I can't even afford a Terabyte of RAM yet, so needing more than 64-bits is surely a long way away. (1 Terabyte=1024 Gigabytes. 1  Petabyte=1024 Terabytes.   And 1 Exabyte=1024 Petabytes).

http://highscalability.com/blog/2012/9/11/how-big-is-a-petabyte-exabyte-zettabyte-or-a-yottabyte.html

Those are really big numbers.   Bigger than even Doc Brown is likely ready to contemplate:

http://www.youtube.com/watch?v=I5cYgRnfFDA

But it isn't always obvious how best to spend the many transistors that the progress predicted by Moore has provided to us.   I see a certain amount of oscillation in design approaches as things get wide and then get back to serial again.   Look at ATA vs. SATA, for example.

http://en.wikipedia.org/wiki/Serial_ATA

One way to spend transistors is to make more complex circuitry to make the time for each instruction be shorter - do faster multiplication or division, but there's only so far that you can push things in that direction. Current consensus seems to be that making faster and faster processors is getting to be very difficult.   As clock speeds go up, the chip's thirst for electrical power goes up too and with that the amount of heat that has to be taken away from the chip to avoid reducing it to a puddle or a puff of smoke.   So, the industry's current direction is toward spending the transistors on having more processors with moderate speed per processor.   The aggregate instruction rate of such an array of processors multiplies out to nice high numbers of instructions per second, but the challenge is how to effectively apply all those processors to solve a problem faster than an older uniprocessor computer would be able to. Hence the anticipated growing importance of parallel computing in the future.

I think so far I've answered the questions in your subject line.   I hope you have the patience for me to try answering the questions in the body of your mail too.

A Day at the Races

I see your next question is "Why the fuss about mutable data?"   Well, as I understand it, the concern is that if your data is mutable, you need to worry about inter-processor synchronization and locking so that when a processor updates a stored value, that it doesn't interfere with some other processor.
The processing of read-only (immutable) data doesn't have to worry about locking and synchronization.  But consider something as simple as A=A+1, where A is a mutable value.    Underneath it all, your processor needs to figure out where the value of A is stored, fetch the value into some arithmetic register, add 1 to the value and store the value back into the location for A.   If A is accessible only to your one processor, there's little to sweat about, but if A is accessible to multiple processors there's a potential for a race.   What if both processors have fetched the value of A and both have incremented their copy.    Only one of them has the right answer.   If they both store their new values for A back to the shared location, the final result is one less than it ought to be.

One solution is to have specialized hardware that makes the A=A+1 operation be atomic, indivisible, so there's no chance of one processor seeing the old value when it should be using a new value.

There's the challenge of figuring out exactly which atomic instructions are most useful additions to your instruction set design.   IBM mainframes had an interesting, though complicated instruction called compare and swap.   As I remember it, the instruction took 2 registers and a memory location.   If the first register matched the value in the memory location, then the 2nd register would be stored into the memory location.   If they didn't match, then the memory location would be loaded into the 1st register. And the whole operation was indivisible.    So a processor could do it without having to worry about whether some other processor was operating on the same memory location.   So, you could use compare and swap to do our A=A+1 operation safely.   You fetch the value of A into a register. You copy that register to a 2nd register.   Add 1 to the 2nd register.    Now do a compare-and swap to store the result back to memory.   If the compare and swap sets the condition code that says the 1st register didn't match, then sorry, but you have to repeat your computation.   Copy the newer value from the first register to the 2nd register. Add 1 to the (new) value to get a newer value and try the compare and swap again.   Of course, if there are many processors in hot contention for the value of A, then you might have to spin for a while in that loop trying to compute the right value and get it back before it becomes stale.

The compare-and-swap instruction can be used for more than A=A+1 kinds of computations.    For instance consider a linked list of items, perhaps the run-able thread list in your operating system kernel.   You want to be able to remove an item from that list.   That involves fetching the link to the next item, fetching the link to the item after that and then storing the link to the next next item into the location where the link to the item you are removing from the list came from.

    A  ----> B ----> C becomes A ----> C

As with the A=A+1 case, there's the potential for a race if there are multiple processors that are contending to pick B off the list.  compare-and-swap can at least make it safe from races, but again, if there is hot contention among many processors, there can be much wasted spinning before a processor succeeds in grabbing B off the list.

So, if you have careful control at the machine instruction level, the problem is practically solved.   But that sort of implies that you drop down into assembler language from time to time or you have a compiler that generates incredibly clever object code that knows where to use these specialized multi-processing instructions.   What if you are using a garbage-collected language like Java or Python?   Maybe your problem is worse than the value of A that you used in your computation became stale between your fetch and your store back to memory.   Maybe the location of A has changed entirely and your store operation is smashing something else entirely different than the variable A.   Big trouble ahead...   In fact, if you think in terms of Python, maybe by the time you are trying to store the new value, A isn't even an integer any more. "Gee, it was an integer value when I fetched it.   Who the heck changed it to be a floating point number in the meanwhile?".   Could be subtler: Python will happily and silently promote a int to a long if the value gets too big to fit into an int, so you need to be very careful that the value you fetched still makes sense before you store the result back to memory.

The article I pointed you to the other day "Downfall of Imperative Programming” asserts that "Imperative programs will always be vulnerable to race conditions because they have mutable variables".   So functional programming languages, by avoiding mutable variables, dodge a major bullet in the multiprocessing world. The thing that I don't know is how to be sufficiently productive in functional programming languages for Haskell to be worth the trouble to learn.   The Downfall article predicts that the race conditions are an insoluble problem for imperative programming language implementations.  I'll happily accept that there's trouble ahead to watch out for, but I do have a bit of difficulty accepting that the races absolutely can't be resolved.

Python's Global Interpreter Lock

Python worries about the possibility of races among threads in interpreting the instructions of Python code. They have a "Global Interpreter Lock" (GIL) to assure that one interpreter thread won't change a value in use by another interpreter thread. Folks worry that this coarse level of locking will keep Python programs from being able to scale up with increasing numbers of processors.
I've seen some clever dodges of the GIL in Python programs, mainly by spreading the program across separate address spaces (multiple Python interpreters, each with their own GIL) and limiting interprocess interaction to some carefully controlled set of places in the code with appropriate locking protections.  On the one hand, this doesn't give transparent scaling up from a uniprocessor to M processors all running in parallel, but on the other hand, it does get the job done.

My (weak) excuse for not having more first hand experience with this...

My home PC doesn't bring multiprocessors to the party.   Some day I hope to replace it with an i5-ish based computer with 64-bit addressing and >4GB of memory.   As a retiree with a rather modest pension, that's a discretionary expense that I've been postponing into the future.  Maybe in the meanwhile my target will shift to something with way more processors than an i5.   What I have in mind is something with enough oomph to be able to run Linux and Windows both in virtual machines (Based on Xen, VMWare, something else?  I don't know...). Heck, Microsoft isn't even making it easy to buy such a configuration without paying for a Windows license twice (once bundled into the PC's base price and then again for an installable copy that can be installed into a VM).  I'm assuming that a re-install CD that wants to reload Windows onto a bare PC isn't going to be able to install into a VM environment.   I'm expecting that multi-processor race conditions and their associated problems will come along naturally to my world once I have a rich enough configuration and that encountering those problems on more than just paper will motivate me into doing something about them.
Maybe I'm just old-fashioned in thinking that what I need is a richer computing environment here at home.   Maybe the right thing to do is to venture out into things like Amazon's Cloud Computing service and see what kind of trouble I can get into using other people's multi-processors via the Internet.  One of my worries about that, is maybe the underlying MP nature of their cloud services is too deeply wrapped for me to really see the problems I'd be up against from MP.   And, "look, dear, here's the marvelous new computer I just bought" is a much easier conversation to anticipate having with my wife then "Just let me pay this bill for cloud services.   It isn't so much money and I did really learn from having tried their services."

Comparative Operating Systems

You ask me to recommend an OS book to better understand Windows vs. Linux.  I don't know which book is the right choice.    Certainly an Amazon or Google search will turn up a large number of candidate titles. Perhaps your school's library has some of those titles so you can look them over, or perhaps they can arrange inter-library loans for you to be able to look over some of the candidate titles.   "Which of these is best" is always a tricky question because the answer depends so much on your particular criteria for "best"
So let me turn this around and ask you for a summary of your findings from digging into the too-long list of candidate titles and your recommendation.   You might want to ask your question of your school's professor for the OS classes too.   Maybe he's got a more formed opinion on this topic than I have.

Linux Weekly News

Meanwhile, I stand by my suggestion that you should make an effort to keep up with lwn.net  (free of charge at the price of having to lag back a week from the most current articles) to see what is going on in the Linux world.  Don't feel obligated to have the newest and most experimental kernel on your home PC,  But if you spend some time watching the evolution and planning of kernels, you'll have a better idea of Linux's strengths and weaknesses and what "they" are doing about the weaknesses.  Unlike Windows, if you are sufficiently motivated to want Linux to be different then it is today, you can make that happen.

Kernel programming languages?

What programming languages show up in OS programming?  Well, at this time, I expect the correct answer to that is C.  Other languages (e.g.Java and Python) do show up in supporting roles, but generally don't make it into kernel code.   Even C++ tends to need too demanding an environment to be a good candidate for kernel code.   Maybe as time goes on the kernel will sprout suitable layers of capability to make higher level languages more attractive for implementing functionality within the kernel but right now if someone tells you a kernel is written in C++, ask them more questions to confirm that.   It wasn't all that long ago that the likely choice for programming an OS kernel was surely assembler language.  Unix introduced the C language and the then radical idea of using a higher level language in the kernel and even having kernel code that is somewhat portable across computing system architectures. (To calm the historians in the audience, I'll concede here that I may be under-crediting the Multics operating system, portions of which were written in PL/I. And the Multics site gives credit to Burroughs for having done a kernel in Algol, but that's way before even my time).
Stack overflow article on the languages of the Android OS:

http://stackoverflow.com/questions/12544360/on-what-programming-language-is-android-os-and-its-kernel-written

Stack overflow article on the languages of MacOS, Windows and Linux:

http://stackoverflow.com/questions/580292/what-languages-are-windows-mac-os-x-and-linux-written-in

Not every answer is to be trusted to be correct on stackoverflow....

One sub-link of that article that I followed and that does look interesting and credible:

http://www.lextrait.com/vincent/implementations.html

lwn.net article on what's new in the Linux 3.11 kernel expected to become available in September 2013...
http://lwn.net/Articles/558940/
This is a particularly interesting link from one of the many comments on that lwn.net article about 3.11:
http://www.softpanorama.org/People/Torvalds/Finland_period/xenix_microsoft_shortlived_love_affair_with_unix.shtml

In Closing...

You quote me as saying that I'm best at operating systems.   I tried rummaging in old mail to you to put that statement in context, but didn't succeed in tracking down what I said.   I will concede that I'm especially interested in operating systems, and given a list of computer science topics, I'm probably more interested in operating system then in most of the others, but claiming I'm best in operating systems sounds like it surely needs some context.


I confess that except for command line pipelines, I've never actually written a multi-threaded program of my own. So don't assume more expertise here than I actually have.

Thursday, August 8, 2013

Software Development isn't a Field for Loners

A young friend of mine, a first year student studying to become a computer engineer, recently sent to me a copy of his school's 3rd year curriculum, asking me for comments. Now, on the one hand, I'm no curriculum expert, not having really looked at a college catalog for computer science in many a year. But, on the other hand, it's a rare topic that leaves me without an opinion when I'm asked to look at something.

On the plus side, the curriculum was packed full of technical courses. It was quite different from my own experience at Cornell U. where every semester in Cornell Engineering had room for at least one elective. My recollection is that the catalog's requirements at Cornell actually demanded that your courses not just be in "Your" school. That requirement was sometimes annoying to comply with, but in the long run, I do think the University succeeded in exposing me to more diverse backgrounds by forcing me to get out and about in more than just the engineering school. Far as I can tell, at my friend's school everyone in his major is expected to work through the same heavy technical course load in their 3rd year.

In my opinion, the weakest aspect of the curriculum I was looking at today was lack of team or group projects. Why do I think that is so important? Well, software development isn't really a field for loners. Programmers may get mocked at social events for not fitting in, but software development is most definitely a team activity. Where do we go wrong at social events? I think it is just a tendency to be so darn focused on the challenging problems at hand at work that keeps us from adequately noticing that the techy stuff is of no interest at all to the Muggles at the party.

Contemporary programming methodologies (e.g. Agile, Scrum, Extreme programming) pretty much mandate that you be able to work with and communicate with other people. Now I concede that it is remarkably difficult to build "collaboration" into an engineering curriculum. First the programming language has to be taught, and then algorithms, and the rudiments of collaboration tools (e.g. git, code review tools, ...) before you can even think about springing a team effort on to some subset of a class. Getting students to actually work together, especially at a non-residential college, can be difficult. To be interesting, the project has to be big enough that splitting the work up across the team has to make sense, but it still has to fit into a semester of mortal efforts. Figuring out how to fairly give grades in such a course where maybe some of the team has done more than the rest of the team is doubtless a hard problem, one that I have no real solution for here.

I do remember some group projects from my undergraduate days. In our OS class, the assignment was to create an OS. My recollection is that we had some reasonably bright people on our team, but the semester ended without the OS really gelling into a working whole. We had managed to run into some really interesting problems in our software and we passed the course, perhaps on the strength of our war stories that showed we'd indeed really tried to think it through.

And not quite a group project, but scarily close to the real world: I remember a class in file processing. We'd been given an assignment and were told to bring in the runnable decks of punch cards as part of what we had to turn in. The professor collected the assignments and then gave them back out again, but you didn't get back your own program - You got someone else's program. The next assignment was to modify that someone else's program to add a new output report to it. I'm not sure which was more painful: modifying the crappy program I'd been handed without taking the time to rewrite it entirely, or looking at how someone else modified my program to add on the new report routine. Most definitely one of the more educational software assignments in the many software assignments I worked through as an undergrad.

So that's the problem as I see it - lack of collaborative software development in engineering schools. Has anyone got real world examples of a curriculum that in fact teaches collaborative software development at an undergraduate level? If you've got an example or counter-example, please add a comment to this article to tell us about it.

I tried to design a collaborative software development course to suggest for a local community college here. In the end, I decided that wasn't going to work. Junior college has to nominally fit in a mere 4 semesters. They are barely able to introduce C in their curriculum. I believe you'd want a much higher-level language, like Python, as the basis for any large scale group project. Python libraries and modules, object oriented programming patterns, ... Too much to cover before you could even start to talk about working as a team on a larger scale project. I'd have no objection to swapping out their C course for a course that instead introduces Python, but I think that'd be a tough political battle with the opposing forces arguing that C is a much more commercially important programming language than Python.

So I dropped back and punted: the 4-semester problem was fitting in programming, software development techniques and collaboration. In my opinion, the crucial new material is the collaboration, so I've been sketching out a course on how to be a member of a work team. I believe it wouldn't even be specific to the computer science department, but could be offered within the school's communication department. My organized collection of links to web pages I dug into while working on the idea of such a course are available on Pearltrees.com: rdrewd collaboration pearltree. Ironically, I couldn't find a collaborator to work on the project with me, so it's somewhat fallen to the wayside. If you'd be interested in a possibly long-distance collaboration on the design of such a course, please contact me. My email:Drew's mailbox (no spam, please).

Sunday, July 28, 2013

Pythonic Python - Writing Python code that fits the language's idioms.

A recent posting by +Luke Plant to the Google+ Python community about the importance of PEP-8 reminded me that for quite a while now I've been thinking I should compose a blog posting about idiomatic Python code - and that's what led to this article.

Now I know this is a very small candle trying to light a very large dark space, but so it goes. My own Python experience has so far been limited to small projects and class-work, no larger scale production quality team efforts. The good news is that that limitation in my own experience means there's plenty of room for you to chime in with opinions of your own on this matter. I write posts to my blog in hopes of sparking up some conversations. Do feel free to speak up!

Some Orientation if you're New Around Python

If you are new around Python and aren't entirely clear on what it means when we talk about "the Python community", I offer you this orientation section lest you be confused by technical details in my words here. If any of it remains unclear, do speak up about that and I'll try to expand the material to cover more.

A "PEP" is a "Python Enhancement proposal". There's a web site where all the PEP's are published and shared. PEP 0 is an index to the PEP's. PEP 1 is a guide to how to write and submit a PEP.

Most PEP's are proposals to enhance the Python programming language in some way.
e.g. PEP 435 proposes adding an enum data type to the Python standard library. It's been a long discussed topic. PEP 354 for example proposed something similar but was rejected in 2005. PEP 435 has been accepted and is scheduled to be implemented and released as a feature of Python 3.4.

Other "PEP's" are merely informational, not really enhancement proposals at all.
For example, PEP 429 is the nominal schedule and plans for the Python 3.4 release. So, if PEP's aren't always "Enhancement proposals", what are they? PEP's are public records of consensus opinions of the Python community. Not every voice counts the same. Benevolent Dictator for Life (BDFL) and Creator of the Python language in the first place, Guido Van Rossum, has extraordinary influence on the fate of a PEP. Fortunately, he generally shows good judgement in steering the language.

Among the informational PEP's, PEP-8 is a style guide for Python code. If you don't want your code to look strange to the eyes of experienced Python programmers, you should try to comply with the guidance of PEP-8. To help you do that, there's a PEP-8 checking program. Actually, if you do a Google search, you'll find that there's more than one such PEP-8 checking program available. You should have at least one of these programs in your tool box and make a habit of running it against your code and spending the time to tidy things up to make the checker program happy.

It's not that you can't ever bend the rules, but you darn well better have an excellent reason why deviating from the style guidelines was worthwhile in the exceptional case that you decided not to fix.

Another informational PEP that you should read is PEP-20, the Zen of Python. Unlike PEP-8, the guidance of PEP-20 isn't something so simple that a straight-forward program can look at your code and say whether or not you were thinking like a Python programmer when you wrote your code. PEP-20 does try to get your head into the right way of looking at the things that you wish to program using Python.

Beyond Style

There's more to writing "Pythonic" Python code than following PEP-8's style guidelines and letting PEP-20 shape your thinking. For example, given a collection of things (a "list" being the most typical way of setting that up in a Python program, if your reflex to process that list is to think "DO I=1 TO N"... then you are probably still under the influence of Fortran or C or some other such old programming language. A more Pythonic way is "FOR ITEM IN LIST:"... Now there are lots of special situations that might drive you to explicitly stepping an index through a list, even in Python, but don't do it as a simple automatic habit. Ned Batchelder has written an excellent tutorial on how to "loop like a native".

One other area where I've found I have to fight against habits formed in working in other older programming languages is in conceiving of the types of return values from a function. In some old languages, the return values were limited to something simple, and to exactly one thing. But Python is entirely dynamic, so you can feel free to inexpensively return elaborate data structures or even multiple values (tuples) at a time. It isn't inordinately expensive because the implementation doesn't copy the values around, just pass descriptors of the values. The language's runtime garbage collector takes care of reclaiming the storage space when you no longer need the fancy value that you constructed.

Additional Tools

It is important to understand that Python does incredibly little error checking at "compile" time. Your code will pass without complaint from the Python language processor, even if your program has grossly mis-spelled variable names or calls non-existent functions. You won't hear about the errors in a given line of code until you actually try to execute that line. This makes testing your code incredibly important. You should definitely look into Python unit-test tools so you can embed test cases with your code and be well prepared to routinely re-test after you make revisions to the code. Related to testing, you might also benefit from coverage tools that report which portions of your code remain unexercised. This may guide you into beefing up your unit test cases. Sadly, even 100% test coverage of all the lines of code still can't guarantee that your code has no undiscovered bugs lurking in it. But if you haven't even exercised all the lines of code it is easy to anticipate that there may be easy-to-find errors lying in wait to spring out at you at some inopportune time.

There's no real substitute for good judgement and as the old adage explains good judgement is something you learn from experience and experience is often something you gain from applying bad judgement. Another tool that may draw your attention to places in need of better judgement is sloccount. sloccount tells you how many lines of non-blank, non-comment source code you've written and while it is counting, it computes complexity metrics for your functions. A function that is oversized or that scores as exceptionally complex deserves to be re-considered. More often than not, those are the kinds of functions where your undiscovered bugs are lurking.

There are multiple static checking programs that try to find the more obvious kinds of problems for you. pylint, pyflakes, and pychecker to start you off with 3 names to Google search for. Some of the suggestions from these tools can be very annoying, like if I need a short-lived integer variable for local use, why shouldn't I name that variable "i"? But generally the checkers are tunable to tailor the rules to your taste. Don't just get annoyed with what the checker program tells you. Look at what it thinks it sees and see if you could do better and make it happy while you are at it.

Data Types.

One other aspect of the Python language that you should pay attention to is it's richness in data types. strings, lists, dictionaries, sets, tuples... And those are just "collections" of values. If you find yourself frequently searching through a list, stop and think whether a list is really the right choice. Maybe you really ought to use a dictionary to make it simpler to check if a given value is in the collection.

Modules and Name Spaces

If your Python programs all tend to each live in exactly one file, then odds are you aren't making use of Python's modules and name space capabilities to separate your code into manageable sized pieces. Ultimately, that will limit you in your ability to tackle larger projects such as multiple-person programming teams. Keep an eye out for possible reusable modules that are worth separating from the specific problem at hand so you can use the same code elsewhere in the future. Python is quite liberal in its handling of type ("duck typing"). You can exploit that to make your code quite flexible about what data it is willing to deal with. Take the time to carefully document what the requirements are for the data that your module can handle. e.g. maybe you had in mind that it would handle "employees", but perhaps it could be equally happy with any kind of object that has a mailing address as part of it ("customers" for instance).

Multiple paradigms.

Although Python is not a gigantic language with the sort of sprawl that PL/I was notorious for, Python does allow for more than one programming style. It certainly has support for object oriented programming as well as structured programming. Happily, it doesn't insist that you make use of all of its possibilities, but if you have been shying away from some aspect of the language because it supports a style that you are unaccustomed to, do push yourself toward learning how to use that aspect of the language appropriately. "generators" are a kind of co-routine and you may not have run into such a control structure in other languages, but they are worth the time to learn. Object oriented programming is still a weak area for me, but I've been working on trying to understand how to put that to good use. Test driven development is new to me too, but again I've been trying to regroove my mind to pay attention to doing things that way.

Libraries and Frameworks

One of the mixed blessings of working in Python is that there are many rich libraries of existing code available for your use. Do plan to spend some time searching to find what is available that may be helpful to you. The code is generally free for you to download, but you may need to invest some time to understand it and bend it to your will. Forking it to make a specific-to-you version is probably a bad idea. But reinventing the wheel on your own is probably an even worse use of your time. If the library module isn't going to comfortably fit into your program, that may be nature's way of telling you to return to your favorite search engine to find another alternative implementation to use instead of your initial "find".

Learning the Python language is just a start. Learning to put it to good use is a much taller order. As Peter Norvig promises in his essay, "Yes, you can learn to program in only 10 years". But don't get discouraged. Learning new stuff every day can be great fun.

11/10/2013 - Corrected a typo. "are are". Being your own editor has its hazards.

Saturday, July 27, 2013

MOOC's: Be careful what you wish for.

There's an old warning that you should be careful what you wish for, because you just might get it. Here's an article from the Slate "web magazine" cautioning that MOOC's are going to doom us all:

http://www.slate.com/articles/technology/future_tense/2013/07/moocs_could_be_disastrous_for_students_and_professors.html

Countering that point of view, the article does manage to provide a link to a TED talk by coursera founder Daphne Koller, speaking at Edinburgh in June 2012:

http://www.ted.com/talks/daphne_koller_what_we_re_learning_from_online_education.html (21 minutes)

Now my own experience is limited. I took Udacity's CS101 course last year. It has no fixed schedule and I worked on the course when time was available and did eventually complete it with 100% on the final exam. I've signed up for another couple of Udacity courses, but haven't managed to get around to finishing either. Home Internet outages and a few weeks hospitalized distracted me from sticking to it. Of course, with no fixed schedule, I can always return to those courses and get down to work on them.

Koller and coursera seem to put much more emphasis on a fixed schedule than does Udacity. For example, I've been taking coursera's Systematic Program Design Course and have to concede that I have not kept up with it's mandatory schedule of weekly homeworks and quizzes. Realistically, I have to concede I'll not be completing that course this summer and will have to try it again the next time it is scheduled. Not a bad course, but it does get tedious in its attention to microscopic details. I also got distracted with learning a new programming language (Racket) along the way. Too bad as the course was to have 2 peer reviewed assignments and they were to be my first experience with peer reviewed work.

I think the gloom and doom suggestions of the Slate article are a bit too Luddite a point-of-view for my taste. I think MOOC's will lead to a lot of change in education courses in the future. The MOOC courses will establish baselines which future improvements will have to beat. Koller emphasizes that the MOOC offerings collect a lot of data which provide the prospect of guiding future improvements. The one thing that I can see screwing up a bright future of continuous course improvement is if copyrights are used to restrict building a better course based on an existing course. U.S. copyright law has seemed to get repeatedly stretched with additional years to make sure Mickey Mouse, et al, never pass into the public domain. I can only hope that the MOOC courses stick to the tradition of Creative Commons licensing so that it is possible to found an improved course on an existing good course.

On a related topic, although Bill Gates is not a person that I often have praise for, in this 10-minute TED talk by Bill, he does a nice job of arguing that teachers need more feedback. A modest bit of technology, such as a video camera on a tripod, can give teachers a basis for self-assessment and the possibility for peer reviews.

http://www.ted.com/talks/bill_gates_teachers_need_real_feedback.html

Tuesday, July 16, 2013

FIeld Trip and an Unusual Looking Flag

Flag image by Blas Delgado Ortiz, 27 June 2001

A few weeks ago, while we were visiting my brother-in-law in a nursing home in Queens, we had occasion to stop by the NYPD 110th precinct station house to file a report about an incident. While my wife was chatting with the officer at the desk, I wandered around the lobby looking at the various posters and artifacts on display there. I noticed that in several of the pictures of "events", there was a distinctive flag on display - green & white stripes and 24 stars in a curious looking constellation arrangement, not simple rows or anything simply symmetric. I asked the officer after my wife had finished her business what that flag was and she waved me off, not knowing the answer. But the question stewed in the back of my mind, unsettled.

Today, I decided to try a Google search for:

flag green and white 24 stars

to see what it would find. The search easily turned up exactly the information I was looking for. It's the official flag of the NYC police department. 5 stripes for the 5 boroughs of the city (Manhattan, Brooklyn, Queens, the Bronx and Staten Island) and the 24 stars represent the 3 cities, 9 towns and 12 incorporated villages that were integrated together in forming NYC in 1898. A detailed explanation is given on http://www.crwflags.com/fotw/flags/us-nycp.html. So now I know. Can't say that I'm impressed with the depth of knowledge displayed by the officer at the station house desk.

I can't share this kind of information without mentioning the "Fun with Flags" episode of "The Big Bang Theory". Enjoy!

Sunday, July 7, 2013

Recursion

One of the interesting computer science topics that Udacity CS101 introduces is "recursion". Recursion is where you define a function in terms that in some cases require the function to call itself. This is the main topic of Unit 6 of the course. There are important design considerations to be taken into account if you are going to use "recursion" in your implementation of a function.

  1. Base case(s) - It is crucial that your function have at least one combination of inputs that do not trigger yet another recursion. This non-recursive case is called the base case of your function. If you don't have at least one base case, then you are fairly certain to have an unending loop that never produces a final result.
  2. Progress - It is similarly crucial that your function make progress toward the base case(s) as it recurses. If you have situations where sometimes the function re-invokes itself with the same inputs and state as it had previously, then it may be more subtle, but almost surely you are stuck in an unending loop that never produces a final result.

Not every programming language supports recursion. Cobol and Fortran for example traditionally do not. Some languages (e.g. PL/I) support it, but only if you declare that a specific function may be invoked recursively. ("PROC OPTIONS(RECURSIVE)"). Python supports recursion without any need to declare your intent to use the capability, but Python's support of recursion does not include "tail recursion optimization". Tail recursion optimization is where a language processor recognizes the special case that a procedure is being called recursively, but that when the procedure returns to the point of the call, there's nothing more to do than to return to an earlier call to this procedure. A clever compiler can transform the code for such a program to do a plain loop instead of a recursive call. Alas, in Python, there's much that cannot be known for certain about the code until run time. Guido Von Rossum, the creator of Python and it's "Benevolent Dictator for Life" (BDFL) has blogged about why Python doesn't bother to try harder for this particular case. See: Tail Recursion Elimination.

A key fact to note is that if you've got code with a tail recursion in it, then it is reasonably straight forward for you to restructure that code to explicitly use a loop in place of the recursion. It apparently is in this fact that Guido draws enough comfort to not bother trying to optimize the handling of this kind of code.

Some folks look at the limitation of Python's support of recursion and wrongly conclude that recursion is a feature of the Python programming language that you should avoid. Ned Batchelder did a great job of de-bunking that assertion in his essay: Recursive Dogma.

There are lots of interesting discussions of recursion in the Udacity CS101 forum. Much of the debate is over whether or not recursion is something easy or hard to get your brain wrapped around. Some folks find recursion is an elegant way to express a function while others opine that iteration is a more natural way to conceive of a function's processing. My opinion in this debate is that even if recursion provides a straightforward clean design for a function, do think through how to transform that design into an iteration. Compare the readability, performance and limitations of the 2 alternative designs and pick the alternative that makes the most sense for your needs.