Saturday, August 31, 2013

Out of the Box Thinking - 3D Printers in Space

It is fairly standard advice that if you want to come up with a new solution to a problem, you have to look at it differently. You need to think outside of the box of conventional solutions.


Background

For a while now, I'd been following the growing interest in 3D printers. If you're unfamiliar with them, see my Pearltree on the topic - a shared collection of links to web pages where you can read more about the topic. Short version: A 3D printer is a device that typically extrudes some material (e.g. melted plastic) under computerized control of the position of the nozzle to create a specified object on a platform in a box.


Prototype of a planned 3D-printing spiderbot for construction of buildings on earth. Image borrowed without formal permission from this site.


3D-Printers in Space

From time to time I've read mention of NASA being interested in using 3D printing in space to create replacement parts instead of having to wait for delivery. The FEDEX delivery truck doesn't come by the International Space Station very often. So, I thought, that's interesting. I wonder to what unintended extent the commercial 3D printers depend on gravity. I haven't thought it through, but won't be surprised if it turns out some engineering revisions are needed before a 3D printer designed to work on earth will work properly in a zero-gravity environment. If nothing else, the absence of natural air convection with hot air rising and cold air descending, perhaps will require the addition of forced air cooling to the 3D-printer to be used in space (assuming the printer will reside in a pressurized area where there is air). Cooling times for materials extruded in the vacuum of the great outdoors of space are a whole other thing that needs to be thought through.


And then today I came across this article from the Verge web site, reporting on some plans by NASA to "3D Print" large structures in space. Plenty more questions than answers at this point, but it is clear that I have been guilty of literally confining my thinking about 3D-printing in space to literally be inside the box. The job of the box of a 3D-printer is mostly to support the print mechanism and to provide a frame of reference within which the printer head will move. But in space, with zero gravity, you don't need anything to hold up the printing mechanism. Perhaps a finely adjusted GPS mechanism and laser range-finder can be used to guide positioning of the extruder, and small thrusters can move the extruder from point to point. Once you've literally gotten out of the box, you can think about 3-D printing really large objects, such as rigid beams to hold together gigantic solar arrays - Objects far larger than you'd likely be able to consider launching into space preassembled from the ground.


Seems to me that the hard part is still going to be getting the raw materials (and thruster fuel) into space. Perhaps really large structures fabricated in space will depend upon the hypothetical space elevator to lift materials to orbit for relatively low cost. Of course, if you can lift relatively large things to orbit, an engineering alternative would be to lift prefabricated sections to orbit and then have construction robots that assemble the pieces into the desired large rigid beam structures. Of course, if you can figure out how to 3D-print the desired structures without needing material from earth that might tip things in favor of space-based 3D-printers, but I'm not at all sure how you'd go about fabricating anything from moon dust or the materials of asteroids.

Acknowledgement

Thanks and a tip of the hat to Nuno Cravino for sharing the link to the Verge article on the Google+ STEM Community. It is always good to see reminders from time to time of the importance of thinking outside the box.


While contemplating the illustration on The Verge article, I remembered there was an old episode of Star-Trek where the Enterprise was trapped in a web-like structure. I'm pretty sure the web was made of energy beams, not solid extruded materials. Thanks to the Internet's excellent reference sites about Star-Trek, all I needed was a Google Search for:


    star trek enterprise caught in a web


to find that was Season 3 Episode 9, 1968's episode "The Tholian Web".


Image borrowed from Memory Alpha.org, the Star Trek Wiki.


imdb.com says there was an actual on-screen mention of the Tholian web in an episode of Futurama.

Wednesday, August 28, 2013

Learn How to Program Computers!

Say What?

News photo of an experimental self-driving Volvo from Engadget report.


Computer procesors show up almost everywhere these days. Desktops, laptops, cell phones, processors embedded in cars, processors embedded in printers, embedded in cameras, embedded in dish-washers.


One thing that they all have in common is that someone (or some team of folks) had to develop a set of instructions to guide the computer to do whatever it is that it does. Those instructions are called programs, software, and the stuff that goes into a program is called "code", not to be confused with "secret codes", though there is a connection if cryptography is what interests you.


So, no matter what, you are pretty much fated to be a user of computer programs, but with some effort on your part, you can learn to write your own computer programs too. Creating a computer program is called "software development". You can learn to do software development on your own. I'll caution up front that seriously big projects generally are tackled by teams, not individuals working solo. But the longest journey starts with but a single step.


My intent here is to convince you that you should embark on the long journey of learning to program computers, to do software development.

Motive

There are plenty of reasons why you should learn how to program computers. Before you begin on that path, you might want to think about what your motive is. Are you hoping to write your own computer games? Want to understand Cyberwarfare before you are noted as a threat by Skynet? Interested in autonomous robots? Self-driving cars? Want to develop a fancy web site of your own? Thinking it might be a useful skill when seeking a serious job? Want to understand how the NSA accidentally intercepted calls from Washington, DC, when they meant to intercept calls from Egypt?


I can't tell you what motivates you. I will caution that this is no short trip, so you should probably be sure you are well motivated before you dive in.


I'll also mention up front that the early days of getting started with developing your own software can be frustrating. As you master the basics, a self-driving car may seem awfully far away. Part of my hope here is that I can convince you that if you keep your motive in mind, you can work through the steep initial learning curve.

Age restrictions?

In my opinion, there's no upper bound on the age when you can learn to program computers. If you are past retirement age, that might alter your own list of motives for learning, but it is no reason to refuse to give it a try.


Can you be too young to learn to program? Well, programming does generally involve reading and writing, so if you haven't gotten proficient at those skills yet, you may find it hard to get into software development. But then, how is it that you are reading this article? The US apparently has regulations that strongly discourage web sites from registering information for children under age 13, so if you are under age 13, it is important that you discuss your plans with your parent or guardian and have permission from them to get involved in online courses.

Cost?

If you have access to a computer with a good Internet connection, you probably have all that you need to get started. If you don't, then how is it that you are reading this article?


Of course, if you have money to invest in this project of yours, there are things you might want to look into. For instance, books. You can get plenty of materials for free here on the Internet, but sometimes it can be useful to have a paper document that you can bookmark, dog-ear, highlight and annotate. Don't feel you have to invest in books up front, but if you can find a good local book store, you may find that there are useful things to be found by browsing. Your local library may also be useful, though in my experience, the local library tends to be woefully short of current technical books. The good news is that if you can find titles worth looking at, perhaps from web searches of places like amazon.com, then most likely your local library can arrange an inter-library loan so you can examine the book without having to buy it first.


Of course, library books, including books borrowed on inter-library loans, need to be treated politely, not dog-eared, high-lighted and given marginal annotations. And they do have to be returned after a relatively short time. If you find a title that looks really well matched to your needs, that's where you just might blow your allowance on an order from an online book-seller, so you'll have a copy of it for your own. Amazon.com does have provisions for wish lists, sort of like bridal registries, so as you publically grow your list of titles you hunger for, you'll at least make it easy for folks thinking about getting you a birthday present or Christmas present.


Libraries can also be a great place for getting free public access to Internet-connected computers. You might find that there are administrative obstacles to your installing software development tools on the library's PC's, or even filters to protect you from the educational materials. Don't let those barriers stop you. Talk to the librarian to find out who is in charge of those kinds of filters. Most likely, arrangements can be made for good reasons like you have.

Courses?

There are many computer programming courses available on the Internet for free. "Massive Open Online Courses", MOOC's, featuring different programming languages and different levels of material. Some aim to serve particularly younger students. You might look, for example, for courses that introduce the "Scratch" programming language.


I haven't taken a "Scratch" course myself, so I'm not going to single out a specific suggestion here. Just try a simple Google search for:


    scratch programming course


and let us know in a comment which course you picked and why, and how that went for you.


But if you feel you are ready to learn a somewhat more conventional programming language, one that will take you further than I believe Scratch will, my suggestion is CS101 from Udacity.com, where you will learn to program using the Python 2 programming language.


The Python programming language continues to evolve. There are Python 3 versions available today. The world is still catching up to that. You'll be fine starting with Python 2, and learning the differences later on to get over to the newer versions. It is important that you know that there are multiple versions and that when you are shopping for books or tools that you get a version that matches what the course is expecting you to have.


There are other courses available, though I haven't tried the others myself. Certainly there are courses that teach Python specifically for game development. And there are plenty of courses that teach other languages, Java and C, for example. But in my opinion, CS101 from Udacity.com is a good place to start. It is free, self-paced, you work on it on your own schedule. Nominally, it is an 8-week course. It doesn't have pre-requisite other courses. I believe it is an excellent place to get started with MOOC's. There's a final exam at the end and when you pass it, they will e-mail to you a certificate to commemorate your accomplishment. There are reports on the web of real colleges that even give credits if you are a registered student and pass the Udacity.com CS101 course, but being a registered student at a physical college is outside the realm of stuff you can try for no charge.


If nothing else, trying a MOOC to get started will show you if this is a field that holds your interest. I know software development has been a long standing interest of my own, but I also know that some fraction of the students who try it find that they absolutely hate programming. If you find that's the case for you, my advice is that you tough it out to completion of the introductory course and then look for other fields that do hold your interest. It's only a couple of months to work through Udacity CS101, and it isn't anything like a full-time course load while you are working through just the introductory course. Getting the certificate isn't good motive to start the course, but perhaps it is a good motive to stick it out to the end.

Where to find more?

code.org is a web site that advocates that everyone should learn how to code. They offer 3 editions of a promotional video to promote interest in the field. One is a 1-minute teaser. Another is a 5-minute edition featured on their web site's front page. And if you have 10-minutes to spare, there is a full edition available.


There are links on the code.org site to various local places to learn to program. For example, there's a brief plug there for the "Yes We Can Community Center" here in Westbury, NY.


Details of the schedule are not yet nailed down for Fall 2013, but if you are local to here, one way to take on CS101 is to sign up with the Community Center. The Center has the computers and Internet access, and it will have other students so you won't feel too much that you are on-your-own. And, for what it worth, I'll be available to answer questions and help keep you motivated while you are working through the course online.


Not quite free, as there is a membership fee to sign up with the Community Center. But use of the basketball courts, game room and locker room and access to quiet study space for your homework time all come with that membership, so it's probably worth joining if this is your community. (Use of the fitness center is not included in basic membership. Sorry).


The schedule I've proposed is that Monday evenings we'd meet as a class to share discussion of progress and problems. Other school nights I'd be available to answer questions 7-9PM or by appointment.


In any case, you can try udacity.com CS101 on your own before we even get started and then continue at the Community center once we get our act together there. Please, do let them know at the front desk that you're interested in taking CS101 as seating is limited.

Free for Senior Citizens

If you are a resident of North Hempstead and are age 60 or more, you are eligible for free membership in Project Independence and get a free community center membership too when you join Project Independence. Such a deal!

Further reading

Benefits of Teaching Kids To Code That No One Is Talking About - This blog post by an online acquaintance of mine has an example of a Scratch program, and a link to a video of a talk by the creator of Scratch.


Is Udacity CS101 Watered Down - This is a blog post from me in December 2012, describing what you should expect to get out of the online Udacity CS101 course.


Where to Get Python - This is another blog post from me. This one describes how you can install Python on your own PC. Note that back when I wrote that, Udacity was still using Python 2.6, but the course has since updated it's software to Python 2.7. From an end-user point of view that's an almost imperceptible change.


There are numerous Youtube videos available about learning to program games. Here is Episode 1 of a series that is dozens of videos long. Part 1 shows off a couple of games the guy has written in Python and describes what prerequisite knowledge he expects you to have to get started with his tutorials.


This isn't the first time that I've written about plans for CS101 at the Community Center. For more details of my intended format for the weekly meetings, see my blog article: Marketing the Importance of Programming Education

Thursday, August 22, 2013

What's the fuss about parallel programming?

What's the fuss about parallel programming?

A young friend of mine, now a 2nd year computer engineering student, asked me:

What is parallel programming? Why is parallel programming regarded as important in the future?

I don't have any idea about parallel programming and try to learn  by Googling. Yet,  it is difficult to understand. Why mutable data has been considered as inefficient in programming recently? How it creates problem and in what way functional programming avoids this mess? Will functional programming increase the performance of multicore systems?
Also, to which OS books should I refer? As I am starting my study on my own, and I want to get good at OS and basically able to understand the difference between linux and windows, which book should I follow? Earlier, you said that you are interested in operating systems and also best at it. Please, just suggest me some books which would able to justify the differences between linux and windows technically.
In which language is OS programming done?

Image of multiple processors taken somewhat out of context with a thank-you to Tom's Hardware, a web site where people try to keep up with this stuff


This is my reply to that e-mail...
You ask "What is parallel programming?"   That's a very similar topic to another topic you recently asked about:   concurrent programming.   Both concern how to write programs that do more than one thing at once so that overall performance is improved.   e.g. if the time to run the program depends on "n" (perhaps n is the amount of input data to be processed), then what a parallel program wants to do is apply more than one processor to the problem so the job can be completed sooner than one processor would be able to.

For example, if the job is to sort n items, you might divide the list up into a separate list per processor so each processor needs only sort a shorter list of items.   Of course, before the job is finished, the multiple shorter lists need to be merged together to make the final result.

Distributing the items across the processors is work, merging the lists back together again is work.   Whether the overhead of those extra steps is worth it or not depends on things like how much memory each of the processors has good access to.   If the items divided make a small enough list to fit in the RAM of each processor, then things are probably going to go very fast.    But if the sub-problems are still big enough that you need to spill things out to intermediate work files, and if the extra processors don't have good access to the disk space used to store the spill files, then the dividing up of things might turn out to be a net loss in performance.

http://en.wikipedia.org/wiki/Parallel_programming_model

Moore's Law

You also ask "Why is parallel programming regarded as important for the future?".   Well, if you go way back to the early days of integrated circuits, Gordon Moore predicted in 1965 that the number of transistors on an integrated circuit would double every 2 years.   He thought that observation would hold true for 10 more years or so.   We actually have gotten through a lot more doublings than that and aren't done yet (though folks are starting to fret that they can see ultimate limits ahead - so it won't go on forever).
His prediction was more and more transistors and it isn't entirely obvious that that translates to mean faster computers.   But, in fact, what folks have done with those transistors is figure out ways to apply them to make faster computers.    If you look back to the earliest IBM PC's, the processor chip didn't even do floating point arithmetic.   If you needed faster floating point, you'd have to add a math co-processor onto the motherboard (there was a socket for that additional chip).

I confess to liking that idea of having separate useful pieces that you can custom integrate to create a tailored computer with exactly the strengths that you want.   Alas, the expense of having multiple chips connected together at the circuit board level argues powerfully against that piece-part model of the chip business.   The trend instead has been to absorb more and more functionality into a single chip - whole systems on a chip - just to be rid of the sockets and pins and propagation delays of getting off-chip and on-chip and back again.

So where did all the transistors get spent to speed things up?   Some of it is obvious.   Computers today have amounts of memory that were unthinkable just a few years ago.   Along with more memory, you certainly have more cache and more layers of cache to speed up access to that memory.   There's much to be learned in contemplating why there are more layers of cache instead of just bigger cache.   But that's a more hardware-centric topic than I'm comfortable explaining here as a software guy.

Besides more memory and more registers, the paths and registers have gotten wider.   Where there were 8 bits in the beginning, there are often 64 bits today.    You can try cranking that up in the future to 128 bits, but at some point you get into diminishing returns.   Slinging around 128-bit pointers in a program that could be happy dealing with only 32-bit pointers may not be optimal.    Maybe the problem is just that we need a little more time for programs to comfortably exploit gigantic memory spaces.   My PC today only has 2GB of real RAM.    32 bits is more than enough to directly address that much memory.  2^32 in fact is enough to directly address 4GB of RAM.   So the line of needing more than 32 bits isn't super far away. But 64 bits is enough to directly address 16 exabytes of RAM.   I can't even afford a Terabyte of RAM yet, so needing more than 64-bits is surely a long way away. (1 Terabyte=1024 Gigabytes. 1  Petabyte=1024 Terabytes.   And 1 Exabyte=1024 Petabytes).

http://highscalability.com/blog/2012/9/11/how-big-is-a-petabyte-exabyte-zettabyte-or-a-yottabyte.html

Those are really big numbers.   Bigger than even Doc Brown is likely ready to contemplate:

http://www.youtube.com/watch?v=I5cYgRnfFDA

But it isn't always obvious how best to spend the many transistors that the progress predicted by Moore has provided to us.   I see a certain amount of oscillation in design approaches as things get wide and then get back to serial again.   Look at ATA vs. SATA, for example.

http://en.wikipedia.org/wiki/Serial_ATA

One way to spend transistors is to make more complex circuitry to make the time for each instruction be shorter - do faster multiplication or division, but there's only so far that you can push things in that direction. Current consensus seems to be that making faster and faster processors is getting to be very difficult.   As clock speeds go up, the chip's thirst for electrical power goes up too and with that the amount of heat that has to be taken away from the chip to avoid reducing it to a puddle or a puff of smoke.   So, the industry's current direction is toward spending the transistors on having more processors with moderate speed per processor.   The aggregate instruction rate of such an array of processors multiplies out to nice high numbers of instructions per second, but the challenge is how to effectively apply all those processors to solve a problem faster than an older uniprocessor computer would be able to. Hence the anticipated growing importance of parallel computing in the future.

I think so far I've answered the questions in your subject line.   I hope you have the patience for me to try answering the questions in the body of your mail too.

A Day at the Races

I see your next question is "Why the fuss about mutable data?"   Well, as I understand it, the concern is that if your data is mutable, you need to worry about inter-processor synchronization and locking so that when a processor updates a stored value, that it doesn't interfere with some other processor.
The processing of read-only (immutable) data doesn't have to worry about locking and synchronization.  But consider something as simple as A=A+1, where A is a mutable value.    Underneath it all, your processor needs to figure out where the value of A is stored, fetch the value into some arithmetic register, add 1 to the value and store the value back into the location for A.   If A is accessible only to your one processor, there's little to sweat about, but if A is accessible to multiple processors there's a potential for a race.   What if both processors have fetched the value of A and both have incremented their copy.    Only one of them has the right answer.   If they both store their new values for A back to the shared location, the final result is one less than it ought to be.

One solution is to have specialized hardware that makes the A=A+1 operation be atomic, indivisible, so there's no chance of one processor seeing the old value when it should be using a new value.

There's the challenge of figuring out exactly which atomic instructions are most useful additions to your instruction set design.   IBM mainframes had an interesting, though complicated instruction called compare and swap.   As I remember it, the instruction took 2 registers and a memory location.   If the first register matched the value in the memory location, then the 2nd register would be stored into the memory location.   If they didn't match, then the memory location would be loaded into the 1st register. And the whole operation was indivisible.    So a processor could do it without having to worry about whether some other processor was operating on the same memory location.   So, you could use compare and swap to do our A=A+1 operation safely.   You fetch the value of A into a register. You copy that register to a 2nd register.   Add 1 to the 2nd register.    Now do a compare-and swap to store the result back to memory.   If the compare and swap sets the condition code that says the 1st register didn't match, then sorry, but you have to repeat your computation.   Copy the newer value from the first register to the 2nd register. Add 1 to the (new) value to get a newer value and try the compare and swap again.   Of course, if there are many processors in hot contention for the value of A, then you might have to spin for a while in that loop trying to compute the right value and get it back before it becomes stale.

The compare-and-swap instruction can be used for more than A=A+1 kinds of computations.    For instance consider a linked list of items, perhaps the run-able thread list in your operating system kernel.   You want to be able to remove an item from that list.   That involves fetching the link to the next item, fetching the link to the item after that and then storing the link to the next next item into the location where the link to the item you are removing from the list came from.

    A  ----> B ----> C becomes A ----> C

As with the A=A+1 case, there's the potential for a race if there are multiple processors that are contending to pick B off the list.  compare-and-swap can at least make it safe from races, but again, if there is hot contention among many processors, there can be much wasted spinning before a processor succeeds in grabbing B off the list.

So, if you have careful control at the machine instruction level, the problem is practically solved.   But that sort of implies that you drop down into assembler language from time to time or you have a compiler that generates incredibly clever object code that knows where to use these specialized multi-processing instructions.   What if you are using a garbage-collected language like Java or Python?   Maybe your problem is worse than the value of A that you used in your computation became stale between your fetch and your store back to memory.   Maybe the location of A has changed entirely and your store operation is smashing something else entirely different than the variable A.   Big trouble ahead...   In fact, if you think in terms of Python, maybe by the time you are trying to store the new value, A isn't even an integer any more. "Gee, it was an integer value when I fetched it.   Who the heck changed it to be a floating point number in the meanwhile?".   Could be subtler: Python will happily and silently promote a int to a long if the value gets too big to fit into an int, so you need to be very careful that the value you fetched still makes sense before you store the result back to memory.

The article I pointed you to the other day "Downfall of Imperative Programming” asserts that "Imperative programs will always be vulnerable to race conditions because they have mutable variables".   So functional programming languages, by avoiding mutable variables, dodge a major bullet in the multiprocessing world. The thing that I don't know is how to be sufficiently productive in functional programming languages for Haskell to be worth the trouble to learn.   The Downfall article predicts that the race conditions are an insoluble problem for imperative programming language implementations.  I'll happily accept that there's trouble ahead to watch out for, but I do have a bit of difficulty accepting that the races absolutely can't be resolved.

Python's Global Interpreter Lock

Python worries about the possibility of races among threads in interpreting the instructions of Python code. They have a "Global Interpreter Lock" (GIL) to assure that one interpreter thread won't change a value in use by another interpreter thread. Folks worry that this coarse level of locking will keep Python programs from being able to scale up with increasing numbers of processors.
I've seen some clever dodges of the GIL in Python programs, mainly by spreading the program across separate address spaces (multiple Python interpreters, each with their own GIL) and limiting interprocess interaction to some carefully controlled set of places in the code with appropriate locking protections.  On the one hand, this doesn't give transparent scaling up from a uniprocessor to M processors all running in parallel, but on the other hand, it does get the job done.

My (weak) excuse for not having more first hand experience with this...

My home PC doesn't bring multiprocessors to the party.   Some day I hope to replace it with an i5-ish based computer with 64-bit addressing and >4GB of memory.   As a retiree with a rather modest pension, that's a discretionary expense that I've been postponing into the future.  Maybe in the meanwhile my target will shift to something with way more processors than an i5.   What I have in mind is something with enough oomph to be able to run Linux and Windows both in virtual machines (Based on Xen, VMWare, something else?  I don't know...). Heck, Microsoft isn't even making it easy to buy such a configuration without paying for a Windows license twice (once bundled into the PC's base price and then again for an installable copy that can be installed into a VM).  I'm assuming that a re-install CD that wants to reload Windows onto a bare PC isn't going to be able to install into a VM environment.   I'm expecting that multi-processor race conditions and their associated problems will come along naturally to my world once I have a rich enough configuration and that encountering those problems on more than just paper will motivate me into doing something about them.
Maybe I'm just old-fashioned in thinking that what I need is a richer computing environment here at home.   Maybe the right thing to do is to venture out into things like Amazon's Cloud Computing service and see what kind of trouble I can get into using other people's multi-processors via the Internet.  One of my worries about that, is maybe the underlying MP nature of their cloud services is too deeply wrapped for me to really see the problems I'd be up against from MP.   And, "look, dear, here's the marvelous new computer I just bought" is a much easier conversation to anticipate having with my wife then "Just let me pay this bill for cloud services.   It isn't so much money and I did really learn from having tried their services."

Comparative Operating Systems

You ask me to recommend an OS book to better understand Windows vs. Linux.  I don't know which book is the right choice.    Certainly an Amazon or Google search will turn up a large number of candidate titles. Perhaps your school's library has some of those titles so you can look them over, or perhaps they can arrange inter-library loans for you to be able to look over some of the candidate titles.   "Which of these is best" is always a tricky question because the answer depends so much on your particular criteria for "best"
So let me turn this around and ask you for a summary of your findings from digging into the too-long list of candidate titles and your recommendation.   You might want to ask your question of your school's professor for the OS classes too.   Maybe he's got a more formed opinion on this topic than I have.

Linux Weekly News

Meanwhile, I stand by my suggestion that you should make an effort to keep up with lwn.net  (free of charge at the price of having to lag back a week from the most current articles) to see what is going on in the Linux world.  Don't feel obligated to have the newest and most experimental kernel on your home PC,  But if you spend some time watching the evolution and planning of kernels, you'll have a better idea of Linux's strengths and weaknesses and what "they" are doing about the weaknesses.  Unlike Windows, if you are sufficiently motivated to want Linux to be different then it is today, you can make that happen.

Kernel programming languages?

What programming languages show up in OS programming?  Well, at this time, I expect the correct answer to that is C.  Other languages (e.g.Java and Python) do show up in supporting roles, but generally don't make it into kernel code.   Even C++ tends to need too demanding an environment to be a good candidate for kernel code.   Maybe as time goes on the kernel will sprout suitable layers of capability to make higher level languages more attractive for implementing functionality within the kernel but right now if someone tells you a kernel is written in C++, ask them more questions to confirm that.   It wasn't all that long ago that the likely choice for programming an OS kernel was surely assembler language.  Unix introduced the C language and the then radical idea of using a higher level language in the kernel and even having kernel code that is somewhat portable across computing system architectures. (To calm the historians in the audience, I'll concede here that I may be under-crediting the Multics operating system, portions of which were written in PL/I. And the Multics site gives credit to Burroughs for having done a kernel in Algol, but that's way before even my time).
Stack overflow article on the languages of the Android OS:

http://stackoverflow.com/questions/12544360/on-what-programming-language-is-android-os-and-its-kernel-written

Stack overflow article on the languages of MacOS, Windows and Linux:

http://stackoverflow.com/questions/580292/what-languages-are-windows-mac-os-x-and-linux-written-in

Not every answer is to be trusted to be correct on stackoverflow....

One sub-link of that article that I followed and that does look interesting and credible:

http://www.lextrait.com/vincent/implementations.html

lwn.net article on what's new in the Linux 3.11 kernel expected to become available in September 2013...
http://lwn.net/Articles/558940/
This is a particularly interesting link from one of the many comments on that lwn.net article about 3.11:
http://www.softpanorama.org/People/Torvalds/Finland_period/xenix_microsoft_shortlived_love_affair_with_unix.shtml

In Closing...

You quote me as saying that I'm best at operating systems.   I tried rummaging in old mail to you to put that statement in context, but didn't succeed in tracking down what I said.   I will concede that I'm especially interested in operating systems, and given a list of computer science topics, I'm probably more interested in operating system then in most of the others, but claiming I'm best in operating systems sounds like it surely needs some context.


I confess that except for command line pipelines, I've never actually written a multi-threaded program of my own. So don't assume more expertise here than I actually have.

Thursday, August 8, 2013

Software Development isn't a Field for Loners

A young friend of mine, a first year student studying to become a computer engineer, recently sent to me a copy of his school's 3rd year curriculum, asking me for comments. Now, on the one hand, I'm no curriculum expert, not having really looked at a college catalog for computer science in many a year. But, on the other hand, it's a rare topic that leaves me without an opinion when I'm asked to look at something.

On the plus side, the curriculum was packed full of technical courses. It was quite different from my own experience at Cornell U. where every semester in Cornell Engineering had room for at least one elective. My recollection is that the catalog's requirements at Cornell actually demanded that your courses not just be in "Your" school. That requirement was sometimes annoying to comply with, but in the long run, I do think the University succeeded in exposing me to more diverse backgrounds by forcing me to get out and about in more than just the engineering school. Far as I can tell, at my friend's school everyone in his major is expected to work through the same heavy technical course load in their 3rd year.

In my opinion, the weakest aspect of the curriculum I was looking at today was lack of team or group projects. Why do I think that is so important? Well, software development isn't really a field for loners. Programmers may get mocked at social events for not fitting in, but software development is most definitely a team activity. Where do we go wrong at social events? I think it is just a tendency to be so darn focused on the challenging problems at hand at work that keeps us from adequately noticing that the techy stuff is of no interest at all to the Muggles at the party.

Contemporary programming methodologies (e.g. Agile, Scrum, Extreme programming) pretty much mandate that you be able to work with and communicate with other people. Now I concede that it is remarkably difficult to build "collaboration" into an engineering curriculum. First the programming language has to be taught, and then algorithms, and the rudiments of collaboration tools (e.g. git, code review tools, ...) before you can even think about springing a team effort on to some subset of a class. Getting students to actually work together, especially at a non-residential college, can be difficult. To be interesting, the project has to be big enough that splitting the work up across the team has to make sense, but it still has to fit into a semester of mortal efforts. Figuring out how to fairly give grades in such a course where maybe some of the team has done more than the rest of the team is doubtless a hard problem, one that I have no real solution for here.

I do remember some group projects from my undergraduate days. In our OS class, the assignment was to create an OS. My recollection is that we had some reasonably bright people on our team, but the semester ended without the OS really gelling into a working whole. We had managed to run into some really interesting problems in our software and we passed the course, perhaps on the strength of our war stories that showed we'd indeed really tried to think it through.

And not quite a group project, but scarily close to the real world: I remember a class in file processing. We'd been given an assignment and were told to bring in the runnable decks of punch cards as part of what we had to turn in. The professor collected the assignments and then gave them back out again, but you didn't get back your own program - You got someone else's program. The next assignment was to modify that someone else's program to add a new output report to it. I'm not sure which was more painful: modifying the crappy program I'd been handed without taking the time to rewrite it entirely, or looking at how someone else modified my program to add on the new report routine. Most definitely one of the more educational software assignments in the many software assignments I worked through as an undergrad.

So that's the problem as I see it - lack of collaborative software development in engineering schools. Has anyone got real world examples of a curriculum that in fact teaches collaborative software development at an undergraduate level? If you've got an example or counter-example, please add a comment to this article to tell us about it.

I tried to design a collaborative software development course to suggest for a local community college here. In the end, I decided that wasn't going to work. Junior college has to nominally fit in a mere 4 semesters. They are barely able to introduce C in their curriculum. I believe you'd want a much higher-level language, like Python, as the basis for any large scale group project. Python libraries and modules, object oriented programming patterns, ... Too much to cover before you could even start to talk about working as a team on a larger scale project. I'd have no objection to swapping out their C course for a course that instead introduces Python, but I think that'd be a tough political battle with the opposing forces arguing that C is a much more commercially important programming language than Python.

So I dropped back and punted: the 4-semester problem was fitting in programming, software development techniques and collaboration. In my opinion, the crucial new material is the collaboration, so I've been sketching out a course on how to be a member of a work team. I believe it wouldn't even be specific to the computer science department, but could be offered within the school's communication department. My organized collection of links to web pages I dug into while working on the idea of such a course are available on Pearltrees.com: rdrewd collaboration pearltree. Ironically, I couldn't find a collaborator to work on the project with me, so it's somewhat fallen to the wayside. If you'd be interested in a possibly long-distance collaboration on the design of such a course, please contact me. My email:Drew's mailbox (no spam, please).