Tuesday, February 9, 2016

Project Euler Problem 1 - In Python

Project Euler

Project Euler is a web site with hundreds of problems for you to tackle however you want to solve them. Some of them, if you are sufficiently adroit mathematically, can be solved on the back of an envelope. Most of them clearly need a piece of software to grind through the calculations. You can use whatever programming language you prefer. the site asks only for the answer. Each problem yields a numeric answer. If you provide the correct answer, the site credits you for solving the problem and grants you access to a discussion page. The discussions are mostly people bragging how they solved the problem. The bragging isn't anything important, but it can be useful to look at other people's approaches to see other ways of looking at the problem that perhaps hadn't occurred to you.

The one "guideline" is that a good solution should need no more than a minute of time on your computer. If your solution needs way more time than that, then you should look for a better solution.

Tell Ya What I'm Gonna Do...

I'm going to show you how to solve some of the project Euler problems using the Python 3 programming language. Instead of having you install Python 3 on your computer, I'll be using the "Python in the cloud" facilities of pythonanywhere.com. You can sign up for a free account of your own there. For small programs like ours, they offer their service for free. If you use their site to build something that becomes enormously popular (say, the next "Farmville" game), you'll be needing to pay for their services, but we're far, far from crossing that line between trivial computing load and significant computing burden.

It'd be useful if you figure out a good way to prepare your Python programs in a file, but you can start with notepad or whatever simple editor you are most comfortable with. If you tell us in the comments on this blogpost what editor or IDE ("Integrated Development Environment") you prefer, it may influence whether I write future articles addressing that editor or IDE.

Project Euler Problem 1

The first problem on project Euler is one that calls for very little code. I'll tell you that I approached the problem as a computer programmer, but in the discussion page for the problem there were people who knew how to compute sums of a series using a simple formula and they were able to readily solve problem 1 using only pencil and paper. That's just proof that although I've worked with mathematicians, I'm a software guy, not a mathematician.

The crux of problem 1 is: Find the sum of all the multiples of 3 or 5 below 1000.

The example given makes clear that they are only asking us to consider positive integers. So no need to worry about negative 3 and negative 5 and so on. ("If we list all the natural numbers below 10 that are multiples of 3 or 5, we get 3, 5, 6 and 9. The sum of these multiples is 23.").

Spoiler Alert.

I'm going to talk here about how I solved the problem. If you want to solve the problem on your own, stop reading here and go solve that problem. Feel free to come back later and tell us in comments how your solution is better than mine. Better in what sense?

Solving Problem 1 in Python

At this point, I am tempted to show you the small program that I used to solve the problem, but that is so small a program that you'd probably glance at it and walk away grumbling "nothing to learn here", so I've decided to creep up on the solution a bit more slowly. I hope you pick up some comfort with Python programming along the way.

The General Shape of the Program.

Generally, the way to write a program is to start out with a rough idea of what the program should look like. You can write this down in "pseudo-code" - a mixture of your native language and whatever bits of programming language structure you are comfortable to stir in. One of the delights of Python is that Python code looks a lot like pseudo-code, so if you nudge your pseudo-code into being real Python code, you can test out your pseudo-code by running the program. When it works, perhaps you are done, but if your actual target language isn't supposed to be Python, then you have a working prototype that you need to re-code into your intended target language (Perhaps C or C++ or even some assembler language). It is worth noting that in the real world, often a manager faced with a working prototype that works well enough, will suddenly decide that a Python implementation is plenty good enough to declare the problem solved, even if its the first instance of Python code admitted into "production" use in that shop.

This program is going to need an iteration (loop) to consider all the natural numbers less than 1000. And inside that loop it is going to need a conditional statement (if statement) to select the numbers that are multiples of 3 or of 5 (which for whatever reason are the numbers that this problem considers to be interesting). And for the numbers that are multiples of 3 or 5, we'll need a little arithmetic to accumulate the sum of the selected numbers.

There's more than one way to do it. For example, we could loop through all the non-negative integers <1000 and build up a list of the numbers that meet the criteria of being "interesting". Then we could take the sum of that list to get the desired answer. But we have no further use for the list in this problem, so I assert it is simpler to just accumulate the sum as we go.

Another approach would be to generate the multiples of 3 and to generate the multiples of 5 that are less then 1000, and then tally up the generated lists, but you'd need to be careful not to include any numbers twice. Some numbers (e.g. 15) are multiples of both 3 and of 5, but only should get added into the sum for this problem once, so I assert it is simpler to just consider each of the candidate numbers and accumulate the ones that meet the criteria for being interesting.

Got another plausible shape for the solution this problem?

So, I'm going to stick with my initial proposal which in Python-like pseudo-code looks like this:

sum=0
for num in numbers 1 thru 999:
    if num is "interesting":
       sum += num
print(sum)

Note that in Python, indentation is used to delineate the blocks of code. My Python-like pseudo-code follows that same convention. Thus the body of the loop statement is indented under the "for" that introduces the loop. The body of the "then" clause that is made conditional by the "if" statement is indented inside the "if". Since the "if" is inside the "for" loop, the accumulation of the sum is doubly indented. The initialization of sum to zero is outside of the "for" loop so it isn't indented at all. The "print" statement is also not indented. We don't want to print the partial sum on each iteration of the loop, so it is important not to indent that final "print". But maybe when you are debugging, a print statement inside the loop would be a helpful addition. That would be done by adding another print statement and indenting it so it is inside the loop. You might want to include a comment on your debugging code so you can trim the debugging code out when your program is in good working order.

The "sum += num" statement is Python short-hand for "sum = sum + num", since accumulating totals is such a frequently needed operation.

I hope you've noticed that the pseudo-code is not quite working python, so we aren't done yet. The most glaring magic is how do we really decide if a given number, which we've named "num", is "interesting"?

How to test if a number is a multiple of some value

So we need to consider how to test a number to see if it is a multiple of 3 (or of whatever number we want to test to see if it is a multiple of). Python has a modulo operator (%) that is documented here. x % y is how you ask Python to compute the remainder of dividing the number stored in x by the number stored in y. So all you need to realize is that if x is a multiple of y, that the remainder will be 0, and if x is not a multiple of y, then the remainder will not be zero.

So if we have a number to test named num, then we can test if num is a multiple of 3 by using this Python code:

num % 3 == 0

The result of that test expression is a True or False value (True if the remainder is 0. False if the remainder is not 0). But we want to know if the number is divisible by 3 or is divisible by 5. Happily, Python has an "or" operator that will let us combine 2 True/False values in exactly the way we need.

Here's a copy/paste of my PC anywhere session where I tried this out:

>>> num=27
>>> num % 3
0
>>> num % 3 == 0
True
>>> num % 5
2
>>> num % 5 == 0
False
>>> num % 3 == 0 or num % 5 == 0
True

As you can see, the Python rules for operator precedence did exactly the right thing for us here, but I find that long expression a bit hard to read, so I added some un-necessary parentheses to make it easier for a human to parse.

>>> (num % 3 == 0) or (num % 5 == 0)
True

How to conjure up a list of numbers?

In many programming languages, Fortran for instance, the way you conjure up a sequence of numbers is you have a variable to serve as a counter. You initialize the counter to a starting value, then you increment the counter to get the next value and you need a test to decide when you've gone as far as you want to go. If you've programmed in C, that's what a "for" loop in C does. But there's a more Pythonic way to do this in Python, so please take care not to write Fortran (or C) code in Python syntax. A Python "for" loop looks like
for x in list:
    do something with x
Where x is an arbitrary variable name that takes on each of the values in the list and the body of the loop (which I've represented as "do something with x") processes each of the values as x is stepped through the list.

If you want to read more about looping in Python, especially if you are comfortable with looping in other languages, I strongly recommend Ned Batchelder's blog post "Loop like a native".

Not only does Python have a statement to consider each of the values in a list of values. It also has a built-in generator of a list of values. In Python 2, "range" is a built-in function to return a list of integers, but such a list takes up space. So "xrange" was introduced to return a generator that'll provide the desired sequence of integer values on demand without ever actually creating the list as a whole in memory. xrange worked well enough and explaining the distinction between range and xrange was ugly enough that in Python 3, xrange became range and you only need to know its a generator if you're interested in the details of how stuff works. So just say range(1000) to conceptually whip a list of numbers 0 through 999.

So now our pseudo-code has morphed into runnable Python code:

sum=0
for num in range(1000):
    if (num % 3)==0 or (num % 5) == 0:
       sum += num

print(sum)

And we're done, except for running the code to get the answer. I'll not reveal the numeric answer here, so please learn how to run this yourself.

Mystified? Please tell me if I've confused you so I can polish things up before my next blog post.