Saturday, September 27, 2014

Actual code - C vs. Python for a small problem

If you aren't much interested in writing software, this post is probably not for you. If you read the post without following the links to the example implementations, then I'm not sure what you are doing here, though you are certainly welcome to ask questions.


Recently on Quora.com someone asked to see C code to find the first 20 prime numbers. Seemed like an easy enough question, but the first guy who answered it, gave a C++-based pseudocode and some tips on how to convert that to C. That answer didn't make me happy. Heck, I figure if you're going to write in some other language, why not Python as a sort of testable pseudocode, but the question really needs that answer to then be re-stated in C.


So I sat down and banged out a Python program to find and print the first 20 prime numbers. That code is here: find20primes.py


I used a list to hold the list of primes. When the list grows to 20 items, I'm done. I start the list by telling it 2 is a prime number. I then only consider odd numbers as additional candidates. If the candidate is divisible by any of the prime numbers we've found so far, it is not a prime number. If the candidate isn't divisible by any of the prime numbers we've found so far, then the candidate is a prime number so we add it to our list of primes and, in ether case, add 2 to the candidate number to get the next candidate number to be tested. In C, we'll need a little more book keeping for the list, but since the max size of the list is bounded (20 items), things should translate from Python to C pretty easily. One trick that isn't entirely obvious is the use I made of Python's for...else control structure to catch the case of the for loop not exiting via break. We can paper that over with a goto in C, or you can add some flags


I was thinking that C has labeled break statements to let me do sort of the same thing as that for...else. But much to my chagrin, that's a feature of Java, not C. Oops. So, goto it is in my C translation of the program.


So I sat down with that Python listing open in one window and typed up find20primes.c That code is here: find20primes.c


I believe it is a straight-forward translation of the Python program into C. Of course, the C program is bigger in the sense of it has more lines of code. There are lines added to delimit the blocks and loops in the program, and lines added for variable declarations and lines added for the annoying book-keeping that C needs me to do where Python just handles those chores. I did run into some trouble where I didn't get the book keeping entirely correct the first time through. The program outputted 20 primes, but it started with 2, followed by 73, followed by 3, and it left out 71 at the end of the list. Huh? Turned out I was mistakenly incrementing primecnt before I'd used it to fill in the next slot in the primes array, so I skipped the primes[1] slot and just got 73 there by bad luck. Python would have told me about an uninitialized variable if there'd been a spot in the Python program for me to have committed that same kind of error.


Having finally gotten both programs working, conventional wisdom is that the C code should be much faster than the Python code. So I used the "time" command to see how the timings compare.


Using cygwin Python 2.7.8 on Windows 7 on a PC with an i5 processor and 8GB of RAM,

$ time python find20primes.py

[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71]

real    0m0.136s
user    0m0.000s
sys     0m0.062s

Using cygwin cc 4.8.3 on that same Windows 7 PC:

$ time cc find20primes.c

real    0m0.238s
user    0m0.045s
sys     0m0.185s

$ time ./a.exe
2
3
5
7
11
13
17
19
23
29
31
37
41
43
47
53
59
61
67
71

real    0m0.064s
user    0m0.015s
sys     0m0.015s

The execution time for the compiled C program was way less than the total time for the Python run, but if you add in the compile time for the C program, Python comes out ahead. Your mileage may vary.


By the way, if I throw in a -O option ("optimize the generated code") on the cc command, it further slows the compile down, while the run time is about the same.

$ time cc -O find20primes.c

real    0m0.336s
user    0m0.031s
sys     0m0.185s

$ time ./a.exe
2
3
5
7
11
13
17
19
23
29
31
37
41
43
47
53
59
61
67
71

real    0m0.086s
user    0m0.000s
sys     0m0.015s


(the "real" elapsed time varies a lot from run to run. These run times are so short the variability makes it hard to make fair comparisons in races between the programs). I suspect that the real time is dominated by the console I/O to print the results, so optimizing the details of the loops isn't going to do much improving of things.


Now to make things more interesting, suppose I wanted to pass in as a command line argument the number of primes to be found. So long as the primes are still in the range of ints that only needs a malloc of the array in C but if the program is ever asked to get up into really big primes, the Python code will keep on chugging while the C code will need substantial rework to pull in an arbitrary precision integer library module. This blog article is already long enough, so I'll leave that generalization to "findNprimes" up to you. How big does N have to be to break the C code's use of ints? I suppose that with relatively minor surgery, you could convert over to use of long ints (or if your compiler supports them, to long long ints) in the C program. Looking into it lightly, it appears that the cc included with Cygwin supports long long ints. The format strings will need adjusting. Use %lld instead of %d.


If you've feeling anxious to try further experiments, see how pypy does in handling the find20primes.py program. That'll probably be more interesting with findNprimes.py with a really big value for N.


The moral to the story is that C may not be the best choice of programming language for this kind of problem.


Did you enjoy this article or was it too technical? Please add a comment down below so I know what you think about this sort of writing.