Wayne's World of C Code (The Sieve of Eratosthenes)
I had to develop an instructor solution to a simple C problem, the calculation of primes using the Sieve of Eratosthenes.
The first challenge in teaching the Sieve of Eratosthenes Beta of Cyrene (who was, according to Wikipedia, called Beta because he produced second-prize winning solutions on several occasions) is twofold:
(1) Getting everybody to spell sieve correctly by teaching the i before e rule, the full form of which is "i before e, except after c, and except for the word weird which in being a legacy of old English and from Anglo-Saxon *wyrd* (a man's soul, his destiny, his *wyrd*) has a legacy spelling." The first part up to the weird part is my late Mom's rule: the weird part, the part about weird and wyrd, is my own discovery.
(2) Getting everybody to spell and pronounce Eratosthenes.
It is IMPORTANT to get everybody to spell sieve and Eratosthenes correctly. This is because spelling errors cause HAVOC in program maintenance and at build time, where the maintainers and the build team often have an aural understanding of keywords.
If some clown has named a variable, and object, or a file with his favorite mis-spelling of a common word, the change breaks or the build crashes.
Returning to C was interesting, for I'd forgot that its bugs are those of assembly. In the following solution I initially declared the sieve array as intSieve[SIEVESIZE - 1]...thinking in Visual Basic and not C.
Because I knew that C programming is consistently more bug-ridden than programming in a truly high-level language, I coded an "inspect" routine to check the sieve upon completion of the requested table of primes for all and only prime numbers.
Imagine my chagrin when in testing certain ranges the inspect routine reported failure. It displayed a message saying that the sieve, which is supposed to contain prime numbers (since to determine primehood you need ONLY to divide by all primes up to one-half the candidate) contained the nonprime number 999!
Looking at the code I realized I was back in the thrilling days of C and yesteryear, for the code did not appear accidentally to assign an index value to a table entry. I realized that I'd been bitten by the fact that in C, it is always possible to assign garbage to a variable in an unpredictable way and that I'd somehow gotten 999 into the LAST entry of the table!
I felt like Woody Harrelson when in Terence Malick's noble film THE THIN RED LINE, Harrelson's character accidentally pulls the pin on his grenade and just before he dies, says, "What a f**kin' recruit error."
But I went immediately to the declaration of the sieve and then to the C standard on the Web. Oops, ok, right, size of the table, not upper bound, got it, I'm ON the mothaf*r boss.
My current praxis in OO languages including VB.Net and C# is to develop objects with an inspect routine that checks a number of assertions. The genesis of this practice is my days with C and before that with assembler, where I realized the need for a level of caution one level above that which is needed with languages that like VB and C# prevent the vision of the memory that C "allows", a "vision" that makes all too clear the von Neumann adjacency of variables that in a TRUE high level language are in a noncontiguous space.
Wow, that hurt. What I mean is that it is an historical accident, the accident of the von Neumann machine, that "variables" are neighbors in a sequential neighborhood less like Mister Rogers and more like that of Eddie Murphy:
This is how we ansah da door in MAH neighborhood, boys and girls:
"WHO DAT!!?"
Which means that inspect and use of the assert() function are critically important in C, and that its supposed "power" is a vice: someday in the far future a fabulist will tell of mad men in a city who called their chains, heaps of pearl.
Now, you should be aware that certain personalities, among them the editor of C Unleashed, Richard Heathfield, don't think I "know" (in the sense of grok, in the sense of being with the program, in the sense of coding right, in the sense of being a regular guy) C. Richard acidulously explained my assisting John Nash with C as an instance of the ape with the typewriter, in fact.
It is true that I have not used C in recent years, and my style tracks old fashioned C book C, Wayne and Garth era Wayne's World C. Nor am I like to return to C full time except as a part-time teacher of C. I do indeed need in this role to get up to speed on recent developments, and truly good practice.
[Richard Heathfield and his team have some good advice in his book, but don't believe everything you read.]
But as a teacher of C I plan to stress its flaws and its pitfalls from my POV (point of view).
A critical theory of computer science demands teachers who are informed less by loyalty to artifacts and more by hatred of the flaws of artifacts: David Gerlenter of Yale is the exception for he has said he dislikes most computers and their software.
The rule is some part-time prof and full time corporate slave who is too overworked to research alternative paradigms and thus filled with a passionate loyalty to a hobby-horse. The class divide in American education visits such wads upon students who can't afford a better school, where the ACM guidelines are more strictly followed, and academic freedom trumps corporate demands.
Anyway, here is the code. Comments and corrections are welcome.
Note: the program is designed to run from the command line using this syntax:
"sieveOfEratosthenes"
where and are the lower and upper bounds of the desired table of primes.
Yes, I named the EXE, sieveOfEratosthenes. May as well practise our spelling of the two words!
Editor's Note: Updated version of this code, and accompanying commentary, posted here.
// ***********************************************************************
// * *
// * sieveOfEratosthenes *
// * *
// * *
// * This program shows an example of work that may be considered *
// * excellent in your computer science program. It calculates a range *
// * of prime numbers from a starting point to an ending point, using *
// * the standard "tricks" to save time: *
// * *
// * *
// * * Each prime found can be recorded in the famous "sieve" of *
// * Eratosthenes, and to test primality, we can just divide *
// * candidates by sieve members...until the numbers in the *
// * sieve run out. *
// * *
// * * If the sieve does not contain all prime numbers up to n-2 *
// * where n is the (odd) candidate then starting after m+2, *
// * where m is the last element of the sieve (or 3 when the *
// * sieve is empty), we can divide the candidate by odd numbers *
// * up to and including floor(n/2) (the integer part of the *
// * candidate n divided by 2). *
// * *
// * *
// ***********************************************************************
#include <stdio.h>
#include <string.h>
#include <time.h>
// ***** Constants and macros *******************************************
#define SIEVESIZE (1000)
#define SYNTAX_REMINDER \
("sieveOfEratosthenes <start> <finish> [<sieveLimit>]")
#define IS_FACTOR( n, m ) ( (n) % (m) == 0 )
#define MAX( x, y ) ( (x) > (y) ? (x) : (y) )
// ***** Tools ***********************************************************
// -----------------------------------------------------------------------
// Convert character to its numeric form
//
//
int char2Number( char chrInChar )
{
int intInCharValue = ( int )chrInChar;
int intZeroValue = ( int )'0';
if ( intInCharValue < intZeroValue
||
intInCharValue > ( int )'9' ) return -1;
return intInCharValue - intZeroValue;
}
// -----------------------------------------------------------------------
// Convert a string containing an unsigned number to its value: return
// true (-1) on success or false (0) on failure
//
//
// This function returns false when the number's value overflows, but in
// this case, its reference parameter intValue is still returned. In
// this case the value will be negative.
//
//
int string2Number( char *strInstring, int *intValue )
{
int intIndex1;
int intCharValue;
for ( intIndex1 = 0, *intValue = 0;
intIndex1 < strlen( strInstring );
intIndex1++ )
{ intCharValue = char2Number(strInstring[intIndex1]);
if ( intCharValue < 0 ) return 0;
*intValue = *intValue * 10 + intCharValue;
if ( *intValue < 0 ) return 0; }
return -1;
}
// ***** Main ************************************************************
int main( int intArgCount,
char *strArgValues[] )
{
if ( intArgCount == 2 && strcmp( strArgValues[1], "/?" ) == 0 )
{
printf( strcat(strcat("The %s program calculates primes ",
"in a user-specifiable range. "),
"Use the following syntax: %s"),
strArgValues[0],
SYNTAX_REMINDER );
return 0;
}
int intStart;
int intFinish;
if ( intArgCount < 3
||
! string2Number(strArgValues[1], &intStart)
||
! string2Number(strArgValues[2], &intFinish) )
{
printf( "Syntax: %s\n", SYNTAX_REMINDER );
return -1;
}
int intSieveLimit = SIEVESIZE;
if ( intArgCount > 3
&&
( ! string2Number(strArgValues[3], &intSieveLimit)
||
intSieveLimit > SIEVESIZE ) )
{
printf( "Sieve limit is not valid\n" ); return 0;
}
printf( strcat(strcat("Sieve of Eratosthenes primes from least prime >= %d ",
"to greatest prime <= %d\n"),
"Sieve contains %d entries\n"),
MAX(intStart, 3),
intFinish,
intSieveLimit );
int intSieve[SIEVESIZE];
int intPrimes = 0;
int intSievePrimes = 0;
int intDisplayedPrimes = 0;
int intTest;
int intHalfway;
int intIndex1;
int intFound;
time_t ttTimeStart; time( &ttTimeStart );
int intFoundInSieve = 0;
int intFoundUsingBruteForce = 0;
for ( intTest = 3; intTest <= intFinish; intTest = intTest + 2 )
{
intHalfway = (int)(intTest / 2);
intFound = 0;
for ( intIndex1 = 0;
intIndex1 < intSievePrimes
&&
intSieve[intIndex1] <= intHalfway;
intIndex1++ )
if ( IS_FACTOR( intTest, intSieve[intIndex1] ) )
{ intFound = 1; break; }
if ( ! intFound )
for ( intIndex1 =
intSievePrimes > 0 ? intSieve[intSievePrimes - 1] + 2 : 3;
intIndex1 <= intHalfway;
intIndex1 = intIndex1 + 2 )
if ( IS_FACTOR( intTest, intIndex1 ) )
{ intFound = 2; break; }
switch ( intFound )
{
case (0):
{
if ( intSievePrimes < intSieveLimit )
{
intSieve[intSievePrimes] = intTest;
intSievePrimes++;
}
intPrimes++;
if (intTest >= intStart)
{
intDisplayedPrimes++;
printf( "%d%s\n",
intTest,
intPrimes == intSieveLimit ?
" End of the sieve" : "" );
}
break;
}
case (1):
{
intFoundInSieve++; break;
}
case (2):
{
intFoundUsingBruteForce++; break;
}
default:
{
printf( "Programming error: unexpected case\n" );
return 0;
}
}
}
time_t ttTimeEnd; time( &ttTimeEnd );
int intIndex2;
for ( intIndex1 = 0; intIndex1 < intSievePrimes; intIndex1++ )
{
for ( intIndex2 = intIndex1 - 1;
intIndex2 >= 0;
intIndex2-- )
if ( IS_FACTOR(intSieve[intIndex1], intSieve[intIndex2]) )
{
printf( strcat( "Programming error: nonprimes occur in the ",
"sieve at %d and %d: %d and %d\n" ),
intIndex1, intIndex2,
intSieve[intIndex1], intSieve[intIndex2] );
break;
}
if ( intIndex2 >= 0 ) break;
}
if ( intIndex1 != intSievePrimes )
{
return 0;
}
printf( "Number of primes found: %d\n", intPrimes );
printf( "Number of nonprimes found using the sieve: %d\n", intFoundInSieve );
printf( "Number of nonprimes found using brute force: %d\n", intFoundUsingBruteForce );
printf( "Number of primes in the sieve: %d\n", intSievePrimes );
printf( "Number of primes displayed: %d\n", intDisplayedPrimes );
printf( "Time in approximate seconds: %d\n",
(long)ttTimeEnd - (long)ttTimeStart );
return 0;
}
Editor's Note: Updated version of this code, and accompanying commentary, posted here.
C should be given a Viking funeral, IMO
...*qua* C, it has to provide access to aliased variables and as such should no longer be used.
Oh, I suppose it could have a "safe mode" with aliasing turned off...but this is a mistake (witness the way in which different semantics from different Option statements causes problems).
Furthermore a safe and OO C already exists. It is called Java.
The reason I'm using C apart from job requirements at this time is that I managed to smoke my installation of Visual Studio 2003 and while waiting for a replacement I wanna code. So I downloaded a free C compiler from Jacques Navia and David Hanson of Princeton (with whom I studied briefly, before I bailed out of his compiler course owing to work demands).
I am working on some representative solutions and one idea I have is to make them suitable for use in the developing world by restricting them to a standard command line interface. Ultimately, I want to create a library of tools that could be used even on hand-cranked laptops or cell phones in the developing world.
In a paradox, one is told that one's desire to use low level facilities to recreate high level facilities is self-indulgent but I MAY see an opportunity here to "reinvent the wheel".
That's my image, chains as heaps of pearl.
Names are wrong: lcc was
Names are wrong:
lcc was written by Chistopher Fraser, David Hanson.
lcc-win32 was implemented by Jacob Navia based on the above work.
That is correct
omundo.
Note also that this compiler isn't "free". My Bonehead Hobbyist use is OK but even a professor in a university cannot use it for "free".
I never got back up to speed in C although I was thinkin' about it last winter when my Vaio was on its last legs, and I was outraged that I'd spent all that money to get a system that only lasted three years. I am now working with Microsoft VB and C# express which are free for noncommercial use and I will get Visual Studio next month.
My error illustrates that even if you return to C coding after years away, and especially if you are new, you need to get input on your code (but not from lawn trolls who use their knowledge of C as a marker of personal adequacy, of course).
But even when I used C regularly, I used it in a nonstandard way, with long identifiers and heavy use of the preprocessor. I was seeking OO in C because this is what a programmer with intelligence does. Bjarne Stroustrup was supposed at Bell Labs to be working on something other than a massive preprocessor for C which became C++.
There's no point in reinventing suffering or the wheel, or casting your pearls before swine.
Do I treat assertion failure differently in debug v production?
No.
The great Bjarne Stroustrup, in his magisterial Introduction to C++ (a book not only on the mechanics of C++ but also on the normative ethics of programming, if written from a rather dour, Scandinavian perspective, that eschews my sort of malarkey) asks a very simple question.
Perhaps because as a Dane, Stroustrup loves ships and boats, he asks the reader if in a boat, you would keep the life rafts and the life presevers on a shakedown cruise in the harbor...and jettison them when you are Outward Bound to the open sea.
Of course, Bruce Ismay of the White Star line did just this, overriding his chief engineer. We all know, courtesy of The Greatest Chick Flick of All Time, what happened to the Titanic.
No, in my view, objects should self-inspect and enter a Safe mode (in which they respond by raising a standard error and either doing nothing, and/or returning a default value).
The mention of Ismay, who was a coward who stole a seat meant for women and children, is apposite.
Because to handle runtime errors, including the most serious runtime error, a bug in the program that has made the check, the programmer necessarily has to code in excess of requirements.
Which gives managers the willies.
Ismay thought it not economic to carry a sufficient number of lifeboats. The engineer could not make a "business case" in 1912 because marine safety requirements did not call for them at the time (because of the Titanic, they were changed).
Runtime checks should be pretty much the same for production and for debugging unless, of course, they are excessive in terms of time given the nature of the problem.
If the run time checks are indeed excessive, the source code should keep the checks, using the preprocessor to suppress their generation for the run build as opposed to the debug build.
"Build it safe and build it stout, out of things you know about" - Bell Northern Research engineer, circa 1981
segfault => clueless author
The fact that you must understand a computer in order to use C properly is not an indictment of the language. If anything, it is a strong indicator that learning C will help a person to understand the computer. I fail to see how that is a bad thing.
The fact that your tutorial segfaults when I execute it with the "/?" option to return the usage statement is a strong indicator that you understand neither the C language, the conventions of modern programming, nor your computer. Why on earth should I examine your code more thoroughly? You've attempted to write your usage message into a random section of memory prior to printing it. That's not a "trivial beginner error"--it's a glaring proof that the author of this code has no clue what he's doing.
Comments on my error
...will be treated with courtesy and respect for indeed I FORGOT that strcat needs memory to concatenate into, IF those comments are not so absurdly global and personal as above.
A "flame" war erupted this evening in comp.lang.c. But I've resigned from this flame war since for me to so heavily participate in a newsgroup in which I claim no special expertise in C (as opposed to software) would violate nettiquette.
You are welcome to courteously post a replacement for my erroneous use of strcat. That would be most constructive.
See blog
"Fix to my evil seive"


The Sieve of Nilges
Hi Edward,
There's a few gems hidden in this innocent-looking post about a C-language prime number exercise. I'm going to start experimenting with the inspect() method idea, which you've written about before. Do you treat the failure of a given assertion differently in debug vs. production mode?
Reading your post also reminded me of Andrew Binstock's recent column in SD Times, titled "The Slow Demise of C?". He points out an interesting fact: "The foundation on which the open source community is built (Linux - Apache Web server - MySQL - Python/Perl, aka LAMP) is entirely written in C."
He also points out that "a fair part of Java" and "of course, C++" are written in C.
These facts are interesting in and of themselves, but I think Binstock's real point is, "So, with C being the key to so much important software, why does the language get so little respect these days?"
I wish I could quote the whole column (or link to it--it's not available online), because he makes a number of good points, including "the trouble is, of course, that despite the great systems software in C, most developers need more productivity out of their language--certainly more than they need the extra performance kick of C." But he also feels that with language improvements and tool support C productivity could be improved.
He also laments the fact that the ANSI C99 standard "sits around unread by compiler writers today," and finally concludes that, "by this process of a thousand cuts, the language that continues to serve faithfully atrophies for lack of tending by vendors."
I've quoted prodigiously here, but there is much more in the column than the highlights I've selected, and he brings up several issues that could all in themselves be discussions. Binstock's call for improvements to a languishing C language reminded me, Edward, of your assertion against a "loyalty to artifacts." This from your post is a beauty, by the way:
Is that a reference, or is that yours?
You both come at the subject of the past, present, and future of C from different directions, but your post and Binstock's column criss-crossed in my brain today.
Best,
Dan