I'm the Daddy, that's why
I have found in San Francisco that I simply need to visit the public library every day to get full wireless access.
As a subproject to my late but promised effort to improve the compiler for Build Your Own I am writing "yet another" XML parser. Most available parsers annoy me by exposing only a limited set of functionality.
In a corporate environment, I feel guilty about this feeling of annoyance and try to reuse, overall with less success than many .Net developers because in fact it often takes me less time to reinvent.
The parser consists of two objects, xmlParser and xmlTagInfo. xmlParser parses all the tags.
What's interesting is that as I design and code xmlTagInfo, I am having my usual experience with nodes of what are trees. It is that the more functionality you put in the node, the less you need for the tree.
Consider clone() and compareTo(). I decided to make xmlTagInfo ICloneable and ICompareable. But this confronts me with the decision as to how deep the clone should be...and correspondingly, how deep the compare.
In cloning any object that has reference objects in state, the basic decision is whether you want to clone those objects recursively, OR set a reference in the clone.
My instinct is to do a deep clone by default, but support two alternatives: shallow cloning (such that the clone is in fact "stuck to" the source insofar as it references the same objects) or NO cloning (such that the reference in the clone are validly Nothing).
The problem is that among the reference in state of an xml Tag obect are ONE Parent and 0..n Children.
Needless to say, while deep cloning, you will deep clone the Parent. In so doing, you need to avoid a recursive clone of one of its children and THAT is the object you started with.
In cloning the Children of any object, you also need to avoid any attempt to clone their parent.
Therefore, the Private implementation of the clone method has to support an objDontCloneMe parameter set to the object that must be skipped. When calling the cloner recursively for Parent, this parameter is set to You (well, actually, Me).
This sounds complex and there may be further gotyas but at the end of the day, what you have is a xmlTagInfo that can clone not just itself but also the tree of which it is a part.
For the GUI, the large amount of recursion needs I need events to fire when a recursive clone (and compare) call is made.
All this is quite a lot of Work, but I have long needed a simple XML parser (able to operate purely on the basis of syntax) to be able to provide the inverse function to my object2XML functions found in the software for the book.
Why? Because I'm the Daddy, that's why.
However, the whole Nilgesian philosophy of core procedures means to me that ultimately, I need my own OO language which would start, not with a minimal Object, but one able to clone itself and support a sensible core set of stateful properties, unless it is not an Object (something with state) but a pure class.
Shouldn't EVERY object be able to clone itself and compare itself to others? Shouldn't EVERY object be able to convert its state to tags?
Hmm, this question reminds me of a discussion with my cab driver on the way to the Chinese consulate to apply for my new visa. He was one of those heros from the "third" world who's worked all his life and feels that people should get along without welfare or health insurance, mostly because they can.
Minimalism appeals. In software, "maximalism" annoys as in the case where users of a system feel it is forcing them into an unnecessary framework...just as citizens of the GDR felt that beyond being corrupt, the GDR's infrastructure of subsidized sports, free but lousy (in being ideologically rigid) educational opportunities, and the primacy of Marxism-Leninism was a major drag: a choice made for them.
But I honestly feel that IF an object has state, THEN it should be able to clone, and there is but one right way, providing options to vary the way as in the choice between deep, shallow, and no cloning of reference delegates.
This makes me part of an older generation of geeks in search of some paradigm for everybody's use. However, and in fact, I make this search purely for my own amusement, like Gramsci in prison.
Reinventing da wheel
Paradoxically, as programmers, we're encouraged to reuse closed solutions when true reuse, in my view, is the reuse of algorithms in the public domain.
Take a look, if you have a copy of my book Build Your Own .Net Compiler and Language, at the source code for the stateless utilities library that underlies the compiler. I will include a list of all its methods at the end of this post.
The genesis of this library is at Standard Oil in 1981 where I was programming in a completely outdated shell language, EXEC-2, for the CMS mainframe system. I noticed that I was always parsing blank-separated words where I needed to forgive the user for entering variable numbers of blanks, therefore developed a very simple parser for blank-delimited words, later extended to a set of tasks unrelated except insofar as they could be accomplished as black boxes without state.
This library evolved from exec-2 into Rexx and thence to VB. It is also influence by PL/I (as was Rexx) in that it calls its answer to C's strspn and strcspn, "verify".
The problem is that the library creates genuine blinders. One that has recently come to my attention is pointed out by Kathleen Dollard in her book on .Net Code Generation.
The VB.Net programmer needs to quietly abandon the old VB library in favor of string and other methods because the old VB library encapsulates the old, Basic, way of doing things.
One of its biggest flaws is in the simple CInt, which (as the wikipedia entry for Visual Basic points out), uses banker's rounding of Single and of Double values. Most of the time you should use Math.Floor to get the strict, integer, part, and prior to .Net, my utilities library extracted the "floor" by parsing the string representation of the Single or Double value.
Therefore, my utility library may have gotchas because the utilities are by now almost ten years old, in some cases.
At the same time, I quietly refuse the appelation old fart because this is primarily an American renarration of valuable EXPERIENCE.
Here is the promised list of functionality. You SHOULD get the utilities library, if you want it, by buying my book and then downloading the software. You CAN get the library without buying the book, but please...don't be cruel, to a heart that's true.
abbrev: return True when string1 is an abbreviation of string2
append: append a string with or without a separator
appendPath: append to a Windows path
align: align and fill a string
asciiCharsetEnum2String: obtains ASCII character sets
baseN2Long: convert nondecimal base values to Long values
bg2fgColor: convert a background to a foreground (font) color
breakLongWords: ensure words are of manageable size in text
canonicalTypeCast: convert an object to a standard type
canonicalTypeEnum2Name: convert the type enum to name
canonicalTypeName2Enum: convert the type name to enum
changeRecord: return change record in EGNSF format
char2Name: convert the special character to its name
charset: return international character sets
commonRegularExpressions: return useful regular expressions
copies: make many copies of a string
datatype: test a string for a data type
datatypeEnum2Name: data type enumerator to name
datatypeName2Enum: data type name to enumerator
dequote: remove quotes from a string
determineNewline: determines the newline character
deweyParent: returns the parent Dewey decimal number
directoryExists: check for Windows directory
display2String: convert displayable string to value
ellipsis: abbreviates string to ellipsis...
enquote: intelligently enquote a string
errorHandler: trivial error handler
extendTextBox: extend a text box (Windows)
file2String: read a file, return its contents as a string
fileExists: return True when file exists
fileidParse: lightweight parser of a broad range of file ids
findAbbrev: find potentially abbreviated string
findBalParenthesis: find balancing parenthesis
findItem: locate item in string
formatOutline: format an outline for a monospace font
hasReferenceType: determines if object is a reference object
histogram: models data on a value range
incrementDewey: add one to a Dewey decimal number
inspectionAppend: append inspection and test reports
int2Digits: convert positive integer to width
isQuoted: return True when string is quoted
isXMLcomment: return True when a string is an XML comment
isXMLname: return True when a string is an XML name
item: return the nth delimited item from a string
itemPhrase: return the nth through mth adjacent items
items: return the count of items
itemTest: test the item method
joinLines: join two multiple-line lists
line: return the newline-delimited item
lines: return the count of newline-delimited items
listBox2Registry: save the list box in the Registry (Windows)
listItem: return the comma-delimited list item
listItems: return the count of comma-delimited list items
long2BaseN: convert Long integers to nondecimal bases
mkXMLcomment: make an XML comment
mkXMLelement: make an XML element
mkXMLtag: make an XML tag
Name: merely returns the name of this class
name2Char: converts the character name back to the character
numbers2Variables: convert each number in a string to variable
object2Scalar: convert an object to a scalar
object2String: convert an object to a string
objectInfo2XML: convert object about info and state to XML tag
parseXMLtag: parse an XML tag
phrase: return the nth through mth adjacent words
properCase: capitalizes first letter, lowercases rest of string
randomSentence: create a random sentence
range2String: create a single string from a start/end range
replaceXMLmetaChars: replace XML special chars and &
soft2HardParagraph: format paragraphs
smallBusinessList: comma-delimited list of small business names
spinlock: wait for a locked resource, then lock it
string2Box: place a string in a box of asterisks
string2Display: convert a string to a displayable form
string2File: write a string to a file
string2Object: convert many strings to an object
string2Percent: convert string to percent
string2Range: convert a string to a range of strings
string2Sentinel: find string, that doesn't occur in input str
string2ValueObject: convert a string to a value object
tempFileid: return an available temporary file identifier
test: run stress and smoke tests on the utilities library
testAvailable: returns True (stress tests are doable) or False
translate: translate source to target characters
trimContainer: adjust form/container width and height (Windows)
utility: runs one of these methods from the quickBasic engine
verify: scan a string
word: return the nth blank-delimited word from a string
xmlMeta2Name: convert XML meta-characters to their names
xy2Point: creates the Point object from coordinates
xy2Size: creates the Size object from height and width
Now, this library forms, in the manner of a coral reef, an accretion of a world-view and a Complex Instruction Set. As such, it is very much a two-edged sword.
Just as coral insists on its own existence by being sharp and spliny to the touch, a library can never not offend in all cases.
Furthermore, the utter triviality of some of its methods can be lampooned.
enquote(string) places a string in quotes. Actually spending time on such a method can be a Termination Offence in some companies I've worked at.
I mean, Ed, what part of """" & strInstring & """" doncha understand?
Until we ask what if strInstring containeth embedded double quotes.
OK, what part of """" & Replace(strInstring, """", """""") & """" donchoo understand? Or in a more modern style, how about """" & strInstring.Replace("""", """""") & """"??? Better yet, use the Concat method and not the operator!
But notice something. The modern style doesn't at all avoid the main problem...which is the appearance, time after time, of the ugly (in the sense of slightly counterintuitive) string """".
The key phrase is time after time. A function point appears in the form of the Idea of Enquote: "place my strings in VB standard quotes", but despite the simplicity and elegance of the idea, it doesn't correspond with a simple enquote(string) function or a simple string.enquote method. And, note that the two notations are comparable in simplicity and brevity: using the black box function notation is comparable to using the method.
The technical problem in VB is the lack of inlining, in which the simple function can be compiled as a macro without the call mechanism: but long ago, I decided that even the older call mechanism was efficient enough to warrant the stylistic improvement.
Furthermore, for a specific client, the enquote function can be extended and overloaded to accept a style parameter allowing different enquote standards such as ordinary text and C.
In conclusion, I don't regard utilities as an Evil Framework but indeed as in need of continual rethink.
To Adorno, in his almost unreadable Aesthetics, any creative act is necessarily critical, and this is why programmers deliberately mask their creativity using typically "male" discourse.
We say to each other around de campfire, "don't reinvent the wheel", "don't be creative, sweetheart", and "don't do me any favors, do your job".
This is because utilities started life as my PROTEST against the lack of anything like a decent shell system for IBM's CMS...a lack that was not rectified by IBM but instead by a guy named Mike Cowlishaw, who worked evenings and weekends to create the Rexx programming language...so he'd have something to do his "real" job that did not look like an explosion in an ampersand factory. Mike became an IBM Fellow, but I've seen guys terminated for the same stunt (I've also see guys make millions by imposing their PhD projects, as an OS, on entire company, a stunt which in my experience destroyed the company).
In 1995, utilities.CLS was under Adorno's logic a PROTEST against having to use VB merely because I was too bone lazy to develop C++ GUIs, and an attempt to make VB coding more like Rexx.
Worked for me. The point is that each such decision is in an entire context and can never be prejudged by sayings, folklore and saws. Today, I need to listen to Katie Dollard and morph into more use of methods and less of functions, and deeper Inheritance.
At the same time, I have to ask, why are so many VB (.Net and COM) projects so bloated, whether by hand code or generated code? It seems to me that the intelligent use of abstract classes MIGHT debloat, but this is only a tentative opinion.
In this connection, under the XML parser, I've decided to implement the core functionality (About, class2XML, dispose, inspect, Name, mkUnusable, test and Usable) as an abstract class.
I've decided to implement this abstract class using (eep) preprocessor symbols to control whether a fully threadable class with state is generated (in which each stateful procedure has to go through a central dispatch_() method in order to ensure a locked state) or a serially threadable class with state is generated...in which Public procedures can perform their work without calling a locking dispatcher.
The full threading will be crude, since it won't let you partition the state into multiple objects for more fine grained threading.
As to the use of preprocessor symbols, they are considered Evil by many younger programmers, but I believe they still have a limited role in VB. Perhaps not at all in C++: I am reading the new special edition of Stroustrup to find out.


Reinventing the Wheel
That really reminds me of myself. I often find myself getting annoyed when I'm in the process of trying to figure out all of the assumptions built into some framework or 3rd-party tool and have quite often simply given up and simply built my own.
Am I the only developer who was so put off by the learning curve associated with CVS that I thought it would be easier to just build my own system?
The flip side of this is that I know sometimes I'm being a crotchety middle-aged man who simply would rather do things his own way instead of jumping on the standards bandwagon. This is an impulse I am trying more and more to resist. For example, over the past four or five years I developed my own web application framework at the consulting company where I work. As new functionality was needed, it was added to the framework. As better ideas and approaches came along, we (mostly I) refactored.
I love my little webapp framework and want to use it. But I realize that there are now standards out there (Struts, for one) that duplicate a lot of our functionality and add a lot more. It would be better in the long run to start using a standard framework because new developers will have an easier time ramping up and we'll have an easier time with our clients when we tell them we're embracing standards instead of using our own approach.
But it grates nonetheless.