<! date> Date: Mon, 23 Oct 2000 13:53:41 -0400
Organization: Kurtz-Fernhout Software
Eric Armstrong wrote:
But one of the mechanisms that needs to be developed next is the notion of category-validation. Not all categories need to be validated, of course. But if the category IBIS:Alternative is added to a node, then the node needs to ensure that at least one of it's parents is in the category IBIS:Question.
That mechanism is essentially identical to the one I understand you to be suggesting -- a higher order function that does node typing and validation dynamically, rather than forcing the nodes to have static types. (Is that reasonably accurate?)
More or less. But the enforcement of rules of relation is something I prefer to leave up to the client code rather than handle in the server somehow.
William Kent's work from 20+ years ago proposed the notion of an "Executeable" which he meant in the sense of a server side piece of code always running in the repository (or triggered as appropriate). Executeables would be responsible in his ROSE model (Relations, Objects, Strings, Executeables) for enforcing these sorts of validation operations. For example, they would ensure certain sets of relations would be deleted as units in any relation in the set was deleted.
Often the issue is that there is a certain amount of irreducible complexity -- it can just be moved around. The question of whether enforcement of certain rules of consistency is best done by the user client code or by server side trigger code is perhaps one of these issues. One issue is how perfect the database has to be and what happens if an inconsistency is introduced. Modern client side coding styles with transactions and exception handling should be able to avoid most inconsistencies and also usually recover if they do slip in.
Although -- there is still not a good substitute for having a human data base analyst around to patch over the occasional really unexpected glitch.
I lean more towards designing flexible systems (i.e. the Pointrel Data Repository System) with the expectation of occasional inconsistencies and failures [short of massive loss of data] requiring human intervention than designing inflexible (but otherwise elegant) systems with the expectation of consistent input and perfect operation in a well defined domain. In practice, the inflexible systems fail anyway due to mechanical failure (e.g. bad RAM memory) or design flaws, and then they need to be fixed anyway, often with inadequate tools and strategies because the failures were not planned for. In most tasks within today's computing environment, it only makes sense cost-effectiveness-wise to automate 95-99% of most tasks anyway. A flexible system then makes that remaining non-routine 1-5% easy to handle for a sophisticated user.
For unsophisticated users of a complex end-user product like a sophisticated email / workflow / KM / OHS system, that non-routine 1-5% (setup, error recovery) is usually more cost-effectively what friends and consultants are for. I wouldn't quite go this far, but applied to computer systems doing complex tasks, someone once wrote: "Design a system that even a fool can use, and only a fool will want to use it." That is in line with some of Doug's comments on "ease of use".
Another way to think of this is as "classes of validation" or "classes of error recovery". So, some errors one expects the system to handle routinely and other errors one expects a human to have to detect or deal with (ranging from a minor automated error recovery confirmation to a day long debugging session).
This notion of "classes of error recovery or validation" is similar to the three classes of user argument for a system with a scripting language (I forget who proposed the original version). More or less, there are three classes of users of scripting systems. 90% of users of won't learn a scripting language much beyond running a script, but will occasionally use scripts written by their friends/associates in the 9% who are willing to learn the basics, and these 9% in turn depend in non-routine cases on their friends/associates in the 1% who really know the scripting language and push its bounds. Yet, it would be easy to dismiss an extensible scripting language in a product by saying only 1% of the users will bother to really understand it (and so it isn't important and should be dropped) when the reality is that almost everyone might be using the scripting language in one way or another, even if they don't write scripts. So, I would use extend this idea to suggest validation and error recovery should be implemented in something flexible so 1% of the users can do it really well by direct intervention using sophisticated custom scripts, and 9% of users can do routine things by customizing simple scripts, and 90% can call in their knowledgeable buddies or run canned recovery routines from a menu. It would also be a mistake to dismiss the 90% who won't learn the scripting language or recovery system, because occasionally out of their large ranks comes the tiny fraction of people who eventually move into the more knowledgeable and active 10% of developers. The same argument can apply to open source developer populations as well -- interested users move into ever more challenging developer roles.
Anyway, my point is not that validation isn't necessary. It is just an interesting question of who should be doing it and when -- server code, client code, or human DBA, and the implications of this as applied to system architecture.
Developers of custom software and educational simulations
Creators of the Garden with Insight(TM) garden simulator