|
From: John O'G. <ogo...@ho...> - 2002-01-03 20:36:53
|
You are both right. The problem is to know where to stop. 1. re excellence of SQLledger I agree that SQLledger is superb. It is much much simpler than commercial offerings that I have had to support in the past. In general this simplicity is not at the expense of loss of function. Having a database that you can fully understand because it has only a dozen tables is precious. Roland's point about increasing database complexity as modules and features are added is very apparent in commercial examples where tables are numbered in the many hundreds. I ernestly hope that SQL-ledger can avoid falling into this trap. I think Roland's suggestions might help this - but they need to be carefully thought through first. 2. re 3NF This is a sound principle which can easily be pushed past the point of absurdity. For example, why stop at a separate table for addresses? I could (but won't) argue that we should properly normalise addresses into 4 subtables: locations -< streets -< towns -< countries on the grounds that this would eliminate enormous redundancy (especially in countries with large population and many towns/cities). 3. re OOP All OOP programming has at its heart the difficulty of deciding how to structure your nodel of the problem space. This is almost never a single correct or obviuos design. John O'Gorman Roland Stoker wrote: > > Hi Thomas... > > This is the last answer to you over the list, for a further discussion I > suggest we do that privatly.... > > I am aware of 3NF, and why you don't always want to follow those rules, I've > made a lot exceptions on them myself because it can be usefull (mostly for > performance as you say). > But helas, thats not my point here. I'm not talking about performance here, > but about using the possibilities that are offered by open-source software, > and about keeping possibilyties open for future expansions. > Most companies would be helped if a software package could be build that > incorporates al packages they need, using only one database. Take for exemple > an addresbook. You need the addresses to make invoices, but not everyone has > acces (or should have) to the accounting, so you'll need a second one for the > other employees. But if there's a second program that uses the same tables, > they could use that. > For that idea to work, you would need a standerd database model, and > something like addresses would have to be in one table. > There's also something like loginnames, but I'll react to that later... > For sql-ledger to really improof from what it is now (it's the best > bookkeeping program I've seen) is to make it work together with other > programs and to let the idea go that everything should work inside of it. > You've now made an order system inside of sql-ledger, but I havn't seen an > accounting doing the ordering of hardware yet or placing the orders of > clients. Usually that's someone else, who has no access to the accounting. > Wouldn't it be great if sql-ledger and phpgroupware could use the same > database (and i don't mean just the adresbook, but the entire package). With > enough willpower, it could be done within a month, because it really isn't > that much work. > > But even if you don't do something like that, the continuing expanding of > sql-ledger as it is happening now, will mean that in the future there are > going to be more functions for people and businesses. And creating tables > every time is going to mean that you're database gets more complex and less > usefull for companies who want to use all the functionality because they have > to copy persons all over the database and have diffuculties keeping address > correct. > > > This is called 3NF (Codd's "Third Normal Form".) As to it being redundant > > and the source of 'minor problems' is really a matter of opinion. Ensuring > > relational integrity by having each entity describe only ONE subject > > (whether customer or vendor or foobar) is the practical application of the > > theory. The entity described by customer is the person's ROLE...this is a > > distinct role, different from the role of vendor, hence, not really > > "redundant". Can the same person have two roles? Yes...does it happen > > often enough to justify adding complexity to the schema? Not terribly > > often. > > I agree it doesn't really happen that often, but that in intself means that i > DOES happen...??? > > > > Other exemple. Say I would like everybody (vendors and customars) to have > > > a login on my website. With what query do I check the login-name?? I > > > cannot create a view because then I cannot garantee that a login name > > > appears only once! > > > > This has been discussed previously and is a valid point. Relational > > databases use check constraints to prevent corruption. See the postgres > > man page on ADD CONSTRAINT or CREATE INDEX...you could also add a stanza or > > two to the existing code to trap the error before the parser prints stderr > > to the user's screen. This is prettier to view than a failed query dumped > > to the screen (and possibly more useful for the end user.) > > As far as I know that isn't possible. You cannot create an index on a view, > only on individual tables, which means that a loginname could excist twice, > once in vendor, and once in customar. > I think we agree that coding something that checks for that problem isn't the > ideal solution. > > > > Solution: The solution would be to not take people or companies as being > > > there functions. I mean, a vendor is not really a vendor because he > > > himself also buys goods. He is actually a "company" that has a "vendor" > > > relationship with you. > > > To recreate that you make a th folowing tables: > > > > > > -companies (with the fields like customar or vendor) > > > -function (a table with company_nr and function_nr, they are primary key) > > > -function_define (function_nr = primary key, and some explanation fields) > > > -people (probably something like employees like adres and phone,... but > > > without salary) > > > -peoplecompany (a bit like function to tie people to companies, like > > > contacts) -employee (people_nr, salary,...) > > > ( You could also start with departments, but is not necessary) > > > > And this reduces redundancy? I don't see it. From this vantage point it > > increases complexity without any real gain as to access the data you would > > need too many joins. Views would contain subqueries (correlated > > perhaps)... All of this for the rare exception? I don't see the gain. > > I'm going from 5 to 6 tables, so not that much compexity extra. If you're > counting fields it will actually become less. But when there's a new kind of > company or person, that's when the advantages of this system come into action. > > > It would be easier to simply place every contact in one table and then > > define their role in the context of a transaction. But is it necessary? > > Or even useful? It makes cursor selects (without joins) difficult when > > all we want back is a list of vendors...keep it simple is a very good > > motto. > > There's something called views. If you create a view calle vendor consisting > of all companies that sell goods to you, you wouldn't even need to change the > query's. Personly, I don't mind the joins. I have more problems with tables > that shouldn't be there, making it more diffucult for people trying to make > add-ons for you're program. > > > > Employee could have a company_nr, but what's the point, I am only > > > ionterested in my employees so I assume there all mine.... > > > Even adresses can be separated from people and companies if you're really > > > ambitius. > > > > vendor/customer relations may have drill down tables as well...without > > abandoning the concept of confining each relation to the description of > > a singular entity (in this case the ROLE of the person recorded in the > > relation.) Although here again, it is not a matter of ambition but the > > usual tradeoff between performance (denormalisation of schema) versus > > reduction of redundancy (in an effort to cut overhead and eliminate data > > errors). In real world situations people often denormalise because the > > performance gain is worth it. > > I agree that you can have a bad model which gives significant performance > loss. That's definitly not the case here. > If you're using an index (which was the idea), the db will search > logaritmicly. Meaning that that if the tables become twice as long (combining > vendor's and customars) it will need twice as long. > Exemple: You have 1 million customars and 1 million vendor's in you're db. > making the query ( to search a certain vendor using indexnumber) in the db we > use now using 20 compare statments. The new one will use 42. 42 Compare > statements is still using no time. > The performance towards reduction of redundancy you're talking about is > actually called caching. This is where you deliberatly put data in a table, > to be able to find it faster later. That's not what were discussing now. > > If you see this sort of thing as too > > redundant for your taste you might consider the hierarchical model (mumps) > > as an alternative. It is very fast, less 'redundant' and has a leaner > > programming language. Of course, it has given way to the relational model > > for some very good reasons (not just Oracle's marketing skills.) > > I'm very fond of the relational system, and for good reason. Even dough I > don't like using 1 to 3NF > > > Simpler IS better... > > Yes, but what is simpler???? > > > > Query's > > > Queries will not change much, other then always having to define who you > > > want. > > > > [ ... snip ... ] > > > > No offence Roland, but what you're talking about is reinventing the wheel. > > In my view, the current one (being round - and sound) suffices. Another > > point is that I've seen alot of people come along and offer suggestions for > > sweeping changes to a product that has already racked up hours of > > development time and has a base of satisified users. Usually not much > > comes of this. The reason? The current product does the job...otherwise it > > would not be in wide use. > > I'm not suggesting anything that would take a lot of time coding. In its > sompelest form, not even the query's would need work, only the db. If it > needed a totaly new interface, I would be in the wrong place, because this > interface is practicly finished and really good. > > Of course, should you wish to reinvent the wheel > > as an academic exercise, why not? But I see the existing schema and code > > as already being rather trim and possessing an object oriented bent. The > > combo of a sound relational structure coupled with an OO design is a clear > > winner. > > Finaly!!!! Of course " a sound relational structure coupled with an OO design > is a clear winnar". That's what I'm suggesting here. You have an object > called companies, with certain functions in an interaction with you. That's > what this is all about. There is no object "vendor" because thats a proparty > of an object, and not the object itself. > > Roland |