Rule enforcement thru DB schema

2007-02-15
2013-05-28
  • Brian Nalewajek

    Brian Nalewajek - 2007-02-15

    Hi,

      Would anyone care to offer a view on the usefulness of enforcing rules through constraints in a physical implementation of a DB?  A conceptual ORM model does a superb job of representing a rule (a business rule, or anything worth embodying in one or more connected FactTypes).  The ORM model provides a means of representing and verifying the rule with domain experts.  My question is: to what point is it useful to fully implement the constraints of the model into the physical schema of a resulting DB?  It seems obvious that the tables and columns required to store the data associated with the rule be created, and that data entered that has a bearing on the rule be preserved (as part of data integrity).  Is this as far as it should go; or should enforcement of the rule be accomplished by adding constraints that limit inserts and updates to entries that comply with the rule?

      This isn't a NORMA question, but as part of the NORMA project involves DDL script generation, I think it's still relevant here.  Thanks.  BRN..  

     
    • Anonymous - 2007-02-16

      Brian,

      In my experience the business rules (that logically belong
      in the middle tier of a 3-tier app) always migrate to both
      the data and the view layer as well. It's a duplication of
      function, but for example you want your view layer to stop
      the user from creating errors as early as possible, so you
      enforce rules there. I always prefer to enforce everything
      I can at the data layer also. Sooner or later, someone will
      open up a general query tool or a table viewer and try to
      correct things or add new rows... or they'll write a funky
      new script to populate a table from an external data source,
      and your rules get broken. Data layer enforcement can stop
      that - as long as your performance goals are reachable. The
      middle (logic) layer should still enforce the rules also, as
      someone will build a new user interface one day, and they
      will omit some checking that your UI contained.

      Part of the goal of my work is to declare all such rules
      once, in the information model, and automatically generate
      them into all layers of the application, so although there's
      duplication of function, there's only a single place where
      these rules are declared - so the whole application is always
      self-consistent.

      It's because of the migration of rules into the view layer
      that I don't want my ORM models to have surrogate keys. I
      want the relational mapper to infer keys where necessary to
      meet performance goals. Then the inferred rules about these
      keys, and the keys themselves, don't find their way into the
      view layer where they don't belong.

      There are additional rules that arise in the process model
      that also find a realization in the data layer, and the same
      design principle should apply - declare the rule once, then
      automate the enforcement everywhere it's feasible. Such rules
      describe the legal state *transitions* of the data store, as
      opposed to just the states allowed by the information model.

       
    • Brian Nalewajek

      Brian Nalewajek - 2007-02-16

      Hi Clifford,

        Thanks for the informative reply.  I think we're working some parallel lines.  The single authoritative source for rules is important; I'm looking into just how much that can, and should be the model itself, and how much enforcement can be done at the data source level.  Your point about limiting surrogate keys is well taken.  I wonder if the multi-abstraction layering that's been mentioned for NORMA models might help. It would be best to model using  the terms of the UofD, yet have the expediency of the surrogate keys available for DDL generation.  Perhaps, an option to provide a surrogate for a composite key (one that would not show in the most conceptual view of the model, but would be used for logical and physical implementation), would provide a workable solution.  Whenever there was a composite to surrogate link, and the model was changed, the surrogate scheme would be updated as well - in fact, some validation check would insure that the surrogate scheme was verified for the new  model version.

        I also have questions about the usefulness of trying to enforce some types of rules at the physical data level.  There are cases where the model's constraints only remain valid if all the instances entered in to the data source are valid, and if they are all valid, then the constraint is redundant; and a single error in data entry can cause the constraints from the model to "validate" invalid instances going forward - giving the result a false impression of rule compliance.

        I look forward to seeing what your project looks like, at some point.  Good luck.  BRN..

       
    • Tyler

      Tyler - 2007-02-16

      Just from what limited work experience I have, I'd say enforcement at the DB level of all constraints is fairly important. It's inevitable that a customer will have some issue with the software, and some "tech-savvy" customer support rep is going to try editing the database by hand. Now, that's obviously a bad case that should be prevented by processes in the development company, but I'm guessing we're not the only place that this happens at.

      Also, it discourages "tech-savvy" developers from doing a lot of dirty hacks to make their job "easier". In a deadline crunch, code-level enforcement of constraints is one of the first things to go.

      In all of these scenarios, proper understanding of the domain model can help prevent people from entering bad data or using the database improperly. Spreading that domain knowledge throughout the organization is difficult to impossible for any but the smallest of teams.

      So to sum up... You (the "royal" you) might think it's not important to enforce all these database constraints because you're building the system and you know what not to do. However, once the project is out of your hands (since you'll be promoted for doing such a good job), the next developer will probably not understand the domain as well. The impetus to get something developed quickly will lead to shortcuts, and it's in these shortcuts that your domain model will be muddied and violated.

      -- Tyler Young

       
    • Brian Nalewajek

      Brian Nalewajek - 2007-02-16

      Hi Tyler,

        Thanks for the perspective.  I think we all, pretty much, are thinking along the same lines.  The DB (OLTP, line of business, repository of truth, or however you want to refer to the data source from which other, purpose built DBs can be derived), is the last line of defense, and so the hardest to circumvent by well intentioned (let's hope, even if we have reason to doubt that at times), but ill advised, tinkering.

        Part of the question is: can the model cover the rules (Terry raised an example of a class that ORM doesn't handle at this point), and how best to establish the enforcement.  Ideally, this would be done automatically; and any adjustments needed for performance reasons (the surrogate keys that C.H. mentioned, for instance), would only be allowed if they were effectively equivalent to the UofD representation.

        I'm pretty sure that no one using ORM is comfortable with modifications to the schema of a foundation DB, that do not reference the model.  Part of the problem with modifications to schema come from the established notion of "the" database.  If one instance of a data source is required to serve every role and function, it's bound to be a battle ground of competing interests.  If, however, there are any number of databases that are derived from a fully normalized RM compliant DB, the pressure to compromise between data integrity and performance is lessened.  As RM compliant DBs are efficient in "write optimized" OLTP scenarios, that works out well.  For BI or other "read optimized" DB requirements, a DB derived from the OLTP DB, with the schema configured to suit the purpose (like a star or snowflake for data warehousing), need not compromise the performance or integrity of the foundation DB. [ BTW, the "write optimized" and "read optimized" idea I got from Larry Barnes, former MS tech evangelist out of the New England office.  Other than his choice in UI color schemes, he provided a lot of good info on SQL Server and n-tiered application development.]  BRN..

           

       

Log in to post a comment.