Menu

Constraints on derived predicates?

jmag
2006-11-19
2013-05-28
  • jmag

    jmag - 2006-11-19

    If…

    SCHOOL(name) enrolls STUDENT(id)
    STUDENT(id) plays on TEAM(id)

    …then what kind of constraint should I create to ensure that a team only includes students from a single school?

    One option is to create a derived fact “SCHOOL(name) enrolls some student who plays on TEAM(id)”, and then put a uniqueness constraint on TEAM. Is it valid ORM to put a constraint on a derived fact? One problem with this approach is that it requires the complexity of derivation expressions.

    Another option is to create an additional base fact “SCHOOL(name) manages TEAM(id)”, put a unique constraint over TEAM, and then set up a subset constraint that says “If a SCHOOL enrolls STUDENT who plays on TEAM, then that SCHOOL manages that TEAM.” The problem with this approach is that the fact I have added is somewhat redundant (can be derived) from other facts; if you know what students play on a team than you can assume their school manages that team.

    Below is a similar example, using a modification to the ORM metamodel. Suppose I want to create a constraint that says that an INTERNALUNIQUECONSTRAINT only spans roles from a single PREDICATE?

    PREDICATE(id) involves ROLE(id)
    ROLE(id) is in INTERNALUNIQUECONSTRAINT(id)

    In these cases I had hoped there would be some simple way of expressing the constraint in ORM. Can it be done?

     
    • Brian Nalewajek

      Brian Nalewajek - 2006-11-20

      Hi,
        Sometimes it a bit hard to understand what a person is trying to model, when they start in with methodology specific assertions, rather than the basic facts that they are trying to model - we'll I find it hard, anyway.
        The basic facts for your model might be stated:
          A school enrolls students
          A given student is enrolled in at most one school
          A school sponsors/manages teams
          A Student can only join a team sponsored/managed by the school in which they are enrolled.

        If these statements convey the facts you wish to model, I can think of at least one way to do so in ORM.  I don't think it violates the conditions of normalization.  It does use a subset constraint, but not quite the way you suggested as an option.
        See if this looks Ok to you:

          School(ID) enrolls Student(ID) - with IUC over the role played by Student
          LocalTeam(Name) is of School - with IUC over the role played by LocalTeam
          Student joins LocalTeam - with IUC over both roles (to allow the student to join various teams at the same school
          A subset constraint, so that Student joins is a subset of Student .../is enrolled by

        I should model this with sample population to make sure, but I think it covers the facts. If I messed up, someone else here will spot it, and correct me, and inform you. I don't know about adding constraints to derived predicates. Good luck with your project.
        BRN..

       
    • jmag

      jmag - 2006-11-20

      Thanks, I think you understand what I am trying to model. Your proposal works and I believe it is the same as the second option I proposed. I still am just a little bit troubled by it, more from a theoretical than from a practical point of view. The fact "LocalTeam(Name) is of School" (or "SCHOOL(name) manages TEAM(id)”) doesn't add new data because it can be derived. This fact's sole purpose is to provide a place for the uniqueness constraint. That is why the constraint on the derived predicate seems to be a little cleaner to me, but I still don't know if it is legal ORM to put constraints on derived predicates.

       
      • Tyler

        Tyler - 2006-11-20

        Just my limited experience, but I've never heard of constraints on a derrived fact type.  I would expect that the derrivation rule itself would provide all the constraints necessary.  If you're running into problems with derrivation rules, it probably means there are facts about the UoD that are not being stored.

        Let me try an example to argue that "School sponsors Team" is not redundant....

        Row | School   | Student | Team
        1     Highland   Tyler     Rams Rugby
        2     Highland   Scott     Rams Football
        3     Skyline    Cassandra Eagles Swim
        4     West       Lenny     Rams Rugby -- invalid, explained below

        There's no fact type "School sponsors Team" in the first model.  I think that instead of a constraint over a derrived fact type (since it might not be possible in ORM, and is definitely not in NORMA), it would be better in this case to put an extra textual constraint on the model.  Maybe something like, "If Student A plays on Team B and Student A goes to School C, then for each Student D that plays on Team B, Student D goes to School A".

        In that constraint, it looks like it's limiting Team members to Students from the same School.  It is, but not in the way one would expect.  What's actually happening is that Students that play for the Team must attend the same School as the first Student that plays on the Team.  If Lenny were in Row 2 instead of Scott, Scott would be prevented from playing for Rams Football.  When you create a new Team, there's no way to know how to limit which School the Students would come from.  It's simply a matter of "luck" who signs up first.

        Even though you could enforce the constraint by immediately signing up a student from the appropriate school, if the team membership ever went to 0, say, at the end of a rough season, it would be up for grabs again.

        So there's my argument for "School sponsors Team".  Please poke holes in it as you see appropriate-- I'm still learning too.

        -- Tyler Young

         
        • Tyler

          Tyler - 2006-11-20

          Oops!  Lenny was supposed to sign up for Rams Football.  Mistake #1.  For internationalization, I should have specified "American Football".  Mistake #2. And to throw a wrench into your model, Highland Rugby isn't officially sponsored by Highland High-- it accepts students from any school.

          Kia Kaha! (highlandrugby.net)

          -- Tyler Young

           
    • Brian Nalewajek

      Brian Nalewajek - 2006-11-20

      Hi,
        Even though the verbalization I propossed looks Ok to me, I'm going to wait until I can use NORMA to model it, and make sure it's doing what I think it's doing.  In a way it's a good thing - as it demonstrates that the tool (and at least for me), is neccessary to create and verify a data model that will be functionally accurate to the domain requirements and result in a RM comlient schema.
        BTW, what data in my proposal do you see as being derivable from the other data elements in that model? I don't think that it's School having ID, and School sponsors Team(ID), as it's only COMMON that schools sponsor only thier own teams - not a given.  As I believe that all the facts are elementary, there should be be no derived data.
        Still I won't be sure that what I gave is a solution until I model it, and use some sample data to confirm the model.  It's too easy for me to leave something out, or add an unneccessary item when I do it without using an ORM development tool.
        Thanks for posing the question, though.  The general pattern you asked about is common enough, and I ought to feel fully confident in a solution for it.  As I'm not, it means I still need to do more ORM!  BRN..

       
    • jmag

      jmag - 2006-11-20

      The example I gave is a little weird in the real world because the determination of which school manages a team is dependent on the first student signed up. it is probably better to add the fact "School manages Team" as proposed above.

       
    • Brian Nalewajek

      Brian Nalewajek - 2006-11-20

      Hi,
        Yeh, having the first student signed determining the team sponsorship ins't exactly intuitive.  If you have a reason for doing it that way, that's fine - the model should fit the reality, rather than making the reality fit the model; having an application do the first signed bit, might be better that to use it as a DB schema requirement, though.
        BTW, I started to work the model I suggested through NORMA with sample data.  I ran into one problem right away - not with the ORM diagram, but with mapping it to SQL Server as a DDL script.  The model maps two autocounter columns in one table, as School and Student used ID for PRM.  I fixed this by changing Student(ID) to Student(Name).  This isn't an ORM issue , and I think it only occured because the model is so simple - with the Objects not playing enough roles to map to seperate tables. An anlternate fix would be to add more roles.
        Created the DB and used the script from NORMA (modified for proper SQL Server object identifiers, and the tables and SP installed without error.
        I started to open the tables and add data, and there are major problems in the logic.  I must have missapplied the subset constraint, or the model need other fixes, or there's a problem with the NORMA mapping to for SQL Server script generation.  Haven't had a chance to get back and sort it out, but I will.  I need to know what I or NORMA are doing wrong.
        I'll try to post the model, files, and notes on my site sometime later.  BRN..

       
      • JO3Y

        JO3Y - 2007-08-31

        "... might be better that to use it as a DB schema requirement"

        Where could I learn more about setting DB schema requirments?

        Thanks

        Joe

         
    • Brian Nalewajek

      Brian Nalewajek - 2006-11-24

      Hi,
        The core problem of implementing (using NORMA/ORM2), the Business rule: A student can only join a team, that is sponsored by the school in which that student is enrolled; is thornier than I thought (at least for my skill set).  I have tried several modeling techniques, and haven't found a good solution that doesn't introduce derived (and very contrived), elements.    As this rule seems to be a type for an important class of problems that a modeler might face, finding a solution - or reason why there can't be a solution is important.  The possiblities I see are:
          There is a simple direct solution with NORMA/ORM2 that hasn't occured to me.
          The only solution with NORMA/ORM2 is complicated and obtuse.
          The problem as stated is not one that should be modeled using NORMA/ORM2 as doing so would violate the intened result of a normalized Relational Model compliant schema.

        I think I laid out the bones of a realworld model that could be used to track any number of schools, enroling any number of persons as students; allowing each student to join any one of a number of activites that any given school might offer.  The model also includes a realistic workable composite reference sheme for School and Activity.  What I can't see is the last piece of the puzzle that would lock an instance of 'Student joins Activity' to THE School in which THAT Student is enrolled.  As I can't include the model or the image of the model in this post, I'll need to get a URL for these later.

        Your thoughts on this problem/solution are very welcome.  BRN..

       
      • Anonymous

        Anonymous - 2006-11-25

        You need to use a subset constraint. The population
        of StudentIsEnrolledInSchool is a superset of the
        Student/School part of StudentParticipatesInSchoolActivity.
        I have a ternary "Student partipates in Activity sanctioned
        by School", and the subset constraint reads: "If some
        Student partipates in some Activity which is sanctioned by
        some School then that Student is enrolled in that School."

        Clifford Heath.

         
    • Brian Nalewajek

      Brian Nalewajek - 2006-11-25

      Hi CJ,
        I took a look at your post about using a subset constraint to achieve the desired results.  I had tried several subset configurations and ran into problems.  Not seeing an obvious flaw in your senario, I Modeled it in NORMA.  I kept this model as simple as possible (Using Name for PRM for each of the Entity Objects - Student, School, Acivity).  I put an IUC to show that a student can enroll in at most one School.  I used the ternary facttype you suggested, with an IUC to have A Student Particpates in an Activity sanctioned by at most one school.  Got the Ext. Subset constraint in place to read as you mentioned.
        There were no error/warning messages.  I populated the Objects and then the facttypes with suitable data.  I tried to add some population of the facttypes that would be in violation of the Bus. rule we're trying to model.  That this didn't bring up any warnings or errors, may be due to current limitations of the sample pouplation features of the tool, not an indication that the rules were not violated.
        I did a custom tool generation of the model for SQL Server.  I corrected the SQL script only to avoid the object identifier errors that always occur when running these scripts against SQL Server 2005.  I ran the script USING a pre-created database. (no errors).
        The script ran without errors creating two tables; one for Student-School, one for Student-Activity-School, plus the SPs.
        Openning the tables I inserted data, first in the Student-School table.  The constraints worked as expected, rejecting a Student from enrolling in more than one school.
        The other table showed problems:
          While a student was able to participate in a given activity only once for a given school, that same student could participate in an activity at any school (not just the one designated by Table 1 data).  If I choose a different activity (say Baseball, instead of Football), the same student could participate in activites at more than one school.
          There were other anomolies along these lines, but hard to show just using text.
        My guess is that the Subset constraint reading doesn't convey the restrictions that you might reasonably assume.  The sticky part is the word 'some.'  For instance,  the Student-Activity-School table will reject a student not listed in the Student-School table, but as long as the Student(Name) is enrolled at 'some' school, that student is allowed to participate in activities in ANY school.
        Try to generate a DB from the model yourself, and see if you run into the same problems with the constraints.  If you model is bullet proof, I'll have to see how I might have not  gotten your model implemented correctly.
        I have to say, that while I was a bit peeved that I might have missed a simple solution to the problem, I was glad to see one was available, because I expect that class of problem to come up from time to time.  Looks like the search is still on, though.  Do let me know if you get one that proves correct.   BRN..

       
      • Anonymous

        Anonymous - 2006-11-25

        > I put an IUC to show that a student can enroll in at most one School.

        This constraint true, but isn't necessary once you have
        the subset constraint in place. It says that a student may
        not participate in an activity more than once (sanctioned
        by different schools). But since they can only be enrolled
        in one school, and can only participate in activities
        sanctioned by that school, the IUC is redundant/derivable.

        > My guess is that the Subset constraint reading doesn't convey
        > the restrictions that you might reasonably assume.

        My guess is that NORMA doesn't enforce the subset
        constraint correctly. I believe it's a bug, which
        I'll submit. The constraint is definitely the one
        you want. To check this in your own mind, use the
        un-compacted relational mapping:

        School(Name), unique on name
        Student(Name), unique on name
        Activity(Name), unique on name
        StudentSchool(School_Name, Student_Name),
          unique on Student_Name,
        StudentActivity(School_Name, Student_Name, Activity_Name),
          unique on (Student_Name, Activity_Name).

        The subset constraint, if enforced correctly, ensures
        the SQL predicate:

        NOT EXISTS(SELECT * FROM StudentActivity AS sa
            WHERE NOT EXISTS(SELECT * from StudentSchool AS ss
                 WHERE (sa.Student_Name = ss.Student_Name
                     AND sa.School_Name = ss.School_Name)
        )

        BTW, it's worthwhile learning, if you haven't already,
        to visualize the direct relational mapping of your ORM
        models. Each entity is a table containing *only* its PRM.
        Each relationship is another table consisting only of
        foreign keys into those tables. The role boxes are just
        column headings in the relationship table. This "direct
        mapping" to RM is what gets compacted by removing
        unnecessary tables. The nice thing about it is that
        every fact, constraint, etc, can easily be verbalized
        as an SQL query or predicate.

         
    • Brian Nalewajek

      Brian Nalewajek - 2006-11-26

      Hi,
        I'll look through the model again.  You may ve right in that the fault is with the way the tool implements the subset constraint, but the bottom line is that the methodology-domain logic-tool system doesn't produce a solution for the test problem we're looking at.
        Yes,the mapping algorthims used are essencial parts of the solution.  I hadn't done a 'by hand' mapping to see how that compares to the mapping done by the tool.  Have you compared the two mappings (your own vs the one generated), to see where the generated mapping is at fault?  There is also the possiblity that the fault is in the mapping from the logical model to the physical implementation (rather than from the conceptual to the logical).  I only have MS SQL Server here, so I couldn't test the physical mapping for another RDBMS.
        I'm still not convimced that the verbalization for the subset constraint (the one you mentioned in your post, and the same one I saw for the constraint I used in the test model), implies the requirement of the business rule being modeled - as the linkage between THAT student/activity/school and THAT student/school is not expressed by the symantics of the verbalization.  Looks to me that the verbalization used WOULD imply the results that I saw in the DB implementation.  It could be that I'm not seeing what's there, or missinterpreting what I'm seeing, but until a fault is established for the constraint mapping, I'd be inclined to look futher up the food chain to find where the model and the implementation diverge.
        Thanks for all you are doing to help me get through this sticky problem.  I first assumed that there was an obvious solution to the oroginal poster's question, and offered an off-the-cuff responce.  I knew I wouldn't feel comfortable until I modeled it using NORMA, to see if what I said would work.  A lot of approaches I tried resulted in implementations with 'some'  where I was looking for 'that.'
        You could be right, but I'll have to keep working on how, and why the results don't match the requirements.  Let me know how the physical implementation based on your mapping  works out.  If the business rule is implemented correctly, that should go a long way to pointing out exactly where the error with the tool lies.   BRN..   

       
      • Anonymous

        Anonymous - 2006-11-26

        On looking at it again, it's pretty simple; the subset constraint is not reflected in the SQL Server code at all. It's just a "feature not implemented", and I've submitted an feature request for it (including my ORM file; you could download that and take a look). A proper implementation requires the use of SQL triggers.

        Based on your comments, it looks like the subset constraint isn't handled correctly with sample population data either. Sample data is a new-ish feature, and I wouldn't be surprised if it just isn't finished yet.

        Interestingly, when you read the OIAL file (precursor to the DCIL and DDIL files which preceed the SQL generation), it contains an XML comment:
        <!--WARNING: There once was an external constraint here. There isn't any more. It was of type "SubsetConstraint", and named "StudentSchoolLoyalty".-->
        That to me looks pretty convincing that this subset constraint isn't going to be populated to any of the relational models.

        On "my" mapping versus the generated one, mine is not a mapping. The ORM model is *intrinsically* relational, and I used that to point out that you can use the intrinsic relations to understand the constraints. This is absolutely fundamental to ORM; yet it took me a *long* time to properly get it. ORM is not less relational than SQL, it's *more* relational! SQL is the odd-ball that makes it difficult to express many necessary constraints (hence the need for triggers - code to express things that can't be expressed declaratively).

        All the Rmap algorithm does is combine tables where possible, removing elementarity.

        BTW, this subset issue occurs in several places in the orienteering model also, and I haven't yet added them. So it's an issue for me at the moment also.

         
    • Brian Nalewajek

      Brian Nalewajek - 2006-11-26

      Hi,
        I had just looked at the associated .xml files and seen the warning about the 'SubsetConstraint1' having gone missing in action, and was going to point that out to you!  You beat me to the punch.  Though I'm not pleased that a feature as important as this one isn't implemented yet, (it would be helpful if any feature not yet implemented was noted - somewhere), at least I now know why, what should have worked correctly doesn't (as far as the tool in its present state is concerned).
        The current state of the sample population feature of the tool doesn't help matters.  Don't know if you saw my post, but if you edit the properties of an object that has sample data, the edited object shows on the diagram, but the original version is retained - but you need to go to the object explorer to find it - or the tool will find it for you when it generates an error for a sample population incompatiblity, for an object you didn't know existed.
        Did you take a look at the negative verbalization for the subset constraint you came up with?  While it may be logically correct, it hardly helps illuminate the rule being implemented.  It's my recollection that InfoModeler had a feature that allowed you to cycle through various logically correct verbalizations for a given fact.  This was useful, as it allowed the modeler to see the fact from a number of perspectives.  If one wording was ambigous (to that modeler), another might make the statement clearer.
        SQL has a number of issues; but as it's a product of commity, with the principals more concerned with showcasing or preserving the features unique to thier implementation, maybe that shouldn't come as a surprise.  I've always considered triggers as a kind of a cludge for something that should be implemented in the structure - or left out and handled otherwise.  On the other hand, real world solutions are what we need to deal with, and we  have to work with what we've got - until something better comes along.
        Thanks for the notes on relational algorithm, non-mapping, etc....  Please do post the issues that came to light as a result of this problem.  I know that things will improve with each release, but may be most important is a notification system concerning features that have known issues, so we can work around them - and not get side tracked by unexpected consequences.  BRN..

       

Log in to post a comment.