Work at SourceForge, help us to make it a better place! We have an immediate need for a Support Technician in our San Francisco or Denver office.


Reference Mode or Reference Scheme?

  • Hi,

    Just noticed that the error message displayed, when an entity object type does not have an assigned reference mode, states "Error    2    Entity type 'EOT' in model 'Base' requires a preferred reference scheme."

    In the properties window for this entity object type, there is a "RefMode" combo box.

    I've heard that the subject of reference modes and value types are part of internal discussions at Neumont.  Whatever the outcome, it would be best to settle on one term.  Personally, I think "scheme" make more sense than "mode" in this context, but consistency within the tool and throughout the documentation is probably more important.  By any name, the concept is important and took me a while to understand.  You don't want to frustrate people new to ORM.

    Not sure this qualifies as a bug, but I'll enter a copy of this in that section.  BRN..

    • Sten Sundblad
      Sten Sundblad

      Whatever it's called, it shouldn't be marked as an error if missing. It should be possible to enter an entity type without assigning a reference mode/scheme to it, and to show that entity type to business domain experts without having the lack of a reference mode flagged as an error. It's perfectly normal to assign ref modes late in the process, and if the lack of ref modes are flagged as errors, business people tend to feel uncomfortable and even to lose confidence in the model and/or modeller.

      A better strategy would be to have a model flag indicate whether lack of ref mode should be flagged as an error or not. This flag could be switched off early in the process and then switched on before the model is finalized.

      • Hi,

        I haven't tried to modify error reporting behavior.  I saw something about this option listed in either the release notes included with the June 2007 CTP drop, or in the notes Matt C. included in a recent reply in to a question about this CTP in this forum.  It could well be that the options you are asking for are available already.

        While the impact of error notices on clients attitudes are a concern, there are several other points about the use of a CTP tool release that would raise eyebrows. Until the tool is further along, you may well want to be selective about the degree to which you expose the mechanics of the tool to stake holders.

        My concern with this issue was the inconsistency of terminology.  Mixing terms makes it much harder for those using the tool to find answers to important questions about the methodology and the tool.

        More relevant to the issue you raised, are some important considerations for the structure of the methodology.  I'm working on putting together some notes to offer to the team on one way to approach this.  The gist of it is to recognize the value in intermediate results provided by segments of the methodology, and to include capabilities within the tool that will accommodate several end result goals (not just varied RDBMS, but more categories of modeling methods).  Validation within the methodology process can then be tuned to the end points desired.  Requirements and limitations for one target are not necessarily the same for another.  If the tool can be made to work along these lines, only errors along the path to the selected target results will be flagged.  Of course, any segment of the complete methodology must not break the requirements of the underlying logical framework.  I hope to get this to them soon, so that they can assess to what degree this idea is reasonable.  My guess is that there is a very short list of those with a complete enough understanding of the theory and methodology (as well as the concerns for translating these into a software application), to make informed judgments on such proposals.  I'm not on that list, but I'll try to run it by some that are.

        Thanks for your comments.  In a practical sense, doing things to bring the the tool up to the level of being ready for use in a business environment can't come fast enough.  If you look through Matt's notes, you'll see that important steps toward this are planned for the next several releases; and some that were lacking are now incorporated in the NORMA tool.  BRN..

      • Hi,

        If you've followed the rest of this thread, you'll notice it's a complicated issue.  I agree with your point, if I can state it as: Some how, some way, the ORM modeling features should be available for conceptual modeling, where identifying unique instances is not an immediate concern.

        I can imagine agendas where unique identification would be better put off until later in the process, and some agendas where unique instance identification is not required to accomplish the level, or type, of modeling that the practitioner has in mind.

        While it may be a while before the core issues are resolved, or the is tool modified to accommodate this, there may be a way to "fool the tool" into accepting a unique-less form for Entity Object Types.  There is a Reference Mode Editor in the current CTP that allows you to modify the provided RM's, and to create new ones.  I started to play with adding a new RM type, using the editor, called "Dummy."  My thought is that Dummy could be used wherever the tool requires a RM/RS/PI in places you'd like to postpone or avoid assigning these.  I'm not suggesting this as an answer, but as a way of exploring the bounds of what the tool and the underlying ORM theory have imposed.  If you have the time, try something like this.  I'll play with it some myself, but one person is aways apt to take a misstep.  At some point, even if it seems to work, a careful look at the validity of the resulting models (for a given agenda), will have to be looked at carefully.  Anytime you circumvent rules, you run the risk of fundamental errors that never crop up when the rules are followed. BRN..

    • The error display option is available, and currently affects error display on the diagrams, menus, and model browser (we don't currently filter the verbalization and report, see PS below). However, we still consider the lack of a reference scheme an internal error. As with all of our validation errors, live tracking is on whether it is displayed or not.

      'Reference Mode' and 'Reference Scheme' are related, but are not the same concept. The internal uniqueness constraint that is the preferred identifier for an EntityType provides the reference scheme. Identifiers can be single-role uniqueness constraint attached role with either ValueType or EntityType role players, or multi-role external uniqueness constraints. If the current reference scheme matches the most common pattern (a single-role internal uniqueness constraint on a role with a ValueType role player), then NORMA now considers the EntityType to have a 'Reference Mode'. If the EntityType name and ValueType name match known patterns, then we can further identify the kind of reference mode (Popular or Unit). For example, x(id) means x has x_id(), and id is the recognized reference mode because {EntityTypeName}_{ReferenceModeName} is the pattern associated with the 'id' reference mode.

      The current implementation (and my preference, although there is still an ongoing internal debate) is to keep the notion of 'reference mode' as a secondary notion derived by examining the preferred identifier, with additional metarules if the pattern is recognized (for example, a single value type should not math the popular reference mode pattern for more than one entity type). This way, it is irrelevant if you enter a reference scheme pattern that matches a reference mode pattern by hand or by setting the 'RefMode' property.

      We know that the way Reference Mode is presented is different than in other ORM tools. Concerns have been noted, and changes are in the pipeline (explicit notation for unit vs popular reference modes, comparable units, better documentation on the underlying 'Reference Mode Pattern', explicit control over the set of default reference modes, capitalization options, etc). In the mean time, the easiest way to view reference modes is as a recognized pattern of reference scheme.


      PS Just talked to Terry and he'd like the verbalization/report filters as well. I added filtering for secondary errors (primary errors block verbalization, we can't ignore them outright) to the verbalization engine in change 1052.

      • Hi Matt,

        Good material to mull over.  I think I get the differentiation between the terms Reference Scheme, Reference Mode and Primary Identification - as you set them out.  Keeping the perspectives of the theory, techniques and requirements for a working tool seems central to the issue.  I want to look closer at what you've put forward, and other references to these terms, before speculating further.  K.E. reminded me of a paper from an OF presentation on Whole/Part representation in ORM, and the points put forward there may have a bearing on these referencing issues as well.  Thanks for the notes.  BRN..

      • Clifford Heath
        Clifford Heath

        > If the current reference scheme matches the most common pattern...

        I have to admit I had to read that paragraph several times :-).

        I can't help feeling that you're making an error by predicating
        the recognition of this "further identification" by matching
        lexical patterns on the names. Or perhaps to put it another
        way, if the lexical pattern is all that distinguishes a reference
        mode from a reference scheme, then perhaps the value of
        the notion of "mode" is over-rated?

        All it seems really to be saying is that this reference scheme
        is named in a way that allows us to display it rolled-up inside
        the entity rectangle. However to me,  the rolling-up is merely
        a lexical convenience, and I want to be able to do it on the
        diagram even if I use a different pattern for constructing the
        associated names. I realise that would make it less pure as a
        formal notation, but the rolling up exists as a convenience
        anyway, not as a formal thing. By building this as a formal
        thing, you're hiding the list of name pattern recognition rules
        up inside the formal notation.

        There's very little else in the ORM2 notation which has such
        a mass of special ad-hoc rules hidden inside it, and I'd prefer
        it if these weren't either.

        > The current implementation (and my preference, although there
        > s still an ongoing internal debate) is to keep the notion of
        > 'reference mode' as a secondary notion

        That's my preference also - treat it as a kind of shorthand,
        with the details neither implicit, nor displayed. It sounds like
        in your current implementation, they're implicit. The implicit
        path only has value if the meta-rules are standardized. If
        they're customizable, they lose their formal value as a
        short-hand, because the meaning isn't known without
        consulting the customized meta-rules.

        > We know that the way Reference Mode is presented is different
        > than in other ORM tools. Concerns have been noted, and changes
        > are in the pipeline (explicit notation for unit vs popular
        > reference modes, comparable units,...

        By "comparable" I'm sure you don't mean equality comparison.
        Do you mean sortable? Or do you mean convertible (inches to
        meters, for example)? Consider that exact equality may be lost
        during conversion, and in fact some units are only approximately
        convertible, for example months to days. Others are time-variant,
        for example dollars to pounds. In any case, the "units" program I
        recommended you look at has something to offer here. I believe
        you'll need to add an indication of exactitude and time-variance
        to the conversions though.

    • Maybe a better name for 'Reference Mode' is 'Reference Mode Pattern'. If you look at the 'ORM Reference Mode Editor' window, you're really looking at a set of patterns. These are not hidden and the patterns can be modified. Tthe format strings are editable for the reference mode kinds and any custom reference modes you add. When you choose a RefMode from the available list, you are instructing the tool to apply the pattern.

      The most important pattern is definitely the reference scheme pattern, which is what is stored with the model. The FactType/Constraints/ValueType are the primary representation of the concept, regardless of how it is entered. The underlying 'Reference Mode' pattern is secondary. However, that does not mean that there are no additional metarules that apply to it. Clearly, the matching pattern for applying any additional rules will be specified.

      To see where we're leaning in changes (gory detail):
      1) All reference modes patterns used by the model will be stored with the model, including any of the ones now considered intrinsic.

      2) We will install a reference modes file which provides an editable list of the reference mode patterns you want to be available for new (and not so new) models on a given machine. As soon as you begin using a reference mode pattern from the known list, it will be copied into your model file, at which point it will hide any of the defaults on the given machine. This allows you to customize your 'intrinsic' reference modes without compromising the machine.

      3) Reference mode patterns can be divided into two groups, Popular and Unit. The old notion of General does not match a specific pattern (the name of the opposite value type is shown as the collapsed name).

      4) The name of the ValueType associated with a Popular reference mode pattern must be functionally dependent on the name of the EntityType. In most case, the value type name will also be functionally dependent on the name of the reference mode pattern (You can see this as {EntityTypeName}_{ReferenceModeName} in the reference mode editor). The popular pattern is where additional metarules become useful.

      5) A Unit-based reference mode may not be dependent on the name of the associated entity type.

      6) Each of the Reference Mode Kinds (Popular, Unit) will continue to have default naming patterns. However, all of the formats will be overridable on a per-Reference Mode Pattern basis. So your kg reference mode can correspond to a ValueType of kgValue, and your cm reference mode could correspond to a reference mode of cm

      6) Popular reference modes will be displayed in short form with a preceding dot (.). So, Person(id) becomes Person(.id). Note that in the fact editor this would let you type in Person(.id) and automatically create an 'id' RMP (Reference Mode Pattern) with default formatting information even if you didn't have one yet.

      7) UnitBased reference modes would appear with a trailing :, so Height(cm:) would indicate a height measured in cm. You could optionally display the type of the unit, so this could also display as Height(cm:length).

      8) If you choose to expand the reference mode and show the underlying facttype/valuetype, then we would modify the FactType display by appending a : on the name of a value type associated with a Unit RMP (note that 5 means that there is a 1-1 relationship between a Unit RMP and a ValueType), and highlighting the name of the of the EntityType in the ValueType.Name for a Popular (see 4). Highlighting could be done with an underline (single, double, dotted underlines have all been discussed), so Person_id would have Person underlined, indicating that this ValueType is the Popular reference scheme for Person, even if Person is not on the current diagram.

      Anyway, I think this addresses most of the issues. The RMP notion stays secondary (the underlying FactType/Constraints/ValueType are the primary notion, not the pattern), ease of entry is improved (x(.foo) in the fact editor will automatically create a recognized and reusable foo reference mode pattern), the list of 'intrinsic' reference modes is relaxed so not all forms are imposed on users who don't want them, and the RefMode behavior is very similar to the Visio-era tool if you never set ExpandRefMode to true. If we need to display/report the reference modes in some form we can do that as well.

      On units, I'm being fast, loose, and non-technical with English when I say comparable (cm and inches are comparable, but cm and kg are not). However, I'll concede that comparison and conversion have different technical meanings. However, I think it was obvious from the context here that we're talking about comparing LENGTH, regardless of the unit. No arguments with the statements on conversion exactitude and time-variance. Of course, it can be even worse depending on the datatype used to store the data (inches and centimeters stored as an integral type convert less accurate than those stored as floating point).


      • Clifford Heath
        Clifford Heath

        > 2) We will install a reference modes file which provides an
        > editable list of the reference mode patterns you want to be
        > available for new (and not so new) models on a given machine

        Machine... I forsee problems. A model has a different meaning
        depending on what machine I peruse it on??? How baroque. I know
        you want to handle this by copying the definitions into the
        model, but that's fraught also... it means that a given set of
        (possibly recorded and/or automated) actions to build a model
        will produce a different outcome depending on how the machine
        was configured! It also means that if the corporate standard
        patterns get updated, the model is frozen in time. IMO, these
        kinds of dependencies are utterly unacceptable in a development
        tool, especially one that purports to set a new standard in
        information management.

        NORMA needs a way to import definitions (models, value types,
        patterns, and whole models) into a model, and that's the way
        this should be handled. You can include an "included model
        version" in the model for each include, to detect include

        If you import a whole model, the entity types included aren't
        mapped unless they're connected to (play roles in) the model
        they're imported to. At present, all features of a model are
        mapped even if there are multiple disconnected meshes.

        > 6) Each of the Reference Mode Kinds (Popular, Unit) will
        > continue to have default naming patterns. However, all of
        > the formats will be overridable on a per-Reference Mode
        > Pattern basis.

        Again, this means that ORM2 as a visual language creates models
        whose interpretation that's dependent on invisible or environmental
        settings. Is that a good thing? Or perhaps that happens anyhow...

        > 7) ... Height(cm:length)

        Why can't "cm" just know that it's a length? I mean, it is
        always a length isn't it?

        I question the need to separate the : and . notations. Surely
        Foo(:id) could be considered as a unitless mode, and matched
        against the popular patterns? Then you wouldn't need the
        syntax Foo(.id), and you could apply patterns to unit-based
        modes as well, so Area(10000*m*m:hectare) would be a unit-
        based definition that applies a "hectare" pattern, if any?

        I assume you had a plan to support compound units, yes?

        On "comparable", I was questioning whether IDs are considered
        comparable. They support equality comparison, but not
        (meaningfully) sorting, for example.

        I hate this forum. Surely you could see your way to migrating
        to an alternative, such as the Yahoo information_modeling list
        I set up? It would be ideal, and provides everything this one
        does and more.

    • Reference modes provide a shorthand notation for an extremely common pattern. With a shorthand notation not all of the information is displayed on the diagram. This is no different from the Visio solution and other solutions where some of the reference mode details (separator, etc) are not clearly displayed on the diagram. Similarly, ORM diagrams don't display datatypes. In any case, you will not see the readings for the underlying existential FactType when you collapse reference modes. There are also voices asking for customization of the default set of reference modes (generally to reduce that set, although extension also applies), which is at odds with your desire to have the shorthand notation match exactly the same pattern on all models. Your restrictive request followed to its logical conclusion also means that there are no custom reference modes because these would be non-standard, and hence the shorthand could not convey the full meaning. Certainly, we could allow a synchronization command between the current machine settings and those in a given model, but I believe that attempting to do this automatically and dynamically changing meanings in a stable model based on environmental changes is fundamentally a bad idea. Stability of individual models clearly trumps the need for updates.

      Cross-model references are a different topic, and an area we will address after the single-model mappings are handled cleanly. Cross model synchronization is also a major issue (what happens when your external model changes?). Certainly, if you want to have a set of RMPs defined in an external model you can do this and have no defaults. However, I think having a populated (yet configurable) set of common RMPs available for a new model is also important. I don't think one world view (fixed set of reference modes only) needs to lock out the other possibilities, and you can certainly configure the tool anyway you like by removing all the default RMPs, adding them to a starting template, etc.

      Unit and Popular are fundamentally different notions. Multiple entity types may bind to the same ValueType and match the same RMP. The same is not true for a Popular reference mode.

      hectare is an area (length*length), not the other way around, so hectare is definable in terms of two length units and a conversion factor. This will give you a unit, but not a ValueType. The ValueType itself has additional datatype and conceptual information not available directly from a unit. We haven't formally modeled this yet and locked down underlying names, but I would consider {length,area,mass,etc} to be fundamental units, whereas {m,cm,mm,in,acre,hectare,kg,lb,etc} to be standard units that correspond to fundamental units. Formalizing the notion of unit provides a clean conversion mechanism. So, for example, conversion factors can be applied automatically to a derived BMI value (unit is m*m/kg) can be  calculated seamlessly even if weight is recorded in pounds and height in inches. In any case, unless I'm badly misreading your syntax, Area(10000*m*m:hectare) looks like its backwards (a fundamental unit defined in terms of a standard unit).

      Whether an Id is sortable or not is more of a function of the data type, which is independent of the matched RMP. I'm not sure its a question we need to consider, though. If Id matches a popular RMP, then the lexical values have no meaning outside the set of identifiers for the population of a given entitytype (and its subtypes). Clearly, for unit-based values, comparison/conversion is conceptually meaningful, but not for popular reference modes.


      PS I'm following this thread where it was started. I don't object to following the yahoo list as well, which is housing most of the recent traffic.

    • Hi,

      The RM/PI/RS issue is anything but clear cut.  How about taking one piece, the idea of a Reference Mode, and delineating that?  From a section in T.H.'s book, I see a Reference Mode as the manor in which the Value refers to the Entity.  The difference between interpreting a value such as 80 degrees C and 80 degrees F are profound.  The reason for introducing a RM then, seems to be to include a designated interpretation for some value type's values within a conceptual model of a domain.  In this, there is no intrinsic link to the concepts of unique instances of types, or to identity.  Yet the ORM tools I know of (and perhaps the ORM technique or methodology as well), bind the concept of a RM to Uniqueness Constraints, Primary Identifiers and Reference Schemes, and further to explicit designation of data types.

      Taken on its own a RM designation has obvious merit.  In a conceptual model, there should be a way to state that Inches are what is being discussed, not Centimeters. In a practical sense, it might be good to have the tool recognize that these are units of length, not currency, for example; and so suggest a suitable data type as an expedient - but that's not a conceptual issue.  Even at the conceptual level, I'm not sure that a RM (in this strictly limited sense), should be anything more than a note that is bound to the value object in the model.  Binding one RM to length, Inches say, can impose an unnecessary restrictions of interpretation, not called for in modeling the domain.  An RM (again strictly on its own), should be optional.

      Is it possible to untangle the basic notion of an RM from the rest of these related issues?  If so, I think it will make sorting out the rest much easier.

      Finally, I'd like Matt's opinion on how far this thread should be carried in this forum.  After all this forum is for NORMA project development issues.  If he thinks it best to move these discussions to a new venue, or table them for a latter time, I can see the point in that.  BRN..