The current MultiFormat BizComp allows you to match input records against a literal string in one of two ways:
- matching the initial characters of a record (xa:match_begin)
- matching one or more fixed field positions in the record (xa:field_match)
In addition, The BizComp requires you to match every record in the input and map every record to the resultset (or an error is generated and no results are returned).
This works for a large number of common use cases we have encountered so far, but there are more complex file formats that would be easier/possible to process with a few enhancements to the MultiFormat BizComp. For example, to match any character in the first position, would require 72 xa:record_definitions just to match digits and upper and lowercase letters! Here are some features that should be added:
Custom Mapping Rule
Provide a method to match records with any custom mapping rule. One option is to provide a regular expression matching construct (xa:match_regex?), using some standard regex syntax such as the Java sytnax. This would allow the user to match against complex specifications (not just literal strings). Another option - which I would prefer - is to provide the capability to use a custom functoid to provide a custom mapping rule. If the functoid returns "true", then the record is matched. Although a functoid can currently be used on an xa:field_match, the result must be a string literal that is used to match the input record. Instead, the ability to pass the current input record as a parameter to the functoid is needed (perhaps a new $xavar:currec$, or the ability to specify a functoid name which requires a single String param as input). This is a more flexible option, allowing the user to create a simple functoid, or use Java regex if desired.
Optional Matching Every Record
Provide the capability to ignore errors if an input record is not matched. For example, a user may want to match 3 out of 100 record types in a file, and only write 3 xa:record_definitions. This could be handled as a default rule if the first capability is implemented, but that requires the user to write/maintain a potentially complex regex or rule that matches ALL but the valid cases. A better solution would be to provide another attribute (xa:match_error?) that turns the current error detection on/off. This allows users who want to match every case the ability to use the current behavior, and other users the ability to match only the desired records.
Provide a Default Match
Provide the capability to map a single record layout for all cases not matched (assuming the second capability above is implemented). This allows the user to catch all other records without having to provide a match record definition. This could be useful for mapping the 97/100 other records to some result for logging or even for just debugging.
Optional xa:map
The user should not have to provide a record layout and map the results for records of no interest (for example in the case of mapping 3/100 record types). This may not be an issue if the second capability above is provided: no match/no map. I can't currently think of a use case where a record definition would be desired without a corresponding map.
Specify Match Ordering
The Multfiformat Bizcomp does define the order of matching for multiple matches in the case of xa:match_begin (longest key first). However, the behavior appears to be undefined (and unpredictable) if multiple record definitions are matched using xa:field_match. I'd suggest that it is defined and implemented to match the first xa:record_definition in the BizComp, if record matches overlap, for either xa:field_match or custom mapping rules as requested above.
I believe all of this except the specify match ordering has been implemented in 5.2. Any disagreement? I would move to close this feature request and open a new one for specifying match ordering if that is the only thing left to implement. Any objections?