Menu

Pattern_matching_filter

silex6

This filter should extract any record that contains a specified pattern or keyword from database table or view. This filter should have no input connector and will be used for the LIKE and CONTAINS statements.

Algorithms

This filter should have two algorithms: indexed and table scan.

The indexed algorithm should use the pattern index to find all records that can contains the pattern, and then load indexed records from table, scan the records for pattern, and send to the output connector any record that contains the pattern.

On this algorithm, the pattern index should be used as a hint, to limit operation cost, as indexed patterns are limited to 8 bytes, and predicate used in LIKE and CONTAINS are actually not limited.

The pattern index should also be used to estimate cost. This algorithm could be used only if there is a pattern index related to the field to filter.

The non-indexed algorithm will behave the same as the non-indexed SELECT table algorithm.

Cost calculation

Cost calculation for the indexed algorithm use the pattern index. Cost = number of bits in pattern index. Expected number of records = number of 1 bits in pattern index.

Cost calculation for the non-indexed algorithm is same as for non-indexed SELECT table algorithm

See also

[Set_engine]


Related

Wiki: Filters
Wiki: Set_engine

MongoDB Logo MongoDB