Recent changes to Set_engine

Set_engine modified by silex6

silex6 — Wed, 17 Dec 2025 13:32:21 -0000

--- v12
+++ v13
@@ -22,9 +22,9 @@

 The pipeline builder should link filters together on a tree-like structure. Producer filters will be placed on top, and consumer filter will be placed on bottom, while pass filters will be placed in the middle. Join and Union filters are binary operators; they have two input connectors and one output. Filters and filtering predicate are compound objects constructed by the SQL runtime. 

-A general sequence of the filters in tree like - WHERE - JOIN - UNION - GROUP BY - HAVING - INSERT INTO – could be used by the set engine as default calculation strategy. 
+A general sequence of the filters in tree like - WHERE - JOIN - UNION - GROUP BY - HAVING - INSERT INTO – could be used by the pipeline as default calculation strategy. 

-Filters should use internal ids of the tables and columns. Matching between objects name of table / view / field and their internal ids should be done by the SQL parser. 
+Filters should use internal ids of the tables and columns. Matching between objects name of table / view / field and their internal ids should be done using the catalog component, during the preparation phase of runtime classes. 

 The pipeline should use expression related classes of the runtime framework in order to calculate result of expressions used in queries. The expression compound objects should first be built by the SQL parser according to SQL source text.

Set_engine modified by silex6

silex6 — Wed, 17 Dec 2025 11:06:47 -0000

--- v11
+++ v12
@@ -1,36 +1,32 @@
-The **Set engine** is the one of the runtime components of the database engine. Goal of the set engine is to execute set-related operations (queries). The set engine should automatically detect the fastest algorithms to execute a given query, and execute the query accordingly. Selected algorithm depends on physical structures added by the user and overall database content; scenario detection is a full automatic process. The set engine should implement pipelining though multithreading and message-passing of the results, this mean any record that has been calculated should be sent to the user as soon as possible even if query is still running. 
+The **Pipeline engine** is the one of the runtime components of the database engine. Goal of the pipeline engine is to execute DML operations expressed as SQL queries. The engine should automatically detect the fastest algorithms to execute a given query, and execute the query accordingly. Selected algorithm depends on physical structures added by the user and overall database content; scenario detection is a fully automated process. The pipeline engine implement a tree of state machines that exchange messages. Any record that has been calculated should be sent to the user as soon as possible even if query is still running. 

 [TOC]

 ## Background

-Set engine component goal is to prepare and execute set calculations related to query expressions. This component should use physical engine functions for low-level access to data, including reading and writing records, index seeks, and transaction management. 
+Pipeline engine component goal is to prepare and execute DML operations expressed as SQL queries. This component should use physical engine functions for low-level access to data, including reading and writing records, index seeks, and transaction management. The queries are first translated from SQL code to an _internal_ representation made of a set of objects from the runtime framework. 

-When the database engine is used as an embedded component every thread of the host application can run its own copy of the set engine component. When running as a server, the server runtime should create a worker thread attached to every client connection and every worker thread runs its own copy of the set engine component. 
+When the database engine is used as an embedded component every thread of the host application can run its own copy of the pipeline engine. When running as a server, the server runtime should create a worker thread attached to every client connection and every worker thread runs its own copy of the pipeline engine. 

-The set engine should have three sub-components: the [Micro-thread_scheduler], the [Filters] set and the [Query_factory]. 
+The pipeline engine should have two sub-components: the [filters](Filters) set and the [pipeline builder](Query factory).

 Set calculations should be done by a set of filters linked together into a pipe-and-filter architecture. Every filter is a specific set operator (filter, project, union, Cartesian product). 

-The filters can be implemented as a set of independent agents, that loop waiting and sending messages (records and errors), as the _generators_ in continuation programming style. The filters can also be implemented as _state machines_ that do an operation, and store their current state on a private structure in order to continue operation on next call. Algorithms in this specification are based on _continuation programming style_, and more complicated algorithms have to be applied if filters are implemented as state machines. 
+The filters can be implemented as _state machines_ that do an operation, and store their current state on a private structure in order to continue operation on next call. Algorithms in this specification are based on _continuation programming style_, and more complicated algorithms have to be applied if filters are implemented as state machines. 

 If possible (in most case), calculation should be done as a _streaming process_: every single record that has been calculated should be sent to the client while filters continue to process subsequent records. 

-### Micro-thread scheduler
+### Pipeline builder and filters

-The **micro-thread scheduler** is a [System_Abstraction_Layer] that should allow round-robin execution of the Filters. The scheduler will be used to schedule and run filters, and will be used by filters to exchange messages and wait on events. The scheduler should hide technical details of the implementation, as coroutine-style execution, context switches, message passing... 
+The **pipeline builder** is the component that should create filters instances and link them together. The pipeline builder should try several ways to connect components together, and estimate cost of the calculation in terms of I/O calls. Cost calculation should be done using the filters. The builder should then choose the most cost-effective way, run set calculation that matches current SQL statement and collect results. 

-### Query factory and filters
-
-The **query factory** is the component that should create filters instances and link them together. The query factory should try several ways to connect components together, and estimate cost of the calculation in terms of I/O calls. Cost calculation should be done using the filters. The query factory should then choose the most cost-effective way, run set calculation that matches current SQL statement and collect results. 
-
-The query factory should link filters together on a tree-like structure. Producer filters will be placed on top, and consumer filter will be placed on bottom, while pass filters will be placed in the middle. Join and Union filters are binary operators; they have two input connectors. Join and filtering predicate are compound objects constructed by the SQL runtime. 
+The pipeline builder should link filters together on a tree-like structure. Producer filters will be placed on top, and consumer filter will be placed on bottom, while pass filters will be placed in the middle. Join and Union filters are binary operators; they have two input connectors and one output. Filters and filtering predicate are compound objects constructed by the SQL runtime. 

 A general sequence of the filters in tree like - WHERE - JOIN - UNION - GROUP BY - HAVING - INSERT INTO – could be used by the set engine as default calculation strategy. 

 Filters should use internal ids of the tables and columns. Matching between objects name of table / view / field and their internal ids should be done by the SQL parser. 

-The set engine should use expression related classes of the SQL framework in order to calculate result of expressions used in predicates. The expression compound objects should first be built by the SQL parser according to SQL source text. 
+The pipeline should use expression related classes of the runtime framework in order to calculate result of expressions used in queries. The expression compound objects should first be built by the SQL parser according to SQL source text. 

 ## Relational algebra and SQL

Set_engine modified by silex6

silex6 — Tue, 27 May 2014 17:59:54 -0000

--- v10
+++ v11
@@ -18,7 +18,7 @@

 ### Micro-thread scheduler

-The **micro-thread scheduler** is a [System_Abstraction_Layer] that should allow round-robin execution of the Filters. The scheduler will be used to start, stop or kill filters, and will be used by filters to exchange messages and wait on events. The scheduler should hide technical details of the implementation, as coroutine-style execution, context switches, message passing... 
+The **micro-thread scheduler** is a [System_Abstraction_Layer] that should allow round-robin execution of the Filters. The scheduler will be used to schedule and run filters, and will be used by filters to exchange messages and wait on events. The scheduler should hide technical details of the implementation, as coroutine-style execution, context switches, message passing... 

 ### Query factory and filters

Set_engine modified by silex6

silex6 — Tue, 27 May 2014 17:59:54 -0000

--- v9
+++ v10
@@ -18,7 +18,7 @@

 ### Micro-thread scheduler

-The **micro-thread scheduler** is a [System_Abstraction_Layer] that should allow concurrent (coroutine-like) execution of the filters. The scheduler will be used to start, stop or kill filters, and will be used by filters to exchange messages and wait on events. 
+The **micro-thread scheduler** is a [System_Abstraction_Layer] that should allow round-robin execution of the Filters. The scheduler will be used to start, stop or kill filters, and will be used by filters to exchange messages and wait on events. The scheduler should hide technical details of the implementation, as coroutine-style execution, context switches, message passing... 

 ### Query factory and filters

Set_engine modified by silex6

silex6 — Tue, 27 May 2014 17:59:54 -0000

--- v8
+++ v9
@@ -12,7 +12,7 @@

 Set calculations should be done by a set of filters linked together into a pipe-and-filter architecture. Every filter is a specific set operator (filter, project, union, Cartesian product). 

-The filters can be implemented as threads or fibers that loop waiting on messages (records and errors). The filters can also be implemented as an array of methods that process pending messages, in this case an event loop should be used to start processing any filter that have pending messages on it’s queue. Algorithms in this specification are based on coroutine, and more complicated algorithms have to be applied if filters are implemented as methods based on an event loop. 
+The filters can be implemented as a set of independent agents, that loop waiting and sending messages (records and errors), as the _generators_ in continuation programming style. The filters can also be implemented as _state machines_ that do an operation, and store their current state on a private structure in order to continue operation on next call. Algorithms in this specification are based on _continuation programming style_, and more complicated algorithms have to be applied if filters are implemented as state machines. 

 If possible (in most case), calculation should be done as a _streaming process_: every single record that has been calculated should be sent to the client while filters continue to process subsequent records.

Set_engine modified by silex6

silex6 — Tue, 27 May 2014 17:59:54 -0000

--- v7
+++ v8
@@ -94,3 +94,7 @@
   * [Update_filter] 
   * [Delete_filter] 
   * [Cursor_filter] 
+
+## See also
+
+[General_architecture]

Set_engine modified by silex6

silex6 — Tue, 27 May 2014 17:59:54 -0000

--- v6
+++ v7
@@ -18,7 +18,7 @@

 ### Micro-thread scheduler

-The **micro-thread scheduler** is a [System_abstraction_layer] that should allow concurrent (coroutine-like) execution of the filters. The scheduler will be used to start, stop or kill filters, and will be used by filters to exchange messages and wait on events. 
+The **micro-thread scheduler** is a [System_Abstraction_Layer] that should allow concurrent (coroutine-like) execution of the filters. The scheduler will be used to start, stop or kill filters, and will be used by filters to exchange messages and wait on events. 

 ### Query factory and filters

Set_engine modified by silex6

silex6 — Tue, 27 May 2014 17:59:54 -0000

--- v5
+++ v6
@@ -72,8 +72,11 @@
 The general algorithms explained are a shrink-wrapped form of the algorithms that should be implemented. For purpose of readability of the algorithms the following implementation details have been omitted:

   * Error handling and rules related to request messages. 
-  * There could be several predicate operators, as '=', '>', '=>‘ ... only one predicate is taken in account in the sample algorithm. 
+  * There could be several predicate operators, as '=', '>', '=>' ... only one predicate is taken in account in the sample algorithm. 
   * On filters that have 2 inputs, the A and B input stream can be swapped, and swapping could impact calculation cost. 
+
+### The filters
+
   * [Select_filter] 
   * [Filtering_filter] 
   * [Pattern_matching_filter]

Set_engine modified by silex6

silex6 — Tue, 27 May 2014 17:59:54 -0000

--- v4
+++ v5
@@ -83,7 +83,7 @@
   * [Semi-join_filter] 
   * [Sub-query_filter] 
   * [Sorting_filter] 
-  * [Group_filter] 
+  * [Aggregator_filter] 
   * [Distinct_filter] 
   * [Union_filter] 
   * [Except_filter]

Set_engine modified by silex6

silex6 — Tue, 27 May 2014 17:59:54 -0000

--- v3
+++ v4
@@ -16,9 +16,13 @@

 If possible (in most case), calculation should be done as a _streaming process_: every single record that has been calculated should be sent to the client while filters continue to process subsequent records. 

-The micro-thread scheduler is a [System_abstraction_layer] that should allow concurrent (coroutine-like) execution of the filters. The scheduler will be used to start, stop or kill filters, and will be used by filters to exchange messages and wait on events. 
+### Micro-thread scheduler

-The query factory is the component that should create filters instances and link them together. The query factory should try several ways to connect components together, and estimate cost of the calculation in terms of I/O calls. Cost calculation should be done using the filters. The query factory should then choose the most cost-effective way, run set calculation that matches current SQL statement and collect results. 
+The **micro-thread scheduler** is a [System_abstraction_layer] that should allow concurrent (coroutine-like) execution of the filters. The scheduler will be used to start, stop or kill filters, and will be used by filters to exchange messages and wait on events. 
+
+### Query factory and filters
+
+The **query factory** is the component that should create filters instances and link them together. The query factory should try several ways to connect components together, and estimate cost of the calculation in terms of I/O calls. Cost calculation should be done using the filters. The query factory should then choose the most cost-effective way, run set calculation that matches current SQL statement and collect results. 

 The query factory should link filters together on a tree-like structure. Producer filters will be placed on top, and consumer filter will be placed on bottom, while pass filters will be placed in the middle. Join and Union filters are binary operators; they have two input connectors. Join and filtering predicate are compound objects constructed by the SQL runtime.