Menu

Home

Adrian Panasiuk

SQLToAlgebra

The following describes SQLToAlgebra 7.1.8.

This tool allows translating SQL queries into equivalent relational algebra formulas. It implements the algorithm described in "Translating SQL into the Relational Algebra" by Jan Van den Bussche, Stijn Vansummeren. You may redistribute and/or modify SQLToAlgebra under the terms of the GNU GPLv3 license.

SQLToAlgebra depends on AlgebraToSpreadsheet which defines the API for manipulating relational algebra terms and allows to further convert the relational algebra formulas into spreadsheet files. Those spreadsheet files contain columns where the user enters her data and columns which evaluate to the results of the chosen relational algebra queries.

Project Members:

Usage

To use the tool, you need Java 7.

Demo app

After unpacking the binary distribution, you can try out the demo app. You can do it by running the translate-ng-7.1.8.jar program with no arguments. It will produce a graphical user interface window, where the user inputs the database schema and queries to create relational algebra and spreadsheet representations.

Command line demo app

The distribution also includes a command line demo app, that shows the generated spreadsheet formulas on the console. The app uses a hardcoded database schema which can not be configured. It is p(a,b,p), q(a), xy(x,y), r4(ra,rb,rc,rd).

First, generate a directory with a set of examples in .sql files. For example, to create the examples in the directory ./examples, invoke:

java -jar translate-ng-7.1.8.jar -examples ./examples/

Now, you may edit the existing or add new examples in the directory. After you're done, translate them into relational algebra:

java -jar translate-ng-7.1.8.jar -dir ./examples/

The results will be stored in ./examples/.txt and .xlsx files. The generated .txt files include the generated spreadsheet formulas. The first couple columns are filled with =NA() and represent columns, where the user will enter her data.

API

Setup

You can use the SQLToAlgebra library in your own application.
First, you need to choose a database schema, for example:

Schema schema = Schemas.fromDDL(
        "CREATE TABLE p(a INT, b INT, p INT);"+
        "CREATE TABLE q(a INT);"+
        "CREATE TABLE xy(x INT, y INT);"+
        "CREATE TABLE r4(ra INT, rb INT, rc INT, rd INT);"
);

Translation

To translate a query, use the static Queries#getRaOf:

String sqlQuery = "SELECT a,b FROM p";
Term translated = Queries.getRaOf(schema, sqlQuery);
System.out.println(translated);  // Project(p=Relation(A,B,C),[1, 2])

getRaOf may throw ParseException or TranslationException.
Additionally, you may see a pretty-printed string representation of the formula with

System.out.println(Terms.indent(translated));

To translate multiple queries at the same time, Queries#getMultipleRasOf may be used.

Integration with AlgebraToSpreadsheet

The SQLToAlgebra distribution comes together with AlgebraToSpreadsheet, which provides the relational algebra API. It includes a tool for converting relational algebra into a spreadsheet representation.

You can translate the formula with:

Result r = translated.execute();

And access the result:

System.out.println("Implemented in excel:");
System.out.println("Result relations: " + r.getResultRelation().toString(Utils.getCaf()));
System.out.println("Formulas: " + r.getFormulas().toString(Utils.getCaf()));

Spreadsheet file generation

To create a spreadsheet file in .xlsx:

import static output.ExcelFileFormulas.QUIET_LAYOUT;
import output.ExcelFileFormulas;


String xlsxOutputPath = ...;

try (
    FileOutputStream out = new FileOutputStream(xlsxOutputPath);
    ExcelFileFormulas excel = new ExcelFileFormulas(out)
)
{
    excel.addResult(r, QUIET_LAYOUT);
} catch (FormulaParseException | IOException e) {
    System.err.format("Error: could not write file '%s'\n", xlsxOutputPath);
}

The different spreadsheet columns in the generated file have different meaning. The leftmost columns represent the tables declared with the CREATE TABLE statements. This is where the user inputs data. The next couple couple columns, hidden by default, are intermediate results of computing the query results. And finally, the columns that are on the right of the user input data but are not hidden represent the resultant relations of the queries chosen by the user.

The generated spreadsheet document has only two rows filled in. They are presented in the spreadsheet program with a "N/A!" text in gray. Those are workspace rows, where the user inputs data. To input more than two rows, the user must extend the workspace area. To do so, the user must select the second row (and only the second row, with care not to select the first row) and autofill it down. This can be done with key bindings by selecting an area spanning the second row and subsequent rows that are to be autofilled and executing the autofill with the Ctrl-D key binding.

Examples

This is an incomplete list of sql queries and equivalent relational algebra formulas. It references the database schema defined above.

Queries.getRaOf(schema, "SELECT * FROM p");
// p=Relation(A,B,C)

Queries.getRaOf(schema, "SELECT p.p FROM p WHERE a>b");
// Project(Filter(p=Relation(A,B,C),"#1>#2"),[3])

[...] "SELECT x FROM xy xy0 WHERE EXISTS (SELECT * FROM xy xy1 WHERE xy1.x = xy0.y)");
// Project(
//         Filter(
//                CartProd(
//                         xy=Relation(A,B), 
//                         Reference(xy)
//                        ),
//                "#3=#2"
//               ),
//         [1]
//        )


Queries.getRaOf(schema, "SELECT min(ra),sum(rd),rc,rb FROM r4 GROUP BY rb,rc");
// Project(
//         GroupBy(
//                 r4=Relation(A,B,C,D),
//                 [2, 3]
//                ).Min[1].Sum[4],
//         [3, 4, 2, 1]
//        )

Queries.getRaOf(schema, "SELECT p0.a, p1.b FROM p p0 JOIN p p1 ON p0.b = p1.a");
// Project(
//         EqJoin(
//                 p=Relation(A,B,C),
//                 Reference(p),
//                 2,1
//                ),
//         [1, 4]
//        )

You may find additional examples embedded in the test suite.

Further reading

J. Tyszkiewicz. Spreadsheet As a Relational Database Engine. [in:] Proc. ACM SIGMOD International Conference on Management of Data, Indianapolis, 2010.

J. Sroka, K.Stencel, J. Tyszkiewicz. Translating Relational Queries into Spreadsheets.

Jan Van den Bussche, Stijn Vansummeren. Translating SQL into the Relational Algebra.

A. Panasiuk. SQLToAlgebra - a tool for translating SQL queries into relational algebra expressions and its applications.


Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.