Download Latest Version java-cup-11a-runtime.jar (13.3 kB)
Email in envelope

Get an email when there's a new version of Relational Algebra for MySQL

Home / queries
Name Modified Size InfoDownloads / Week
Parent folder
company.config 2016-02-03 129 Bytes
computers.config 2016-02-03 125 Bytes
computers.queries 2016-02-03 578 Bytes
cq1.ra 2016-02-03 160 Bytes
dq1.ra 2016-02-03 177 Bytes
dq2.ra 2016-02-03 214 Bytes
drinks.config 2016-02-03 128 Bytes
Totals: 7 Items   1.5 kB 0

RELATIONAL ALGEBRA INTERPRETER

This open source project provides an implementation of a relational algebra interpreter. This is a useful tool in introductory database courses where querying through relational algebra is covered. The interpreter assumes a MySQL database. The relational algebra queries are executed over the tables of the MySQL database.

We provide a jar file called ra-mysql.jar in the code section; make sure to include the jar file in your CLASSPATH.

Grammar for Relational Algebra

Query ::= Expr SEMI
Expr ::= ProjExpr|RenameExpr|UnionExpr|MinusExpr|IntersectExpr|JoinExpr|TimesExpr|SelectExpr|NAME|LPAREN Expr RPAREN
ProjExpr ::= PROJECT LBRACK AttrList RBRACK LPAREN Expr RPAREN
RenameExpr ::= RENAME LBRACK AttrList RBRACK LPAREN Expr RPAREN
AttrList ::= NAME|AttrList COMMA NAME
UnionExpr ::= Expr UNION Expr
MinusExpr ::= Expr MINUS Expr
IntersectExpr ::= Expr INTERSECT Expr
JoinExpr ::= Expr JOIN Expr
TimesExpr ::= Expr TIMES Expr
SelectExpr ::= SELECT LBRACK Condition RBRACK LPAREN Expr RPAREN
Condition ::= SimpleCondition|SimpleCondition AND Condition
SimpleCondition ::= Operand COMPARISON Operand
Operand ::=  NAME|STRING|NUMBER

Precedence rules:

precedence right JOIN, TIMES, UNION, MINUS, INTERSECT;

I. USER MANUAL

MySQL Database

To use this Relational Algebra interpreter, you must have a MySQL database available. This database will contain the actual tables and data on which you will perform the relational algebra queries.

Configuration File

To connect to the MySQL database from the RA interpreter, you need to use JDBC (the drivers are already included in the jar file; make sure you include the jar file in you CLASSPATH) which requires a config file like below:

    DRIVER=com.mysql.jdbc.Driver
    CONNECTION_STRING=jdbc:mysql://localhost/
    DATABASE_NAME=dbname
    USER_NAME=username
    PASSWORD=password

You can customize the configuration file by providing your database name, username, and password. This will connect to the database 'dbname' in MySQL.

How to Invoke Interpreter

To invoke the interpreter you must run the command:

    java RA configFile

where configFile contains the information in the above section. For example for the company database the command would look like this

    java RA company

Another feature that can be added in the command line prompt is to view the SQL version of your RA query. To do so, you can add '-q' to the end of the command like so:

    java RA configFile -q

Once one of these commands is entered, a prompt that says RA> will show up where you will be able to enter your queries.

Interpreter Commands

There are 4 different commands that can be used in this interpreter, each is followed by a semicolon to terminate the command.

  1. schema

    this command will list all of the table names and their column names and data types

  2. A query

    to run a query you can simply type in the query on the prompt

  3. source filename

    there is also an option to put the query in a file and run it by calling that filename

    the file may contain comments preceded by '//'

    the end of the query will be determined by the placement of ';'

  4. exit

    exit will terminate and leave the interpreter

Sample Interaction

  1. Display database schema

    RA> schema;
    
    *********************************************
    DEPARTMENT(DNAME:STRING,DNUMBER:NUMBER,MGRSSN:STRING,MGRSTARTDATE:STRING)
    DEPENDENT(ESSN:STRING,DEPENDENT_NAME:STRING,SEX:STRING,BDATE:STRING,RELATIONSHIP:STRING)
    DEPT_LOCATIONS(DNUMBER:NUMBER,DLOCATION:STRING)
    EMPLOYEE(FNAME:STRING,MINIT:STRING,LNAME:STRING,SSN:STRING,BDATE:STRING,ADDRESS:STRING,SEX:STRING,SALARY:NUMBER,SUPERSSN:STRING,DNO:NUMBER)
    PROJECTS(PNAME:STRING,PNUMBER:NUMBER,PLOCATION:STRING,DNUM:NUMBER)
    T1(A:NUMBER,B:NUMBER)
    T2(A:NUMBER,B:NUMBER)
    T3(C:NUMBER,D:NUMBER)
    WORKS_ON(ESSN:STRING,PNO:NUMBER,HOURS:NUMBER)
    
    *********************************************
    RA>
    
  2. Query from keyboard

    RA> project[fname,minit,lname](employee join rename[ssn,a,b,c,d](dependent));
    
    ANSWER(FNAME:STRING,MINIT:STRING,LNAME:STRING)
    
    Number of tuples = 6
    John:B:Smith:
    Franklin:T:Wong:
    Alex:D:Freed:
    Bonnie:S:Bays:
    Alec:C:Best:
    Jennifer:S:Wallace:
    
    RA>
    
  3. Execute query from file. Assume queries/cq1.ra contains a relational algebra query:

    Naveens-MacBook-Pro:compiler naveen$ more queries/cq1.ra
    //comment
    project[fname,lname,address](
            (rename[dname,dno,mgrssn,mgrstartdate](
            select[dname='Research'](department)) 
            join 
            employee
            )
    );
    
    Naveens-MacBook-Pro:compiler naveen$ java RA company
    RA> source queries/cq1.ra;
    
    ANSWER(FNAME:STRING,LNAME:STRING,ADDRESS:STRING)
    
    Number of tuples = 4
    John:Smith:731 Fondren, Houston, TX:
    Franklin:Wong:638 Voss, Houston, TX:
    Joyce:English:5631 Rice, Houston, TX:
    Ramesh:Narayan:971 Fire Oak, Humble, TX:
    
    RA>
    
  4. Exit the interpreter

    RA> exit;
    Naveens-MacBook-Pro:compiler naveen$
    

II. IMPLEMENTATION

Phase 1 (Lexical analysis and parsing)

Lexical Analysis

We used jflex for lexical analysis. The lexical specifiation file is shown below:

    LineTerminator = \r|\n|\r\n
    WhiteSpace = {LineTerminator} | [ \t\f]
    Project = [pP][rR][oO][jJ][eE][cC][tT]
    Rename = [rR][eE][nN][aA][mM][eE]
    Union = [uU][nN][iI][oO][nN]
    Minus = [mM][iI][nN][uU][sS]
    Intersect = [iI][nN][tT][eE][rR][sS][eE][cC][tT]
    Join = [jJ][oO][iI][nN]
    Times = [tT][iI][mM][eE][sS]
    Select = [sS][eE][lL][eE][cC][tT]
    And = [aA][nN][dD]
    Name = [a-zA-Z][a-zA-Z0-9_]*
    Numeric_constant = [0-9]+ | [0-9]+"."[0-9]* | "."[0-9]*
    String_constant = ['][^'\n]*[']
    Comparison = "=" | "<>" | "<" | ">" | "<=" | ">="
    %%
    /* ------------------------Lexical Rules Section---------------------- */
    <YYINITIAL> {
        ";"                 { return symbol(sym.SEMI); }
        {Comparison}        { return symbol(sym.COMPARISON, new String(yytext())); }
        {Project}           { return symbol(sym.PROJECT); }
        {Union}             { return symbol(sym.UNION); }
        {Minus}             { return symbol(sym.MINUS); }
        {Intersect}         { return symbol(sym.INTERSECT); }
        {Join}              { return symbol(sym.JOIN); }
        {Times}             { return symbol(sym.TIMES); }
        {Rename}            { return symbol(sym.RENAME); }
        {Select}            { return symbol(sym.SELECT); }
        {And}               { return symbol(sym.AND); }
        "]"                 { return symbol(sym.RBRACK); }
        "["                 { return symbol(sym.LBRACK); }
        ")"                 { return symbol(sym.RPAREN); }
        "("                 { return symbol(sym.LPAREN); }
        ","                 { return symbol(sym.COMMA); }
        {Name}              { return symbol(sym.NAME, new String(yytext())); }
        {Numeric_constant}  { return symbol(sym.NUMBER,new String(yytext())); }
        {String_constant}   { return symbol(sym.STRING,new String(yytext())); }
        {WhiteSpace}        { /* just skip what was found, do nothing */ }   
    }

The above specification is relatively straight forward to understand (that is if you know regular expressions!) Notice that all of the key words in the relational algebra are case insensitive. For example, the key word union is specified by the regular expression

    Union = [uU][nN][iI][oO][nN]

The numeric constants allow for both integers and real numbers. Notice that negative numbers are not allowed, if you are interested into looking into the source go ahead and update the numeric constant regular expression to allow negative numbers.

Parsing

We used jcup for parsing. Besides parsing we also construct a tree data structure to store the expression tree for the relational algebra query. The java code embedded within the grammar rules is relatively straight forward. One rule is illustrated below.

    JoinExpr ::= Expr:e1 JOIN Expr:e2
      {:
        RANode n = new RANode();
        n.setLchild(e1);
        n.setRchild(e2);
        n.setRnodetype("join");
        RESULT = n;
      :}
    ;

Typically we introduce java code at the end of the rule in which we construct the node in the expression tree using pointers for sub expressions (e1,e2).

Data Structure

We illustrate the data structure tree with the following query.

    project[B](
      rename[B](bar)
      join
      rename[B,R](serves)
      join
      project[R](select[D='Clark'](rename[D,R](likes)))
    );

The expression tree for the above query is shown below (The schema and relation name will be explained later):

q1.jpg

MySQL Adaptor

We built a MySQL adaptor class, MySQL.java, that provides the following methods:

    public MySQL()
    public static void open(String configFile)
    public static void close()
    public boolean relationExists(String rname)
    public ArrayList<String> getAttributes(String rname)
    public ArrayList<String> getDataTypes(String rname)
    public void displayDatabaseSchema()
    public void displayQueryResults(String query, RANode root)
    //query is the MySQL query generated for the expression 
    //root is top node of expression tree

A typical usage of the adaptor is as follows:

    MySQL m = new MySQL();
    m.open("configFile");
    ... call other methods
    ...
    m.close();

Phase 2 (Semantic checks and preparation for phase 3)

Semantic Checks

At this point the input given by the user has passed all syntax checks, now semantic checks will be performed.

Base case: the input is a table

The table must be present in the database.

Minus, Union, or Intersect

The number of columns and datatypes of the two tables must be equal.

Times and Join

No semantic check

Project

All of the attributes listed must be in the schema of the subquery.

Rename

The size of the attribute list must be the same as the size of the schema of the subquery.

Select

The data types of the left operand and the right operand must be equal, and any table used must be in the database. All attributes references in the select condition must be present in the schema of the subquery

Schema and Datatype Computation

Base case: the input is a table

Set the schema and datatypes to the schema and datatypes of the given table.

Minus, Union, or Intersect

Set the schema and datatypes to that of the left expression (chosen arbitrarily; both expressions have the same schemas and datatypes)

Times

Set the schemas and datatypes to that of the left and right expressions, adding the table name in front of the columns that may repeat in the schema.

Join

Set the schemas and datatypes to that of the left and right expressions, but in the right expression, only set the columns that dont appear in the left expression.

Project

Set the schema and datatypes to that of the attribute list given.

Rename

Set the schema to that of the attribute list, and set the datatypes to that of the subquery.

Select

Set the schema and datatypes to that of the subquery.

Temporary Table Name Generation

Temporary table names are generated due to syntax rules of MySQL. They are generated recursively giving a table name tempN to each of the subqueries, where N starts at 0 and increments for each node that is not a basic relation(not a leaf node) based on the post order recursive path.

Phase 3 (Query translation and execution)

Query Translation

For each expression we put a left and right parenthesis around it to ensure it executes in the proper order.

Base case: the input is a table

(select * from table)

Inductive cases: Let lquery and rquery denote the left and right sql sub expressions. Also let tempL and tempR denote the relation names of the left and right child nodes and let tempN be the relation name for the current node.

Union

(lquery union rquery)

Minus; assume c1,...,ck are the columns of lquery

(select * from lquery tempL where (c1,...,ck) not in (select * from rquery tempR))

Intersect; assume c1,...,ck are the columns of lquery

(select * from lquery tempL where (c1,...,ck) in (select * from rquery tempR))

Times

(select * from lquery tempL, rquery tempR)

Join; assume c1,...,ck are the columns of lquery and rquery with duplicates removed and assume jc1,...,jcn are the join columns

(select distinct (c1,...,ck) from lquery tempL, rquery tempR where (tempL.jc1=tempR.jc1 and ... and tempL.jcn=tempR.jcn))

Project; assume a1,...,an are the projection attributes

(select distinct (a1,...,an) from lquery tempL)

Rename; assume a1,...,an are the rename attributes and c1,...,cn are the columns of lquery

(select distinct (c1 a1, ...,cn an) from lquery tempL)

Select; assume lc1 and rc1, ..., lcn and rcn are the left and right operands for each conditions and there are n conditions. Also op is any operator.

(select * from lquery tempL where (lc1 op rc1 and ... and lcn op rcn))

Query Execution

We call the function displayQueryResults(query,tree) that is defined in the MySQL java class. This function uses JDBC to connect to the MySQL database and perform the query. The results of that query are then printed to the screen.

Source: README.md, updated 2016-02-03