Name | Modified | Size | Downloads / Week |
---|---|---|---|
Parent folder | |||
company.config | 2016-02-03 | 129 Bytes | |
computers.config | 2016-02-03 | 125 Bytes | |
computers.queries | 2016-02-03 | 578 Bytes | |
cq1.ra | 2016-02-03 | 160 Bytes | |
dq1.ra | 2016-02-03 | 177 Bytes | |
dq2.ra | 2016-02-03 | 214 Bytes | |
drinks.config | 2016-02-03 | 128 Bytes | |
Totals: 7 Items | 1.5 kB | 0 |
RELATIONAL ALGEBRA INTERPRETER
This open source project provides an implementation of a relational algebra interpreter. This is a useful tool in introductory database courses where querying through relational algebra is covered. The interpreter assumes a MySQL database. The relational algebra queries are executed over the tables of the MySQL database.
We provide a jar file called ra-mysql.jar
in the code section; make sure to include the jar file in your CLASSPATH.
Grammar for Relational Algebra
Query ::= Expr SEMI
Expr ::= ProjExpr|RenameExpr|UnionExpr|MinusExpr|IntersectExpr|JoinExpr|TimesExpr|SelectExpr|NAME|LPAREN Expr RPAREN
ProjExpr ::= PROJECT LBRACK AttrList RBRACK LPAREN Expr RPAREN
RenameExpr ::= RENAME LBRACK AttrList RBRACK LPAREN Expr RPAREN
AttrList ::= NAME|AttrList COMMA NAME
UnionExpr ::= Expr UNION Expr
MinusExpr ::= Expr MINUS Expr
IntersectExpr ::= Expr INTERSECT Expr
JoinExpr ::= Expr JOIN Expr
TimesExpr ::= Expr TIMES Expr
SelectExpr ::= SELECT LBRACK Condition RBRACK LPAREN Expr RPAREN
Condition ::= SimpleCondition|SimpleCondition AND Condition
SimpleCondition ::= Operand COMPARISON Operand
Operand ::= NAME|STRING|NUMBER
Precedence rules:
precedence right JOIN, TIMES, UNION, MINUS, INTERSECT;
I. USER MANUAL
MySQL Database
To use this Relational Algebra interpreter, you must have a MySQL database available. This database will contain the actual tables and data on which you will perform the relational algebra queries.
Configuration File
To connect to the MySQL database from the RA interpreter, you need to use JDBC (the drivers are already included in the jar file; make sure you include the jar file in you CLASSPATH) which requires a config file like below:
DRIVER=com.mysql.jdbc.Driver
CONNECTION_STRING=jdbc:mysql://localhost/
DATABASE_NAME=dbname
USER_NAME=username
PASSWORD=password
You can customize the configuration file by providing your database name, username, and password. This will connect to the database 'dbname' in MySQL.
How to Invoke Interpreter
To invoke the interpreter you must run the command:
java RA configFile
where configFile
contains the information in the above section.
For example for the company database the command would look like this
java RA company
Another feature that can be added in the command line prompt is to view the SQL version of your RA query. To do so, you can add '-q' to the end of the command like so:
java RA configFile -q
Once one of these commands is entered, a prompt that says RA> will show up where you will be able to enter your queries.
Interpreter Commands
There are 4 different commands that can be used in this interpreter, each is followed by a semicolon to terminate the command.
-
schema
this command will list all of the table names and their column names and data types
-
A query
to run a query you can simply type in the query on the prompt
-
source filename
there is also an option to put the query in a file and run it by calling that filename
the file may contain comments preceded by '//'
the end of the query will be determined by the placement of ';'
-
exit
exit will terminate and leave the interpreter
Sample Interaction
-
Display database schema
RA> schema; ********************************************* DEPARTMENT(DNAME:STRING,DNUMBER:NUMBER,MGRSSN:STRING,MGRSTARTDATE:STRING) DEPENDENT(ESSN:STRING,DEPENDENT_NAME:STRING,SEX:STRING,BDATE:STRING,RELATIONSHIP:STRING) DEPT_LOCATIONS(DNUMBER:NUMBER,DLOCATION:STRING) EMPLOYEE(FNAME:STRING,MINIT:STRING,LNAME:STRING,SSN:STRING,BDATE:STRING,ADDRESS:STRING,SEX:STRING,SALARY:NUMBER,SUPERSSN:STRING,DNO:NUMBER) PROJECTS(PNAME:STRING,PNUMBER:NUMBER,PLOCATION:STRING,DNUM:NUMBER) T1(A:NUMBER,B:NUMBER) T2(A:NUMBER,B:NUMBER) T3(C:NUMBER,D:NUMBER) WORKS_ON(ESSN:STRING,PNO:NUMBER,HOURS:NUMBER) ********************************************* RA>
-
Query from keyboard
RA> project[fname,minit,lname](employee join rename[ssn,a,b,c,d](dependent)); ANSWER(FNAME:STRING,MINIT:STRING,LNAME:STRING) Number of tuples = 6 John:B:Smith: Franklin:T:Wong: Alex:D:Freed: Bonnie:S:Bays: Alec:C:Best: Jennifer:S:Wallace: RA>
-
Execute query from file. Assume
queries/cq1.ra
contains a relational algebra query:Naveens-MacBook-Pro:compiler naveen$ more queries/cq1.ra //comment project[fname,lname,address]( (rename[dname,dno,mgrssn,mgrstartdate]( select[dname='Research'](department)) join employee ) ); Naveens-MacBook-Pro:compiler naveen$ java RA company RA> source queries/cq1.ra; ANSWER(FNAME:STRING,LNAME:STRING,ADDRESS:STRING) Number of tuples = 4 John:Smith:731 Fondren, Houston, TX: Franklin:Wong:638 Voss, Houston, TX: Joyce:English:5631 Rice, Houston, TX: Ramesh:Narayan:971 Fire Oak, Humble, TX: RA>
-
Exit the interpreter
RA> exit; Naveens-MacBook-Pro:compiler naveen$
II. IMPLEMENTATION
Phase 1 (Lexical analysis and parsing)
Lexical Analysis
We used jflex for lexical analysis. The lexical specifiation file is shown below:
LineTerminator = \r|\n|\r\n
WhiteSpace = {LineTerminator} | [ \t\f]
Project = [pP][rR][oO][jJ][eE][cC][tT]
Rename = [rR][eE][nN][aA][mM][eE]
Union = [uU][nN][iI][oO][nN]
Minus = [mM][iI][nN][uU][sS]
Intersect = [iI][nN][tT][eE][rR][sS][eE][cC][tT]
Join = [jJ][oO][iI][nN]
Times = [tT][iI][mM][eE][sS]
Select = [sS][eE][lL][eE][cC][tT]
And = [aA][nN][dD]
Name = [a-zA-Z][a-zA-Z0-9_]*
Numeric_constant = [0-9]+ | [0-9]+"."[0-9]* | "."[0-9]*
String_constant = ['][^'\n]*[']
Comparison = "=" | "<>" | "<" | ">" | "<=" | ">="
%%
/* ------------------------Lexical Rules Section---------------------- */
<YYINITIAL> {
";" { return symbol(sym.SEMI); }
{Comparison} { return symbol(sym.COMPARISON, new String(yytext())); }
{Project} { return symbol(sym.PROJECT); }
{Union} { return symbol(sym.UNION); }
{Minus} { return symbol(sym.MINUS); }
{Intersect} { return symbol(sym.INTERSECT); }
{Join} { return symbol(sym.JOIN); }
{Times} { return symbol(sym.TIMES); }
{Rename} { return symbol(sym.RENAME); }
{Select} { return symbol(sym.SELECT); }
{And} { return symbol(sym.AND); }
"]" { return symbol(sym.RBRACK); }
"[" { return symbol(sym.LBRACK); }
")" { return symbol(sym.RPAREN); }
"(" { return symbol(sym.LPAREN); }
"," { return symbol(sym.COMMA); }
{Name} { return symbol(sym.NAME, new String(yytext())); }
{Numeric_constant} { return symbol(sym.NUMBER,new String(yytext())); }
{String_constant} { return symbol(sym.STRING,new String(yytext())); }
{WhiteSpace} { /* just skip what was found, do nothing */ }
}
The above specification is relatively straight forward to understand (that is if you know regular expressions!)
Notice that all of the key words in the relational algebra are case insensitive. For example, the key word union
is specified by the regular expression
Union = [uU][nN][iI][oO][nN]
The numeric constants allow for both integers and real numbers. Notice that negative numbers are not allowed, if you are interested into looking into the source go ahead and update the numeric constant regular expression to allow negative numbers.
Parsing
We used jcup for parsing. Besides parsing we also construct a tree data structure to store the expression tree for the relational algebra query. The java code embedded within the grammar rules is relatively straight forward. One rule is illustrated below.
JoinExpr ::= Expr:e1 JOIN Expr:e2
{:
RANode n = new RANode();
n.setLchild(e1);
n.setRchild(e2);
n.setRnodetype("join");
RESULT = n;
:}
;
Typically we introduce java code at the end of the rule in which we construct the node in the expression tree using pointers for sub expressions (e1,e2).
Data Structure
We illustrate the data structure tree with the following query.
project[B](
rename[B](bar)
join
rename[B,R](serves)
join
project[R](select[D='Clark'](rename[D,R](likes)))
);
The expression tree for the above query is shown below (The schema and relation name will be explained later):
MySQL Adaptor
We built a MySQL adaptor class, MySQL.java
, that provides the following methods:
public MySQL()
public static void open(String configFile)
public static void close()
public boolean relationExists(String rname)
public ArrayList<String> getAttributes(String rname)
public ArrayList<String> getDataTypes(String rname)
public void displayDatabaseSchema()
public void displayQueryResults(String query, RANode root)
//query is the MySQL query generated for the expression
//root is top node of expression tree
A typical usage of the adaptor is as follows:
MySQL m = new MySQL();
m.open("configFile");
... call other methods
...
m.close();
Phase 2 (Semantic checks and preparation for phase 3)
Semantic Checks
At this point the input given by the user has passed all syntax checks, now semantic checks will be performed.
Base case: the input is a table
The table must be present in the database.
Minus, Union, or Intersect
The number of columns and datatypes of the two tables must be equal.
Times and Join
No semantic check
Project
All of the attributes listed must be in the schema of the subquery.
Rename
The size of the attribute list must be the same as the size of the schema of the subquery.
Select
The data types of the left operand and the right operand must be equal, and any table used must be in the database. All attributes references in the select condition must be present in the schema of the subquery
Schema and Datatype Computation
Base case: the input is a table
Set the schema and datatypes to the schema and datatypes of the given table.
Minus, Union, or Intersect
Set the schema and datatypes to that of the left expression (chosen arbitrarily; both expressions have the same schemas and datatypes)
Times
Set the schemas and datatypes to that of the left and right expressions, adding the table name in front of the columns that may repeat in the schema.
Join
Set the schemas and datatypes to that of the left and right expressions, but in the right expression, only set the columns that dont appear in the left expression.
Project
Set the schema and datatypes to that of the attribute list given.
Rename
Set the schema to that of the attribute list, and set the datatypes to that of the subquery.
Select
Set the schema and datatypes to that of the subquery.
Temporary Table Name Generation
Temporary table names are generated due to syntax rules of MySQL. They are generated recursively giving a table name tempN to each of the subqueries, where N starts at 0 and increments for each node that is not a basic relation(not a leaf node) based on the post order recursive path.
Phase 3 (Query translation and execution)
Query Translation
For each expression we put a left and right parenthesis around it to ensure it executes in the proper order.
Base case: the input is a table
(select * from table)
Inductive cases: Let lquery
and rquery
denote the left and right sql sub expressions. Also let tempL
and tempR
denote the relation names of the left and right child nodes and let tempN
be the relation name for the current node.
Union
(lquery union rquery)
Minus; assume c1,...,ck are the columns of lquery
(select * from lquery tempL where (c1,...,ck) not in (select * from rquery tempR))
Intersect; assume c1,...,ck are the columns of lquery
(select * from lquery tempL where (c1,...,ck) in (select * from rquery tempR))
Times
(select * from lquery tempL, rquery tempR)
Join; assume c1,...,ck are the columns of lquery and rquery with duplicates removed and assume jc1,...,jcn are the join columns
(select distinct (c1,...,ck) from lquery tempL, rquery tempR where (tempL.jc1=tempR.jc1 and ... and tempL.jcn=tempR.jcn))
Project; assume a1,...,an are the projection attributes
(select distinct (a1,...,an) from lquery tempL)
Rename; assume a1,...,an are the rename attributes and c1,...,cn are the columns of lquery
(select distinct (c1 a1, ...,cn an) from lquery tempL)
Select; assume lc1 and rc1, ..., lcn and rcn are the left and right operands for each conditions and there are n conditions. Also op is any operator.
(select * from lquery tempL where (lc1 op rc1 and ... and lcn op rcn))
Query Execution
We call the function displayQueryResults(query,tree) that is defined in the MySQL java class. This function uses JDBC to connect to the MySQL database and perform the query. The results of that query are then printed to the screen.