Cuebee
Cuebee is a flexible, extensible application for querying the semantic web. It provides a friendly interface to guide users through the process of formulating complex queries. No technical knowledge of query languages or the semantic web is required. They key enabler of the query builder is the ontology schema. The schema provides the types and possible interconnections of data to guide the user in creating a query. The query formulation process starts with a standard information retrieval task, where the user looks for a term to start building his or her query. A term can be a variable, instance or class in the ontology. Then, by reading the ontology schema, the system retrieves all possible properties that apply to the selected initial term, and present them in a list for the user to choose. When a property is chosen, another background query to the schema retrieves the possible classes in the range of that property. The process continues iteratively, until the intended query is achieved. The system then encodes the user query into SPARQL and submits to a server. Cuebee aims at SPARQL Protocol compliance, and thus is able to plug-and-play with any SPARQL Protocol compliant server such as Joseki, Virtuoso, D2R-Server, etc.
Demonstration
See:
- DBpedia Live Demo, 2010: Querying an encyclopedia of facts with Cuebee (coming soon)
- Semantic Web Dogfood, 2010: Querying Semantic Web conferences data. (Thanks to Jie Bao et al. for the data)
- Twarql Demo Video, 2010: Querying streams of annotated tweets using Cuebee and Twarql.
- Cuebee Live Demo, 2008: Query a dataset of semantic web conferences, papers and authors.
- TcruziKB Demo Video, 2007: Querying and analyzing a Genome Knowledge Base (uses an old version of Cuebee)
Publications
- P.N.Mendes. "Cuebee: Knowledge Driven Query Formulation". Project Report, Kno.e.sis Center, Wright State University. September 17, 2008. download
- P.N.Mendes, B.McKnight?, A.P.Sheth, and J.C.Kissinger, “TcruziKB: Enabling Complex Queries for Genome Data Exploration,” in Proceedings of the 2nd IEEE International Conference on Semantic Computing, Santa Clara, CA, August 4–7, 2008. download
- P.N.Mendes, A.P.Sheth "Complex Queries for Hypothesis Validation on the Web". Poster at Expanding Secondary Use of Health Data: An NSF Biomedical Informatics Workshop (BioMedWeb2007). December 4th & 5th, 2007. download
Architecture
Cuebee's architecture is MVC inspired and has the following main modules:
- View (Input and Output Widgets)
- Query Builder: interface that guides the user on specifying a query based on ontology schemas
- Results Explorer Widget: we provide a reference implementation of a widget that can subscribe to Cuebee to be notified with results to be displayed. Other developers can easily extend Cuebee to display results in any way they want. We have done that for many projects using Cuebee, resulting in the development of TGGraphExplorer, DataGridExplorer?, TweetExplorer?, etc.
- Model (Data representation within Cuebee)
- Every web resource in Cuebee is an Item that encapsulates URI, label, type, etc. Every component within the architecture handles Items, facilitating the exchange of information between parts. Transformation of triples into Items results in a subject centric view of the resources.
- Data can be read from SPARQL XML or SPARQL JSON formats, amongst others. They are parsed and transformed to Cuebee's internal format, which is then recognized by any component in the architecture.
- Controller: core component that takes care of the workflow, executing queries and notifying observer widgets
See the flow of control in our architecture in these slides: Cuebee Architecture
Please contact Pablo if you'd like to use this project or join the development team.
Powered By
Cuebee is (or was) in use within the scope of many projects. Any changes we make to the trunk should not be project-specific. Here is a list of the known deployed Cuebee versions.
- Cuadro: photo sharing community (2005)
- TcruziKB: querying genomic databases (2006,2007)
- ProtozoaDB: querying genomic databases (2007)
- TcruziPSE: querying experimental data in collaboration with UGA (2008)
- Twitris: querying the D2R (Database to RDF) endpoint exposing the Twitris data as triples (2009)
- Twarql: subscribing to twitter feeds defined by SPARQL queries (2010)
We invite you to commit the changes made to your specific deployment to a branch with your project name under our SVN. If you have fixes that are useful in general, post a message to the list and we will consider incorporating them to the trunk.
Please note that the content in this project has been shared under Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License. We will move to a more permissive license once we make the first release, but for now we need to protect our authorship. However, if you commit your changes back to our repository, you won't be sharing a derivative, rather a branch of this very software, so you are welcome to use and adapt Cuebee to your needs, as long as you are contributing back to the main repository and sharing credit with the authors of the modules you use.
Known Limitations
Cuebee was created as a research prototype, and as such has incorporated new features faster than it has incorporated comprehensive tests, bugfixes or language support. One of the main points in bringing it to Open Source is to incorporate the power of community in order to iron out such points. I list below a few known limitations and bugs. Estimated effort to implement goes in parenthesis.
- Full SPARQL support
- As of now, Cuebee only supports a subset of SPARQL.
- I and V-shape queries are supported.
- T-shape are not supported. (medium)
- There are a few small bugs with literals. (tiny)
- Filters are not yet supported. (small)
- As of now, Cuebee only supports a subset of SPARQL.
- Cross browser compatibility (unknown)
Advanced Features
- Complex Queries
- Web Services: executing processes as part of query execution
- Rules: using saved queries to simplify query formulation for complex queries
- Stream Querying (Real-time Feed Updates)
- QueryingStreams?: support for querying streaming data (e.g. Twitter) and updating result explorers in real time.
- Asynchronous Querying
- Logging: listing, canceling, restarting queries under execution
- PuSH: Real Time Delivery
- Customizing interface
- Input widgets: allowing users to add files, text, collections, etc. as part of a query formulation.
- Output widgets: presenting results in different formats: charts, graphs, tables, etc.
First time? Get some help on how to use Trac here.