Tag Archives: database

Interview with Michael Krieger, Subject Matter Expert – Databases, SQL, Big Data, Storage and Industry Trends

SF: What topics are getting the most traffic when it comes to databases?

MK: Hot topics include hybrid structured/unstructured databases, especially growth of NoSQL for big data (and other non-structured) usage and emergence of cloud as viable repository for databases thanks to virtually ubiquitous bandwidth

SF: SQL 2016… Modernization and upgrade to SQL 2016, polybase, JSON (new capability w/in SQL 2016), open-source language – How they relate to databases and then SQL 2016?

MK: The emergence of JSON reduces the burden for data interchange, and is rapidly gaining popularity. This alone may drive some modernization, as will the larger shift toward open-source solutions for many parts of enterprise private or hybrid clouds (or whatever we call datacenters these days)

SF: Data warehouse + big data: Storage of big data, unstructured + structured data (hybrid model) – what is MongoDB doing?

MK: Mongo’s NoSQL/JSON/Open-Source approach has led to rapid adoption; some sources cite MongoDB as the most popular database for new data stores.

SF: How are people searching for SQL Server? Is it just “SQL” or is it “SQL Server”? How are people referring to Microsoft products in the industry?

MK: We believe SQL still refers to the technology, while SQL Server refers to the MS product, particularly among Enterprise IT who view Oracle and SQL server both as SQL products.

SF: How is the industry talking about data? “Big Data” has been a trend – is it still? What about the storage of data?

MK: Big Data is getting bigger, thanks to the emerging IoT market, which will bring yet new ways for us to collect information. The plummeting price of storage – including SSD storage is also leading organizations to archive some data indefinitely, with the hope of utilizing not-yet-created analytics tools to mine existing data for new insights. Witness the cloud providers (like MS and Google) who offer unlimited storage to their enterprise cloud (O365, Google Apps) customers.

SF: What about enterprises specifically? Are there new trends in the industry we should be aware of?

MK: Security ALWAYS is number one, top of mind. Internet of Things (IoT) is bubbling up as are wearable devices. New cloud platforms for PaaS and agile development also are of interest. From a database perspective, the combination of JSON and NoSQL components driven by big, unstructured data will continue to put pressure on traditional database products.

SF: Thanks Michael.

MK: My pleasure.

MK image 1About Michal Krieger: Michael Krieger lends his industry perspective, product marketing and management expertise to a broad variety of high-tech clients including SlashDot Media. Michael has been on the leading edge of technology for over 40 years, starting as a software and hardware developer. He has since held executive roles in marketing, product management, business development and Technology Market Research. A pioneer in many fields, he led and participated in teams that developed one of the first mainframe to PC connections, one of the first Multiprocessor PC Servers, the first Blade Server, and one of the first Cloud Computing SaaS services while employed by tech leaders such as AST Computer, Hitachi, and FutureLink. Michael also served over a decade as VP of the Ziff Davis Market Experts where he developed integrated marketing programs for a who’s who of technology and consumer electronics companies. Michael has been a speaker at numerous trade shows including Interop, CloudSlam, Blade Server Summit and the late Comdex.

Synchronize Databases with SymmetricDS

SymmetricDS is open source software that synchronizes data between databases on servers, desktops, and mobile devices. With support for most major database platforms and operating systems, it can integrate data in a heterogenous environment that spans across databases and networks. Using asynchronous operation, data is synced in the background at a user-specified interval that can meet either batch or immediate time integration requirements.

Data Synchronization Solutions

Flexible configuration allows for one-way or bi-directional replication, filtered synchronization, and transformation of data. SymmetricDS is used in four common solutions for the flow of data: bridging, consolidation, distribution, and multi-master.

Bridging or Active/Passive

Data is replicated in one direction from a primary database to a secondary one. The primary database is active while the secondary is passive or read-only. This can be used as an operational database that supports an application and a reporting database that offloads work for analysis and reporting. Another use for the secondary database is a backup that becomes active in the event of a failure or during maintenance of the primary. Sometimes the secondary database has a different structure or contains a subset of the data from the primary, and syncing data becomes a bridge that transforms the data between the systems. For example, an ecommerce site puts orders in its database and the data is bridged to fulfilment and call center application databases.

Data Consolidation

Data is combined from multiple databases into a single database. There may be two tiers or multiple tiers of databases that consolidate up the chain. For example, a retail company records sales on a database at the point of sale, which syncs to the store server, then syncs to a regional server, and finally syncs to the central office. Another example is medical devices that collect information and sync data to a hospital server.

Data Distribution

Data is managed centrally and subsets of rows are distributed to remote nodes. The data flows in the opposite direction of consolidation, from a large database to many smaller databases. For example, a retail company sets item prices by company, zone, or store in the central office database, and the data is automatically distributed to the corresponding retail store databases.

Multi-Master or Active/Active

Data is replicated across all nodes, with all nodes actively participating. This configuration is used to improve access to data through high availability. For example, a parts manufacturer uses multiple websites located in the US and EMEA to provide fast access for its customers by using database replication.

Advantages of SymmetricDS

Compared to other synchronization or replication solutions, SymmetricDS has several advantages:

  • Database Independent – With 12 different databases supported as of version 3.1, SymmetricDS interoperates between multiple databases and operating systems. Adding support for more databases is easy because of a dialect abstraction layer.
  • Easy Configuration – Configuration is both easy and flexible with lots of options for advanced configuration and a plug-in architecture for customization.
  • Scaling and Performance – Large networks of thousands of nodes is possible because SymmetricDS is optimized to handle simultaneous requests and load data quickly. With settings for management of memory, connections, and clustering, a deployment can be tuned for both big and small environments.
  • Adverse Networks – Sync data across networks with low bandwidth usage and withstand periods of network outage. SymmetricDS guarantees replication by tracking data and automatically retrying errors.

With open source development, SymmetricDS enjoys the benefits of collaboration across companies, geographies, and markets. The feedback and assistance of users and developers worldwide have improved its features and quality. We’re excited by the growth of the project and we look forward to seeing more community involvement in the future.

The Anvil Podcast: Adminer

Rich: I’m speaking with Jakub Vrana, and he’s a member of the Adminer project.

If the embedded audio player below doesn’t work for you, you can download the audio in mp3 or ogg format.

You can subscribe to this, and future podcasts, in iTunes or elsewhere, at http://feeds.feedburner.com/sourceforge/podcasts, and it’s also listed in the iTunes store.

Rich: Thanks for speaking with us. I as wondering if you could tell us some more about the project – tell us what it does and how you got started working on it.

Jakub: The project is a web-based administration of databases. I started with MySQL, but the project currently supports several other database engines, like Oracle, PostgreSQL, SQI, and MSSQL. I started the project because I needed some web-based tool. There are other great desktop tools, but they are all limited in the possibilities of connecting to the remote server, because remote servers often forbid connections from the desktop client. I needed something web-based. I used primarily MySQL, and there’s this PHPMyAdmin project, which lots of people use, but I don’t like it much. There’s also another problem – the PHPMyAdmin project is very big, and if you want to use it on a server you must upload 700 files which is several megabytes, which takes 10 minutes. Then you do something, and if it’s not my hosting, I’m supposed to delete it afterwards. So I started with a small tool just for performing basic operations on the server with MySQL database. But it grew up, and currenty it has all the features you need to manage MySQL server or other databases.

226430

Rich: What do you have in mind for future versions of the product? Are you thinking of supporting other databases, or do you have other features that you’re working on?

Jakub: Yes. Regarding other databases, I’m mostly done, because the project supports all the databases I need, and I use. So I don’t plan supporting other databases, because I don’t like working on features which I don’t personally use. But the project is Open Source, of course, so anybody can add support for another database, and it’s quite simple. For example, the MSSQL was added primarily by someone else. I don’t plan on adding another engine.

I’m also not sure about more features, because currently the project is feature-complete. It supports all of the features of MySQL, and most of the features of other databases. My current focus is probably on allowing more extendibility of the project. Currently you can very easily change the design, you can quite easily change how the program behaves. For example, if you wanted to add some custom button there, for example, to send an email from the application to the address inserted in the database, it’s quite simple. My primary focus will be on this – improving the extendibility, which is currently, I think, quite good. But there are not many extensions yet, so I think this needs some improvement.

Rich: I notice that you’re also involved in several other projects. Would you like to tell us something about those as well?

Jakub: Yes, sure. I created a small library which is called NotORM, which means that it looks like ORM, but it’s not an ORM. It’s a library that allows you to access the database from PHP code, in a database, regardless of the type of database. MySQL, SQI, Oracle and all other databases are supported. It allows you to very easily access the data in the database. With this very simple and easy to learn API, you also get very good performance, which is also extremely cacheable. It almost doesn’t allow you to perform unefficient queries to the database. All queries produced by NotORM tend to be performant. That’s another one of my projects.

Rich: Is there anything else you’d like to tell us about your project?

Jakub: Just try it! If you haven’t tried it, just try it. If you use any other clients for communicating with the database, just try Adminer, and you’ll find that it’s very small. The whole application is just 300 kilobytes embedded in a single file that you can copy to the server. It looks like a very primitive tool, but it actually has more features than most competitors, so give it a try, and let me know how you like it.

Rich: Thank you very much for your time.

Jakub: Thank you for calling me.

Project Of The Month, January 2012 – HyperSQL Database Engine

We’re delighted to announce first Project Of The Month for 2012, HyperSQL Database Engine.

(See previous Project Of The Month winners)

The Project Of The Month is selected from the projects that grew the fastest in the previous month, based on the activity of the project community, on mailing lists and ticket trackers, and the commit and release activity of the project. In coming months (starting with February) we’re going to involve you, the SourceForge community, more in this process, by having a Twitter based vote on the POTM for February. Details coming soon.

HyperSQL DB is a database engine written in Java that can be embedded in Java applications, or it can be run as a standalone and connected to over JDBC. It also contains some tools for making JDBC connections to other databases.

You can subscribe to this, and future podcasts, in iTunes or elsewhere, at http://feeds.feedburner.com/sourceforge/podcasts, and it’s also listed in the iTunes store.

If the embedded player doesn’t work for you, you can also download the audio in
mp3 and ogg formats.

Rich: Today I’m speaking with Fred Toussi and Blaine Simpson. Hello?


Blaine: Hi Rich.

Fred: Hello Rich.

R: So, without further introduction, here’s my conversation with the two lead developers of this project.

Well, let’s jump right in with the first question. How long have you been doing this project?

F: Eleven years.

R: And have the two of you been involved that whole time?

F: I’ve been involved since the beginning. Blaine joined after about a year.

B: Long timers.

R: How big is the overall developer community that you see patches and commits from?

F: At the moment there are three of us. In the past, there have been others. People join in, do some work, finish that work …

R: How does this database compare to other databases that people might be more familiary with, like PostgreSQL?

F: In general, the two differences: Our database can use memory very well …

R: So it’s really fast …

F: It can run completely in memory without any files.

B: And on portable devices too.

F: Exactly. Another aspect – our database compares very well with those well-known ones for small and medium-sized data, in terms of speed and so on. Now, the definition of small and medium is just going up as hardware gets faster, and more disk space, etc. So at the moment, several hundred megabytes I would say, it’s extremely fast, and compares extremely well to Postgres and MySQL.

B: And there are huge advantages to all of the java databases to integrate with Java applications. Not just HyperSQL, but JavaDB and Derby – those are a lot more efficient, and easier to integrate with Java applications.

R: So you sort of embed this in with your Java application? Is that how this works?

B: You can, and it’s very easy to either embed it in an application, or use a JDBC driver. All the popular databases have a JDBC driver. Even if it’s a completely C database, if it’s a popular database, it probably has a JDBC driver. It’s just a tiny jar file. So it’s very easy for any user, either a developer or someone just running a client, like running a spreadsheet, for them to get the JDBC driver. So we have a JDBC driver. They just get a jar, and they can use it with their Java app.

R: If somebody was looking to get involved in your project, what sort of openings do you have for someone has the right skill set.

F: Every area is open to development. More compatibility with other databases – commercial databases, so that people can port applications more easily. There are lots and lots of possibilities. But it takes a lot of knowledge of databases and SQL to participate in some of the areas of the database. But if there are some other areas where they probably need knowledge of some particular API. And a good knowldge of Java, of course. But there are openings and in the past we have had a few other core developers who have contributed to the project.

R: If there’s something you’re particularly passionate about on this project, what would that be?

F: Quality. Resilliance. Basically quality explains it all. Resilliance means it’s on 24/7. Nothing goes down, and it’s reliable.

B: Fred’s a tireless worker. If you look through the forum history, we’re on top of every problem. And as Fred says, he works very hard to keep everything reliable. He works on this full time, constantly. It’s a very solid product. We put a lot of effort in to make it very standalone. Most Java products that people want to integrate or embed a database into their product – it’s a lot of work, and a hassle dealing with all the interdependencies with other products. For example, if you just want to use a database for your Java product, so you pull down the jar for that database, usually you have to deal with all these interdependencies. It depends on these 12 libraries. They depend on another 100 libraries. And you have to … it’s a constant chore to make sure you’re at the right version. You don’t have that issue with hsqldb. It’s a single stand-alone jar. We don’t depend on anything else.

It means that some things are extra work. For example, for a tool, I have to generate some HTML. We have to do that from scratch, because we don’t want to depend on anything else. We don’t want to make that an extra chore for people who use the product.

F: I suggest Blaine describe the tool he’s working on, because it is a significant tool in its own right. It can be used with all other databases. So, Blaine, could you go ahead and describe that.

B: My main area of concentration with HyperSQL is the tool SQLTool, and it’s a generic JDBC client. JDBC is just accessing SQL databases over the Java protocol, JDBC. This tool can connect to any database that has a JDBC driver. Like I said a few minutes ago, all popular databases have a JDBC driver, so you can use this tool, SQLTool, to connect to and work with pretty much any relational database. It’s a command-line tool, so that means that if you’re looking for something graphical with buttons and pictures and graphical tables, this isn’t what you want. It’s for people who want to interactively work on the command line – type in SQL. And also for automation.

In that respect there are other tools that do the same thing like Derby’s IQL and Oracle’s SQL plus. It started as comparable to those tools, but it’s a lot more powerful for two reasons.

For one, those tools can’t work with just any JDBC database. They work with the database that they come with. SQLTool can work with any. You can write scripts and you can use the tool, the same exact commands, to work with any database.

The second advantage is that a lot of work has gone in to make it very stable and suitable for automation purposes. You can go to the command line, whether you’re on MacOS or Windows, or Unix, and right on the command line you can give a single command line with SQL in it. You don’t need to enter the program. You just give it on the command line. People who work with automation and scripting can recognize right away that that’s extremely useful.

A lot of work has gone in so that it gives a meaningful error status. It always ensures that it will return a success status if everything works. The developer has the control to use the default, which is if an error occurs just exit right away, depending on whether it’s interactive or not. But the error handling is intuitive. If you run it interactively and an error occurs, it tells you on the screen but it won’t exit. If you execute an SQL script with it, it will exit by default, and you’ll get an exit status. So it’s very easy to integrate that, put it in scripts or cron jobs, or autosys, or Windows scheduling systems, and send error notification through email or logging systems. Very suitable for automation tasks.

R: This sounds like it’s almost a full-time job. How much time do you all put into this project?

F: I’m working full time. Blaine has worked full time when he’s between assignments. He works part-time. There’s a lot of work that he has to do, to do with maintaining the code repository – version control – to do with releases – he has set up releases on Maven, on Sourceforge, etc, and with the documentation, with basic generation of documentation, because we use Docbook markup. There is a lot of logistics involved in releasing the new versions of the product, and Blaine has taken care of that as well as doing his SQLTools, and in the past he has also contributed code to the engine.

R: So you have commercial clients.

F: Yes, we have.

B: My time investment has been very sporadic, so even though I have been working steadily at regular daytime jobs, there have been several weeks when I have worked 40 hours on Hsqldb. I just finished up one of those marathon runs about a week ago, so there were about three weeks there where I would get back from my day job every day and crunch at hsqldb.

R: I’m curious about something here – this is more of a philosophy question. You have a full-time job here, and you have commercial clients. What is the rationale for making this Open Source as well?

F: Because it was Open Source to start with. The commercial clients like the fact that it’s Open Source. These people contribute financially to the project to keep it going, so they want it to be as it is.

I do also have another product which is not Open Source, but it’s only used by a small percentage of the commercial users. Most of them want the Open Source product, which is good enough for their use.

B: And for the SQLTool portion, that gets a lot more use because it is Open Source. Millions of people can benefit by connecting to commercial like Oracle, and Open Source databases like MySQL, and everything in between. I put a lot of work into the product, so I get a lot satisfaction from reaching a wider audience, being Open Source.

F: We’ve been in computing a long time – all three of us. I developed one of the early WYSISYG word processors in the late 80s and early 90s. And then the platform it was on disappeared. I thought that the next time I did something, it would be completely cross-platform, and Open, and it has a very long shelf-life. That was probably the main motivation.

B: We appreciate the social mechanisms – the forums and mailing lists – on Sourceforge, because for all these years it’s been nice to have that system in the background there working for us. I’ve spent minutes, adding up all the time, administering those communcations systems over the years. It’s nice to have that working so that I can concentrate on the code, and when people bring up problems, I can work on those problems and your communication and help systems take care of themselves.

F: We use the Sourceforge platform for bugs and for communication – the forums. We’ve used them from the beginning, and they are adequate for our needs.

R: Thank you all so much for your time. I really appreciate it.

F: You’re welcome.

B: Thanks very much Rich.

The Anvil Podcast: Mardao

I recently spoke with Ola Sandström from the Mardao project and the interview is below.

You can subscribe to this, and future podcasts, in iTunes or elsewhere, at http://feeds.feedburner.com/sourceforge/podcasts, and it’s also listed in the iTunes store.

If the embedded player doesn’t work for you, you can also download the audio in
mp3 and ogg formats.

Rich: I’m speaking with Ola Sandström, and we’re going to talk about the Mardao project.

Could you tell us what this project is, how it works, how it fits together, and how people use this in the real world.

Ola Sandstrom

Ola: Mardao is a tool that helps the database developer get the data out of the database into the application or the website, depending on what the app is. Mardao generates the data access objects, so that the developer doesn’t have to worry much about SQL statements, or relations, and so on. And it also saves the developer a lot of time and effort writing boilerplate code.

R: How did you get started with this project? What kind of a problem were you trying to solve?

O: In the first place, we wanted something more effective than similar techniques such as Hibernate. I also had a colleague who had generated similar stuff, but taking a different approach. It was for a specific project that I created the more general tool.

R: Do you have a feel for how large your user community is?

O: I know how many downloads there have been, and I see how many downloads there are when we make each release. That varies between 50 and 100 downloads. Maybe the recurring usage is about ten or fifteen users. Hopefully the number of production systems is about the same.

It’s not a big community, but it is my first Open Source project, so I’m ok with that.

R: The developer community is just you? You’re the only person who works on this? Is that correct?

O: No, there is one more developer – a former colleague – who has focused on one of the implementing techniques. You can use Mardao either for Spring, or you can use in on top of JPA, or on Google App Engine. This former colleague of mine implemented the JPA port.

R: What do you have planned for future versions of the project?

O: The biggest thing right now is to support Android applications. There is a nice SQL Light database on each Android device. It’s a very good fit to generate code for those databases. I think we’ll have a next versions early next year.

I’ve been quite happy hosting at Sourceforge, because I think you get the necessary tools, such as the Wiki and the issue tracker and so on. I certainly would consider starting another project there.

R: If someone wanted to get involved in an Open Source project, and they have some Java skills and database skills, what sort of an opening might there be on your project for such a developer? Is there a need that you have that you might welcome another developer for?

O: They certainly would be very welcome to join and commit. I think that if the user base grows a little bit – if I get more feedback, there would certainly be more areas where we would need to improve and so on. I’m not sure right now what the next big thing to focus on, but I’m sure any developer would come up with ideas if they start using it.

R: Thank you very much.

O: Thank you. Bye.