Re: [Cppcms-users] high performance database

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

> So at this point my own applications - blog, wikipp run on top
> of MySQL that is quite fast.

Yes, it is fast, I'm impressed. Some days ago, I still doubt in
choosing who is faster between C++ and Erlang (as I found possibility
to develop web with Erlang), and I found something interesting that
C++ win with 19 times faster and less memory in game benchmark. I read
about it in http://www.scribd.com/doc/29113347/Numerical-Comparison-Between-Erlang-and-C
and http://shootout.alioth.debian.org/u32q/which-programming-languages-are-fastest.php?calc=calculate&gpp=on&gcc=on&java=on&javaxint=on&jruby=on

How do you think about Erlang?
...
...
>
> Sounds interesting. But from what I see:
>
> - It works in memory... How does it scales when DB size grows
>  to hundreds of GB?

VoltDB running in 64-bit environment, so we can use bigger size memory
and VoltDB will partitions and distribute our data into cluster. How
VoltDB distributes our data depends on how we configure "how many
replica per partition" in our DB. It's named k-safety.

So if or data grow, we just add new node to cluster (-add memory size).

>
> - Its D requires running in hot cluster so each transaction should
>  be committed to other cluster to provide "D". So you always need
>  at least 2 nodes.

Yes you right. VoltDB also provide periodic snapshot into disk to
ensure data persistence, but it's not required for ACID transaction.

>
>  What happens if one crashes? How it provides D then?

With k-safety, we can choose availability level we want. k=1 means,
VoltDB will save copy per partition in 2 node, k=2 in 3 node, etc.. So
if we have at least 2 replica, it will durable.

>
>  Does it really faster then committing data to high quality
>  data-storage?
>

It will be faster, because, beside VoltDB doesn't suffered from disk
logging, they remove locking, buffer management, and latching. They
remove them because they are needed only in disk base database. From
this technique VoltDB state will save 97% CPU works for "true data
processing".

They replace DB locking into simple parallel queue transaction(a queue
per partition) to get benefit from multi-core CPU. If we have 4 core
CPU, then we can partitions or data into 4 partitions in a node.

You can try it with voter sample application they provided, and you
will get 1000 transactions per second as a slow transaction :)

I copy their benchmark bellow for you:

> Hardware/Software used for the benchmark:
> Servers: (6) Dell R610 (each with 2 x Xeon 5530 CPU, 48GB DDR3 @ 1333Mhz, single 1Gb NIC, CentOS 5.4 64-bit)
> Client: (1) Dell R610: same specification as Servers
> Switch: (1) Dell PowerConnect 6248, 48 1Gb ports
> VoltDB: (6) partitions per server

> Benchmark Plan:
> 1. Determine how much capacity this configuration can support without k-safety (run the client without rate limiting), measure throughput and latency.
> 2. Rate limit the client ~10% below the throughput of #1, measure throughput and latency.
> 3. Finally, change the cluster to be k-safe (k=1) and rerun the client at 50% of the throughput from #2, measure throughput and latency.

> Results:
> 1. k=0, no rate limit on client, measured 531,685 transactions per second (TPS) @ 350.11ms latency
> 2. k=0, client rate limited to 500,000 TPS, measured 490,674 TPS @ 9.43ms latency
> 3. k=1, client rate limited to 250,000 TPS, measured 249,162 TPS @ 7.63ms latency