Thursday, September 16, 2010

Started the development of OrientDB clustering

Today is a new day, the day after the official release of OrientDB version 0.9.22. I apologize to all the requesters of the issues planned for this release and not yet finished and postponed to the 0.9.24. What is the reason?

---> Replication, Clustering, Fault-Tolerance <---

Seems that the most missed feature in OrientDB is the support for clustering, and therefore high scalability, high availability and high volume of transactions that a single node can't handle. In the last months I studied the whole different architectures of other NoSQL solutions for clustering.

Today is a new day because I'll start the development of clustering for OrientDB with the following features:
  • Master-Slaves type, where it can be only one Master and N Slaves. If the Master crashes a Slave is elected to be the new Master
  • IP multicast to discover cluster nodes
  • Configuration of nodes using TCP/IP, useful for Clouds that don't allow the IP multicast
  • Two sync modes: full where all the database is compressed and sent over the network, and partial by sending only the changes happened since the last sync
  • New database handled by the Master OrientDB Server instance to store all the pending records until a configurable threshold. Up this threshold the logs are deleted and the node need a full-sync on startup
  • New console commands to display nodes, listen clustering messages and elect the master manually
The release 0.9.23 is planned for the October 15th, 2010. Stay tuned and contribute with comments, ideas or critiques.


GertThiel said...

Did you choose the "master and multiple slaves" model for implementation simplicity?

I'm afraid that could result in something administration-wise complicated.

Considering Hazelcast, maybe a multi-master model might be possible?

I'm asking because I'm looking for a fast and flexible database that is as easy for the admin as for the developer ;-)


Gert Thiel

Luca Garulli said...

After some analysis on the (so different) current architectures of other DBMSs I realized that the single master-multi slaves is the best for OrientDB.

But I'm always open to better ideas ;-)

I know quite well Hazelcast because the OrientKV (Key/Value) is based on it. But Hazelcast is not suitable for OrientDB because the requirements are differents.

OrientDB it's easy to configure and to develop. Current SVN version already has the auto-discovery of nodes using IP multicast.

Cloud said...


Cloud said...

Does that mean we benefits of all the hazelcast added values like hazelcast cluster monitoring tool ( and its others goods in OrientKV ?