OrientDB the fastest GraphDB available today?

Thursday, September 09, 2010

OrientDB the fastest GraphDB available today?

Two days ago I've finished the integration of the last part of the Blueprints: the Index. Now OrientDB can be used as for Neo4J with the entire ThinkerPop stack. This include the Gremlin language as well.

First tests shows that OrientDB outperforms Neo4J, the market leader of GraphDB, in all the tests but the iteration (and therefore counting). This is due to the implementation of the Blueprints that need to create a new wrapper object to contain the OrientDB's OGraphVertex and OGraphEdge objects. I've some ideas to improve it, but I need more time for it. Maybe in the next weeks or earlier if some users need it. However using native OrientDB Graph APIs this overhead is removed.

These are the results:

Test name	Description	Time in ms (less is better)		OrientDB vs Neo4J
Test name	Description	OrientDB 0.9.22	Neo4J	+ faster, - slower
testVertexEquality	1 vertex added and retrieved in	4.74	2.33	-203,4%
testRemoveVertexNullId	1000 vertices added in	216.73	2,070.74	+955,4%
	1000 vertices deleted in	1,093.71	1,910.02	+174,6%
testVertexIterator	5000 vertices added in	476.62	8,314.04	+1.744,4%
	5000 vertices counted in	86.94	1.60	-5.433,8% *
testAddManyVertexProperties	750 vertex properties added (with vertices being added too) in	72.34	43,437.29	+60.046,0%
testAddEdges	6 elements added and checked in	0.79	30.70	+3.886,1%
testAddManyEdges	3000 elements added in	2,314.44	8,031.12	+347,0%
	1000 edges counted in	8.45	12.54	+148,4%
	2000 vertices counted in	31.62	0.51	-6.200,0% *
	2000 vertices checked in	98.27	14.05	-699,4% *
testGetEdges	3 edges retrieved in	0.45	0.12	-375,0%
testRemoveManyEdges	200 vertices counted in	167.28	0.14	-119.485,7% *
	100 edges counted in	34.92	0.79	-4.420,3% *
	100 edges removed and graph checked in	20,555.44	332.54	-6.181,3% *
testStringRepresentation	1 graph string representation generated in	0.01	0.01	100,0%
testClear	75 elements added in	45.89	152.87	+333,1%
	75 elements deleted in	30.76	422.05	+1.372,1%
testRemovingEdges	500 vertices added in	133.62	974.37	+729,2%
	1000 edges added in	1,130.64	4,521.90	+399,9%
	1000 edges deleted (with size check on each delete) in	36,773.66	4,411.69	-833,6% *
testRemovingVertices	500 vertices added in	10.48	1,110.03	+10.591,9%
	250 edges added in	132.32	1,089.45	+823,3%
	500 vertices deleted (with size check on each delete) in	70,675.74	2,140.07	-3.302,5% *
testTreeConnectivity	1464 vertices added in a tree structure in	1,506.82	5,832.02	+387,0%
	1464 vertices iterated in	427.67	0.75	-57.022,7% *
	1463 edges iterated in	9.89	5.39	-183,5% *
testTinkerGraphEdges	graph-example-1 loaded in	43.75	541.15	+1.236,9%
testTinkerGraphVertices	graph-example-1 loaded in	6.81	520.29	+7.640,1%
testTinkerGraphSoftwareVertices	graph-example-1 loaded in	5.31	543.02	+10.226,4%
testTinkerGraphVertexAndEdges	graph-example-1 loaded in	5.01	544.17	+10.861,7%

* are the tests with iteration

6 comments:

TinkerPop said...: I think that these tests are misleading because Neo4j is in AUTOMATIC transaction mode and thus, for every update to the graph, a new transaction is created. Blueprints is not a benchmark suite, but a operational semantics suite. You are comparing apples and oranges.; 6/10/10 19:49
Luca Garulli said...: Hi Marko,
this is a micro-benchmark and as all the micro-benchmarks won't measure the absolute performance of a product. The micro-benchmark wants to measure specific use cases.

Since we've not yet a benchmark suite the most closer thing I know to compare two GraphDBs is the Blueprints unit tests that work on thousands of vertices, edges, properties, indexes, etc.

Even if we would have a benchmark suite, you'll always find someone that say you that the comparison is not fair for N reasons... This is the magic of the benchmark itself.; 7/10/10 00:26
chubbsondubs said...: From the looks of the test it's hard to say that OrientDB is faster than Neo4J out right. For one there a lot of swings in the data with Neo4J being 100,000x or 50,000x faster. However, the greatest swing in favor of OrientDB is 60,000x faster. The fact that there are such wild swings makes me think there might be a problem in the way the test was conducted. Assuming all is fine I think overall performance would be about the same if not a wash because the gains you get in one would be offset by the magnitude of the loss during the reads.; 1/11/10 16:47
Luca Garulli said...: Hi,
if you look at the test types Neo4J is faster than OrientDB only on counting and iteration. For the others OrientDB is far faster: add, update and delete.

Iteration doesn't means loading, but just browsing all the vertices or edges in the database (by the way to count OrientDB has own methods but are part of custom API outside Blueprints spec).

It's not a very common use case to browse all the database. This is the reason why I'm not interested in the improvement of it, but to every day improve all the others that are much more commons: read, add, update and delete of vertices and nodes.; 1/11/10 18:02
Unknown said...: Luca,
Thanks for a great graph database. I am currently trying to read through all the information on your site and wiki.
I am primarily interested in search timings. Do you have any plans of doing any comparison regarding how long it takes to search for a node/property in the database?

Thanks; 8/4/11 06:29
Luca Garulli said...: Hi,
these results are quite old, now OrientDB performs much better.

Search time? I'd like to see 3rd party benchmarks to avoid to see flame on my blog ;-); 8/4/11 09:54

Luca Garulli - The Zion City (OrientDB)

Thursday, September 09, 2010

OrientDB the fastest GraphDB available today?

6 comments:

Links

Blog Archive

AssetData blog