Oracle NoSQL Database Vs. Cassandra

Transcription

Oracle NoSQL Database vs. CassandraORACLE COMPET ITIVE MAY 2017

Introduction and OverviewOracle NoSQL Database is a scalable and distributed key-value data store. It leverages Berkeley DBin its storage. It provides highly reliable, flexible and available data management across a configurableset of storage nodes. Oracle NoSQL Database is a Java based key-value store implementation thatsupports Tables, JSON schema or key-value data models. The implementation uniquely supports a fullrange of transaction semantics from ACID to relaxed eventually consistent and has built in support foronline elasticity. Developers can choose from multiple programming language APIs including REST tostore and retrieve data. It is designed to scale to large clusters and geographically distributed datacenters for disaster recovery. This is complimented by its secondary index support and the support forfull text search. Oracle NoSQL DB offers comprehensive SQL-based declarative query capability.Oracle NoSQL DB provides enterprise-class security. In addition, the NoSQL Database is integratedwith Oracle RDBMS, several other Oracle products and related open source technologies like Hadoop/ MapReduce. It can be deployed on the Oracle Big Data Appliance which is an engineered system aswell as commodity servers. Oracle NoSQL Database Enterprise Edition server is available undercommercial license and Oracle NoSQL Database Community Edition server is available under an opensource license.Oracle NoSQL Database is available under Apache 2.0 license. Cassandra is a highly available,distributed database for managing large structured data across many commodity servers. Cassandrais a key-value store that supports a single value abstraction known as table-structure. It configuresnodes over a ring based architecture where every node in the system can handle any read-writerequest, so nodes become coordinators of requests when they do not actually hold the data involved inthe requested operation. Cassandra supports drivers for popular programming languages. It supportsSQL-like query language called CQL, and the support for secondary index and other indexing patternsenable users to retrieve data effectively. Cassandra can be integrated well with theHadoop/MapReduce environment. Cassandra Community edition is available under Apache licenseand it is also available with commercial license as Datastax Enterprise Edition.1 ORACLE NOSQL DATABASE VS CASSANDRA

ComparisonFeatureOracle NoSQL DatabaseCassandraData ModelOracle NoSQL Database has aCassandra provides a Table basedflexible key-value data model. It alsodata model where rows aresupports Table and JSON datapartitioned. The first component of amodels. The Table and JSON datatable's primary key is the partitionmodels also support secondarykey; within a partition, rows areindices and schema evolution.clustered by the remaining columnsData Modelsof the key. Other columns can beindexed separately from the primarykey.Storage ModelData Access and APIsOracle NoSQL Database leveragesCassandra uses log-structuredBerkeleyDB as a storage engine forengine that uses sequential IO toeach node. BerkeleyDB is a logupdate data. During writes,structured implementation proven inCassandra stores the data inmillions of deployments. It is anmemory, and appends the same toappend only implementation thatthe commit log on disk. The dataenables efficient and extremely highstored in memory is eventuallywrite throughput.flushed to the disk.Oracle NoSQL Database has clientData in Cassandra can be accessedlibrary API’s for Java, Python,using CQL – the query shell and alsoNode.js, C/C and REST. Oraclemultiple client drivers like – Java,NoSQL DB also supports parallelized Python, C#, Ruby, Node.js, PHP,bulk inserts and bulk retrieval APIsC , Apache Spark. Cassandra Bulkfor high performance data accessloader (SSTableLoader) provides thewith very large data-sets.ability to bulk load external data intoClient APIsQueryOracle NoSQL Database providesThe Cassandra Query Languagekey based access methods (put, get,(CQL) is the primary language fordelete) including multi-key variationscommunicating with the Cassandrawith large result set streamingdatabase. Users can interact withsupport. Oracle NoSQL DB providesCassandra using the CQL shell,support for SQL for read access viacqlsh. With cqlsh, users can createshell as well as programmatic API.keyspaces and tables, insert andThe data can also be retrieved usingquery tables, Nested user-definedsecondary indices. The Oracletypes and indexes. Syntactically theNoSQL DB data can also beQuery language is very much similaraccessed from Oracle relationalto standard RDBMS SQL Cassandradatabase or Hive using SQL queries.has support for hash basedThis is further optimized and2 ORACLE NOSQL DATABASE VS CASSANDRAthe cluster.

parallelized across the cluster tosecondary index.achieve improved query speed andoverall better performance.SQL SupportQuery Key/Value APIOracle RDBMS integrationTransactionsOracle NoSQL DB support ACIDCassandra does not support ACIDtransactions over multiple recordstransactions but offers atomic,that share the same shard key. Itisolated, and durable transactionsprovides choice over for consistencywith eventual/tunable consistency.and durability on a per-operationCassandra trades transactionalbasis. This allows developers toisolation and atomicity for highmake appropriate tradeoff betweenavailability and fast writeperformance and durability/performance. Writes in Cassandraconsistency.are durable but consistent reads canConsistency Explainedbe expensive, since it must contactseveral replicas to find the “latest”version of the data.ConcurrencyOracle NoSQL DB achievesCassandra does not implement anyconcurrency with fine grained locking. locking mechanism. It relies onThe locking is performed at database Coordinator nodes to serverecord level.concurrent requests.Concurrent ProcessingReplicationOracle NoSQL DB supportsCassandra is designed as a peer-to-Replication for both availability andpeer system that makes copies of thescalability with auto failover. Thedata and distributes the copiesclient driver is topology aware thatamong nodes in a group. Thehashes insert operations to electedreplication uses the ring to determinemaster in the replica group and thenodes that would hold copies of data.data is replicated to replica nodes.Number of replicas of data isFor every write operation, the userconfigurable with each keyspace cancan choose the number of replicas to have independent replication factor.be updated before responding toCassandra supports 2 replicationclient and whether the data should be strategies. SimpleStrategy that is thewritten to the memory or disk on each default which blindly writes the datanode. Reads can be serviced fromin subsequent nodes along the ring.any node in a replication group. ToNetworkTopologyStrategy is primarilymaintain write availability whenuseful when deploying to multiplereplication factor is two, Oracledata centers. It ensures that data isNoSQL DB provides support forreplicated across data centers.arbiter node. The arbiter does not3 ORACLE NOSQL DATABASE VS CASSANDRA

hold any data but holds only the stateinformation and participates inprimary election process.Replication GroupsSharding/ScalabilityOracle NoSQL DB is shardedIn Cassandra data distribution and(shared-nothing) system thatreplication go together. Virtual nodesdistributes data uniformly andassign the data ownership to physicalautomatically across the clustermachines by allowing each node tobased on the hashed value of theown a large number of small dataprimary key. It uses MD5 hashingrange. Cassandra offers the followingalgorithm over a fixed, highly3 partitioners: Murmer3 (default),granular, partition definition. OracleRandom and ByteOrder partitioners.NoSQL DB supports online elasticityCassandra provides online clusterby redistribution of data to newlyexpansion to achieve scalability andadded hardware.elasticityTopology ChangesSecurityOracle NoSQL DB supports following Apache Cassandra provides thesecurity features – Authenticationand authorization, Audit Logging,Role based access control withsupport for custom defined roles,following Security features:1. SSL Encryption from Client toDatabase.SSL Encryption on network data,2. Authentication and AuthorizationExternal Password Storage withDataStax Enterprise AdvancedOracle Wallet and password store,Security provides additionalKerberos Support. Data at rest canextensions on top of the above-be encrypted using file-system-based1. Inter-node Encryptionencryption. Oracle NoSQL DBprovides security configuration utility2. Encryption at Restto administer security for the cluster.3. Data AuditingSecurity GuideAdministration4 ORACLE NOSQL DATABASE VS CASSANDRAOracle NoSQL DB is simple and very Cassandra administration requireseasy to use. It can be set up to loadgood understanding of its concepts.and query data within just 5 minutes.The cluster can be setup via aOracle NoSQL DB provides tools forconfiguration file, hence subsequentcapacity planning. Multi-nodeconfiguration changes/ upgradesconfiguration is made easy withcould become error prone in absencesingle administration interface. Itof any admin tool. Datastaxsupports node upgrades andEnterprise Edition offers Ops Centermigrations as well as backup andtool that provides web based UI forRecovery using file-system-basedmonitoring and administratingsnapshots. NoSQL DB also supportsCassandra cluster. Apachehigh speed data export and import.Cassandra does not have native

Oracle NoSQL DB simplifiessupport for online rolling upgrades.maintenance with comprehensivesupport forRolling UpgradesAdministrationMulti DataCenter AwarenessOracle NoSQL DB supports multipleCassandra allows multiple workloadsDataCenters through non-electableto be run across multiple datacentersreplica group strategy. Read requests using a snitch. Data can be replicateduse local nodes to satisfy latencyacross the data centersdemands and Write availability isautomatically.achieved in a local quorum in primarydata center and data is replicated tonon-electable nodes in secondarydata centersMonitoring and AdministrationOracle NoSQL DB provides a varietyCassandra provides JMX support forof protocols for monitoring themonitoring operations. Cassandracluster. The proprietary protocols areoffers command line tools forsupported in both browser based and administration and monitoring.CLI interfaces. JMX facilitatesDatastax Enterprise Edition providesintegration with monitoring tools likeOps Center that is graphical userEnterprise Manager, BMC tools etc.interface to for monitoring andStandardized MonitoringHadoop Integrationadministrating the cluster.Oracle NoSQL DB is integrated withCassandra is integrated with Hadoopand can participate in MapReduceenvironment such that MapReduceoperations from Hadoopjobs can retrieve data from andenvironment. Oracle NoSQL DB alsooutput to Cassandra. Cassandrasupports running Hive queries.includes native support for ApacheHadoop IntegrationPig and Hive.Hive IntegrationLarge Object SupportOracle NoSQL DB provides streamCassandra isn’t optimized for largeAPI support to read and write Largefile or BLOB storage and single blobObjects (LOBs).values is always read and send to theLOB Supportclient entirely. It is advised tomanually split large blobs ( 16MB)into smaller chunks.Text SearchOracle NoSQL DB is integrated withThere is no native support for FullElasticSearch.Text Search in Apache Cassandra.Full Text Search5 ORACLE NOSQL DATABASE VS CASSANDRADatastax Enterprise Edition offerssearch support to query data using

complex, sub-string, fuzzy and fulltext search queries.Spatial and Graph DataOracle NoSQL DB supports graphThere is no native support fordata management by integrating with managing Spatial and Graph data inOracle Big Data Spatial and Graph.Apache Cassandra. Datastax offersTogether, they can manage networks Graph database based on Datastaxof linked data as vertices, edges, and Enterprise Edition that is inspired byproperties of vertices that can beopen source Titan graph database.used to model, store and analyzerelationships found in socialnetworks, cyber security andknowledge networks.Oracle IntegrationOracle NoSQL DB is well integratedCassandra is not integrated with anywith Oracle products including:Oracle products Oracle Relational Database Oracle Coherence Oracle Database MobileServer Oracle EnterpriseManager Oracle Golden Gate Oracle SQL Developer Oracle Stream Exploreramong othersTime To LiveThe Time To Live feature in OracleCassandra supports data to have anNoSQL DB limits the lifetime of dataoptional expiration period, i.e. Time toin the store. TTL can be specified inLive. It can have a precision of ahours and days using the Table APIsecond.as part of DDL queries. OracleNoSQL DB supports TTL at the rowlevel.6 ORACLE NOSQL DATABASE VS CASSANDRA

Figure 1 - Oracle NoSQL Database Performance Compared to Cassandra and MongoDB7 ORACLE NOSQL DATABASE VS CASSANDRA

Oracle Corporation, World HeadquartersWorldwide Inquiries500 Oracle ParkwayPhone: 1.650.506.7000Redwood Shores, CA 94065, USAFax: 1.650.506.7200CONNECT W ITH r.com/oracleoracle.comCopyright 2017, Oracle and/or its affiliates. All rights reserved. This document is provided for information purposes only, and thecontents hereof are subject to change without notice. This document is not warranted to be error-free, nor subject to any otherwarranties or conditions, whether expressed orally or implied in law, including implied warranties and conditions of merchantability orfitness for a particular purpose. We specifically disclaim any liability with respect to this document, and no contractual obligations areformed either directly or indirectly by this document. This document may not be reproduced or transmitted in any form or by anymeans, electronic or mechanical, for any purpose, without our prior written permission.Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners.Intel and Intel Xeon are trademarks or registered trademarks of Intel Corporation. All SPARC trademarks are used under license andare trademarks or registered trademarks of SPARC International, Inc. AMD, Opteron, the AMD logo, and the AMD Opteron logo aretrademarks or registered trademarks of Advanced Micro Devices. UNIX is a registered trademark of The Open Group. 0116Oracle NoSQL Database vs CassandraMay 2017Author: Schulman/Nirhali8 ORACLE NOSQL DATABASE VS CASSANDRA

for high performance data access with very large data-sets. Client APIs Data in Cassandra can be accessed using CQL - the query shell and also multiple client drivers like - Java, Python, C#, Ruby, Node.js, PHP, C , Apache Spark. Cassandra Bulk loader (SSTableLoader) provides the ability to bulk load external data into the cluster.