AWS DynamoDB is a managed NoSQL document database service. It's a proprietary NoSQL database created by AWS. Amazon uses it on its eCommerce website. Hence its performance and scalability are proven.
I have used it in a high volume data project which needs more than 7000 writes per second and generates around 50 GBs of data daily. Though it's an effort to design applications it scales really well.
In this article, you will see a few good reasons for you to evaluate AWS DynamoDB when you are planning to use MongoDB, Cassandra, or others alike.
1. AWS DynamoDB is a managed service
To keep the database running is not a small job. If the size of data is in terabytes and growing, you need a team of infrastructure engineers to carry out the following tasks:
- Architecture & design for a multi-region, multi-partition, and redundant high-performance database.
- 24 X 7 monitoring of database nodes' health.
- Database engine upgrade.
- OS upgrade.
- Regular disk and memory space planning, monitoring, and implementation.
- Computational power planning, monitoring, and implementation.
- Security audit & trail.
- On occasion Database node maintenance and replacement.
If we are using MongoDB or Cassandra and wanted to run the database with Terabytes of data we have to make sure all the above-mentioned tasks are overlooked by an Infrastructure team.
Though AWS DynamoDB keeps you free from all the above tasks as it is a managed service. You just create tables and start pouring in data.
It helps in reducing database infrastructure management costs to near zero. It is one of its biggest selling points of it.
2. Even Petabytes of data are fine
AWS DynamoDB doesn't have any limit on the size of tables, hence even Petabytes of data are handled at the same performance. All the data is kept on Solid State Drive servers.
3. Easy read and write throughput management
AWS DynamoDB is a true cloud database. It provides the following options to manage read and write throughout elasticity:
- Auto Scale - With the Auto Scale feature you have the ability to define the increase and decrease read and write capacity of a table, when a certain percentage or number of throughput capacity is reached, AWS automatically increases or decreases the number of partitions required to handle the new throughput. It helps in reducing the cost by keeping the number of partitions optimal as per demand.
- Use the cron job to trigger the change in reading and write throughput for the table using AWS CLI commands in the script.
- Manually change throughput from the management console.
Change in the table throughput results in the creation or deletion of partitions. AWS makes sure all these happen without any downtime.
4. Automatic data and traffic management
DynamoDB automatically manages the replication and partition of the table based on the data size. It continuously monitors the table data size and spread tables on a sufficient number of servers that are replicated to multiple availability zones in a region, when required. All these without any downtime and our knowledge.
5. On-demand backup & recovery for the table
DynamoDB will never lose your data because it replicates it in multiple zones of the system which are fault tolerant.
Keeping the backup of the table periodically can save our face when application corrupt data. In some corporate, there is a compliance need for the same. It provides a simple admin console and API based backup and recovery mechanism. Backup and recovery are very fast and complete in seconds despite the size of the table.
6. Point in time recovery
DynamoDB provides the point in time recovery features to go back at any time in the last 5 weeks (35 days) of time for a table. It is over and more to back up & recover features.
7. Multi-region global tables
AWS DynamoDB does automatic syncing of data between multiple regions for global tables. You just need to specify in which regions want it to be available. Without global tables, you were doing it on your own by executing code and copying data in multiple zones.
It is really helpful if the application needs multi-region replication for performance reasons.
8. Inbuilt in-memory caching service DAX (DynamoDB Accelerator)
Caching improves the performance dramatically and cuts the load on the database engine for reading queries.
DynamoDB Accelerator (DAX) is an optional caching layer that you set up with a few clicks. DAX is a specially built cache layer to work with DynamoDB. You can use it against ElasticCache or self-hosted Redis because of its performance along with DynamoDB.
DynamoDB typically returns the read queries under 100 milliseconds, with DAX further improved, and queries return under 10 milliseconds.
9. Encryption at rest
DynamoDB request response is HTTP based, just like many other NoSQL databases. Encryption at rest is a feature provided to enable an extra layer of security for data to avoid unauthorized access to storage. Sometimes it is required by compliance. It uses 256-bit AES encryption and encrypts table level data as well as indexes. It works seamlessly with AWS key management service for the encryption key.
10. Document and key-value item storage
DynamoDB can store JSON documents or key-value items in the table.
Like other NoSQL document databases, DynamoDB is schema-less. The key attribute is only one mandatory attribute in the item.
12. Eventual and Immediate consistency
You can create the table in two consistency modes in DynamoDB.
Eventual consistency - The cheaper option, with the query may or may not make the latest item available.
Immediate consistency - If your application wants immediate consistency with query results should always give the latest items.
13. Time to live items
This is one of the powerful features of DynamoDB which enables many use cases not possible without writing custom application code. You can get your items deleted after a certain amount of time automatically by a sweeper.
DynamoDB streams are another powerful feature that enables the execution of the AWS Lambda function when an item is created, updated, or deleted. Streams are similar to AWS Kinesis streams and you can use them for many use cases. For E.g. create your own data pipeline for creating aggregated records like average, sum, etc. Or sending the email when a new user record is inserted.
15. Local DynamoDB setup
For ease of development and integration test, you can use DynamoDB local distribution. It is a Java application and it can with Java Runtime Environment installed in the environment.
One last thing which I have not highlighted but is important. Being a part of AWS cloud offering, It can easily integrate with AWS Athena for big data computation needs. However, you can always integrate it with Apache Spark or other Big data computation engines.
I will suggest you try DynamoDB as your NoSQL need, let's see if it fits your need. They provide a generous free tier to start with.