DynamoDB AWS Intro
Start from CAP Theorem. - CAP Theorem. - Consistency - Consistency refers to the choice of either eventually or strongly consistent data. - In a replicated and distributed system, when you make a request for data, the data can be returned to you in two ways. - The first way is that the system checks with the replication point, and returns the result stored at only that replication point. If the data has been updated before your request, but that update has not fully replicated across all points, then you may be getting stale data. This data retrieval would be seen as eventually consistent. While there is a risk for retrieving still data, the requests are oftentimes much faster. - With strongly consistent, the request verifies that the data is the same across all replication points before returning a result. These requests will provide the most up-to-date information, but the operations will oftentimes be slower and more computationally expensive. Keep these in mind as we will talk about them later. - Availability - Partition Tolerance - Partition tolerance is a requirement for our systems regardless of the use case.

Some areas where NoSQL systems do not always find their advantages.

  • First, when working with a system that is self-managed, or DIY, provisioning still needs to be done for peak load. Depending on the system and the use, end time horizontal scaling may not always be an option. In these times, the system needs to be provisioned and ready to go for expected peaks in the workload.
  • Second, NoSQL databases are made up of denormalized data. Querying of normalized highly structured data can become very complex, but it also provides very strong analytical advantages.
  • Lastly, and this is a bit of a continuation from the second point, NoSQL systems do not have cross-table relationships. Where traditional databases supported data access across tables with a single query, NoSQL options require additional tools or functions to make calls to multiple tables at once.

DynamoDB is a great solution for many use cases when looking for a NoSQL option.

  • First, because it’s a managed system, there is less focus on building and architecting the underlying system, and more ability to focus on performance. DynamoDB gives you the ability to directly manage the level of performance you want to achieve through adjustment of your read and write capacity units.
  • DynamoDB also provides a really strong feature set. With DynamoDB Streams, transactions, Accelerator, support for global tables, and more, you have several options to help you get the level and type of functionality you’re looking for.
  • With DynamoDB’s low latency, redundant storage, and managed durability and fault tolerance, you start to see why we have a course dedicated to the utilization and optimization of this service.

A table is, in its most basic form, a collection of zero or more items. While I am showing this table as a grid, the table is, in fact, just items that are organized and can be queried through their keys.
The items themselves are made up of their attributes.
The attributes are the values associated with the keys that are set up for the table.
The only required attribute for all items is the primary key, also called the partition key. This key is there to provide the organizational structure for the table. In a table using only partition keys, no two items can have the same partition key value.

While the primary or sort key can be duplicated, you cannot duplicate the primary and sort key within the table.
Creating an item is when you would add a partition key value for the item being created, as well as any other fields and attributes you want to add to that item.
You also have the ability to update your items by modifying any of the attributes or adding additional fields or attributes.

Whenever data is put into or modified within your table, the changes are replicated across multiple storage facilities within the AWS Region.
This is not replication that you have to manage, but this does provide you with some built-in durability for the data.
When reading the item, you have the option to do a strongly or an eventually consistent read.

limit

  • While the table size does not have a stated limit, individual items do. The max item size for DynamoDB items is 400 kilobytes. I bring this one up because an important consideration for DynamoDB usage is that it isn’t going to be used for your large data blobs in, say, the way that Amazon S3 would be. Instead, as an example, large objects could be stored in S3, while the metadata is being stored in DynamoDB.
  • The minimum and maximum key lengths. The minimum key length for partition and sort keys is 1 byte. The maximum length for a partition key is 2048 bytes. And the maximum sort key length is half of this, at 1024 bytes.

If we just define the partition key to be Artist, it would not always be unique, as artists will have many songs. Because we must be able to uniquely identify items, we will also fill out the optional sort key to create what’s called a composite key.

  • A secondary index essentially allows you to create a copy of your table with an alternate key schema, making other attributes suddenly queryable. You can do this easily by calling an API to create the secondary index for you, and this index is kept in sync with the base table asynchronously.
  • We can then run queries against that index using the newly defined partition and sort keys. There are two types of indexes we will go over in depth.
    • The first is called a local secondary index. This allows you to pick an alternate sort key.
    • The second is called a global secondary index, and this allows you to create an alternate partition and sort key.
  • We can query non-key attributes without a scan, but it requires that we design our table to allow us to do so through the use of secondary indexes.
Author: Yuzu
Link: https://kamisu66.com/2022/05/29/Dynamo-DB-AWS-Intro/
Copyright Notice: All articles in this blog are licensed under CC BY-NC-SA 4.0 unless stating additionally.