DynamoDB is a fully managed,
highly available, easily scalable, and low-latency performance NoSQL database.
Based on the read and write capacity set by the user, it provides the needed
infrastructure, and simplifies the database and cluster management by partitioning
of the data for fast read and write.
This topic is extremely important
to pass the AWS – Developer Associate exam. Many questions appear on the Solutions
Architect Associate as well as the Professional exam; especially the ones that
present different architecting scenarios.
Concepts for the Exam
Certain topics pertaining to
DynamoDB have more probability of appearing in one exam compared to others. To
indicate which exam a certain topic is important to study for, I am including
that exam abbreviation in front of that topic. The abbreviations are as given
SAA – Solutions Architect Associate Exam, DA – Developer Associate
Exam, SOA – SysOps Administrator Exam, SAP – Solutions Architect
Professional Exam, and DOP – DevOps Professional Exam.
1. Partition (Hash) and Sort (Range) Keys
Partition key is the attribute of the table on
which DynamoDB builds a hash index – that is used to determine at which
partition the record is to be stored.
Sort Key is the attribute which simplifies
ordering of the items in a table via a query.
Given an example of a table containing certain
attributes, the exam question may ask you to identify what should be the Partition
and Sort Keys. See such an example at the end.
2. Secondary Indexes DASAASAP
Apart from the primary key, one or more
secondary indexes on the table allows you to search the table efficiently,
avoid the scan operation, and provide an alternate way to partition and sort
the data without using the primary key. These indexes are Global Secondary
Index and Local Secondary Index.
For the exam, understand the difference between the
Secondary Index – Both the partition and sort keys are different than that of
the primary key. This index can be created and deleted at any time.
Secondary Index – Has the same partition key as that of the primary key, but
has a different sort key. This index can be created only during the creation
of the table.
3. Provisioned Throughput Calculations DA
By far, this is the most important topic,
especially for the Developer Associate exam. You can expect a couple of
questions to calculate the read and/or write throughput for a given scenario.
In DynamoDB configuration setting, based on the
workload of the application, the user provides a certain amount of read and
write capacity. This capacity is measured in terms of read and write capacity
The amount of capacity consumed for read
operations depends on the desired read consistency – Eventual or Strong. See #4
below for further explanation.
The throughput capacity in terms of read
capacity units and write capacity units is measured as follows:
read capacity unit represents one
strongly consistent read per second, or two eventually consistent reads
per second, for an item up to 4 KB in size. The total number of read
capacity units required depends on the item size, and whether you want an
eventually consistent or strongly consistent read.
write capacity unit represents one
write per second for an item up to 1 KB in size. The total number of
write capacity units required depends on the item size. (No eventual or strong
consistency in write).
For example, suppose that you create a table
with 5 read capacity units and 5 write capacity units. With these settings,
your application could:
strongly consistent reads of up to 20 KB per second (4 KB × 5 read capacity
eventually consistent reads of up to 40 KB per second (twice as much read
up to 5 KB per second (1 KB × 5 write capacity units).
See the sample exam question at the end.
4. Difference between Eventual Consistent Read
and Strong Consistent Read DASAA
The items stored
in DynamoDB are replicated across multiple AWS Regions for high availability.
When an item is updated, it starts replicating across multiple servers in those
regions, which takes some time to complete.
Eventual consistency means that if an item is written or updated, an immediate
read operation may not show its latest value, showing the stale data. But
within a second, the latest value can be read.
If your application requires that the latest value must always be
returned, then strong consistency should be used; where, DynamoDB returns the
latest value on the immediate subsequent read operation.
5. Difference Between Scan and Query DA
Query operation returns only requested items
from the table; whereas, scan returns all the items in the table. Hence, scan
is an expensive option and should be avoided as much as possible.
6. DynamoDB Streams DA
Keeps track of the recent changes made to the
records in DynamoDB.
Used to return the list of items modified in the
last 24-hour period. SAA
Stream records are organized into groups called Shards.
7. Atomic Counters and Conditional Writes SAA
Used for concurrency control.
If multiple users try to modify the same item
simultaneously, it is important not to lose the value of that item, and the
next read operation should always return the correct value. Using the Atomic
Counter, DynamoDB handles the concurrent updates in a serial manner, without
losing any updates.
With Conditional Writes, you can check that
certain conditions are met before the item is read or written.
Pass the Exam
Things to remember!
SAA and SAP exams, DynamoDB mostly appears as one of the options when it comes
to architecting a database for distributed web application. Always remember
that for an application, where scalability
and data read/write speed (low latency) is
the most important design consideration, DynamoDB is the best option. Just by increasing
the read and write capacity with one click or a single API call named UpdateTable, DynamoDB can scale with
virtually limitless capacity.
DynamoDB table, the attribute that has most occurrences in the table, is the
prime candidate for the partition key. The attribute that needs ordering, has a
range (e.g. smallest to largest), or a specific value is required (e.g.
highest, lowest), is the prime candidate for the sort key. See an example at
provisioned throughput calculation, follow these steps:
If the items read are per minute, always get the
items per second by dividing that number by 60.
Note if read capacity or write capacity is
it is read,
Divide the size (in KB) of the item by 4 and
round to the next natural number. Then multiply that with the items per second.
If eventual consistency is given, divide the
result by 2. If strong consistency is given, keep the result as is.
it is write, just multiply the items
per second with the size (in KB).
your application exceeds the maximum allowed provisioned throughput for a
is thrown, which is a 400 HTTP Status Code. SAA
read and write/update multiple items from DynamoDB table in a single batch
operation, use BatchGetItem and BatchWriteItem APIs respectively. You
need to remember these APIs especially for DA and SAA.
Question: A meteorological system
monitors 600 temperature gauges,
obtaining temperature samples every minute
and saving each sample to a DynamoDB table. Each sample involves writing 1K of data and the writes are evenly
distributed over time. How much write
throughput is required for the target table?
1 write capacity unit
10 write capacity units
60 write capacity units
600 write capacity units
3600 write capacity units
First, get the items per second, which is 600/60 = 10. Then, since the
write throughput is requested, just multiply this number with the size in KB
which is 1. Hence, the answer is 10 x 1 = 10.
Question: You are building a game
high score table in DynamoDB. You will store each user’s highest score for each
game, with many games, all of which
have relatively similar usage levels and numbers of players. You need to be
able to look up the highest score
for any game. What’s the best DynamoDB key structure?
HighestScore as the hash / only key.
GameID as the hash key,
HighestScore as the range key.
GameID as the hash / only key.
GameID as the range / only key.
Since there are many games with similar usage levels and number of
players, GameID should be the partition key, and since highest score is
desired, sorting the games by the score would surely be beneficial. Hence,
HighestScore should be the sort key. Hence, B is the answer.