DynamoDB available, easily scalable, and low-latency performance NoSQL

DynamoDB

Introduction

We Will Write a Custom Essay Specifically
For You For Only $13.90/page!


order now

DynamoDB is a fully managed,
highly available, easily scalable, and low-latency performance NoSQL database.
Based on the read and write capacity set by the user, it provides the needed
infrastructure, and simplifies the database and cluster management by partitioning
of the data for fast read and write.

This topic is extremely important
to pass the AWS – Developer Associate exam. Many questions appear on the Solutions
Architect Associate as well as the Professional exam; especially the ones that
present different architecting scenarios.

Key
Concepts for the Exam

Certain topics pertaining to
DynamoDB have more probability of appearing in one exam compared to others. To
indicate which exam a certain topic is important to study for, I am including
that exam abbreviation in front of that topic. The abbreviations are as given
below:

SAA – Solutions Architect Associate Exam, DA – Developer Associate
Exam, SOA – SysOps Administrator Exam, SAP – Solutions Architect
Professional Exam, and DOP – DevOps Professional Exam.

1.      Partition (Hash) and Sort (Range) Keys
DA, SAP

·       
Partition key is the attribute of the table on
which DynamoDB builds a hash index – that is used to determine at which
partition the record is to be stored.

·       
Sort Key is the attribute which simplifies
ordering of the items in a table via a query.

·       
Given an example of a table containing certain
attributes, the exam question may ask you to identify what should be the Partition
and Sort Keys. See such an example at the end.

2.      Secondary Indexes DASAASAP

·       
Apart from the primary key, one or more
secondary indexes on the table allows you to search the table efficiently,
avoid the scan operation, and provide an alternate way to partition and sort
the data without using the primary key. These indexes are Global Secondary
Index and Local Secondary Index.

·       
For the exam, understand the difference between the
two.

o Global
Secondary Index – Both the partition and sort keys are different than that of
the primary key. This index can be created and deleted at any time.

o Local
Secondary Index – Has the same partition key as that of the primary key, but
has a different sort key. This index can be created only during the creation
of the table.

3.      Provisioned Throughput Calculations DA

·       
By far, this is the most important topic,
especially for the Developer Associate exam. You can expect a couple of
questions to calculate the read and/or write throughput for a given scenario.

·       
In DynamoDB configuration setting, based on the
workload of the application, the user provides a certain amount of read and
write capacity. This capacity is measured in terms of read and write capacity
units.

·       
The amount of capacity consumed for read
operations depends on the desired read consistency – Eventual or Strong. See #4
below for further explanation.

·       
The throughput capacity in terms of read
capacity units and write capacity units is measured as follows:

o One
read capacity unit represents one
strongly consistent read per second, or two eventually consistent reads
per second, for an item up to 4 KB in size. The total number of read
capacity units required depends on the item size, and whether you want an
eventually consistent or strongly consistent read.

o One
write capacity unit represents one
write per second for an item up to 1 KB in size. The total number of
write capacity units required depends on the item size. (No eventual or strong
consistency in write).

·       
For example, suppose that you create a table
with 5 read capacity units and 5 write capacity units. With these settings,
your application could:

o Perform
strongly consistent reads of up to 20 KB per second (4 KB × 5 read capacity
units).

o Perform
eventually consistent reads of up to 40 KB per second (twice as much read
throughput).

o Write
up to 5 KB per second (1 KB × 5 write capacity units).

·       
See the sample exam question at the end.

4.      Difference between Eventual Consistent Read
and Strong Consistent Read DASAA

The items stored
in DynamoDB are replicated across multiple AWS Regions for high availability.
When an item is updated, it starts replicating across multiple servers in those
regions, which takes some time to complete. 
Eventual consistency means that if an item is written or updated, an immediate
read operation may not show its latest value, showing the stale data. But
within a second, the latest value can be read. 
If your application requires that the latest value must always be
returned, then strong consistency should be used; where, DynamoDB returns the
latest value on the immediate subsequent read operation. 

5.      Difference Between Scan and Query DA

·       
Query operation returns only requested items
from the table; whereas, scan returns all the items in the table. Hence, scan
is an expensive option and should be avoided as much as possible.

6.      DynamoDB Streams DA

·       
Keeps track of the recent changes made to the
records in DynamoDB.

·       
Used to return the list of items modified in the
last 24-hour period. SAA

·       
Stream records are organized into groups called Shards.

7.      Atomic Counters and Conditional Writes SAA
DA

·       
Used for concurrency control.

·       
If multiple users try to modify the same item
simultaneously, it is important not to lose the value of that item, and the
next read operation should always return the correct value. Using the Atomic
Counter, DynamoDB handles the concurrent updates in a serial manner, without
losing any updates.

·       
With Conditional Writes, you can check that
certain conditions are met before the item is read or written.

Tips to
Pass the Exam

Things to remember!

1.      For
SAA and SAP exams, DynamoDB mostly appears as one of the options when it comes
to architecting a database for distributed web application. Always remember
that for an application, where scalability
and data read/write speed (low latency) is
the most important design consideration, DynamoDB is the best option. Just by increasing
the read and write capacity with one click or a single API call named UpdateTable, DynamoDB can scale with
virtually limitless capacity.

2.      In
DynamoDB table, the attribute that has most occurrences in the table, is the
prime candidate for the partition key. The attribute that needs ordering, has a
range (e.g. smallest to largest), or a specific value is required (e.g.
highest, lowest), is the prime candidate for the sort key. See an example at
the end.

3.      In
provisioned throughput calculation, follow these steps:

a.     
If the items read are per minute, always get the
items per second by dividing that number by 60.

b.     
Note if read capacity or write capacity is
asked.

                                                   
i.     If
it is read,

1.     
Divide the size (in KB) of the item by 4 and
round to the next natural number. Then multiply that with the items per second.

2.     
If eventual consistency is given, divide the
result by 2. If strong consistency is given, keep the result as is.

                                                  
ii.     If
it is write, just multiply the items
per second with the size (in KB).

4.      When
your application exceeds the maximum allowed provisioned throughput for a
table, ProvisionedThroughputExceededException
is thrown, which is a 400 HTTP Status Code. SAA

5.      To
read and write/update multiple items from DynamoDB table in a single batch
operation, use BatchGetItem and BatchWriteItem APIs respectively. You
need to remember these APIs especially for DA and SAA.

Sample
Questions

Question: A meteorological system
monitors 600 temperature gauges,
obtaining temperature samples every minute
and saving each sample to a DynamoDB table. Each sample involves writing 1K of data and the writes are evenly
distributed over time. How much write
throughput is required for the target table?

A.    
1 write capacity unit

B.    
10 write capacity units

C.     
60 write capacity units

D.    
600 write capacity units

E.     
3600 write capacity units

Answer:

First, get the items per second, which is 600/60 = 10. Then, since the
write throughput is requested, just multiply this number with the size in KB
which is 1. Hence, the answer is 10 x 1 = 10.

 

Question: You are building a game
high score table in DynamoDB. You will store each user’s highest score for each
game, with many games, all of which
have relatively similar usage levels and numbers of players. You need to be
able to look up the highest score
for any game. What’s the best DynamoDB key structure?

A.    
HighestScore as the hash / only key.

B.    
GameID as the hash key,
HighestScore as the range key.

C.     
GameID as the hash / only key.

D.    
GameID as the range / only key.

Answer:

Since there are many games with similar usage levels and number of
players, GameID should be the partition key, and since highest score is
desired, sorting the games by the score would surely be beneficial. Hence,
HighestScore should be the sort key. Hence, B is the answer.