John's Dev Blog: Spring Data JPA with a Hash & Range Key DynamoDB Table

Friday 25 November 2016

Spring Data JPA with a Hash & Range Key DynamoDB Table

DynamoDB is an attractive option for a quick and simple NoSql database for storing non-relational items consistently and safely without the need for setting up hardware, software and configuration for clustering. DynamoDB will scale up perfectly well to cope with large volumes of data. It is extremely quick and easy to set up a table and read write items from other AWS components.

Spring Data for DynamoDB

From an application point of view the obvious choice for access is to use the AWS SDK. For CRUD operations, from some front end gateway such as a REST API, we would want to abstract the boiler plate code of mapping a POJO to a table which is the perfect use case for JPA. I was unaware of any JPA framework which had built in extensions for DynamoDB until I looked at Spring's implementation - Spring Data. There's a nice framework on GITHub, forked from Michael La'Velles original work, which provides a DynamoDB extension with all the annotations for mapping Java domain classes I'd expect.

Mapping Items to the Java Domain Model

As with all database design the first step is to define a unique key with which to uniquely identify each item in the table. In DynamoDB this is known as the Partition or Hash Key. Often, the items we are dealing with have no unique natural identifier and we need to specify a second identifier, the Sort or Range Key, which, when coupled together with the Partition Key, is unique within the table.

When annotating a domain type we are expected to mark a field as an ID. With Spring Data we only have the option of using the org.springframework.data.annotation.Id annotation. This must be at field level and only one field can be annotated. Therefore, with our compound key model, we are forced to create an identifier class which is embedded into the domain type. This is a common pattern with JPA and relational data models. However, it has knock-on effects with the DynamoDB persistence model and how the domain object is serialized from a JSON document.

In this example I will map a Widget domain type with both Partition and Sort keys. In order to ensure uniqueness by compound key I will create a WidgetId class which is embedded into Widget. However, I don't want this persisted into the store as an object. The keys should be defined as simple number and String types within the JSON schema. The JSON schema I will map is shown below.

Rather than map other domain model attributes I simply want the Widget class to contain a String of JSON data which is dumped directly into single DynamoDB attribute. In reality I would probably map each attribute and tightly constrain the Java model to the JSON schema. But, this approach gives increased flexibility to store whatever data I want. Even so, the lack of constraint increases the risk of persisting bad or invalid data. When persisted into DynamoDB a Widget item looks like this

I need to tell the underlying Jackson mapper to treat the data String as raw JSON rather than deserialize it as a String. Here's the Widget and WidgetId classes annotated with everything Spring Data and Jackson need to serialize and persist single instances.

The annotations on the get/set Hash and Range key values essentially act as proxies to the encapsulated WidgetId instance which is ignored by both DynamoDB and the Jackson serializer. The @Id annotation ensures JPA still uses this class as the embedded identifier allowing us to utilize the compound key.

CRUD Repository

I need to set up the configuration to access the DynamoDB Widgets table. Access control is via IAM so I can either add a key pair of my IAM account to the application or grant the required role to the EC2 instance I'm running this app from. In this example I'm going to give it the access and secret key which, being a Spring Boot application, I add as properties to the application.properties file on my class path along with the end point URL to the DynamoDB service.

Now to define a repository to handle the CRUD operations. The Spring Data framework provides all the functionality needed straight out the box to such an extent that no further code is required. In this example I want to read all the Widgets with a given Partition key value, there may be multiple items because that key alone isn't unique. The interface which extends the Spring Data org.springframework.data.repository.CrudRepository simply needs another method defining. Spring Data will interpret this from it's name, parameters and return type and formulate the query at runtime.

Use MVC to Create a REST API

It's pretty straight forward to add the dependency for Spring MVC and create a simple REST Controller to implement GET and PUT, read & write, methods.

Securing the End Points

As with any web application I need to secure the REST end points. I'll do this with Basic authantication, a simple base64 encoded user and pass on the Authorization header of the request. This is again very simple to achieve with Spring Security and the convention over configuration of Spring Boot. Simply extend the WebSecurityConfigurerAdapter class, override the configure method and set the required authority for the specific paths. In my case I just want to force any request to either method exposed by the controller to be authenticated. Spring Security will create a default user/pass and print it out to the system log on boot up. I can override this by adding the credentials into the application.properties file. In a production system I'd add my own UserDetailsService and load those credentials from store of some kind.