Cloud Native Serverless Java with Quarkus and GraalVM on AWS Lambda

A few gray hairs, but...
20
Sep

Cloud Native Serverless Java with Quarkus and GraalVM on AWS Lambda

If you haven't shouted "Bingo" yet, you have only yourself to blame. How can it be possible to use almost all of the bleeding-edge technologies, frameworks and platforms listed above successfully together in a real-world project away from the greenfield and Hello World demos? A field report.

With this article I deliberately don’t want to write a guide or description of a framework, explaining how to use it and for what purpose. Since I like to work with new technologies, I gave myself the following challenge over the last few months: Is it possible to combine many of the new and current frameworks in such a way that the application can ultimately be used while keeping the programming and technical effort within manageable limits? Based on this question, I would like to report my findings here. Details and instructions regarding the components used can be found on the respective websites.

As a real-world application that is in production, I have chosen the login/registration app of the Java User Group Darmstadt, an organization of which I am an active member. This application was created three years ago as part of my serverless book project and has been running since without the help of a web framework like Spring, Java EE or similar. It is based on Java 8 in AWS Lambda. In the meantime, the application has evolved, like so many projects, and the code has, as they say, grown historically.

The application can only be run locally to a limited extent, changes become more complex and sometimes lead to quite fragile code. So it’s high time to do a complete review of the code base and update it.

From a technical point of view, the application is quite clearly structured: For any JUG DA event, interested participants should be able to register easily and without obligation via a web form. This allows us to get a good overview of the number of participants in advance of the event and, if necessary, make adjustments with the room and catering organization. If a participant is successfully registered, we send a confirmation email. This area is publicly available, so there is no need for the user to log in. We have deliberately dispensed with user accounts for various reasons. Organizers access completed registrations via a secured area of the application. For this, the login of an Orga member is necessary.

Want more great articles about Serverless Architecture? Subscribe to our newsletter!

From where we are to where we want to be

The application consists of a browser frontend, authentication/authorization, data processing, data storage, and email sending. To render the frontend, I resorted to a server-side solution using the Handlebars templating engine. I deliberately chose not to use a JavaScript-based single-page application in order to have as few individual components as possible. I ultimately wanted to manage only one or at most two deployment components. All HTTP(S) handling is done by the Amazon API Gateway, which also configures which paths of the application are provided with a so-called ‘authorizer’ to make them accessible only to authorized users. Within the application, I created and managed my own service classes using singletons. Singletons sound old-fashioned and fragile at first, but in Java-based AWS Lambda functions, they are quite justified, as there can be no competing threads at runtime: Each thread is processed in its own Lambda instance. Data is stored in an Amazon DynamoDB NoSQL DB and emails are sent via the Amazon Simple Email Service – AWS provides a Java API for both services. In the end, this left me with four Lambda functions (registration, deletion, admin, and authorizer), an API gateway mapping template, and a DB table that I deployed using the serverless framework; several individual components indeed, but very easy to manage as a single unit in a project through the serverless framework. The legacy code can be found in the GitHub repository under the tag legacy.

However, using the Java runtime in AWS Lambda has the disadvantage that the startup times of Java-based functions are relatively high. This is not a problem in asynchronous, event-driven data processing pipelines, but in context with user interaction (i.e. a website, for example), it can quickly lead to undesirably high latencies and thus to dissatisfied users. The only workaround so far has been to increase the provisioned memory for a Lambda function so that it is also allocated more CPU power and network bandwidth. Even if you don’t need the actual memory, this can lead to reduced costs because the execution time decreases. Additionally, you can keep an instance of the Lambda function “warm” for some time by calling it periodically with cron events from AWS CloudWatch. It might sound weird, but for a long time it was really the only way to do this. Meanwhile, AWS offers the option of “reserved concurrency”, which allows you to specify how many instances should be available pre-warmed (initialized), delivering faster response times. The continued use of AWS Lambda was decided, firstly because I am a serverless fanboy, and secondly because the cost for our use case can be kept at a very manageable level of zero Euros as the free AWS usage quota is not fully utilized.

This is where various new technologies, frameworks, and platform options come into play. With GraalVM, it is possible to compile a Java application into OS-native code and thus execute it more efficiently and also conserve memory, but the effort and barriers to entry are not insignificant if one has not worked with it before. AWS Lambda also does not offer a preconfigured runtime environment to run native binaries. This only became possible with the Custom Runtime API, with which one can create, upload, and use any custom runtimes.

Then, in early 2019, Red Hat introduced the Quarkus framework, which aims to shine with fast start-up times, convenient hot-reload options during development, and the option of native compilation using GraalVM via simple command line parameters. Designed for the development of microservices that are later executed in containers, it plays in the same camp as Micronaut and Helidon, for example. However, Quarkus also supports the development of AWS Lambda functions. These features initially made Quarkus interesting to me and led me to investigate and also use the framework in terms of revising the JUG-DA registry.

Having an opinion

Quarkus is considered an “opinionated” framework, so it does things in its own way and according to its own opinion. And that is exactly what we should keep in mind. What Quarkus can do and does, it does well, but in its own way and on its own terms. Thus, the framework may be excellent for certain use cases, and not at all for others.

I tried to go in with an open mind, and started by looking for a plugin for my IntelliJ development environment. There is already a plugin, but it is in a very early stage (version 0.0.3) and does not offer many features yet: only a wizard to create new projects or modules based on the generator of https://code.quarkus.io, and autocomplete based on the language server for Quarkus properties in the application.properties file. A debug facility is not (yet) included in the plug-in. Instead, when a Quarkus app is started in dev mode, a debug port is automatically opened so that you can connect to it with a remote debugger from the IDE. So that’s something. But that’s just the way it is when working with young frameworks for which the ecosystem and tooling are still in their nascent stages.

As a first step, I added the required Quarkus libraries as dependencies to the Maven pom.xml. I chose a solution using RESTEasy as the JAX-RS implementation and the AWS Lambda HTTP extension for Quarkus. The AWS Lambda extensions are still in “preview” status, so API and properties may still change during development. After all, when you’re “living on the (tech) edge” you’re used to that.

After that, I set about rewriting my Lambda function classes that covered HTTP event handling into JAX-RS annotated classes, and changing my own service class management from static singletons to CDI. This was basically straightforward; I’m familiar with the APIs of JAX-RS and CDI, and I also know the existing code well enough to not run into any problems here for now.

Quarkus uses the Highlander Principle – there can only be one!

After getting to a point where the codebase was compiling again after making the first changes, I was curious to see how it behaved when starting the application. But “Bang!” – the application doesn’t start. The Quarkus Maven plugin tells me that several handler classes were found, and that there was another custom handler in addition to the QuarkusStreamHandler from the quarkus-lambda-http extension. This is actually correct, this is my authorizer function, which will be called separately later by the API gateway and thus needs to be deployed separately as well. Until now, it was possible to manage multiple handler classes in one project and deploy them to different API gateway paths without any problems. With Quarkus, this is no longer the case. So here it is, Quarkus’s first strict opinion: There can be no other (handler class) except me! The Highlander Principle with a twist.

With Quarkus, there must be only one possible entry point into the code in a project (or module). This makes sense, but isn’t always helpful in the context of a serverless application with multiple related Lambda functions. At this point, however, I didn’t want to deal with this, and the topic of security would have to wait until later. So I deleted the Authorizer class and restarted the application. This worked surprisingly quickly and produced no errors without the handler class. A first HelloWorld request was also successful immediately. Was it that simple after all? I would have been amazed.

Another templating engine?

Now, as is well known, you shouldn’t count your chickens before they hatch: I tried to call the registration form. Unfortunately, this didn’t work, because the Handlebars template engine in Quarkus has its problems. I couldn’t locate what the problem was exactly. I guess it could have been class loader problems, because a call from Quarkus to the Handlebars engine resulted in a FileNotFoundException, but the files were exactly where the code wanted to find them. I tried several things to no avail. In the end, a normal templating engine would not easily work in a native image either, because most engines rely heavily and extensively on reflection. Native compilation means that the code is statically checked for all possible branches at compile time and only these resources are packed into the native artifact. Code that is used dynamically via reflection at runtime cannot be analyzed at compile time and is ultimately not included in the generated artifact. If you want code used via reflection to be included in a native GraalVM build, it must be specified in a config file at compile time so that it can be included. Templating for many frameworks means resolving a lot of code via reflection, which would end up in an almost unmanageable GraalVM configuration.

No templating mechanism was apparent in the Quarkus ecosystem at this point. There were only GitHub issues expressing a desire for them, but with no timeline yet. For this reason, I continued my search for a suitable templating framework and initially found what I was looking for with Rocker. Rocker is a templating engine that does completely without reflection and generates pure Java code at compile time, which ultimately generates the templates with near-zero copy rendering. A very interesting concept that I wanted to further evaluate and use later. In my eyes, Rocker would have been a good framework to integrate with Quarkus. After all, we don’t need another templating engine on the market, there are already (too) many good ones, and continuing to use a good solution makes sense after all.

As developer life goes, I couldn’t develop the application for a couple of days due to time constraints and left it lying around. At the same time, I read the first tweets on Twitter about Qute – the templating engine for Quarkus. Honestly, my first thought was: Wow, besides Quarkus they must have had a very strange name left that they didn’t know what to do with. Yes, naming things is hard. But back to the topic.

A new templating engine has appeared, still bearing the “experimental” tag, which means “we’re just trying it out and maybe we’ll throw it away”. Anyway, living on the edge! The main thing is that it works with Quarkus, and my templates are not that highly sophisticated.

The application, or rather the integration into the code and the syntax of the template functions and placeholders are, of course, different from other templating solutions, but I already expected that. However, rewriting the templates and Java code was just a simple case of working through the changes – nothing complicated. Since Qute is still quite young, there are naturally not many functions included yet, so almost only simple property substitutions, if conditions, and loops work. For this, there is already a @TemplateExtension annotation, with which custom template extensions can be implemented if Qute does not yet provide the desired behavior.

For example, at one point in a template I wanted to react to the presence of a key in a map:

<div class=”{#if myMap.containsKey(‘name’)}has-error{/if}”>…</div>

Unfortunately, containsKey() is not yet supported on maps. Eventually, however, I was able to create exactly this behavior with my own extension (Listing 1): If in the template code of a map (first parameter of the @TemplateExtension method) the containsKey() function is called (name of the @TemplateExtension method), then this method is executed. Further parameters can be passed from the template, for example the key here. But it can be assumed that in later versions such functionality will be implemented directly in Qute.

public class QuteExtension {
  @TemplateExtension
  static boolean containsKey(Map&amp;lt;?, ?&amp;gt; map, Object key) {
    return map.containsKey(key);
  }
}

Want more great articles about Serverless Architecture? Subscribe to our newsletter!

Security

After the application was up and running with its frontend and successfully rendered templates, it was time to address the issue of security in terms of authentication and authorization. As a reminder, in the context of AWS Lambda and API gateways, a Lambda function does not know an HTTP stack, but only an event with attributes from an HTTP request. The API gateway handles the actual HTTP communication with the requesting clients and then forwards the data to the Lambda function in the form of an APIGatewayProxyRequestEvent. If requests for a certain path, e.g. /admin, should only be forwarded for authorized requests, this must be configured in the API gateway and this path must be provided with an authorizer (IAM, Cognito or a separate Lambda function). This gives you at least two different entry points in the API gateway; for example /registration for publicly accessible registration to an event and /admin for administration. However, both paths can point to the same lambda function if it is implemented correctly and also evaluates the paths. This is the case when using Quarkus, RESTEasy and the Lambda HTTP extension.

Since the previously mentioned Quarkus Highlander Principle does not allow other Lambda functions in the same project or in the same runtime classpath context, I first tried to handle the authentication and authorization issue directly in the Quarkus application. The framework also already provides some extensions for this. I use a simple basic authentication in the application, and so I implemented the Properties File Based Authentication for a first test.

Basically, this worked, even though the extension pulled the dependencies of what felt like half an Undertow HTTP server into the project, which I don’t need at all thanks to the API gateway (and don’t want to have in my project). I say “basically” because the Quarkus application writes a WWW Authenticate Basic header in the response to a request without an Authorization header, which should inform the client that the corresponding credentials are missing. However, this header is rewritten by the API gateway into an x-amzn-Remapped-WWW-Authenticate header. Thus, the client no longer recognizes that authorization information is missing. This mapping is not changeable and this is absolutely correct, because for the actual place that does the HTTP handling – the API gateway – no authorization of the request is configured to begin with. So the gateway must not return the header unmapped, because a request with a correct header would then also have to be checked and authorized by the API gateway. In this case, the API gateway is not only a proxy that forwards the HTTP request to other HTTP servers, but also performs other tasks in the overall context, including the authorization of requests. And the Lambda function is not an HTTP server either; I’m just using Quarkus, a framework that happens to be able to speak HTTP.

So we have no choice but to authorize requests in the API gateway and use an authorizer function there for all /admin requests. A Java class in the same project will not work for the above reasons. However, I also wanted to avoid this turning into a Maven multi-module project just because of a single class, if possible. Thanks to polyglot programming and the possible Node.js runtime environment in AWS Lambda, I was able to implement the authorizer function in JavaScript in the end. It doesn’t interfere with the Java classpath of the Quarkus application, and I can manage the function in the same project. When deploying, of course, the function has to be packaged and uploaded separately, but this is easily done with the serverless framework. The configuration file for deployment with the serverless framework can be seen in a simplified form in Listing 2.

service: jugda-registration
 
provider:
  name: aws
  runtime: java8
  stage: prod
  region: eu-central-1
  memorySize: 2048
 
package:
  individually: true
 
functions:
  admin:
    handler: io.quarkus.amazon.lambda.runtime.QuarkusStreamHandler
    events:
      - http:
        path: /admin/{proxy+}
        method: any
        authorizer:
          name: basicAuthorizr
          type: token
          identitySource: method.request.header.Authorization
    package:
      artifact: target/jugda-registration-runner.jar
  public:
    handler: io.quarkus.amazon.lambda.runtime.QuarkusStreamHandler
    events:
      - http:
        path: /{proxy+}
        method: any
    package:
      artifact: target/jugda-registration-runner.jar
  basicAuthorizr:
    handler: js/basicAuthorizr.handler
    runtime: nodejs12.x
    memorySize: 128

Going Native

The application now runs in AWS Lambda under the Java 8 runtime, I can also start it locally and thus comfortably develop, debug, and test it. The only thing missing is the ability to compile, deploy and run the Quarkus application natively via GraalVM. After all, I want to benefit from better (startup) performance while using less memory. According to the documentation, this should work quite easily with $ mvn package -Pnative. Provided the appropriate Maven profile is in the pom.xm., it will reference all necessary settings from the Quarkus pom.xml, which are already pre-configured for the developer. This is very comfortable and saves me a deep and long familiarization with the world of GraalVM configuration for starters.

However, a first attempt to compile unfortunately produced a whole handful of error messages that sound very strange and incomprehensible for a Java developer. I’m no longer in the Java world with the usual exceptions and the like, but in the native world where the error messages look different.

After some stacktrace investigation and research, I was able to trace all error messages back to the AWS libraries used, much to my amazement. The rest of the code was apparently free of compile errors (which would still not prevent runtime errors). In the application, I use AWS APIs for DynamoDB and SES. Fortunately, there is already a Quarkus extension [17] for DynamoDB, so hopefully I can just use that and do not have to dive into the depths of implementation and customization on Quarkus. That’s the plan, anyway. Customizing third-party libraries can be necessary for efficient use in Quarkus, but can also be time-consuming at first if you’ve never done it before. The Quarkus DynamoDB extension uses version 2 of the AWS Java API, which is a rewrite of the V1 APIs but decouples the code better and has become more modular to use. In my application, I am (was) still using the V1 API, which is not a big deal because it is still supported and also further developed by AWS. However, since the AWS V2 API is still quite young, it cannot yet keep up with the functionality of the V1 API. For DynamoDB, for example, the Document API and the Object Mapper API are not yet implemented in V2. Especially the Object Mapper API (similar to the JPA annotations for POJOs and other high-level functions) I used in the application. With that, the first thing I had to do was rewrite the entire DynamoDB implementation in the application. Again, something not that complicated, but that mainly takes time, especially when the new API is not so familiar.

In addition to DynamoDB, there is also the dependency on the Simple Email Service in the application. Again, I used V1 of the AWS Java API. However, there is no Quarkus extension for SES, at least so far. Being able to use the two different AWS Java API versions in a project at the same time is feasible, but increases the further transitive dependencies; also because the V1 API has a hard dependency on the Apache HttpClient, but this is solved in a modular way with the V2 API and can be replaced, for example, by a client based on the java.net.URLConnection class included in the JDK. The advantage of the URLConnection-based client is that it starts faster than the Apache HttpClient, but it offers less throughput. In my use case, I don’t have to deal with high throughput rates, so I was able with this better performance to reduce dependencies further. As such, I rewrote the SES implementation to the V2 API as well. A new native compile worked after that and the application was ready for deployment.

Well, nearly. I compiled the native image on my MacBook, which means it runs on macOS, but not on Linux, and thus not in AWS Lambda. Native really does mean native. Fortunately, Quarkus already offers support for this, which is purely configurational, provided you have Docker available in your environment, for example. With appropriate properties, Quarkus starts a Docker container during the native build, in which the native image is then built. Thus, you get a Linux-based native binary that can then also be run in AWS Lambda:

quarkus.native.container-build=true

quarkus.native.container-runtime=docker

While AWS Lambda does not have a pure Linux runtime environment, it does provide a way to create and use custom runtimes with the Custom Runtime API. One of the requirements is that the entry point to this custom runtime has the filename bootstrap. The generated artifacts of a Quarkus Lambda project created with the Quarkus archetypes kindly provide a Maven assembly configuration so that the generated native binary is renamed to bootstrap and immediately packaged into a zip file that can be uploaded to AWS Lambda. Unlike a predefined runtime, however, there is no need to specify a defined handler for a function during deployment; the bootstrap artifact takes care of that.

Deploy the native binary and call the application. It runs! Almost. The application cannot access the AWS APIs via HTTPS/TLS yet because the native image does not know how to handle HTTPS and does not know about certificates. For this, the runtime needs the Sun-EC library from a Linux JDK distribution (libsunec.so) so that the Sun-EC provider can be loaded correctly, and the cacerts file with the certificates from a JDK, possibly provided with its own self-signed certificates. These two files must be deployed with and made known to the environment. In the native image these cannot be recompiled, they must be indicated as external files by the configuration file. The Quarkus documentation describes how this can be done for a Docker environment, but doesn’t say a word about what becomes necessary in an AWS Lambda environment.

Lambda Layers provides a remedy for this. A layer is a zip file that is uploaded separately to Lambda, provides static (binary) files, and can be used by one or more Lambda functions. For example, the runtime libraries for a Groovy application could be deployed as a layer. A Lambda function references this layer and can include the libraries, but does not have to deploy them itself. So in my case, I deploy the two files mentioned above as a layer graalvm and reference them when initializing the native function. I am not allowed to call this function bootstrap anymore, but I have to write my own bootstrap wrapper that makes the layer files known to my native function via system property parameters and calls it (Listing 3).

#!/usr/bin/env bash
RUNNER=$( find . -maxdepth 1 -name '*-runner' )
export DISABLE_SIGNAL_HANDLERS=true
$RUNNER
-Djavax.net.ssl.trustStore=/opt/graalvm/jre/lib/security/cacerts
-Djavax.net.ssl.trustAnchors=/opt/graalvm/jre/lib/security/cacerts
-Djava.library.path=/opt/graalvm/jre/lib/amd64

The bootstrap wrapper and the native image are packed together in the zip file to be deployed. The deployment configuration for the serverless framework for the native image including the GraalVMSecurity layer can be seen in Listing 4. Under the path specified for the layer, the files are located in the required directories and are zipped as they are and uploaded as a layer to Lambda.

service: jugda-registration
 
provider:
  name: aws
  runtime: provided
  stage: prod
  region: eu-central-1
  memorySize: 256
 
package:
  individually: true
 
functions:
  admin:
    handler: not.used.in.provided.runtime
    events:
      - http:
        path: /admin/{proxy+}
        method: any
        authorizer:
          name: basicAuthorizr
          type: token
          identitySource: method.request.header.Authorization
    package:
      artifact: target/function-admin.zip
    layers:
      - { Ref: GraalvmSecurityLambdaLayer }
  public:
    handler: not.used.in.provided.runtime
    events:
      - http:
        path: /{proxy+}
        method: any
    package:
      artifact: target/function-public.zip
    layers:
      - { Ref: GraalvmSecurityLambdaLayer }
  basicAuthorizr:
    handler: js/basicAuthorizr.handler
    runtime: nodejs12.x
    memorySize: 128
 
layers:
  GraalvmSecurity:
    path: lambda-layer

Want more great articles about Serverless Architecture? Subscribe to our newsletter!

Is the application running now? Better, but not quite there yet.

The registration form appears and I can also submit it. The email is sent, but then the application runs into an error. A log analysis quickly shows that a class cannot be called via reflection when sending the email. Reflection? Yes, the error occurs when trying to map the API response to an instance of the XMLInputFactoryImpl class. This class is not known to the native image because it could not be detected by static code analysis. At there are several tips and hints as to what to do in such and similar cases. In my case it was sufficient to define this single class in a file named reflection-config.json

[{

  “name” : “com.sun.xml.internal.stream.XMLInputFactoryImpl”,

  …

}]

and specify it in the application.properties as an additional build parameter:

quarkus.native.additional-build-args=-H:ReflectionConfigurationFiles=reflection-config.json

That’s it. The application runs performant, stable and with about one-eighth of the originally provisioned memory. The current state of the application can be found in the public GitHub repository.

Conclusion

So, back to the question posed at the beginning: Is it possible to connect and operate various frameworks, platforms and technologies – some of which are still very young – to create an executable application in the real world? The answer is “yes”. However, I don’t want to hide the fact that some of the “little problems” I came up against cost me a few gray hairs, and my neighbors must have heard me swear more often than usual. More than once during the migration I was tempted to just give up on the project.

Admittedly, Quarkus is still a very young project – at the time of writing it is not even a year old. On the one hand it can do a lot, but on the other hand, it is still missing some important things. It is certainly not the all-singing and all-dancing solution to all of life’s problems. Quarkus is a framework with its own approach for certain scenarios. It scores points with regards to a containerized world and is thus made for the present. However, since this state will not last too long, in my eyes, and the infrastructure will move more and more in the direction of serverless workloads, Quarkus cannot keep up yet. The “Quarkus Highlander Principle” is simply too restrictive and not suitable for managing multiple, contextually related functions. Quarkus is very “opinionated” in my eyes – and that should not be forgotten. Migrating an existing application to Quarkus simply because you can, and because Quarkus seems so “hip” makes no sense. There should be good reasons to do so. Existing alternatives in the ecosystem should always be taken into consideration and compared against the real business requirements of the project. These are what is important, not the sensitivities and desires of developers/architects or the opinions of evangelists.

The actual integration with GraalVM and the associated ability to create native images is solved as long as one deals with manageable services. However, if one uses third-party libraries that are not available as Quarkus extensions, this can be a challenge, especially if these libraries use reflection to a large extent and you want to compile the application natively with GraalVM. The configuration files that may become necessary for this can quickly make the project confusing again.

For the existing Quarkus extensions there is already extensive documentation, sometimes more extensive, sometimes less detailed. However, the framework is growing so rapidly that some of the published guides still contain outdated information. Properties and APIs that have changed in the meantime. Trust is good here, control is even better.

There are also still many companies for which containerization and startup speed is not a top-priority issue. For these, established solutions such as Spring (Boot) also do the job. Here, too, some performance can be improved through optimization. An important additional factor that must not be forgotten is the know-how of the available developers. Not every team can or wants to jump on a new technology bandwagon every two years. Effective and efficient use of existing knowledge and technologies can sometimes be more profitable than starting 0.2 seconds faster. Also, not every company is Netflix or Google, even if many think they have the same needs.

Still, the Quarkus approach is great, and it’s good to see something moving in terms of Java deployment in a container-based world. As I said, the framework is still young, and time will tell how it evolves. I only have one wish: Please come up with some better names!

Stay tuned!
Learn more about Serverless
Architecture Conference 2020

Behind the Tracks

Software Architecture & Design
Software innovation & more
Microservices
Architecture structure & more
Agile & Communication
Methodologies & more
Emerging Technologies
Everything about the latest technologies
DevOps & Continuous Delivery
Delivery Pipelines, Testing & more
Cloud & Modern Infrastructure
Everything about new tools and platforms
Big Data & Machine Learning
Saving, processing & more