Develop Lambda with Node.js

This article is a small introduction in Node.js from a Java developer’s perspective, The purpose is to create an AWS Lambda Function to solve the problem described here: How I started with Serverless.

The merits of AWS Lambda UI are beyond the scope of this tutorial.

What is Node.js ?

The formal definition of Node.js as given in the official website is:

Node.js® is a JavaScript runtime built on Chrome’s V8 JavaScript engine.

V8 Javascript engine was first designed to increase performance of the Javascript execution in web browsers and specifically Chrome. Later, V8 was extended in order to run as a standalone application, which resulted to Node.js. Below an analogy to Java is shown:

Every Java application runs as a single thread application and is the thread created by the invocation of the public static void main(String[]) . This thread then can spawn any number of threads, specified by the developer. In the Hotspot JVM there is a direct mapping between a Java Thread and a native operating system Thread.

On the other hand, Node.js uses an event-driven, non-blocking I/O model. Event-based applications divide work by using callbacks, an event loop and a queue. The unit of work, or task, is simply a callback. The event loop will listen for an event saying that an asynchronous operation (IO) has completed and then execute the callback that was registered when the operation started.

Package manager

The default package manager for Node.js environment is npm. npm it’s also a software registry that contains public and private modules/packages that applications can depend on. So npm is the equivalent of mvn or gradle for Java.

The command npm init is similar to the mvn archetype:generate or init, with the difference that will not generate any directories, but only the package.json file. The package.json file follows similar concepts with the Java package managers, where you define the application name, the dependencies and many more. More details in the section Packaging & Bundling

The other package manager for Javascript is the yarn, which is based in the npm registry, but has some characteristics that differentiate it from npm. Yarn tried to address some problems with npm, like inconsistency in dependency versions. After the latest release of npm version, where package-lock.json is as well provided, the differences are on the details.

The Lambda function

At the time I faced the problem described in the introduction, AWS had support only for Node.js version 6. The source code of the Node.js lambda can be found in this repository aws-lambda-to-cognito

On line 1 we can see that aws-sdk is required. requires() is the equivalent of import in Java.

Java’s import statement is pure syntactical sugar. Import is only evaluated at compile time to indicate to the compiler where to find the class names in the code.

In Node.js source code is visible to other source code only if they are in the same file. So we can say that everything is private. In order to be able to use code from another file, which corresponds to a module, one should use requires() to load whatever the module exposes. Usually the return value of that is assigned to a variable. It is worth mentioning that requires in Node.js runs synchronously, so is a good practice not to use it inside a function, since it can block the execution.

On line 3 the service object for Cognito is initialized, together with the userRequest & response structure that will be used inside the actual lambda.

On line 18 the handler function is defined, which basically is the function that AWS Lambda will call in order to start the execution. AWS Lambda passes

  1. The event that triggered the execution, which in our case is an event from the AWS Gateway,
  2. the context object, which provides useful runtime information e.g how long before the execution terminates,
  3. the callback that the function will invoke when the execution is done.

Between the lines 28 to 56 is the actual logic of what we want to achieve. Given a User Pool Id and a user’s email, we want to retrieve the user, and if the UserStatus is UNCOFIRMED, to remove the user from the pool. AWS SDK provides the adminGetUser and adminDeleteUser API operations, which are both asynchronous. This means that they use callbacks, and instead of immediately returning some result, it will take some time to produce and return. So a callback function is the code, that you want executed when the function call completes and returns.

Coming from a Java background, where the code is executed sequentially, which means that a method call on line 1 will finish and return a result before the code in line 2 starts executing, my first attempt of developing with Node.js resulted on what is called callback hell. I simply try to synchronize the two asynchronous method calls via nested callbacks. Javascript provides two approaches in order to solve the above hell: Promises and on later versions Async/Await. Since I like to iterate on my solutions while learning new technologies, these approaches will be used in a subsequent post. For now my solution works.

On line 58 the Callback Parameter is used to return information back to the caller. It will always return a successful response for security reasons.

Packaging & Bundling

The package.json is a JSON file that is created using the npm init command. Its primary function is to manage the dependencies of the Javascript project, as well as generating builds and running tests. There are no fixed requirements on what will be contained in a package.json file. Below is a description of the properties used in this project.

  • name sets the package name
  • version indicates the current version
  • scripts defines a set of node scripts you can run
  • dependencies sets a list of npm packages installed as dependencies
  • devDependencies sets a list of npm packages installed as development dependencies

All the properties are similar to maven or gradle properties, except the scripts. In the scripts section we define a set of node scripts that we want to run.These scripts are command line applications and can be executed by simply typing npm run <name>.

So, running npm run build will remove any folder named dist in the the base path of the project and then with the help of [corss-env]https://www.npmjs.com/package/cross-env() it will set the environment variable NODE_ENV to production and will run webpack.

Similarly, by running npm run package will zip everything under the dist/ into a dist.zip, which is basically the way the lambda function is packaged and distributed.

While working with AWS Lambda, one of the challenges is how to bundles everything needed into a .zip file. The simplest solution is to zip the javascript function file with the entire node_modules directory. This would work in the given example but it is not maintainable and will not scale. When the function will evolve, for example will have more common scripts, in order to create a zip bundle we will need to follow all the requires() and imports to make sure that we get exactly all the node_modules the function needs.

Instead of the above manual way, we use webpack to bundle the Lamda function, which is a bundle system. It takes all files (JavaScript files with modules and dependencies, etc) that are connected in some way and bundles them into standard Javascript files. webpack.config.js is the main configuration file. In this file we define:

  • the entry point of our application,
  • the file extensions and modules that will be resolved
  • what should be generated from this build process

Editor

For developing everything described in this series of articles I used Intellij together with the NodeJS plugin.