Never too late for some OpenCV and Clojure on AWS Lambda

In our previous post, we managed to run a Yolo-based Deep Neural Network directly on a raspberry pi, and get some object detection in semi real-time on pictures and video streams. The processing was entirely done locally, which is kind of optimum for a local video stream, but can be a little bit too power hungry if you have a farm of these. Here are some, not-so-easy-to-get power consumption values for the Raspberry Pi and you can easily see that heavy CPU usage doubles the energy consumption. In that case, a possible solution to off load processing out the raspberry and onto servers using easy to setup lambdas.

You’ve probably tried AWS Lambdas before, and some of you might even be using some in production, they are stateless functions, handler, deployed and hosted remotely on AWS infrastructure.

In this post, we are going to create and deploy (obviously) Clojure based lambdas, and call them from our beloved Raspberry Pi to analyse some pictures using Neural Networks via Origami.

The lambda will be developped and deployed from a local development machine, and will finally be called from the Raspberry Pi, like shown in the diagram below:

/lambda0.png

Setting up AWS CLI

To interact with and set up AWS services one usually go through the AWS command line interface.

To install awscli you can use apt:

sudo apt install awscli

but depending on your luck and timing, you may end up with a slightly old version and may want to try the python based install instead:

sudo pip3 install awscli --upgrade

Once you have the aws command ready, make sure to configure it with one of your IAM user.

Configure the AWS IAM user

In the AWS console, we simply setup a new IAM user, here named nico.

We also create a role, lando, to which we give full lambda access:

/lambda1.png

Then retrieve the key ID and the Access Key of that new user, we can now run the aws configuration from the Raspberry Pi and our development machine.

aws config

Where you have to answer the usual questions on:

  • AWS Access Key ID
  • AWS Secret Access Key
  • Default region name
  • Default output format

That’s for the fun in the AWS console for now.

First Lamba in Clojure

This is a really brief summary of the existing documentation on the AWS website about writing and deploying a lambda in Clojure.

Here we will create a new Clojure project using lein, write a function that can be called by the lambda framework, and deploy it. We’ll call our little project lando.

“How you doin', ya old pirate? ...  

You’re used to it, so let’s make creating a new Clojure project brief:

lein new lando

Which gives us the project structure below:

├── project.clj
├── resources
├── src
│   └── lando
│       ├── core.clj

The project.clj file doesn’t need anything special, and so in short:

(defproject lando "0.1.0-SNAPSHOT"
  ... 
  :dependencies [
    [org.clojure/clojure "1.10.0"]
  ]
  :main lando.core
  :profiles {
    :uberjar {:aot :all}
  }
  :repl-options {:init-ns lando.core})

Note that there is no extra AWS specific library needed … for now. Also note that we are forcing Ahead-Of-Time compilation so that the Java classes are generated at compile time.

Now on to the core.clj file itself. Again this is pretty much a copy of the AWS documentation so nothing new. We use gen-class to create the Java function that will be called by the lambda framework.

Here the function will take a String, and return a String, both are Java objects. The core of the function is self-explanatory, it returns a string of “Hello " and the content of the string passed in parameter.

(ns lando.core
  (:gen-class
    :methods [^:static [handler [String] String]]))
 
 (defn -handler [s]
   (str "Hello " s "!"))

We’re done. Let’s create a jar file out of this, that can be deployed onto AWS Lambda.

lein uberjar

We now have a nice jar file in the target folder:

ls -al target/lando-0.1.0-SNAPSHOT-standalone.jar

To deploy our function we use the aws CLI and the create-function subcommand:

aws lambda create-function  --function-name core  \
 --handler lando.core::handler   \
 --runtime java8   \
 --memory 512   \
 --timeout 20   \
 --zip-file fileb://./target/lando-0.1.0-SNAPSHOT-standalone.jar \
 --role arn:aws:iam::817572757560:role/lando \

To check that all went fine, you can run:

aws lambda list-functions

And check the output contains the newly deployed function:

{
    "Functions": [
        {
            "FunctionName": "core",
            "FunctionArn": "arn:aws:lambda:ap-northeast-1:817572757560:function:core",
            "Runtime": "java8",
            "Role": "arn:aws:iam::817572757560:role/lando",
            "Handler": "lando.core::handler",
            "CodeSize": 22140085,
            "Description": "",
            "Timeout": 20,
            "MemorySize": 512,
            "LastModified": "2019-09-20T07:56:11.644+0000",
            "CodeSha256": "KvnPmrSwEjWcUvDbhjy2dE2+VxhmjnAHqa2ghzhatMg=",
            "Version": "$LATEST",
            "TracingConfig": {
                "Mode": "PassThrough"
            },
            "RevisionId": "d699f624-6c6a-4e24-a8aa-15f4ad8da5cc"
        }
    ]
}

Usually, if something goes bad at this stage, it’s because of a lack of AWS permission, or a typo in the naming of the handler function.

When using multiple profiles, make sure the one in use is the one you want, by setting the AWS_PROFILE environment variable.

export AWS_PROFILE="default"

The most complicated is done, now we can just call the lambda, using the subcommand invoke.

Recalling our deployment graph from earlier on, we can now switch from the development machine to our Raspberry Pi. Calling the function is done via invoke.

aws lambda invoke --function-name lando --payload '["Lando"]' lando.log

And the status of the call is shown on completion:

{
    "StatusCode": 200,
    "ExecutedVersion": "$LATEST"
}

The result of the call itself is in the lando.log file:

$ cat lando.log 
"Hello Lando!"
“How you doin', ya old pirate? ...  

Spicing things up by calling OpenCV/Origami.

Let’s see how it goes by using our favorite Clojure imaging library, origami.

Our example will be quite dead easy, and just output the OpenCV version in use, which also mean all the required native libraries have been loaded properly and can be loaded from with the lambda context.

In the dependencies section of project.clj, let’s add the new dependency:

:dependencies [
    [org.clojure/clojure "1.10.0"]
    [origami "4.1.1-6"]
  ]

And let’s creates a new namespace origami, within the same lando project, with some code…

(ns lando.origami
    (:require [opencv4.core :refer :all])
    (:gen-class
      :methods [^:static [handler [String] String]]))


(defn -handler [s]
   (str "Using OpenCV Version: " VERSION ".."))

Nothing annoying yet in the code, we can run lein uberjar to create the jar file, and then run a create-function from the CLI, using the same role.

aws lambda create-function  --function-name origami  \
 --handler lando.origami::handler   \
 --runtime java8   \
 --memory 512   \
 --timeout 20   \
 --zip-file fileb://./target/lando-0.1.0-SNAPSHOT-standalone.jar \
 --role arn:aws:iam::817572757560:role/lando 

This time, things did not go too well at first glance… and we’re stuck with:

An error occurred (RequestEntityTooLargeException) when calling the CreateFunction operation: Request must be smaller than 69905067 bytes for the CreateFunction operation

Let’s check the size of the generated jar file:

$ ls -alh target/lando-0.1.0-SNAPSHOT-standalone.jar 
-rw-r--r--  1 niko  staff    95M Sep 20 17:05 target/lando-0.1.0-SNAPSHOT-standalone.jar

That would be a little bit over the 70M limit enforced by AWS. So something you may not know is that origami has a version of OpenCV for each different plateform it is supposed to run on, this way everyone can get started very quickly but … it makes a resulting standalone jar file really big.

$ unzip -l target/lando-0.1.0-SNAPSHOT-standalone.jar | grep natives
        0  12-17-2018 10:27   natives/linux_32/
        0  08-02-2019 11:18   natives/linux_64/
 39529136  08-02-2019 11:18   natives/linux_64/libopencv_java411.so
        0  08-14-2019 10:11   natives/linux_arm/
 25907540  08-14-2019 10:12   natives/linux_arm/libopencv_java411.so
        0  08-02-2019 13:54   natives/linux_arm64/
 32350952  08-02-2019 13:55   natives/linux_arm64/libopencv_java411.so
        0  08-01-2019 14:47   natives/osx_64/
 82392636  08-01-2019 14:47   natives/osx_64/libopencv_java411.dylib
        0  12-17-2018 10:27   natives/windows_32/
        0  08-01-2019 16:36   natives/windows_64/
 55030784  08-01-2019 16:37   natives/windows_64/opencv_java411.dll

Since lambdas are running on linux, we can focus on keeping just the compiled library needed, i.e linux_64/libopencv_java411.so.

Let’s slim things a bit:

 zip -d target/lando-0.1.0-SNAPSHOT-standalone.jar  \
 natives/windows_64/opencv_java411.dll  \
 natives/osx_64/libopencv_java411.dylib \
 natives/linux_arm64/libopencv_java411.so \
 natives/linux_arm/libopencv_java411.so 

And check the size of the standalone jar again:

$ ls -alh target/lando-0.1.0-SNAPSHOT-standalone.jar 
-rw-r--r--  1 niko  staff    21M Sep 20 17:14 target/lando-0.1.0-SNAPSHOT-standalone.jar

Way better. Now we can redeploy the origami function again.

aws lambda invoke --function-name origami --payload '""' origami.log
$ cat origami.log 
"Using OpenCV Version: 4.1.1.."

Running our Deep Neural Network on Lambda

You’re probably used to it now, if you follow the series of posts on running a DNN with Clojure, we’ll pretty much just wrap the previously made code, and call it from our lambda handler. This time the function will create the network, and run it on the input of the lambda as a URL of a picture we want to find out what’s inside. We’ll put this code in a new namespace named lando.dnn this time.

(ns lando.dnn
    (:require 
        [origami-dnn.net.yolo :as yolo]
        [opencv4.core :refer :all]
        [opencv4.utils :as u]
        [origami-dnn.core :as origami-dnn])
    (:gen-class
      :methods [^:static [handler [String] String]]))

(defn result! [result labels]
(let [img (first result) detected (second result)]
    (map #(let [{confidence :confidence label :label box :box} %]
    {:label (nth labels label) :confidence confidence }) detected)))

(defn run-yolo [ input ]
(let [[net opts labels] (origami-dnn/read-net-from-repo "networks.yolo:yolov2-tiny:1.0.0")]
    (println "Running yolo on image:" input)
    (-> input
        (u/mat-from-url)
        (yolo/find-objects net)
        (result! labels))))
  
(defn -handler [s]
   (apply str (run-yolo s)))

In the code snippet above, note that we are only interested in this example about the found objects, their label and the confidence associated to the found object. The result! method is there to format the result returned by find-objects, and replace the usual blue-boxes that draw directly on the picture.

The project now requires a new dependency on origami-dnn, that can simply be added with:

:dependencies [
    [org.clojure/clojure "1.10.0"]
    [origami-dnn "0.1.5"]
  ]

This time when deploying, we actually do need to specify a Java environment variable, that is used by origami-dnn to expand the files of the network to somewhere more convenient, here /tmp which is kept over between multiple of the same lambda, thus avoiding retrieving the files over and over again.

This is done using the JAVA_TOOL_OPTIONS with the switch –environment.

aws lambda create-function  --function-name dnn  \
 --handler lando.dnn::handler   \
 --runtime java8   \
 --memory 512   \
 --timeout 20   \
 --zip-file fileb://./target/lando-0.1.0-SNAPSHOT-standalone.jar \
 --role arn:aws:iam::817572757560:role/lando \
 --environment Variables="{JAVA_TOOL_OPTIONS=-Dnetworks.local=/tmp}" 

The function needs a picture, so obviously we’ll find a picture of a cat …

/lambda2.jpg

And feed it to the function call.

aws lambda invoke --function-name dnn --payload '"https://images.unsplash.com/photo-1518791841217-8f162f1e1131"' dnn.log

And we’ll be happy to know that this is a cat:

$ cat dnn.log 
"{:label \"cat\", :confidence 0.7528027}"

Great. You can try with a few different picture, and also notice that even though the first run was quite slow because it needed to fetch and expand the compiled network files, subsequent run can re-use those files, and the network is loaded quite instantly.

Running the DNN on a list of images and return some JSON

“You might want to buckle up, baby.”

The last example builds on the previous image detection lamda, but take an array of images as input, and returns a well formatted json answer containing detection results for each of the input.

Clojure’s Cheshire will added to project.clj in order to be used to generate the resulting json.

    [cheshire "5.9.0"]

And we’ll create a lando.dnn2 namespace, which starts as a copy paste of lando.dnn, and brings a few updates.

First, the generated class has a slightly different method signature, we tell the framework that the input will be a list, not a string.

(:gen-class
      :methods [^:static [handler [java.util.List] String]])

Second, we move the preparation of the network outside the declaration of the sub functions, to make things slightly faster.

(let [[net opts labels] (origami-dnn/read-net-from-repo "networks.yolo:yolov2-tiny:1.0.0")]
   (defn result! [result labels]
     ...)
    
   (defn run-yolo [ input ]
     ...)

)

Finally, we use cheshire’s generate-string to return properly formatted json.

(generate-string (doall (map run-yolo s)))

Finally, the full namespace gives:

(ns lando.dnn2
    (:require 
        [origami-dnn.net.yolo :as yolo]
        [opencv4.core :refer :all]
        [opencv4.utils :as u]
        [origami-dnn.core :as origami-dnn]
        [cheshire.core :refer [generate-string]])
    (:gen-class
      :methods [^:static [handler [java.util.List] String]]))

(let [[net opts labels] (origami-dnn/read-net-from-repo "networks.yolo:yolov2-tiny:1.0.0")]

    (defn result! [result labels]
    (let [img (first result) detected (second result)]
        (doall (map #(let [{confidence :confidence label :label box :box} %]
        {:label (nth labels label) :confidence confidence }) detected))))

    (defn run-yolo [ input ]
        (println "Running yolo on image:" input)
        (-> input
            (u/mat-from-url)
            (yolo/find-objects net)
            (result! labels))))
  
(defn -handler [s]
    (println (first s))
    (generate-string (doall (map run-yolo s))))

Deployment is the exact same as before, except with the updated function name, now dnn2 with the handler being: lando.dnn2

The payload to send to the function is also super long, but is basically an array with different images we want to run object detection on.

aws lambda invoke \
--function-name dnn2 \
--payload '["https://image.cnbcfm.com/api/v1/image/105992231-1561667465295gettyimages-521697453.jpeg?v=1561667497&w=630&h=354","https://a57.foxnews.com/static.foxnews.com/foxnews.com/content/uploads/2019/07/931/524/creepy-cat.jpg","https://a57.foxnews.com/static.foxnews.com/foxnews.com/content/uploads/2019/07/931/524/creepy-cat.jpg","https://images.unsplash.com/photo-1518791841217-8f162f1e1131?ixlib=rb-1.2.1&ixid=eyJhcHBfaWQiOjEyMDd9&auto=format&fit=crop&w=900&q=60"]' \
dnn2.log

Same the first call is quite slow due to the fact that it needs to

$ cat dnn2.log 

"[[{\"label\":\"dog\",\"confidence\":0.7746848}],[{\"label\":\"cat\",\"confidence\":0.6056155}],[{\"label\":\"cat\",\"confidence\":0.6056155}],[{\"label\":\"cat\",\"confidence\":0.78931814}]]"

And it’s all about cats and dogs in the end …

/lambda2.jpg /lambda3.jpg /lambda4.jpg