Mark Needham

Thoughts on Software Development

Archive for the ‘Software Development’ Category

Asciidoctor: Creating a macro

without comments

I’ve been writing the TWIN4j blog for almost a year now and during that time I’ve written a few different asciidoc macros to avoid repetition.

The most recent one I wrote does the formatting around the Featured Community Member of the Week. I call it like this from the asciidoc, passing in the name of the person and a link to an image:

featured::https://s3.amazonaws.com/dev.assets.neo4j.com/wp-content/uploads/20180202004247/this-week-in-neo4j-3-february-2018.jpg[name="Suellen Stringer-Hye"]

The code for the macro has two parts. The first is some wiring code that registers the macro with Asciidoctor:

lib/featured-macro.rb

RUBY_ENGINE == 'opal' ? (require 'featured-macro/extension') : (require_relative 'featured-macro/extension')
 
Asciidoctor::Extensions.register do
  if (@document.basebackend? 'html') && (@document.safe < SafeMode::SECURE)
    block_macro FeaturedBlockMacro
  end
end

And this is the code for the macro itself:

lib/featured-macro/extension.rb

require 'asciidoctor/extensions' unless RUBY_ENGINE == 'opal'
 
include ::Asciidoctor
 
class FeaturedBlockMacro < Extensions::BlockMacroProcessor
  use_dsl
 
  named :featured
 
  def process parent, target, attrs
    name = attrs["name"]
 
    html = %(<div class="imageblock image-heading">
                <div class="content">
                    <img src="#{target}" alt="#{name} - This Week’s Featured Community Member" width="800" height="400">
                </div>
            </div>
            <p style="font-size: .8em; line-height: 1.5em;" align="center">
              <strong>#{name} - This Week's Featured Community Member</strong>
            </p>)
 
    create_pass_block parent, html, attrs, subs: nil
  end
end

When we convert the asciidoc into HTML we need to tell asciidoctor about the macro, which we can do like this:

asciidoctor template.adoc \
  -r ./lib/featured-macro.rb \
  -o -

And that’s it!

Written by Mark Needham

February 19th, 2018 at 8:51 pm

Posted in Software Development

Tagged with ,

Asciidoc to Asciidoc: Exploding includes

without comments

One of my favourite features in AsciiDoc is the ability to include other files, but when using lots of includes is that it becomes difficult to read the whole document unless you convert it to one of the supported backends.

$ asciidoctor --help
Usage: asciidoctor [OPTION]... FILE...
Translate the AsciiDoc source FILE or FILE(s) into the backend output format (e.g., HTML 5, DocBook 4.5, etc.)
By default, the output is written to a file with the basename of the source file and the appropriate extension.
Example: asciidoctor -b html5 source.asciidoc
 
    -b, --backend BACKEND            set output format backend: [html5, xhtml5, docbook5, docbook45, manpage] (default: html5)
                                     additional backends are supported via extensions (e.g., pdf, latex)

I don’t want to have to convert my code to one of these formats each time – I want to convert asciidoc to asciidoc!

For example, given the following files:

mydoc.adoc

= My Blog example
 
== Heading 1
 
Some awesome text
 
== Heading 2
 
include::blog_include.adoc[]

blog_include.adoc

Some included text

I want to generate another asciidoc file where the contents of the include file are exploded and displayed inline.

After a lot of searching I came across an excellent script written by Dan Allen and put it in a file called adoc.rb. We can then call it like this:

$ ruby adoc.rb mydoc.adoc
= My Blog example
 
== Heading 1
 
Some awesome text
 
== Heading 2
 
Some included text

Problem solved!

In my case I actually wanted to explode HTTP includes so I needed to pass the -a allow-uri-read flag to the script:

$ ruby adoc.rb mydoc.adoc -a allow-uri-read

And now I can generate asciidoc files until my heart’s content.

Written by Mark Needham

January 23rd, 2018 at 9:11 pm

Posted in Software Development

Tagged with ,

Strava: Calculating the similarity of two runs

without comments

I go running several times a week and wanted to compare my runs against each other to see how similar they are.

I record my runs with the Strava app and it has an API that returns lat/long coordinates for each run in the Google encoded polyline algorithm format.

We can use the polyline library to decode these values into a list of lat/long tuples. For example:

import polyline
polyline.decode('u{~vFvyys@fS]')
[(40.63179, -8.65708), (40.62855, -8.65693)]

Once we’ve got the route defined as a set of coordinates we need to compare them. My Googling led me to an algorithm called Dynamic Time Warping

DTW is a method that calculates an optimal match between two given sequences (e.g. time series) with certain restrictions.

The sequences are “warped” non-linearly in the time dimension to determine a measure of their similarity independent of certain non-linear variations in the time dimension.

The fastdtw library implements an approximation of this library and returns a value indicating the distance between sets of points.

We can see how to apply fastdtw and polyline against Strava data in the following example:

import os
import polyline
import requests
from fastdtw import fastdtw
 
token = os.environ["TOKEN"]
headers = {'Authorization': "Bearer {0}".format(token)}
 
def find_points(activity_id):
    r = requests.get("https://www.strava.com/api/v3/activities/{0}".format(activity_id), headers=headers)
    response = r.json()
    line = response["map"]["polyline"]
    return polyline.decode(line)

Now let’s try it out on two runs, 1361109741 and 1346460542:

from scipy.spatial.distance import euclidean
 
activity1_id = 1361109741
activity2_id = 1346460542
 
distance, path = fastdtw(find_points(activity1_id), find_points(activity2_id), dist=euclidean)
 
>>> print(distance)
2.91985018100644

These two runs are both near my house so the value is small. Let’s change the second route to be from my trip to New York:

activity1_id = 1361109741
activity2_id = 1246017379
 
distance, path = fastdtw(find_points(activity1_id), find_points(activity2_id), dist=euclidean)
 
>>> print(distance)
29383.492965394034

Much bigger!

I’m not really interested in the actual value returned but I am interested in the relative values. I’m building a little application to generate routes that I should run and I want it to come up with a routes that are different to recent ones that I’ve run. This score can now form part of the criteria.

Written by Mark Needham

January 18th, 2018 at 11:35 pm

Morning Pages: What should I write about?

without comments

I’ve been journalling for almost 2 years now but some days I get stuck and can’t think of anything to write about.

I did a bit of searching to see if anybody had advice on solving this problem and found a few different articles:

The articles talk about different approaches to journalling and since I’m not following one particular approach I thought I’d summarise all their ideas and put them in a document that I can use. I’ve also added some of my own ones.

Here’s the list:

  • Something you’re grateful for
  • An unusual event
  • Goals or hopes
  • A past event
  • Things that are stopping you achieving your goals
  • Values that are important to you
  • Ideas or nagging thoughts
  • What did you do yesterday?

    • What did you work on?
    • Who did you talk to?
    • What book did you read?
    • What podcasts did you listen to?
    • What TV shows/movies did you watch?
    • Where did you go?
  • What are you doing today?
  • What scares you?
  • Decisions you need to make/made – small or big

If you have any other ideas of what I can write about let me know in the comments. While researching for this post I noticed that Julia Cameron, the inventor of Morning Pages, has a new book out – The Right to Write: An Invitation and Initiation into the Writing Life – so I’m hoping to get some ideas from there as well.

Written by Mark Needham

December 27th, 2017 at 11:28 pm

Posted in Software Development

Tagged with ,

Serverless: Building a mini producer/consumer data pipeline with AWS SNS

with one comment

I wanted to create a little data pipeline with Serverless whose main use would be to run once a day, call an API, and load that data into a database.

It’s mostly used to pull in recent data from that API, but I also wanted to be able to invoke it manually and specify a date range.

I created the following pair of lambdas that communicate with each other via an SNS topic.

The code

serverless.yml

service: marks-blog

frameworkVersion: ">=1.2.0 <2.0.0"

provider:
  name: aws
  runtime: python3.6
  timeout: 180
  iamRoleStatements:
    - Effect: 'Allow'
      Action:
        - "sns:Publish"
      Resource:
        - ${self:custom.BlogTopic}

custom:
  BlogTopic:
    Fn::Join:
      - ":"
      - - arn
        - aws
        - sns
        - Ref: AWS::Region
        - Ref: AWS::AccountId
        - marks-blog-topic

functions:
  message-consumer:
    name: MessageConsumer
    handler: handler.consumer
    events:
      - sns:
          topicName: marks-blog-topic
          displayName: Topic to process events
  message-producer:
    name: MessageProducer
    handler: handler.producer
    events:
      - schedule: rate(1 day)

handler.py

import boto3
import json
import datetime
from datetime import timezone
 
def producer(event, context):
    sns = boto3.client('sns')
 
    context_parts = context.invoked_function_arn.split(':')
    topic_name = "marks-blog-topic"
    topic_arn = "arn:aws:sns:{region}:{account_id}:{topic}".format(
        region=context_parts[3], account_id=context_parts[4], topic=topic_name)
 
    now = datetime.datetime.now(timezone.utc)
    start_date = (now - datetime.timedelta(days=1)).strftime("%Y-%m-%d")
    end_date = now.strftime("%Y-%m-%d")
 
    params = {"startDate": start_date, "endDate": end_date, "tags": ["neo4j"]}
 
    sns.publish(TopicArn= topic_arn, Message= json.dumps(params))
 
 
def consumer(event, context):
    for record in event["Records"]:
        message = json.loads(record["Sns"]["Message"])
 
        start_date = message["startDate"]
        end_date = message["endDate"]
        tags = message["tags"]
 
        print("start_date: " + start_date)
        print("end_date: " + end_date)
        print("tags: " + str(tags))

Trying it out

We can simulate a message being received locally by executing the following command:

$ serverless invoke local \
    --function message-consumer \
    --data '{"Records":[{"Sns": {"Message":"{\"tags\": [\"neo4j\"], \"startDate\": \"2017-09-25\", \"endDate\": \"2017-09-29\"  }"}}]}'
 
start_date: 2017-09-25
end_date: 2017-09-29
tags: ['neo4j']
null

That seems to work fine. What about if we invoke the message-producer on AWS?

$ serverless invoke --function message-producer
 
null

Did the consumer received the message?

$ serverless logs --function message-consumer
 
START RequestId: 0ef5be87-a5b1-11e7-a905-f1387e68c65f Version: $LATEST
start_date: 2017-09-29
end_date: 2017-09-30
tags: ['neo4j']
END RequestId: 0ef5be87-a5b1-11e7-a905-f1387e68c65f
REPORT RequestId: 0ef5be87-a5b1-11e7-a905-f1387e68c65f	Duration: 0.46 ms	Billed Duration: 100 ms 	Memory Size: 1024 MB	Max Memory Used: 32 MB

Looks like it! We can also invoke the consumer directly on AWS:

$ serverless invoke \
    --function message-consumer \
    --data '{"Records":[{"Sns": {"Message":"{\"tags\": [\"neo4j\"], \"startDate\": \"2017-09-25\", \"endDate\": \"2017-09-26\"  }"}}]}'
 
null

And now if we check the consumer’s logs we’ll see both messages:

$ serverless logs --function message-consumer
 
START RequestId: 0ef5be87-a5b1-11e7-a905-f1387e68c65f Version: $LATEST
start_date: 2017-09-29
end_date: 2017-09-30
tags: ['neo4j']
END RequestId: 0ef5be87-a5b1-11e7-a905-f1387e68c65f
REPORT RequestId: 0ef5be87-a5b1-11e7-a905-f1387e68c65f	Duration: 0.46 ms	Billed Duration: 100 ms 	Memory Size: 1024 MB	Max Memory Used: 32 MB	
 
START RequestId: 4cb42bc9-a5b1-11e7-affb-99fa6b4dc3ed Version: $LATEST
start_date: 2017-09-25
end_date: 2017-09-26
tags: ['neo4j']
END RequestId: 4cb42bc9-a5b1-11e7-affb-99fa6b4dc3ed
REPORT RequestId: 4cb42bc9-a5b1-11e7-affb-99fa6b4dc3ed	Duration: 16.46 ms	Billed Duration: 100 ms 	Memory Size: 1024 MB	Max Memory Used: 32 MB

Success!

Written by Mark Needham

September 30th, 2017 at 7:51 am

Serverless: S3 – S3BucketPermissions – Action does not apply to any resource(s) in statement

without comments

I’ve been playing around with S3 buckets with Serverless, and recently wrote the following code to create an S3 bucket and put a file into that bucket:

const AWS = require("aws-sdk");
 
let regionParams = { 'region': 'us-east-1' }
let s3 = new AWS.S3(regionParams);
 
let s3BucketName = "marks-blog-bucket";
 
console.log("Creating bucket: " + s3BucketName);
let bucketParams = { Bucket: s3BucketName, ACL: "public-read" };
 
s3.createBucket(bucketParams).promise()
  .then(console.log)
  .catch(console.error);
 
var putObjectParams = {
    Body: "<html><body><h1>Hello blog!</h1></body></html>",
    Bucket: s3BucketName,
    Key: "blog.html"
   };
 
s3.putObject(putObjectParams).promise()
  .then(console.log)
  .catch(console.error);

When I tried to cURL the file I got a permission denied exception:

$ curl --head --silent https://s3.amazonaws.com/marks-blog-bucket/blog.html
HTTP/1.1 403 Forbidden
x-amz-request-id: 512FE36798C0BE4D
x-amz-id-2: O1ELGMJ0jjs11WCrNiVNF2z2ssRgtO4+M4H2QQB5025HjIpC54VId0eKZryYeBYN8Pvb8GsolTQ=
Content-Type: application/xml
Transfer-Encoding: chunked
Date: Fri, 29 Sep 2017 05:42:03 GMT
Server: AmazonS3

I wrote the following code to try and have Serverless create a bucket policy that would make all files in my bucket publicly accessible:

serverless.yml

service: marks-blog

frameworkVersion: ">=1.2.0 <2.0.0"

provider:
  name: aws
  runtime: python3.6
  timeout: 180

resources:
  Resources:
    S3BucketPermissions:
      Type: AWS::S3::BucketPolicy
      Properties:
        Bucket: marks-blog-bucket
        PolicyDocument:
          Statement:
            - Principal: "*"
              Action:
                - s3:GetObject
              Effect: Allow
              Sid: "AddPerm"
              Resource: arn:aws:s3:::marks-blog-bucket
 
...

Let’s try to deploy it:

./node_modules/serverless/bin/serverless deploy
Serverless: Packaging service...
Serverless: Excluding development dependencies...
Serverless: Uploading CloudFormation file to S3...
Serverless: Uploading artifacts...
Serverless: Uploading service .zip file to S3 (1.3 KB)...
Serverless: Validating template...
Serverless: Updating Stack...
Serverless: Checking Stack update progress...
........
Serverless: Operation failed!
 
  Serverless Error ---------------------------------------
 
  An error occurred: S3BucketPermissions - Action does not apply to any resource(s) in statement.

D’oh! That didn’t do what I expected.

I learnt that this message means:

Some services do not let you specify actions for individual resources; instead, any actions that you list in the Action or NotAction element apply to all resources in that service. In these cases, you use the wildcard * in the Resource element.

To fix it we need to use the wildcard * to indicate that the s3:GetObject permission should apply to all values in the bucket rather than to the bucket itself.

Take 2!

serverless.yml

service: marks-blog

frameworkVersion: ">=1.2.0 <2.0.0"

provider:
  name: aws
  runtime: python3.6
  timeout: 180

resources:
  Resources:
    S3BucketPermissions:
      Type: AWS::S3::BucketPolicy
      Properties:
        Bucket: marks-blog-bucket
        PolicyDocument:
          Statement:
            - Principal: "*"
              Action:
                - s3:GetObject
              Effect: Allow
              Sid: "AddPerm"
              Resource: arn:aws:s3:::marks-blog-bucket/*
 
...

Let’s deploy again and try to access the file:

$ curl --head --silent https://s3.amazonaws.com/marks-blog-bucket/blog.html
HTTP/1.1 200 OK
x-amz-id-2: uGwsLLoFHf+slXADGYkqW0bLfQ7EPG/kqzV3l2k7SMex4NlMEpNsNN/cIC9INLPohDtVFwUAa90=
x-amz-request-id: 7869E21760CD50F1
Date: Fri, 29 Sep 2017 06:05:11 GMT
Last-Modified: Fri, 29 Sep 2017 06:01:33 GMT
ETag: "57bac87219812c2f9a581943da34cfde"
Accept-Ranges: bytes
Content-Type: application/octet-stream
Content-Length: 46
Server: AmazonS3

Success! And if we check in the AWS console we can see that the bucket policy has been applied to our bucket:

2017 09 29 07 06 13

Written by Mark Needham

September 29th, 2017 at 6:09 am

Posted in Software Development

Tagged with , ,

Serverless: AWS HTTP Gateway – 502 Bad Gateway

without comments

In my continued work with Serverless and AWS Lambda I ran into a problem when trying to call a HTTP gateway.

My project looked like this:

serverless.yaml

service: http-gateway
 
frameworkVersion: ">=1.2.0 <2.0.0"
 
provider:
  name: aws
  runtime: python3.6
  timeout: 180
 
functions:
  no-op:
      name: NoOp
      handler: handler.noop
      events:
        - http: POST noOp

handler.py

def noop(event, context):
    return "hello"

Let’s deploy to AWS:

$ serverless  deploy
Serverless: Packaging service...
Serverless: Excluding development dependencies...
Serverless: Uploading CloudFormation file to S3...
Serverless: Uploading artifacts...
Serverless: Uploading service .zip file to S3 (179 B)...
Serverless: Validating template...
Serverless: Updating Stack...
Serverless: Checking Stack update progress...
..............
Serverless: Stack update finished...
Service Information
service: http-gateway
stage: dev
region: us-east-1
api keys:
  None
endpoints:
  POST - https://29nb5rmmd0.execute-api.us-east-1.amazonaws.com/dev/noOp
functions:
  no-op: http-gateway-dev-no-op

And now we’ll try and call it using cURL:

$ curl -X POST https://29nb5rmmd0.execute-api.us-east-1.amazonaws.com/dev/noOp
{"message": "Internal server error"}

That didn’t work so well, what do the logs have to say?

$ serverless  logs --function no-op	
START RequestId: 64ab69b0-7d8f-11e7-9db5-13b228cd4cb6 Version: $LATEST
END RequestId: 64ab69b0-7d8f-11e7-9db5-13b228cd4cb6
REPORT RequestId: 64ab69b0-7d8f-11e7-9db5-13b228cd4cb6	Duration: 0.27 ms	Billed Duration: 100 ms 	Memory Size: 1024 MB	Max Memory Used: 21 MB

So the function is completely fine. It turns out I’m not very good at reading the manual and should have been returning a map instead of a string:

API Gateway expects to see a json map with keys “body”, “headers”, and “statusCode”.

Let’s update our handler function and re-deploy.

def noop(event, context):
    return {
        "body": "hello",
        "headers": {},
        "statusCode": 200
        }

Now we’re ready to try the endpoint again:

$ curl -X POST https://29nb5rmmd0.execute-api.us-east-1.amazonaws.com/dev/noOp
hello

Much better!

Written by Mark Needham

August 11th, 2017 at 4:01 pm

Serverless: Python – virtualenv – { “errorMessage”: “Unable to import module ‘handler'” }

without comments

I’ve been using the Serverless library to deploy and run some Python functions on AWS lambda recently and was initially confused about how to handle my dependencies.

I tend to create a new virtualenv for each of my project so let’s get that setup first:

Prerequisites

$ npm install serverless
$ virtualenv -p python3 a
$ . a/bin/activate

Now let’s create our Serverless project. I’m going to install the requests library so that I can use it in my function.

My Serverless project

serverless.yaml

service: python-starter-template

frameworkVersion: ">=1.2.0 <2.0.0"

provider:
  name: aws
  runtime: python3.6
  timeout: 180

functions:
  starter-function:
      name: Starter
      handler: handler.starter

handler.py

import requests
 
def starter(event, context):
    print("event:", event, "context:", context)
    r = requests.get("http://www.google.com")
    print(r.status_code)
$ pip install requests

Ok, we’re now ready to try out the function. A nice feature of Serverless is that it lets us try out functions locally before we deploy them onto one of the Cloud providers:

$ ./node_modules/serverless/bin/serverless invoke local --function starter-function
event: {} context: <__main__.FakeLambdaContext object at 0x10bea9a20>
200
null

So far so good. Next we’ll deploy our function to AWS. I’m assuming you’ve already got your credentials setup but if not you can follow the tutorial on the Serverless page.

$ ./node_modules/serverless/bin/serverless deploy
Serverless: Packaging service...
Serverless: Excluding development dependencies...
Serverless: Uploading CloudFormation file to S3...
Serverless: Uploading artifacts...
Serverless: Uploading service .zip file to S3 (26.48 MB)...
Serverless: Validating template...
Serverless: Updating Stack...
Serverless: Checking Stack update progress...
.........
Serverless: Stack update finished...
Service Information
service: python-starter-template
stage: dev
region: us-east-1
api keys:
  None
endpoints:
  None
functions:
  starter-function: python-starter-template-dev-starter-function

Now let’s invoke our function:

$ ./node_modules/serverless/bin/serverless invoke --function starter-function
{
    "errorMessage": "Unable to import module 'handler'"
}
 
  Error --------------------------------------------------
 
  Invoked function failed
 
     For debugging logs, run again after setting the "SLS_DEBUG=*" environment variable.
 
  Get Support --------------------------------------------
     Docs:          docs.serverless.com
     Bugs:          github.com/serverless/serverless/issues
     Forums:        forum.serverless.com
     Chat:          gitter.im/serverless/serverless
 
  Your Environment Information -----------------------------
     OS:                     darwin
     Node Version:           6.7.0
     Serverless Version:     1.19.0

Hmmm, that’s odd – I wonder why it can’t import our handler module? We can call the logs function to check. The logs are usually a few seconds behind so we’ll have to be a bit patient if we don’t see them immediately.

$ ./node_modules/serverless/bin/serverless logs  --function starter-function
START RequestId: 735efa84-7ad0-11e7-a4ef-d5baf0b46552 Version: $LATEST
Unable to import module 'handler': No module named 'requests'
 
END RequestId: 735efa84-7ad0-11e7-a4ef-d5baf0b46552
REPORT RequestId: 735efa84-7ad0-11e7-a4ef-d5baf0b46552	Duration: 0.42 ms	Billed Duration: 100 ms 	Memory Size: 1024 MB	Max Memory Used: 22 MB

That explains it – the requests module wasn’t imported.

If we look in .serverless/python-starter-template.zip

we can see that the requests module is hidden inside the a directory and the instance of Python that runs on Lambda doesn’t know where to find it.

I’m sure there are other ways of solving this but the easiest one I found is a Serverless plugin called serverless-python-requirements.

So how does this plugin work?

A Serverless v1.x plugin to automatically bundle dependencies from requirements.txt and make them available in your PYTHONPATH.

Doesn’t sound too tricky – we can use pip freeze to get our list of requirements and write them into a file. Let’s rework serverless.yaml to make use of the plugin:

My Serverless project using serverless-python-requirements

$ npm install --save serverless-python-requirements
$ pip freeze > requirements.txt
$ cat requirements.txt 
certifi==2017.7.27.1
chardet==3.0.4
idna==2.5
requests==2.18.3
urllib3==1.22

serverless.yaml

service: python-starter-template

frameworkVersion: ">=1.2.0 <2.0.0"

provider:
  name: aws
  runtime: python3.6
  timeout: 180

plugins:
  - serverless-python-requirements

functions:
  starter-function:
      name: Starter
      handler: handler.starter

package:
  exclude:
    - a/** # virtualenv

We have two changes from before:

  • We added the serverless-python-requirements plugin
  • We excluded the a directory since we don’t need it

Let’s deploy again and run the function:

$ ./node_modules/serverless/bin/serverless deploy
Serverless: Parsing Python requirements.txt
Serverless: Installing required Python packages for runtime python3.6...
Serverless: Linking required Python packages...
Serverless: Packaging service...
Serverless: Excluding development dependencies...
Serverless: Unlinking required Python packages...
Serverless: Uploading CloudFormation file to S3...
Serverless: Uploading artifacts...
Serverless: Uploading service .zip file to S3 (14.39 MB)...
Serverless: Validating template...
Serverless: Updating Stack...
Serverless: Checking Stack update progress...
.........
Serverless: Stack update finished...
Service Information
service: python-starter-template
stage: dev
region: us-east-1
api keys:
  None
endpoints:
  None
functions:
  starter-function: python-starter-template-dev-starter-function
$ ./node_modules/serverless/bin/serverless invoke --function starter-function
null

Looks good. Let’s check the logs:

$ ./node_modules/serverless/bin/serverless logs --function starter-function
START RequestId: 61e8eda7-7ad4-11e7-8914-03b8a7793a24 Version: $LATEST
event: {} context: <__main__.LambdaContext object at 0x7f568b105f28>
200
END RequestId: 61e8eda7-7ad4-11e7-8914-03b8a7793a24
REPORT RequestId: 61e8eda7-7ad4-11e7-8914-03b8a7793a24	Duration: 55.55 ms	Billed Duration: 100 ms 	Memory Size: 1024 MB	Max Memory Used: 29 M

All good here as well so we’re done!

Written by Mark Needham

August 6th, 2017 at 7:03 pm

Posted in Software Development

Tagged with ,

AWS Lambda: /lib/ld-linux.so.2: bad ELF interpreter: No such file or directory’

without comments

I’ve been working on an AWS lambda job to convert a HTML page to PDF using a Python wrapper around the wkhtmltopdf library but ended up with the following error when I tried to execute it:

b'/bin/sh: ./binary/wkhtmltopdf: /lib/ld-linux.so.2: bad ELF interpreter: No such file or directory\n': Exception
Traceback (most recent call last):
File "/var/task/handler.py", line 33, in generate_certificate
wkhtmltopdf(local_html_file_name, local_pdf_file_name)
File "/var/task/lib/wkhtmltopdf.py", line 64, in wkhtmltopdf
wkhp.render()
File "/var/task/lib/wkhtmltopdf.py", line 56, in render
raise Exception(stderr)
Exception: b'/bin/sh: ./binary/wkhtmltopdf: /lib/ld-linux.so.2: bad ELF interpreter: No such file or directory\n'

It turns out this is the error you get if you run a 32 bit binary on a 64 bit operating system which is what AWS uses.

If you are using any native binaries in your code, make sure they are compiled in this environment. Note that only 64-bit binaries are supported on AWS Lambda.

I changed to the 64 bit binary and am now happily converting HTML pages to PDF.

Written by Mark Needham

August 3rd, 2017 at 5:24 pm

Posted in Software Development

Tagged with ,

AWS Lambda: Programmatically scheduling a CloudWatchEvent

with 2 comments

I recently wrote a blog post showing how to create a Python ‘Hello World’ AWS lambda function and manually invoke it, but what I really wanted to do was have it run automatically every hour.

To achieve that in AWS Lambda land we need to create a CloudWatch Event. The documentation describes them as follows:

Using simple rules that you can quickly set up, you can match events and route them to one or more target functions or streams.

2017 04 05 23 06 36

This is actually really easy from the Amazon web console as you just need to click the ‘Triggers’ tab and then ‘Add trigger’. It’s not obvious that there are actually three steps are involved as they’re abstracted from you.

So what are the steps?

  1. Create rule
  2. Give permission for that rule to execute
  3. Map the rule to the function

I forgot to do step 2) initially and then you just end up with a rule that never triggers, which isn’t particularly useful.

The following code creates a ‘Hello World’ lambda function and runs it once an hour:

import boto3
 
lambda_client = boto3.client('lambda')
events_client = boto3.client('events')
 
fn_name = "HelloWorld"
fn_role = 'arn:aws:iam::[your-aws-id]:role/lambda_basic_execution'
 
fn_response = lambda_client.create_function(
    FunctionName=fn_name,
    Runtime='python2.7',
    Role=fn_role,
    Handler="{0}.lambda_handler".format(fn_name),
    Code={'ZipFile': open("{0}.zip".format(fn_name), 'rb').read(), },
)
 
fn_arn = fn_response['FunctionArn']
frequency = "rate(1 hour)"
name = "{0}-Trigger".format(fn_name)
 
rule_response = events_client.put_rule(
    Name=name,
    ScheduleExpression=frequency,
    State='ENABLED',
)
 
lambda_client.add_permission(
    FunctionName=fn_name,
    StatementId="{0}-Event".format(name),
    Action='lambda:InvokeFunction',
    Principal='events.amazonaws.com',
    SourceArn=rule_response['RuleArn'],
)
 
events_client.put_targets(
    Rule=name,
    Targets=[
        {
            'Id': "1",
            'Arn': fn_arn,
        },
    ]
)

We can now check if our trigger has been configured correctly:

$ aws events list-rules --query "Rules[?Name=='HelloWorld-Trigger']"
[
    {
        "State": "ENABLED", 
        "ScheduleExpression": "rate(1 hour)", 
        "Name": "HelloWorld-Trigger", 
        "Arn": "arn:aws:events:us-east-1:[your-aws-id]:rule/HelloWorld-Trigger"
    }
]
 
$ aws events list-targets-by-rule --rule HelloWorld-Trigger
{
    "Targets": [
        {
            "Id": "1", 
            "Arn": "arn:aws:lambda:us-east-1:[your-aws-id]:function:HelloWorld"
        }
    ]
}
 
$ aws lambda get-policy --function-name HelloWorld
{
    "Policy": "{\"Version\":\"2012-10-17\",\"Id\":\"default\",\"Statement\":[{\"Sid\":\"HelloWorld-Trigger-Event\",\"Effect\":\"Allow\",\"Principal\":{\"Service\":\"events.amazonaws.com\"},\"Action\":\"lambda:InvokeFunction\",\"Resource\":\"arn:aws:lambda:us-east-1:[your-aws-id]:function:HelloWorld\",\"Condition\":{\"ArnLike\":{\"AWS:SourceArn\":\"arn:aws:events:us-east-1:[your-aws-id]:rule/HelloWorld-Trigger\"}}}]}"
}

All looks good so we’re done!

Written by Mark Needham

April 5th, 2017 at 11:49 pm

Posted in Software Development

Tagged with , ,