Functions for platforms with Kubernetes and Fission.io

Learning about Kubernetes and Fission.io by creating our own workers-for-platforms clone

Back in May, Cloudflare released a blog post announcing their new product: workers for platforms. It is a natural evolution of their existing edge workers product and opens the door to a new kind of developer experience for third-party integrations. We only have to look at Slack and their Slack applications to see how valuable integrations have become. Entire suites of productivity tools are being sold as Slack apps. Yet, one limiting factor of integrations is the need for users to set up their own infrastructure and maintain it. If you want to build a Slack application, Slack doesn't give you a small part of their infrastructure to use, you have to build your own. With this announcement, Cloudflare tries to solve this problem by giving its users the ability to integrate with Cloudflare, through Cloudflare.

Why does this matter?

With products like Slack, it is clear that the investment to build and maintain a Slack application on your own cloud infrastructure is worth it, because their user base makes sure you’ll see the growth needed to justify the costs. Products like GitHub or Discord are in the same category; their integration platforms have been successful regardless of how many resources are needed to get one going. It's exactly why Savoir is a GitHub application and why we are considering creating a Slack application as well.

But what happens if you're a smaller product without that ability to ensure a return on investment? Using Savoir again as an example, we're a new company with a yet-to-be-successful product. For us, it is clear that integrations would be very valuable. What if we could give users ways to react to Webhooks that triggers changes in content, and even update that content programmatically? What if you could sync content tracked with Savoir on platforms like GitBook, Readme, or GraphCMS? We know we do not have the resources to compete with these platforms and it makes a lot more sense to focus on what makes us unique: code-level tracking of your documentation. Clearly, integrations are the way to go.

To build integrations prior to the Cloudflare post, we'd have two options: build the integrations individually ourselves (and hope we build it the way users want), or ask our users to take the cost of hosting their own custom integration without being able to promise them the growth they need to make their money back. Workers for platforms create a third choice: we give our users a way to create integrations, and we execute them. We can then focus on giving our users the best DX possible and they can focus on building an integration that matches their needs. It also still leaves the door open for our own integrations - we have the perfect opportunity to dogfood our own integration platform.

Functions for platforms

Long story short, we're very excited about the potential of workers for platforms. So excited, in fact, that I decided to try building a prototype clone based only on the information contained in the announcement blog post. With this long-winded introduction behind us, let's now go through the process of building that prototype and learning about Fission.io and Kubernetes in the process.

Workers for platforms are described as isolated and secure JavaScript environments where users can upload and execute JavaScript functions in a V8 environment. There is no Node.js - it runs in a pure JavaScript environment as if it was running in a browser, but without the DOM. This is not a Serverless Function in the sense that we have to answer to triggers like with a more traditional serverless environment (commonly implemented as express servers). Rather, Cloudflare gives us a set of functions we can use to listen to events and they'll execute the function whenever an event we listen to happens. Let's try building that.

I uploaded a working version of this project on GitHub, feel free to follow along with the code there if anything doesn't work as described in this post: github.com/Minivera/functions-for-platforms.

Prerequisites

After doing a lot of research, I ended up settling on the Fission.io framework to support this project. Fission is an open-source Serverless framework running in kubernetes. Think AWS Lambdas, but we are in control of every part of the infrastructure. Kubernetes gives us the power to define the environments the containers will be executed in, and any other resources they need. This gives us the control we need to be able to create our very own environment for executing arbitrary JavaScript through the V8 engine. Each function can be isolated as much as we need to and Fission is really great at giving us the ability to quickly create multiple environments.

Since Fission is built on Kubernetes, it will take care of a lot of the heavy lifting and allow us to focus on what we want. I'll make sure to explain everything I'm doing, but this post won't go into too much detail about Kubernetes. You will need Node.js installed on your machine. I recommend going with the most recent LTS version (version 16 at the time of writing this article).

To be able to use Fission, we first need to set up a kubernetes cluster. A cluster is the "cloud" environment where all the resources are created and managed. It's like your very own specialized GCP or AWS running only containers. I'll be using Minikube to manage a cluster locally in this post, as I've found it to be the most compatible with Fission. It is a great tool with lots of utilities, and it runs the entire cluster inside of another docker container, which makes it very easy to clean up. Let's get started with setting up Minikube on our machine.

  1. First, Install docker, based on your OS. As said previously, Minikube runs the cluster inside of a Docker container. Docker and Kubernetes are very complementary tools, so Docker will likely be useful even if you're not using Minikube.
  2. Install kubectl and helm to be able to manipulate a kubernetes cluster. kubectl is the official Kubernetes CLI tool and helm is a utility deployment tool for creating and deploying kubernetes applications. We will not be using helm directly in this post, it is a dependency of Fission.
  3. Install the fission CLI, it will use kubectl and helm to set up Fission automatically for us.
  4. Finally, Install Minikube.

Once everything is installed, start the Minikube cluster using the command minikube start in any terminal. Minikube will download a few docker images and start the cluster. Once completed, run eval $(minikube -p minikube docker-env) in the same terminal. This tells that terminal session to run any docker command inside the Minikube cluster, allowing us to do things like pushing or pulling images inside of the cluster. Without this command, we wouldn't be able to use our custom V8 image locally, as the cluster cannot access our local docker registry (It runs inside a container). Note that this command only works for the current terminal session, if you close that terminal, you'll have to run it again.

The final step is to install Fission itself on our new cluster. There are a few ways to install Fission, we'll be using helm and installing it on Minikube. Run the command below -- copied from the official docs -- to get Fission installed and ready to start.

export FISSION_NAMESPACE="fission"
kubectl create namespace $FISSION_NAMESPACE
kubectl create -k "github.com/fission/fission/crds/v1?ref=v1.16.0"
helm repo add fission-charts https://fission.github.io/fission-charts/
helm repo update
helm install --version v1.16.0 --namespace $FISSION_NAMESPACE fission \
  --set serviceType=NodePort,routerServiceType=NodePort \
  fission-charts/fission-all

We're now ready to get started!

Uploading functions

One important thing I noted from the Cloudflare blog post is how important speed is to their implementation. It's clear they wanted their integrations to be as fast as any other worker function running on their platform. We won't get into running this project at the edge or avoiding performance loss from Fission, but we do want to do as much as possible to improve performance.

For this reason, we'll be using a network disk over something like a CDN for uploading the JavaScript files. Executing these files will only require a direct file system access, which should be much faster than having to do the round trip to some CDN server. We'll be using YAML specification files to manage our infrastructure and applying them with kubectl. While we could only use the CLI command, I find that specification files are much more expressive and configurable. Looking at the official Kubernetes docs, we find a special kind of resource called a "persistent volume". A persistent volume is like a docker volume, but the files created in that volume are persistent rather than ephemeral. With Fission, our containers will be started and stopped constantly, so this persistent volume is a great way to share files between the containers.

Since the only thing we need from Kubernetes is to manage this volume, we'll keep the specification files simple. Create a new directory called kubernetes and then create a file named code-volume.yaml in that directory. Copy this YAML into that file.

# kubernetes/code-volume.yaml
apiVersion: v1
kind: PersistentVolume
metadata:
  name: code-volume
  namespace: fission-function
spec:
  storageClassName: manual
  accessModes:
    - ReadWriteOnce
  capacity:
    storage: 5Gi
  hostPath:
    path: /data/code-volume/
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: code-volume-claim
  namespace: fission-function
spec:
  storageClassName: manual
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 3Gi

This YAML specification file defines the volume itself - called code-volume - directly in the kubernetes namespace for the Fission functions containers. In short, namespaces allow us to isolate parts of the cluster for easier management, and Fission uses them a lot. We want our volume to be as close as possible to where the functions will be executed (since we'll have to connect this disk to the containers used by the function), that's why we create it directly in that namespace. It's a very small disk at 5 Gigabytes, but that's enough for testing things out.

The second element created is a persistent volume claim named code-volume-claim. Individual containers use this claim to request access to the persistent volume and it allows us to define the base permissions and access. A persistent volume, in Kubernetes, is a resource in the cluster. A persistent volume claim consumes that resource and defines its access. In our case, we're telling Kubernetes to give us access to 3 Gigabyte out of the 5 available in read-write mode, and that only one node can read or write at a time through the ReadWriteOnce access mode. In a real-world situation, these constraints would likely lead to access locks and prevent concurrent access. This is fine for a prototype, but we'd have to manage access properly if we were to deploy this in production.

Let's create these resources now. Run the command kubectl apply -f ./kubernetes/code-volume.yaml in your terminal. This will tell kubectl to take the specification file we just created and apply its content on our cluster. If we ever change this file, running the same command will update the cluster by applying any changed properties without recreating everything. Pretty useful.

Specification files like these are very useful and make the commands much easier to run, since we don't have to hope for the best with command arguments. Fission also supports specification files; any Fission CLI command can be appended with --spec to create a specification file in the specs directory. We can then run fission spec apply --wait to apply the specification files on the cluster like we would with kubectl.

For the rest of this blog post, we'll be using Fission specification files over command lines as it will make things a lot easier for us. Let's start by creating the spec folder itself. Run the command fission spec init to initialize that folder, Fission will add a few files in there. We can now start creating the environment and the function for uploading scripts. Create a env-nodejs.yaml file in this directory, copy this YAML into that new file.

# specs/env-nodejs.yaml
apiVersion: fission.io/v1
kind: Environment
metadata:
  creationTimestamp: null
  name: nodejs
  namespace: default
spec:
  builder:
    command: build
    container:
      name: ""
      resources: {}
    image: fission/node-builder
  imagepullsecret: ""
  keeparchive: false
  poolsize: 3
  resources: {}
  runtime:
    image: fission/node-env
    podspec:
      containers:
        - name: nodejs
          image: fission/node-env:latest
          volumeMounts:
            - name: code-volume
              mountPath: /etc/code
      volumes:
        - name: code-volume
          persistentVolumeClaim:
            claimName: code-volume-claim
  version: 2

This YAML defines a Node.js environment. In fission, environments are the definition for the containers where each function will be executed. The containers run in a unit called a pod, Fission will create an arbitrary number of pods and scale them based on demand. You can see pods as like docker-compose files that run a few containers with shared resources. All pods run in a node, which is like a virtual machine with a bunch of docker-compose files (the pods) running on it. Fission will take care of creating the pods and will balance requests on all the pods automatically.

This file was created by running the command fission environment create --name nodejs --image fission/node-env --spec, with a few modifications. We're giving the name nodejs to our environment in line 6, then we tell Fission to use the official Node.js image in line 14 to build the container and the official Node.js runtime image for the function in line 20. Finally, we define a custom podspec object in line 21 where we mount our persistent volume as a docker volume, meaning each container will now have a directory named /etc/code where they can access the content of the persistent volume. podspec is a very powerful tool that gives us the ability to configure the containers in the pod. We could, for example, add a second container running redis if we ever needed an ephemeral redis database.

We'll be using Node.js to upload code to our persistent volume. Fission supports complete NPM projects with modules, but our code only needs the base Node.js modules, so we'll keep things simple by creating a single script. Create a src directory and add a function-upload.js file in that directory, copy the following code in there.

// src/function-upload.js
const { promises } = require('fs');
const path = require('path');

// Path to the persistent volume
const diskPath = `/etc/code`;

// Function to hash a string into a short string.
const hashContent = (content) => {
    let hash = 0;
    if (content.length === 0) {
        return hash;
    }

    let chr;
    for (let i = 0; i < content.length; i++) {
        chr = content.charCodeAt(i);
        hash = (hash << 5) - hash + chr;
        hash |= 0;
    }

    return hash;
};

// Fission will execute the function we export in the Node.js environment. 
// Content contains things like the body and headers
module.exports = async function (context) {
    // This function expects a raw body (no JSON) for uploading a JavaScript script
    const fileContent = context.request.body;

    console.log(`Received function "${fileContent}", hashing.`);
    // Create a file name based on the file content. We hash that content so 
    // the same content will always have the same name.
    const fileHash = hashContent(fileContent);

    console.log(`Writing file ${fileHash}.js to the persistent volume`);
    // Write the file content onto the persistent volume.
    await promises.writeFile(path.join(diskPath, `${fileHash}.js`), fileContent);

    return {
        status: 200,
        body: {
            message: 'function successfully uploaded',
            // We return the filename so we can execute it later
            id: fileHash,
        },
    };
}

This function defines the base API code for uploading a script. We receive that script as the raw string body of a POST call, then hash that content to generate a unique file name. Finally, we create a file on the persistent volume to host that file and return the hashed name for executing that function in our V8 function.

Run fission function create --name function-upload --env nodejs --code src/function-upload.js --spec to create the function spec followed by fission httptrigger create --url /upload --method POST --name upload-js --function function-upload --spec to create the HTTP trigger spec. Run fission spec apply --wait to apply the newly created spec files onto the cluster.

In Fission, a function is the definition for executing code in an environment. In this case, we tell Fission to run our code for uploading scripts inside the Node.js environment. A trigger is what causes a function to execute. There are multiple trigger types in Fission, but since this is an API, we'll be using HTTP triggers. This tells Fission to run the function whenever an HTTP call is sent to the URL specified, /upload in our case.

Feel free to test the function execution with curl. To do so, export the Fission router URL (the entrypoint to call HTTP triggers) as an environment variable in your terminal with this command

export FISSION_ROUTER=$(minikube ip):$(kubectl -n fission get svc router -o jsonpath='{...nodePort}')

In the same terminal, try running this command to send a POST request to our upload function. This should execute and return a JSON payload with the file id.

curl -XPOST "http://$FISSION_ROUTER/upload" -H "Content-Type: text/plain" -d 'console.log("Hello, World!")'

In MacOS or Windows environments, you may need a load balancer. Check the official docs from Minikube to set it up. The repository for the project has the kubernetes specification file ready in the kubernetes folder if needed.

Executing functions

Now that we have everything ready to upload code, we need a way to execute that code in a secure V8 environment. The Cloudflare blog post is clear that V8 was the key to unlocking a secure and isolated environment, so we'll be following in their footsteps. This means we can't use the default Node.js environment from Fission, we'll have to build our own. Thankfully, Fission has a binary environment we can partially reuse for this.

To build our own environment, we need to create a container image to use as the runtime. Fission also has a concept of builders, images that build environments based on the code given (For example, the Node.js environment will install dependencies defined in a package.json file). We only need to define our own runtime, we can reuse the binary builder since we'll be using a bash file as our function script. Let's start from the base binary image and change the Dockerfile to add V8 in that image.

Download the content of this folder in the official Fission environment repository: github.com/fission/environments/tree/master.. and copy the two .go files and the Dockerfile into a newly created image directory. Open the Dockerfile and replace its content with this code:

# image/Dockerfile
# First stage, to copy v8 into the cache. V8 is built for Debian
FROM andreburgaud/d8 as v8

RUN ls /v8

# Second stage, build the Fission server for Debian
FROM golang:buster as build

WORKDIR /binary
COPY *.go /binary/

RUN go mod init github.com/fission/environments/binary
RUN go mod tidy

RUN go build -o server .

# Third stage, copy everything into a slim Debian image
FROM debian:stable-slim

RUN mkdir /v8

COPY --from=v8 /v8/* /v8

WORKDIR /app

RUN apt-get update -y && \
    apt-get install coreutils binutils findutils grep -y && \
    apt-get clean

COPY --from=build /binary/server /app/server

EXPOSE 8888
ENTRYPOINT ["./server"]

This Dockerfile has three stages. First, we download and test an image found on the Docker registry called andreburgaud/d8. This image has V8 prebuilt for Debian (You might need to build V8 on MacOS) so we can save the few hours it takes to build it. In the second stage, we copy the official .go files from the Fission binary environment and build them for Debian. These go files create a server that takes in commands from the triggers and executes a binary file in response, that’s how Fission supports any binary. Finally, the third stage puts everything together by setting up the container as defined in the original binary Dockerfile and copying V8 to a specific directory so it's available.

Run the command docker build --no-cache --tag=functions/v8-env . in the same terminal where you previously ran eval $(minikube -p minikube docker-env). This will build the image and tag it under functions/v8-env inside the Minikube cluster, so Fission can access it.

Time to create our environment! Go to the specs directory again and create a env-v8.yaml file. Copy this YAML into it.

# specs/env-v8.yaml
apiVersion: fission.io/v1
kind: Environment
metadata:
  creationTimestamp: null
  name: v8
  namespace: default
spec:
  builder:
    command: build
    container:
      name: ""
      resources: {}
    image: fission/binary-builder:latest
  imagepullsecret: ""
  keeparchive: false
  poolsize: 3
  resources: {}
  runtime:
    image: functions/v8-env:latest
    podspec:
      containers:
        - name: v8
          image: functions/v8-env:latest
          volumeMounts:
            - name: code-volume
              mountPath: /etc/code
              readOnly: true
      volumes:
        - name: code-volume
          persistentVolumeClaim:
            claimName: code-volume-claim
  version: 2

This environment is very similar to the Node.js environment, except we use the fission/binary-builder official image from Fission to build the container and our custom functions/v8-env for the container runtime. Like with the Node.js environment, we also connect the persistent volume, but in readOnly mode this time. We don't want our users to be able to write things to the volume from their own scripts.

Next, go to the src directory and create a function.sh file. Copy this code into that new file.

# src/function.sh
#!/bin/sh

file_id="$(/bin/cat -)"

printf "executing /etc/code/%s.js with /v8/d8\n\n" "$file_id"
printf "output is: \n"

 # This will not print errors but instead causes the process to crash, for now
/v8/d8 "/etc/code/$file_id.js"

When the go server from the official Fission binary environment runs a script or binary, it provides the request body in the standard input stream (accessible by reading from it with /bin/cat -). To avoid having to parse JSON or headers, we'll take the file name from the previous upload function as the raw body and execute the JS file from the persistent volume in V8 directly.

Let's create the function and trigger now. Run fission function create --name run-js --env v8 --code src/function.sh --spec followed by fission httptrigger create --url /execute --method POST --name run-isolated --function run-js --spec to create the two spec files. Run fission spec apply --wait to apply the newly created spec files onto the cluster.

We now have a function that can load a JS script uploaded through our Node.js function and execute it in an isolated and controllable V8 environment. We can control how many resources the function has through the environment specification file, but also how much time it is allowed to run. We have total control over how much power we give our users thanks to Fission and Kubernetes.

Testing the functions

The final stage is to test what we just built! If you haven't done so already, export the Fission router URL as an environment variable in your terminal with this command.

export FISSION_ROUTER=$(minikube ip):$(kubectl -n fission get svc router -o jsonpath='{...nodePort}')

In the same terminal where you exported the router URL, run this command to send a POST request to the upload function. It should return a JSON payload with the file id under the id property.

curl -XPOST "http://$FISSION_ROUTER/upload" -H "Content-Type: text/plain" -d 'console.log("Hello, World!")'

Copy the ID and run curl -XPOST -k -d "<ID>" "http://$FISSION_ROUTER/execute", replacing <ID> with it. You should see the words Hello, World! appear in your terminal. That means it worked!

What happened here exactly? When you sent the first request, the Fission router sent it to our upload function running in a Node.js container, which then creates that file in our persistent volume. The second request is then routed to our execution function running in our custom V8 containers, which loads the same file based on its file id from the volume and runs it, printing the result to the standard output. The Fission binary env is set up in such a way that any output is sent back as the result of the HTTP call.

Feel free to test this with more complex code, the code should print whatever you ask it to log. The next step for this prototype would be to provide environment variables and functions to our users so they can react to events triggered by our system, but that will be for another day!

Where to go from here?

In this post, we built a prototype for cloning the workers for platforms product from Cloudflare. Our implementation shows some promise, but it is also very limited and flawed. Due to the selection of frameworks and technologies, and also to the nature of this project which is based entirely on a single blog post, this prototype has a few flaws worth talking about.

First, anyone can technically access anyone else's scripts. A user could potentially write a script that loops through all the files in the persistent volume and prints the code of each file, potentially leaking any secrets saved directly in there (even without Node.js' fs module). We'll have to make sure each function can only see its script and nothing else.

The next issue is access. At the moment, anyone can access the two endpoints and do pretty much anything they want if we were to deploy it to the cloud. The first step towards deploying this is to make sure these two endpoints are only accessible to other internal services and secured. We'd have to create a separate service to route requests to our Fission cluster or something similar to abstract the implementation.

Finally, there is the issue of performance. This prototype isn't configured to run at the edge, but it also has some performance issues that would need to be improved to satisfy the requirements outlined in the Cloudflare blog post. The serverless nature of this project means we'll have to deal with cold starts and limited resources in our kubernetes clusters.

These optimizations are far beyond the scope of this first post, but maybe we can continue exploring in a future post! In any case, I hope you enjoyed this long post and I'm very much looking forward to seeing where the community takes workers-for-platforms.

Please check out the repository where I uploaded a working version of the prototype here: github.com/Minivera/functions-for-platforms. Contributions are welcome.


I'd love to hear your thoughts - please comment, share and follow.

We are building up Savoir, so keep an eye out for features and updates on our website at savoir.dev. If you'd like to subscribe for updates or beta testing, send me a message at info@savoir.dev!

Savoir is the french word for Knowledge, pronounced sɑvwɑɹ.