Functions for platforms with Kubernetes and Fission.io
Learning about Kubernetes and Fission.io by creating our own workers-for-platforms clone
Back in May, Cloudflare released a blog post announcing their new product: workers for platforms. It is a natural evolution of their existing edge workers product and opens the door to a new kind of developer experience for third-party integrations. We only have to look at Slack and their Slack applications to see how valuable integrations have become. Entire suites of productivity tools are being sold as Slack apps. Yet, one limiting factor of integrations is the need for users to set up their own infrastructure and maintain it. If you want to build a Slack application, Slack doesn't give you a small part of their infrastructure to use, you have to build your own. With this announcement, Cloudflare tries to solve this problem by giving its users the ability to integrate with Cloudflare, through Cloudflare.
Why does this matter?
With products like Slack, it is clear that the investment to build and maintain a Slack application on your own cloud infrastructure is worth it, because their user base makes sure you’ll see the growth needed to justify the costs. Products like GitHub or Discord are in the same category; their integration platforms have been successful regardless of how many resources are needed to get one going. It's exactly why Savoir is a GitHub application and why we are considering creating a Slack application as well.
But what happens if you're a smaller product without that ability to ensure a return on investment? Using Savoir again as an example, we're a new company with a yet-to-be-successful product. For us, it is clear that integrations would be very valuable. What if we could give users ways to react to Webhooks that triggers changes in content, and even update that content programmatically? What if you could sync content tracked with Savoir on platforms like GitBook, Readme, or GraphCMS? We know we do not have the resources to compete with these platforms and it makes a lot more sense to focus on what makes us unique: code-level tracking of your documentation. Clearly, integrations are the way to go.
To build integrations prior to the Cloudflare post, we'd have two options: build the integrations individually ourselves (and hope we build it the way users want), or ask our users to take the cost of hosting their own custom integration without being able to promise them the growth they need to make their money back. Workers for platforms create a third choice: we give our users a way to create integrations, and we execute them. We can then focus on giving our users the best DX possible and they can focus on building an integration that matches their needs. It also still leaves the door open for our own integrations - we have the perfect opportunity to dogfood our own integration platform.
Functions for platforms
Long story short, we're very excited about the potential of workers for platforms. So excited, in fact, that I decided to try building a prototype clone based only on the information contained in the announcement blog post. With this long-winded introduction behind us, let's now go through the process of building that prototype and learning about Fission.io and Kubernetes in the process.
Workers for platforms are described as isolated and secure JavaScript environments where users can upload and execute JavaScript functions in a V8 environment. There is no Node.js - it runs in a pure JavaScript environment as if it was running in a browser, but without the DOM. This is not a Serverless Function in the sense that we have to answer to triggers like with a more traditional serverless environment (commonly implemented as express servers). Rather, Cloudflare gives us a set of functions we can use to listen to events and they'll execute the function whenever an event we listen to happens. Let's try building that.
I uploaded a working version of this project on GitHub, feel free to follow along with the code there if anything doesn't work as described in this post: github.com/Minivera/functions-for-platforms.
Prerequisites
After doing a lot of research, I ended up settling on the Fission.io framework to support this project. Fission is an open-source Serverless framework running in kubernetes. Think AWS Lambdas, but we are in control of every part of the infrastructure. Kubernetes gives us the power to define the environments the containers will be executed in, and any other resources they need. This gives us the control we need to be able to create our very own environment for executing arbitrary JavaScript through the V8 engine. Each function can be isolated as much as we need to and Fission is really great at giving us the ability to quickly create multiple environments.
Since Fission is built on Kubernetes, it will take care of a lot of the heavy lifting and allow us to focus on what we want. I'll make sure to explain everything I'm doing, but this post won't go into too much detail about Kubernetes. You will need Node.js installed on your machine. I recommend going with the most recent LTS version (version 16 at the time of writing this article).
To be able to use Fission, we first need to set up a kubernetes cluster. A cluster is the "cloud" environment where all the resources are created and managed. It's like your very own specialized GCP or AWS running only containers. I'll be using Minikube to manage a cluster locally in this post, as I've found it to be the most compatible with Fission. It is a great tool with lots of utilities, and it runs the entire cluster inside of another docker container, which makes it very easy to clean up. Let's get started with setting up Minikube on our machine.
- First, Install docker, based on your OS. As said previously, Minikube runs the cluster inside of a Docker container. Docker and Kubernetes are very complementary tools, so Docker will likely be useful even if you're not using Minikube.
- Install kubectl and helm to be able to manipulate a kubernetes cluster.
kubectl
is the official Kubernetes CLI tool andhelm
is a utility deployment tool for creating and deploying kubernetes applications. We will not be usinghelm
directly in this post, it is a dependency of Fission. - Install the fission CLI, it will use
kubectl
andhelm
to set up Fission automatically for us. - Finally, Install Minikube.
Once everything is installed, start the Minikube cluster using the command minikube start
in any terminal. Minikube will download a few docker images and start the cluster. Once completed, run eval $(minikube -p minikube docker-env)
in the same terminal. This tells that terminal session to run any docker command inside the Minikube cluster, allowing us to do things like pushing or pulling images inside of the cluster. Without this command, we wouldn't be able to use our custom V8 image locally, as the cluster cannot access our local docker registry (It runs inside a container). Note that this command only works for the current terminal session, if you close that terminal, you'll have to run it again.
The final step is to install Fission itself on our new cluster. There are a few ways to install Fission, we'll be using helm
and installing it on Minikube. Run the command below -- copied from the official docs -- to get Fission installed and ready to start.
export FISSION_NAMESPACE="fission"
kubectl create namespace $FISSION_NAMESPACE
kubectl create -k "github.com/fission/fission/crds/v1?ref=v1.16.0"
helm repo add fission-charts https://fission.github.io/fission-charts/
helm repo update
helm install --version v1.16.0 --namespace $FISSION_NAMESPACE fission \
--set serviceType=NodePort,routerServiceType=NodePort \
fission-charts/fission-all
We're now ready to get started!
Uploading functions
One important thing I noted from the Cloudflare blog post is how important speed is to their implementation. It's clear they wanted their integrations to be as fast as any other worker function running on their platform. We won't get into running this project at the edge or avoiding performance loss from Fission, but we do want to do as much as possible to improve performance.
For this reason, we'll be using a network disk over something like a CDN for uploading the JavaScript files. Executing these files will only require a direct file system access, which should be much faster than having to do the round trip to some CDN server. We'll be using YAML specification files to manage our infrastructure and applying them with kubectl
. While we could only use the CLI command, I find that specification files are much more expressive and configurable. Looking at the official Kubernetes docs, we find a special kind of resource called a "persistent volume". A persistent volume is like a docker volume, but the files created in that volume are persistent rather than ephemeral. With Fission, our containers will be started and stopped constantly, so this persistent volume is a great way to share files between the containers.
Since the only thing we need from Kubernetes is to manage this volume, we'll keep the specification files simple. Create a new directory called kubernetes
and then create a file named code-volume.yaml
in that directory. Copy this YAML into that file.
# kubernetes/code-volume.yaml
apiVersion: v1
kind: PersistentVolume
metadata:
name: code-volume
namespace: fission-function
spec:
storageClassName: manual
accessModes:
- ReadWriteOnce
capacity:
storage: 5Gi
hostPath:
path: /data/code-volume/
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: code-volume-claim
namespace: fission-function
spec:
storageClassName: manual
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 3Gi
This YAML specification file defines the volume itself - called code-volume
- directly in the kubernetes namespace for the Fission functions containers. In short, namespaces allow us to isolate parts of the cluster for easier management, and Fission uses them a lot. We want our volume to be as close as possible to where the functions will be executed (since we'll have to connect this disk to the containers used by the function), that's why we create it directly in that namespace. It's a very small disk at 5 Gigabytes, but that's enough for testing things out.
The second element created is a persistent volume claim named code-volume-claim
. Individual containers use this claim to request access to the persistent volume and it allows us to define the base permissions and access. A persistent volume, in Kubernetes, is a resource in the cluster. A persistent volume claim consumes that resource and defines its access. In our case, we're telling Kubernetes to give us access to 3 Gigabyte out of the 5 available in read-write mode, and that only one node can read or write at a time through the ReadWriteOnce
access mode. In a real-world situation, these constraints would likely lead to access locks and prevent concurrent access. This is fine for a prototype, but we'd have to manage access properly if we were to deploy this in production.
Let's create these resources now. Run the command kubectl apply -f ./kubernetes/code-volume.yaml
in your terminal. This will tell kubectl
to take the specification file we just created and apply its content on our cluster. If we ever change this file, running the same command will update the cluster by applying any changed properties without recreating everything. Pretty useful.
Specification files like these are very useful and make the commands much easier to run, since we don't have to hope for the best with command arguments. Fission also supports specification files; any Fission CLI command can be appended with --spec
to create a specification file in the specs
directory. We can then run fission spec apply --wait
to apply the specification files on the cluster like we would with kubectl
.
For the rest of this blog post, we'll be using Fission specification files over command lines as it will make things a lot easier for us. Let's start by creating the spec folder itself. Run the command fission spec init
to initialize that folder, Fission will add a few files in there. We can now start creating the environment and the function for uploading scripts. Create a env-nodejs.yaml
file in this directory, copy this YAML into that new file.
# specs/env-nodejs.yaml
apiVersion: fission.io/v1
kind: Environment
metadata:
creationTimestamp: null
name: nodejs
namespace: default
spec:
builder:
command: build
container:
name: ""
resources: {}
image: fission/node-builder
imagepullsecret: ""
keeparchive: false
poolsize: 3
resources: {}
runtime:
image: fission/node-env
podspec:
containers:
- name: nodejs
image: fission/node-env:latest
volumeMounts:
- name: code-volume
mountPath: /etc/code
volumes:
- name: code-volume
persistentVolumeClaim:
claimName: code-volume-claim
version: 2
This YAML defines a Node.js environment. In fission, environments are the definition for the containers where each function will be executed. The containers run in a unit called a pod, Fission will create an arbitrary number of pods and scale them based on demand. You can see pods as like docker-compose files that run a few containers with shared resources. All pods run in a node, which is like a virtual machine with a bunch of docker-compose files (the pods) running on it. Fission will take care of creating the pods and will balance requests on all the pods automatically.
This file was created by running the command fission environment create --name nodejs --image fission/node-env --spec
, with a few modifications. We're giving the name nodejs
to our environment in line 6, then we tell Fission to use the official Node.js image in line 14 to build the container and the official Node.js runtime image for the function in line 20. Finally, we define a custom podspec
object in line 21 where we mount our persistent volume as a docker volume, meaning each container will now have a directory named /etc/code
where they can access the content of the persistent volume. podspec
is a very powerful tool that gives us the ability to configure the containers in the pod. We could, for example, add a second container running redis if we ever needed an ephemeral redis database.
We'll be using Node.js to upload code to our persistent volume. Fission supports complete NPM projects with modules, but our code only needs the base Node.js modules, so we'll keep things simple by creating a single script. Create a src
directory and add a function-upload.js
file in that directory, copy the following code in there.
// src/function-upload.js
const { promises } = require('fs');
const path = require('path');
// Path to the persistent volume
const diskPath = `/etc/code`;
// Function to hash a string into a short string.
const hashContent = (content) => {
let hash = 0;
if (content.length === 0) {
return hash;
}
let chr;
for (let i = 0; i < content.length; i++) {
chr = content.charCodeAt(i);
hash = (hash << 5) - hash + chr;
hash |= 0;
}
return hash;
};
// Fission will execute the function we export in the Node.js environment.
// Content contains things like the body and headers
module.exports = async function (context) {
// This function expects a raw body (no JSON) for uploading a JavaScript script
const fileContent = context.request.body;
console.log(`Received function "${fileContent}", hashing.`);
// Create a file name based on the file content. We hash that content so
// the same content will always have the same name.
const fileHash = hashContent(fileContent);
console.log(`Writing file ${fileHash}.js to the persistent volume`);
// Write the file content onto the persistent volume.
await promises.writeFile(path.join(diskPath, `${fileHash}.js`), fileContent);
return {
status: 200,
body: {
message: 'function successfully uploaded',
// We return the filename so we can execute it later
id: fileHash,
},
};
}
This function defines the base API code for uploading a script. We receive that script as the raw string body of a POST call, then hash that content to generate a unique file name. Finally, we create a file on the persistent volume to host that file and return the hashed name for executing that function in our V8 function.
Run fission function create --name function-upload --env nodejs --code src/function-upload.js --spec
to create the function spec followed by fission httptrigger create --url /upload --method POST --name upload-js --function function-upload --spec
to create the HTTP trigger spec. Run fission spec apply --wait
to apply the newly created spec files onto the cluster.
In Fission, a function is the definition for executing code in an environment. In this case, we tell Fission to run our code for uploading scripts inside the Node.js environment. A trigger is what causes a function to execute. There are multiple trigger types in Fission, but since this is an API, we'll be using HTTP triggers. This tells Fission to run the function whenever an HTTP call is sent to the URL specified, /upload
in our case.
Feel free to test the function execution with curl
. To do so, export the Fission router URL (the entrypoint to call HTTP triggers) as an environment variable in your terminal with this command
export FISSION_ROUTER=$(minikube ip):$(kubectl -n fission get svc router -o jsonpath='{...nodePort}')
In the same terminal, try running this command to send a POST request to our upload function. This should execute and return a JSON payload with the file id.
curl -XPOST "http://$FISSION_ROUTER/upload" -H "Content-Type: text/plain" -d 'console.log("Hello, World!")'
In MacOS or Windows environments, you may need a load balancer. Check the official docs from Minikube to set it up. The repository for the project has the kubernetes specification file ready in the kubernetes
folder if needed.
Executing functions
Now that we have everything ready to upload code, we need a way to execute that code in a secure V8 environment. The Cloudflare blog post is clear that V8 was the key to unlocking a secure and isolated environment, so we'll be following in their footsteps. This means we can't use the default Node.js environment from Fission, we'll have to build our own. Thankfully, Fission has a binary environment we can partially reuse for this.
To build our own environment, we need to create a container image to use as the runtime. Fission also has a concept of builders, images that build environments based on the code given (For example, the Node.js environment will install dependencies defined in a package.json
file). We only need to define our own runtime, we can reuse the binary builder since we'll be using a bash file as our function script. Let's start from the base binary image and change the Dockerfile to add V8 in that image.
Download the content of this folder in the official Fission environment repository: github.com/fission/environments/tree/master.. and copy the two .go
files and the Dockerfile
into a newly created image
directory. Open the Dockerfile
and replace its content with this code:
# image/Dockerfile
# First stage, to copy v8 into the cache. V8 is built for Debian
FROM andreburgaud/d8 as v8
RUN ls /v8
# Second stage, build the Fission server for Debian
FROM golang:buster as build
WORKDIR /binary
COPY *.go /binary/
RUN go mod init github.com/fission/environments/binary
RUN go mod tidy
RUN go build -o server .
# Third stage, copy everything into a slim Debian image
FROM debian:stable-slim
RUN mkdir /v8
COPY --from=v8 /v8/* /v8
WORKDIR /app
RUN apt-get update -y && \
apt-get install coreutils binutils findutils grep -y && \
apt-get clean
COPY --from=build /binary/server /app/server
EXPOSE 8888
ENTRYPOINT ["./server"]
This Dockerfile has three stages. First, we download and test an image found on the Docker registry called andreburgaud/d8
. This image has V8 prebuilt for Debian (You might need to build V8 on MacOS) so we can save the few hours it takes to build it. In the second stage, we copy the official .go
files from the Fission binary environment and build them for Debian. These go files create a server that takes in commands from the triggers and executes a binary file in response, that’s how Fission supports any binary. Finally, the third stage puts everything together by setting up the container as defined in the original binary Dockerfile
and copying V8 to a specific directory so it's available.
Run the command docker build --no-cache --tag=functions/v8-env .
in the same terminal where you previously ran eval $(minikube -p minikube docker-env)
. This will build the image and tag it under functions/v8-env
inside the Minikube cluster, so Fission can access it.
Time to create our environment! Go to the specs
directory again and create a env-v8.yaml
file. Copy this YAML into it.
# specs/env-v8.yaml
apiVersion: fission.io/v1
kind: Environment
metadata:
creationTimestamp: null
name: v8
namespace: default
spec:
builder:
command: build
container:
name: ""
resources: {}
image: fission/binary-builder:latest
imagepullsecret: ""
keeparchive: false
poolsize: 3
resources: {}
runtime:
image: functions/v8-env:latest
podspec:
containers:
- name: v8
image: functions/v8-env:latest
volumeMounts:
- name: code-volume
mountPath: /etc/code
readOnly: true
volumes:
- name: code-volume
persistentVolumeClaim:
claimName: code-volume-claim
version: 2
This environment is very similar to the Node.js environment, except we use the fission/binary-builder
official image from Fission to build the container and our custom functions/v8-env
for the container runtime. Like with the Node.js environment, we also connect the persistent volume, but in readOnly
mode this time. We don't want our users to be able to write things to the volume from their own scripts.
Next, go to the src
directory and create a function.sh
file. Copy this code into that new file.
# src/function.sh
#!/bin/sh
file_id="$(/bin/cat -)"
printf "executing /etc/code/%s.js with /v8/d8\n\n" "$file_id"
printf "output is: \n"
# This will not print errors but instead causes the process to crash, for now
/v8/d8 "/etc/code/$file_id.js"
When the go server from the official Fission binary environment runs a script or binary, it provides the request body in the standard input stream (accessible by reading from it with /bin/cat -
). To avoid having to parse JSON or headers, we'll take the file name from the previous upload function as the raw body and execute the JS file from the persistent volume in V8 directly.
Let's create the function and trigger now. Run fission function create --name run-js --env v8 --code src/function.sh --spec
followed by fission httptrigger create --url /execute --method POST --name run-isolated --function run-js --spec
to create the two spec files. Run fission spec apply --wait
to apply the newly created spec files onto the cluster.
We now have a function that can load a JS script uploaded through our Node.js function and execute it in an isolated and controllable V8 environment. We can control how many resources the function has through the environment specification file, but also how much time it is allowed to run. We have total control over how much power we give our users thanks to Fission and Kubernetes.
Testing the functions
The final stage is to test what we just built! If you haven't done so already, export the Fission router URL as an environment variable in your terminal with this command.
export FISSION_ROUTER=$(minikube ip):$(kubectl -n fission get svc router -o jsonpath='{...nodePort}')
In the same terminal where you exported the router URL, run this command to send a POST request to the upload function. It should return a JSON payload with the file id under the id
property.
curl -XPOST "http://$FISSION_ROUTER/upload" -H "Content-Type: text/plain" -d 'console.log("Hello, World!")'
Copy the ID and run curl -XPOST -k -d "<ID>" "http://$FISSION_ROUTER/execute"
, replacing <ID>
with it. You should see the words Hello, World!
appear in your terminal. That means it worked!
What happened here exactly? When you sent the first request, the Fission router sent it to our upload function running in a Node.js container, which then creates that file in our persistent volume. The second request is then routed to our execution function running in our custom V8 containers, which loads the same file based on its file id from the volume and runs it, printing the result to the standard output. The Fission binary env is set up in such a way that any output is sent back as the result of the HTTP call.
Feel free to test this with more complex code, the code should print whatever you ask it to log. The next step for this prototype would be to provide environment variables and functions to our users so they can react to events triggered by our system, but that will be for another day!
Where to go from here?
In this post, we built a prototype for cloning the workers for platforms product from Cloudflare. Our implementation shows some promise, but it is also very limited and flawed. Due to the selection of frameworks and technologies, and also to the nature of this project which is based entirely on a single blog post, this prototype has a few flaws worth talking about.
First, anyone can technically access anyone else's scripts. A user could potentially write a script that loops through all the files in the persistent volume and prints the code of each file, potentially leaking any secrets saved directly in there (even without Node.js' fs
module). We'll have to make sure each function can only see its script and nothing else.
The next issue is access. At the moment, anyone can access the two endpoints and do pretty much anything they want if we were to deploy it to the cloud. The first step towards deploying this is to make sure these two endpoints are only accessible to other internal services and secured. We'd have to create a separate service to route requests to our Fission cluster or something similar to abstract the implementation.
Finally, there is the issue of performance. This prototype isn't configured to run at the edge, but it also has some performance issues that would need to be improved to satisfy the requirements outlined in the Cloudflare blog post. The serverless nature of this project means we'll have to deal with cold starts and limited resources in our kubernetes clusters.
These optimizations are far beyond the scope of this first post, but maybe we can continue exploring in a future post! In any case, I hope you enjoyed this long post and I'm very much looking forward to seeing where the community takes workers-for-platforms.
Please check out the repository where I uploaded a working version of the prototype here: github.com/Minivera/functions-for-platforms. Contributions are welcome.
I'd love to hear your thoughts - please comment, share and follow.
We are building up Savoir, so keep an eye out for features and updates on our website at savoir.dev. If you'd like to subscribe for updates or beta testing, send me a message at info@savoir.dev!
Savoir is the french word for Knowledge, pronounced sɑvwɑɹ.