Building OCI images with Go. No run command yet

06/01/2024 - Estimated reading time: 6 minutes — Building a tool to build OCI container images

In the previous post we learned what OCI images are and how they made. In this post we will see how to automate image creation process with golang.

Code is located here: https://github.com/pkorzh/container-build-tool/tree/v0

Code stability: it works on my machine.

Side note: At this point I’ve read Effective Go, Learning Go. Skimmed though & borrowed ideas from containers/buildah, and containers/image repos.

The tool I’m building is called cbt, short for Container Build Tool. It has three top level commands:

from to initialize a new working directory.
config to configure an image.
build to build an image.

`cbt from`

The from command will initialize a new working directory inside of ~/.cbt.

It expects base image reference as the only argument. Image reference is a string in the following format: <source>:<file_or_repo>. At the moment of writing source could be either oci-archive or oci-layout, so you either pass a tar archive or path to a folder.

# image reference examples
oci-archive:nginx.tar
oci-archive:alpine.tar:latest
oci-layout:/Users/bob/nginx/:nginx:latest

To get an OCI image we can use skopeo tool and run the copy command:

# this will copy the nginx:latest image into /var/tmp/nginx dir in OCI format
skopeo copy docker://nginx oci:/var/tmp/nginx

Now we can run cbt from oci-layout:/var/tmp/nginx to create a working directory for our future image. Working dir is named after the base image.

tree ~/.cbt/nginx 
.
├── builder.json
└── layers

A working directory contains user-created layers(inside layers/ dir) and image configuration(inside builder.json file). To create a new layer create arbitrary named folder inside layers/ and put content in it.

For instance, to create a layer with dependencies create a directory layers/deps/ and copy over the deps. Alternatively you could also install packages directly into the layer’s folder:

pip install -r requirements.txt -t layers/deps/
# or 
npm install --prefix layers/deps/

Of course this will only work if the host platform is the same as the platform used to run the image. This is the limitation of current version that does not support the run command yet.

Let’s see how to create an app layer in action:

# this will copy the nginx:latest image into /var/tmp/nginx dir in OCI format
skopeo copy docker://nginx oci:/var/tmp/nginx

# create a new layer named app
APP_LAYER_DIR=~/.cbt/nginx/layers/app
mkdir -p $APP_LAYER_DIR

# add content to app layer
mkdir -p $APP_LAYER_DIR/usr/share/nginx/html
echo '<h1>hello world</h1>' >> $APP_LAYER_DIR/usr/share/nginx/html/index.html

`cbt config`

The config command is used to configure an image. The only required argument is a working directory name. To change configuration pass any of the following flags:

--arch string: Architecture (default runtime.GOARCH)
--cmd string: Command
--entrypoint string: Entrypoint
--os string: OS (default runtime.GOOS)
--ports strings: Ports
--user string: User
--workingdir string: Working directory

After running cbt from oci-layout:/var/tmp/nginx, builder.json file will be created inside of ~/.cbt/nginx/. This file contains all the configuration parameters. The default configuration is copied from the base image.

# ~/.cbt/nginx/builder.json
{
  "fromImage": "oci-layout:/var/tmp/nginx",
  "workDirId": "nginx",
  "ociImage": {
    "created": "2024-01-06T13:49:02.322067Z",
    "architecture": "arm64",
    "os": "darwin",
    "config": {
      "ExposedPorts": { "80/tcp": {} },
      "Env": [
        "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
        "NGINX_VERSION=1.25.3",
        "NJS_VERSION=0.8.2",
        "PKG_RELEASE=1~bookworm"
      ],
      "Entrypoint": ["/docker-entrypoint.sh"],
      "Cmd": ["nginx", "-g", "daemon off;"],
      "Labels": {
        "maintainer": "NGINX Docker Maintainers \u003cdocker-maint@nginx.com\u003e"
      },
      "StopSignal": "SIGQUIT"
    },
    "rootfs": { "type": "layers", "diff_ids": null }
  },
  "ociManifest": {
    "schemaVersion": 2,
    "mediaType": "application/vnd.oci.image.manifest.v1+json",
    "config": { "digest": "", "size": 0 },
    "layers": null
  }
}

You might notice that the content of ociImage and ociManifest looks familiar. That’s because we already seen it in How-to build OCI Image by hands post.

ociImage and ociManifest properties inside builder.json are serialized representation of Image and Manifest structs defined in OCI image spec repo, besides specification it also contains Go types, intra-blob validation tooling, and JSON Schema.

`cbt build`

The build command will turn layers we’ve created and configuration we’ve specified into an OCI image.

cbt build nginx oci-archive:nginx.tar.gz --layers app

It expects two arguments: work dir name and target image reference.

After running the above command cbt will store resulting image inside of nginx.tar.gz file in current directory. This image will include:

all the layers from the base image and,
user-created layers in order specified in --layers flag.

The full code looks like this:

cbt from oci-layout:/var/tmp/nginx

# configure image
cbt config nginx --os linux --arch amd64

# create app layer
APP_LAYER_DIR=~/.cbt/nginx/layers/app
mkdir -p $APP_LAYER_DIR

# add content to app layer
mkdir -p $APP_LAYER_DIR/usr/share/nginx/html
echo '<h1>hello world</h1>' >> $APP_LAYER_DIR/usr/share/nginx/html/index.html

cbt build nginx oci-archive:nginx.tar.gz --layers app

Additionaly we can upload our image to docker hub:

mkdir nginx-oci-layout
tar -xvzf nginx.tar.gz -C ./nginx-oci-layout

skopeo copy oci:nginx-oci-layout docker://platonkorzh/nginx:test

To test it run docker run -it -p 8080:80 platonkorzh/nginx:test and open http://localhost:8080.

Implementation notes

Since cbt does not support the run command yet, it’s more of an archival tool then it is a “build tool” because most of our effort will be spend reading and writing tar achives.

Tar archives

All the images are packed into tar archives. Tar is archival, or “container” format, meaning it does not provide compression. People use additional tools like Bzip2 or Gzip to compress it.

Logically speaking, a tar file is linear sequence of entries. An entry consists of a header and file body. Header is a structure that contains metadata about the file:

file name,
permission and mode bits,
size,
owner’s user id and group id,
checksum, file type, etc.

We already know that OCI manifest contains list of layers. Each layer has a digest over compressed tar archive.

In addition to that OCI image config has DiffID property, which is another digest over uncompressed tar archive.

To help us compute the digest there is a common go-digest package.

Reading and writing

cbt works with unpacked images; tar archives are unpacked into temporary directories. As a result we always end up with a directory containing the OCI image, for example:

├── blobs
│   └── sha256
│       ├── 0deab21fa3beed803ec512c91e2a49182a7d76d91495591d73e41f479a426756
│       ├── 661ff4d9561e3fd050929ee5097067c34bafc523ee60f5294a37fd08056a73ca
│       └── 7ffe15c5a686c84c102aa6079430511a2d68c63c512f8b1de9ca09dbe8c39da5
├── index.json
└── oci-layout

Now it’s mostly a lot of IO operations. To get a list of manifests we read index.json.

# index.json
{
  "schemaVersion": 2,
  "manifests": [
    {
      "mediaType": "application/vnd.oci.image.manifest.v1+json",
      "digest": "sha256:0deab21fa3beed803ec512c91e2a49182a7d76d91495591d73e41f479a426756",
      "size": 405,
      "annotations": {
        "org.opencontainers.image.ref.name": "alpine:latest"
      }
    }
  ]
}

To get a specific manifest we take its digest, e.g. sha256:0deab21fa3beed803ec512c91e2a49182a7d76d91495591d73e41f479a426756, and read it from the blobs dir. Remember that blobs are named after their digest so technically digest is a file name.

# "blobs/sha256/0deab21fa3beed803ec512c91e2a49182a7d76d91495591d73e41f479a426756"
{
  "schemaVersion": 2,
  "mediaType": "application/vnd.oci.image.manifest.v1+json",
  "config": {
    "mediaType": "application/vnd.oci.image.config.v1+json",
    "digest": "sha256:7ffe15c5a686c84c102aa6079430511a2d68c63c512f8b1de9ca09dbe8c39da5",
    "size": 585
  },
  "layers": [
    {
      "mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
      "digest": "sha256:661ff4d9561e3fd050929ee5097067c34bafc523ee60f5294a37fd08056a73ca",
      "size": 3408480
    }
  ]
}

interval/ vs pkg/

I decided to organize all my packages into an internal directory. Firstly, I’m not sure if anyone would like to use my packages. Secondly, I don’t want to make any promises regarding API stability.

`io.TeeReader`

Another pleasant discovery I made in Go is io.TeeReader.

// TeeReader returns a Reader that writes to w what it reads from r.
// All reads from r performed through it are matched with
// corresponding writes to w. There is no internal buffering -
// the write must complete before the read completes.
// Any error encountered while writing is reported as a read error.
func TeeReader(r Reader, w Writer) Reader {
	return &teeReader{r, w}
}

I use it to calculate layer’s digest when it gets io.Copy‘ed to filesystem.

This post is part of a series.

Part 1: Container build tool
Part 2: How-to build OCI Image by hands
Part 3: Building OCI images with Go. No run command yet
Part 4: How to Tar/Untar container layers in Go
Part 5: Linux kernel namespaces
Part 6: Mini container runtime in Go
Part 7: Union mount