Building OCI images with Go. No run command yet
In the previous post we learned what OCI images are and how they made. In this post we will see how to automate image creation process with golang.
Code is located here: https://github.com/pkorzh/container-build-tool/tree/v0
Code stability: it works on my machine.
Side note: At this point I’ve read Effective Go, Learning Go. Skimmed though & borrowed ideas from containers/buildah, and containers/image repos.
The tool I’m building is called cbt
, short for Container Build Tool. It has three top level commands:
from
to initialize a new working directory.config
to configure an image.build
to build an image.
cbt from
The from
command will initialize a new working directory inside of ~/.cbt
.
It expects base image reference as the only argument. Image reference is a string in the following format: <source>:<file_or_repo>
. At the moment of writing source
could be either oci-archive
or oci-layout
, so you either pass a tar archive or path to a folder.
# image reference examples
oci-archive:nginx.tar
oci-archive:alpine.tar:latest
oci-layout:/Users/bob/nginx/:nginx:latest
To get an OCI image we can use skopeo tool and run the copy command:
# this will copy the nginx:latest image into /var/tmp/nginx dir in OCI format
skopeo copy docker://nginx oci:/var/tmp/nginx
Now we can run cbt from oci-layout:/var/tmp/nginx
to create a working directory for our future image. Working dir is named after the base image.
tree ~/.cbt/nginx
.
├── builder.json
└── layers
A working directory contains user-created layers(inside layers/
dir) and image configuration(inside builder.json
file). To create a new layer create arbitrary named folder inside layers/
and put content in it.
For instance, to create a layer with dependencies create a directory layers/deps/
and copy over the deps. Alternatively you could also install packages directly into the layer’s folder:
pip install -r requirements.txt -t layers/deps/
# or
npm install --prefix layers/deps/
Of course this will only work if the host platform is the same as the platform used to run the image. This is the limitation of current version that does not support the run
command yet.
Let’s see how to create an app layer in action:
# this will copy the nginx:latest image into /var/tmp/nginx dir in OCI format
skopeo copy docker://nginx oci:/var/tmp/nginx
# create a new layer named app
APP_LAYER_DIR=~/.cbt/nginx/layers/app
mkdir -p $APP_LAYER_DIR
# add content to app layer
mkdir -p $APP_LAYER_DIR/usr/share/nginx/html
echo '<h1>hello world</h1>' >> $APP_LAYER_DIR/usr/share/nginx/html/index.html
cbt config
The config
command is used to configure an image. The only required argument is a working directory name. To change configuration pass any of the following flags:
--arch string
- Architecture (default
runtime.GOARCH
) --cmd string
- Command
--entrypoint string
- Entrypoint
--os string
- OS (default
runtime.GOOS
) --ports strings
- Ports
--user string
- User
--workingdir string
- Working directory
After running cbt from oci-layout:/var/tmp/nginx
, builder.json file will be created inside of ~/.cbt/nginx/
. This file contains all the configuration parameters. The default configuration is copied from the base image.
# ~/.cbt/nginx/builder.json
{
"fromImage": "oci-layout:/var/tmp/nginx",
"workDirId": "nginx",
"ociImage": {
"created": "2024-01-06T13:49:02.322067Z",
"architecture": "arm64",
"os": "darwin",
"config": {
"ExposedPorts": { "80/tcp": {} },
"Env": [
"PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
"NGINX_VERSION=1.25.3",
"NJS_VERSION=0.8.2",
"PKG_RELEASE=1~bookworm"
],
"Entrypoint": ["/docker-entrypoint.sh"],
"Cmd": ["nginx", "-g", "daemon off;"],
"Labels": {
"maintainer": "NGINX Docker Maintainers \u003cdocker-maint@nginx.com\u003e"
},
"StopSignal": "SIGQUIT"
},
"rootfs": { "type": "layers", "diff_ids": null }
},
"ociManifest": {
"schemaVersion": 2,
"mediaType": "application/vnd.oci.image.manifest.v1+json",
"config": { "digest": "", "size": 0 },
"layers": null
}
}
You might notice that the content of ociImage
and ociManifest
looks familiar. That’s because we already seen it in How-to build OCI Image by hands post.
ociImage
and ociManifest
properties inside builder.json are serialized representation of Image and Manifest structs defined in OCI image spec repo, besides specification it also contains Go types, intra-blob validation tooling, and JSON Schema.
cbt build
The build
command will turn layers we’ve created and configuration we’ve specified into an OCI image.
cbt build nginx oci-archive:nginx.tar.gz --layers app
It expects two arguments: work dir name and target image reference.
After running the above command cbt
will store resulting image inside of nginx.tar.gz file in current directory. This image will include:
- all the layers from the base image and,
- user-created layers in order specified in
--layers
flag.
The full code looks like this:
cbt from oci-layout:/var/tmp/nginx
# configure image
cbt config nginx --os linux --arch amd64
# create app layer
APP_LAYER_DIR=~/.cbt/nginx/layers/app
mkdir -p $APP_LAYER_DIR
# add content to app layer
mkdir -p $APP_LAYER_DIR/usr/share/nginx/html
echo '<h1>hello world</h1>' >> $APP_LAYER_DIR/usr/share/nginx/html/index.html
cbt build nginx oci-archive:nginx.tar.gz --layers app
Additionaly we can upload our image to docker hub:
mkdir nginx-oci-layout
tar -xvzf nginx.tar.gz -C ./nginx-oci-layout
skopeo copy oci:nginx-oci-layout docker://platonkorzh/nginx:test
To test it run docker run -it -p 8080:80 platonkorzh/nginx:test
and open http://localhost:8080.
Implementation notes
Since cbt
does not support the run
command yet, it’s more of an archival tool then it is a “build tool” because most of our effort will be spend reading and writing tar achives.
Tar archives
All the images are packed into tar archives. Tar is archival, or “container” format, meaning it does not provide compression. People use additional tools like Bzip2 or Gzip to compress it.
Logically speaking, a tar file is linear sequence of entries. An entry consists of a header and file body. Header is a structure that contains metadata about the file:
- file name,
- permission and mode bits,
- size,
- owner’s user id and group id,
- checksum, file type, etc.
We already know that OCI manifest contains list of layers. Each layer has a digest over compressed tar archive.
In addition to that OCI image config has DiffID
property, which is another digest over uncompressed tar archive.
To help us compute the digest there is a common go-digest package.
Reading and writing
cbt
works with unpacked images; tar archives are unpacked into temporary directories. As a result we always end up with a directory containing the OCI image, for example:
├── blobs
│ └── sha256
│ ├── 0deab21fa3beed803ec512c91e2a49182a7d76d91495591d73e41f479a426756
│ ├── 661ff4d9561e3fd050929ee5097067c34bafc523ee60f5294a37fd08056a73ca
│ └── 7ffe15c5a686c84c102aa6079430511a2d68c63c512f8b1de9ca09dbe8c39da5
├── index.json
└── oci-layout
Now it’s mostly a lot of IO operations. To get a list of manifests we read index.json.
# index.json
{
"schemaVersion": 2,
"manifests": [
{
"mediaType": "application/vnd.oci.image.manifest.v1+json",
"digest": "sha256:0deab21fa3beed803ec512c91e2a49182a7d76d91495591d73e41f479a426756",
"size": 405,
"annotations": {
"org.opencontainers.image.ref.name": "alpine:latest"
}
}
]
}
To get a specific manifest we take its digest, e.g. sha256:0deab21fa3beed803ec512c91e2a49182a7d76d91495591d73e41f479a426756
, and read it from the blobs dir. Remember that blobs are named after their digest so technically digest is a file name.
# "blobs/sha256/0deab21fa3beed803ec512c91e2a49182a7d76d91495591d73e41f479a426756"
{
"schemaVersion": 2,
"mediaType": "application/vnd.oci.image.manifest.v1+json",
"config": {
"mediaType": "application/vnd.oci.image.config.v1+json",
"digest": "sha256:7ffe15c5a686c84c102aa6079430511a2d68c63c512f8b1de9ca09dbe8c39da5",
"size": 585
},
"layers": [
{
"mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
"digest": "sha256:661ff4d9561e3fd050929ee5097067c34bafc523ee60f5294a37fd08056a73ca",
"size": 3408480
}
]
}
interval/ vs pkg/
I decided to organize all my packages into an internal directory. Firstly, I’m not sure if anyone would like to use my packages. Secondly, I don’t want to make any promises regarding API stability.
io.TeeReader
Another pleasant discovery I made in Go is io.TeeReader
.
// TeeReader returns a Reader that writes to w what it reads from r.
// All reads from r performed through it are matched with
// corresponding writes to w. There is no internal buffering -
// the write must complete before the read completes.
// Any error encountered while writing is reported as a read error.
func TeeReader(r Reader, w Writer) Reader {
return &teeReader{r, w}
}
I use it to calculate layer’s digest when it gets io.Copy
‘ed to filesystem.
This post is part of a series.
- Part 1: Container build tool
- Part 2: How-to build OCI Image by hands
- Part 3: Building OCI images with Go. No run command yet
- Part 4: How to Tar/Untar container layers in Go
- Part 5: Linux kernel namespaces
- Part 6: Mini container runtime in Go
- Part 7: Union mount