How-to build OCI Image by hands
Open Container Initiative define standards, or specs, for containers. Back in the early days Docker had its own container and runtime formats.
rkt, container runtime created by CoreOS as alternative to Docker, also had it own container format. To make everything work interoperable people created OCI that define three specifications:
- the Runtime Specification (runtime-spec),
- the Image Specification (image-spec)
- and the Distribution Specification (distribution-spec).
In this article we will explore the image spec with the goal of building our own OCI image without build tools.
Container image layers
Each container image consists of layers, or blobs, and configuration. If we take the contents of a file system and archive it we get a layer. Lets look at it in action by running docker save alpine > alpine.tar
and then unpack the archive:
docker pull alpine
docker save alpine > alpine.tar
#unpack it
tar xvf alpine.tar
tree .
.
├── 3cc20332140056b331ad58185ab589c085f4e7d79d8c9769533d6a9b95d4b1b0.json
├── e863aefeb0c938ac2eb625d83bb2f5094568ba00a1ca521496a7a98f0e57ba27
│ ├── VERSION
│ ├── json
│ └── layer.tar
├── manifest.json
└── repositories
Inside e86...a27
folder there is layer.tar
- lets unpack it as well.
tree .
.
├── bin
├── dev
├── etc
├── home
├── lib
├── media
├── mnt
├── opt
├── proc
├── root
├── run
├── sbin
├── srv
├── sys
├── tmp
├── usr
└── var
It looks like pretty standard linux file system directory tree.
Each layer is nothing more then a tar archive.
Content Addressable Storage
Content Addressable Storage, CAS, is a way of storing information so it can be retrieved based on its content, not its name or location. In CAS a file is assigned a unique value, a hash, that represents the content. This ensures that data is stored only once.
Usually we use file path to get its content. E.g. we can do cat /etc/abc/xyz.conf
to get the content of the file. In container world we use content addressable identities to reference files.
#create regular file
echo 'hello' > content-addressable-identitiy
#get its hash
sha256sum < content-addressable-identitiy
5891b5b522d5df086d0ff0b110fbd9d21bb4fc7163af34d08286a2e846f6be03 -
#rename the file
mv content-addressable-identitiy 5891b5b522d5df086d0ff0b110fbd9d21bb4fc7163af34d08286a2e846f6be03
cat 5891b5b522d5df086d0ff0b110fbd9d21bb4fc7163af34d08286a2e846f6be03
hello
Now we use file hash to get its content as opposed to using file path.
Layers, or blobs, are named after their SHA 256 sum.
Lets look at the contents of alpine.tar
archive again and try to get SHA sum of the first file.
tree .
.
├── 3cc20332140056b331ad58185ab589c085f4e7d79d8c9769533d6a9b95d4b1b0.json
├── e863aefeb0c938ac2eb625d83bb2f5094568ba00a1ca521496a7a98f0e57ba27
# truncated
sha256sum < 3cc20332140056b331ad58185ab589c085f4e7d79d8c9769533d6a9b95d4b1b0.json
3cc20332140056b331ad58185ab589c085f4e7d79d8c9769533d6a9b95d4b1b0 -
This approach is used to verify integrity. If someone tampered with a file its hash will change and the file won’t be used.
Union File System
We know that images consist of multiple layers. Those layers needs to be merged together to get a unified view of directory structure.
# dockerfile
FROM alpine
ADD hello /bin/hello
Now if we build this docker build . -t multilayers
we will get an image with 2 layers:
- base layer provided by alpine image,
- and a layer containing only
/bin/hello
file.
docker image inspect multilayers | jq '.[0].RootFS'
{
"Type": "layers",
"Layers": [
"sha256:5f4d9fc4d98de91820d2a9c81e501c8cc6429bc8758b43fcb2cd50f4cab9a324",
"sha256:86922ebcfe54c8c254f94a8624bbc7233ca6b3d644d0c97c384458508956c6c1"
]
}
Container engines rely on different storage drivers to perform layers merging. The default storage driver in docker is overlay2.
Overlay2 allows us to create multi-layered virtual file system, which means multiple layers are overlaid on top of each other. Virtual file system is created using lower and upper layers, each layer is basically a directory. Lower layers are ready-ony. Upper layers are read-write.
Let’s inspect the image and see the content under GraphDriver property.
docker image inspect multilayers | jq '.[0].GraphDriver'
{
"Data": {
"LowerDir": "/var/lib/docker/overlay2/b54e0ad78336df458acd8bedb753790352df983a61f24915ae5059cfbdaa9a88/diff",
"MergedDir": "/var/lib/docker/overlay2/kokkj58zxlqgfte7rbw5mcn4g/merged",
"UpperDir": "/var/lib/docker/overlay2/kokkj58zxlqgfte7rbw5mcn4g/diff",
"WorkDir": "/var/lib/docker/overlay2/kokkj58zxlqgfte7rbw5mcn4g/work"
},
"Name": "overlay2"
}
# look up content of UpperDir
tree . /var/lib/docker/overlay2/kokkj58zxlqgfte7rbw5mcn4g/diff
.
/var/lib/docker/overlay2/kokkj58zxlqgfte7rbw5mcn4g/diff
└── bin
└── hello
We can see it contains only the data we added in our dockerfile.
It is said that each layer is a "diff" that contains only changes added to the lower layers.
All in all layers are nothing special. They are just blobs that are expanded into directory and then mounted together to get a unified view.
OCI Image Layout
The image we’ve seen before was not OCI compliant. What we saw was a docker image. Let’s fix that and see how OCI image looks like.
#simple dockerfile
cat dockerfile
FROM alpine
CMD echo 'hello world!'
#build it using buildx
docker buildx build -o type=oci,dest=oci.tar .
#unpack it
tar xzf oci.tar
tree .
.
├── blobs
│ └── sha256
│ ├── 2ab6241fbe26fe4ce86dba1231bcd2dc73101e75b3f6627e0ca1d6ddfe206632
│ ├── 579b34f0a95bb83b3acd6b3249ddc52c3d80f5c84b13c944e9e324feb86dd329
│ └── ba692c6a83ea13339c26d640da9aaae69b635efb854be1a4a13e563e965c31a4
├── dockerfile
├── index.json
├── oci-layout
└── oci.tar
#You can ignore dockerfile and oci.tar as we created those.
The directory tree you see is called OCI Image Layout. Here’s what OCI image spec says about it:
- The OCI Image Layout is the directory structure for OCI content-addressable blobs and location-addressable references (refs).
blobs
directory contains content-addressable blobs.oci-layout
file contains JSON object withimageLayoutVersion
field.index.json
is an image index, it contains manifests.
OCI Image format specification
The specification defines an OCI Image consisting of
- image index, defined in
index.json
, - image manifest, defined inside
index.json
, - a set of filesystem layers(blobs),
- and configuration, also stored as a blob.
Image index is simply a list of manifests.
# index.json
cat index.json | jq
{
"schemaVersion": 2,
"manifests": [
{
"mediaType": "application/vnd.oci.image.manifest.v1+json",
"digest": "sha256:ba692c6a83ea13339c26d640da9aaae69b635efb854be1a4a13e563e965c31a4",
"size": 480,
"annotations": {
"org.opencontainers.image.created": "2023-10-26T15:55:20Z"
},
"platform": {
"architecture": "arm64",
"os": "linux"
}
}
]
}
The digest
property is a hash of a manifest. And since we are using Content Addressable Storage we can get the content of a manifest by its hash.
# manifest
cat blobs/sha256/ba692c6a83ea13339c26d640da9aaae69b635efb854be1a4a13e563e965c31a4 | jq
{
"schemaVersion": 2,
"mediaType": "application/vnd.oci.image.manifest.v1+json",
"config": {
"mediaType": "application/vnd.oci.image.config.v1+json",
"digest": "sha256:2ab6241fbe26fe4ce86dba1231bcd2dc73101e75b3f6627e0ca1d6ddfe206632",
"size": 809
},
"layers": [
{
"mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
"digest": "sha256:579b34f0a95bb83b3acd6b3249ddc52c3d80f5c84b13c944e9e324feb86dd329",
"size": 3331831
}
]
}
A manifest defines list of layers our image consists of AND image configuration. At this point it should be really easy to get the content of configuration.
# configuration
cat blobs/sha256/2ab6241fbe26fe4ce86dba1231bcd2dc73101e75b3f6627e0ca1d6ddfe206632 | jq
{
"architecture": "arm64",
"config": {
"Env": [
"PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
],
"Cmd": [
"/bin/sh",
"-c",
"echo 'hello world!'"
],
"ArgsEscaped": true,
"OnBuild": null
},
"created": "2023-09-28T20:39:34.079909813Z",
"history": [ ... ],
"os": "linux",
"rootfs": {
"type": "layers",
"diff_ids": [
"sha256:5f4d9fc4d98de91820d2a9c81e501c8cc6429bc8758b43fcb2cd50f4cab9a324"
]
}
}
As you can see the manifest configuration among other things defines environment variables, command and entrypoint. It also defines rootfs
object that lists image layers under diff_ids
property. We already know that each layer is a “diff” that contains only changes made to an image so naming here should be understood. We also know that we use CAS(Content Addressable Storage), so diff_ids
is a list of hashes of our layers with one distinction that we use unpacked layer data to get the hash. This is used to verify data integrity. And this is different from digest of a layer inside a manifest, there hash of an archive(blob) is used.
Whenever you hear someone saying “multi platform image” it means that image index contains manifests for multiple platforms. Container repository tagging works the same way - each tag is basically a manifest pointing to a set of layers and configuration. We then use image index to perform lookups.
Getting to the point
We finally have all the knowledge we need to build our very own OCI image. To keep things simple i will include a single binary in the layer. You can use any binary you want, i’ll use Golang to create it. The reason i’m using Go is because it produces statically linked binary and has no dependencies.
package main
import "fmt"
func main() {
fmt.Println("hello world")
}
To get the binary we need to build it.
go build hello-world.go
Now we need to created our layer’s directory.
mkdir -p layer/bin
cp hello-world ./layer/bin
tree layer
layer
└── bin
└── hello-world
Now lets create OCI image layout.
mkdir -p image/blobs
touch image/index.json
touch image/oci-layout
tree .
.
└── image
├── blobs
├── index.json
└── oci-layout
Lets start by creating a layer out of our layer directory we created above.
cd layer
tar -czvf ../layer.tar.gz *
a bin
a bin/hello-world
Now we should have our layer archived into layer.tar.gz
. To use that layer inside out config file we need to get its unpacked hash -
gunzip < layer.tar.gz | sha256sum
918299505cdc628639a0c0ebc767aeab9419b680aaa3a738d2e5976b0bb6c4e0 -
This gets us unpacked layer’s hash we will use for diff_ids
. We need to reference it inside config.
{
"architecture": "arm64",
"os": "linux",
"config": {
"Env": [
"PATH=/bin"
],
"Entrypoint": [
"hello-world"
]
},
"rootfs": {
"type": "layers",
"diff_ids": [
# unpacked layers hash goes here
"sha256:918299505cdc628639a0c0ebc767aeab9419b680aaa3a738d2e5976b0bb6c4e0"
]
},
"history": [
{
"created_by": "Platon Korzh"
}
]
}
Put this content inside config.json
file. The first thing we need to do is to get its hash.
sha256sum < config.json
c81fe38d5f0c3c91a7d0506244b413e9a76ed66b81af882575ada912ee7e9e1b -
mv config.json image/blobs/sha256/c81fe38d5f0c3c91a7d0506244b413e9a76ed66b81af882575ada912ee7e9e1b
Now lets get the hash of our layer’s archive.
sha256sum < layer.tar.gz
6cdd002eef9eebc1c8d2f58dc67789b660ed4b3426c1492e65f9d4078cf76838 -
mv layer.tar.gz image/blobs/6cdd002eef9eebc1c8d2f58dc67789b660ed4b3426c1492e65f9d4078cf76838
The next step is to create image manifest.
{
"schemaVersion": 2,
"mediaType": "application/vnd.oci.image.manifest.v1+json",
"config": {
"mediaType": "application/vnd.oci.image.config.v1+json",
"size": 460,
# put config.json hash in digest property
"digest": "sha256:c81fe38d5f0c3c91a7d0506244b413e9a76ed66b81af882575ada912ee7e9e1b"
},
"layers": [
{
"mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
"size": 1125888,
# put layer.tar.gz hash in digest property
"digest": "sha256:6cdd002eef9eebc1c8d2f58dc67789b660ed4b3426c1492e65f9d4078cf76838"
}
]
}
Inside config.digest
we pass the hash of config file we created earlier. Inside layers[0].digest
we pass the hash of our layer’s archive. Now we save that file as manifest.json and get its hash.
sha256sum < manifest.json
7ecc5d1e321c8143040f8fc57d24d7544408dd4276b5cc1b1ae2bc9652a7dd12 -
mv manifest.json image/blobs/7ecc5d1e321c8143040f8fc57d24d7544408dd4276b5cc1b1ae2bc9652a7dd12
At the end we should have the directory structure as follows.
tree .
.
├── blobs
│ └── sha256
│ ├── 6cdd002eef9eebc1c8d2f58dc67789b660ed4b3426c1492e65f9d4078cf76838
│ ├── 7ecc5d1e321c8143040f8fc57d24d7544408dd4276b5cc1b1ae2bc9652a7dd12
│ └── c81fe38d5f0c3c91a7d0506244b413e9a76ed66b81af882575ada912ee7e9e1b
├── index.json
└── oci-layout
At this point all we have to do is to put { "imageLayoutVersion": "1.0.0" }
inside oci-layout
file and write index file.
# index.json
{
"schemaVersion": 2,
"manifests": [
{
"mediaType": "application/vnd.oci.image.manifest.v1+json",
# reference manifest file by its hash
"digest": "sha256:7ecc5d1e321c8143040f8fc57d24d7544408dd4276b5cc1b1ae2bc9652a7dd12",
"size": 250,
"platform": { "architecture": "arm64", "os": "linux" }
}
]
}
And thats it! We have our OCI image ready! To test it lets upload it to docker hub by using skopeo tool.
skopeo copy oci:./ docker://platonkorzh/oci-image
Getting image source signatures
Copying blob 6cdd002eef9e done
Copying config c81fe38d5f done
Writing manifest to image destination
And the moment of truth:
docker run platonkorzh/oci-image
Unable to find image 'platonkorzh/oci-image:latest' locally
latest: Pulling from platonkorzh/oci-image
Digest: sha256:7ecc5d1e321c8143040f8fc57d24d7544408dd4276b5cc1b1ae2bc9652a7dd12
Status: Downloaded newer image for platonkorzh/oci-image:latest
hello world
This post is part of a series.
- Part 1: Container build tool
- Part 2: How-to build OCI Image by hands
- Part 3: Building OCI images with Go. No run command yet
- Part 4: How to Tar/Untar container layers in Go
- Part 5: Linux kernel namespaces
- Part 6: Mini container runtime in Go
- Part 7: Union mount