Mini container runtime in Go
This is a direct continuation of the post on Linux kernel namespaces. Since most of the theory has already been covered, this part would focus more on the code.
We start by getting busybox filesystem.
mkdir busybox
cd busybox
wget https://github.com/jpetazzo/docker-busybox/raw/master/rootfs.tar
tar xvf rootfs.tar
To run the mini runtime execute sudo go run ./cmd/minicr/ /home/user/busybox
. Make sure to replace /home/user/busybox with your path. The runtime will execute /bin/bash
inside busybox’s filesystem.
Code is located here: https://github.com/pkorzh/container-build-tool/tree/v0
Code stability: it works on my machine.
| |
The c code is posted at the end of the page. |
|
| |
We re-execute self, effectively running a bootstrap process. For alternative implementation please see docker’s reexec package. |
|
| |
This is the bootstrap process. The Command is hidden so that end users don’t see it. |
|
| |
Since parent’s mount list can be |
|
| |
Mount |
|
| |
This causes the program that is currently being run by the calling process to be replaced with a new program, with newly initialized stack, heap, and (initialized and uninitialized) data segments. |
|
| |
We need this to satisfy restriction of |
|
| |
|
|
| |
|
|
|
If we execute the code we can poke around in a container:
$ sudo go run ./cmd/minicr/ /home/platon/p/busybox
/ # echo $$
1
/ # hostname
inside-container
/ # ps aux
PID USER COMMAND
1 root /bin/bash
8 root ps aux
/ # ls
bin etc lib linuxrc mnt proc run sys usr
dev home lib64 media opt root sbin tmp var
/ #
C code
The code below is a modified version of container_top_linux.c.
//go:build !remote
#define _GNU_SOURCE
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <sys/mount.h>
#include <sys/wait.h>
#include <unistd.h>
/* keep special_exit_code in sync with container_top_linux.go */
int special_exit_code = 255;
char **argv = NULL;
void
create_argv (int len)
{
/* allocate one extra element because we need a final NULL in c */
argv = malloc (sizeof (char *) * (len + 1));
if (argv == NULL)
{
fprintf (stderr, "failed to allocate ps argv");
exit (special_exit_code);
}
/* add final NULL */
argv[len] = NULL;
}
void
set_argv (int pos, char *arg)
{
argv[pos] = arg;
}
void
exec_ps ()
{
if (argv == NULL)
{
fprintf (stderr, "argv not initialized");
exit (special_exit_code);
}
execve (argv[0], argv, NULL);
fprintf (stderr, "execve: %m");
exit (special_exit_code);
}
This post is part of a series.
- Part 1: Container build tool
- Part 2: How-to build OCI Image by hands
- Part 3: Building OCI images with Go. No run command yet
- Part 4: How to Tar/Untar container layers in Go
- Part 5: Linux kernel namespaces
- Part 6: Mini container runtime in Go
- Part 7: Union mount