"image"

To learn more about "image" you will find the overview page here.

Discussion during EuroBSDCon devsummit 2006

1. Panellists (unordered)

RobertWatson, PoulHenningKamp, BjoernZeeb, BrooksDavis, ShteryanaShopova, MaxLaier, MikeSilbersack, PaoloPisati

2. Notes mailed out by PoulHenningKamp (slightly edited for public documentation on the wiki)

(...) I'm going to dump this load of vague ideas and illconceived concepts (...) via email:

2.1. Optionality

For the embedded people, it is worth considering making it a compile time option, although I'm not sure how much would be saved. To make it compile-time optional we basically need to pick the image out of the ucred with a macro:

#ifdef IMAGES
#  define       cimg(cred)      ((cred)->c_image)
#else
#  define       cimg(cred)      (img0)
#endif

That would compile almost all the indirect pointers out, but not mangle the source code throughout the kernel any further.

2.2. Management

It is obviously true that the virtualized virtualization interface is very prone to management issues.

The least awful way to deal with, and probably the only way that has any resistance against ifconfigization is to go the nmount/g_ctl "just use strings" route. (In fact, it might be the right time to collapse these into a generic "complex & extensible OaM API" facility, but leave that for now).

So, imagine for now that define the mgt API as taking pairs of strings, just like nmount and g_ctl. In theory this would allow us to write a imgctl program that wound not need to be changed as we add virtualization components, since all it does is wrap up arguments.

Example:

The corresponding syscall would have a char ** argument of:

        {
                "name",         "newimg",
                "vir",          "net",
                    "if",       "wi0,eth0",
                "vir",          "proc",
                    "oneway",   "true",
                "vir",          "priv",
                    ...
        }

The soft spot is that things like hostname->IP translations would have to be special-cased, but the good news is that ABI would be stable, no matter what we do.

imgctl should probably read a .rc file with a spec also.

2.3. Implementation strategy

I would suggest:

2.4. Permeability

With respect to permeability, setuid() may be a usable model.

Imagine if a process had a "effective image" which the kernel code could optionally chose to use, so that ioctls and sysctls in ifconfig and elsewhere would use this.

That would allow us to say

Without actually putting the process into the image "as such" these would allow unmodified ifconfig and sysctl binaries, ABIs and APIs to act on the image.

Avoiding a duplication of system calls seems to be the really big gain here, at the cost of having to pay attention in the admin code in the kernel wether we act on cimg(cred) or eimg(cred).

Obviously we get tricky cases like

Where we don't want the process to show up in a ps(1) run in the image while at the same time it is manipulating the proc table of the image.

2.5. Efficiency

While we address the image via the cred, we may also want to cache the pointer in struct thread, but the explicit notion is that it is a cached copy (without a separate refcount) of the creds (refcounted) reference to the image.

I also think that using the "netmask" idea to figure out subsetting should be adopted because it forces a max depth at the same time.

3. Notes from the blackboard

RobertWatson took a photo of the notes on the blackboard. You can find it here.

The ASCII version derived from it was done by BjoernZeeb:

            P1                                 +-------+      +---------+
          image                                |p_ucred|----->|cr_uid   |
      +------------+                           +-------+  ,___|cr_gid   |
      |p_net       |<==============>+----+               /    |cr_prison|<--+
      |p_fs   [ | ]|     P2      ,=>|vnet|               |    +---------+   |
      |p_user [ | ]|   +-----+  //  +----+      P3       v                  |
P0 -> |p_jail [ | ]|-> |p_net|<='            +-----+    +----+  +--------+  |
      |p_...       |   |..   |-------------->|p_net|<==>|vnet|--|PCB list|  |
      |p_children  |   +-----+               |..   |    +----+  +--------+  |
      +------------+\                        +-----+               |        |
                 |   \   +-----+   +----+                        +---+      |
                 |    `->|p_net|<=>|vnet|                        |PCB|------+
                 |       |..   |   +----+                        +---+
                 |       +-----+
                 |
                 \       +-----+
                  `----->|p_net|
                         |..   |
                         +-----+

Image/NotesEuroBSDConDevsummit2006 (last edited 2008-06-17 21:37:39 by localhost)