Linux packages for pkg(8)

Project description

The idea is to make pkg(8) able to install foreign packages (i.e. .deb/.rpm), from foreign repositories, for use with linux(4). This is done by, first extending libpkg, so that it can deal with foreign package sources and then feeding the SAT-solver of pkg(8) to resolve the dependencies and fetch/install whatever is necessary. I will start this project with Debian packages. If time allows I will start to implement .rpms-support as well. The biggest challenge is to use the SAT-solver of pkg(8) for foreign dependencies.

The recommended way (that is used by most major Linux-distros, including Debian) to verify packages, is using gpgv. I think verifying Debian-packages with OpenSSL can easily become very cumbersome and error prone. The simplest approach seems to be using an exec-function. Unfortunately this means to depend on GPLv3 code. In my eyes a reasonable compromise is to ask the user to install it manually, if it is not.

Deliverables

a forked git repository including:

You can find the repo here: https://github.com/manufactory/pkg

Project Status

In a nutshell: I did not complete everything as planned. See Official Final Status for details. However, I'll hack on and send pull requests on Github.

Here of suggestions, that I will work on. This list will be updated:

Get the code compatible to the latest trunk of pkg

In Progress

Apply suggestions of bapt@ (mainly just exchange function calls)

In Progress

Consistently use pkgdb_* instead of sqlite

In Progress

Make pkg_parse_manifest() repo specific

to be done

Adding package type to pkg query

to be done

Make pkgdb_query() to repo specific

to be done

Make pkg audit to repo specific

to be done

Make pkgdb_* functions more generic, to avoid code duplication

to be done

Make functions in pkgdb_iterator.c less restrictive for non-binary functions

to be done

Make pkg_is_valid() repo specific

to be done

Write a pkg_is_valid()

to be done

Make pkg_add() repo specific

to be done

Integrate my pkg_add()

to be done

Write a pkg_search)

to be done

Changes in the architecture in detail:

First of all, when typing 'pkg something', pkg has to know somehow, that
it's not working with a 'normal' repo. For example, files should be
installed to '/compat/linux'. This could be done by simply providing the
repository configuration file, when calling package.

libpkg/pkg_add.c
--------------------------
pkg_add_from_remote()
pkg_add_upgrade()
minimal changes, just call the right pkg_add()

add.c:
-------------
exec_add():
--
.) needs to open the right database according to the repo provided on
the command line.
.) Needs to call the right pkg_add function. I suggest that a pkg_add()
function is added to struct pkg_repo_op.


pkg_manifest.c
--------------------------
The manifest_keys are not are not specific. I'd move
pkg_parse_manifest() to the repo-specific directory.

Funtions that need to be adapted (that is all non-static functions):
pkg_emit_filelist()
pkg_string()
pkgdb_register_pkg()
pkg_parse_manifest_fileat()
pkg_parse_manifest_file()
pkg_emit_object()
pkg_emit_mainfest_file()
pkg_emit_mainfest_sbuf()
pkg_emit_mainfest()

src/query.c
--------------------------
print_query() I'd love to see the pkg-type here. This can be done just
by the repo-type

pkg_query.c:
--------------------------
I think pkgdb_query should belong to struct pkg_repo_op. All occurrences
need to be adapted accordingly. I could list them all here, but I think
it would just look like I'm trying to make that mail look longer, when I
just list a grep -r here.

src/rquery.c and rquery.c :
--------------------------
should work, once query works nicely. Same adaptions as for query.c

src/annotate.c and annotate.c:
--------------------------
Just need to call the right pkg_query.
Probably it makes sense here, to exclude non-binary-repos, expect it is
given explicitly.  I cannot think of a sensible use case where an
annotation is set for binary and .deb packages.

libpkg/pkg_audit.c
--------------------------
struct pkg_audit needs entry about the audit type (e.g. DEB_AUDIT)

pkg_audit_process() needs to call the right function then.

Probably struct pkg_repo_op is fine for a specific parsing function

audit.c:
--------------------------
Adapt warnings to allow other filenames than vuln.xml
Should work once pkg_audit.c is adapted.

fetch.c:
--------------------------
Should work without any change. When repo_ops.get_cached_name() and
repo_ops.fetch_pkg() work.


libpkg/pkg_jobs.c:
--------------------------
jobs_solve_install_upgrade() should work nicely, but depending on the
repo type not all flags may be available. But since not all packages
have e.g. shared libs. This should work without change.

pkg_delete.c:
-------------
pkg_start_stop_rc_scripts()
needs to be move to repo_ops, if this is desired. It may be 'not so
easy' to start e.g. debian rc-scripts or even worse systemd-stuff.

libpkg/pkg_version.c:
--------------------------
pkg_version_cmp() imo needs to be moved to repo_ops

pkgdb.c:
-------------
The sql-statements are database specific and imo it would make sense to
move them to a header file in repo/_sometype_/

Same counts for:
pkgdb_init()
pkgdb_register_pkg()
pkgdb_unregister_pkg()
pkgdb_vset()

pkgdb_open():
 needs a small adaption to open the given repository.
 alternatively, an own function can be created, that opens the database
specified for the given repo. This is necessary forin e.g. pkg_add() to
work.

pkgdb_obtain_lock():
can be reused. Of course all 'special'-databases have to provide a
lock-table then.

other functions (including pkgdb_release_lock(), pkgdb_close())
can safely be reused.

pkgdb_iterator.c
--------------------------
All functions in load_on_flag[] need to be not to throw an error if
something not mandatory is not existing. E.g the Debian manifest
provides no licences.

For Debian the following values do not exist, or make little sense:

pkgdb_load_scripts (can be changed easily)
pkgdb_load_options
pkgdb_load_category (in the Debian manifest too, but has nothing to do
with FreeBSD categories )
pkgdb_load_license
pkgdb_load_user
pkgdb_load_group
pkgdb_load_shlib_required
pkgdb_load_shlib_provided
pkgdb_load_provides
pkgdb_load_requires

libpkg/pkg.c
--------------------------
pkg_is_valid() needs to be generic

src/install.c:
--------------------------
pkg_flags      f = PKG_FLAG_NONE | PKG_FLAG_PKG_VERSION_TEST;

There is no sensible package "pkg" for non binary packages. This only
needs to test for pkg if it is a binary repo.


How to avoid code copy pasting in my code:
----------------------------------------------------
There are many functions that can be used generically for different
types, but not necessarily fit for all future-repo types.

Thus I suggest, to provide generic functions, which can be called by the
repo-specific functions. This counts especially the database functions.
Maybe it is sensible to make them generic, and give them just pointers
to sql_prstmt plus an index.

These are:
.init() - with pointer to sql_stmts
.access()
.open()
.close()
.stat()

.mirror_pkg() makes only sense if we want to mirror foreign repos, there
are better tools for that imo.

--------------------------------------------------------------------
--------------------------------------------------------------------
TODO:
-----------------
pkg_search()

Small stuff:
-----------------
add.c:
pkgdb instead of sqlite

usual small stuff mentioned before:
-----------------
kick out mktemp()
NELEM instead of STRLEN

autotools magic todo:
-----------------
get repo/linux_deb/utils to repo

I nuked code for a generic way to put a pkg to the database. I simply
cannot know what foreign manifests provide and prepared statements are
already there and repo-dependeant.

--------------------------------------------------------------------
brought in by bapt@ but not yet done:
--------------------------------------------------------------------
pkg_repo_util_check_gpg()
 use posix_spawn()

pkg_repo_linux_deb_fetch_check_extract_packages()
 should say: only amd64 and i686 supported

several times:
get rid of fgets() in favour of getline()

I hope I did not forget something important. I expect, that there is
stuff missing in this mail, which will occur during testing. However.

Official Final Status

Here is my final status mail:

Fully integrating my code into pkg(8), was unfortunately not entirely
possible, within the scope of this years summer of code. The
architecture to install packages is very specific and tailored for
binary packages (i.e. common FreeBSD-packages).

Changing huge parts of an existing software architecture requires lots
of discussion. That was not possible in time. I think it would have made
little sense to spend much time to get it working in a hackish way, that
is ugly and thus useless in the end. For this reason I mocked many
functions in accordance with bapt@ and created a very detailed list of
suggested changes, to abstract functions so that they are repo specific.
This list was was mailed to bapt@ for evaluation.

Of course, it is important to me to fully integrate my code into pkg(8)
and make it a part of FreeBSD. I will work through my list of
suggestions stepwise and open pull requests on Github. This way
discussion will take place regardless of the availability of upstream
developers.
I hope in the end this will highly simplify the development for other
package types, such as RPMs (and conditionally even ruby-gems, etc.)

There is fully working and fully integrated code for updating
repositories, verifying the the manifests using gpg and for
creating/deleting databases. There is mocked code for comparing package
versions according to Debian's specification and to analyse, register
and extract Debian-packages. This includes dependency and conflict
parsing. This is basically all that is needed to use an official Debian
repository.

I spent a lot of time of to learn the ropes, and I want to use my gained
knowledge to continue contributing to pkg(8). If possible I'll attend
the EuroBSDcon in Stockholm and present what I have so far.

Milestones

May, 25 (official start of coding)

Getting the code flexible enough to allow several repository types. Fix eventually bugs. It's not funny, but it makes life a lot easier: write tests to avoid regressions. Four weeks seem long, but being to fast is better then being to slow.

June, 26 (begin of mid-term evaluations)

Writing a (basic) backend for Debian packages. Start by defining test cases.

July, 10

Resolving dependencies using the pkg-SAT-solver

July, 24

Extra week to extend and improve the .deb-backend, and deal with eventual problems

July, 31

pkg-audit-interface

August, 7

extra time

August, 17

scrubbing the code, write documentation and additional tests.

August, 21

extra time until firm end

August, 28 (official end)

Test Plan

The Code

Debian repositories:

Debian packages:

SummerOfCode2015/LinuxPackagesForpkg (last edited 2022-10-03T02:29:19+0000 by KubilayKocak)