Non-BSM to BSM Conversion Tools





I've started to develop a plugin for Linux audispd:


It's been a few months since the end of the GSoC. Recently, on 02.12.2016 I spoke with KonradWitaszczyk about the future of this project. We've exchanged thoughts and ideas and here's a short wrap-up:


Notes & Questions

Project Description

Let's imagine a FreeBSD server which collects audit records from machines that are not necessarily using BSM as the format of their audit records. The idea is to create a tool which would be able read audit records in a non-BSM format and output those audit records in the BSM format. The aim is to lose as little information as possible which is very likely unavailable due to the differences between the standards.

The deliverables would allow to ease the handling of different audit log files collected from your servers and examine them using default FreeBSD administration tools which support the BSM format.

Approach to Solving the Problem

I will focus on the Linux Audit and Windows formats mainly.

Linux Audit

When it comes to Linux Audit, I will study the full list of the data fields and map the Linux audit fields to the BSM fields. Subsequently, I will be able to design a suitable parser from Linux Audit to the BSM format.

Windows Audit Format

Windows to BSM conversion is in demand according to this. The documentation of Windows fields can be found here and here.

Microsoft offers a 180-day long trial period for Windows Server 2012 Essentials (see here (link)).







Start of Coding



May 23

Start of coding.


Linux Audit to BSM conversion



May 23 - May 29

Learn the details of Linux Audit and BSM.



May 30 - June 5


June 6 - June 12


June 13 - June 19

Design the structure and the interface of the library.



June 20 - June 26

Mid-term Evaluations



June 27

Mid-term evaluation.


Implement the Conversion from Linux Audit to BSM / Shell Conversion Tool



June 27 - July 3

Implement the Linux Audit parser.



July 4 - July 10

Implement the Linux Audit conversion.



July 11 - July 17

Improve the API of the conversion of the Linux Audit format.



July 18 - July 24

Improve the conversion/mapping.



July 25 - July 31


August 1 - August 7


August 8 - August 14

Convert Linux syscalls.


Convert Linux execs.


Bring au_to_attr(5) to the userland.


Extend auditdistd(8) with the Ability to Receive Linux Audit Logs



August 15 - August 21

Configure CentOS to send logs to FreeBSD.


End of Coding



August 22

Soft deadline.



August 23

Hard deadline: submit the code until 19:00 UTC.



August 30 

Successful student projects announced.


Deferred & Uncompleted Milestones

Weekly Reports

  1. Week 1 / Non-BSM to BSM Conversion Tools

  2. Week 2 / Non-BSM to BSM Conversion Tools / Problems with mapping and NFS

  3. Week 3 / Non-BSM to BSM Conversion Tools

  4. Week 4 / Non-BSM to BSM Conversion Tools

  5. Week 5 / Non-BSM to BSM Conversion Tools

  6. Week 6 / Non-BSM to BSM Conversion Tools

  7. Week 7 / Non-BSM to BSM Conversion Tools

  8. Week 8 / Non-BSM to BSM Conversion Tools

  9. Week 9 / Non-BSM to BSM Conversion Tools

  10. Week 10 / Non-BSM to BSM Conversion Tools

  11. Week 11 / Non-BSM to BSM Conversion Tools

  12. Week 12 / Non-BSM to BSM Conversion Tools


This is the summary for the final evaluation.

The code is available in this PR:

The directory with the library and the shell tool is in contrib/openbsm/bin/bsmconv.

I kept some of my notes on the Wiki on GitHub:


The interface of the library is available in linau.h.

A significant part of the library is now completed. The library is capable of parsing Linux Audit records and converting them to the BSM format. The conversion is not perfect but it handles the most common types of Linux Audit records and fields.

The library converts Linux Audit logs only; I had too little time to share my time between the Linux and Windows standards.

Perhaps the most interesting part of the library is the one responsible for the conversion. Here is a quick overview of the conversion framework I created.

Improve Conversion

Say that you want to improve the conversion. For example there is a new record type NEWTYPE and a typical record of this type looks like this:

type=NEWTYPE msg=audit(1464612294.816.1234): pid=400 newfield="text"

This is a list of steps required to introduce this record type to the library:

  1. Add LINAU_TYPE_NEWTYPE and LINAU_TYPE_NEWTYPE_STR to linau_conv_impl.h.

  2. Update linau_conv_get_type_number() in linau_conv.c.

  3. Update linau_conv_to_au() in linau_conv.c.

  4. Add an lcrectype structure:

       1 const static struct linau_conv_record_type lcrectype_newtype = {
       2         LINAU_TYPE_NEWTYPE,
       3         LINAU_TYPE_NEWTYPE_STR,
       4         {
       5                 &lctoken_process32,
       6                 &lctoken_newtype,
       7                 NULL
       8         }
       9 };

    to linau_conv.c.

    &lctoken_process32 is added because of the pid field in this record. Because the Linux Audit framework is not yet to be standardized I had to decide how to convert records. The policy is to create BSM tokens even if there is only one field we can put into the token (pid is that lonely field in the example).

    For the sake of this tutorial I will introduce a new BSM token which stores information from newfield.

  5. Create all the missing lctokens from our example. Add

       1 const static struct linau_conv_token lctoken_newtype = {
       2         write_token_newtype,
       3         {
       4                 &lcfield_newtype,
       5                 NULL
       6         }
       7 };

    to linau_conv.c.


  7. Add an lcfield:

       1 const static struct linau_conv_field lcfield_newfield = {
       3         .lcf_validate = linau_conv_is_encoded
       4 };

    to linau_conv.c.

    • linau_conv_is_encoded() is used here because we assume that the value of the newfield field is always stored inside a pair of quotation marks ("...").

    • .lcf_validate =  is used because the name of the field is predefined (its name is newfield obviously); we would use .lcf_match =  in case the name of a group of fields is defined by a regex (see lcfield_a_execve_syscall; more details are available in this thread on the Linux Audit mailing list.).

  8. Add a function which writes the token to the audit record descriptor (aurd). The function should be static void and should no attempt to write a token to the descriptor if there are no valid fields to create a token. It means that those functions have to check both the number of existence of at least one field required by a function from au_token(3) and that the value of the field is reasonable (most of the time it should basically return 1 when lcf_validate is called.

  9. That's it. NEWTYPE has been introduced to the system together with the newfield field.

Architecture of the Conversion Framework

The whole conversion is based on the 3 structures: linau_conv_field, linau_conv_token and linau_conv_record_type. lcrectypes know the lctokens which can be possibly generated using the fields they have and lctokens know the lcfields they might require to create generate and write a BSM token.

The flow of the conversion procedure looks like this in general (a parsed Linux Audit record is an input here):

  1. Get the lcrectype related to the type of the Linux Audit record.

  2. Iterate over every lctoken of the lcrectype and try to generate a BSM token with the rules defined in the lctoken.

  3. Find out which fields were not included in any BSM token and write them as BSM text tokens.

Todo List

There are quite a few TODO, STYLE and XXX tags scattered around the source files. Apart from that there is a TODO file in the project's directory with a list of tasks.

This is a list of the most vital parts of the library that are missing:

  1. The fields are not well validated at the moment. The linau_conv_is_* functions are mostly not implemented yet.

  2. The function au_event_type_from_linau_event is not implemented.

    As system calls in FreeBSD and Linux differ significantly we should not use the FreeBSD system call numbers from /etc/security/audit_event as mapping values for Linux Audit events. Instead, we should add new identifiers.

    Another idea is to ignore the /etc/security/audit_event file entirely and just map every Linux Audit event to 0 (AUE_NULL). The event's type would be passed as an extra text token instead. This approach is less aggressive towards FreeBSD.

    Actually, /usr/include/bsm/audit_kevents.h and /usr/include/bsm/audit_uevents.h do contain some mappings specifically for Linux. It looks like it is just a matter of mapping the Linux Audit record types to the numbers found in those files. Additionally, those files might require a little refreshment since there were some changes in the Linux Audit standard.

  3. The library does not support the ENRICHED format.

  4. Currently, the conversion scheme suggests that a Linux Audit record is related to one or more BSM tokens forming one BSM record. It is not super accurate however. There are examples of Linux Audit records (like those of type PATH) which should be joined with the Linux Audit record preceding them. It could be achieved if another layer of conversion is introduced. The idea is to run another conversion on the already converted BSM records and merge some of them into single records.

  5. There are still a lot of information stored inside subj and msg fields, In fact the msg field stores the payload of the audit record. The problem is that the msg field stores even more fields which does not fits the current architecture of the library directly.

  6. When the logs are sent by audisp-remote from a Linux machine then every record has a prefix the size of which is 16 bytes. It is because audisp-remote adds a header to every log it sends over pure TCP (I don't know if it holds true when Kerberos is used). The first 4 bytes are a magic number fe0000ff. Then the version is added (which is always 0). Then again 0 for mver. Then 6 bytes for the type, 10 bytes for the length and 12 bytes for the sequence number (see audit-userspace/lib/private.h:AUDIT_RMW_PACK_HEADER). Additionally, audisp-remote(8) prefixes the actual record with fields like node indicating the machine from where the audit record came from.

    The library assumes that a valid record starts with type=.

Linux Audit Framework

The Linux Audit framework is a little bit hard to understand due to the fact that the users of this framework do not follow the standard. Additionally, there are no documents describing the standard in depth - in fact, it is the source code and few short documents which define the standard. This situation is going to change in the near future as the testsuite is being developed to help developers unify their log messages.

If you want to understand how the Linux Audit format works you have to:

There is no easy way. Sorry.

BTW, it is worth updating the kernel

   1 rpm --import
   2 rpm -Uvh
   3 yum install yum-plugin-fastestmirror
   4 yum --enablerepo=elrepo-kernel install kernel-ml

and the audit framework

   1 VERSION=2.6.6
   3 curl -O${VERSION}.tar.gz
   4 gzip -d audit-${VERSION}.tar.gz
   5 tar xf audit-${VERSION}.tar
   6 cd audit-${VERSION}
   8 sudo yum update
   9 sudo yum install \
  10     libcap-ng-devel \
  11     libtool \
  12     openldap-devel \
  13     python-devel \
  14     swig \
  15     tcp_wrappers-devel
  16 ./configure --sbindir=/sbin --with-python=yes --with-libwrap --with-libcap-ng=yes \
  17     --enable-gssapi-krb5=yes
  18 make
  19 sudo make install

to the latest versions if you plan to generate logs on a Linux machine.

Further Reading & References

auditdistd Extension

The main purpose of this part of the project is to give auditdistd(8) the ability to receive audit trails from Linux Audit auditd.

The extension has neither been implemented nor designed.

Nevertheless, I had enough time to configure CentOS 7 to send logs directly to a netcat on FreeBSD. It is certainly not a solution but configuring CentOS was crucial; I was not possible to design the extension of auditdistd without the knowledge about the Linux Audit tools: auditd(8), audispd(8) and audisp-remote(8).

The configuration files are available here (link).

Further Reading & References

Shell Tool

A simple shell tool has been implemented. It is possible to convert Linux Audit logs to the BSM format in a command line.

It is possible to compile the tool (at the moment it is called 'bsmconv'):

   1 # Go to the root directory of the repository.
   2 cd freebsd-source-tree-root
   3 # Install modified libbsm.
   4 cd lib/libbsm
   5 make
   6 make install
   7 # Install the tool.
   8 cd ../../usr.bin/bsmconv/
   9 make

The usage is fairly simple. Just run the tool and stream a Linux Record audit trail into the tool:  ./bsmconv < audit.log  If no command line options are used then the tool will print the logs in the BSM format. It might be handy to pipe the output straight to praudit. The only command line option available is -v which increases the verbosity level of debug messages. If specified, the tool will not print the output in the BSM format; instead, it will print a dump of the constructed structures.

Alternatively, you can use the script I used to automate my workflow during GSoC. This will compile the tool:

   1 ./fu m

Report about the Drawbacks of the BSM Format

It turned out that I spent much more time understanding the Linux Audit format than on the BSM format. As a result my knowledge on the BSM format is not deep enough to prepare such a report.

Nevertheless, I had a discussion with RobertWatson about bringing au_to_attr functions to the userland (see this discussion (link)). At the moment a temporary solution is present in the code, although a much better solution should be implemented (see the discussion linked above).

Additionally, I found that there are a couple of outdated man pages which need refreshing, for example au_token(3).

When it comes to the modification of libbsm, I added a function au_close_buffer_tm() to the interface of libbsm. The reason is that there was no other way to create a BSM token with an arbitrary timestamp in it.

Test Suite

I did not manage to create a test suite. Throughout the project I used my own shell script (it is called fu) to test the shell tool with the Linux Audit records I've gathered. There are both real life examples and some edge case tests written by me.

The Linux Audit logs are stored inside tests/. It is possible to run the tests using like this:

   1 # Test one file.
   2 ./fu rp tests/positive/basic.input
   3 # Run the whole sets of tests.
   4 ./fp tsv -- tests/positive/

Manual Pages

I did not create any manual pages during Google Summer of Code.

Further Reading & References

SummerOfCode2016/NonBSMtoBSMConversionTools (last edited 2018-06-29T13:00:10+0000 by MateuszPiotrowski)