20. Storing the full content of a file (aka: WHAT has changed?)

[Note]Consider using a revision control system

One of the most frequently requested features is the ability to determine what has changed in a file. This is not really within the scope of a file integrity checker; rather it would be the task of a revision control system like GIT, SVN (subversion) or CVS.

While samhain, as of version 2.4.4, supports storing the full content of files in the baseline database, this feature is limited to small files (smaller than 9200 bytes after zlib compression). If you really think you need this feature, it is recommended to evaluate whether a revision control system does not fit your needs better.

As of version 2.4.4, samhain can optionally store the full literal content of regular files in the database, which allows to determine what has changed in a file. This feature will only get compiled if the required zlib development environment is available on the host where samhain is compiled (e.g. on Debian Linux, the package zlib1g-dev). This feature is subject to the following restrictions:

  • Only small files can be stored, where 'small' means less than 9200 bytes after zlib compression (and less than 92000 bytes before compression, i.e. files 10 times larger than the limit are assumed to not compress below the limit).

  • Only regular files can be stored; in particular, symlinks are not stored, since the content of a symlink inode actually is the target path (which is stored literally). It is safe to enable this for a directory, in the sense that it is silently ignored for file types where it does not apply.

  • The feature must be explicitely enabled in the runtime configuration file by adding the '+TXT' to the monitoring policy of a file or directory.

To enable this feature, modify a policy to include 'TXT', and place the desired files under this policy (see example below).

In order to show the stored content of a file, use the following command:

        sh$ samhain --list-file path -d database_path

20.1. Example configuration

	  # UserN policies default to ReadOnly + ATM (access time). This
	  # makes the default (intentionally ;-) more or less useless.
	  # Redefine to ReadOnly + TXT (store file content)
	  RedefUser0 = -ATM, +TXT
	  # Files for which we want to store the full content in the
	  # baseline database.

20.2. Implementation details

File contents are zlib compressed (RFC 1950), and the compressed data are base64 encoded. To avoid internal conflicts, samhain uses the letters '(', ')' and '?' instead of the letters '+', '/', and '=' used in standard base64 encoding. E.g. in PHP the following will decode the data:

	  $tmp1 = strtr($data, "()?", "+/="); 
	  $tmp2 = base64_decode($tmp1); 
	  $tmp3 = gzuncompress($tmp2);