Skip to content


Mapping SinceDB Files to Logstash File Input

Sometimes you need to know which SinceDB files map to which file inputs for Logstash. This could be for a bug with the file input plugin or to force logstash to reparse a specific file. The contents of a SinceDB file look like

479 0 64515 31175

Not very intuitive, is it?

A little googling will show you that the first field in this file is an inode number. A little more searching will show you how to map from an inode number back to a file path. The rest of this post shows how to put together a little two-liner that will just print the map of all SinceDB files to the monitored files.

The basic idea is that we iterate over all sincedb files, grab the inodes, and use debugfs to map the inodes to actual filepaths. However, debugfs needs to know the filesystem corresponding to each inode, so we first build a list of all filesystems housing files monitored by logstash that we can iterate over for checking.

filesystems=$(grep path /etc/logstash/conf.d/*.conf | awk -F'=>' '{ print $2 }' | xargs -I {} df -P {} 2>/dev/null | grep -v Filesystem | sort | uniq | cut -d' ' -f 1)

Now we can walk over all sincedb files and call `debugfs` on their inodes. Here’s the second line expanded for readability. Note that you’ll have to update LS_HOME to match your environment (logstash user home directory where the .sincedb files live).

LS_HOME=/var/lib/logstash
for fs in $filesystems; do
  for f in $(ls -a $LS_HOME/.sincedb_*); do
    echo $f
    inodes=$(cut -d' ' -f 1 $f)
    for inode in $inodes; do
      sudo debugfs -R "ncheck $inode" $fs 2>/dev/null | grep -v Inode | cut -f 2
    done
    echo
  done
done

This will give you all files corresponding to a particular sincedb file. For example, here’s part of the mapping for my StackStorm host.

.sincedb_09f49cf38dc2a7ae4c6d120dfcf44a3d
/var/log/st2/st2auth.log
 
.sincedb_48a2f0085e322e3df288d1cce93fd057
/var/log/st2/st2api.log
 
.sincedb_51651a515433f9e76b1c1a36238d60cf
/var/log/st2/st2rulesengine.log
 
.sincedb_3cfd63884179d2c89e6ada571ea0b505.13688.24832.1882
 
.sincedb_3cfd63884179d2c89e6ada571ea0b505.13688.24832.274753
 
.sincedb_89eba400c341be80785965b36cbeac4c
/var/log/st2actionrunner.32638.log
/var/log/st2actionrunner.31714.log
/var/log/st2actionrunner.32647.log
/var/log/st2actionrunner.31720.log

Notice that it shows some sincedb files which only track a single inodes, some which track multiple inodes, and some which track none at all (empty files).

Putting this together, here’s the full two-liner to print the SinceDB to logstash file input map! Don’t forget to update LS_HOME.

filesystems=$(grep path /etc/logstash/conf.d/*.conf | awk -F'=>' '{ print $2 }' | xargs -I {} df -P {} 2>/dev/null | grep -v Filesystem | sort | uniq | cut -d' ' -f 1)
LS_HOME=/var/lib/logstash; for fs in $filesystems; do for f in $(ls -a $LS_HOME/.sincedb_*); do echo $f; inodes=$(cut -d' ' -f 1 $f); for inode in $inodes; do sudo debugfs -R "ncheck $inode" $fs 2>/dev/null | grep -v Inode | cut -f 2; done; echo; done; done

Now you can fix logstash and get back to “real” work. :)

Posted in Tutorials.


2 Responses

Stay in touch with the conversation, subscribe to the RSS feed for comments on this post.

  1. jho says

    Nice input … thank you.

    I think there’s one problem with sincedb and inode-reuse on Linux.

    In my case I download big log files every day and let them parse into elasticsearch. After 3 days, the oldest log file will be deleted. Of course the inode reference remains in the sincedb file. So if a new copied log file will get an “recycled” inode which was already referenced in the sincedb file, what will happen?

    Of course it may not be likely that this situation will happen, but it’s possible.

    So one should probably think of cleaning the sincedb file after file removal. It was nice if logstash had some kind of feature for this.

  2. jho says

    A possible solution instead of deleting files could be to just truncate the file to zero size and move it to an archive folder. This will work if the archive folder remains on the same filesystem just to “reserve” the indode. Downside would be that you could run out of free inodes … :/



Some HTML is OK

or, reply to this post via trackback.

 



Log in here!