Log2timeline(Plaso)のパーサモジュールではなく、出力結果の取り扱いについて確認しておきたいと思います。Plaso には psort.py が提供されており、Plasoストレージを対象にソート処理などを行う事ができます。これはすでに試してみていますので、以下 l2t_process のお話になります。
これまでLog2timelineでは出力形式として CSV を指定してきました。これをそのまま Excel などに取り込みフィルタをかける案もあると思いますが、L2t_Process ツールを使うことで、重複の排除、日付による絞り込みを行う事ができます。
日付による絞り込みでは、指定日付以降・指定日付範囲でスコープを絞る事が可能です。
また、l2t_processコマンドのヘルプを確認すると、幾つか興味深い機能が提供されている事が分かります。
- 読み込み時の日付フォーマットをMM-DD-YYYYではなくyyyy-mm-ddで読む形式に変更する(-y)
- Keywordファイルで指定した文字列に一致したものだけを出力対象とする(-k)
- ホワイトリストとして指定した文字列を出力結果から除外する(-w)
- Timestompツールの利用を推測(ミリ秒の値がゼロになっている)して結果を含めるか除外するかを指定(-i, -e)
以下 SIFT 3.0 に含まれている l2t_process のヘルプになります。残念ながら posrt.py にある出力段階でのタイムゾーンの変更(例えば UTC⇒JST)は出来ないようですね。
root@siftworkstation:/home/sansforensics# l2t_process -h
Usage:
l2t_process [OPTIONS] -b CSV_FILE [DATE_RANGE]
Where DATE_RANGE is MM-DD-YYYY or MM-DD-YYYY..MM-DD-YYYY
Options:
-b|-body CSVFILE
The name of the file that contains the CSV output produced by
log2timeline.
-t|-tab The default input to the tool is a file that was created using
the CSV output module. However, the TAB module can also be used,
however you will need to tell the tool that the file is TAB
delimited instead of comma separated, using this option.
-i|-include
The tool detects possible timestomping activity against changes
made to MFT records (millisecond is of zero value). This option
makes the tool add lines that contain suspicious entries even
though they fall outside the supplied date filter.
-e|-exclude
The tool detects possible timestomping activity against changes
made to MFT records (millisecond is of zero value). If this
option is supplied the tool will not ask the user to add the
lines that are suspicous yet are outside the supplied date
range.
-v|-verbose
Making the script produce mode debug information (be more
verbose)
-y The default format for the date variable is mm-dd-yyyy, however
this default behavior can be changed with this option so the
format read is yyyy-mm-dd.
-V|-Version
Print the tools version number and exit.
-k|-keyword FILE
Include a keyword file that contains one keyword per line. The
tool will read the keyword file line-by-line, and then compare
each line in the CSV file against each of those keywords. The
tool will only print out those lines that match the keywords.
The words inside the keyword list are case insensitive.
-w|-whitelist FILE
Include a keyword file that contains one keyword per line. The
file has the same format as the keyword file, and does the same
thing, except that this file lists up keywords of words that
should not be contained in the timeline. That is to say, this
file defines the "known good" or whitelisted lines that should
be kept out of the timeline.
The tool starts by comparing the known keywords before
processing the whitelist, meaning that keywords are first
filtered out before the whitelist is processed. So the whitelist
can be used in conjunction to the blacklist to narrow down the
scope even more.
It can also be used to remove known "good entries" or entries
that are not relevant to the current investigation out of the
timeline.
-s|-scatter FILE
This only makes sense when the timeline contains records from
the MFT parser (NTFS filesystem). Then the tool will take the
creation time of each file that resides in the WINDOWS/System32
directory and scatter plot it against the MFT number of that
file. The tool will both plot the $FN and $SI creation time of
the file.
This can be useful during malware investigations, to quickly
find files that might have been added to the system32 folder.
When the operating system in installed, and during patching
there are usually several files written to the system32 folder
at once and since MFT's are associated sequentially there should
be clear association between MFT numbers and creation time.
However a typical malware does not create several files in the
system32 directory, a typical malware tries to hide and does so
by creating as few files as possible. That makes it possible to
view a scatter plot, showing the relationship between creation
time and MFT numbers to quickly spot those outliers or
anomalies. This technique can therefore be used for data
reduction.
This option creates a simple gnuplot data file and a gnuplot
script that can be used to create a simple scatter plot to see
those outliers. It will also make an attempt at identifying
those outliers with a simple algorithm. By default the tool
treats the entire dataset as a single slice and tries to find
the obvious outliers, however that behaviour can be changed
using the -m or --multi option to tell the tool to try to split
the dataset into slices.
The FILE portion should be the name of the output file the tool
writes to, it should only contain ASCII letters: a-z, A-Z,
underscore (_) and numbers 0-9, no dot. The files created will
be: FILE.dat and FILE.cmd
Then the tool gnuplot has to be run, like:
gnuplot FILE.cmd
Which will produce a file called FILE.png, containing the
scatter plot.
If the tool detects any outliers in the dataset then the file
FILE_outliers.txt will be created. That file will contain a list
of all those files that the tool detected as outliers.
-m|--multi
This option is only available when used with the -s FILE, to
create scatter plot of the creation time vs. $MFT entry numbers.
By default the tool treats the entire dataset as a single slice
and tries to detect outliers in it. Since the relationship
between $MFT entry numbers and creation time isn't a simple
line, in reality it consists of several straight lines, there
will be many false negatives when treating the dataset as a
single slice. Therefore the option of trying to split the
dataset into multiple smaller slices, and calculating the
outliers for each one of those has been provided.
This is a simple approach to this problem, and by no means
solves the issue at hand. This method does produce lots of false
positives (and it could also miss some, or produce false
negatives). However it will catch many of the items that get
missed by the first attempt.
Perhaps the best approach is to start with the default behaviour
of the tool, examine the graph manually. And if there are some
outliers in the dataset that are perhaps aligned with another
line, yet are obvious outliers, then to re-run the tool using
this option to try to see if it gets detected.
-h|-help
Print this help message
[DATE_RANGE]
The date range is formulated as one of the following:
MM-DD-YYYY All dates from the date supplied date and
forward from them. That is to say, the date
defines the starting date and all dates after
that date will be part of the selection.
MM-DD-YYYY..MM-DD-YYYY
This is a range, so all events that fall within
the boundaries set by these two dates will be
part of the selection.