What's YARA Maximum Achievable Performance

Gman Tue, 08 Mar 2022 16:05:15 -0800

Hi,

I'm trying to get the maximum possible performance out of YARA, and for 
that goal I've been studying the code and algorithms to ensure everything 
is contemplated:

1) My understanding is that the Aho-Corasick algorithm helps build the
Atoms tree to then efficiently apply just the rules that have Atoms
matching the scanned file. This is a great start because not all the rules
will be executed for each file.
2) I also believe there is a short-circuit logic capability so that once a
condition is not satisfied, the subsequent ones will not even try to
execute.
3) The -f option (as seen in the command line tool) will also run in "fast"
mode and report the first occurrence, without wasting time on subsequent
checks/rules.
4) Precompiling rules is a good practice as it saves time, given that the
scanner won't need to compile them before starting a scan.
5) Writing the rules in smart ways yields better performance, including:
using non-trivial hex sequences, replacing some strings with hex
representations, re-writing regexs to be more efficient, (sorting the
conditions?), etc.
6) You can run YARA in multi-thread mode. There is a drastic difference
between running with 1 thread vs running with 16 threads (most likely as it
also takes advantage of I/O vs CPU-bound operations).

With these in mind, I tried to measure the performance of YARA for scanning
a given directory (e.g. containing 10k assorted files) using an artificial
set of 5k, 10k, 20k and even 40k rules. To my surprise, YARA is quite fast
up to 5k rules, and after that performance degrades drastically (almost in
a linear fashion). Note: I run the benchmark multiple times to eliminate
the effect of hard disk I/O (hence, having everything in cache/memory).

- Am I missing any possible optimization trick or Best-Known-Method?
- Does YARA suffers from some limitation in terms of performance related to
# of rules or # of files?
- Based on my basic understanding of the source code, the modules such as
"pe" and "dotnet" are actually parsing the entire file (within the module
Load) regardless of the rules actually using these modules. Let's say a
rule just needs to do the check pe.is_pe, do we need to parse the entire
file just for that? Aren't the imported/exported functions or certificates
parsing slowing down the scan unnecessarily? (I'm not even sure this is the
reason for performance degradation, just a thought).

Any tip or suggestion is much appreciated, and happy to contribute back if
there is an opportunity to do so.

Regards,

--
You received this message because you are subscribed to the Google Groups
"YARA" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/yara-project/0a75bf6c-766f-487b-9769-d625a1babe55n%40googlegroups.com.

What's YARA Maximum Achievable Performance

Reply via email to