Hello!

I don't think it is possible to control how much of the file libyara *reads*.
You could try fast matching mode, but I believe libyara would still load
the whole file to memory before starting matching your rules regardless of
how these rules are written.

I believe nothing can be faster than reading a smaller buffer, but then you
cannot control its size from the rules themselves. See:

$ dd if=/dev/zero bs=1GB count=1 of=1gb
1+0 records in
1+0 records out
1000000000 bytes (1.0 GB, 954 MiB) copied, 0.95126 s, 1.1 GB/s

$ cat /bin/ls 1gb > bigfile # just to have a match

$ cat normal.py
import yara
import sys
rules = yara.compile(source='rule test_elf { strings: $a = "ELF" condition:
$a in (0..99) }')
matches = rules.match(filepath=sys.argv[1])

$ time python normal.py bigfile
real    0m1.532s
user    0m1.512s
sys     0m0.020s

$ cat fast.py
import yara
import sys
rules = yara.compile(source='rule test_elf { strings: $a = "ELF" condition:
$a in (0..99) }')
matches = rules.match(filepath=sys.argv[1], fast=True)

$ time python fast.py bigfile
real    0m1.052s
user    0m1.032s
sys     0m0.020s

$ cat read100.py
import yara
import sys
rules = yara.compile(source='rule test_elf { strings: $a = "ELF" condition:
$a in (0..99) }')
with open(sys.argv[1], 'rb') as f:
    matches = rules.match(data=f.read(100))

$ time python read100.py bigfile
real    0m0.012s
user    0m0.012s
sys     0m0.000s

I'm not a YARA developer, but I think this happens because reading/mapping
a file to memory and matching it against rules are two separate steps.
Think programatically: to implement what you want, the devs would have to
first examine the rules to see if there's one or more conditions limiting
the amount of bytes that should be matched. So, a condition such as "$a in
(0..99)" should cause libyara to read only 100 bytes from the file.
However, if this condition is "$a in (0..99) or $b", then libyara should
read the whole file, because $b can be anywhere. It'd be a complex process.
I don't know if you can do this without patching libyara, sorry. Maybe a
dev could help here.

Thanks,
Fernando

On Wed, Aug 23, 2023 at 3:18 AM neslihan hanecioglu <
[email protected]> wrote:

> Hello,
>
> Thank you Sir for your help. But I want to give file to yara in python for
> speed. Because yara extracts the content of file and examines the file very
> fast. I searched this problem in python, unfortunately can not find
> anything. For example I used the following rule but yara still reads full
> file.
>
> rule SearchRegexdInPartOfAFile {
>     strings:
>         $a =
> /([1-9])([0-9])([0-9])([0-9])([0-9])([0-9])([0-9])([0-9])([0-9])([0-9])([0-9])/
>
>     condition:
> $a in (0..100)
> }
>
> As I explained, I want to search "a" in first 100 bytes in the file. If
> "a" finds return the match result. Otherwise stops examination the file. It
> is more important speed for me. I guess, I can not do it with python script
> in no way.
> 22 Ağustos 2023 Salı tarihinde saat 22:52:48 UTC+3 itibarıyla
> [email protected] şunları yazdı:
>
>> Hello, have a look at the -z switch in yara command manual (*man yara*
>> or here <https://yara.readthedocs.io/en/stable/commandline.html>).
>>
>> If you want to do this programmatically, you can just read the first
>> 200KB of the file before passing it to libyara. ;)
>>
>> Best,
>>
>>
>> On Tue, Aug 22, 2023 at 9:34 AM neslihan hanecioglu <[email protected]>
>> wrote:
>>
>>> Hi,
>>>
>>> During the file scanning, I do not want to examine after a certain size.
>>> For example, for a 100 mb file, I want to scan the first 200 kb and get its
>>> match result, Not scanning after 200kb. Wow can i achieve this with yara
>>> rule or python script. I wan to give full file to Yara and Yara not read
>>> full text as I explained the above. It is important for speed.
>>>
>>> Thank you for response.
>>> Sincerely.
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "YARA" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/yara-project/c016a513-da34-4b25-88b6-f8b3367395e5n%40googlegroups.com
>>> <https://groups.google.com/d/msgid/yara-project/c016a513-da34-4b25-88b6-f8b3367395e5n%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>> --
> You received this message because you are subscribed to the Google Groups
> "YARA" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/yara-project/e848f8e5-0974-455d-9f8c-3621fce24674n%40googlegroups.com
> <https://groups.google.com/d/msgid/yara-project/e848f8e5-0974-455d-9f8c-3621fce24674n%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"YARA" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/yara-project/CAM7p17N528Ggr%2BPhZqrZe9QfehUf%3Dwgo9tYMJurBHLH9z%2B3K-g%40mail.gmail.com.

Reply via email to