Skip to content

Commit

Permalink
Improve limits documentation (#151)
Browse files Browse the repository at this point in the history
  • Loading branch information
msm-code authored May 15, 2020
1 parent 6ca6359 commit 02d0c5d
Showing 1 changed file with 11 additions and 2 deletions.
13 changes: 11 additions & 2 deletions docs/limits.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,10 +11,19 @@ These are all absurdly large. But there are other factors: RAM, disk space and C

## Soft limits

- Disk space - 70% or 140% of collection size.
- RAM (querying) - `max(8 * dataset.number_of_files for each dataset) bytes`.
- RAM (compacting) - `4GiB` per operation by default (can run multiple at once).
- RAM (indexing) - assume `4GiB` when indexing in small (<1000 files) batches.
- CPU and time - a lot, but still faster than running Yara directly.
- Others - may open many files at once, but usually under 100.

For more detailed calculations, read below.

**Disk space**

See documentation about [index types](./indextypes.md). In our collections
, combined size of all indexes except `hash4` is about 70% of indexed
See documentation about [index types](./indextypes.md). In our collections,
combined size of all indexes except `hash4` is about 70% of indexed
files, and `hash4` index takes another 70% if used. For example, 630 GiB of
indexed malware, resulted in the following indexes:

Expand Down

0 comments on commit 02d0c5d

Please sign in to comment.