How to entropy analysis vs antivirus: why you need both
- Step 1Drop the sample into the analyzer first — Before or alongside an AV scan, drop the file (one at a time — the entropy case reads only the first file). The 256-byte chunker computes Shannon entropy and plots the distribution with the amber 7.5 reference line.
- Step 2Read the plateau, not just the peak — A uniform plateau hugging or crossing the amber line across most of the file is the packing signature. Distinct zones (low / mid / high in sequence) indicate an unpacked binary. The readout percentage and the amber
threatDetectedcolour quantify how widespread the high entropy is. - Step 3Run your AV scan in parallel — Submit the same sample to your signature engine. Note the verdict and, if available, whether the engine reports a packer name (some AV unpacks before scanning).
- Step 4Cross-reference the four outcomes — AV-dirty + high-entropy = known packed threat. AV-clean + high-entropy = the dangerous quadrant: possible unknown/custom packer — escalate. AV-dirty + low-entropy = unpacked known malware. AV-clean + low-entropy = most likely benign (still verify type).
- Step 5Disambiguate high entropy with header context — High entropy is not malware by itself. Check the header with magic-byte-validator — a signed NSIS/MSI installer or a media container explains the entropy benignly. A bare PE with a UPX marker or no recognizable structure does not.
- Step 6Escalate the AV-clean / high-entropy quadrant — For the samples that bypass signatures but show uniform packing entropy, hash them with multi-hash-fingerprinter, record the fingerprint, and send to a behavioural sandbox for detonation. Entropy got you to the right file faster than waiting for a signature update.
Entropy analysis vs signature antivirus
Two methods, two questions. They are strongest in combination because each covers the other's blind spot.
| Dimension | Signature AV | Shannon entropy analysis |
|---|---|---|
| Question answered | Does this match a known-bad pattern? | How random are these bytes? |
| Catches novel/zero-day packers | No — no signature yet | Yes — packing raises entropy regardless of family |
| False positives on benign files | Low (precise patterns) | High on compressed media/archives/documents |
| Defeated by | Packing, crypting, polymorphism | Nothing — but cannot classify, only measure |
| Speed | Slower (DB lookups, unpacking) | Fast (single pass of byte math) |
| Verdict vs signal | Verdict (named family) | Signal (triage hint, no name) |
The four-quadrant triage matrix
Combine the AV verdict with the analyzer's threatDetected (>50% of chunks at >=7.5). The AV-clean / high-entropy cell is where entropy earns its keep.
| AV result | Entropy | Most likely | Action |
|---|---|---|---|
| Dirty | High (flag fires) | Known packed threat | Contain — already identified |
| Dirty | Low | Unpacked known malware | Contain — straightforward |
| Clean | High (flag fires) | Unknown/custom packer OR benign compressed file | Check header/signature; sandbox if no benign explanation |
| Clean | Low | Probably benign | Verify file type, then move on |
Typical entropy profile of a Windows PE
Approximate per-section entropy for an unpacked vs packed PE. The loss of distinct zones is the packing tell.
| Region | Unpacked | Packed | Meaning |
|---|---|---|---|
.text (code) | 5.5 - 6.5 | 7.7 - 8.0 | Code becomes noise after packing |
.rdata / .data | 2.0 - 4.5 | 7.7 - 8.0 | String/data tables vanish into the plateau |
.rsrc (resources) | 7.0 - 7.5 | 7.7 - 8.0 | Already high; now indistinguishable |
| Whole-file flag | Usually false (<50%) | Usually true (>50%) | threatDetected mirrors the plateau |
Cookbook
How entropy and AV combine on real triage. Values are representative; the analyzer counts chunks at or above 7.5 and flags the file above 50%.
AV-clean, entropy says packed
A fresh sample no engine flags. The entropy curve is a uniform plateau at ~7.9 across 93% of the file. This is the AV-clean / high-entropy quadrant — most likely an unknown packer ahead of the signature DB. Escalate to sandbox.
AV: 0/72 engines detect Entropy readout: 7100 high-entropy chunks (93%) <- amber threatDetected: true Verdict: signatures haven't caught up. Entropy flagged what AV could not. Hash it, sandbox it, submit the sample.
AV-clean, entropy high — but benign installer
Same quadrant on paper, opposite reality. A signed Inno Setup installer is DEFLATE-compressed, so entropy is uniformly high and the flag fires. The header and signature explain it. This is why entropy must be paired with context.
AV: clean Entropy: 88% high -> amber, threatDetected: true magic-byte-validator: PE / Inno Setup installer signature: valid, known software vendor Verdict: high entropy = compression, not packing. Benign. The header context is what saved you from a false alarm.
AV-dirty and entropy-high agree
An engine names the family AND the entropy plateau is obvious. Nothing ambiguous — contain it. Useful as a sanity check that your entropy reading lines up with a known verdict.
AV: Trojan.Generic (3 engines) Entropy: 95% high -> amber threatDetected: true Both methods agree. Straightforward containment.
AV-dirty but entropy low — unpacked malware
A known threat shipped without packing. AV catches it on signature; entropy is unremarkable (~5.8 code zones). Demonstrates that entropy is not a malware detector — it is a packing/encryption detector.
AV: Backdoor.Win32 (named) Entropy: 11% high -> not flagged, threatDetected: false Lesson: low entropy does NOT mean safe. Entropy only sees randomness; AV saw the signature. Use both.
Comparing packer effect on the same file
Run a benign EXE, then its UPX-packed copy, one at a time. The before/after entropy shift is the cleanest illustration of why packing defeats signatures: the bytes — and therefore any signature over them — are gone.
Original EXE: 12% high-entropy, threatDetected false UPX-packed: 91% high-entropy, threatDetected true AV signature over the original .text no longer matches the packed body. Entropy, however, jumped. That gap is exactly the AV blind spot entropy covers.
Edge cases and what actually happens
Encrypted archive looks identical to crypted malware
ExpectedAn AES-encrypted ZIP and an encrypted malware payload both sit at ~8.0 — entropy cannot separate them, and neither can AV (it sees random bytes too). Only the password/source/context distinguishes them. Entropy tells you something is encrypted, not whether it is hostile.
Normal entropy for an unpacked Windows EXE
ExpectedUnpacked PEs show 4.5-6.5 in code, 2-4 in data/strings, and up to ~7.5 in .rsrc. A whole-file uniform value above 7.5 is the anomaly. Because resources are already high, a small high-entropy fraction is normal and will not trip the 50% flag.
Entropy cannot replace AV
By designEntropy catches packed/crypted novelty but produces no family name and false-alarms on compressed benign files. AV provides precise named verdicts but misses unknown packers. Dropping either method leaves a gap; the value is the combination.
AV unpacks before scanning, you don't
InvestigateSome AV engines unpack common packers in memory before signature matching, so they may flag a packed sample your raw entropy reading shows as a plateau. If AV is dirty but you only see entropy, that is fine — they observed the unpacked body; you observed the packed file on disk.
Benign compressed file trips the flag
False alarm riskMedia files, ZIP archives, and Office documents are compressed and routinely cross 50% high-entropy chunks, firing threatDetected. Without header/signature context this reads as a positive. Always confirm the file type before treating the flag as a finding.
Custom packer with deliberate low-entropy padding
Evasion attemptSophisticated packers insert low-entropy junk (null runs, repeated stubs) to drag the file's high-entropy percentage under 50% and suppress the flag. The curve still shows high plateaus between valleys — read the shape, not only the boolean.
File too large for your tier
RejectedThe reader throws exceeds the limit for your plan before computing entropy. Free 10 MB, Pro 100 MB, Pro-media 500 MB, Developer 2 GB. A full installer may need Pro; a memory dump may need Developer.
Polymorphic malware with low entropy
InvestigatePolymorphism rewrites code to dodge signatures without necessarily raising entropy — the bytes change but stay code-like (~5.5-6.5). Such a sample can be AV-clean AND entropy-low yet still malicious. Neither method is sufficient alone; behavioural analysis is the backstop.
Only the first dropped file is analyzed
By designIf you drop a batch hoping to triage several at once, the entropy case still reads files[0] only. Run each suspect separately so every sample gets its own curve and flag.
Frequently asked questions
Do encrypted archives also trigger high entropy?
Yes. An AES-encrypted ZIP looks identical to an encrypted malware blob from an entropy standpoint — both approach 8.0 bits/byte. Context (file name, source, the magic-byte header) is what distinguishes them, not the entropy value. The analyzer's amber flag will fire on either.
What is a normal entropy profile for a Windows EXE?
Unpacked PEs typically show 4.5-6.5 in code (.text), 2-4 in data and string sections, and up to ~7.5 in compressed resources (.rsrc). Uniform values above 7.5 across most of the file suggest packing — that is when the analyzer's >50% threatDetected rule fires.
Can entropy analysis replace antivirus?
No — they are complementary. AV catches known threats with precise named verdicts but misses unknown packers and crypters. Entropy catches the high-randomness signature of packed/encrypted novelty but gives no family name and false-alarms on benign compressed files. Run both.
Why is an AV-clean file with high entropy suspicious?
Because legitimate code is not uniformly random. If signatures find nothing but the whole file is a high-entropy plateau, you may be looking at a packer the signature DB has not catalogued yet. That AV-clean / high-entropy quadrant is exactly where entropy earns its place in the pipeline — provided you first rule out benign compression via the header.
What threshold does the analyzer use to flag a file?
Two levels. Per-chunk: any 256-byte window at or above 7.5 bits/byte is counted as high-entropy. Per-file: threatDetected is true when high-entropy chunks exceed 50% of the total, which also turns the readout percentage amber. Both numbers come straight from the engine.
Will entropy flag every compressed file as a threat?
It will flag them as high-entropy, yes — and threatDetected will often be true for media, archives, and Office documents because they are compressed. That is a known false-alarm class. Pair the curve with magic-byte-validator so a recognized, signed compressed format is read as benign.
How do I confirm a packer after entropy flags it?
Inspect the header. hex-header-inspector shows the first bytes — a UPX! marker, an unusual section name, or a tiny .text with a huge high-entropy overlay all point to packing. Then fingerprint with multi-hash-fingerprinter and submit to a sandbox.
Is the sample uploaded to a scanning service?
Not by the entropy analyzer. It runs entirely in your browser tab — the bytes never leave your machine. That matters when handling live samples you do not want shared with a public multi-engine service. Your separate AV scan is its own pipeline.
Can a packer evade the entropy flag?
Yes, with effort. Inserting low-entropy padding (null runs, repeated stubs) can pull the high-entropy fraction below 50% so threatDetected stays false. The high plateaus still appear on the curve between the valleys — read the chart shape, not only the boolean, on sophisticated samples.
Does low entropy mean a file is safe?
No. Unpacked known malware and polymorphic samples can sit at normal code entropy (5.5-6.5) and still be hostile — that is what AV signatures and behavioural analysis are for. Entropy only measures randomness; it is silent on intent.
How large a sample can I check?
Free tier handles up to 10 MB / 1 file. Pro is 100 MB / 5, Pro-media 500 MB / 50, Developer 2 GB / unlimited. Oversized files are rejected with a clear plan-limit error before any analysis runs.
Can I script the entropy pre-filter ahead of my AV pipeline?
Yes. GET /api/v1/tools/entropy-analyzer returns the schema, and the paired @jadapps/runner runs the same computation locally, returning highEntropyChunks, total, and threatDetected. Gate your AV deep-scan queue on threatDetected to prioritise the packed-looking samples — all without the file touching JAD servers.
Privacy first
Every JAD Security operation runs entirely in your browser. Files, passwords, and PGP private keys never leave your device — verified by zero outbound network requests during processing.