How to comment extractor for security & compliance
- Step 1Stage the artifact locally — Pull the ZIP from your evidence store, artifact registry, or pipeline output to the analyst machine. The tool reads from disk via the File API — nothing transits a network.
- Step 2Hash before you touch it — For chain of custody, generate a baseline hash first with the Checksum Generator. Reading the comment does not modify the file, but a before-hash documents the artifact's exact state.
- Step 3Open the Comment Extractor — Visit /archive-tools/comment-extractor and drop the ZIP. Confirm it is actually a ZIP — a 7z/rar sample will return no comment because it has no EOCD record.
- Step 4Record the comment and metrics — Capture the comment text plus the
Comment lengthandHas commentmetrics. Save the<archive>-comment.txtoutput into the evidence package. - Step 5Corroborate, do not trust — The comment is unauthenticated text. Cross-check any claimed build/commit/signer against your real provenance: use Archive Signing Info for signatures and Archive Diff to compare a suspect artifact against a known-good one.
- Step 6Document the finding — Note in the ticket that the comment is an indicator, not proof, alongside the checksums and signature results. Reading was done locally, no upload — record that for the data-handling log.
What the comment can and cannot tell you
Treat the EOCD comment as an unauthenticated indicator. Corroborate every claim it makes.
| Signal in comment | Audit value | How to corroborate |
|---|---|---|
| Build ID / commit hash | Links artifact to a CI run | Match against CI logs; diff contents with Archive Diff |
| Version / release tag | Confirms intended release | Compare to release manifest / changelog |
| Signer / key note | Hints at who built it | Verify the actual signature with Archive Signing Info |
| Timestamp | When it was built | Compare to entry timestamps; normalise with Timestamp Normalizer |
| Empty / no comment | Neutral — common for hand-zipped files | Not suspicious by itself; rely on hashes and signatures |
Evidence-friendly output
The tool produces a single plain-text artifact suitable for an evidence package.
| Property | Value |
|---|---|
| Processing location | Browser tab — no upload |
| Output type | txt (plain text) |
| Filename | <archive-stem>-comment.txt |
| Metrics captured | Comment length, Has comment |
| Modifies the source? | No — read-only |
| Authenticated? | No — comment is unsigned text |
Tier limits for analyst use
Free tier handles most artifacts; the comment record is tiny, so size limits only bite on very large bundles.
| Tier | Max file size | Files per run |
|---|---|---|
| Free | 50 MB | 1 |
| Pro | 500 MB | 20 |
| Pro-Media | 2 GB | 100 |
| Developer | 2 GB | Unlimited |
Cookbook
Audit-shaped reads. The comment is an indicator; each example shows the corroboration step that turns it into evidence.
Confirming a release artifact's build provenance
An auditor receives a release ZIP and needs to tie it back to a CI build. The comment carries the build ID and commit — read it, then match against the CI log.
Input: customer-release-2.3.0.zip Output file: customer-release-2.3.0-comment.txt build=ci-7741 commit=a91f0c2 built=2026-06-08T09:14Z Audit step: → confirm ci-7741 exists in pipeline history → confirm a91f0c2 is the tagged release commit
Tamper check: comment claims a build that does not exist
A suspect artifact claims a build ID, but no such build is in the CI history. The comment being unauthenticated, this is a red flag to escalate, not proof on its own.
Output file: suspect-package-comment.txt
build=ci-9999 signer=release-bot
Audit step:
→ ci-9999 NOT in pipeline history
→ escalate; compare contents with a known-good build via
/archive-tools/archive-diff
→ comment is forgeable text, not evidence of legitimacyReading an incident sample without uploading it
An incident-response analyst must inspect a quarantined ZIP without sending it to any service. Browser-local reading keeps the sample on the air-gapped workstation.
Input: quarantine/sample-4412.zip (handled locally only) Output file: sample-4412-comment.txt (this archive has no global comment) Note for IR log: comment absent — not informative; rely on hashes, YARA, and content review. No upload performed.
Non-ZIP evidence sample
A 7z evidence file has no EOCD, so the tool reports no comment. This is expected and should be recorded, not treated as a failure.
Input: evidence-bundle.7z Output file: evidence-bundle-comment.txt (this archive has no global comment) Note: 7z has no ZIP EOCD comment field. Use /archive-tools/archive-metadata-extractor for 7z metadata.
Chain-of-custody read with before/after hashes
To prove the artifact was unchanged by the analysis, hash before and after. Reading the comment is read-only, so the hashes match.
1) Checksum Generator → baseline SHA-256: 3f9a...c21 2) Comment Extractor → release-comment.txt: build=4821 ... 3) Checksum Generator → after SHA-256: 3f9a...c21 → identical hashes confirm read-only handling for the log
Edge cases and what actually happens
Comment is unauthenticated text
Not proofAnyone can write any string into a ZIP comment with a one-liner. A build ID or signer name in the comment is an indicator only. Never treat it as evidence of authenticity — corroborate with signatures (Archive Signing Info) and content diffs (Archive Diff).
Non-ZIP evidence sample (7z, rar, tar)
ExpectedThe global comment is a ZIP EOCD feature. A non-ZIP sample returns (this archive has no global comment). Record this as 'not applicable', not as a finding. Use the Archive Metadata Extractor for those formats.
No comment present
NeutralMost hand-zipped or Explorer-created archives have no comment. Absence of a comment is not suspicious by itself — a legitimate artifact may simply not stamp one. Base conclusions on hashes and signatures, not the comment field.
Comment claims a signature
Verify separatelyA comment like signed-by=... is just text; it is not the signature. Verify the actual signature with the Archive Signing Info tool. A matching comment with no real signature is a red flag.
Reading must not alter the artifact
Read-onlyThe Comment Extractor only reads the EOCD; it never writes back to the file. To document this for chain of custody, hash with the Checksum Generator before and after — the hashes will match.
Mojibake in the comment
Encoding noteIf the comment shows replacement characters (U+FFFD), it was written in a non-UTF-8 encoding. Note the raw-byte caveat in your report; the displayed text is the tool's UTF-8 interpretation of the original bytes.
Artifact larger than the tier cap
RejectedFree tier rejects files over 50 MB before reading. Large release bundles may need a Pro (500 MB) or higher tier. The comment itself is tiny; the cap is about loading the file into browser memory.
Self-extracting installer wrapping a ZIP
Often readableSFX bundles append a stub before the ZIP payload but keep the EOCD near the end, so the comment usually reads. Treat any embedded provenance string the same way: indicator, corroborate separately.
Corrupt or tampered tail
May read emptyIf the EOCD region is damaged or overwritten, the scan may find no signature and report no comment. For a suspected-tampered artifact, that itself is worth noting; consider Corrupted ZIP Repair to assess the damage.
Frequently asked questions
Is the comment a trustworthy provenance signal?
Only as an indicator. The EOCD comment is unauthenticated text that any tool can write. Use it to point your investigation, then corroborate every claim with checksums (Checksum Generator), signatures (Archive Signing Info), and content diffs (Archive Diff).
Does analysing the artifact upload it anywhere?
No. The file is read in your browser tab and never uploaded. The trust boundary is the same as running a local CLI — useful for sensitive or regulated samples.
Can I use this on a 7z or RAR evidence file?
It will report no comment, because 7z and RAR have no ZIP EOCD. That is expected. Use the Archive Metadata Extractor for metadata from those formats.
Does reading the comment modify the artifact?
No. The tool is read-only — it reads the EOCD record and never writes back. Hash before and after with the Checksum Generator to prove it for chain of custody.
Is this suitable for HIPAA / PCI / FedRAMP environments?
Because nothing transits a network, the regulated boundary does not move when you read a comment locally. Confirm with your compliance team that local browser processing matches your data-handling policy — most treat it like a local CLI.
What if the comment names a signer?
Treat the named signer as a claim, not a fact. Verify the real signature with Archive Signing Info. A comment naming a signer with no matching signature is suspicious.
Can multiple analysts use it at once?
Yes — each browser tab is an independent instance with no shared state. Free-tier limits apply per session; higher tiers raise size and batch limits.
What output do I attach to a ticket?
A plain-text file, <archive>-comment.txt, containing the comment string (or the no-comment notice), plus the Comment length and Has comment metrics you can quote in the report.
How big an archive can I read?
Free: 50 MB. Pro: 500 MB. Pro-Media and Developer: 2 GB. The comment record is tiny; the cap is purely about loading the file into browser memory.
What does an empty comment mean?
Usually nothing — most archives have no global comment. Absence is neutral. Do not read it as evidence of tampering; rely on hashes and signatures for that.
Can the comment be longer than I see?
No — the EOCD comment length is 16-bit, capping it at 65,535 bytes. If you expected more provenance data, it is stored elsewhere, not in the comment.
How do I compare a suspect artifact to a known-good one?
Use the Archive Diff tool to compare entry-by-entry. Combine that with matching checksums for a solid tamper assessment — far stronger than the comment alone.
Privacy first
Every JAD Archive tool runs entirely in your browser using fflate, @zip.js/zip.js, and the libarchive WASM bridge. Your archives never leave your device — verified by zero outbound network requests during processing.