droidsaw/droidsaw-apk
GitHub: droidsaw/droidsaw-apk
Stars: 0 | Forks: 1
# droidsaw-apk
APK and AAB parser plus security analysis for the [droidsaw](https://github.com/droidsaw/droidsaw) workspace. Decodes the container and walks every layer inside it — `AndroidManifest.xml` (binary XML), `resources.arsc`, ELF shared libraries, APK signing blocks (v1 PKCS#7 + v2 / v3 / v4), v4 sidecars, AAB protobuf, baseline profiles, network security config, raw assets — and emits typed `Finding` records. Pure Rust, BSD-3-Clause.
## Signing analysis
`signing_info()` returns a `SigningInfo` carrying every scheme present:
| Scheme | Parser | Verifier |
|---|---|---|
| v1 (JAR / PKCS#7) | `signing::SigningInfo::from_pkcs7` | full CMS `SignedData` validation: cert chain, SHA-1 / SHA-256 / SHA-384 / SHA-512 digests, SignedAttributes canonical sort per RFC 5652 §5.4 |
| v2 | `signing::detect_signing_block` | block ID `0x7109871a`, digest verification per AOSP, signer cert chain |
| v3 | same path, block ID `0xf05368c0` | rotation lineage; v3 → v2 → v1 cert pinning |
| v4 | `signing::SigningInfo::attach_v4_sidecar` | `.apk.idsig` Merkle-tree root + signer cert validation |
| source stamp | source-stamp parser | pubkey binding cross-checked against `META-INF/com.android.stamp.{type,source}` |
`signing/verify.rs` carries the algorithm dispatcher: RSA (PKCS#1 v1.5 + PSS) over SHA-2 family, ECDSA on P-256 / P-384, DSA-SHA2-256 (block algo `0x0301`). RustCrypto verifiers only — keygen and signing paths are not present.
`signing/subject.rs` parses subject DNs per RFC 4514 with full `\X` escape handling (`,`, `=`, `\`, `+`, `<`, `>`, `#`, `;`, leading-`#`, trailing space).
`signer_allowlist.rs` carries the known-signer table that gates `NO_LAUNCHER_INTENT_FILTER` reclassification — a deep-link tool signed by a known vendor that ships without a launcher activity does not generate the same finding as an unknown package missing one.
### Cryptographic factoring attacks
`signing/factoring.rs` runs four independent attacks against RSA moduli extracted from the cert chain via `signing::extract_key_material`:
- **ROCA fingerprint** (CVE-2017-15361) — Infineon RSALib weak-key detector via the Howgrave-Graham generator-set check; prime table capped at 271.
- **Fermat factoring** — close-prime modulus detection, configurable iteration bound.
- **Wiener's attack** — small-`d` RSA private-exponent vulnerability (acts on `(N, e)`).
- **Batch-GCD** — pairwise (`batch_gcd`) for small key corpora, product-and-remainder tree (`batch_gcd_quasilinear`) for large corpora.
The factoring stack is independent of the parse path — caller collects `KeyMaterial` from any number of APKs, then runs the batch attack across the whole set.
## Container analysis
| Surface | Module | What lands |
|---|---|---|
| binary XML decoder | `binary_xml.rs` | string pool with lossy-flag preservation, element tree with depth + breadth caps |
| `AndroidManifest.xml` | `manifest.rs` | permissions, components, intent filters, deep links, exported-component risk profile |
| `resources.arsc` | `resources.rs` | resource table, `ResTable_config` parsing, resource shadowing detection, type-id zero rejection |
| AAB protobuf | `aab.rs` | base + feature modules, lossy-attribute preservation, XML node recursion-depth cap |
| baseline profiles | `baseline_profile.rs` | `.prof` / `.profm` parsing with zlib-bomb cap |
| v4 sidecar | `v4_sidecar.rs` | `.apk.idsig` Merkle root |
| ELF native libs | `elf.rs` | hardening flags, JNI exports, `.init_array`, RWX segments, imported symbols, section-header bounds |
| `META-INF/MANIFEST.MF` | `manifest_mf.rs` | base64 attribute decoding with positional-`=` validation |
| network security config | `network_security.rs` | cleartext, pinning, trust anchors |
| accessibility XML | `accessibility.rs` | declared accessibility service config |
| asset entropy | `asset_format.rs` + `apk/asset_entropy_audit.rs` | per-asset Shannon entropy + format classification, encrypted-asset fingerprints |
| obfuscation metrics | `obfuscation.rs` | R8 / ProGuard / commercial-shielding signal |
| SBOM / SDK inventory | `sdk_inventory.rs`, `firebase.rs`, `backend_vendor.rs` | bundled-SDK + backend-vendor inventory |
### ZIP-structural anomalies
`apk/anomaly.rs` carries the EOCD-class detectors. APPNOTE.TXT §4.4.21–§4.4.24 are the spec gauge:
- `DoubleEocd { count, offsets }` — more than one End-of-Central-Directory record present.
- `SuspiciousEocdComment` — EOCD comment field carries unexpected bytes (packer smuggle vector).
- `NonZeroDiskNumber` — disk-number / disk-with-CD / disk-entries fields non-zero on a single-disk archive.
The fixtures `packer_double_eocd`, `packer_eocd_comment_false_positive`, `packer_eocd_disk_number_nonzero`, and `packer_eocd_magic_flood` pin the detector behavior.
## YARA scanning
`yara_scan.rs` is the YARA-X (Rust port) integration. Four bundled rule packs under `rules/`:
- `antianalysis.yar`
- `credential.yar`
- `crypto.yar`
- `packer.yar`
Surface:
- `bundled_rules()` / `bundled_credential_rules()` — embedded rule sets.
- `compile_rules_str_restricted()` — directive-policy enforcer for caller-supplied rules.
- `load_rules_from_dir()` — load and compile a custom rule directory.
- `scan_bytes_with_scanner_budgeted()` — scan with a `YaraMatchBudget` cap; emits an overflow `Finding` on truncation.
`yara_verify.rs` carries provenance-aware false-positive suppression — a match inside a known-vendor SDK or a known-clean library namespace does not generate a finding. `detect_library_namespaces` uses Aho-Corasick (single pass over the APK buffer).
## Architecture
src/
├── apk/ orchestrates per-entry audits (security, anomaly, classifier, JNI, …)
├── signing/ der_walker · factoring · keyinfo · oids · subject · types · verify
├── manifest.rs AndroidManifest model
├── binary_xml.rs AXML decoder
├── resources.rs ARSC parser
├── elf.rs ELF parser
├── aab.rs AAB protobuf
├── yara_scan.rs · yara_verify.rs
└── …flat src/ layout, ~29 files
## Correctness
### No panics on adversarial input
The crate root denies the panic lint floor on every non-test module:
#![forbid(unsafe_code)]
#![cfg_attr(not(test), deny(
clippy::unwrap_used, clippy::expect_used, clippy::panic,
clippy::unreachable, clippy::todo, clippy::arithmetic_side_effects,
clippy::indexing_slicing, clippy::string_slice, clippy::as_conversions,
/* …17 more lints… */
))]
See `src/lib.rs` lines 72–117. Parser paths return typed `Err(ApkError)` (see `src/error.rs`); slice access goes through `.get(i).ok_or(…)`, arithmetic through `checked_*`. Enforced by `cargo clippy` and libFuzzer runs.
### Fuzz coverage
25 libFuzzer targets under `fuzz/fuzz_targets/` cover the parser surface:
- **Container & format parsers:** `fuzz_apk_parse`, `fuzz_xapk_parse`, `fuzz_aab_decode_xml_node`, `fuzz_binary_xml`, `fuzz_arsc`, `fuzz_baseline_profile`, `fuzz_v4_sidecar`
- **Signing & cryptography:** `fuzz_signing_block`, `fuzz_detect_signing_block`, `fuzz_v1_pkcs7`, `fuzz_subject_dn`
- **Native & DEX:** `fuzz_elf_parse`, `fuzz_apk_dex_extractor`, `fuzz_dex_sdk_versions_fallback`
- **Analysis & scanning:** `fuzz_obfuscation_analyze`, `fuzz_yara_prefilter`, `fuzz_yara_scan_with_skip`, `scan_tail`
- **Manifest & auxiliary:** `fuzz_manifest_heuristics`, `fuzz_parse_multiple`
- **Anomaly detection:** `fuzz_v1_orphan_zip_entry`
- **Differential & auxiliary:** `parser_differential_arsc`, `parser_differential_axml`, `parser_differential_axml_permissive_recovery_combinatorial`
- **Additional:** `fuzz_aab_decode_xml_node` (AAB XML), `fuzz_v1_signing_garbage`, `fuzz_v4_signed_data_layout` via embedded harnesses
A new parser path lands with its fuzz target; corpus seeds tracked under `tests/fixtures/adversarial/` are regenerated by `tests/regenerate_fuzz_seeds.rs`.
### Fixture ratchet
`tests/fixtures/apk_matrix/` carries 28 named fixtures: `aab_base_module`, `aab_feature_module`, `debuggable_true`, `exported_component_heavy`, `locale_fallback_manifest`, `manifest_not_first`, `many_permissions`, `minimal_valid`, `multi_native_libs`, `multidex`, `packer_aab_slash_bomb`, `packer_double_eocd`, `packer_eocd_comment_false_positive`, `packer_eocd_disk_number_nonzero`, `packer_eocd_magic_flood`, `packer_zip_comment_smuggle`, `unknown_chunk_arsc`, `v1_signed`, `v1_signed_signedattrs_unsorted`, `v1_signed_tampered`, `v1_signing_garbage`, `v4_signed`, `with_baseline_profile`, `with_dex`, `with_native_lib`, `with_raw_manifest`. The ratchet is `tests/fixture_ratchet.rs`; a regressed fixture is a build break.
Verify locally:
cargo test -p droidsaw-apk fixture_ratchet
cargo test -p droidsaw-apk typed_errors
### Kani harnesses
Seven proof files under `proofs/` with 21 total harnesses (gated `#[cfg(kani)]`):
- `aab_xml_depth.rs` — AAB protobuf XML node recursion depth cap.
- `der_set_member_cmp.rs` — DER SET member comparison correctness.
- `manifest_mf_base64_positional.rs` — `MANIFEST.MF` base64 trailing-`=` position invariant.
- `resources_type_id_nonzero.rs` — `resources.arsc` type-id zero rejection.
- `restableconfig_parse_bound.rs` — `ResTable_config` parse bytes-consumed bound.
- `subject_dn_escape.rs` — RFC 4514 §2.4 escape totality on bounded DN inputs.
- `v4_signed_data_layout.rs` — v4 signing block layout and data bounds verification.
## Inputs
| Form | Entry point |
|---|---|
| `.apk` | `Apk::parse(path)` |
| `.xapk` | `Apk::parse_xapk(path)` |
| `.aab` (Play bundle) | `Apk::parse_aab(path)` |
| split APK set (base + config splits) | `Apk::parse_multiple(&paths)` |
DEX bytecode and Hermes bundles found inside APKs are exposed as `DexEntry` / `AabStringRecord`; decompilation of those layers lives in `droidsaw-dex` and `droidsaw-hermes`.
## Output
`Apk::parse` returns a typed [`Apk`] struct with parsed manifest, resource table, signing block, ELF metadata, DEX stats, asset entropy, and AAB string pool. `Apk::audit_security()` emits [`Finding`] records carrying severity, kind, layer, and detail. No `println!`, no panics on adversarial input.
## Performance
`Vec::with_capacity` across the APK parse and audit hot paths replaces `Vec::new()` on collections that grow to a known upper bound (entry count from the central directory, signer count from the signing block). Pre-allocation is gated behind `guard::bound_count` from `droidsaw-common`, which cross-checks the count against the slice's byte length before use.
## Workspace
This crate is a member of the `droidsaw` workspace. See the top-level droidsaw crate for the cross-layer pipeline (DEX → Java, Hermes → JavaScript, taint across the React Native bridge).
Public re-exports from `droidsaw-common`: `Finding`, `Layer`, `Severity`, `EntropyPattern`, `EntropyProfile`, `entropy_profile`, `shannon_entropy`.
## License
BSD-3-Clause.
标签:通知系统