Arizonal8/Keyword-Search-Regular-Expressions

GitHub: Arizonal8/Keyword-Search-Regular-Expressions

Stars: 0 | Forks: 0

# Week 5: Keyword Search and Regular Expressions ## Overview This lab explores keyword searching across a forensic disk image using EnCase. Both RAW (direct disk) and Indexed searching strategies are applied and compared. The investigation searches for the string `55-501165 Digital Forensics` across multiple encoding formats, deleted data, system files, and embedded image metadata. ## Why This Is Necessary Keyword searching is one of the fastest ways to locate relevant evidence on a large drive. Investigators use it to find incriminating communications, financial data, credentials, and other targeted content. Understanding the difference between RAW and Indexed search methods — and knowing where each fails — is essential for a thorough investigation. ## Tools Used - **EnCase Forensic** — keyword and index search engine - **Regular expression engine** — for pattern-based searching - **ANSI and Unicode search support** — to cover multiple text encodings ## Key Concepts Covered - Difference between an **item** (file containing a hit) and a **hit** (each individual occurrence of the keyword) - Searching across multiple encodings: ANSI, UTF-8, UTF-16 LE/BE - Finding keywords in deleted files, system files ($MFT), and image EXIF data - RAW search: scans disk directly; must be repeated for every new keyword - Indexed search: builds a searchable database; instant subsequent searches; does not search raw binary content of media files directly ## Investigation Findings ### Keyword Searched `55-501165 Digital Forensics` - Encodings enabled: ANSI Latin-1, Unicode - Case sensitive: No ### Results - **Items found:** 14 - **Hits found:** 21 ### File Types Containing the Keyword | File | Location | Type | |------|----------|------| | df-ansi.txt | D:\Windows\Keywords\ | ANSI text | | df-utf-8.txt | D:\Windows\Keywords\ | UTF-8 text | | df-utf8-bom.txt | D:\Windows\Keywords\ | UTF-8 with BOM | | df-utf-16-le.txt | D:\Windows\Keywords\ | UTF-16 Little Endian | | df-utf-16-be.txt | D:\Windows\Keywords\ | UTF-16 Big Endian | | df-deleted.txt | D:\$Recycle.Bin\ | Deleted file (recovered) | | df-erased.txt | Unallocated space | Erased but recoverable | | CatwithEXIF.jpg | D:\Pictures\ | EXIF metadata | | WinterHex.jpg | D:\Pictures\ | Embedded hex content | | $MFT (multiple) | System | Master File Table entries | ### RAW vs Indexed Search Comparison | Capability | RAW Search | Indexed Search | |------------|-----------|----------------| | New keyword speed | Slow (full disk re-scan) | Instant | | Searches deleted data | Yes | Limited | | Searches $MFT | Yes | No | | Searches image EXIF | Yes | Yes (metadata) | | Searches inside PDFs/DOCX | No (encoded content) | Yes | | Searches raw image bytes | Yes | No | ## Screenshots ### Search Configuration ![Search Options](https://static.pigsec.cn/wp-content/uploads/repos/2026/05/9a25708df2191136.png) *EnCase keyword search configuration showing ANSI and Unicode encodings selected for the search term '55-501165 Digital Forensics'.* ### Search Results Summary ![Items and Hits](https://static.pigsec.cn/wp-content/uploads/repos/2026/05/c88d25f2ea191140.png) *Keyword Hits panel showing 14 items and 21 hits, demonstrating the distinction between files containing hits and individual hit occurrences.* ### Results Analysis ![Results Panel](https://static.pigsec.cn/wp-content/uploads/repos/2026/05/a88eac0263191143.png) *Full results panel showing keyword hits across text files, image EXIF metadata, deleted files, and system structures including the $MFT.* ### RAW vs Indexed Comparison ![RAW vs Indexed](https://static.pigsec.cn/wp-content/uploads/repos/2026/05/82ac3da33f191147.png) *Comparison of RAW and Indexed search results highlighting the differences in coverage and performance between the two search strategies.* ## Learning Outcomes - Configure and execute a keyword search across multiple text encodings - Differentiate between items and hits in search results - Understand why both RAW and Indexed searches are needed in thorough investigations - Locate keywords in deleted files, EXIF metadata, and system structures - Know the limitations of each search method to avoid missing evidence