Phishing Triage Workflow

Threat Analysis Tool Mar 2026

PythonBeautifulSouptldextractdnspythonEML ParsingRegex

GitHub ↗

Overview

Phishing Triage Workflow is an automated tool designed to streamline the analysis of suspicious emails. By parsing .eml files, the tool extracts critical Indicators of Compromise (IOCs), performs risk scoring based on various heuristics, and generates structured reports for security analysts.

The primary goal of this project is to reduce the manual effort required for initial phishing triage, allowing responders to quickly identify high-risk emails and take automated containment actions.

Features

Email Parsing: Extract headers, body content (HTML and Plain Text), and attachment metadata from .eml files.
IOC Extraction: Automatically identify URLs, domains, IP addresses, email addresses, and file hashes within the email.
Risk Scoring: Weighted scoring system analyzing:
- SPF/DKIM/DMARC authentication failures.
- Header mismatches (e.g., From vs. Return-Path).
- Mismatched anchor links (Visible text vs. actual destination).
- Presence of URL shorteners.
- Dangerous attachment types (e.g., .exe, .vbs, .js).
Multiple Input Support: Analyze single files, multiple files, or entire directories of samples.
Actionable Reporting: Generate reports in Markdown or JSON format for integration with SOAR platforms or manual review.

Usage

Analyze a Single Email

1

python main.py suspicious_email.eml

Analyze a Directory of Samples

1

python main.py samples/ -s --format markdown

Export JSON Report

1

python main.py phishing.eml --format json --output report.json

Sample Output

The tool provides a clear verdict and risk score, highlighting the reasons for the classification:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34


# Phishing Triage Report

## Verdict: 🔴 Likely Phishing (Risk Score: 95/100)

### Key Reasons:
- SPF check failed (+30)
- DMARC check failed (+30)
- Found 2 mismatched anchor links (+40)
- Dangerous attachment type detected: invoice.exe (+50)
- URL shortener detected: bit.ly/fake-link (+20)

## Email Metadata
- **Subject**: Invoice #48291 Due Today
- **From**: billing@secure-invoice.net
- **Date**: Mon, 23 Mar 2026 10:30:00 +0000

## Authentication Results
- **SPF**: fail
- **DKIM**: unknown
- **DMARC**: fail

## Extracted IOCs
### URLs
- https://bit.ly/fake-link
- http://malicious-site.com/login

### ⚠️ Mismatched Links Found:
- Visible Text: `https://amazon.com/orders` -> Destination: `http://malicious-site.com/login`

### Attachments
- **Filename**: invoice.exe
- **Type**: application/x-msdownload
- **Size**: 156000 bytes
- **SHA256**: `deadbeef...`

Learning Outcomes

This project demonstrates practical implementation of automated threat analysis workflows.

Key skills developed:

Advanced EML/MIME parsing in Python.
Regular expression patterns for IOC extraction.
Designing weighted risk-scoring algorithms.
Automating security analyst workflows.
Structured data reporting for security outcomes.

Security Research Value

Automating phishing triage is a core component of modern Security Operations (SecOps). This tool helps in:

Rapidly identifying large-scale phishing campaigns.
Extracting IOCs for proactive blocking at the perimeter.
Identifying common phishing techniques through consistent analysis.
Training newer analysts by providing structured reasoning for phishing verdicts.