📊 Execcomp-AI Dashboard

AI-extracted executive compensation from SEC DEF 14A proxy statements (2005–2022)

GitHub 🤗 Dataset ⚡ Qwen-VL-32B · MinerU
⚙️ How the pipeline works
SEC EDGAR → PDF Download → MinerU Extraction → Qwen3-VL-32B Classification & Parsing → Qwen3-VL-4B Verification → HF Dataset
1 · Vision ExtractionMinerU converts PDFs to structured images preserving table layouts.
2 · Classification + ParsingQwen3-VL-32B identifies the Summary Compensation Table and parses it into typed JSON.
3 · Quality FilteringFine-tuned Qwen3-VL-4B assigns a confidence score (0–1) for each extracted table.
Compensation Breakdown
📋 Detailed Breakdown