thiagofsdata-collab/wyllo-fraud-pipeline
GitHub: thiagofsdata-collab/wyllo-fraud-pipeline
Stars: 0 | Forks: 0
# Wyllo Return Fraud Pipeline





## Intro
Return fraud costs e-commerce ~$100B/year. **2% of customers cause 20% of
fraudulent returns** (NRF). The hard part isn't blocking, it's
**segmenting behaviour** so legitimate occasional-returners stay
unblocked. This pipeline produces the feature store that makes that
segmentation possible.
**As a Data Engineer:** We don't train the model. We deliver the
table where the decision becomes obvious.
┌───────────────────────────────────┐
Olist CSVs → S3 → Bronze → Silver → Gold │ fct_customer_return_risk_features │
│ PK: (customer, snapshot_date) │
└─────────┬─────────────────────────┘
│
┌────────────────────┼────────────────────┐
▼ ▼ ▼
Data Scientist Fraud Analyst Pipeline Health
(trains models) (writes rules) (Streamlit + Plotly)
## Architecture