NapoliData

The trajectory.

Seven employers, in reverse order. Every entry maps to a LinkedIn role and a stack you can interrogate on a call.

NOV 2025 — PRESENT

NapoliData LLC · Independent

Senior Data & AI Agent Engineer. Independent contractor providing data engineering and AI agent development services to clients in regulated industries. Work involves LLM-based observability, multi-agent orchestration, retrieval-augmented systems, and VPC-private model inference for environments with data-residency or compliance requirements. Stack categories: AWS · Azure · Databricks · Snowflake · Anthropic ecosystem (including MCP) · Airflow on Kubernetes · Terraform · dbt. Engagements covered under NDA — architecture and outcomes discussable on a call.

JUN 2024 — NOV 2025

Proactiviti

Data Engineer. Distributed pipelines on AWS (Glue, Lambda, Step Functions, Athena) and Airflow on Kubernetes. Hybrid flows across Azure Data Factory, Synapse, and Databricks. −20% unplanned incidents through LLM-based monitoring agents. Tuned Redshift and PostgreSQL workloads.

JUL 2023 — JUN 2024

Fivvy

AWS Data Engineer. Modular ETL pipelines using Lambda, EMR, Glue, Step Functions, orchestrated with Airflow on Kubernetes. −30% S3 storage costs via lifecycle and partitioning. Tuned Redshift for analytical workloads.

MAR 2022 — JUL 2023

Aprende Institute

AWS BI Data Engineer. Cloud-native ETL with Glue, Lambda, Step Functions, Redshift. −35% processing costs · −50% manual reporting · +20% marketing ROI via predictive models, QuickSight automation, and centralized data lake.

JUL 2021 — MAR 2022

Johnson & Johnson · US Remote

Senior Data Engineer. Productionized ML models from the data science team by refactoring Jupyter notebooks into production Python, deploying inference on AWS Lambda + Glue, and establishing Bitbucket version control + Flask unit testing + CloudWatch/EventBridge monitoring discipline for ML deployment artifacts. Built and maintained large-scale ETL pipelines on AWS (Lambda, Glue, EMR, Athena, S3, Redshift, Kinesis).

NOV 2020 — JUN 2021

Prisma Medios de Pago

Data Scientist Project Leader. Led Big Data & Analytics projects for Argentina's Visa acquirer, owning roadmap and delivery across fraud, risk, and merchant analytics use cases. Stack: S3 · Athena · PySpark · Docker.

FEB 2016 — OCT 2020

Banco Galicia

Data Analyst → Senior Data Scientist (Marketing & Credit Risk). Built propensity, cross-selling, and segmentation models across insurance and lending products; designed a lending qualification engine and a real-time recommendation system on Oracle; built a ReMarketing chatbot with NLP for intent detection. Measurable lift in sales and material reduction in call-center costs.

Case study overview →

NOV 2009 — FEB 2016

Banco Patagonia

Data Analyst (Credit Risk Management). Developed and maintained credit scoring models for retail and corporate portfolios across seven years, supporting underwriting decisions on consumer loans, credit cards, and SME credit lines. Stack: Python · SPSS · SQL.

How I work.

Three recurring shapes of project. Each one: where the data comes from, what gets built, what ships.

01 · OBSERVABILITY · AI AGENTS

From log noise to actionable incidents.

−20% unplanned incidents

Proactiviti — verifiable on CV

Python LangGraph Anthropic ecosystem Airflow on K8s Cloud observability

▸ Input

Airflow task logs · cloud alarms · dbt test failures · Slack noise from on-call channels.

▸ Build

Multi-agent triage over normalized events. Correlation across sources, severity scoring, runbook retrieval via RAG. Built for environments with data-residency or compliance requirements where model inference must stay within client infrastructure.

▸ Output

Triaged incident notification with root-cause hint, linked runbook and suggested next action. Fewer pings, faster MTTR.

02 · DATA ENGINEERING · CLOUD

Heterogeneous sources into a queryable warehouse.

−30% S3 · −35% processing

Fivvy & Aprende — verifiable on CV

AWS Glue Step Functions Databricks Snowflake dbt Kafka Azure ADF / Synapse

▸ Input

Postgres · S3 dumps · SaaS APIs · Kafka streams. Mixed schemas, mixed cadences, no shared dictionary.

▸ Build

Modular ETL/ELT on AWS (Glue, Lambda, Step Functions) or Azure (ADF, Synapse). Spark on EMR/Databricks for heavy load. dbt for modeling and tests. Cost & partitioning tuned from day one.

▸ Output

Curated tables in Redshift / Snowflake. BI-ready, SLA-tracked, documented. Storage and compute costs cut by a third.

03 · PREDICTIVE MODELING · MLOps

From raw transactions to scored populations.

+20% marketing ROI

Aprende — verifiable on CV · 10 yrs banking & insurance

Python scikit-learn MLflow Embeddings + RAG QuickSight / Power BI

▸ Input

Transactional warehouse · CRM attributes · behavioral signals · campaign history.

▸ Build

Feature engineering + classical models (logistic, GBM, RFM, unsupervised) or embeddings & RAG when text dominates. MLflow tracking, batch inference in production.

▸ Output

Scored customers piped to CRM, campaigns, credit decisions, churn watchlists. Decisions move from gut to evidence.

Stack I connect.

Not a laundry list. Tools I've shipped to production, grouped by where they fit in the data path.

Cloud · Pipelines

AWS Glue Step Functions Lambda EMR Athena Redshift Azure ADF Synapse Airflow on K8s Kafka

Processing · Modeling

Python PySpark Databricks Snowflake dbt Unity Catalog Terraform Delta Lake Docker

AI · LLMs · Agents

Anthropic Claude Amazon Bedrock LangChain LangGraph Anthropic ecosystem (incl. MCP) RAG Ollama Hugging Face scikit-learn TensorFlow / Keras MLflow

BI · Delivery

QuickSight Power BI Tableau Metabase

Highlighted = primary, used in production within the last 2 years. Rest = working knowledge, deployed in earlier roles.

17+ years in data. 5 industries, banking to healthcare.

The trajectory.

How I work.

Stack I connect.

Cloud · Pipelines

Processing · Modeling

AI · LLMs · Agents

BI · Delivery

What I write, I keep public.

Credentials.

Cloud · MLOps

AI · Anthropic ecosystem

Let's talk.