DATA ENGINEERING · AI ENGINEERING · CROSS-INDUSTRY

NapoliData

17+ years in data. 5 industries, banking to healthcare.

Senior Data & AI Agent Engineer.

// trajectory
Nov 2025 → NapoliData · Independent
2024 — 2025 · Proactiviti (Data & AI)
2023 — 2024 · Fivvy (Data analytics)
2022 — 2023 · Aprende (BI)
2021 — 2022 · Johnson & Johnson
2020 — 2021 · Prisma Medios de Pago
2016 — 2020 · Banco Galicia
2009 — 2016 · Banco Patagonia
17+ years · 5 industries

The trajectory.

Seven employers, in reverse order. Every entry maps to a LinkedIn role and a stack you can interrogate on a call.
NOV 2025 — PRESENT
NapoliData LLC · Independent
Senior Data & AI Agent Engineer. Independent contractor providing data engineering and AI agent development services to clients in regulated industries. Work involves LLM-based observability, multi-agent orchestration, retrieval-augmented systems, and VPC-private model inference for environments with data-residency or compliance requirements. Stack categories: AWS · Azure · Databricks · Snowflake · Anthropic ecosystem (including MCP) · Airflow on Kubernetes · Terraform · dbt. Engagements covered under NDA — architecture and outcomes discussable on a call.
JUN 2024 — NOV 2025
Proactiviti
Data Engineer. Distributed pipelines on AWS (Glue, Lambda, Step Functions, Athena) and Airflow on Kubernetes. Hybrid flows across Azure Data Factory, Synapse, and Databricks. −20% unplanned incidents through LLM-based monitoring agents. Tuned Redshift and PostgreSQL workloads.
JUL 2023 — JUN 2024
Fivvy
AWS Data Engineer. Modular ETL pipelines using Lambda, EMR, Glue, Step Functions, orchestrated with Airflow on Kubernetes. −30% S3 storage costs via lifecycle and partitioning. Tuned Redshift for analytical workloads.
MAR 2022 — JUL 2023
Aprende Institute
AWS BI Data Engineer. Cloud-native ETL with Glue, Lambda, Step Functions, Redshift. −35% processing costs · −50% manual reporting · +20% marketing ROI via predictive models, QuickSight automation, and centralized data lake.
JUL 2021 — MAR 2022
Johnson & Johnson · US Remote
Senior Data Engineer. Productionized ML models from the data science team by refactoring Jupyter notebooks into production Python, deploying inference on AWS Lambda + Glue, and establishing Bitbucket version control + Flask unit testing + CloudWatch/EventBridge monitoring discipline for ML deployment artifacts. Built and maintained large-scale ETL pipelines on AWS (Lambda, Glue, EMR, Athena, S3, Redshift, Kinesis).
NOV 2020 — JUN 2021
Prisma Medios de Pago
Data Scientist Project Leader. Led Big Data & Analytics projects for Argentina's Visa acquirer, owning roadmap and delivery across fraud, risk, and merchant analytics use cases. Stack: S3 · Athena · PySpark · Docker.
FEB 2016 — OCT 2020
Banco Galicia
Data Analyst → Senior Data Scientist (Marketing & Credit Risk). Built propensity, cross-selling, and segmentation models across insurance and lending products; designed a lending qualification engine and a real-time recommendation system on Oracle; built a ReMarketing chatbot with NLP for intent detection. Measurable lift in sales and material reduction in call-center costs.
Case study overview →
NOV 2009 — FEB 2016
Banco Patagonia
Data Analyst (Credit Risk Management). Developed and maintained credit scoring models for retail and corporate portfolios across seven years, supporting underwriting decisions on consumer loans, credit cards, and SME credit lines. Stack: Python · SPSS · SQL.

How I work.

Three recurring shapes of project. Each one: where the data comes from, what gets built, what ships.
01 · OBSERVABILITY · AI AGENTS
From log noise to actionable incidents.
−20% unplanned incidents
Proactiviti — verifiable on CV
Python LangGraph Anthropic ecosystem Airflow on K8s Cloud observability
▸ Input
Airflow task logs · cloud alarms · dbt test failures · Slack noise from on-call channels.
▸ Build
Multi-agent triage over normalized events. Correlation across sources, severity scoring, runbook retrieval via RAG. Built for environments with data-residency or compliance requirements where model inference must stay within client infrastructure.
▸ Output
Triaged incident notification with root-cause hint, linked runbook and suggested next action. Fewer pings, faster MTTR.
02 · DATA ENGINEERING · CLOUD
Heterogeneous sources into a queryable warehouse.
−30% S3 · −35% processing
Fivvy & Aprende — verifiable on CV
AWS Glue Step Functions Databricks Snowflake dbt Kafka Azure ADF / Synapse
▸ Input
Postgres · S3 dumps · SaaS APIs · Kafka streams. Mixed schemas, mixed cadences, no shared dictionary.
▸ Build
Modular ETL/ELT on AWS (Glue, Lambda, Step Functions) or Azure (ADF, Synapse). Spark on EMR/Databricks for heavy load. dbt for modeling and tests. Cost & partitioning tuned from day one.
▸ Output
Curated tables in Redshift / Snowflake. BI-ready, SLA-tracked, documented. Storage and compute costs cut by a third.
03 · PREDICTIVE MODELING · MLOps
From raw transactions to scored populations.
+20% marketing ROI
Aprende — verifiable on CV · 10 yrs banking & insurance
Python scikit-learn MLflow Embeddings + RAG QuickSight / Power BI
▸ Input
Transactional warehouse · CRM attributes · behavioral signals · campaign history.
▸ Build
Feature engineering + classical models (logistic, GBM, RFM, unsupervised) or embeddings & RAG when text dominates. MLflow tracking, batch inference in production.
▸ Output
Scored customers piped to CRM, campaigns, credit decisions, churn watchlists. Decisions move from gut to evidence.

Stack I connect.

Not a laundry list. Tools I've shipped to production, grouped by where they fit in the data path.

Cloud · Pipelines

AWS Glue Step Functions Lambda EMR Athena Redshift Azure ADF Synapse Airflow on K8s Kafka

Processing · Modeling

Python PySpark Databricks Snowflake dbt Unity Catalog Terraform Delta Lake Docker

AI · LLMs · Agents

Anthropic Claude Amazon Bedrock LangChain LangGraph Anthropic ecosystem (incl. MCP) RAG · pgvector Ollama Hugging Face scikit-learn TensorFlow / Keras MLflow

BI · Delivery

QuickSight Power BI Tableau Metabase

Highlighted = primary, used in production within the last 2 years. Rest = working knowledge, deployed in earlier roles.

What I write, I keep public.

Production-grade code in public repositories. Review it before you hire me.

Credentials.

Certifications I hold or am currently preparing. Verifiable on issuer platforms.

Cloud · MLOps

AWS Certified Cloud Practitioner (CLF-C02) Google Cloud — Machine Learning Operations (MLOps)

AI · Anthropic ecosystem

Claude Certified Architect Foundations (CCAF) — in preparation, Q3 2026 Anthropic Academy (in progress): Claude in Amazon Bedrock · Claude with Google Vertex · MCP Advanced

Let's talk.

Three modes of working together:

PROJECT-BASED
Defined scope and timeline. Typically 4–12 weeks.
FRACTIONAL
Senior part-time capacity, embedded in your team.
ADVISORY
Architecture, code review, technical decisions.

Also available for adjacent verticals — payments, e-commerce data, logistics, and other data-heavy industries.

A 30-minute call to see if there's a fit. No pitch.