Hi, I'm

Luca

I build AI Solutions for Humans

Most AI engineers learn the tech first and pick up commercial instinct later. I did it the other way around — five years in performance marketing and analytics (Publicis, Decathlon, $10M+ ad spend) before going deep on ML/AI systems. Now I build and deploy production AI: from OCR pipelines and LLM agents to anomaly detection systems. Full applications, not just notebooks.

My Projects

LLM OCR PyQt Hugging Face
MegaCat — AI Pipeline for a Museum Vinyl Collection
Shipped a full-stack AI pipeline to the Ethnographic Museum of Geneva (MEG). Tesseract OCR extracts text from LP cover scans, Llama 3 8B performs structured JSON inference via Hugging Face, and a PySide6 desktop UI gives curators a review-and-edit interface with Discogs lookup. Reduced cataloging from ~8 hours per handful of records to 50+ records per day.
Local LLM Ollama Pydantic Streamlit
Agathe Agent — Local AI for French Speech Therapists
Local-only assistant that converts clinical session notes into structured French orthophonie reports (bilan initial and bilan de renouvellement). Runs Mistral Nemo 12B via Ollama — no cloud API, no telemetry. Pipeline includes Pydantic validation, DOCX export, and an optional self-check pass. Streamlit UI for non-terminal users.
Python GCP BigQuery GitHub Actions
SITG Geodata ETL — ArcGIS REST to BigQuery
Production ETL pipeline ingesting Geneva's canton spatial data infrastructure (SITG) from ArcGIS REST FeatureServer endpoints into Google Cloud Storage and BigQuery. Materializes a wide analytical table used for building-level reporting. Automated via GitHub Actions, with per-layer fault isolation and psutil observability decorators. Still running for a client.
RAG LlamaIndex Streamlit
MessyAI — Text-to-SQL RAG for SMB Analytics
MSc thesis project built for the IE Venture Lab. A RAG pipeline that translates natural language questions into SQL over messy multi-table CSVs. LlamaIndex QueryPipeline orchestrates table summarization, vector retrieval, GPT-4 SQL generation, and answer synthesis. Evaluated on five difficulty tiers with ~91% accuracy on the first three.
Scikit-learn PCA Anomaly Detection Airflow
PCA Anomaly Detection — Merchandising Alerts
Production alerting system built at Tenerity. Rolling PCA monitors weekly sales, orders, and commissions for high-volume merchants and affiliates. Uses a T²-led multivariate score with persistence and promotion filters to cut false alarms. Low-volume partners get IQR-based rules. Output is a simple ALERT / OK / IN_PROMO label per week.
XGBoost SciPy Pandas Quant Finance
Orderbook Market Simulation — Algo Trading
Synthetic limit order book generator for algorithmic trading research. XGBoost predicts next event type from lag features; Normal distributions model price returns per side; Gamma distributions model order volumes. MarketSim and SimulationRunner orchestrate repeatable synthetic sessions logged to CSV.

About Me

Applied Data Scientist and builder with 5+ years of experience translating ambiguous business problems into production ML/AI systems. I’ve managed $10M+ in annual ad spend, built profitability and Marketing Mix Models at Decathlon across 40+ countries, and completed an MSc in Data Science at IE University (3.9 GPA, ranked 6th of 200).

I focus on shipping full applications — from OCR pipelines and LLM agents to anomaly detection systems — not just notebooks. Currently at Tenerity, building production AI infrastructure for a global loyalty platform.

Here is my current tech stack:
Languages
Python, SQL, Bash
Engineering & Infra
FastAPI, Pydantic, Docker, Git, CI/CD, Linux, AWS, GCP (BigQuery), Databricks, MLflow, PyQt
AI/ML
PyTorch, Hugging Face, LangChain, LangGraph, LlamaIndex, RAG, Agentic Systems, Local LLMs (Ollama), OCR, Vector DBs (Milvus, Qdrant, Pinecone)
Data Science
Pandas, Scikit-learn, NumPy, MLflow, wandb
Hobbies & Origins
🎸 Jazz & classical Guitarist
🎧 Listening to music
🌍 Traveling the world
🧗🏼 Rock Climbing
🍻 Drinks with friends
📚 Reading
Origins: 🇨🇭 🇮🇹 🇹🇹

Experience

Applied Data Scientist - Tenerity
Aug 2025 – present

Building production data science and AI infrastructure from the ground up.

  • Marketing Alerting System: PCA-based anomaly detection with email and dashboard delivery via Airflow DAG.
  • A/B Testing Automation: Python framework automating experiment setup, statistical testing, and reporting.
  • Data Analytics Agents: LLM-powered agents leveraging Skills/MCP for autonomous database and semantic layer queries.
  • Revenue Forecasting Model: Fused churn cohort signals with operational KPIs for comprehensive revenue planning.
Data Consultant - Self-Employed
Sept 2020 – Aug 2025
  • Real Estate ETL Pipeline: Automated client reporting with redeployment capability.
  • MegaCat (City of Geneva Museums): OCR + local LLM pipeline with PyQt desktop UI for vinyl record cataloging.
  • Speech-Therapist Reporting Agent: Agentic RAG system with local LLMs for privacy-compliant patient report generation.
Data Analyst (E-commerce Acquisition) - Decathlon
Jan 2022 – Sept 2023

Contributed to global data-driven marketing strategies, optimizing performance and ensuring data consistency across international teams.

  • RFM customer segmentation: +0.8% absolute conversion rate lift.
  • Predictive bidding models: 120% ad profitability increase in 3 months.
  • Marketing Mix Models: budget allocation and channel attribution guidance.
  • Global standardization of marketing data across 40+ countries.
Data Analyst (Digital Paid Media) - Publicis Media
Jan 2020 – Dec 2022

Led data-driven media strategies for Nestlé and L’Oréal, focusing on performance optimization at scale.

  • Technical advisor on $10M+ annual digital ad spend across social, search, and programmatic platforms.
  • Delivered actionable insights that improved campaign efficiency and ROI.

Education

Sept 2023 – July 2024
MSc in Data Science & Business Analytics
IE University – School of Science and Technology
GPA: 3.9 out of 4.0 (Dean’s List, ranked 6th)
  • Developed a Text-to-SQL Retrieval Augmented Generation (RAG) system as part of my Master Thesis.
  • Earned distinction in courses like Python for Data Analysis, Big Data Strategy, ML, and AI in Operations (time-series).
Sept 2017 – Jan 2020
BSc in Digital Marketing & Business Analytics
OMNES Education
  • Built a strong foundation in business analytics, digital marketing strategy, and performance metrics.
  • Gained practical experience in using data to inform marketing decisions across digital channels.

Let's get in touch!

My inbox is always open. Whether you have a question about my work, a new business idea or a job opportunity, feel free to reach out! I’m always open to discussing new projects and collaborations.