Educated in several acronyms across the globe (UNISR, SFI, MIT), I am the co-founder of Bauplan, a data infrastructure company based in NYC and SF.
I was co-founder and CTO of Tooso, an AI startup providing search and recommendations to millions of users, before being acquired by TSX:CVO. I led Coveo’s AI from scale-up to IPO, and built out Coveo Labs, an applied R&D practice rooted in open science: our libraries, models and datasets have collected thousands of stars and garnered millions of downloads.
Throughout my career, I have been fortunate enough to collaborate with incredible folks in industry (e.g. Netflix, NVIDIA) and academia (Stanford, Univ. of Wisconsin-Madison, Univ. of Chicago), and work on products spanning multiple fields: Information Retrieval, Data Science, Artificial Intelligence, Data Management, Computer Systems. My research contributions are often product focused, and are memorable mostly for their titles (e.g. “Not all those who browse are lost”, “You don’t need a bigger boat”, “Mo’ models, mo’ problems”, “Faas and Furious”).
While building my new startup, I moonlight as Adj. Professor of ML at NYU, which is only notable because it is the only job I ever had that my parents understand.
I occasionally share code, ideas and teaching materials. Selected projects, talks, papers and datasets are highlighted below.
I recently started investing in startups, both directly and as LP in AI funds: I’m always happy to chat with founders about DataOps, MLOps and AI.
I have done product-minded research in a (perhaps surprisingly) heterogenous set of of topics: Information Retrieval (e.g. RecSys, SIGIR), Machine Learning and model evaluation (WWW, NeurIps), NLP (NAACL, ACL), data science (Nat. Sci. Rep., KDD), AI and Large Language Models (ICML), data management (SIGMOD, VLDB), human-machine computation (HCOMP), computer systems (WoSC 10). Our paper on cognitively-inspired query embeddings won the Best Paper Award at NAACL 21, and our talk on reproducible data science on data lakes won the Best Presentation Award at DEEM 24.
I have been co-organizer of SIGIR eCom (2022, 2023) and EvalRS (2022, 2023), Industry Sponsorship Chair for CIKM 2022, Industry Chair at UMAP 2025, and I have been involved in various organizational capacities in several top-tier research events (COLING, EMNLP, ACL, SIRIP, ECONLP, ECNLP).
As a true Santa Fe Institute alumnus, I am an old-fashioned generalist, and I gave tiny contributions to other fields mostly as an excuse to spend time with old friends: logic and computation, cellular automata, computational social sciences, networks, philosophy of mind, political science, digital ethics.
Finally, some of my projects have been patented, but to this day nobody seems to really know why.
In previous lives, I managed to get a Ph.D., simulate a pre-Columbian civilization, document biases in national elections and give an academic talk on videogames. Some of my improbable “achievements” received ample press coverage.
Having built end-to-end data pipelines at garage, growth and IPO scale, I happily shared all my mistakes in a series of articles that introduced the concept of Reasonable Scale.
Some time before Brad Pitt’s movie, I led one of the first attempts of running sophisticated analytics for a professional basketball team, and spearheaded the first data science effort on Milan’s bike-sharing service (no bikers or bureaucrats were harmed during the project).
The content of jacopotagliabue.it are released under the BY-NC-ND license; my chibi has been designed by the incredibly talented wisesnail.
Last update: June 2025.
I often get invited to talk about things I (sort of) know by friends in industry (e.g. Home Depot, Farfetch, eBay, Pinterest, Tubi) and academia (e.g. keynotes at KDD, SIGIR, RecSys, CiE, IEEE Cloud Summit).
My publication list is available on Google Scholar: links to selected projects, talks, papers and datasets are collected here for convenience.
Aside from research and tutorials, our datasets have been successfully used by dozens of master students to defend their thesis at Tillburg University and Politecnico in Milan.