Lessons from a long year of consulting in data and AI (2024 edition)
A short report from experiences in management boards, workshops and the trenches
Several weeks ago, I saw the field notes by
, in which he shared his experience in data this year. I decided now to do the same.I formed these experiences in many projects, workshops, interviews, conferences, and formal/informal conversations with people from all jobs and organizations of different maturities. It’s not a representative sample, but there are valuable patterns. Since this post can seem quite negative (it’s just how I think, credit to Nassim Taleb), I will use the sandwich method: the bad things are sandwiched between the good.
The good: tools are nice
When I started in the early 2010s, we were still working with Hadoop, and deploying a model required much custom work. Now we have nice things: mlflow, Databricks, AWS Sagemaker. Documentation is (mostly) great, and it has never been easier to do “full-stack data science,” so much so that many have started to question whether this field, at least in traditional businesses, has turned into software engineering.
The bad: we keep reinventing the wheel
Working in data for people with over 10 years of experience feels like Groundhog Day. We have seen the same topics over and over again, regurgitated with a shiny new name. I was fortunate to be involved with many software projects beyond data as a CTO and saw many patterns we could borrow. In the world of data, we are trying so hard to reinvent the wheel instead of taking what works and adjusting it for our context.
Worse: We keep ignoring the fundamentals
The most important thing is having good-quality data, but we do everything possible to avoid tackling this issue. Building a modern data platform or having a super-easy-to-use sandbox environment for your data scientists will not move the needle enough. People with domain knowledge are also still too far away from the implementation. The list goes on.
Even more bad: it’s harder to get valuable nuggets of value from data and AI content
We all try to be on the edge of what is going on and what the new frameworks and tools are. Social media used to have higher-quality content, but in the last year, it has gone south. My go-to places to get useful, relevant, and honest information are Substack, O’Reilly, and the various newsletters that were active before this hype cycle.
Really bad: we still don’t explain what we exactly DO as data people
As a CTO, I was rarely asked to explain what my frontend/backend/BI teams were doing. But data and AI - it was a different story. We are still discussing how to measure the success of such projects and what value our expensive teams are providing. We have to do a better job at this.
But still, sparks of value: there are real, valuable, generative AI use cases
Now, let’s end on a positive note! This year, I saw so many real-world generative use cases with clear value being developed and deployed. My initial skepticism has melted away, but it has made me even more focused on getting the fundamentals right.
With that, I wish you and your loved ones a happy holiday season! Let’s learn the lessons of this exciting year and make 2025 even better!
-Boyan
P.S. Finally, I have something very cool in store, that can perhaps help with many of these issues. See you next year!