Skip to main content

Article

Data meets business: Lessons learned the hard way

Marko Aalto

October 20, 2023


Extracting value from data is a challenging task. The long and intricate journey from raw data to features serving users is riddled with surprises and obstacles. I've navigated this path multiple times and acquired my share of scar tissue.

After twenty years in software development, I've shifted my focus over the past ten years to data and AI projects. During this period, I've worn many hats, ranging from data engineer and data scientist to more advisory and consultative roles. I have tales to tell.

Identifying blockers helps you navigate past them

Everyone has heard and read about digital giants collecting and leveraging data in their operations. Drawing insights from data, enabling data-driven decision-making, and feeding machine learning models to optimize their processes and guide end users. Appearances can be misleading. You might only see the tip of the iceberg and miss the vast databerg beneath the surface and the challenges of moving and molding it.

One of the most common blockers, often overlooked, is the discoverability of data. Many times, you’re not even aware that specific data exists. Not to even mention how you could access the data and quickly check if it would help in your project. If you’re missing or unaware of some key data assets, it's hard to make effective advanced analytics on your customers’ behavior. For example, product recommendations on e-commerce websites might benefit from customers' purchase history in brick-and-mortar stores.

After finding the data, the next blocker is access to data. There might be a myriad of reasons why bits are not moving from networking to governance issues. A firewall might block your access, or you don't have the necessary permissions. In the worst-case scenario, it can take many tickets and tears to solve the issue, only to find the next blocker on your path – death by a thousand paper cuts.

Finally, you have the data, but you can't make sense out of it. Semantic understanding of data is crucial. Data is always specific to its domain context, and interpreting it requires domain knowledge. A database table's technical schema won't be much help if you lack the semantics regarding the column ch_cnt_next, which appears to have values of 1, 2, and 5. Business processes leave data trails. Understanding the business process in detail helps you identify these trails, interpret the story they tell, and see where the path leads.

Fallacies can derail your data and AI projects

The intersection of data and business is fraught with fallacies. People on both sides of the data-business fence know too little about the other side and often extrapolate too far based on their own previous experiences. Based on my involvement as a mediator and translator between data and business, I've repeatedly encountered certain misbeliefs. 

One common fallacy is the belief that business people can easily specify needs and features for a data team to implement. Business seldom understands what data is available and how it could be utilized. Conversely, data teams often lack an understanding of the business domain. The business doesn’t know what is possible, while the data team doesn’t know what is actually needed.

Another misconception is that the failure of a data project is due to failed execution. In reality, most new data projects and ideas don't succeed. There are no silver bullets. Gold nuggets are rare, and you must process tons of dirt to find just a few. You should be on scaling up the processing of dirt, not on selling gold. It’s baseball, not golf — you are a hero if you bat .300.

One prevailing myth is that data quality is paramount. While data quality is undeniably important, it isn't always critical. Sometimes, the allure of a "Golden schema" in data warehouses or master data management can significantly slow things down or even lead to roadblocks. Maintaining a perfect shared data schema for everything becomes ever more time-consuming and effortful, especially in a constantly changing business landscape populated by evolving distributed systems. (A caveat: yes, there is a case for master data management, but only for a small subset of core data, for example, customer master data.)

Things that actually matter in data projects

What are my tools of choice? How do I prefer to navigate the ever-increasing complexity in fast-paced businesses that are trying to leverage data, advanced analytics, and AI?

  • A self-serve data platform with access to all data. This empowers data wizards to shoot lightning from their fingertips. Things get done. Many more impromptu “what if”, “can we” and “do we have” questions get asked and answered. Hey, look, Ma, no tickets!
  • A data catalog for data discoverability. It’s big data for real, and you truly need both a map and a search box. Beyond just a technical schema, a data catalog is an excellent place to gather semantic knowledge about your data.
  • Fast and cheap ideation and prototyping. Delivering the first prototypes in a matter of days instead of months. Iteration speed is paramount, both for finding business cases with traction and for evolving them into successful products. Instead of seeking a single silver bullet, you equip yourself with a full box of ammo.
  • Meeting of minds: a business person with some understanding of data and a data professional with insights into business collaborate on a shared whiteboard and get creative. The innovative fusion of “what is possible” and “what is needed” is powerful, and that’s where magic happens.

Even with quality data, sharp tools and a great team, success is not guaranteed. However, with every iteration, there's an opportunity to learn and improve. The more iterations and repetitions, the better — as they say, practice makes perfect.

I want to hear from your experiences. Let’s talk! marko.aalto@reaktor.com

Marko is a seasoned data & AI consultant who has helped teams and businesses tackle data challenges and harness emerging AI capabilities in real-life use cases, across various industries.

DATA, AI & LLM SOLUTIONS

How Reaktor can help you with AI

From generative AI to data strategy and custom models, we help you build transformative AI solutions with value-first approach.

Discover our AI offering

Stay updated

Our latest takes on tech, business, design and life.

Signup to our newsletter