We’re excited to kick off a new series, 5-Minute Fireside Chat with Lightup, where we engage with data leaders and experts in the data management field to discuss data trends, technology evolution, and insider insights about the world of Data Quality.
In our first quick and cozy 5-minute fireside chat, CEO and Co-founder of Lightup Manu Bansal and Olga Maydanchik, a data management strategist, discuss the importance of fine-grained Data Quality Checks.
Olga Maydanchik recently shared a LinkedIn post that caught our attention, where she posed a simple but important question, “One DQ Rule or Many?” We had to unpack this with her, so we invited her to be our first guest to share her experience and expand on her LinkedIn content.
Watch this 5-Minute Fireside Chat as Manu and Olga explore:
- The need for multiple checks at a fine resolution for accurate data quality
- The challenges of implementing multiple checks, including the effort required to create and maintain each rule
- The opportunity to write fine-grained checks that provide the resolution of many checks with the effort of just one, especially in large enterprise settings
Transcript
Manu: Hi there. I’m Manu Bansal, one of the co-founders and CEO of Lightup Data, a data quality platform. And, I have a fantastic guest here today, Olga Maydanchik, who has been doing data quality and data management for a long time. And super excited we talked to her, about fine-grained data quality checks. She recently wrote about this topic, which caught her attention.
And the problem was kind of posed as imagine you are a grocery chain, and you’re trying to detect issues with sales transactions or the count of those transactions.
Now should you just write one check, or should you cut it finer and start to write checks maybe by store or maybe by product categories or different product lines or customer types and whatnot. And there’s a good argument to be made, for both scenarios, but it sounds like in many scenarios, we just need many checks for it to be useful.
And, we have seen a version of that problem multiple times where you just need many checks at a fine resolution for data quality to be detected or issues and data quality to be detected accurately.
Olga, welcome, I’m super excited to be discussing this scenario with you. You have been doing this a long time at leading organizations like Citibank and AIG and whatnot, and we have seen this in multiple different verticals.
I’m curious. How do you see this problem statement, and why did you choose to talk about it?
Olga: This is actually a very important statement because it’s not taught at school, and sometimes people simply don’t know. They just get the direction from the boss to write down the rules, and they are stuck because they don’t know what to do. So, in there is no correct answer in the situation for the grocery store in different aisles. You definitely want to create separate rules.
The reason is you don’t want to miss a signal and that’s number one. And second, you want to go to the right SME when you want to resolve the issues. Let me address that not to miss signal part. So for example, let’s say that there are miscellaneous items that are kind of not selling very well, and let’s say that there is a milk item that’s basically your bread and butter in terms of the monetary gain.
So you don’t want to create one rule and check those things together because one of them will have, way to, little transaction compared to the other one. That’s why you are not going to catch the signal if something’s happened to the items that are not, like, really selling, fast and, you know, often.
So this is definitely a situation for many rules, situation for one rule when it’s very simple. Let’s say, customer have, must have a birthday. In this case, you probably wanna get one rule and check the results of this rule. You might discover that some of your customers are organizations, and they are not supposed to have a birthday. And some of your customers are people, and those are the true arrows. So in this case, you wanna create one rule, examine the errors, then literally exclude the organizations, and still probably stick to one rule because you never know what other situation you might get. But in a situation where you’re looking for the signal that something is wrong, definitely several rules broken by channels, by brands, by any other things that you could find.
Manu: Excellent. That makes a lot of sense. And, in those scenarios where the right answer is pretty naturally many roles at the right resolution, Do you see I mean, you said something like sometimes people just don’t know that that’s what they’re supposed to do. Right? What’s preventing them from building out many checks? Is it effort? Is it something else?
Olga: So, frankly, yes. It is effort, because each rule needs to be created, then it needs to be maintained. Somebody needs to, you know, analyze the results on a regular basis, and somebody needs to do something with those results.
The other problem is that sometimes there are hierarchies of things; there is just too much because you might have in a store just an item, then the subcategory of items, then the category of items, then some other things. So sometimes it’s literally hard to find on which level you need to create the rules. So you just wanna start with something rather than creating for everything. But with unlimited capabilities, of course, it’s better to create rule for everything. This way, you guarantee that you are not missing anything.
Manu: Awesome. And, I mean, it’s almost like, you know, you wrote recently about this in a LinkedIn post, and it seems like it’s kind of answering its own question, which is, do you need one or many? Well, if your scenario demands many, then that’s the right answer. Yes. And the question really comes back to, is it affordable? Right? Can you do this in low enough effort that it’s practical to do this and maintain it going forward?
And, that’s been our experience too. Right? I mean, the situation where you just need many rules, that’s the only right answer. The question is, can you do this in the effort of just one?
And there we have been able to combine the best of both in the field, especially in large enterprise settings, where you can write fine-grained checks that give you the resolution of many, add the effort of one. That’s when we have seen magic happen. So a hundred percent with you. I think this is an extremely important topic, which sometimes doesn’t get recognized the way it should.
It’s relatively nuanced and insightful point. So thanks for teasing this out. And there’s so much more to unpack here. So why don’t we maybe leave that for our next session and part two of this conversation?
Thanks a lot for being here.
Olga: Thank you.
Resources
Want to be our next featured guest on 5-Minute Fireside Chat with Lightup or know someone who should? Email us at info@lightup.ai.
Questions about Lightup? Book a free strategy consultation and demo.
Build your own fine-grained Data Quality Checks with Lightup, start a free 30-day trial today.