A Conversation on Collaborative Computing with Stijn Christiaens, CTO of Collibra
Artifact management
8 min readPart of our series: Collaborative Computing — How Web3 can be rocket fuel for enterprises’ data strategy
When Lawrence Lundy-Bryan, Research Partner at Lunar Ventures and a Nevermined collaborator wrote ‘The Age of Collaborative Computing’, we were thrilled. We totally subscribe to this idea that we are at the beginning of a fundamental reshaping of how data is used in the economy. And we certainly won’t argue about his prediction that Collaborative Computing is the next trillion dollar market.
So, to help us understand the challenge of establishing a new product category, we worked with Lawrence to create a series of interviews with experts in the data collaboration space who are buying, selling or investing in this vision.
We previously published 3 of Lawrence’s conversations, with Rick Hao, Dr Hyoduk Shin and with Jordan Brandt. For this episode, he got the chance to exchange ideas with Stijn Christiaens, CTO of Collibra, the market-leading data governance platform.
Conversation with Stijn Christiaens, CTO of Collibra
By Lawrence Lundy-Bryan, Research Partner at Lunar Ventures
Collaborative computing is the next trillion dollar market. We are at the beginning of a fundamental reshaping of how data is used in the economy. When data can be shared internally and externally without barriers, the value of all data assets can be maximized for private and public value.
To explore this vision in more depth, I spoke with Stijn Christians, CTO of Collibra, the market-leading data governance platform. Highlights include:
- How competing against non-consumption changes how you sell;
- Why Data Asset System of Records are like CRM 20 years’ ago and;
- Why the Chief Data Officer is becoming one of the most important C-suite roles
Let’s start with selling. When you pitch customers on making better use of their data, what benefits do you lead with?
There aren’t any general rules as you would expect when we are selling into businesses across different markets and different sectors. Each customer has a different internal data landscape and a set of external constraints like regulation, etc. But the reality is that we are almost always selling into companies that just aren’t solving the problem today. Customers either buy us or they don’t solve the problem. So it’s inertia.
GDPR has helped generate budgets and answer the “why now” question for many firms. We expect this to continue to drive business as we see ever more privacy regulation around the world as well as AI regulation requiring better data governance processes. So we can easily say, you need to be compliant. Then we can move onto other propositions like making analysts more efficient by giving them access to data quickly. That’s not to mention the great migration to the Cloud, which generates demand for our products.
What is the most common barrier you encounter when selling your products and/or services?
We are selling a new product. Organizations haven’t bought it before so there isn’t a process or even a buyer in some cases. A system of record for data assets also touches almost every department in the organization so there is a lot of market education to be done with lots of stakeholders. The value proposition is so strong, so once you get everyone coordinated it’s a no brainer. But the coordination is the challenge.
It’s getting easier with Chief Data Officers now, though. We had like 1 CDO in 2002, maybe 400 in 2004, and maybe 10,000 now with Gartner saying every large organization will have a data office by this year so, by extension, they will have a CDO. CDOs are responsible for managing data assets and so it’s the CDOs job to coordinate internally so we don’t have to do it. The sales process is much simpler. That said, not all CDOs are the same and have the same mandate. CDO version 1 was defensive and about managing risk. CDO version 2 is offensive and thinking about how to get data flowing internally to maximize value. The new CDOs, and there aren’t many of them, are thinking about data as products. How to package up data assets and monetise them. That’s the future for leading organizations.
What do you think about the opportunities between internal and external data sharing?
You need to think about this through the lens of risk. Even if the technology is 5x better, the risk is much higher too. In some cases, even if performance and therefore revenue increase by X amount, the decision might still be not to do the thing because the risk and costs are too high. When it comes to data sharing those risks are huge. Not just the risk of being fined, but just the risks of not understanding what you can and can’t do with data assets. The data creator might be giving away value in a dataset and not even know it. It’s complicated. Maybe what you can do in one country you can’t do in another. You don’t want the analysts to have to go to legal all the time.
Most of the time it’s not worth the bother. I suppose with this in mind, internal data sharing has a lower risk profile and is more manageable. You can have a highly defined PoC and have strong measures to protect data. So I expect internal data sharing to grow first and only once that process is pretty embedded for external data sharing to be considered.
When thinking about helping companies utilize their data, a sensible framework is: governance, sharing and monetization. It feels like 95% of companies investing in their data infrastructure are still on data governance, maybe 5% are finding ways to share internally, and <1% are even thinking about monetization yet. Does this sound right to you?
We need to put a dollar amount on data. We need something like accounting principles for data. This is a huge piece of work that would need global buy-in, but it is starting to happen. It’s being led by the big tech firms as they are already eschewing traditional accounting principles as they feel they aren’t quite fit for purpose. Different firms don’t like different bits of it. Investors already rely on pro forma results instead of GAAP accounting because GAAP just can’t handle intangible assets. And tech firms are mainly intangible assets and no-one wants that as cost instead of investment. So we are in the space where our accounting principles aren’t accurately reflecting what they are supposed to reflect.
What about data markets as a way to price data?
Yes, this is one route. If we get to a place where we can price data then I think the shift from governance to sharing and monetization will happen rapidly. There are some price discovery engines in AWS, Databricks, Snowflake, etc that are trying to do this. Also some experiments on pricing in the crypto industry. This is the other route into pricing. Instead of new GAAP with data, let markets price data and then find some way to get that on the balance sheet. Data markets are a very interesting space.
What cultural, technical or social change would be required for demand in data collaboration to increase 10/100x?
Antitrust. It seems like Governments around the world are looking to find ways to address the monopolization that has occurred in the data space. These firms might not be exerting monopoly control by raising prices, but their ubiquity and size has forced data into a few large silos. Breaking up these silos somehow seems to be the target. There are lots of different ways that might happen. Some sort of standard interconnect would be one way to do it. Or, once we have decent internal data sharing based on a system of record for data assets, you might want to legislate that companies must expose those systems to external developers like PSD-2 forced banks to open up. The EU’s GAIA-X is an interesting attempt to drive standards in Europe. They are already getting customers saying they have to run on GAIA-X . Solid is also gaining traction in the EU, so we might see this unblocking silos happen faster than expected.
There are technical challenges. Fully homomorphic encryption, with mathematical guarantees that no-one can read the data being processed would be a technological enabler. But it’s hard to see a pathway, technically, for FHE to get to cost parity so it is competitive with plaintext processing. But if it gets there and there is a strong software ecosystem around it, you can imagine use cases around sharing data internally and then opening that up to partners. It’s like the compliance tick box is completed at source, so you can speed up deployment and access. But these sorts of visions are decades away. Practically, it’s going to be regulation that is the catalyzer for change.
This interview has been edited for clarity. In the next and the final part of this series, we’ll publish Lawrence’s conversation with Flavio Bergamaschi, Director, Private AI and Analytics at Intel.
Sign up for updates if you want to read about:
- Why integrity is just as important as confidentiality;
- Why the importance of crypto-agility is overlooked and;
- The 5 groups you need to convince to sell data collaboration software
And we’d love to hear from you. Got any questions, suggestions or want to chat about your project? Contact us via the website or join our Discord.
Originally posted on 2023-01-07 on Medium.