Big Data: Learn from your mistakes

Robert Mellor, general manager of WhereScape
Mainland Europe

It’s been 15 years since the data management (DM) industry botched the decision support revolution by delivering inflexible data warehouse systems and developing unusable business intelligence (BI) tools. We forced organisations to change their business processes to suit our own product agendas. Now, in the nascent age of Big Data, we’re gearing up to do it all over again.

What we need is a conceptual shift from a product-centric to a process-centric orientation.

Stick to the process

Robert Mellor of WhereScape says that an ideal process flow diagram would have as few interposing boxes as possible. This never, ever happens in practice and there are many reasons for this. But one of the most important is that software vendors target the product, not the process. They pursue a strategy that attempts to insert or implant a product as one of several interposing boxes in a process flow. In effect, they design themselves into a process.

The DM industry’s response to big data has been more of the same. In most cases, this means a bouillabaisse of proprietary, stack-centric big data “solutions”, self-serving technological or architectural prescriptions, and not-yet-ready-for-prime-time front-end tools.

But big data is different because it’s inescapably multi-disciplinary: it presupposes interconnectedness, interoperability and exchange, between and among domains. It is holistic in scope in precisely the way that data management is not.

From a product perspective, a big data-aware tool must operate in a context in which problems, practices and processes are multi-disciplinary. No product will be completely self-sufficient or operate in isolation. But this doesn’t mean you can’t have big data-oriented products that target very specific use cases, or more generalised big data oriented products that address specific process, domain or function practices. And it doesn’t automatically mean that an entire class of existing products will suddenly become “pre-Big Data”.

Data Warehouse

More of the same is more of the wrong approach

 But most of the vendors are developing and marketing “Big Data-in- a-Platform” products. The one thing each of these “solutions” has in common is a product-centric model: each aims to insert or implant itself – as an interposing box – into a process. But each interposing box introduces latency and increases complexity and fragility.

Worse still, each interposing box has its own infrastructure. This includes its own vendor-specific support staff with its own esoteric knowledge-base. At best, this means recruiting armies of Java or Pig Latin programmers, or training-up DBAs and SQL programmers in the intricacies of HQL. At worst, it means investing significant amounts of time and money to develop platform-specific knowledge-bases.

Automation is the answer

 The way to address this dysfunction is to focus on automating the practices and processes that support and enable a data warehouse environment, such as scoping, warehouse creation, ongoing management, and periodic refactoring. You could even automate the creation and management of warehouse documentation, diagrams, and lineage information by completely eliminating hand-coding in SQL or in esoteric, tool-specific languages.

Big data products do not need their own infrastructure. They should speak the languages and accommodate the idiosyncrasies of OLTP systems, warehouse platforms, analytic databases, NoSQL or big data repositories, BI tools, and all of the other “boxes” that collectively comprise an information ecosystem.

Products should target the disconnects between isolated systems in a process, the points at which a process flow breaks down. This type of breakdown is the inevitable consequence of a product-focused development and marketing strategy. By the looks of it, we’re going to see lots of breakdown in the big data-scape.

Think of the big data-scape as a kind of free trade- zone in which “trade” is analogous to process: i.e., data moves from box to box, with minimal restriction or interference and without platform-specific embargoes from inessential interposing boxes.

Automation is the answer. Not automation for its own sake, but automation as integral to process flow to eliminate breakdown, increase responsiveness, lower costs and empower IT to focus on value creation.

Let’s all try not botch this one up!

 

By Robert Mellor, general manager of WhereScape Mainland Europe

RECENT ARTICLES

Samsung and O2 Telefónica introduce vRAN and Open RAN network in Germany

Posted on: May 3, 2024

Samsung Electronics and O2 Telefónica announced on Thursday that the companies launched their first virtualised RAN (vRAN) and Open RAN commercial site in Germany. It is the first time that

Read more

Telxius expands submarine cable route from Dominican Republic to Puerto Rico

Posted on: May 2, 2024

Global connectivity provider Telxius is opening its latest submarine cable route with the extension of SAm-1 between Punta Cana in the Dominican Republic to Puerto Rico. The route is in

Read more