What Big Data dreams may come


Instead of asking, “What’s the big deal with Big Data?” try asking, “What’s the big dream?”

LittleBlogAuthorGraphic  David Hodgson, March 26, 2015

The best ventures start with a big dream and big dreams come in different sizes. In the world of data analytics this might be cracking a top secret code, growing your business, detecting fraud in real time or successfully making purchasing recommendations based on profiles and patterns. However big your dream appears to other people, it is important to you and the goals you want to achieve, so it is worth planning properly and equipping yourself with the tools you need.

In my last few blog posts, I’ve been talking about the challenge of gaining a competitive edge for your business with all the data that is available to you, structured and unstructured. This is the Big Dream.

The importance of creative experimentation and diversity

The growth and transformation of data analytics in recent years has generally been fueled by the interest from lines-of-business (LOBs) within a company – not by the central IT department. The LOBs have wanted to experiment with tools they thought might give them a competitive edge, an extra business insight. This experimentation is an essential ingredient of success and LOBs must be allowed to empower themselves by choosing the tools and services they need.

This revolution has been enabled by the availability of commodity compute power (on-premise and cloud) and the emergence of open source tools driven mainly by the Apache foundation. The diversity of tools that has emerged is incredible. Hadoop is not one thing, but a stack of technology surrounded by an ecosystem of supporting tools that allow you to build the solution you need. I wrote about this explosion of creativity in an earlier blog post. There are multiple distributions of the stack with proprietary code added for different advantages. And then there are numerous NoSQL databases, Cassandra, MongoDB and probably 30 others, that are suited to different sorts of analytic operations.

This creative experimentation and diversity is still incredibly important. We have not yet seen the end of the tool evolutions and are just beginning to realize the potential of these new technologies and the ways businesses can transform themselves in the application economy.

How can IT help?

IT’s big dream is to be more than a cost center and align to the business to be an essential ingredient in the recipe for success of a company’s goals. In the fast proliferating ecosystem of analytical tools and components, LOBs must have the freedom to experiment with the latest technologies. We are past the point when IT can control and limit software usage for their own convenience. To remain relevant, the IT department must support the new big data playbook and be the experts who learn the latest technology.

But how can IT do this and control costs? Even as a driver of business success, the IT department cannot run amok past the boundaries of their budget. So how can they both encourage creative experimentation and contain costs? How can they find all the skills they would need to use all the different tools?

A Big Data management strategy

CA Big Data Infrastructure Management (BDIM) is now being demonstrated and may be an answer for IT departments as they seek to help LOBs with analytics. CA BDIM provides a normalized management paradigm for different Hadoop distributions and NoSQL databases. It is a tool that will scale with your growing analytics implementations and will allow the LOBs to easily switch tools or platforms.

This single unified view approach will help reduce operational costs as experiments scale to production operations. With automation, productivity is increased and the differences between Hadoop distributions becomes no problem. For IT, the potential is a single management tool to manage all clusters, nodes and jobs. For LOBs, the potential is an IT partner that can take the burden off of production operations while leaving them the freedom to easily change direction.

CA BDIM will be on display at the Gartner Business Intelligence & Analytics Summit, March 30 through April 1, 2015 in booth 525 in Las Vegas. This first release is targeted at early adopters while we aim to deliver new functions and support every 60 days and evolve the product to deliver the maximum value for those that want to use it.

Is the Big Dream just big hype or your next key move?

While some analysts might say the industry is climbing up one side of the “hype cycle”, I don’t believe the one curve fits all phenomena. Real results are being achieved by innovators in this area. While many others still don’t understand the full potential, might be confused and can sound like detractors, this noise does not indicate hype on this occasion. We are in the middle of a paradigm shift where old technologies and practices will be left behind and the new is adopted.

Companies with aspirations of winning in the application economy will take advantage of the new analytics and partner with their IT department to move ahead and succeed. A few weeks ago, in North America at least, we adjusted our clocks to spring-ahead into brighter, longer day times. Check out CA BDIM and try for yourself to see if the product could help your company spring ahead of the pack to realize your Big Dream of success.


The answers to your Big Data questions are everywhere


I follow up my previous post with questions that you should be asking yourself when it comes to getting more out of the data your organization probably isn’t using – unstructured data.

LittleBlogAuthorGraphic  David Hodgson, March 3, 2015

Where is unstructured data? As they said about 60s radio series character Chickenman, “He’s everywhere!”

It’s inside your organization under your nose, and outside your organization ripe for the picking like low hanging fruit and in strange places needing a degree of pre-processing and parsing.

In my last post I talked about the power of combining structured and un-structured data to unlock the business value realized by recent revolutions in data analytic technologies. But what is unstructured data? If 80 to 90 percent of the data that you have today is unstructured what is it? Where is it? How can it be used? And how can you get more?

Given that 80 percent of the valuable business data that most companies use today is structured data, this means they’re not getting business value from the majority of the data available to them right now.

You must accept this challenge: the winners in the application economy will be those that find business value in unstructured data and use it in combination with the structured data that undergird their existing mission critical business systems.

Being the Biggest loser is not a positive statement in the world of Big Data!

The big picture: what is it and where is it?

What is it? Any source of information that doesn’t have a defined format or structure intended for generalized data processing as rows and columns is probably unstructured or semi-structured data and could be valuable to you, including a report, a log, an image, a form or any sort of document or file.

Even Excel files that are visually organized in rows and columns are considered semi-structured data for the purposes of this discussion – only the Excel application knows how to do anything with it. In the big picture, this isn’t very useful.

Inside your organization think of where the most valuable, prescient data really is: could it be in notes people take, emails that people exchange, Excel spreadsheets they create, logs of their activity, CRM records or social media interactions with customers?

Sure the reference data in your business systems is critical, but the data that is driving daily business decisions and longer-term strategic decisions may be elsewhere. Could you access that? Would it be valuable if you could?

Outside your organization, where is the data that describes your adjacent markets, or the next innovation that you will (or should) either create or capitalize on? Where is the low hanging fruit? Only you know really but could it be on news websites, in discussions on social media, stock price reports or in SEC filings on company websites? Could it be on a government website that lists foreclosures, or competitive bids, in weather reports or news reports? It could be anywhere accessible via the Internet.

Sure your employees could read all this stuff and process it mentally to your advantage, but can they really and do they? How could you get hold of it in an automated way if it was valuable? This is the low hanging fruit usually available through published APIs, for purchase, or collectable using simple and free open source tools.

The big secret: how do I get hold of it and where do I keep it?

Unstructured data usually requires new tools and processes to extract intelligence and deliver business value. Absent of structure you need ways to extract or create context and metadata about the data: what is it about, when was it created, when and by whom.

For the goal and purpose we are discussing this metadata cannot be created manually. To be useful, these processes need to be scalable and in real-time. They also need to be relevant to business and they can’t cost more than their derived value.

Enter the magic of open source tools and commodity processing power either on-premise or in the cloud. Without these ingredients you would not be able to get hold of Big Data or store it in a cost effective way.

Forget your conventional data warehouses – while they’re not going away, they’re also not your go-forwards tools. In the age of the Internet of Things (IoT), you will be looking at one or several of the new file systems and so-called NoSQL databases that are available today.

The table below gives some idea of the popular offerings available and what you might use them for. What is it you want to do?


How can I be the ‘biggest winner’?

You can’t be the ‘biggest winner’ without asking the right questions and finding the answers. All the questions above, and the specific questions for your business will help you uncover your own secret sauce. Will you mine data others have collected, or create a new collection for yourself?

Take a look at what other companies have done:

Twitter has got millions of people to enter their thoughts on every subject under the sun. LinkedIn has got people to enter their career summaries and their contacts. Nike found personal health data. Some clever electronic medical records vendors have found drug usage data. Every web-commerce site is a potential source of profiling data and geospatial data.

To ensure you don’t get left behind, ask the questions, get engaged with the potential for your business and carve out your winning, differentiated position in the application economy.

After all, the answers are everywhere.

Image credit: Sergei Golyshev