What Big Data dreams may come


Instead of asking, “What’s the big deal with Big Data?” try asking, “What’s the big dream?”

LittleBlogAuthorGraphic  David Hodgson, March 26, 2015

The best ventures start with a big dream and big dreams come in different sizes. In the world of data analytics this might be cracking a top secret code, growing your business, detecting fraud in real time or successfully making purchasing recommendations based on profiles and patterns. However big your dream appears to other people, it is important to you and the goals you want to achieve, so it is worth planning properly and equipping yourself with the tools you need.

In my last few blog posts, I’ve been talking about the challenge of gaining a competitive edge for your business with all the data that is available to you, structured and unstructured. This is the Big Dream.

The importance of creative experimentation and diversity

The growth and transformation of data analytics in recent years has generally been fueled by the interest from lines-of-business (LOBs) within a company – not by the central IT department. The LOBs have wanted to experiment with tools they thought might give them a competitive edge, an extra business insight. This experimentation is an essential ingredient of success and LOBs must be allowed to empower themselves by choosing the tools and services they need.

This revolution has been enabled by the availability of commodity compute power (on-premise and cloud) and the emergence of open source tools driven mainly by the Apache foundation. The diversity of tools that has emerged is incredible. Hadoop is not one thing, but a stack of technology surrounded by an ecosystem of supporting tools that allow you to build the solution you need. I wrote about this explosion of creativity in an earlier blog post. There are multiple distributions of the stack with proprietary code added for different advantages. And then there are numerous NoSQL databases, Cassandra, MongoDB and probably 30 others, that are suited to different sorts of analytic operations.

This creative experimentation and diversity is still incredibly important. We have not yet seen the end of the tool evolutions and are just beginning to realize the potential of these new technologies and the ways businesses can transform themselves in the application economy.

How can IT help?

IT’s big dream is to be more than a cost center and align to the business to be an essential ingredient in the recipe for success of a company’s goals. In the fast proliferating ecosystem of analytical tools and components, LOBs must have the freedom to experiment with the latest technologies. We are past the point when IT can control and limit software usage for their own convenience. To remain relevant, the IT department must support the new big data playbook and be the experts who learn the latest technology.

But how can IT do this and control costs? Even as a driver of business success, the IT department cannot run amok past the boundaries of their budget. So how can they both encourage creative experimentation and contain costs? How can they find all the skills they would need to use all the different tools?

A Big Data management strategy

CA Big Data Infrastructure Management (BDIM) is now being demonstrated and may be an answer for IT departments as they seek to help LOBs with analytics. CA BDIM provides a normalized management paradigm for different Hadoop distributions and NoSQL databases. It is a tool that will scale with your growing analytics implementations and will allow the LOBs to easily switch tools or platforms.

This single unified view approach will help reduce operational costs as experiments scale to production operations. With automation, productivity is increased and the differences between Hadoop distributions becomes no problem. For IT, the potential is a single management tool to manage all clusters, nodes and jobs. For LOBs, the potential is an IT partner that can take the burden off of production operations while leaving them the freedom to easily change direction.

CA BDIM will be on display at the Gartner Business Intelligence & Analytics Summit, March 30 through April 1, 2015 in booth 525 in Las Vegas. This first release is targeted at early adopters while we aim to deliver new functions and support every 60 days and evolve the product to deliver the maximum value for those that want to use it.

Is the Big Dream just big hype or your next key move?

While some analysts might say the industry is climbing up one side of the “hype cycle”, I don’t believe the one curve fits all phenomena. Real results are being achieved by innovators in this area. While many others still don’t understand the full potential, might be confused and can sound like detractors, this noise does not indicate hype on this occasion. We are in the middle of a paradigm shift where old technologies and practices will be left behind and the new is adopted.

Companies with aspirations of winning in the application economy will take advantage of the new analytics and partner with their IT department to move ahead and succeed. A few weeks ago, in North America at least, we adjusted our clocks to spring-ahead into brighter, longer day times. Check out CA BDIM and try for yourself to see if the product could help your company spring ahead of the pack to realize your Big Dream of success.



The whole Big Data is greater than the sum of its parts


How to win in the application economy by finding business value in information and potential data.

LittleBlogAuthorGraphic  David Hodgson, February 19, 2015

“Imitation Game,” the current Oscar-nominated film about code buster Alan Turing, reminds us that from the early days of computers, we have thought of “data” as something organized into rows and columns; something that could be structured. A newspaper or book clearly contained information, but it wasn’t really data because to a computer it appears to have a random organization; it was unstructured and couldn’t be processed usefully by computer programs.

The revolution underlying the application economy is the emergence of new tools, and enough processing power, to glean value from unstructured information thus turning it into data. We call it, “Big Data,” because this revolution gives access to much more data than we had before. The code has changed forever.

Myth busting

There is a myth that, “Big Data analytics,” is all about NoSQL databases and unstructured data. A lot of the clever analysis that companies are doing with high volume, transactional data is achieved with structured data alone. This is Big Data too. The two archetypal cases most used today are:

  • Recommendation engines: used real-time or post-sale to suggest additional purchase options based on what other similar buyers have bought
  • Fraud detection: usually real time to alert on unusual behavior pattern data (access point, transaction time, purchase type, etc.) that might be indicative of fraud.

The revolution that has happened here is based just upon the availability of lots of compute power. This gives the ability to process complex queries that used to take hours, in seconds or short enough periods to afford an effective real-time value.

In the mainframe world IBM achieved this by offloading complex DB2 SQL queries to their DB2 Accelerator – the ex-Neteeza device that attaches directly as an extension to the mainframe to receive data and processing instructions at lightning speed in a way that is transparent to the applications spawning the requests. The downside is that this technology is expensive and only for the well-healed elite. Hadoop democratized the possibility of large-scale compute power by making it available through massive parallel consumption of commodity servers. The cloud providers democratized access to it with their IaaS and SaaS offerings offered at cheap prices with a pay-for-use business model.

The sum of the parts

Many innovative and valuable analyses are being done purely using unstructured data. For instance, imagine analyzing text data like Tweets, Facebook posts and emails sent to customer service, companies might visualize whole new emerging problems that they could build products to solve. Minimally they might be able to validate or eliminate new ideas and “fail fast” as the axiom of lean innovation teaches us. Earlier than the competition does is all that’s needed; he who has the best data scientist wins!

However, the most immediate ways to augment or create new business processes are probably achieved by combining the two types of data, structured and unstructured. The previous use of social data, or person based feeds, is called a, “sentiment analysis.” By combining data from the product catalog, with sales data and a sentiment analysis, companies can quickly get an early grasp on the shape and size of dissatisfactions. This allows product managers to make changes and then use the same data sources as a feedback loop to see if the investment fixed the customer issues.

And the increasingly important use case of fraud Detection can be hugely enriched by the use of unstructured data. Logs of movement through the Internet, or other infrastructure, can reveal deeper patterns than just transaction origin can alone. Activity on social media might indicate buying (or other) behavior patterns preceding a fraud or help illuminate correlations with post-fraud selling activity. These are just two simple examples.

The imitation game

Some analysts tell us that 80 percent of the data a company has today is unstructured data. As the IoT becomes a reality, unstructured data will become more like 99.9 percent of the data a company has. The winners in the application economy will be those that can find business value in all this information and potential data.

In a survey from last December it was revealed that 67 percent of large companies are in production with Big Data analytics.

Although this is on the high side for survey results, the chances are that if you are not doing it yet then be assured that your competition is.

You better start playing the imitation game quickly and start busting the new code of Big Data for yourself.