CONF2016: Leveraging the DNA of Digital Transformation


This was my first time attending a Splunk .Conf so I was eager to feel the event; to gauge the excitement about the products and get a sense for how Splunk might succeed with its ambitious plans for growth in an ever more competitive market.

LittleBlogAuthorGraphic  David Hodgson, September 30, 2016

The family goes to Orlando

Boasting 3 days of in-depth training, 185 technical sessions, inspiring keynotes each day, and the booths of 70 technical partners, .Conf2016 did not disappoint the nearly 5,000 attendees in terms of the intensity of the event and the velocity of interactions between us all. Obviously CEO Doug Merritt had primed his troops because in the kickoff keynote he quoted what they say internally at Splunk: “If you ever want to be inspired go out and talk to a customer”.

The fervor that the Splunk user base feels for the product brings back memories of VMware and SAP when they were cool, and promised change and progress. Perhaps because it’s a weird election cycle, Millennials are looking to technology rather than politics to shape their future. I don’t know for sure, but definitely this conference felt like the best sort of family gathering where people actually liked each other, wanted to collaborate on building solutions and wanted to bring new members into the fold.

The conference was held in the Dolphin & Swan at Disneyworld, Orlando.  By the time we got to the Tuesday event night and a roaming party around the Hollywood studios park, the atmosphere was very much of one big family having fun together.

Learning how to get machines learning IT

Splunk is the clear market leader in providing a pragmatic platform for machine learning. The results have been real and beneficial whether its detecting intrusions from unusual data access patterns or predicting trends that can be addressed to optimize IT service delivery. A big theme at .Conf2016 was the power of Machine Learning and how it is shaping Splunk’s products.

In practice Machine Learning is very different from what we usually think of as Artificial Intelligence.  AI seeks to build computer models that can emulate the functions of human brains. We expect that an AI would perceive its environment and exhibit goal seeking, purposefully behavior that is understood by humans. Ideally it would interact with humans to both receive input and augment our decision making abilities. By contrast Machine Learning is a sub-area of AI that is focused on pattern recognition that allows the system to “learn” and predict based on history, but without their being a rational explanation for that response that a human could understand.  Machine Learning relies on the consumption of masses of granular data that can be processed with statistical analysis to make predictions and uncover “hidden insights” about relationships and trends.  These “insights” are not necessarily causalities that have an explanation that humans could understand and replicate.

As a solution Splunk differentiates itself from the similar platforms like the ELK stack (Elastic Search, Logstash, Kibana) and Hadoop mainly through its functional completeness and ease of use. But it is proprietary and somewhat expensive to use with costs scaling based on the amount of data ingested daily. To accommodate customers concerns about growing costs and their desire to embrace open source technologies, Merritt announced at Conf2016 that Splunk Labs was enabling integration with Elastic Search, Spark, and Kafka- showing Splunk’s openness to adaptation to what customers are asking for in the field. The announcement was well received and is probably the answer both to customer needs and to Splunk can ensure continued popularity.

From a Syncsort perspective our Ironstream product has been focused on getting data to Splunk directly but customers have increasingly asked us to support a Kafka pipe to split data between Splunk and Hadoop. With Splunk’s new open architecture announced at .CONF2016 we will now plan to follow suit.

Splunking IT Operations

One of the significant areas of success that Splunk has had is in the area of monitoring tools for IT infrastructure. The normal users are Enterprise IT teams that need to monitor a broad array of platforms. They need to contextualize events by gathering data from connected platforms and using Splunk to do basic time-based correlation and advanced pattern recognitions. The rate of environmental change in hardware, software and connected devices makes traditional tools almost impossible to integrate and Splunk Enterprise offers a much simpler and more effective approach

For the last two years Syncsort has partnered with Splunk to add the mainframe platform to those monitored and this has proven to be an essential ingredient for the some of the world’s biggest IT organizations that have mainframes.

On the first day Merritt introduced the concept of Data as the DNA of IT, driving evolution and change. On Wednesday Andi Mann carried the theme further in his keynote “Re-Imagining IT” saying

“Digital transformation needs to be in your DNA; not passionately pursuing it is an existential challenge and threat to your individual and organization’s future success”.

Mann focused his discussion on the new 2.4 release of IT Service Intelligence (ITSI) that was unveiled at the conference. The main new capabilities of value are:

  • Anomaly detection using machine learning
  • Adaptive thresholds and tells you what the norms and thresholds should be for any time of the day, week, etc.
  • Intelligent events with contextualized data wrapped in them
  • End-to-end visibility of business services richly visualized for LOBs in the new “glass tables”

At .conf2016 Andi Mann discussed Syncsort’s role in making Big Iron Data available to Splunk for Big Data analytics

Syncsort also unveiled our latest work which was integration of mainframe data for ITSI 2.4.  We demonstrated this with glass tables visualizing an online banking system from a mobile device to a mainframe running CICS and DB2. The Syncsort ITSI module is available for download from Splunkbase at no cost.

Splunking Security

One of the most widely adopted use cases for Splunk is security and compliance. As normal you can roll your own very effectively using the Splunk Enterprise platform or you can add pre-built power features with Splunk’s premium app Enterprise Security (ES)

In her keynote Haiyan Song, SVP Security Markets described how alert based security is no longer adequate and stated that Machine Learning is now required to address internal and external threats. Splunk’s answer is User Behavioral Analytics or UBA.

At the conference Splunk announced new features in ES 4.5 and UBA 3.0 that were aimed at providing CISOs and their teams with operational intelligence. The highlights were:

  • The Adaptive Response initiative allowing partners to openly integrate SIEM technology
  • Glass tables available for advanced visualizations of the underlying data
  • Enterprise hardening for the Caspida acquisition to create UBA as a product

Song described how UBA has the ability to understand and correlate user sessions across platforms and devices. She also brought on Richard Stone from the UK Ministry of Defence who explained how they are leveraging Splunk ES and UBA to create a DaaP (Defence as a Platform) ecosystem. To Stone this is a single information environment in which anyone with the appropriate credentials can access it from any point, enter a familiar environment, and access any information. He challenged us to “Date to Imagine” saying that the biggest constraint in security is our imagination.

Syncsort again extends these solution to the mainframe offering data integration to ES for RACF via the Ironstream product.

Splunking DevOps

A new concept unveiled at .Conf2016 is a solution for DevOps. This is perhaps not surprising given Andi Mann’s background and he will be the champion for this new product. The solution uses the underlying capabilities of Splunk Enterprise to take a data-integration approach to deliver three areas of value:

  • End-to-end visibility across every component in the DevOps tool chain
  • Metrics in glass tables to show LOBs that code meets quality SLAs
  • Correlation of business metrics with code changes to drive continual improvement

Splunking the Mainframe

One of the greatest things for me about the show was the number of people interested in the Syncsort booth. Even people who were not familiar with mainframes were interested to learn how we are Splunking the Mainframe!

Our CEO Josh Rogers delivered a phenomenal Cube interview that explained our strategy of moving data from Big Iron to Big Data (BIBD) platforms. Our deliverables and direction resonate with customers and prospects alike who are as excited with what we are doing as they are about Splunk!

During his appearance on the CUBE at .conf2016, Syncsort CEO Josh Rogers defined the Big Iron to Big Data (BIBD) challenge where customers need to take core data assets being created thru transactional workloads on mainframe and move them to next generation environments for analytics.

With the pace that things are moving across this market I am looking forward to .returning to .Conf in 2017 when it will be held in Washington DC, my home town. I know that both Splunk and Syncsort will have learned more and developed more, inspired by our customers. I can’t wait to see what we will have co-created and what evolves next from the data-DNA of IT.


A Dream of Great Big Data Riches – Harvesting Mainframe Log Data


In today’s new world of big data analytics, traditional enterprise companies have jewels hidden within their walls, embedded in legacy systems. Among the most precious stones, but perhaps some of the best hidden, are the various forms of mainframe log data.

LittleBlogAuthorGraphic  David Hodgson, June 20, 2016

Z/OS system components, subsystems, applications and management tools continually issue messages, alerts and status and completion data, and write them to log files or make streams available via APIs. We are talking hundreds of thousands of data items every day, much more from big systems. This “log data” generally comes under the heading of unstructured or semi-structured data and has not traditionally been seen as a resource of great value. In some cases it is archived for later manual research if required, in many cases it just disappears!  In the case of SMF records it has traditionally been consumed by expensive mainframe based reporting products that unlock the value, but at great cost and you still need special expertise to do anything with it.

What if all this potentially valuable data could be collected painlessly in real-time, made usable by a simple query language and presented in easy to read visualizations for use by operational teams? This sounds like a fantasy dream, but it is what Syncsort and Splunk have achieved through their partnership and products.

Nuggets and gemstones

Of all the data sources we are talking about, SMF (System Management Facility) records are the wealthiest trove with over 150 different record types that can be collected. SMF provides valuable security and compliance data that can be used for intrusion detection, tracking of account usage, data movement tracking and data access pattern analysis. SMF also provides an abundance of availability and performance data for the z/OS operating system, applications, web servers, DB2, CICS, Websphere and the MQ sub-system.

But there is much additional information in feeds like SYSLOG, RMF (Resource Management Facility) and Log4J. And there are the more open ended sources that could be considered log data, like the SYSOUT reports from batch jobs.

The gem collector and now Lapidarist too

Syncsort’s solution for the collection of mainframe log data is called Ironstream and it is a super-efficient pipeline to get data into Splunk Enterprise or Splunk Cloud. Designed from the start to be lightweight with minimum CPU overhead, Ironstream is a data forwarder that converts log data into JSON field/value pairs for easy ingestion. We built it in direct response to Splunk customers who wanted to complete their Enterprise IT picture with critical mainframe data to complete an end-to-end, 3600 view.



In addition to all the data sources listed above, Ironstream offers access to any sequential file and USS files. This gives very comprehensive coverage to any source of log data that an organization might be producing from an application. But in addition we offer an Ironstream API that can be used by any application to send data directly to Splunk if it’s not already writing it out somewhere.

Of course something has to be too good to be true here doesn’t it?   Well yes, one potential issue is the sheer volume of data that is available and the cost of storing it. While all of it could be valuable, most companies are going to want to selectively focus on the items that are most valuable to them now. To address this requirement, our Ironstream engineers became digital Lapidarists.  In the non-digital world, Lapidarists are expert artisans, who refine precious gemstones into wearable works of art. With the latest release of Ironstream, we now offer a filtering facility that allows you to refine large the large volumes of mainframe data by selecting individual fields from records, discarding the rest. By customer request, we have on our roadmap an even more powerful “WHERE” select clause that will allow you to select data elements across records based upon subject or content.

Why didn’t I know this?

There is a fast moving disruption happening in the world of IT management and not everyone wants you to know it. Open source solutions and new analytical tools are changing everything.

For the last 40 years complex, point-management tools have been used by highly skilled mainframe personnel to keep mainframes running efficiently.  Critical status messages are intercepted on their way to SYSLOG and trigger automation to assist the operational staff.  All this infrastructure has made most of this log data unnecessary for operations and mainly of archival interest if of any interest at all. The most valuable SMF data usable for capacity planning, chargeback and other use cases has been kept in expensive mainframe databases and processed by expensive reporting tools.

In parallel to the disruption that is being driven by emerging technologies there is a special skill crisis in the mainframe world; the experts that have been managing these systems for 40-50 years are retiring and there are not enough people being trained to replace them.

Fortunately in the confluence of these two trends a solution is born. By leveraging this new ability to process mainframe log data in platforms like Splunk and Hadoop, a new generation of IT workers can assist “Mainframe IT” by proactively seeing problems emerge and assisting in their resolution. In the first wave of adoption this will help offset the reduced availability of mainframe skills, but it won’t obviate the need for them completely and it won’t replace the old point management tools. Yet.

As this technology matures, and machine learning solutions become proven and trusted, we will see emerge a new generation of tools.  Based on deep learning, these will replace both the old mainframe tools and the personnel who used them, but now want to be left in peace by the lake.  My prediction is that as this comes to be a reality, we will also see a move of analytics technology back onto the mainframe platform.  The old dream of “autonomic computing” will become a reality and a new mainframe will in effect evolve; one that tunes and self-heals itself.

Find the Treasure!

Syncsort plans to be there, in fact we are leading the way there. We offer the keys to the treasure chest for anyone who wants to follow our map to find the dream of great riches!


New Beginnings on Old Bedrock: Linking Mainframe to Big Data

syncsort blog

Following 14 years at CA Technologies, where I held various senior management positions, I joined Syncsort in April of this year. I wanted to become a part of the leading company that is linking Big Iron to Big Data. What will that union yield for the industry, and for me?

 LittleBlogAuthorGraphic David Hodgson, May 23, 2016

I really enjoyed working at CA and learned a lot over the years there. CA is a vibrant, energetic place to work. The employees are smart, the products are good, the installed customer base is amazing and a lot of innovation is occurring. Yes, on the mainframe side of the house too. Last year alone saw three entirely new mainframe products launched, and I am proud to have been a part of the team that did that.

Syncsort is an incredibly interesting company that I had been watching for a while. A forty year old mainframe company that is doing some of the most valuable innovation in the big data space for large enterprises. A few years ago, the company re-invented itself as the company to move mainframe data to analytics environments. Strategic partnerships with Hortonworks, Cloudera, MapR, Dell and Splunk, along with some great innovation by the development teams, has transformed Syncsort into a player in Big Data ecosystem. In fact Syncsort announced record 2015 results, including the promotion of Josh Rogers to CEO to lead the company forward to fully realize the vision and potential that we have for the next few years.

In my last few years at CA I was very focused on the Big Data space and was interested in the problems that CA could solve there. When Syncsort founder and previous Mainframe GM, Harvey Tessler decided he wanted to retire, I talked to Josh and the rest of the Syncsort management team and we all agreed that I would be a great fit to take over the reins.

A few weeks into the role here I am thrilled with the decision to join. I love being part of a smaller company again where everything is more agile, just because of the small teams, shared mission and sense of urgency. We can do so much at Syncsort from our position of strength on the mainframe and our expertise in data management.

Having now met with several customers, I have confirmed the pattern of needs that we can address.  Big Data platform ITOA solutions and business analytics are now the norm. Although the market is evolving quickly and requirements are changing, everyone is doing it. Those who still think it’s still just talk are missing out big time. Most of these initiatives are not started by Mainframe IT, but in companies with mainframes, the enterprise teams are now at the point of implementation where they realize that they need the mainframe data for an effective or complete solution.

The broad uses cases that we see include things like real time monitoring of infrastructure or business services, and real-time awareness of access activity to help spot breaches in security or compliance. What these cases, and others, have in common is a deeper contextual analysis that is impossible with traditional, point monitoring tools.  Done right these solutions can be more effective than current practices and reduce cost by saving labor, penalties and software costs.

These same customers currently indicate that they are unlikely to dump the traditional management tools, but I actually wonder about that myself. As practices in data gathering and machine learning mature I think we will quickly see the start of next-gen automation that may make the old tools redundant. In the case of the mainframe this may become a necessity when, as an industry, we lose the skills of the baby boom generation and fail to replace the depth of knowledge they have.

By joining Syncsort I have brought myself to the coal-face, where we are mining the black-stuff out of the Big Iron legacy systems. As one of those whose career has been based on the strength of the mainframe, and its continual re-invention, I hope that I can be a part of the next round of evolutionary changes. Changes that will enable the mainframe to serve the industry for a renewed lease of life. New beginnings on old bedrock. The decade of ITOA and the dawning of AI applied to business systems.


Lessons from autonomous cars could drive your business


Among the more dramatic aspects of the Digital Transformation most surely be the prospect of driverless cars. It truly feels like science fiction becoming reality. It is useful to examine this phenomena as it’s emerging to see the lessons about AI that we can learn and apply to business transformations elsewhere.

LittleBlogAuthorGraphic  David Hodgson, April 10, 2016

As the prospect of the mass deployment of driverless cars comes careening towards us, are there lessons about AI that you could learn to get a competitive edge for your business? Like many other applications of AI, autonomous vehicles have taken longer to arrive than expected, and longer than predicted even a few years ago. But make no mistake, the robots are coming, and all that we have imagined about AI will probably pale into comparison with the reality we will experience when it’s widely adopted.

It seems like every car manufacturer now has plans to introduce driverless vehicles and there is continued growth in the use of software to transform the experience. Making the in-car experience a connected one is a natural extension of our lives now, and a recent Mckinsey report showed that consumers are growing in their willingness to pay more for this.

What could be

The developments and imminent delivery of highly connected, self-driving cars is very exciting to those who love technology and I am really looking forward to buying one! However, what is perhaps more interesting are the wide-ranging follow on developments and ramifications that go way broader than our immediate riding experience.

Traffic lights will eventually disappear, as will driving licenses, both being redundant. Denser parking will mean more available space, as computers become adapt at squeezing vehicles in and organizing them for access.  There will be no more tickets, traffic courts and a reduced police presence on roads. Traffic policies for special events can be piped to cars in the area programmatically and dynamically adjusted, reducing or eliminating frustrating backups.

Signposts will no longer needed although of course map data will be more important, but the cars could be the cartographers. Similarly the cars could monitor the state of repair  of roads and dispatch an autonomous truck with a mending crew of swarm bots and supplies to fix potholes. Damage from road usage might be reduced through coordination to vary the precise paths by slight offsets for more even surface wear.

Certainly there will be improved flow and throughput through reduced “noise” and disruptive movement. Connectedness could create convoys of cars going to similar destinations with individuals peeling off and joining as needed.

With reduced accidents, hospital and ER space will be freed up and there will be massive disruption to the insurance industry meaning it will probably be at the forefront of resisting adoption!  For sure there will be massive disruption to jobs wherever it is discovered that AI based robotic systems replace humans. One of the first might be the trucking industry where driverless transport convoys are already being tested in Europe.

The general point here is that the AI required to drive cars, drives a much broader impact to business and society than the specific solution area itself. The same thing will be true for changes that AI makes to your business.

The bigger picture and what it means to you

Seen in this bigger context, autonomous cars become an instructive use case of the disruptive influence of AI on the business processes for an industry and connected markets. Whether or not your business will be impacted by driverless cares, you should get ahead of how AI can be leveraged in your industry. It might be a wave that you can ride to survive and surf over your dying competitors that ignore it.

The term “cognitive business” describes a company instrumented with systems that understand data and can realize new insights on their own. This is not as far fetched as it sounds and we can see the dawning of the possibilities with IBM’s offerings that open up Watson as a cloud service through APIs.

In this scenario computers do the more significant things faster, better and more reliably than people. Not just math and report creation, as they have done traditionally, but “thinking”, predicting and decision making.  Eventually it leads to software that maintains and modifies its algorithms to better solve problems and solve new problems.

Imagine a cognitive supply chain that can quickly adapt to real-time changes in demand, differentiate between local and national trends, and accurately predict  the impact of upcoming events. Both know like social, sporting and weather events but also hidden pattern events perhaps created by competitor activity, or changes in consumer preferences. It could balance activity between on-line and bricks and mortar store fronts. It could optimize manufacturing, distribution and stock levels. And of course, given our theme, it could interact with fleets of autonomous distribution and delivery vehicles.

To achieve this and other scenarios, the AI will be integrated with huge amounts of “Big Data” but will also leverage human knowledge. Some of the most powerful solutions will be the interactions of experts with AI systems.  We have seen this already in the advanced weaponry of fighter planes and drone systems.  The medical world holds great promise for new solutions that combine expertise in this way too. There is no reason to think that advanced business systems will not be implemented in the same way.

You control your future

All this is future right now, but the sooner you get started in preparing yourself, the more likely you are to be a winner. This means experimenting with advanced analytics now, finding new uses for existing data and discovering new sources of data. And while you do that, simultaneously starting to grasp the security and compliance aspects of gathering and processing all this existing and new data in new ways.

The best time to plant a tree was 20 years ago, but assuming that your strategy planning has not been that prescient, then there is no time like the present to start planning for the future.


Image credit: CNN


Can AlphaGo Help You Stay Alpha Dog?


The recent triumph of AI program AlphaGo playing against a human, signals just how far advanced analytics has come. What lessons can you learn to get a competitive edge for your business?

LittleBlogAuthorGraphic  David Hodgson, March 15, 2016

Almost two decades ago, in 1997, IBM’s Deep Blue chess playing computer beat the reigning world champion, Garry Kasparov, in a six game match under tournament conditions. The world realized, perhaps for the first time, that HAL of “2001 A Space Odyssey” fame, was going to arrive at some point, though a few years later than cast by Kubrick.

Then, in 2011, IBM’s Watson computer stunned us by winning at Jeopardy. If you haven’t actually seen Watson playing Jeopardy click on that YouTube link; it’s truly awesome. The feeling of invasion is greater seeing Watson, perhaps because we can all imagine playing Jeopardy, and the question and answer approach is so “human”.

Which brings us to current events. Google’s DeepMind research team has developed AlphaGo which beat Fan Hui 5-0 last October. Hui is the current European Go champion and 2 dan master. This was impressive enough, but today saw AlphaGo win 4-1 playing Lee Sodol, the current World Champion, a South Korean 9 dan Grandmaster.   Send Lee Sodol a message of support somehow, because being on the coalface of human defeat by computers must be tough.

What is happening here?

Closed system games like Chess and Go are complicated, but have simple rules and a known, although massive, number of variables. There are more possible Go board move sequences than the estimated 1080 atoms in the visible universe. This is a formidable problem, but a different sort to the open ended question and answer format of a game like Jeopardy

AlphaGo’s algorithms use a combination of value-weighted, Monte-Carlo tree search techniques and a neural network implementation. The DeepMind team’s approach to machine learning involved extensive training from both human and computer play. AlphaGo played itself to rapidly learn the outcomes of numerous different options.

Watson used Hadoop to store masses of unstructured information, including the entire text of Wikipedia, that it could search with analytical techniques in real time. Equally significant in Watson’s case is that it was responding to natural language questions that it had first to understand using similar search techniques.

A third powerhouse for change, Facebook is also experimenting with AI systems and has their own Go-playing system Darkforest, also based on combining machine learning and tree search techniques.

Between them and the numerous other AI projects underway in different domains, we have the building blocks for HAL’s arrival.

So what?

I hear some saying “So what David?”. “This is interesting to learn about, and with the election I had missed it in the news, but of what importance is it to me?”

DeepMind is targeting smartphone assistants, healthcare, and robotics as the practical outcome for their experimental work with AlphaGo. From their website:

“The algorithms we build are capable of learning for themselves directly from raw experience or data, and are general in that they can perform well across a wide variety of tasks straight out of the box.”

IBM has already applied versions of Watson to practical problems, offers it as a service for anyone to buy and a developer community to encourage experimentation. An example of a practical application is the partnership with Sloan Kettering to fine tune cancer treatment. Similarly DeepMind is partnering with the UK’s National Health Service to improve its services.

Although for specific solutions much secret sauce is often preserved, the framework of these systems is usually Open-Source software. An important component of Watson is the Apache Unstructured Information Management Architecture (UIMA) software. These same tools and techniques will be what disrupt your business soon and you will want to be an early adopter.

Fed with the right data, a Watson-type system could answer new questions that nobody yet knows the answers too. Or applied to real-world problems an AlphaGo-type system could decided on the best course of action given many variables and alternatives. Leading the field in practical solutions IBM calls this ‘cognitive business’ and it is definitely a part of our future.

You Control your Future

In the panorama of the Digital Transformation, AI is out there as a wildcard with seeming limitless possibilities. We are both familiar with, and scared of, these futures because of numerous science fiction dramas. HAL is not here yet, but its coming. For you it’s really a case of whether your company or the competition deploy machine learning systems first. You don’t need an AI system to answer that question.


Image credit: NYTimes




Annual Analytics Assessment


At this time last year I celebrated the Chinese New Year by publishing my predictions about the world of analytics and Big Data. Its always fun to look back and hold yourself accountable, so let’s go!

LittleBlogAuthorGraphic  David Hodgson, January 14, 2016

In 2015 the Chinese year of the Ram started on February 19. For 2016, the year of the Monkey, the new year has moved forward to start February 8. It is determined by a zodiac cycle and so bounces around a bit compared to our Julian calendar, which approximates the solar year.

So how did I do on my five predictions?

To keep me honest you can read my blog entry from last year here.

#1 I correctly predicted that people would still be using the term “Big Data”. No-one likes this buzzword, but those who predicted its demise were too optimistic! It’s here to stay because culturally we like reductionist simplicities.

#2 I may have been a bit optimistic in saying that every major enterprise would have a real, funded, Big Data strategy by the end of 2015. I bet it’s close though. Analysts reported continued adoption driven by LoB departments, with Marketing front and central.

#3 I said that “data agility” will be the aspirational driver of big data strategies. We did indeed continue to see people moving data from traditional proprietary data warehouses into more portable forms. However, the erosion slowed as the incumbents found ways to embrace Hadoop and the new became more additive to the old than replacing it. T’was ever thus with IT.

#4 2016 will get us to 10 years of Hadoop being used as the primary tool of Big Data analytics. I got this right, but several forecasters were saying that we would have moved on from Hadoop to other technologies. The elephant still stands largest and squarely in the middle of the room. True we saw Cloudera and many enterprise adopters embrace Apache Spark over MapReduce, but still in the context of the Hadoop stack and ecosystem.

#5 The IoT did not yet become the shaping force that I thought it might. Of course it waits in the wings, growing slowly (who got Nest thermostats last year?), just needing enterprises to really embrace the new technologies involved.

So what trends did emerge to continue as shapers for 2016? I’ll just call out two areas this year.

#1 Analytics and the IoT

It is still early days for enterprises finding business advantage with analytics on IoT generated data, but I think we will see significant progress in the year ahead. Most speculators would bet on the manufacturing industry for results here. But given the proliferation of wearable technology in the last 18 months, my bet is that we will see serious headway in healthcare related analytics.  To come true this prediction is probably dependent on the next area.

#2 Security and Compliance

Security and compliance issues remain a barrier to larger scale production implementations, particularly where PII information may be involved. I predict that we will start to see better defined process and procedures around the handling and merging of structured and unstructured data.

Otto Berkes of CA Technologies has suggested that Bitcoin’s blockchain protocol could be re-used as a secure and validated way for IoT devices to communicate and exchange data. Otto is a lot smarter than me, so I will just say that in 2016 we will see stronger solutions emerge to make the IoT secure and less vulnerable to corruption by hackers.

The Monkey Wrench

Ok, so that’s three predictions really not two – I’ll review them next January. What will the Year of the Monkey really bring? With the economy picking up steam, analytics will be central to IT investment and hiring. We will see a lot of companies copying each other (as monkeys do) but let’s look out for the “alpha ape” trend setters; those who will take us into new territory. Who are you watching? Let me know by commenting at the top of this blog.



What Big Data dreams may come


Instead of asking, “What’s the big deal with Big Data?” try asking, “What’s the big dream?”

LittleBlogAuthorGraphic  David Hodgson, March 26, 2015

The best ventures start with a big dream and big dreams come in different sizes. In the world of data analytics this might be cracking a top secret code, growing your business, detecting fraud in real time or successfully making purchasing recommendations based on profiles and patterns. However big your dream appears to other people, it is important to you and the goals you want to achieve, so it is worth planning properly and equipping yourself with the tools you need.

In my last few blog posts, I’ve been talking about the challenge of gaining a competitive edge for your business with all the data that is available to you, structured and unstructured. This is the Big Dream.

The importance of creative experimentation and diversity

The growth and transformation of data analytics in recent years has generally been fueled by the interest from lines-of-business (LOBs) within a company – not by the central IT department. The LOBs have wanted to experiment with tools they thought might give them a competitive edge, an extra business insight. This experimentation is an essential ingredient of success and LOBs must be allowed to empower themselves by choosing the tools and services they need.

This revolution has been enabled by the availability of commodity compute power (on-premise and cloud) and the emergence of open source tools driven mainly by the Apache foundation. The diversity of tools that has emerged is incredible. Hadoop is not one thing, but a stack of technology surrounded by an ecosystem of supporting tools that allow you to build the solution you need. I wrote about this explosion of creativity in an earlier blog post. There are multiple distributions of the stack with proprietary code added for different advantages. And then there are numerous NoSQL databases, Cassandra, MongoDB and probably 30 others, that are suited to different sorts of analytic operations.

This creative experimentation and diversity is still incredibly important. We have not yet seen the end of the tool evolutions and are just beginning to realize the potential of these new technologies and the ways businesses can transform themselves in the application economy.

How can IT help?

IT’s big dream is to be more than a cost center and align to the business to be an essential ingredient in the recipe for success of a company’s goals. In the fast proliferating ecosystem of analytical tools and components, LOBs must have the freedom to experiment with the latest technologies. We are past the point when IT can control and limit software usage for their own convenience. To remain relevant, the IT department must support the new big data playbook and be the experts who learn the latest technology.

But how can IT do this and control costs? Even as a driver of business success, the IT department cannot run amok past the boundaries of their budget. So how can they both encourage creative experimentation and contain costs? How can they find all the skills they would need to use all the different tools?

A Big Data management strategy

CA Big Data Infrastructure Management (BDIM) is now being demonstrated and may be an answer for IT departments as they seek to help LOBs with analytics. CA BDIM provides a normalized management paradigm for different Hadoop distributions and NoSQL databases. It is a tool that will scale with your growing analytics implementations and will allow the LOBs to easily switch tools or platforms.

This single unified view approach will help reduce operational costs as experiments scale to production operations. With automation, productivity is increased and the differences between Hadoop distributions becomes no problem. For IT, the potential is a single management tool to manage all clusters, nodes and jobs. For LOBs, the potential is an IT partner that can take the burden off of production operations while leaving them the freedom to easily change direction.

CA BDIM will be on display at the Gartner Business Intelligence & Analytics Summit, March 30 through April 1, 2015 in booth 525 in Las Vegas. This first release is targeted at early adopters while we aim to deliver new functions and support every 60 days and evolve the product to deliver the maximum value for those that want to use it.

Is the Big Dream just big hype or your next key move?

While some analysts might say the industry is climbing up one side of the “hype cycle”, I don’t believe the one curve fits all phenomena. Real results are being achieved by innovators in this area. While many others still don’t understand the full potential, might be confused and can sound like detractors, this noise does not indicate hype on this occasion. We are in the middle of a paradigm shift where old technologies and practices will be left behind and the new is adopted.

Companies with aspirations of winning in the application economy will take advantage of the new analytics and partner with their IT department to move ahead and succeed. A few weeks ago, in North America at least, we adjusted our clocks to spring-ahead into brighter, longer day times. Check out CA BDIM and try for yourself to see if the product could help your company spring ahead of the pack to realize your Big Dream of success.