Enterprise Big Data Predictions, 2016

Tuesday, 8 December 2015 00:01 -     - {{hitsCtrl.values.hits}}

By Neil Mendelson – Vice President, Product Management

Companies big and small are finding new ways to capture and use more data. The push to make big data more mainstream will get stronger in 2016. Here are Oracle’s top 10 predictions:

Data civilians operate more and more like data scientists. While complex statistics may still be limited to data scientists, data-driven decision-making shouldn’t be. In the coming year, simpler big data discovery tools will let business analysts shop for datasets in enterprise Hadoop clusters, reshape them into new mashup combinations, and even analyse them with exploratory machine learning techniques. Extending this kind of exploration to a broader audience will both improve self-service access to big data and provide richer hypotheses and experiments that drive the next level of innovation.

Experimental data labs take off. With more hypotheses to investigate, professional data scientists will see increasing demand for their skills from established companies. For example, banks, insurers, and credit-rating firms will turn to algorithms to price risk and guard against fraud more effectively. But many such decisions are hard to migrate from clever judgments to clear rules. Expect a proliferation of experiments default risk, policy underwriting, and fraud detection as firms try to identify hotspots for algorithmic advantage faster than the competition.

DIY gives way to solutions. Early big data adapters had no choice but to build their own big data clusters and environments. But building, managing and maintaining these unique systems built on Hadoop, Spark, and other emerging technologies is costly and time-consuming. In fact, average build time of six months. Who can wait that long? In 2016, we’ll see technologies mature and become more mainstream thanks to cloud services and appliances with pre-configured automation and standardisation.

Data virtualisation becomes a reality. Companies not only capture a greater variety of data, they use it in a greater variety algorithms, analytics, and apps. But developers and analysts shouldn’t have to know which data is where or get stuck with just the access methods that repository supports. Look for a shifting focus from using any single technology, such as NoSQL, Hadoop, relational, spatial or graph, to increasing reliance on data virtualisation. Users and applications connect to virtualised data, via SQL, REST and scripting languages. Successful data virtualisation technology will offer performance equal to that of native methods, complete backward compatibility and security.

Dataflow programming opens the floodgates. Initial waves of big data adoption focused on hand coded data processing. New management tools will decouple and insulate the big data foundation technologies from higher level data processing needs. We’ll also see the emergence of dataflow programming which takes advantage of extreme parallelism, provides simpler reusability of functional operators, and gives pluggable support for statistical and machine learning functions.

Big data gives AI something to think about. 2016 will be the year where Artificial Intelligence (AI) technologies such as Machine Learning (ML), Natural Language Processing (NLP) and Property Graphs (PG) are applied to ordinary data processing challenges.  While ML, NLP and PG have already been accessible as API libraries in big data, the new shift will include widespread applications of these technologies in IT tools that support applications, real-time analytics and data science. 

Data swamps try provenance to clear things up. Data lineage used to be a nice-to-have capability because so much of the data feeding corporate dashboards came from trusted data warehouses.   But in the big data era data lineage is a must-have because customers are mashing up company data with third-party data sets. Some of these new combinations will incorporate high-quality, vendor-verified data. But others will use data that’s not officially perfect, but good enough for prototyping. When surprisingly valuable findings come from these opportunistic explorations, managers will look to the lineage to know how much work is required to raise it to production-quality levels.

IoT + Cloud = Big Data Killer App. Big data cloud services are the behind-the-scenes magic of the internet of things (IoT). Expanding cloud services will not only catch sensor data but also feed it into big data analytics and algorithms to make use of it. Highly secure IoT cloud services will also help manufacturers create new products that safely take action on the analysed data without human intervention.

Data politics drives hybrid cloud. Knowing where data comes from – not just what sensor or system, but from within which nation’s borders – will make it easier for governments to enforce national data policies. Multinational corporations moving to the cloud will be caught between competing interests. Increasingly, global companies will move to hybrid cloud deployments with machines in regional data centres that act like a local wisp of a larger cloud service, honouring both the drive for cost reduction and regulatory compliance.

New security classification systems balance protection with access. Increasing consumer awareness of the ways data can be collected, shared, stored – and stolen – will amplify calls for regulatory protections of personal information. Expect to see politicians, academics and columnists grappling with boundaries and ethics. Companies will increase use of classification systems that categorise documents and data into groups with pre-defined policies for access, redaction and masking. The continuous threat of ever more sophisticated hackers will prompt companies to both tighten security, as well as audit access and use of data.

COMMENTS