Skip to content

LATEST GARTNER REPORT: Avoiding the 10 Most Common Mistakes in Financial Services Automation

Product Engineering
Define your product strategy and develop your capabilities to deliver at scale.
Data & Analytics
Deliver business value and actionable insights from your data.
Intelligent Automation
Transform and reimagine your business processes.
Technology Radar
Trends and techniques to help you make better product development decisions.
Custom engineering and global R&D solutions
Collaborative path to your product build.
Redefining industries
We work with our clients to solve their most complex business challenges, engineering technology that redefines industries and shapes the way people live.
Environmental, Social and Governance
At Ciklum, social responsibility always comes first. That is why we have strengthened our commitment to creating, nurturing and sustaining a world from which every customer, employee, citizen and nation we operate in can benefit.
We develop technology that drives your IP and provides true market differentiation.
We can support a custom omnichannel approach for personalised experiences.
We can help you build scalable, compliant platforms for responsible gaming.
Actionable insights on topics across all the industries we serve.
Case studies
Find out how Ciklum has helped its clients solve some of their toughest challenges.

Big Data and the Challenge of Unstructured Data

August 29th 2017
There’s a lot of buzz lately about Big Data and the privacy issues inherent in collecting and storing so much personal information.

While that’s a legitimate and real concern for both consumers and those who work in data security, industry insiders are facing another challenge: how to handle unstructured data?


What is unstructured data?

Unstructured data doesn’t fit neatly into databases organised by fixed categories like name, address, social security number, etc. Unstructured data is the freeform information that is mined from things like social media posts, notes made by a call centre agent, email, or Twitter conversations with customers. Unstructured data can be an extremely rich source of relevant information, but it doesn’t easily lend itself to older models of data storage and analysis.


What are some of the challenges?

The challenges of unstructured data run the gamut from gathering to storing, to using it to make decisions:


One way in which relevance comes into play is lack of insight into the “backstory” of certain pieces of data. For instance, a student might do a search on a particular topic or product to gather information for a school paper, and then never search for those keywords again. If so, that search would be irrelevant to any subsequent consumer behaviour, but the computers doing Big Data analysis wouldn’t know that. The system assumes a relationship that simply wasn’t there. Another big challenge in working with unstructured data comes into play with machine learning and highlights the importance of knowing which factors actually drive consumer behaviour. It’s the classic “correlation or causation” dilemma on steroids. An analytic model could give too much weight to factors that are merely correlated, and, thanks to machine learning, the more the correlation is noted, the more weight it’s given. But, since there is no actual causation, the conclusions are inaccurate, and they become more so as time goes on.



For many businesses, that’s more than they can keep up with, and they may be collecting information they’re not even aware of. That presents challenges for both using and securing the data. The lack of awareness makes it more likely for enterprises to run afoul of the increasing number of regulations addressing data privacy. Such a large volume of data also requires infrastructure that many businesses don’t currently have, or haven’t budgeted for.


By nature, a large volume of unstructured data is unverified. There are plenty of jokes about “Facebook lives,” in which a person’s Facebook updates are more fantasy than reality. One effect of growing privacy concerns is the tendency for people to make up details for their profiles, in which even the hard “facts” – like marital status and hometown – can be completely false. This presents serious challenges for consumers and enterprises. On a consumer level, people could be negatively impacted by companies that make decisions based on flimsy unstructured data, like using a person’s social media posts to help determine insurance rates. On an enterprise level, making business decisions based on inaccurate data could be extremely costly.


For unstructured data to be usable, businesses will have to come up with a way to locate, extract, organise, and store the data. This means coming up with an entirely new type of database to store information that doesn’t fit the mould.

Unstructured Big Data isn’t going away. And that’s a good thing, because it holds the opportunity for greatly enhanced planning and decision-making. Contact us to develop and execute a plan for using Big Data rather than falling back on the “drinking from a fire hose” when much data is coming so fast that it becomes useless.

Editor’s Note: This post was originally published in October 2015 and has been updated for accuracy and comprehensiveness

Subscribe to receive our exclusive newsletter with the latest news and trends