Saturday, March 2, 2019

The future of data visualization

 
Data visualization today varies from basic infographics with little data and almost no functionality to impossible designs with convoluted graphics created by computer scientists to display the behavior of complex systems, MS-DOS-style. Likewise, it takes longer to understand what's on your screen than the data is worth.

Recently the explosive growth of data collection and storage created the need for better management and understanding of data and consequently the development of improved data visualization tools. But to understand the challenge we are facing, it is necessary to go back in time and understand better the history of data visualization.

When we think early data visualization, we are either confronted with old maps or the famous Chart of Napoleon's Great Army March of 1812. One of the earliest successful info-graphics ever created.
But the real origin of the field is much older. It dates back to the early 3rd millennium BC, when the Mesopotamians looking for a more effective way to “sign” contracts, first made clay balls containing rods and sphere representing what was being traded and soon replaced these representations by markings on clay tablets, inventing writing in the process.

From clay tablet to papyrus, to paper and finally computer screen, the technological progress has been relentless. And likewise, from writing to accounting to Excel spreadsheets, the conceptual innovations have been too numerous to count... until now.

But as we stand on the edge of a new revolution instigated by big data, the need for a new leap has become more obvious.

As the technology stands today, computers allow us to build complex multi-dimensional excel tables linking hundreds of variables to each other or similarly, maps with multiple variables represented as layers on top of each others. But in almost every case, the results are too complex to visualize or to make sense of. Simplification has becomes necessary but how?

To get a better grasp of the challenge, a practical example is necessary so let's look at aviation where in a little over 100 years, we have learned, mostly the hard way, how to visualize data effectively to keep our planes in the air.

Early on, there was no cockpit and therefore no data. The only feedback the pilots received was through their senses: listening to the engine, keeping their balance, direction and if possible altitude and speed. But as the machines became bigger and the engines more complex, dials and gauges made their appearance, multiplying endlessly until the early 1970s when the cockpit of the new 747-100 ended up completely covered from floor to ceiling. As two layers of instruments where not technically possible, a new approach became necessary and monitor screens made their apparition. All the data was still accessible but as long as no “urgent” action was needed, the data was not displayed, although the pilot could “dive” as deep as necessary on demand. We now had a new hierarchy of data with only essential and urgent data being displayed at any time. As the cockpit became “digital”, all the earlier mono-dimentional feedbacks disappeared to be replaced by a few complex monitor screens combining data as needed for practical use.

It is there, in the plane cockpit, I believe that we will find the future of data visualization. The reason why aviation progressed so far so fast is that we had direct and unforgiving feedback, in the sense that planes poorly designed to fly invariably crashed and this severe natural selection quickly brought efficient and effective cockpits so much so that today flying has become one of the safest way to travel.

Compared to this, in almost every other field, data visualization is still relatively backwards. Sometimes because it is difficult to get feedback from a data spreadsheet, even when there is a real company behind the spreadsheet, but also because many fields such as mobile phone interaction and intelligent home management are still at a very early stage of their development.

If this is correct, the new technologies which are just emerging: 3D, holographic and haptic will only be the most recent embodiment of the Sumerian clay tablets, the medium we use to access invisible data. But the actual data visualization will depend on advanced software, which will build a hierarchy of data, deciding what must be displayed at any time, in conjunction with other data and background information to insure maximum attention and relevance of the interface.

The systems will benefit from data mining to look at historical relevant events, deep learning to improve the interface and other add-ons which will completely transform the visualization process while preserving the possibility of direct proactive access to data.

Let's take the example of a big company having a million on-line clients with off-line data (addresses, names, age, marital status, etc...) and on-line data such as geo-location, propensity to buy, tastes and interface with the brand.

Today, it is still extraordinarily difficult to make sense of all the data. There is simply too much data, trends and events to make sense of in real time and have any predictable, instant effect. So the efforts are still haphazard, based on on-line behavior, geo-marketing, cluster analysis or A-B testing. The effects are real but remarkably similar to gold digging in the old West. Wherever you look, there is a good chance that you will find something but with so many possibilities that it is almost impossible to optimize the system.

With self learning software looking automatically for events, trends, outliers, correlations and all other statistical characteristics of the data, it will become far easier to analyses huge multi-dimentional database without ever visualizing the whole which in any case will become more and more difficult to do. The only data that managers will request will be decisional data, “Tableau” on steroid where the data is auto-organized for optimum ease of understanding and action. Including the equivalent of what the financial industry has developed for its own lucrative use, commonly referred to as “algo”, available for all types of non financial applications!

Amazingly, this will probably take place over the next few years, giving a critical advantage to early adopters and generating a new arm race of software efficiency, accelerating greatly in the process the development of artificial intelligence.

It may still be early, but it is not absurd to envision an environment where the marketing department disappears as all types of analysis are generated on the fly and adapt automatically to the policy of the company enabling a smooth integration of management and sales.

No comments:

Post a Comment

Why am I afraid of AI and why should you too?

  About 10 years ago, I started working with early AI models. The first thing we started doing was not AI at all. We were calling it: The Ra...