Our brains have an amazing ability to find relationships and discover meaning from diverse data sets. I was reminded of this as I looked more closely at the agenda for the upcoming “Know, Innovate, Grow” conference. Datamining, extraction of behavior patterns, and more are all related to an area that fascinates me, Inference.
Inference is about finding the surprising connections between data and making them explicit. In particular, it is focused on discovering and bringing together weak signals that may point to an emerging pattern or a pattern that is intentionally being hidden.
For example, the comments, complaints and purchases of customers may reveal a new market segment or suggest an unfulfilled need that may be a new opportunity. The customers themselves may not be able to articulate their association or need, but the data, presented to an expert may make it clear. Or a terrorist cell may be plotting an attack with great care about revealing themselves. While they are well aware that their actions and communications come together to form a pattern that builds their capabilities, these data points may be intentionally disguised and dispersed over time. It may be extremely difficult for someone on the outside, such as an intelligence operation, to “connecting the dots” and see the picture.
In many ways, Compstat — the NYPD’s statistics-based initiative to reduce crime — is one of the roots of Inference. While it included many traditional approaches to statistical analysis, three elements have become important to firms that are involved in Inference:
First, Compstat pulled together data from diverse sources. This was no small accomplishment. The inability to bring together information from different sources was cited as a key factor in the inability to US security agencies to anticipate 911. This allowed the data to be analyzed in new ways. But simply bringing the information together creates new options. One of the strengths of Palantir, a company that provides services to the US government and financial firms, is providing tools for looking across databases with a sensitivity to secrecy and privacy concerns. Their tagging methods break data up into units that can be recombined while continually excluding data that is from secrecy levels that should not be available to a specific analyst. This makes more data, which otherwise could not be broken away from high levels, available.
Second, Compstat put people into the mix, including people with knowledge and perspectives that weren’t typically available. Often, those who had special knowledge of, say, community activities (such as demonstrations) or environmental factors (such as empty buildings or dark corners) could provide a context that could make more sense of the data. Compstat made the data consumable as well as accessible. Indeed, one of the most important outputs was putting complaints and arrests onto maps.
This visualization is a key strength of Perception Partners, a firm aimed at providing competitive advantage in innovation and intellectual property. Using data that is largely from open sources (such as patent filings and articles), Perception Partners creates inference landscapes that reveal connections between innovation investments for firms and industries. These can be of such detail that it can be possible to determine not just what intellectual property is being developed, but who the key inventors are.
Third, Compstat facilitated experimentation. Much of this experimentation was traditional, in terms putting “feet on the street” in areas where criminal activity was starting to build. But there were outlier activities. For instance, one precinct saw possibilities in reducing guns and narcotics by enforcing bicycle regulations. Their success became obvious and was quickly replicated across NYC.
David Snowden of Cognitive Edge is a leading thinker who has been working to detect weak signals and connect the dots for the government of Singapore, among others. While not disparaging traditional approaches, he has taken on more difficult and complex systems.
He has said that sense-making has four key elements:
1. Scanning: Did we pick up the signals at some level?
2. Perception: Did those signals register in some way?
3. Attention: Did we see something as worthy of our notice?
4. Response: Can we act (or get other people to act) on those signals?
Putting the data together and getting it in front of people was part of what Compstat did. But Snowden has pointed out that, even when we have the data in front of us, “we don’t tend to notice what we are not looking for.” This perception problem, he says, can be helped somewhat by developing a diversity of scanning capability. If you have analysts who have different perspectives, there is a better chance that otherwise missed patterns will be detected. To be of used, they also need to be seen as valuable. When people are told to look for something specific, that is where their focus will be. Too much direction on what is important can be counter productive. Finally, action is dependent on either effective communication to those who have power or empowering those who gain the insights directly.
Cognitive Edge has created a number of tools and approaches to assist sense-making in complex environments, with narrative forms being central to many. The firm also makes use of summaries and visualizations to assist decision makers.
I believe we are just at the beginning of the use of Inference in our culture. Major projects, such as Semantic Web, will allow us to stitch together data in new ways. Algorithms will emerge as we observe and codify the best practices of those who are experts at deriving meaning—without necessarily knowing how they do it. Along the way, we will need to have a serious conversation about privacy, data rights and social issues that are raised by this powerful, new capability. However, there is real promise for a deeper understanding of our world, social trends and emerging needs.