Analytics and Belief: The Struggle for Truth

Increasingly sophisticated analytics tools and methods are available to derive business insight from data.  However, as a discipline which drives insight from data, the crucial ‘last step’ in the analytics process is about organizational decision making.  A sophisticated, intensive analysis may all be for naught if the crucial last step, framing and committing to a decision, misses the mark.

No matter how fancy or sophisticated the tools and methods, there needs to be a commitment to action in the end.  This is complicated if there is doubt concerning the validity of analytics-derived insight.  The organization has to embrace analytics culture: the notion that scientific, evidence-based techniques are the best method to discover and validate hypotheses.  While intuition can be powerful, it should be validated via data-driven experimental approaches, particularly in dynamic, complex environments. proof

Adopting analytics culture brings an organization a step further to connecting analytics insight to high-quality decisions.  For the union to be feasible, analytics culture must understand how ‘belief’ concerning the insight gained from analytics is secured in the organization.

Belief as the Foundation for Knowledge

It is helpful to consider that organizational belief, the belief in the results of insights derived from analytics, can be positioned on a continuum:

  1. Truth-in-of-itself:  establishing belief via incontrovertible evidence, such as via structured and replicable experimental proof in a laboratory.  Also, the validation of a mathematical proof which applies the rules of logic and which clarifies the boundary conditions for the proof (in the case of purely rational or epistemological assertions).
  2. Useful to be true:  establishing belief based on the notion that it is useful for an assertion to be true.  Often we adopt this framework of belief when it is difficult to establish incontrovertible proof, but there is observable practical concordance.  Indeed many practical theories in the ‘useful to be true’ category lead to engineering advances without necessarily being fundamentally ‘proven’ scientifically.  This category often encompasses the highest state of ‘belief’ achievable in business, as it is impractical, and often impossible, to scientifically validate many complex business assertions.  Usefulness implies pursuing ‘pragmatic’ truths – to adopt an engineering mindset which diffuses politics and makes decision making an objective, professional process.  This state reflects the observation of statistician George Box that “all models are wrong, but some are useful”.
  3. Possibly true:  establishing belief based on the notion that it ‘might be true’.  This method appeals to the instincts and sentiments.  As such, it risks being driven by a willful notion of ‘wanting something to be true’. This results in a susceptibility to being misled by cognitive bias traps – for instance, being waylaid by Kahneman’s ‘System 1’, our ‘lizard brain’ which leads us to make quick decisions.
  4. It will continue to be true:  this system of belief is based on the notion that things will be true based on a continuity of the status quo.  This is known as the status quo cognitive bias trap.
  5. It should be true:  This is belief based on implicit and willful trust in one’s own viewpoints and bounded information.  Intuition-driven conclusions may fall into this category.  Although intuition may turn out to be correct in the case of expert intuition, it can also mislead us (even when we are ‘experts’).  Kahneman and Klein’s article ‘Conditions for Intuitive Expertise‘ is an excellent guide to distinguishing where intuition is useful and where it can mislead.  When believing in ourselves implicitly, we are susceptible to the overconfidence bias, whereby we attribute greater confidence in our own judgment than is objectively warranted.
  6. It is the current trend:  slipping down a notch on the scale of ‘truth belief’, this is belief based on the notion of a loose tendency or trend.  This is susceptible to many biases, including ambiguity, anchoring, attention, availability, and selective perception.
  7. It is an aspect of the truth:  this form of belief is based on believing something to be true because the people in the surrounding community believe in something.  This is susceptible to groupthink and the bandwagon effect.

How Do We ‘Know’ Anything At All?

The notion that State 1, truth in-of-itself, is not only elusive, but in most cases formally unattainable is troubling.  While we can say that, for instance this is a stone before me and it does not yield, is solid, and is hard, is in effect a purely semantic and contextual categorization.  That the stone exists and has properties begets a series of more and more fundamental questions which begin crumbling into uncertainty (lack of substantiation or indisputable ‘proof’).  Do all stones have these properties?  Under which conditions does the stone persist and exist in tension with other bodies?  The Ancient Greek philosophers (i.e. Plato’s Meno dialogue) were masters of asking such ‘childlike’, but profound questions, displaying how we, in the end, know little except what we believe in terms of State 2, practical assertions.  Plato’s assertion, reflected by Popper, was that ‘knowledge’, in the end, is justified belief, thus State 2.

For a realist, we can also frame this more specifically, via Popper, by noting that there are two ways we ‘know’ or assert truth via theory: inductive and deductive proofs.  Popper pointed out that inductive logic and experimentation cannot ‘prove’, it can only substantiate, or ‘falsify’ that certain things are certain ways, right now, under certain conditions, and that we simply do not know if they are forever and universally true.

The example is of the famous black swan:  prior to the discovery of Australian black swans, there was a theory that all swans were white by nature because all swans observed were white and no black swans were ever encountered.  The discovery of black swans destroyed the inductive assertion of ‘swan whiteness’.  Similarly, with all inductive experimentation we struggle to make truth assertions, but must always admit that they are, at best, falsifications, ‘working theories’ that only hold under continual inquiry.

The second form of truth assertion is deductive: theory built from working, basic theory and substantiated via experimentation or empirical observation (or data collection, as the case may be).  Popper’s assertion, which is still debated, is that deductive logic is the only path to pursuing truth assertions.  Thus, theorizing that birds have colors which reflect their evolutionary adaptation to particular ecosystems begets the notion that swans are commonly white, but that it is possible we might encounter swans in different environments that are black, blue, red, or purple, as the case may be.

Deductive theory thus is always a struggle, a process of continually ‘falsifying’ and, when new observations emerge, revising primary theory.  This is the fundamental state of science.  It also reveals that in most all cases, we are struggling to establish useful, working ‘truths’, that is to say, working theories.  Concerning how we establish primary deductive theory to test inductively, this is a topic of very lively debate, and likely will always be so.

Some argue we come ‘pre-built’ with an empirical understanding of the world that we extrapolate from.  Others argue reality has a fundamental ‘nature’ and that we ‘discover’ or gain inuitive insights based on our uniquely human ways of knowing.  Both approaches admit that there is a certain anthropomorphic element to knowledge, that knowledge is highly contextual to being human and seeing and understanding in human ways.

There is a certain ‘uncertainty principle’ (not unlike Heisenburg’s uncertainty principle) at work in epistemology:  the more we know about knowing, the less it is possible to make fundametal assertions about a fundamental reality.  Likewise, to the degree we work towards establishing fundamental truth assertions concerning the basis for reality (i.e. establishing the Higgs-boson as the fundamental connector between matter and energy), the stranger and more detached this revealled external reality seems to become from any type of relevant human meaning.  At a quantum level, time and space appear to fundamentally break-down, which is quite beyond the world we live within (unless you are a Gnostic or Hermetic occultist).

Underneath it all, we still struggle with field unification theory and the reality that the cosmological constant is not constant, but variable.  We miss much of the matter which supposedly binds the universe together.  There is yet a proper, validated explanation concerning how time, matter, gravitation, weak and strong electromagnetic forces, and quantum phenomenon ‘hang’ together’.  There are wonderful complex theories (i.e. string theory), as well as lovely simplifications (i.e. A. G. Lisi’s E8 Theory), but regarding the fundamental nature of reality vis-a-vis theoretical physics, we still are in the realm of ‘falsified’ State 3 struggling to be State 2.

Similarly, economic and social research at the bleeding edge of modern computational-driven analysis seems to propose some very disturbing truths and assertions concerning the nature of consciousness and free-will (or lack thereof).  Social theory, at its extreme edges, potentially sees us as ants in a massive complex of humanity, tossed by tide and randomness in most circumstances despite our internal feeling of meaning and relevance.

In the debate concerning where deductive theory comes from, perhaps it is simply best to say that it exists somewhere in the intersection between being human and existing in a ‘reality’ that is, at base, ‘beyond-humaness’, so to say.  In this sense, we can consider deductive theory generation as a type of practical bootstrapping, with a hint of self-centered egocentricism about it!

Getting Back to ‘Reality’…

Enough!  To ground the discussion from abstract epistemology theory, we return to the challenge of establishing ‘truth’ in dynamic organizational settings, particularly under practical time pressures. The practical difficulty for organizations is that moving up the scale of belief to scientific proof is costly in terms of time and resources, principally the attention of experts and management.  This means there is a constant practical pressure to speed-up the process of establishing belief.  Indeed, case studies concerning notable decision failures often have some element of groupthink, the lowest rung on the ladder of belief establishment (i.e. the Global Financial Crisis, the space shuttle disasters, friendly-fire military errors, and notable company failures such as Enron and Lehman Brothers).  It can be asserted that being ‘lazy’ about the energy invested in establishing organizational or systemic belief can have disastrous, even deadly, consequences.

On the top end of the scale, truth-in-of-itself, solid scientific proof is quite often impossible in complex business settings.  There is an assertion that even science itself falls short of achieving this goal (i.e. although the existence and effects of gravity are well understood, science still has not established a unified theory).  Philosophy of science luminary Karl Popper, in his work The Logic of Scientific Discovery, asserted that no scientific theories can finally and resolutely be ‘proven’.  Instead, when conducting scientific inquiry, he asserts that we are constantly attempting to ‘falsify’ theory: to establish that it is again and again ‘not NOT true’ in a variety of conditions and settings.  When a theory is disproved, it evolves.

This is valuable wisdom for the conduct of business and data analytics.  The great value of data-driven analytics is the emergent fusion of powerful methods and tools (software for data processing, transformation, analysis, and visualization combined with data storage and rapid transmission mechanisms).  Basically, an assertion can be made that the great ‘revolution’ of so-called Big Data approaches and business analytics generally lies in the growing cost-efficiencies associated with ‘experimentation’ (experimenting with data in this case).  Whereas during World War II, calculating projectile trajectories would occupy many hours of time on a computer the size of a large room, we can now process calculations many factors more complex on our hand-held smartphone.

However, we must be eternally vigilant concerning the ‘crucial last step’: establishing solid ground for belief in a truth assertion.  In framing business problems, gathering data samples, composing an experimental analytics model, and interpreting results, we are all susceptible to the same biases and tendencies to take short-cuts as covered above – to ‘believe’ without rigor.  As with Popper’s concept of ‘falsification’, we must be content that we are never ‘proving’ anything finally, but are always struggling to ‘falsify’ out assertions, to continue to test and retest hypotheses.  The distinction is the amount of effort we put into overcoming our own biases by questioning and re-testing models to establish ‘practical causation’ rather than belief based on shortcuts.

In the Context of Computer-Driven Analytics: The Dark Side of Analytics…

A particularly pernicious trend is to the temptation to accept or defer to computer generated experimental data models.  For instance, using stepwise regression, one can ask the computer to suggest significant correlating factors in a large dataset.  Such methods can lead to rather odd results, some of which have been documented, such as the observation that people who do not drink orange juice are more likely to die in an auto crash, that vegetarians are less likely to be late for work, or that horses with a single 8 – 10 letter name win more races.  These may be aberrations, phantoms in a dataset, which can lead to ‘overfitted’ models and thus become dangerous when pushed to become predictive or prescriptive models.

A scientist friend recently quoted to me a study which correlated ‘lack of orange juice drinking’ to ‘propensity to die in a road accident’.  Apparently the correlation was sound, and had to do with motorcyclists being statistically adverse to OJ.  This is a good example of correlation, not causation.  Clearly some deeper analysis needs to be done (i.e. inverse correlation between healthy eating habits and risky sports?).

The great danger is that corporations and governments simply start enacting policy based on first-order correlations and not even telling us (i.e. you don’t drink orange juice, therefore higher auto insurance rates).  There is a ‘dark side’ to analytics when lazy bureaucracies and computer generated correlations masquerading as causation collide…

We must always be vigilant in applying our own human context and understanding in second-guessing such approaches.  Computer-derived models may indeed show significance, but it may be pointing out something which has underlying, deeper causes which beg expert human interpretation.  The recent article on the problem of conflating correlation with causation covers this topic in more detail.

Essentially, the conduct of science depends upon applying human understanding to the process of building and validating models.  While computer-driven techniques can indeed lead to valid inference and autonomous observations of significant correlation, the result may only be part of the story.  Computers are just as susceptible to ‘falsifying’ theories which later turn out to have been false, as a phantom in a particular dataset or a surface observation of some deeper causal phenomenon.

, , , , , , , , , , , , , ,

About SARK7

Scott Allen Mongeau (SARK7) is an INFORMS Certified Analytics Professional (CAP) and a Data Scientist in the Cybersecurity business unit at SAS Institute. Scott has over 20 years of experience in project-focused analytics functions in a range of industries, including IT, biotech, pharma, materials, insurance, law enforcement, financial services, and start-ups. Scott is a part-time PhD (ABD) researcher at Nyenrode Business University. He holds a Global Executive MBA (OneMBA) and Masters in Financial Management from Erasmus Rotterdam School of Management (RSM). He has a Certificate in Finance from University of California at Berkeley Extension, a MA in Communication from the University of Texas at Austin, and a Graduate Degree (GD) in Applied Information Systems Management from the Royal Melbourne Institute of Technology (RMIT). He holds a BPhil from Miami University of Ohio. Having lived and worked in a number of countries, Scott is a dual American (native) and Dutch citizen. He may be contacted at: webmaster@sark7.com All posts are copyright © 2015 SARK7 All external materials utilized imply no ownership rights and are presented purely for educational purposes.

View all posts by SARK7

Subscribe

Subscribe to our RSS feed and social profiles to receive updates.

Trackbacks/Pingbacks

  1. Twelve Emerging Trends in Data Analytics (part 1 of 4) | BAM! Business Analytics Management… - July 12, 2014

    […] An earlier set of posts goes into detail concerning challenges associated with business analytics model validation – pointing out that with increasingly complex analytics models, the challenge of validation is increasingly organizational in nature.  The question of complex model validation quickly connects to fundamental questions of epistemology and organizational sensemaking, as covered in the post Analytics and Belief: The Struggle for Truth. […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: