We Will Never Fully Understand How AI Works — But That Shouldn’t Stop You From Using It

Illustration by II

Illustration by II

To unlock the true potential of artificial intelligence, investors need to accept the black box.

The search for alpha can take us to some unusual places — perhaps none more so than the 13th century works of Thomas Aquinas. His philosophical maxim, “Finitum non capax infiniti” — “The finite cannot comprehend the infinite” — makes a compelling argument for an entirely undifferentiated source of alpha: artificial intelligence.

To apply this postulate to AI: While there are some types of AI that humans can comprehend, there are others that, because of their complexity and high dimensionality, are beyond the ken of human intelligence.

There are clear signs that we have reached a tipping point where certain types of AI have surpassed the human mind: The finite (humans) cannot comprehend the infinite (advanced AI).

Yet because of a deep-seated industry bias that investment results must be explainable, investors have been slow to accept the superhuman capabilities of advanced AI and, as a result, are failing to consider unique sources of alpha that could provide better investment outcomes.

An archetype of such an “infinite” AI is OpenAI’s ChatGPT, a chatbot that, as New York Times technology reporter Kevin Roose notes, “can write jokes (some of which are actually funny), working computer code, and college-level essays. It can also guess at medical diagnoses, create text-based Harry Potter games, and explain scientific concepts at multiple levels of difficulty.”

ChatGPT is a large language model (LLM), and a key to its success is its generative AI: It uses deep learning and reinforcement learning to analyze and understand the meaning of language and then generate a relevant response.

While ChatGPT might seem like a parlor game, its underlying AI could prove transformative. For example, in a peer-reviewed paper recently published in Nature Biotechnology, researchers used the same type of LLM that underpins ChatGPT to “generate protein sequences with a predictable function across large protein families, akin to generating grammatically and semantically correct natural language sentences on diverse topics.”

Not only could the amazing success of this LLM, called ProGen, radically transform medicine and healthcare, but, according to the authors, such deep-learning-based language models could be used “to solve problems in biology, medicine, and the environment.”

Yet such solutions come with a caveat: While researchers are able to explain their models and inputs and evaluate the quality of the results, a deep-learning model’s complex architecture and recursive nature make it impossible to explain how multiple neurons work together to produce a specific prediction or decision. ProGen, for example, “is a 1.2-billion-parameter neural network trained using a publicly available dataset of 280 million protein sequences,” making it impossible to map specific inputs to specific outputs, the authors write.

Such unexplainability is an endogenous feature of deep learning. In the vernacular, it’s a black box. As Yoshua Bengio, a pioneer of deep-learning research, notes, “As soon as you have a complicated enough machine, it becomes almost impossible to completely explain what it does.”

In their commitment to uncovering ways that AI can contribute to fundamental scientific discoveries that improve human life, computational biologists and other researchers accept the results of the ProGen and other black boxes without the need for explanations of the specific decisions made by the AI (and thereby accede to our philosophical maxim).

Yet investors balk at such acceptance. Instead, they cleave dogmatically to a belief stated by the Investment Association: “Financial institutions must be able to provide clear explanations of how decisions involving AI are made, at every stage of the process. These explanations must be set out in a transparent and accessible way, and made available to employees, customers, regulators, and other relevant stakeholders.”

(Regulators are considering imposing similar explainability requirements on applied AI. For example, the European Union’s General Data Protection Regulation proposes that individuals have a right to an explanation of an algorithmic decision, although precisely what form such explanations might take or how they might be evaluated is unclear.)

This transparency requirement holds investment processes using or based on deep neural networks to a standard that even traditional investment managers and humans generally cannot meet. According to Roman Yampolskiy: “Even to ourselves, we rationalize our decision after the fact and don’t become aware of our decisions or how we made them until after they been made unconsciously. People are notoriously bad at explaining certain decisions, such as how they recognize faces or what makes them attracted to a particular person. These reported limitations in biological agents support [the] idea that unexplainability is a universal impossibility result impacting all sufficiently complex intelligences.”

More importantly, this demand for “clear explanations of how decisions involving AI are made, at every stage of the process” displays a profound ignorance of advanced AI because such individual explanations simply are not possible.

This demand is bolstered by the false claim, again expressed by the Investment Association, that “several techniques are now available to help with interpretability.”

This claim rests on a false hope. A review of the academic literature indicates that while it is possible to provide broad descriptions of how an advanced AI system works, contemporary techniques used to explain individual decisions are “unreliable or, in some instances, only offer superficial levels of explanation” and are “rarely informative with respect to individual decisions.” (It’s worth noting that academic literature sometimes differentiates between interpretability and explainability, but here we are using the terms interchangeably.)

The literature typically divides human-comprehensible explanations of advanced machine-learning decisions into two categories: inherent explainability and post-hoc explainability.

In their paper, “The False Hope of Current Approaches to Explainable Artificial Intelligence in Health Care,” Dr. Marzyeh Ghassemi and her co-authors explain that while many basic machine-learning models are inherently explainable because the relationship between their relatively simple inputs and model output can be quantified, advanced AI models like those using deep neural networks are “too complex and high-dimensional to be easily understood; they cannot be explained by a simple relationship between inputs and outputs.”

As an alternative, some researchers attempt to use post-hoc explanations — but these are, by their nature, problematic. First, current techniques used to produce post-hoc explanations do not reliably capture the relationship between inputs and outputs and may be misleading.

More generally, as Ghassemi et al. point out, post-hoc explanations “are only approximations to the model’s decision procedure and therefore do not fully capture how the underlying model will behave. As such, using post-hoc explanations to assess the quality of model decisions adds an additional source of error — not only can the model be right or wrong, but so can the explanation.”

Professor Cynthia Rudin takes the argument a step further, writing that post-hoc explanations “must be wrong,” that by definition they are not completely faithful to the original model and must be less accurate with respect to the primary task.

In the end, Rudin concludes, “You could have many explanations for what a complex model is doing. Do you just pick the one you ‘want’ to be correct?”

The demand for explanations of individual decisions combined with the inadequacy of existing explainability techniques leads to this flawed solution, with humans left to their own devices to determine which post-hoc explanation is correct. And as Ghassemi et al. conclude, “Unfortunately, the human tendency is to ascribe a positive interpretation: We assume that the feature we would find important is the one that was used.”

This human-centric disposition represents what tech writer Ben Dickson calls “the fallacy of the anthropocentric view of minds,” which, in the case of advanced AI, means we assume that AI makes decisions in the same way as human intelligence. This fallacy robs us of the ability to see that advanced AI is not merely a more powerful type of mind but is, as Wired’s Tom Simonite so eloquently writes, an “alien intelligence, perceiving and processing the world in ways fundamentally different from the way we do.”

By anthropomorphizing AI, we not only trivialize its power but place conditions on it that cannot be met — like requiring clear explanations of particular results — thereby limiting its utility and application.

Yet, as we’ve seen, there are transformative use cases like ChatGPT and ProGen where the demand for explainability is at odds with broader scientific or commercial objectives. In these cases, a black-box model is not only a far better choice than an explainable model; it is the only choice.

Investing presents another well-documented case where the capabilities of traditional investment processes are not good enough to consistently solve the problem — specifically, how to deliver alpha — and would benefit from the use of deep neural network-based models.

However, the deeply entrenched belief that individual investment decisions need to be explainable circumscribes the manager’s choice to human-based methods and basic forms of AI that augment human processes. It indicates that we are choosing explanations over accuracy and predictive power, disqualifying AI that could produce better investment outcomes and effectively dooming clients to a cycle of underperformance.

It’s true that cogent explanations give us the confidence that an investment process works, establishing the trust that is key to any investor-manager relationship. Yet there is a way to trust advanced AI that does not require explanations of individual decisions.

First, managers can clearly explain why a chosen model is appropriate for the use case. Second, they can broadly describe the model’s design and how it works. Third, they can make clear the choice of inputs. Fourth, they can delineate how the particular model is built, validated, and retrained, and empirically demonstrate that it generalizes — in other words, it adapts to new, previously unseen data. Fifth, managers can demonstrate that the model’s live performance is in line with the test period’s performance.

Scientific disciplines have long used such an empirical method to create trust in the black box. Medicine, for example, offers a contemporary application of this method for evaluating and validating black boxes. Consider the drug acetaminophen, “which, despite having been used for more than a century, has a mechanism of action that remains only partially understood,” Ghassemi et al. write. “Despite competing explanations for how acetaminophen works, we know that it is a safe and effective pain medication because it has been extensively validated in numerous randomized controlled trials. RCTs have historically been the gold-standard way to evaluate medical interventions, and it should be no different for AI systems” (emphasis added).

To provide clients with the investment outcomes they seek, we must look beyond our conventional investment approaches, which means renouncing our fetish for explanations, overcoming what professor Zachary Lipton describes as our “concession to institutional biases against new methods,” and accepting “a sufficiently accurate model should be demonstrably trustworthy, and interpretability would serve no purpose.”

But first, we need to accept that the finite cannot comprehend the infinite.

Angelo Calvello, Ph.D., is co-founder of Rosetta Analytics, an investment manager that uses deep reinforcement learning to build and manage investment strategies for institutional investors.