In the past year, a new model for portfolio construction has emerged as the framework du jour.
Positioned as a superior alternative to Strategic Asset Allocation, the Total Portfolio Approach promises a more flexible, holistic way to manage institutional portfolios. Proponents describe it as “a range of approaches that can be tailored to the unique needs of different asset owners, regardless of their size or complexity. Where SAA requires asset owners to create a policy benchmark aligned with long-term fund objectives and allocate to appropriately sized asset-class portfolios, TPA offers something different.
Its core tenets (summarized from Gene Podkaminer's articulation) include, evaluating each investment based on its contribution to overall portfolio objectives rather than against narrow benchmarks, using risk factors as a common language to compare assets across all public and private categories, and budgeting risk across multiple dimensions and objectives, enabling explicit trade-offs about where, how, and to what magnitude risk is taken.
The approach also calls for aligning governance, accountability, and organizational culture to support collaborative decision-making and effective stakeholder communication. (Proponents of Strategic Asset Allocation would likely argue these objectives are implicit in SAA.)
Rather than simply dismiss SAA, TPA proponents genuflect to Harry Markowitz while arguing the world has changed: “It's not that SAA is wrong—it's that it was designed for a different world. A world of simpler portfolios, more stable regimes, and lower cross-asset complexity.” TPA, naturally, is presented as the antidote — "a direct response to the evolving realities facing institutional investors." What are these evolving realities? According to proponents, "macroeconomic and geopolitical regimes have become less stable and more difficult to predict," while portfolios have become "more diversified, structurally constrained, and exposed to nonlinear risks."
Both claims deserve scrutiny. When exactly were macroeconomic and geopolitical regimes stable and predictable? The Bretton Woods collapse? The oil shocks? The fall of the Soviet Union? And when were investment risks linear? Regime shifts, fat tails, and nonlinear dynamics have always characterized markets. The claims feel less like an accurate description of financial history and more like a rhetorical device to position TPA as the successor to SAA.
The Actual Challenge: Dynamic Optimization
Strip away the marketing and both SAA and TPA are addressing the same fundamental question: How should capital be allocated to maximize the probability of meeting liabilities over time? Traditional approaches mistakenly treat this strategic decision as a static optimization problem, solved once and rebalanced periodically. But the reality is more complex. This is a dynamic optimization problem that requires sequential decision-making — determining the best action at each point in time to maximize the relative value between competing future payoffs.
Since Markowitz in the 1950s, asset owners have relied on Strategic Asset Allocation — mean-variance optimization and its variants — to address this challenge. SAA’s limitations as a dynamic optimization solution are well established in academic and practitioner literature. TPA represents an evolution — acknowledging complexity, expanding the optimization surface, and attempting to respond more dynamically to changing conditions.
But TPA faces formidable implementation barriers: a lack of standard factor analysis tools, inconsistent data, limited ability to swap illiquid legacy holdings for new opportunities, difficulty integrating the approach into existing workflows, and teams that lack the skills to use it effectively. These challenges limit adoption to only the most determined boards and CIOs.
And even if asset owners can overcome these barriers, there is the question of whether TPA can produce better outcomes.
A 2025 survey by Willis Towers Watson found TPA adopters outperformed SAA organizations by 1.3 percent annually over ten years, suggesting TPA's "potential to add value." However, a Financial Times analysis of the WTW data warns, "we need to be really very careful drawing conclusions from data from two dozen funds' excess return experience. And especially careful when the targets chosen by high TPA funds turn out to have been easily eclipsed just by putting money into risky assets."
Apologies to Everybody, but the Fundamental Constraint Is Humans
Beyond these practical barriers and ambiguous results, TPA shares a fundamental constraint with SAA: Both are human-designed frameworks that depend heavily on professional judgment at every stage. This reliance on human inputs — forecasting returns, selecting relevant factors, setting portfolio constraints, dynamically tilting portfolios, and determining rebalancing triggers — fundamentally caps their potential effectiveness. Both approaches trap asset owners in a cycle of predicting optimal allocations and executing periodic adjustments in a highly probabilistic, dynamic world across multiple dimensions, when the portfolio optimization problem itself demands continuous adaptation that exceeds human cognitive capacity and response time.
The constraint runs even deeper. SAA and TPA both require humans to decide upfront on the model inputs — risk factors, return premia, cross-asset correlations — and then build predictive models around those pre-selected inputs. These decisions encode human biases and assumptions about market structure into frameworks that systematically miss emerging patterns not anticipated in the original design. When regimes shift and correlations evolve, the framework can respond only as quickly as humans can recognize the changes, reformulate their assumptions, and implement new rules. By that point, the opportunity or risk may have already passed.
The Self-Learning Alternative
If asset owners and consultants want robust optimization solutions, they need to move beyond human-designed frameworks entirely. The answer lies in AI-driven decision-making systems — not small-capacity machine learning (ML) models that augment human judgment (random forests, support vector machines, k-nearest neighbors). ML models are technically limited to predicting returns or volatility and cannot solve the full portfolio construction problem that requires balancing multiple objectives, constraints, and costs over time.
Deep Reinforcement Learning (DRL) has emerged as the leading candidate for solving the dynamic optimization problem. In a DRL system, an agent learns directly from raw data through iterative trial-and-error (much like a child learns how to walk). Over thousands or millions of simulations across varying market conditions, the agent learns which decisions produce the optimal portfolio weights for a given objective (e.g., meeting liabilities, maximizing risk-adjusted returns, or managing drawdown constraints, outperforming peers). Critically, the learned policy adapts continuously as new data arrives, responding to structural changes that human-engineered models might miss entirely or recognize only in retrospect.
DRL directly addresses the "volatility, uncertainty, complexity, and ambiguity" that TPA proponents cite as "new-world" challenges—without requiring human specification of how to handle them. DRL represents the genuine “shift in mindset” that TPA advocates call for: from humans predicting optimal allocations to systems learning optimal decision-making policies. Rather than forecasting the best portfolio mix and rebalancing periodically, DRL continuously learns from experience and sequentially decides adaptive strategies that respond dynamically to transaction costs, market impact, liquidity constraints, and evolving market conditions under uncertainty.
Both academic research and my practical experience as co-founder of Rosetta Analytics demonstrate that DRL systems outperform human-mediated optimization frameworks. As one study concludes: "Reinforcement learning proves capable of optimizing highly complex financial models, including the effects of income taxes, mean-reverting asset classes, and time-varying bond yield curves, all of which other approaches cannot handle. Reinforcement learning appears to be the first fundamentally new approach to the portfolio problem in over 50 years."
Solvable and Insurmountable Barriers
DRL faces real adoption barriers, but they're fundamentally different from the constraints limiting SAA and TPA. The human cognitive bottleneck is insurmountable—no amount of training or better tools will enable human brains to reliably predict multivariate, dynamic systems. DRL's barriers are technical and organizational challenges that can be solved.
The most significant barrier is ‘interpretability.’ Boards and regulators require transparent, human-understandable rationales for investment decisions to satisfy fiduciary duties, but DRL operates as a "black box" discovering patterns through trial and error. This requirement creates tension: DRL's strength lies in finding relationships humans miss, yet governance frameworks demand explicit investment logic. However, researchers and practitioners are actively improving interpretability techniques—attention mechanisms, feature attribution methods, and post-hoc interpretation tools—to make deep learning models more transparent and trustworthy for institutional decision-making.
Other governance challenges include setting appropriate risk limits for self-learning systems that continuously adapt their behavior, concerns about algorithmic failures or unexpected portfolio decisions, and regulatory ambiguity regarding whether DRL satisfies prudent-person standards. From a practical standpoint, high upfront costs, uncertain ROI timelines, severe talent shortages, and significant career risk for CIOs championing unproven approaches further discourage adoption.
But these challenges are being solved.
AI and DRL are not unknown to investment professionals. The CFA includes a chapter on reinforcement learning in AI in Asset Management: Tools, Applications, and Frontiers. However, a well-reasoned exposition on reinforcement learning may not be enough to create a shift in mindset, especially, when a leading proponent of TPA, WTW, continues to take a limited view of AI, claiming it currently is "a complement — rather than a rival — to human intelligence," and that "a reasoning AI model will soon be able to research a market, build an investment thesis, and take action on it, perhaps changing asset allocation, albeit with a human's guidance and approval." Once again, AI is reduced to the role of a handmaid rather than recognized as the solution to problems human intelligence cannot solve.
The Trajectory of Portfolio Optimization
The asset allocation challenge facing institutional investors is fundamentally a problem of dynamic optimization in high-dimensional space—a problem that exceeds human cognitive capacity, no matter how sophisticated the framework. Strategic Asset Allocation failed because it treated a dynamic problem as static. Total Portfolio Approach represents incremental progress by acknowledging complexity and expanding the optimization surface, but it remains a more sophisticated hack, not a solution, as it is constrained by the same bottleneck: humans designing rules, selecting factors, and predicting correlations in environments too complex for their ex ante modeling.
The parallel to reinforcement-learning-based autonomous driving is instructive. Human drivers cause over 39,000 traffic deaths annually in the U.S. because they cannot simultaneously process hundreds of variables, predict other drivers' behavior, and optimize decisions in real time—no matter how skilled or experienced they are. Waymo's autonomous robotaxis have experienced only one fatality to date (2025), caused by a human driver colliding with a Waymo vehicle. The broader safety data are equally striking: Waymo's self-learning systems achieve approximately 0.05 crashes per million miles compared to human rates of one to five crashes per million miles, depending on severity and location.
This performance gap exists not because humans are incompetent drivers, but because the cognitive task—processing hundreds of real-time inputs, predicting dynamic behavior, and optimizing across multiple objectives simultaneously—exceeds human processing capacity.
Waymo's advantage lies in its autonomous systems, which continuously adapt without cognitive fatigue.
Portfolio optimization faces the same constraint. Humans cannot reliably predict multivariate dynamic systems, continuously adapt to regime changes, and optimize across multiple objectives simultaneously. We can do it periodically with great effort. Self-learning systems do it continuously.
The path forward isn't better human frameworks — it's eliminating human prediction from the optimization process entirely. Deep Reinforcement Learning systems learn optimal policies through interaction with market environments, discovering relationships and adapting to regime changes without requiring humans to specify what to look for or how assets will behave. The barriers to DRL adoption — interpretability, governance structures, implementation costs, talent scarcity, and career risk—are real but solvable. They're technical and organizational challenges, not fundamental limits. Just as we're moving past early skepticism about autonomous driving because the performance advantage is undeniable, portfolio optimization will follow the same trajectory.
For asset owners, the practical implication is clear. If your organization can implement TPA effectively—meaning it has the governance structure, data infrastructure, and analytical capacity—then it's worth pursuing as an incremental improvement over SAA. But understand you're optimizing within boundaries set by human cognition, and your performance ceiling is accordingly capped. The institutions that will solve the optimization problem over the next decade are those that invest in DRL: building interpretability frameworks, training or acquiring talent, running pilot programs in constrained strategy spaces, and educating boards on why self-learning systems represent evolution, not risk.
The question isn't whether algorithmic portfolio optimization will replace human-designed frameworks. The question is whether your institution will lead that transition or be disrupted by it.
Angelo Calvello, PhD, founder of C/79 Consulting LLC and writes extensively on the impact of AI on institutional investing. All views expressed herein are solely those of the authors and not those of any entity with which the authors are affiliated.