There is no question that AI can provide useful research, if utilized correctly and subject to validation. As indicated by one of the contributors to this recent article in Forbes, Rak Garg, Partner at Bain Capital Ventures, about the rise of AI into the Boardroom, AI certainly helps to solve the “blank slate” problem of getting a project off the ground and can do much of the heavy lifting regarding research and base composition.
Indeed, the market is now demanding AI with increasing competitive pressure. The stakes are being raised. The strategic imperative for business to reach for and meet evolving capabilities is now written into the DNA of what it means to be a modern business. Generative and Predictive AI do not need to be more than currently advanced (and certainly not conscious) to have a significant impact on the bottom line, when deployed properly. Moreover, their application need not - and should not - be limited to cost reduction. There are clearly low-hanging fruit waiting to be picked, and inefficient operations to be streamlined, but AI can also help your most valuable resources - your people - work smarter. AI can represent a true productivity lift by assisting in smarter decision-making for the long-term.
But here’s the rub. In order to extract these benefits, the implementation of this new and improved AI requires both a considered use case approach for your organization and an effective governance structure at the user level. AI cannot act as a reliable substitute for informed decision-making without verification and human input - at least not successfully. Notwithstanding its capacity to create efficiencies and to generate novel and multi-disciplinary insights for timely action, it should not be relied upon exclusively.
Beyond its upside potential, AI will continue to present risks to business operations and market standing that arise from potential inaccuracies, biases and other exposures associated with its use. Human-first AI demands embedded review processes, and this also assumes that internal data has already been suitably sanitized and formatted for appropriate use. Further, the reality is that preserving critical thinking and effective collaboration within an organization both require active participation by your people. Reliance on AI output without human review “close to the bone”, particularly for strategic decision-making, is a recipe for disaster.
The recent KPMG Generative AI Survey of 225 senior business leaders found that 54% expect AIregulationto increase organizational costs. Much of the focus is, understandably, on improving cybersecurity and data quality. More than half consider risk-mitigation to be critical to AI in enterprise functions, with many citing data privacy concerns, regulatory uncertainty, and general lack of readiness for the threats posed by the rapidly evolving cyber environment.
There is also the emerging phenomenon of AI risk being declared on the financial statements of the largest organizations in the world. According to a new study published by Arize AI, a platform tracking public disclosures by large enterprises, 56% of Fortune 500 companies cited AI as a financial risk in the past year. That's up 460% since 2022.
The regulatory landscape, like all things, does not exist in a vacuum. It stems from the nature of AI systems themselves. Andy Byrne, CEO of Clari is quoted in the Forbes article as saying, "Ensuring robust AI governance, security measures, data privacy, and legal compliance is crucial for bridging the trust gap and enabling the scalable deployment of AI within enterprises... AI can’t be a black box.”
This is a laudable statement. Yet, there is more to this than meets the eye. The problem remains that even with increased visibility into Large Language Model (LLM) operations, a credibility gap will persist. LLMs are probabilistic systems of output generation and are fundamentally unconcerned with factual correctness. They work on weighted patterns of language formation. There is also a recognized inverse relationship between the reliability and the capacity of LLMs. The more capacity a model has in terms of training data and parameters, the potentially less reliable in output, given the probabilistic nature of these systems. As such, even knowing the precise calibration of any LLM (nigh on impossible), there will always remain some degree of trust deficit with AI output since this technologyisa black box.
In fact, the solution to this problem is not intra-model at all but is an external one - validation and verification by humans. Good ole checking. It is becoming increasingly clear from a variety of sources, including new legislation, regulatory frameworks and interested stakeholders, that transparency and explainability are now being treated as paramount. This is a good development. But humans are the key. The new mandate in this direction will not be satisfied by LLMs alone, nor by LLMs validating other LLMs. Mindful business leaders will be wary of the infinite regress here - it’s turtles all the way down.
At face value, human review may not sound the most efficient solution for solving the validation problem. But is that really the case if human review is seamlessly integrated into the workflow process? AI output is already re-writing the rules on multi-disciplinary output and, therefore, the teams required to be involved. And what is the institutional learning and sense of ownership you wish to see preserved, maintained and developed within your organization? How else will you achieve it? What will your insurers and other stakeholders think about the undocumented provenance of AI decision-making without verification?
I agree with Rak Garg, when he says, “It's important for companies to understand not just the technology, but also the potential risks and tradeoffs associated with LLMs, and to define a mitigation plan for those risks." My own assessment is that the devil is in the details, not just in model development, evaluation and AI monitoring, but also in full-circle engagement with humans at the user operational level.
In fact, Byrne and Garg also both stress the importance of continued reliance on human judgment, experience, and management direction in the evaluation and application of AI-generated insights. As part of a broader architecture of policy development, Garg also cites key objectives and action steps of Risk Assessment, Robust Evaluation of LLMs, Explainability and Communication with internal and external stakeholders (including training). Who could argue with the proposition that these steps are not only critical but also ongoing priorities. There is no space to be static.
I would add that much of the action must occur lower down the supply chain of information. In order to achieve effective engagement with this technology, and to allow its value to percolate to the level of confident Board decision-making, it requires a platform designed for users up and across the organizational hierarchy. A centralized gateway interface with LLMs and other models allows secure, permission-based interaction to ensure they are used consistently, collaboratively, efficiently, with validation, and with audit-trail visibility.
It’s fair to say that the market is still immature in this regard. The enterprise architecture for secure engagement is not yet widely available. Some large enterprises are figuring out their own solutions. Since Generative AI marched onto the scene at the tail end of 2022, there has been an increasing realization of the difference between promise and product. Beyond the hype, how does this product fit into my business? In fact, this issue encompasses both the consumer market and the enterprise market, with some signs of change on the consumer end. The central takeaway at present, however, is that an LLM may have the engine of an airplane, but without the other parts to make it fit for purpose, it’s never gonna fly.
Unless these steps are taken – unless the plane is built, if you like - in my view the operational deficits will persist. This is especially important, of course, if AI is being used to generate recommendations, e.g. on new product development or strategic market direction. Garg, in the Forbes piece, emphasizes the importance of data. Understanding the quality of company data, ensuring that it is cleaned up and structured, and knowing how to deploy it effectively across the organization, have taken on hugely increased significance with the acceleration of AI. This makes a lot of sense. After all, without good data, little is possible.
Operationally, however, the stakes have also been raised in workflow with AI, often with the need for differently configured teams being involved. Going forward, Generative AI demands a different approach towards end-to-end ideation and implementation. Active engagement and process change at the coalface will not only enable smart decisions by the Board but will be critical to success. Seamless integration of secure human review will become the Table Stakes to play at all.
About R. Scott Jones
I am a Partner in Generative Consulting, an attorney and CEO of Veritai. I am a frequent writer on matters relating to Generative AI and its successful deployment, both from a user perspective and that of the wider community.