Highlights of the Bessemer Venture Partners report and our take on what enterprises can expect regarding the future of purpose-built tools to leverage AI.
In the rapidly evolving world of Artificial Intelligence (AI), staying ahead of the curve isn't just an advantage – it's a necessity. A recent report from Bessemer Venture Partners sheds light on the emerging AI infrastructure paradigm, providing insights into the future of purpose-built tools for enterprises to leverage AI. Here are the highlights from the report, along with reflections from Shahin Atai, Head of AI at HiQ Sweden.
The report highlights a critical shift in the tech world: the emergence of software and infrastructure companies outside the major hyperscalers building the next generation of AI services and solutions. This isn't just an incremental change, it's a fundamental rethinking of how we approach AI development and deployment. And it’s exciting to watch.
The key innovations are happening across multiple layers of the AI stack:
The model layer is becoming increasingly dynamic and competitive. We're seeing rapid advances in scaling techniques, novel language model architectures, and specialized foundation models for different domains. Such diversity is opening new opportunities for AI applications across industries.
New, advanced models, especially large language models (LLM), are performing exceptionally well on various test tasks using publicly available data. This is increasingly making them effective for more complex and specific tasks. For example, Healthcare Foundational Models like Med-PaLM – a model developed and trained by Google, designed to provide high quality and accurate answers to medical questions.
Innovations in the compute layer, including hardware GPU/TPUs as well as custom chips, operating systems, and AI cloud operating models are addressing critical bottlenecks in AI model training, deployment and inference. Interesting techniques such as self-attention mechanisms and KV cache optimizations are dramatically improving efficiency and reducing memory footprint across the compute layer.
Evan Morikawa, who led the Applied engineering team at OpenAI and ChatGPT, discussed the challenges of scaling LLMs and the compute layer earlier this year. He specifically mentioned self-attention mechanisms and KV cache optimizations as techniques used to improve Chat-GPT’s efficiency.
Self-attention techniques help language models analyze sentences to determine which words are crucial and how they relate to each other, enhancing overall generation capabilities. Meanwhile, KV cache optimizations act like a quick-reference notebook for the language model, allowing it to store previously calculated data in memory, thus avoiding repeated calculations for certain prompts or tasks.
Given that data is the lifeblood of AI and Generative AI (GAI), the AI revolution is catalyzing significant changes across the data stack. Traditional data infrastructure is being reimagined to support AI workloads, with projections of unstructured data volumes exploding to ~612 zettabytes by 2030. This surge is expected to necessitate new data and storage toolchains. Players like Weaviate, and Databricks are moving fast to address these needs.
A new wave of startups is emerging that are building systems with AI language models at their core or enhancing existing capabilities with AI technologies. This trend is exposing the limitations of current data infrastructure and tooling, which are not yet fully customized for AI use-cases, driving demand for AI-specific tooling to unlock future data management strategies.
Companies in the orchestration layer, such as Langchain and LlamaIndex, are becoming critical players in AI delivery and application development. They're providing frameworks that simplify the development of AI-embedded software applications by abstracting away much of the underlying complexity.
The Bessemer Venture Partners report is one of the most comprehensive AI infrastructure roadmaps to date. While it's difficult to predict exactly where AI infrastructure will be in three to four years, the report provides valuable insights into where the field is headed. Here's our take on what enterprises can expect:
We expect foundational models that leverage state-of-the-art GPU clusters to continue to grow in performance and capability. For most organizations, these models will likely be accessed through cloud infrastructure provided by large hyperscalers, as inference remains more cost-effective than training for the time being.
Organizations of all sizes will increasingly begin to double down on data science and engineering capabilities. The shift will involve finding new skills, investing in data-driven capabilities, and adopting new AI tools and infrastructure. Organizations that are slow to make this transition risk falling behind in an increasingly AI-driven business landscape.
We expect to see a growing emphasis on the collection and use of domain-specific, high-quality data. Human-in-the-loop (HITL) platforms and frameworks will play a critical role in consolidating and refining data for broader AI use cases.
As AI becomes more integrated into business operations, we expect to see an increased focus on AI operations (AIOps) tools and frameworks. These will be critical for monitoring and evaluating future AI model performance metrics, including bias, reasoning, training efficiency, latency, compute cost and more.
Beyond chatbots, we expect to see new AI-embedded application archetypes, built on emerging AI infrastructure platforms. Players like Datarobot, Databricks, and Airbyte, as well as larger companies like Amazon (with Trainium and Inferentia), Microsoft, and Meta, are likely to drive this innovation forward and unlock AI capabilities for multi-agent predictive maintenance, intelligent control or AI-at-edge systems. Watch this space!
As noted in the report, there's a significant opportunity for the emerging AI infrastructure players and services to provide the "picks and shovels" to enterprises. These picks and shovels will eventually unlock new approaches towards data management (DataOps) and integrating various Machine Learning (ML) or AI operations (MLOps-AIOps) across enterprises.
“As we approach an AI-driven future, organizations must prepare for rapid evolution in the AI infrastructure landscape. Success in this new era will hinge on several critical factors. First, agility and adaptability will be paramount—companies must be ready to swiftly adopt and integrate emerging AI technologies. Second, strategic data management will be crucial, focusing not just on data collection, but on ensuring its quality and relevance for AI-embedded applications. Third, investing in talent development, particularly in AI literacy and specialized skills, will be essential across all levels of the organization.”, says Shahin Atai, Head of AI at HiQ Sweden.
The AI infrastructure revolution is not just on the horizon—it's already here, reshaping the landscape of enterprise technology. As we've explored, this shift presents both challenges and opportunities for businesses across all sectors.
At HiQ, we're committed to helping our clients navigate this complex and rapidly evolving terrain. Whether you're just beginning your AI journey or looking to scale your existing capabilities, our team of experts is ready to provide the guidance, tools, and support you need to thrive in this AI-driven future. We can not only help design and build AI-embedded solutions but also help you build a “point of view” and help thinking through the first steps on your AI-journey.