From pipelines to federated search: how cost, complexity, and scale force a full-stack rethink of the SIEM.

The stakes in security are higher than ever. The attack surface is expanding rapidly, threats are becoming more sophisticated, and security teams are under pressure to respond with greater speed and agility. 

We’ve previously written about why the SOC Stack is evolving, but increasingly, security teams are zeroing in on the engine powering their stack: the SIEM (Security Information & Event Management).

And that engine is starting to break down. 

Traditional, centralized SIEMs are struggling to keep up, not just due to cost and complexity, but because the security landscape itself is evolving in five key ways:

  1. Telemetry data is exploding, overwhelming legacy systems that weren’t built—or priced—for this scale. (Splunk’s volume-based pricing model quickly becomes unsustainable.)
  2. Cost pressures are intensifying, pushing teams toward modular, pipeline-first architectures that offer more economic flexibility.
  3. The security toolchain is fragmenting, increasing the need for dynamic routing, normalization, and transformation across diverse data sources.
  4. Analysts are stretched thin and need better signal-to-noise ratios plus faster access to clean, actionable insights.
  5. Customers want vendor neutrality, pushing the adoption of new standards like OCSF (Open Cybersecurity Schema Framework) and interoperable capabilities across solutions. 

These shifts are fueling sustained demand for unbundled SIEMs. This isn’t just a workaround to Splunk’s pricing model—it is a fundamental reset for how security infrastructure is being rebuilt.

Breaking Down Splunk: A Legacy Giant with Cracks in the Foundation 

Splunk generated $900M in ARR before its Cisco Acquisition and almost $2B today, making it the case study in the promise and limitations of centralized SIEMs. 

At its core, a SIEM ingests log data from many sources (EDRs, API logs, Firewalls, etc.), indexes it to make it queryable, and then delivers an analytics layer that allows SecOps to detect, investigate, and respond to threats. Splunk’s architecture revolves around three core components: Forwarders, Indexers, and Search Heads. All three now show signs of strain in today’s modern security environment.

The Forwarder: Where Noise and Cost Begin

The Forwarder collects log data from across an organization—security, application, endpoint, system logs, etc.—and streams it to the Indexer. It was designed to be data-agnostic, transmitting everything as-is with no filtering. 

In the early days, this was a strength. But with today’s volumes and Splunk’s volume-based pricing, it’s become a cost liability.

Some Fortune 100 financial institutions report spending $10M+ annually on Splunk, much of it to store low-volume data.

Splunk introduced the Heavy Forwarder (HF) to address this, offering filtering and pre-processing at the edge. But HFs are resource-hungry (high CPU, memory, disk I/O) and didn’t meaningfully solve the core challenge: smarter routing. 

This is where Cribl found its wedge. It capitalized on this by focusing on one element of the value chain: filtering, pre-processing, and routing raw log data to cut Splunk storage costs dramatically. 

Its value prop was simple—strip out low-value metadata (e.g., nanosecond-level MAC addresses, verbose network headers, thermal readouts) and route only essential logs to Splunk, pushing the rest to cheap cold storage like S3. It worked. 

Cribl rocketed to $100M ARR in under four years, then doubled by early 2025. In the process, it broke Splunk’s lock-in and opened the door to a new category of challengers: Security Data Pipeline Platforms (SDPPs).

But that’s just the first chapter.  

The next generation of pipelines won’t just collect, filter, and route—they’ll embed intelligence en route. We have already seen vendors shift threat intelligence earlier by enabling Sigma rule execution within the pipeline layer, catching threats in real time without relying on a centralized SIEM. These pipelines are starting to behave more like active participants in detection, not passive transports. 

Another exciting frontier is the emergence of memory within the pipeline. By maintaining a long-term understanding of how telemetry evolves—and how it’s acted upon—pipelines can power adaptive enrichment, retrospective analysis, and smarter AI-driven recommendations.

Rather than starting from zero with each event, the pipeline can remember, making it sharper and more relevant over time. And because these systems are being built AI-native from day one, they’re well-positioned to evolve into fully agentic architectures: automating rule creation, enrichment, and even response. 

The near-term unlock is clear: lower costs and faster detection. But the long-term opportunity is much bigger—a new operating layer for security that’s fast, smart, and continuously learning.

The Indexer: A Bottleneck of Cost and Visibility 

The Indexer ingests data from the Forwarder and builds searchable indexes so data can be queried. Splunk’s model charges for data ingestion (GB/day) and querying (compute/workload) – a formula that made sense in 2012 but buckles under 2025-scale telemetry. 

The economics alone are challenging, but the impact goes beyond just cost. Because storing data in platforms like Splunk is so expensive, many teams have been forced to pre-filter logs, choosing not to collect, store, or retain specific data. 

This means critical visibility is often lost before an analyst ever runs a query. As a result, teams are not just dealing with operational inefficiency; they’re accepting blind spots in detection and investigation — a direct risk to security efficacy.

Cribl initially addressed this by managing data routing, but this introduced its own challenge: a proliferation of storage destinations. Cribl’s recent report highlights this trend, with 90% of customers using two or more destinations and the number growing 15% year-over-year. The (intentional or accidental) move to a multi-SIEM world is already well underway.

This fragmentation has fueled the adoption of horizontal storage solutions like ClickHouse and Iceberg, which decouple storage from compute. Meanwhile, log-native platforms like Hydrolix are gaining traction for their performance and cost advantages.

Enterprises increasingly store logs across a mix of cold storage (S3, Azure Logs) and queryable systems like ClickHouse, especially when navigating multi-year cloud commitments.

The result is the rise of “Security Data Lakes,” a more versatile yet expansive means of storing disconnected, expansive security data. While some organizations are building this infrastructure in-house, the shift has created a clear opportunity at the analytics and querying layer — the third core pillar of Splunk — to deliver not just cheaper log analysis, but more complete and effective visibility across security operations.

The Search Head: Rethinking the Analyst Experience

The Search Head is the interface between users and log data. It’s where querying, dashboarding, alerting, and automation all come together. Once data is being ingested, automating alerts (e.g., breaches, threat detection) and building dashboards for monitoring and root cause analysis become critical.

Splunk handles querying through its proprietary language, SPL—a tool with a steep learning curve that is increasingly straining under scale. Historically, this part of the stack has been the most neglected, both by incumbents and startups. 

Cribl, for example, built its early traction on log routing but is now expanding into analytics and search with products like Cribl Search. Still, the core problems remain: SPL is clunky and slow, centralizing data is expensive, and SOC analysts are stuck jumping between a dozen tools to conduct a single investigation.

Meanwhile, SOC teams now juggle ~4,500 alerts per day. Reducing MTTR (mean time to respond) is non-negotiable, making time-to-insight absolutely critical. That’s why introducing OCSF (Open Cybersecurity Schema Framework) in 2022 was a game-changer. It standardized disparate security logs, unlocking faster detection, response, and analysis.

OCSF laid the foundation for a new wave of solutions that don’t just normalize data but rethink how analysis is done. We’re now seeing companies push the envelope with federated search, which lets users query across systems without moving data. That’s a significant shift, not just technically but in how teams interact with security data.

The combination of fragmented sources, growing log volumes, and OCSF adoption has created a greenfield opportunity to rethink the Search Head. 

The future Search Head isn’t just about speed. It’s about composability, working across data sources, formats, and tools to give analysts the shortest path from signal to response. Whether through federated search or other innovations, we’re watching this space closely.

Wrap

The SIEM is being reimagined from the ground up. And while we’ve chosen to use Splunk as our foil, the limitations we explored, from cost to complexity, from lock-in to efficiency, are present across all legacy architectures. 

What began with data pipeline innovation has become a full-stack rethink of how security teams collect, store, analyze, and act on telemetry. 

This wave is about unbundling the legacy SIEM into modular, cloud-native primitives, whether through smarter routing, more flexible storage, or modern analytical layers that enable faster, broader, and more contextual investigations. 

This is more than cost reduction. It’s about giving analysts leverage, the ability to detect faster, investigate deeper, and respond more intelligently.  

From federated query engines to AI-native investigation assistants to full-blown SIEM replacements—the surface area for innovation is only getting bigger. We’re in the golden era for security infrastructure innovation.

This is the kind of architectural shift that breeds generational companies. We’re excited to back the team building its foundation.

 


This article is for informational purposes only and does not constitute investment advice. Views expressed represent the opinions of the author and Jump Capital. Jump Capital may have investments in or pursue investments in the legal technology sectors discussed. References to specific approaches or technologies do not constitute investment recommendations.