The Fax Machine’s Last Stand: Opportunities for Retrieval and Summarization of Medical Records
Much of our investment in legal has focused on the corporate side, targeting the evolving CLO organization and growing compliance demands for the enterprise. We’ve sought companies that promise to unlock revenue or avoid material costs, not just increase efficiency.
Wherever we hunt, we look for similar dynamics – regulatory pressure, error-prone processes, and emerging technology that could drive revenue. But opportunity like that is hardly limited to legal tech in enterprise; in fact, improving medical records could be a multi-billion dollar opportunity for law firms, insurance companies, and many others.
We don’t invest directly in healthcare, so our interest is less in supporting providers with outcomes through records; we are curious about opportunities for all the other players drawing on these files for insurance, legal claims, or even clinical research. There are challenges at every turn – accessing files, processing data in the records, updating files, responding to state and national requirements around privacy and response times – it’s a fascinating mess. Let’s discuss.
It’s Hard to Get Them.
We are not the first to recognize the challenges with healthcare data. Many major tech players have touted a vision of a comprehensive personal health record [what happened, Google?]. However, the challenge is especially acute in the legal context. Think workers’ compensation, medical malpractice, family law involving medical records, social security/disability, employment discrimination, and more. We’re talking about tens of billions of dollars in awards across cases that require some access to medical records. These records need to be processed by law firms, insurance companies, TPAs, pharmaceutical companies, and others with stringent regulatory requirements regarding their acquisition, storage, the time it takes to process them, etc.
Most of you can guess that any PHI (Personal Health Information) that travels must be managed by HIPAA-compliant and likely SOC 2 Type 2 entities. This presents a significant hurdle and a moat for anyone building tech in the space. But the challenges go well beyond that. For example, anybody who has ever tried to access their own medical records knows how difficult it is. Now imagine how hard it is for a third party to access them.
Clients must provide lawyers with a complete list of their prior providers and generally come with a comprehensive medical history. Lawyers take that roadmap and need to fill in the gaps. For missing records, clients can individually submit release forms to each practice in favor of their counsel. That in itself can be a big headache- every provider can have a different consent protocol and a different mode of conveying the docs (fax is still cool), and they are allowed to charge varying rates for the cost of physical or digital copies. They can arrange a Power of Attorney to collect the records or use a medical retrieval partner – a network of entities that have built provider networks over time to source documents.
Generally, they’re doing a mix of those. And nothing happens fast. Medical providers must respond in under ~30 days; they can extend another 30, and there are no consequences or enforcement if they miss the deadline.
It’s hard to read them. Or understand them. But you have to do that fast.
Assuming all relevant documents are retrieved, law firms can end up with thousands of pages of unstructured data. While going provider by provider to pick up manilla folders of handwritten notes is archaic, hiring reviewers to sit and read all these documents is even more daunting. As you might imagine, churn in the ranks of folks doing this is high, and the risk of error is high as well. The cost of outsourcing the reading, which is generally by page, is meaningful.
Time matters, too. Sitting in front of thousands of pages of notes, adjusters, for example, are generally facing a regulatory requirement around how fast an insurance claim needs to be addressed, i.e., the adjuster needs to get a response back within some state-mandated window. Lawyers also have statutes to contend with. The resulting pressure can, of course, lead to errors and omissions that cost real money.
On the legal side, complete medical records are required even around the decision to take a case or to move it forward – many states require an expert opinion of negligence on medical malpractice cases before a case is filed. Even if there is a longer grace period to process the data, end clients are waiting – if they get frustrated and push the result, they may drive suboptimal outcomes.
Different use cases require different approaches to records. Identifying a problematic causal relationship between a drug and adverse patient outcomes involves aggregating and analyzing thousands of medical records for specific timelines. Depending on which side of the issue you’re on, you may focus on highlighting correlative patterns or isolating conflated concerns. Triaging and adjusting insurance claims requires analyzing health data from the first notice of loss and continually incoming data as the claimant seeks care or their condition changes.
Where’s the market?
From an investment perspective, while we would love to see medical document retrieval solved, it’s fraught with entrenched challenges that are unlikely to be solved soon. EMRs have varied degrees of interoperability, and providers are not eager to make it easy for any third party to request and pull sensitive information. It would be a meaningful shift if a player could find a way to play intermediary or even automate some of the challenges of requesting docs and project managing the acquisition of materials.
The challenge is that revenue across players in medical record retrieval is likely sub $500M in the US. The problem is deeply rooted in systems that aren’t eager to change, and the juice may not be worth the squeeze.
However, medical summarization—support for the claims adjuster, attorney, or clinical trial researcher—represents a massive opportunity that AI makes much more accessible. Based on per-page rates, medical claims review outsourcing likely runs into the billions. Medical case management, the broader adjacent space, is around $5bn.
Many solutions could exist here – everything from platforms that generate medical summaries faster and better than a paralegal to solutions for claims adjusters or independent medical reviewers to maintain a current and accurate picture of a claimant’s position and their own triage posture on the case. There are also tangential solutions that feed these data products, i.e., cleaning up relevant data to enable other internal processing or even doing a better job sourcing medical histories and statuses from patients themselves. Of course, a winning player might do a little bit of column A and a little bit of column B to gain steam – taking some of the headache out of doc retrieval with practices to feed their pipeline for summarization and ongoing analysis.
There is real lunacy to the way medical information is collected and managed – we fill out badly xeroxed forms wherever we go, a doctor appends a variety of their own scribbles, and those pieces of paper languish in dusty stacks until someone brave has a problem big enough to require digging them up. There are many ways in which AI underperforms humans, but our guess is over thousands of pages, the error rate of the AI is gonna be a lot better than the error rate of a tired paralegal. There is a real opportunity here and a real need – we are excited to see what model gets it right.
All that said, if you’re building in this space or know someone who is, I want to hear from you. Also, a big thank you to our founder friends for ensuring this article made sense.