Three decades ago, the discovery process was fairly straight forward. An attorney conducting document discovery could simply confer with her client, determine the documents at issue, collect them from filing cabinets, and begin reviewing. The relative dearth of paper documents meant that even inefficient approaches to document review, such as linear, eyes-on review of each and every document, typically weren’t too burdensome. And they were, at least, reasonably intuitive.
Then the personal computer ushered in the digital age—and the massive proliferation in electronically stored information which, even as early as the 1970s, was making its way into litigation. (The Federal Rules of Civil Procedure were amended to reference electronic data in 1970, for example.)
Suddenly, discovery was much more complex—and expensive. In the early days of eDiscovery, the legal industry responded to the growth of ESI by treating it just as they had paper documents before: printing out emails, redacting them with giant markers, applying Bates stamps with actual stamps, and employing armies of doc review attorneys.
Soon, the first eDiscovery software allowed for the review of electronic documents electronically. It was a revelation. But though paper was gone, many inefficiencies remained. That early software was slow, hard to use, and often extremely expensive.
A whole industry of eDiscovery vendors emerged at the same time, initially as glorified copy shops and then eDiscovery specialists, handling everything from data extraction to OCRing to keyword searches—and charging hefty line-item fees for each step along the way. The cost of discovery had grown so much that, by 2015, one celebrated federal judge, U.S. Magistrate Judge John M. Facciola of the D.D.C. (Ret.) lamented that “The costs of discovery may, in the long run, drive an entire economic class out of the federal court for lack of means to engage.”
In recent years, technology has promised to help reduce those burdens, to cut through the mountains of data so that legal professionals can quickly find the information that matters most.
Alternatively, predictive coding and technology-assisted review (TAR) promise to surface the most relevant documents first through the application of machine learning. And though TAR adoption is limited—in part by the complexity and inflexibility of some TAR applications—it has long been hailed as a way to reduce the burdens of document review.
Except technology alone isn’t enough.
When the Cost of Technology Negates Its Benefits
When that technology is prohibitively expensive, the benefits it brings can be significantly undermined.
Take, for example, the recent decision in County of Cook v. Bank of America Corp., No 14 Cv. 2280 (N.D. Ill. Oct. 22, 2019), involving litigation between the Illinois county and several of the nation’s largest lenders over allegedly discriminatory mortgage practices. Recently, the county sought to compel discovery from 24 additional custodians—after having initially proposed extending discovery to 785 custodians.
The bank defendants argued, as is to be expected, that additional discovery would be disproportional. Under Federal Rule of Civil Procedure 26(b)(1), proportionality in the scope of discovery is determined by reference to six factors:
- The importance of the issues at stake
- The amount in controversy
- The parties’ relative access to relevant information
- The parties’ resources
- The importance of discovery in resolving the issues, and
- Whether the burden or expense of the proposed discovery outweighs its likely benefit.
The county argued that the banks’ use of TAR during their review reduced the burden of extending discovery to additional custodians and handling the resultant increase in data volume.
The court was unpersuaded.
The technology available to the producing party, despite that technology’s promises, was simply too expensive and burdensome to render additional discovery proportional.
Indeed, according to the defendants, their TAR-powered review had already cost $1.3 million and required 36 doc review attorneys working full time for three months just to get through their initial 400,000 documents.
From the court:
The Court disagrees that TAR eliminates Judge Rowland's expressed concerns about the burden of ESI discovery in this case. Despite using TAR to target likely responsive documents, Defendants have reviewed 400,000 ESI documents to date for the 38 Court-ordered custodians. According to Defendants, "[si]nce mid-July 2019, some 36 attorneys have been reviewing documents collected from the Court-ordered custodians full time." Moreover, the charges by Defendants' ESI vendor for document processing, review, and production are projected to exceed $1,300,000 and this figure does not include Defendants' own personnel costs in collecting the documents or outside counsel's costs and privilege review work.
For those keeping track, 400,000 for 36 reviewers is approximately 11,000 documents per attorney. At a linear review rate of 50 documents an hour, that would require 222 hours of review time per lawyer, or five and a half full, 40-hour weeks—about half as long as the three months it took for the banks to go through the documents here, though we can assume other factors, such as the complexity of their TAR approach, or the complexity of the data and requirements for second- and third-pass review were also at play.
These numbers undermine any suggestion that Defendants' use of TAR to aid in their ESI production affects Judge Rowland's proportionality basis for denying the County's request for ESI from the custodians at issue here.
There are, of course, alternative outcomes to technology-focused proportionality arguments. A few months ago, for example, Logikcull published a case study on Slack data in litigation. In that case study, New York-based litigator David Slarskey was able to overcome objections to the production of Slack data by showing that, in fact, using Slack data in eDiscovery was not burdensome when the right technology was available.
But the County of Cook decision is an important reminder that technology alone isn’t enough to solve the problem of eDiscovery—particularly when that technology costs over millions of dollars to use.
eDiscovery Pricing Models Trail Technological Improvements—Significantly
While technology has improved vastly over recent years, eDiscovery costs have not. We still see vendors charging for discovery services like it was 2005: $29 per GB to ingest and filter, $135 per GB to process data and “promote” it to a review database, $40 per GB to “prepare native files”—all tacked on to the same project and applied to the same data, driving the costs of even modestly sized cases into the five-figures-a-month range.
The growth of cloud computing promises to reduce the cost associated with discovery dramatically. The commoditization of cloud resources, and the digital revolution it’s powering, are allowing for greater automation, the elimination of major cost centers, and growing control over data that will be transformative for the legal industry.
But many cloud-based discovery solutions haven’t caught up. For cloud-based discovery providers, monthly, per-GB hosting fees are still the norm. Such pricing is, first, unsustainable given the vast growth in data and, second, counter to the benefits afforded by technology in the first place—negating the value of being able to sift through vast amounts of data by charging excessive prices just for hosting that data.
The absurdity of the situation is brought into stark contrast when you compare $35 to $40 per-GB hosting fees to the pennies that cloud providers like AWS charge for a gigabyte. A GB of storage on AWS, at its most expensive, is just 2.3 cents.
Compare that to the cost of most discovery solutions. Hosting just 100 GBs of data, about the storage capacity of your average smartphone, for six months on a discovery platform charging $35 a month would cost $21,000. For a terabyte matter, lasting the same six months, costs rise to $210,000. Stretch that case out to a full year and you’re nearing half a million dollars—expended on software alone.
That’s hardly sustainable, even for the most well-heeled litigants.
It’s that contradiction—between ever-improving technology, accelerating data growth, and pricing structures that don’t reflect the massive sea change the cloud has brought about—that led Logikcull to eliminate hosting fees altogether.
But until the industry as a whole reworks its approach to eDiscovery pricing, or buyers wise up to the absurdity of expensive line items and unjustified hosting fees, the cost of discovery will remain disproportionate to the technological advances that could reshape the legal industry—and those benefits go largely unrealized for the vast majority of litigants.
See how eDiscovery software pricing models can impact your next discovery matter with Logikcull’s eDiscovery cost calculator.