Skip to main content

Search Accuracy Is the Most Important eDiscovery Feature: How to Ensure You're Getting It

March 15, 2018  |  4 min read

Binoculars in hand

Ask any lawyer how important the accuracy of searches within an eDiscovery platform is to them. Nine times out of ten, the answer will be “It’s #1. It’s the most important.”

Of course it is! If your search is faulty, then your entire case could be at risk. Indeed, unreliable searches could hurt your entire career, exposing you to potential sanctions, spoliation, and shame. No sane person wants this.

When choosing an eDiscovery software, guaranteeing accurate, reliable searches should be your top priority. Here are three characteristics to look for in your platform. 


1. Built in Quality Control, Including Exception Reports

You can’t fix a problem if you don’t know it’s there. Yet with many discovery platforms, processing errors are swept “under the rug.” To see if your information wasn’t properly processed and why, you have to seek out a special report. That report itself is often difficult to interpret, filled with computer jargon like “ERROR#1219 could not parse embedded object for extraction.” Similarly, these exception logs are often not found in the discovery search index, because the files are quite literally excluded from the search index.

If you aren’t aware that an exception report exists and you start performing searches, then you are bound to miss critical information. But even if you do know the exception report exists, what the hell are you supposed to do with it? Read every single row? These reports can have thousands of rows. No sane person would do that.

A reliable eDiscovery software should make identifying exceptions easy and transparent. In Logikcull, this is accomplished through QC tags or “quality control tags.” QC tags alert you to things in your data that you might otherwise not notice on its face. There are QC tags for documents that may be potentially privileged, documents that have speaker’s notes and embedded files, and documents that may have errors. That means your exception report is in the platform itself, featured right in Logikcull’s QC filter. That gives users unprecedented insight into their data, allowing them to ensure that they’re operating with the clearest understanding of their information possible. 


2. Stop Words & Noise Words

Your discovery software shouldn’t ignore characters or even entire words. But in many platforms, that’s the default. These platforms skip over words like “and,” “or,” and “not” because they use these words for their Boolean searches. That means that if you’re looking for a phrase such as “peanut butter or jelly” you will not be able to find it without difficult customization work. You could find peanut butter. You could find jelly. But you won’t find the phrase you want, because the search engine ignores a term by default.

Some products also skip over what are called “noise words.” These are words that the product has decided just don’t matter much, words like then, about, each, never, etc. One of the leading eDiscovery technologies excludes single letters, punctuation marks, and 112 default words. If a character is not indexed, it simply doesn’t exist in the search engine’s mind. You cannot find it. This leads to absurd situations where discovery software can’t even find the phrase “e-discovery.”

A reliable discovery platform should index all characters and allow you to search for whatever information you need, including all those ifs, ands, and buts. In Logikcull, that’s how it’s done. Logikcull’s Flex Search allows you to search for any word or character, so you can be confident that you’re finding the information you’re looking for. And because discovery shouldn’t require a PhD in information sciences, Logikcull’s search adopts to your search behaviors, recognizing if you’re using a specific search style and applying that behavior to provide you the best results. 


3. Hidden Files

Yes, documents can hide from you. For example, a Microsoft Word document can contain embedded files, such as images, spreadsheets or even entire PowerPoint presentations. Even PDFs can contain hidden attachments. To ensure that your search is accurate, you need to be able to access these files. Not every platform allows for this, however. Some, for example, do not extract embedded files. Others are incredibly inconsistent with them. You may have to dig through a nasty exception report just to know which documents failed extraction and which have been extracted. And if you don’t extract the hidden files and index them, the user has no idea that they exist and thus puts themselves at risk when running searches.

A reliable discovery platform should eliminate such risks, not create them. Logikcull extracts all of these documents and tags them with a QC Tag of “has embedded files” (see item 1, above) so that users can quickly identify the documents that may be hiding something.

 

If your discovery platform checks all three of these boxes, you can be confident in the accuracy and reliability of its search. If it doesn’t, you might be rolling the dice every time you go to review information.