MENU

Blog

Shouldering the Burden of Data Dumping

October 15, 2010
In an ongoing employment dispute, the plaintiffs asked for discovery of selected data from two laptops, but instead received millions of digital documents, which the defense described as "everything" possibly related to the matter. The plaintiffs counsel, Charles Stillman, founder of Stillman, Friedman & Shechtman in New York, did what he could to manage this sudden influx of data, including soliciting estimates from discovery experts regarding what it would cost to comb the data.
Stillman says that the estimates he reviewed would have been prohibitively expensive for an employment matter worth less than $10 million. But he was very lucky. His clients happened to be computer experts and, using their own software, concluded that opposing counsel had not actually produced the requested files within the mountain of documents delivered. The court agreed, and sanctioned the defendant Sandisk $150,000 in Harkabi v. Sandisk Corp. "We simply could not have afforded to pay for the kind of investigation my clients were able to do for themselves," says Stillman. "They were able to build their own tool to look at the data, something that would have cost too much if we'd had to pay for it."
Stillman doesn't believe the Sandisk sanctions were the result of a deliberate strategy by the defense to obscure facts. But he says that in even a multimillion dollar case such as his, high volumes of data can quickly overwhelm any legal team. And dumping high volumes of data in discovery is a common problem in litigation, say experienced lawyers. "This is a real and growing problem, and only a few big corporate clients even have the resources to deal with this issue," says George Paul, partner at Lewis and Roca in Pheonix and co-chair of the American Bar Association's Electronic Discovery and Digital Evidence Committee. "Unfortunately, in a lot of cases, it's very easy to be overrun by data."
In discovery of paper records, document dumps occur as well, but dealing with them is a straightforward matter -- lots of people are needed to review lots of documents. But large volumes of electronic records can be especially problematic because electronic search technology is often not up to the task of accurately searching large, complex compilations of information (see, e.g.,"TREC 2008 Stresses Human Element in EDD"). "Most people think of data dumps as a product of evidence large in size, but I think that's mislabeling the problem," says Bill Speros, an e-discovery consultant in Cleveland, Ohio. "The problem isn't simply large quantities but digital evidence that is impossible to search and return relevant documents."
 
KNOWING A DATA DUMP WHEN YOU SEE IT
After the courtroom confession, the document dump is one of Hollywood's favorite conventions for legal thrillers. Picture a team of young lawyers doggedly reading page after page of documents from a mountain of cardboard banker's boxes until they find an incriminating record. Unfortunately, that kind of problem might be welcome relief from the complications involved in e-discovery. "I long for the days of paper in boxes," says Paul. "Even if the other side tried to snow you in with paper, you could usually find a way to wade through it all. In the digital world that isn't so easy."
In litigation, overwhelming volumes of data can be the result of a deliberate legal strategy or simply a fact of modern life, where almost every action creates a digital record of some kind. Because almost every type of record is discoverable in litigation, even a simple request for records can balloon out of control. But there are some signs that data has been deliberately produced for litigation with the intent to overwhelm and confuse.
For example, if data is not organized, that is if it is produced in a confusing array of formats, obsolete formats, without proper indexing, or contains file types different than what was requested, data may be impossible to screen, let alone search. Structured data, like that from databases, can be dumped in an unstructured, unusable state without the program it was created with or detailed information on how it was created and stored. And of course, high volumes of data relative to the issues in a case are a good indication data has been dumped unnecessarily or unfairly. "The larger the haystack, the harder it is to find the needle," says Sharon Nelson, president of Sensei Enterprises, an e-discovery firm in Fairfax, Va. "This is the danger of data dumps. And very few are sophisticated enough to find the needle in a data dump."
As with most problems in e-discovery, the best solution is to meet and confer early with opposing counsel and narrow the scope of preservation on both sides as much as possible. For example, teams need to identify key players and the sources of data -- like desktop PCs, laptops, smartphones, voice mail, and flash drives. However, if legal teams don't have a handle on the electronic evidence in a case before meeting with opposing counsel, they may lose any discovery battle. "The meet and confer happens way too late. That's why everyone is turning to 'early case assessment' which has become a buzz phrase," says Nelson. "Once litigation is even imminent, you'd better be thinking about these issues."
In order to head off disputes about data being produced for discovery, it is important to negotiate the mode of production. One important point many lawyers ignore is the format in which documents will be produced. Many prefer what is known as the native format, or the format in which a record or document was originally created, because it is an accurate representation of data. However, other teams will prefer searchable PDF or other formats that can be easier to manage.
If sides do not negotiate this point, contentious battles often arise when the requesting party belatedly discovers that it cannot open, read, or properly search the millions of digital documents produced. Sadly, many legal teams still print digital records to paper or convert files to TIFF, both of which can destroy or distort important document information. "It doesn't make sense to lose data or pay to scan paper back to digital formats, but that's what a lot of people wind up doing," says Speros.
Of course, cooperation may be difficult if some parties in litigation prefer an adversarial approach, including deliberately delivering excessive amounts of data. "More often than not, one side or both are on the warpath and have arrows drawn on a regular basis," says Nelson "A regular problem is that EDD companies and lawyers both make more money if the volume of responsive data remains large. So when we see sloppy work or advice, is it due to incompetence or greed?"
Fortunately, more judges and magistrates have become tech-savvy enough to head off such tactics. Stillman says that the district judge in Sandisk demonstrated an understanding of e-discovery issues increasingly common within the judiciary. "He's thought about these issues before, and when there were some spirited arguments about e-discovery, he understood what was going on and resolved it in a thoughtful manner."


A PERNICIOUS PROBLEM
However, other lawyers are less optimistic that the problems of data production will be easily resolved in many matters. "It's almost impossible to control all variables when dealing with high volumes of data," says Paul. "If you complain to the court once you get dumped on, the lag in scheduling means you can waste three to four months and not get relief."
In addition, search technology has proven inadequate to the task. Nelson says that attorneys try to keep costs down by conducting searches themselves, but the results are generally bad. Even searches of data sets conducted by experts can be incomplete or overbroad, returning only a fraction of potentially relevant documents that demand human review to determine relevance and privilege, adds Nelson. "Attorney review is always the most expensive part in e-discovery. In the old days we were rarely dealing with anything more than gigabytes [of data]. Now we deal in terabytes on a regular basis."
However, unmanageable amounts of data may not always be blamed on opposing counsel. Unfortunately, it is often impossible to know what information may be contained in any given data set, and lawyers understandably don't want to limit their requests, missing potentially relevant information. Conversely, producing parties don't want to be accused of hiding evidence and, as a result, too much evidence is being requested and produced. "If you receive a large quantity it could be that the dump is large because your requests are big," says Speros. "It's always advisable to remember to be wary of what you ask for, because the gods may grant it."
 
John Krause is a freelance writer based in Wisconsin
Special to Law.com October 11, 2010



Return to Blog Main Page


LEGAL BLOGS






RSS 2.0   Atom