The Dark Side of the Use of Big Data by Lawyers

The majority of articles about the use of Big Data by lawyers describes its possibilities and how it might improve the legal practice (here, here, here, here or also data & it law). That´s good, since lawyers might really benefit from their use. However, the use of Big Data might result in many unexpected consequences.

The article gives an overview of issues that are associated with the use of big data techniques by lawyers.


1.      Misinterpretations – How butter production influences stock market

Finding a pattern in data does not always mean that you found a breathtaking and unknown relationship, which would make you rich the next day. The conclusions might be wrong. For example, it is possible to confuse signals and noise in the data set (see Nate Silver) or to experience an effect of “black swan events” – unexpected personal or public events that influence the decision of the judge (Nassim Taleb).

One of the greatest risks is a result of the basic idea of big data analysis – trying to find undiscovered patterns in data. If you tried hard to find something, it is very likely that you would find it. However, the quality is important. To prove the point, it is possible to use a popular example, noted by Boyd and Crawford: data mining techniques could show a strong but spurious correlation between the changes in the S&P 500 stock index and butter production in Bangladesh. Accordingly, lawyers must be careful in the interpretation of data.


2.      Discrimination by data

The brand-new kind of discrimination of the “modern age” is the discrimination by data. Let´s say the firm must choose between two applicants with the same level of education or experience. Eventually, one of them is preferred based on the fact that according to statistical analysis of company´s historical data, people with similar characteristics performed better. The applicants had no influence on data that were important in the decision-making.

It is certainly a right of the company to find the best employees and use its resources. On the other hand, unsuccessful applicant did nothing wrong, except that a certain percentage of his predecessors with similar characteristics performed worse in similar situations. How to find a balance between these positions?


3.      Availability of data

Another limitation is a result of an unequal access to data. What are the sources of suitable data for analysis?

The biggest law firms might use their own historical data on litigation or expenses. It is possible to use various kinds of tools (see their overview). They are powerful and bring useful insights into the legal practice, as well as legal predictions. However, the majority of them is not free – the lawyer needs significant resources.

Finally, even if the law firm has an access to data, hiring a person capable of their analysis, significantly increases the expenses.

Accordingly, the big data analysis is a powerful and handy tool, but it might be available only to larger law firms or companies with bigger resources available. Some authors claim that these firms have an unfair advantage (see Schiltz).


4.      Self-fulfilling or self-defeating predictions

Drury Stevenson and Nicholas Wagoner in their brilliant study Lawyering in the Shadow of Data give many examples of the situation. When a court with a record of favoring plaintiffs thereby attracts plaintiffs with increasingly meritless cases, eventually numbers revert to the mean. That´s why data are no longer capable of predicting future results.

Or another example: When police use statistics to identify the high-crime neighborhoods and shift resources to that neighborhood, they eventually find that the arrest numbers, and hence the “crime rate”, increase or get worse.

That´s why it is necessary to determine which predictions are self-fulfilling and self-defeating. According to the authors, the answer is the combination of two basic variables: information asymmetries (availability of data) and binary strategic incentives (the motivations of participants). Lower information asymmetry coupled with an incentive to beat the crowds will normally yield self-defeating predictions. On the other hand, widely available information combined with bandwagon benefits yields self-fulfilling prophecies. Accordingly, the actual strategy depends on the specific nature of the situation.


5. Risks for legal tactics

The prediction based on data might present hazards for lawyers (or anyone) in several ways: people are overconfident when relying on forecasts, undervaluing low-probability risks or rely on the information and build whole systems around it. Accordingly, it is crucial to stay critical to the probabilistic nature of the prediction.



The article gave a brief overview of the limitations and potential hazards of using big data for law firms. They are: risk of misinterpretations, discrimination by data, availability of data, self-fulfilling or self-defeating prophecies and risks for legal tactics.

The main idea behind these issues is the importance to be critical and analyze the real meaning behind any data used. Moreover, a lawyer must predict if the other party has an access to the same or other type of data. Finally, a lawyer should be ready to accept that big data is about probabilities and should be always ready for low-probability scenarios.

What is your experience with Big Data and law? Have you found any kind of limitations yet? Feel free to contact us with any comments or critique.


Note: This article is intended as a summary of issues. Its purpose is not a to provide legal advice or create an attorney-client relationship between you and the author of this article (see Terms and Conditions)

Leave a Reply

Your email address will not be published. Required fields are marked *