Protecting Customer Information in Data Analysis Contests
Data mining and analytics are big business in the 21 st century economy. So much so that companies are paying huge awards in contests meant to improve data recommendations. In 2009, movie rental provider NetFlix paid $1 million to a team of data experts who won a contest to improve its online movie recommendations. However, it scrapped a similar contest in 2010 to settle a lawsuit based on violations of customer privacy laws.
Earlier this year, two companies sought to sponsor similar contests. According to a New York Times report, online retailer Overstock.com announced that it would award $1 million to the person or team that could best improve its product recommendation scheme. The California medical group, Heritage Medical Network, announced a $3 million award to be given to the group who could best predict which patients would be admitted to hospitals in the next year.
To win, contestants must create predictive algorithms using anonymous personal data in order to win the contests. However, using personal data, even with the announcement of anonymity, carries risks.
Data consultants are employing various plans to minimize data privacy infringement, and scrutiny from the Federal Trade Commission. The Overstock.com contest will be run by Rich Relevance, a company that provides recommendation technology to online retailers. In speaking to the New York Times, chief scientist Darren Vengroff explains that limiting the number of contestants who receive customer data will be helpful in minimizing risk. Most contestants will receive hypothetical data sets. But in the final rounds of competition, real customer data (with identifying information removed) would be used.
Heritage Medical has consulted Arvind Narayanan, who is a post-doctorate researcher at Stanford University, and a scholar at the Center for Internet and Society. Mr. Narayanan recommended that companies conducting data research be honest about the information they seek and how it will be used. He explained that "Handling personal data on the Web, even when stripped of personally identifying information like names and credit card numbers, is a risk management game."
Both companies want to avoid the lawsuit Netflix faced by inadvertently using information that could be traced back to consumers.
If you have questions about the implications of improperly using customer's personal data, an experienced attorney can advise you.