Sharing 16,000 scientists' folder data with researchers — without telling them
July 2018
Dropbox gave Northwestern University researchers project-folder metadata covering some 16,000 scientists to study collaboration patterns. Users were never told their activity would be used for research, and academics warned the 'anonymized' data could re-identify individuals.
What happened
In July 2018 a Harvard Business Review article described how researchers at Northwestern University's Kellogg School had analyzed Dropbox usage to study how high-performing scientific teams collaborate. To enable the study, Dropbox had shared project-folder data — folder structures, sharing activity, and time spent on projects — covering research teams that used the service between May 2015 and May 2017. The analysis spanned roughly 1,000 university departments; reporting initially referenced data tied to around 400,000 users and 500,000 projects, with the published work focusing on about 16,000 scientists.
The controversy was about consent, not hacking. Users were never specifically asked whether their activity data could be used for academic research; the only 'consent' Dropbox could point to was users' agreement to its general privacy policy and terms of service. Information-ethics researcher Casey Fiesler and others argued that this fell short of the standards expected for research involving human subjects, and warned that even with names stripped, folder titles and file structures could be distinctive enough to re-identify specific people or projects — undercutting Dropbox's claim that the data was fully anonymized.
Amid the criticism, the HBR piece was edited to emphasize that the data had been aggregated and anonymized before researchers received it, and Dropbox updated its privacy policy to clarify how customer data may be shared and anonymized for research.
Impact
The episode showed that Dropbox's visibility into users' folders extends to repurposing their behavioral data for outside research without meaningful, specific consent — exactly the kind of use the keys-held model makes possible. It became a frequently cited case study in research-ethics and data-anonymization debates, fueled doubts about whether 'anonymized' cloud metadata is ever truly anonymous, and added a fresh, post-GDPR-era example to the long-running argument that Dropbox treats user files as data it is free to mine.