Big data study provides first insights into behavior of users of peer-to-peer file sharing

OCT 8, 2014 // Megan Fellman

Peer-to-peer file sharing of movies, television shows, music, books and other files over the Internet has grown rapidly worldwide as an alternative approach for people to get the digital content they want — often illicitly. But, unlike the users of Amazon, Netflix and other commercial providers, little is known about users of peer-to-peer (P2P) systems because data is lacking.

Now, armed with an unprecedented amount of data on users of BitTorrent, a popular file-sharing system, a Northwestern University research team has discovered two interesting behavior patterns: most BitTorrent users are content specialists—sharing music but not movies, for example; and users in countries with similar economies tend to download similar types of content—those living in poorer countries such as Lithuania and Spain, for example, download primarily large files, such as movies.

“Looking into this world of Internet traffic, we see a close interaction between computing systems and our everyday lives,” said Luís A. Nunes Amaral, a senior author of the study. “People in a given country display preferences for certain content—content that might not be readily available because of an authoritarian government or inferior communication infrastructure. This study can provide a great deal of insight into how things are working in a country.”

Amaral, a professor of chemical and biological engineering in the McCormick School of Engineering and Applied Science, and Fabián E. Bustamante, professor of electrical engineering and computer science, also at McCormick, co-led the interdisciplinary research team with colleagues from Universitat Rovira i Virgili in Spain.

Their study, published this week by the Proceedings of the National Academy of Sciences (PNAS), reports BitTorrent users in countries with a small gross domestic product (GDP) per capita were more likely to share large files, such as high-definition movies, than users in countries with a large GDP per capita, where small files such as music were shared.

Also, more than 50 percent of users’ downloaded content fell into their top two downloaded content types, putting them in the content specialist, not generalist, category.

“Our study serves as a window on society as a whole,” Bustamante said. “It was very interesting to see the separations between users based purely on content. Individuals tend to interact only with others who are interested in the same content.”

One goal of decentralized peer-to-peer file sharing is to make communication on the Internet more efficient. (In certain parts of the world, BitTorrent users are responsible for up to one-third of the total Internet traffic.) The BitTorrent protocol enables users to share large data files even when they don’t have access to broadband connections, which often is the case in rural areas or less developed countries. BitTorrent breaks files into smaller pieces that can be shared quickly and easily from home computers over networks with lower bandwidth.

The researchers analyzed 10,000 anonymous BitTorrent users from around the world during a typical month using data reported by users of the BitTorrent plugin Ono. File content types shared by users included small files, music, TV shows, movies and books. (The type of content was easily determined based on file size.)

The Ono app, developed by Bustamante and his lab, allows users to improve the performance of BitTorrent while reducing the impact of their traffic on Internet network providers. Ono users can give informed consent for research use of their activity, providing a rich source of data on which new studies and projects can be built.

The National Science Foundation (grants CNS 0644062 and CNS 0917233) supported the research.

The title of the paper is Impact of heterogeneity and socioeconomic factors on individual behavior in decentralized sharing ecosystems.

See also news piece in Science and McCormick School of Engineering and Applied Science .