Data from our ICWSM 2010 paper is available from the link below. Our datasets have been anonymized to protect the privacy
of the users themselves. We are only releasing information
about the Twitter link structure.
Note that we are unable to release any non-anonymized data.
This file contains a list of all user-to-user links that we crawled from Twitter, based on the snapshot of the Twitter network in September, 2009. The file contains 1,963,263,821 directed social links.
Format: Gzipped ASCII. Each line contains two user identifiers, implying a link was observed from the first to the second users (first user follows second user).
Data: Twitter follow links (10.73GB)
This file contains number of new adopters of 7 different retweeting variations (RT, via, Retweeting, Retweet, HT, R/T, and the recycling symbol) per day.
Format: xlsx. Each line corresponds to a retweeting variation, where the first column gives the name of the variation and the subsequent columns give the number of new adopters per day, starting from 2007-03-16 (the day when the first retweeting variation was used).
Data: Retweeting conventions time series (< 1MB)
We are also sharing a set of spammers nodes within this graph from a related project about link farming in Twitter.
Please use the following BibTeX entries if you would like to cite our work.
For Twitter topology:
@inproceedings{icwsm10cha,
author = {Meeyoung Cha and Hamed Haddadi and Fabricio Benevenuto and Krishna P. Gummadi},
title = {{Measuring User Influence in Twitter: The Million Follower Fallacy}},
booktitle = {Proceedings of the 4th International AAAI Conference on Weblogs and Social Media (ICWSM)}},
month = {May},
year = {2010},
address = {Washington DC, USA}
}
For retweeting conventions:
@inproceedings{icwsm12kooti,
author = {Farshad Kooti and Haeryun Yang and Meeyoung Cha and Krishna P. Gummadi and Winter A. Mason},
title = {{The Emergence of Conventions in Online Social Networks}},
booktitle = {Proceedings of the 6th International AAAI Conference on Weblogs and Social Media (ICWSM)}},
month = {June},
year = {2012},
address = {Dublin, Ireland}
}