Welcome to our Twitter project page. The data used in our ICWSM'2010 paper is available for use by the wider community. Based on Twitter's explicit request, we are only
sharing the anonymized topology of the Twitter social network. Please understand that we are not allowed to share any tweet information.
If you have any questions please send us an email at twitter-contact (at) mpi-sws.org.
Data from our ICWSM 2010 paper is available from the link below. Our datasets have been anonymized to protect the privacy
of the users themselves. We are only releasing information
about the Twitter link structure.
Note that we are unable to release any non-anonymized data.
This file contains a list of all user-to-user links that we crawled from Twitter, based on the snapshot of the Twitter network in September, 2009. The file contains 1,963,263,821 directed social links.
Format: Gzipped ASCII. Each line contains two user identifiers, implying a link was observed from the first to the second users (first user follows second user).
Data: Twitter follow links (10.73GB)
This file contains number of new adopters of 7 different retweeting variations (RT, via, Retweeting, Retweet, HT, R/T, and the recycling symbol) per day.
Format: xlsx. Each line corresponds to a retweeting variation, where the first column gives the name of the variation and the subsequent columns give the number of new adopters per d
Data: Retweeting conventions time series (< 1MB)
We are also sharing a set of spammers nodes within this graph from a related project about link farming in Twitter.
If you are using our dataset, please use the following BibTeX entry to cite our work.
@inproceedings{icwsm10cha,
author = {Meeyoung Cha and Hamed Haddadi and Fabricio Benevenuto and Krishna P. Gummadi},
title = {{Measuring User Influence in Twitter: The Million Follower Fallacy}},
booktitle = {In Proceedings of the 4th International AAAI Conference on Weblogs and Social Media (ICWSM)}},
month = {May},
year = {2010},
address = {Washington DC, USA}
}
We are also sharing a set of spammers nodes within this graph from a related project about link farming in Twitter.