The Twitter Project Page at MPI-SWS
Welcome to our Twitter project page. The data used in our ICWSM'2010 paper is available for use by the wider community. Based on Twitter's explicit request, we are only
sharing the anonymized topology of the Twitter social network. Please understand that we are not allowed to share any tweet information. If you are
interested in using the topology data, please send us an email at twitter-contact (at) mpi-sws.org to get the link where you can download the data.
Papers
- Jisun An, Meeyoung Cha, Krishna P. Gummadi, Jon Crocroft,
Media landscape in Twitter: A World of New Conventions and Political Diversity,
Proc. International AAAI Conference on Weblogs and Social Media (ICWSM), July 2011
[Download Paper (436KB)]
- Meeyoung Cha, Hamed Haddadi, Fabricio Benevenuto, Krishna P. Gummadi,
Measuring User Influence in Twitter: The Million Follower Fallacy,
Proc. International AAAI Conference on Weblogs and Social Media (ICWSM), May 2010
[Download Paper (205KB)]
Data characteristics
- 54,981,152 user accounts
These accounts were in use in August 2009. We obtained the list of user IDs by repeatedly checking all possible IDs from 0 to 80 million. We scanned the list twice at a two week time gap. We did not look beyond 80 million, because no single user in the collected data had a link to a user whose ID was greater than that value.
- 1,963,263,821 social (follow) links
The 54 million users were connected to each other by 1.9 billion follow links. This is based on the snapshot of the Twitter network topology in August 2009. The follow link data does not contain information about when each link was formed.
- 1,755,925,520 tweets
For each of the 54 million users, we gathered information about all tweets ever posted by the user since the launch of the Twitter service. The tweet data contains information about the time each tweet was posted.
Media Coverage
- ReadWriteWeb blog article by Sarah Perez on March 19th, 2010 (link).
The same article was picked up by the New York Times (link).
- Interview article by Brent Lang at theWrap on April 12th, 2010 (link)
- A research blog article at Harvard Business Review by editor Scott Berinato on May 7th, 2010 (link)
Applications
Members
If you are using our dataset, please use the following BibTeX entry to cite our work.
@inproceedings{icwsm10cha,
author = {Meeyoung Cha and Hamed Haddadi and Fabricio Benevenuto and Krishna P. Gummadi},
title = {{Measuring User Influence in Twitter: The Million Follower Fallacy}},
booktitle = {In Proceedings of the 4th International AAAI Conference on Weblogs and Social Media (ICWSM)}},
month = {May},
year = {2010},
address = {Washington DC, USA}
}