Here is a tab-delimited file of all of Comet Ping Pong's tweets (choose the raw option -- pastebin doesn't show the tabs otherwise)
http://pastebin.com/GFKjDuf7
Here are all the links I could extract from Comet Ping Pong's tweets: http://pastebin.com/TXXLvSWz
Here's a list of top mentions by the account (listed with frequency of occurrence): http://pastebin.com/gUmd5aud
Here's a list of the top tags by the account (listed with frequency of occurrence): http://pastebin.com/cZNywSGQ
Here's a list of "names"/phrases mentioned in the tweets (simply extracted by a regular expression looking for a sequence of capitalized words): http://pastebin.com/0cYnfNEc
I had to do some post-processing on the text since I extracted the tweets manually (by scrolling until all were in view and then text copying them). I removed all emojis and non standard characters since I was focused on the text.
Hopefully this data may be of help to others!
view the rest of the comments →
archons ago
Emoji's are symbols they use to communicate certain things. They should definitely be recorded as well. Otherwise good job.
Jeremy20_9 ago
Yeah, I did the quick and dirty method of text copy and processing, instead of pull the data in from the API, but hopefully this will offer enough data for new insights.
A deeper dive on the data is always possible.