Streaming Twitter Data to MongoDB Real-time
Lot of Developers/Applications which perform Analytics are pulling data from lot of Social Media Sources Like Facebook, Twitter, Blogs, News Channels like ReadIt. 99% of Social Media Data is not free but it’s mostly available for Developers to POC and Development purpose. This Article I will explain how you can we Stream Data from Twitters Real-time as Twitter offers only 1% of actual their activity stream data for free of cost.
To Stream Twitter Real-time Data we need to have Twitter Developer Account from Twitter which can be provisioned from https://dev.twitter.com/. You can read more about Twitter Streaming API’s
Once you have your twitter developer account, using that you can stream data to MongoDB or Text Files. From Twitter Streaming you basically gets JSON documents and MongoDB is the best candidate to store this data as it’s Document Store and Stores data as JSON format internally.
There are various ways to Stream this data real-time, however the easiest way I’ve found is to use CURL. You can download this utility from http://curl.haxx.se/download.html. You can download as per your Operating System Specifications. Once you download CURL extract to particular directory and create a parameters file called “twitter_params.txt”. Parameters file acts as a filter to get the data of your choice from Twitter feeds. While it is possible to filter the data by #hashtags and keywords. Here is sample of twitter_params.txt for your reference. Parameter file is basically your search terms with delimited by comma separator.
track=#usa, #gold, #stock, usa, America, stock market, gold, bank, #SYWCyberMonday, @ShopYourWay, economics
Now, Create a batch file named as “Twitter Stream Data Text.cmd” to start the streaming and store it to text file. You need to replace <twitterusername> and <twitterpassword> with your twitter credentials.
curl -d @twitter_params.txt -k https://stream.twitter.com/1/statuses/filter.json -u <twitterusername>:<twitterpassword> >>twitter_stream_file.txt
Your good to verify this Streaming by executing your cmd file(“Twitter Stream Data Text.cmd”) with command prompt. This will read your parameter phrases and filter the twitter steam accordingly and store this stream data returned by Twitter to twitter_stream_file.txt.
Once you are done with above verification you are good to stream this data to MongoDB with the help of “mongoimport” utility provided by MongoDB. Make sure your MongoDB instance is up and running before using “mongoimport”. In Order to stream data to MongoDB you need to create another batch file named as “Twitter Stream Data MongoDB.cmd” as below. Where you have to provide additional information like database name and collection name. In below example “StreamDB” is database name and “tweets” is collection(collection is nothing but table in MongoDB)
curl -d @twitter_params.txt -k https://stream.twitter.com/1/statuses/filter.json -u <twitterusername>:<twitterpassword> >> | E:\MongoDB2_2\mongoimport -d StreamDB -c tweets
Now you are good to run your “Twitter Stream Data MongoDB.cmd” file with command prompt which will fetch data from Twitter based on your filter criteria (twitter_params.txt) and store it to “tweets” collection from “StreamDB”.
You can also stream this data to Cloud MongoDB Instance with additional parameters like host name, port name, database, collection, user name and password as shown below.
curl -d @twitter_params.txt -k https://stream.twitter.com/1/statuses/filter.json -u <twitterusername>:<twitterpassword> >>twitter_stream_file.txt | mongoimport -h <host>:<port> -d StreamDB -c tweets -u <user> -p <password>
Please note twitter only expose 1% of their actual stream data for developer steaming purpose which is free to use. If you want 100% data then we have to go through recommended data resellers from twitter.
Hope this helps.
Next article I’ll explain Twitter Data Structure and the MongoDB commands/SQL to query data on with Aggregation Framework. Stay tuned.
Happy Holidays…!!!
Sandip Shinde
Everyone loves it wen folks come together and share opinions.
Great site, stick with it!
Hi, I do think this is a great web site. I stumbledupon it 😉 I am going to come back once
again since I bookmarked it. Money and freedom is the best way to
change, may you be rich and continue to help others.
Looking for forward to reading through extra from you in a while!? I’m usually to running a blog and i really respect your posts.
I love it when folks get together and share ideas. Great
site, continue the good work!
I am extremely inspired together with your writig skills and also with the format in your weblog.
Is this a paid topic or did you customize it yourself?
Anyway stay up the nice quality writing, it’s uncommon to look a great weblog like this one these
days..
Interesting articles on information like this is a great find. It’s like finding a treasure. I appreciate how you express your many points and share in your views. Thank you.
These are really great іdeas in abοut blοgging.
You have touched ѕome fаstidious things here. Any way keep up wгinting.