Staying abreast of technology is simultaneously a challenging and rewarding part of my career. Now and then I like to dive deep into an area to get my hands dirty. Recently I’ve had the itch to explore the latest offerings from Microsoft. SQL 2016, PowerBI and .Net. I’ve also wanted to get a little more hands-on experience with public APIs. All topics I’m familiar with, but sitting down and writing code, designing a database, calling APIs and building reports is a little different that simply understanding how it works.
With EMC World right around the corner, I figured I’d have a little bit of fun with the project, and track and report on the Twitter usage of my fellow EMC Elect during the event.
Down the road I’ll try to blog more about the details but here is the gist of what’s behind the report. Levering Twitter I created a list with all the EMC Elect twitter accounts, you can subscribe to it here. Then, leveraging .Net and Twitter’s public API, I programmed a routine that will continually monitor that list, collecting Tweet information and storing it in SQL 2016. With PowerBI, I built a report that shows interesting tidbits on the Twitter usage collected. Recently Microsoft released a new feature in PowerBI that allows sharing reports with the internet without requiring authentication which has enabled me to share the report. To keep the report up to date, I’m using the Personal Gateway for PowerBI, which allows me to connect my on-prem SQL 2016 database with the cloud-based reporting tool.
I chose this stack and components in part because all of these are available free of charge now, a shift Microsoft has been making much like EMC’s Free and Frictionless movement. PowerBI allows a personal account (with limited data and options). Microsoft recently made SQL Developer Edition free, which essentially is all SQL Enterprise features, just for you as a single user. The .Net coding language has free Visual Studio options, with Nuget I can pull free libraries into my code from the web quickly, and of course, Twitter makes accessing the API free with an account.
I also hooked this up to Azure’s Machine Learning cloud to perform sentiment analysis on the keywords, which also has a free tier. Though given the volume of tweets, I’m not sure I’ll stay in the free tier band, so still working on that aspect.
So, here is the EMC Elect Twitter Statistics for the week of EMC World, May 1st-5th. I’ve embedded the report on my blog below, scroll past it for some information on what the charts mean, as well links to get directly to the report and data for the previous week to compare. If you are a PowerBI user already and have the mobile application and would like to watch on your phone, drop me a note… hopefully, Microsoft will allow sharing the mobile reports publically down the road.
I’d also love comments on your personal deciphering of what this means, as always data presented often needs a human to make it into information. As well, there are countless ways to slice this data now that it’s all in a database, if you have some burning questions or a different way you’d like to see the data, let me know, and I’ll try to build it (or at least run the query to see). I’m personally interested what words will show up, will we see the names of new releases in the word clouds? Will we see more tweets given the event, or less. Will many of the European EMC Elect coming state-side for the event shift the time of day we see tweeting? Or will the fact we’re all out late at night counter-balance?
If the report above is empty, it’s hopefully because you’re reading this post before Sunday the 1st, otherwise I broke something. If it’s not the week of EMC World yet, the data won’t start populating, but you can look at the previous weeks report to see an example, as well compare the two weeks.
Like I mentioned above, this is available through the PowerBI mobile app, but only for PowerBI users, not general use. Because PowerBI is a responsive design, the reports above are designed for desktop (or tablet) viewing and don’t work well on your phone.
Due to the current preview mode of the public web publishing, and the free Personal Gateway, the update frequenty of PowerBI is limited to daily with up to 8 refreshes per day. You can do live queries from on-prem, or SQL Azure, it just isn’t free. So while the data collection from Twitter is live, the reports might be an hour or so behind.
I hope the report is fairly self-evident. A good dashboard shouldn’t require much explanation. But if I didn’t make it intuitive enough, here some details on the elements.
- In the upper right is the timeframe of the report. All data in the report is within that timeframe. With the exception of the timelines that have a legend for “Last Week” and “This Week” of which “This Week” is inside the timeframe, and “Last Week” is the previous week to show a comparison.
- Also on time frame, everything is in Central time. PowerBI needs to enhance their time localization functions (by enhance I mean create, since there is none I could find).
- Total EMC Elect
- How many total of the EMC Elect that have valid Twitter accounts, I’m missing a couple at the time of publishing.
- Total Tweets
- The sum total of original Tweets created by the EMC Elect (meaning I’m not counting when an EMC Elect retweets someone else’s original tweet)
- Total Retweets
- How many times original Tweets from the EMC Elect were retweeted
- Total Favorites
- How many times the original tweets from the EMC Elect have been ‘liked’
- EMC Elect Active
- Of the total EMC Elect members, during the week how many have tweeted at least once
- EMC World Mentioned
- From the original tweets, how many mentioned EMC World (in any facet, hashtag or words)
- EMC Mentioned
- From the original tweets, how many tweets mentioned EMC in any way
- Tweets by Weekday
- Of all those original tweets, what day did they occur on compared to the same day last weel.
- Tweets by Hour of Day
- When are all those tweets coming out, so all tweets to date summed for the hour of day; then compared to last week.
- Who Tweeted The Most
- Ordered descending and a running sum, who has created the most original tweets
- Who was Retweeted the Most
- Ordered descending and a running sum, who’s tweets have been retweeted the most
- Who’s Tweets Received the Most Likes
- Ordered descending and a running sum, who’s received the most ‘likes’
- What Words were Tweeted
- This is your standard word cloud of all the words used in the original tweets from the list. The bigger the word, the more it’s used. I’ve removed common words, but didn’t do any other filtering so if it’s profane; it came from Twitter.
- What #Hashtags were Tweeted
- Same as words, just the hashtags
- Most Mentioned
- Ordered descending and a running sum, who is the EMC Elect mentioning in their tweets
- Everyone Mentioned
- Again a word cloud, but of the other users mentioned in the tweets.
- Where are EMC Elect Tweeting from
- This is a little light on data because few people tag their location when tweeting. But Twitter does store it when you do, and I wanted to play with the geospatial features in SQL and PowerBI.