Code

Tracking EMC Elect Tweets @ EMC World

Staying abreast of technology is simultaneously a challenging and rewarding part of my career. Now and then I like to dive deep into an area to get my hands dirty. Recently I’ve had the itch to explore the latest offerings from Microsoft. SQL 2016, PowerBI and .Net. I’ve also wanted to get a little more hands-on experience with public APIs. All topics I’m familiar with, but sitting down and writing code, designing a database, calling APIs and building reports is a little different that simply understanding how it works.

With EMC World right around the corner, I figured I’d have a little bit of fun with the project, and track and report on the Twitter usage of my fellow EMC Elect during the event.

Down the road I’ll try to blog more about the details but here is the gist of what’s behind the report. Levering Twitter I created a list with all the EMC Elect  twitter accounts, you can subscribe to it here. Then, leveraging .Net and Twitter’s public API, I programmed a routine that will continually monitor that list, collecting Tweet information and storing it in SQL 2016. With PowerBI, I built a report that shows interesting tidbits on the Twitter usage collected. Recently Microsoft released a new feature in PowerBI that allows sharing reports with the internet without requiring authentication which has enabled me to share the report. To keep the report up to date, I’m using the Personal Gateway for PowerBI, which allows me to connect my on-prem SQL 2016 database with the cloud-based reporting tool.

I chose this stack and components in part because all of these are available free of charge now, a shift Microsoft has been making much like EMC’s Free and Frictionless movement. PowerBI allows a personal account (with limited data and options). Microsoft recently made SQL Developer Edition free, which essentially is all SQL Enterprise features, just for you as a single user. The .Net coding language has free Visual Studio options, with Nuget I can pull free libraries into my code from the web quickly, and of course, Twitter makes accessing the API free with an account.

I also hooked this up to Azure’s Machine Learning cloud to perform sentiment analysis on the keywords, which also has a free tier. Though given the volume of tweets, I’m not sure I’ll stay in the free tier band, so still working on that aspect.

So, here is the EMC Elect Twitter Statistics for the week of EMC World, May 1st-5th. I’ve embedded the report on my blog below, scroll past it for some information on what the charts mean, as well links to get directly to the report and data for the previous week to compare. If you are a PowerBI user already and have the mobile application and would like to watch on your phone, drop me a note… hopefully, Microsoft will allow sharing the mobile reports publically down the road.

I’d also love comments on your personal deciphering of what this means, as always data presented often needs a human to make it into information. As well, there are countless ways to slice this data now that it’s all in a database, if you have some burning questions or a different way you’d like to see the data, let me know, and I’ll try to build it (or at least run the query to see). I’m personally interested what words will show up, will we see the names of new releases in the word clouds? Will we see more tweets given the event, or less. Will many of the European EMC Elect coming state-side for the event shift the time of day we see tweeting? Or will the fact we’re all out late at night counter-balance?

 

Follow this link for the full page report.

If the report above is empty, it’s hopefully because you’re reading this post before Sunday the 1st, otherwise I broke something. If it’s not the week of EMC World yet, the data won’t start populating, but you can look at the previous weeks report to see an example, as well compare the two weeks.

Follow this link for the full page report. 

Like I mentioned above, this is available through the PowerBI mobile app, but only for PowerBI users, not general use. Because PowerBI is a responsive design, the reports above are designed for desktop (or tablet) viewing and don’t work well on your phone.

Due to the current preview mode of the public web publishing, and the free Personal Gateway, the update frequenty of PowerBI is limited to daily with up to  8 refreshes per day. You can do live queries from on-prem, or SQL Azure, it just isn’t free. So while the data collection from Twitter is live, the reports might be an hour or so behind.

I hope the report is fairly self-evident. A good dashboard shouldn’t require much explanation. But if I didn’t make it intuitive enough, here some details on the elements.

  • Timeframe
    •  In the upper right is the timeframe of the report. All data in the report is within that timeframe. With the exception of the timelines that have a legend for “Last Week” and “This Week” of which “This Week” is inside the timeframe, and “Last Week” is the previous week to show a comparison.
    • Also on time frame, everything is in Central time. PowerBI needs to enhance their time localization functions (by enhance I mean create, since there is none I could find).
  • Total EMC Elect
    • How many total of the EMC Elect that have valid Twitter accounts, I’m missing a couple at the time of publishing.
  • Total Tweets
    • The sum total of original Tweets created by the EMC Elect (meaning I’m not counting when an EMC Elect retweets someone else’s original tweet)
  • Total Retweets
    • How many times original Tweets from the EMC Elect were retweeted
  • Total Favorites
    • How many times the original tweets from the EMC Elect have been ‘liked’
  • EMC Elect Active
    • Of the total EMC Elect members, during the week how many have tweeted at least once
  • EMC World Mentioned
    • From the original tweets, how many mentioned EMC World (in any facet, hashtag or words)
  • EMC Mentioned
    • From the original tweets, how many tweets mentioned EMC in any way
  • Tweets by Weekday
    • Of all those original tweets, what day did they occur on compared to the same day last weel.
  • Tweets by Hour of Day
    • When are all those tweets coming out, so all tweets to date summed for the hour of day; then compared to last week.
  • Who Tweeted The Most
    • Ordered descending and a running sum, who has created the most original tweets
  • Who was Retweeted the Most
    • Ordered descending and a running sum, who’s tweets have been retweeted the most
  • Who’s Tweets Received the Most Likes
    • Ordered descending and a running sum, who’s received the most ‘likes’
  • What Words were Tweeted
    • This is your standard word cloud of all the words used in the original tweets from the list. The bigger the word, the more it’s used. I’ve removed common words, but didn’t do any other filtering so if it’s profane; it came from Twitter.
  • What #Hashtags were Tweeted
    • Same as words, just the hashtags
  • Most Mentioned
    • Ordered descending and a running sum, who is the EMC Elect mentioning in their tweets
  • Everyone Mentioned
    • Again a word cloud, but of the other users mentioned in the tweets.
  • Where are EMC Elect Tweeting from
    • This is a little light on data because few people tag their location when tweeting. But Twitter does store it when you do, and I wanted to play with the geospatial features in SQL and PowerBI.

 

By | April 28th, 2016|Code, Home Lab, MyDW|2 Comments

ProcessControl

I was looking through some old code and found a little gem I built almost ten years ago called ProcessControl. I’m sure you’re familiar with the ability to adjust the affinity and priority of processes in Windows. If you’re not, give it a try in Task Manager.

I’ve used these controls quite a bit over the years for numerous purposes. Such as troubleshooting performance by adjusting process priority. Or getting older applications or games to work better by constraining them to a single core (this was common early in the multi-core days before thread management was widespread). Writing the code in my spare time to test an idea I had to improve the performance of a COTS product called SolarWinds Orion, which is an excellent monitoring tool for networking (among other things).

Orion runs multiple components on a single server (web server, business logic, SNMP trap server, NetFlow collector and SNMP poller), however at the time, the SNMP poller was single-threaded; causing a conflict of resources. The other portions of the application that spread their workload across all cores would be on the same core that the single-threaded daemon was assigned automatically by Windows. This contention would slow down the SNMP poller due to the inefficiency.

After testing tuning the affinity of all processes, when the SNMP poller received a dedicated core it showed a vast improvement in the polls per second. However manually adjusting these settings in Task Manager after every reboot or process change is not a viable solution. So in entered a very simple piece a code that would take an XML configuration file with the affinity and priority desire of each process and adjust them programmatically. Running this as a Windows service set to automatically start after boot, and recheck the process settings periodically made it ready for operations.

With the existence of this tool, I found myself using it frequently to solve odd performance situations. Additional benefits were found, such as decreasing the occurrence of processes moving between cores/sockets and the low-level cache rebuilding. Or in processor constrained systems where upgrading hardware wasn’t possible, removing the offending process from core 0 and reducing the process priority would solve stability issues (hardware level interrupts and core operating system processes use core 0, so often OS stability is due to user processes contending with kernel processes).

My most common use of this code was my personal development workstations. When installing lots of SQL, IIS, MySql, PostGres, Apache and more all on the same instance where I do coding; the resource contention between all these applications slows down the GUI. ProcessControl can reduce the cores and priority of all those server daemons, which do not need robust resources to simply test code, or look at configuration details. This, in turn, leaves more resources for the application with a human interface, speeding up the experience.

I’ve posted the code and working binaries on my GitHub page. If you’re tired of manually changing your process priority and affinity or have been looking for a way to tune applications that are having conflicts due to thread management; feel free to use the tool and contribute to the code. It’s not highly complex code, but it gets the job done.

ProcessControl Quick Start

Installation

  1. Create a new folder called “ProcessControl” under “C:\Program Files (x86)\”
  2. Download the entire zipped repository from GitHub (or clone it if you want)
  3. Open the zip file and copy all the contents from “ProcessControl\bin\Release\” into “C:\Program Files (x86)\ProcessControl\”
  4. Navigate to “C:\Program Files (x86)\ProcessControl\” and double-click “ProcessControl.exe”
    1. This will not actually launch the service, but trying to execute the application will ensure you have the needed .Net framework
  5. Use the provided tool “srvinstw” to register ProcessControl.exe as a Windows service
    1. Run as default, so it has access to the processes.
    2. Choose Auto or Manual based on your needs.
  6. Adjust the XML config file “ProcessControlParams.xml” as desired, refer to configuration details below
  7. Start the server

Configuration

There are two configuration files:

ProcessControl.xml is the main application configuration. This has settings for the location of the ProcessControlParams.xml (if you want to host the service and associated files in another location outside of Program Files). As well a setting for the interval to recheck process attributes, by default this is every 15 minutes.

The second XML file, ProcessControlParams.xml, is the meat of the application. The first configuration line with the process name ‘Default’ will adjust the affinity of ALL processes. This is a baseline reset which allows you to clear off a core. The priority control does NOT work for Default. The next line, copied and pasted as many times as you need; adjusts the process of your choosing. You can adjust the priority and/or affinity of (almost) any process (there are a few system processes you cannot control). Here is a quick look at the XML:

<Process Name=“Default” Priority=“” Affinity=“”/>
<Process Name=“MyProcess” Priority=“” Affinity=“”/>
The options for Priority and Affinity?

Priority – these are the standard options:

  • RealTime
  • High
  • AboveNormal
  • Normal
  • BelowNormal
  • Idle

Affinity – this is a little more tricky and controlled by a number to represent all the different configuration options. Below are documented the common options I’ve used in up to an 8 core environment. The options are also documented in the ProcessControlParams.xml file when you download. If you want a combination that is not documented and don’t want to do the math; simply manually set a process to the desired state before starting the service. Step one in launching it to log the status of all existing processes. On the off change you discover more, please update the file on GitHub?

All 8 = 255
4-7 = 240
2,3 = 12
0,1 = 3
4,5 = 48
6,7 = 192
1,2 = 3

Logging

ProcessControl will log into the Windows Application event log. Successful process changes, as well as errors and full exception catch output will all be put into event entries. If you’re having problems with the service, it’s a good bet the information will be in the application log.

By | January 13th, 2016|Code, Tuning|0 Comments