mdschneider – Page 2 – blog.mdschneider.me

mdschneider

About mdschneider

Strategic Tactician | Architect | Geek | EMC Elect | Founder & Principle @ http://vernex.io

IsilonSD – Part 5: Monitoring Activity

For my deployment of IsilonSD Edge, I want to keep this running in my lab, installing systems is often far easier than operating them (especially troubleshooting issues). However an idle system isn’t really a good way to get exposure, so I need to put a little activity on this cluster, plus monitor it.

This post is part of a series covering the EMC Free and Frictionless software products.
Go to the first post for a table of contents.

This is just my lab, so here is my approach to doing more with IsilonSD than simply deploying it:

  • Deploy InsightIQ (EMC’s dedicated Isilon monitoring suite)
  • Move InsightIQ Data to IsilonSD Edge Cluster
  • Synchronize Software Repository
  • Mount Isilon01 as vSphere Datastore
  • Load Test

Deploy InsightIQ

InsightIQ is EMC’s custom-built monitoring application for Isilon. Personally, this was one of the top reasons I select Isilon years ago when evaluating NAS solutions. I’m a firm believer that the ability to monitor a solution should be a key deciding factor in product selection.

Without going too deep in InsightIQ itself (that’s another blog), it provides the ability to monitor the performance of the Isilon, including the client perspective of the performance. You can drill into the latency of operations by IP address; which when I first purchased an Isilon array is was because the incumbent solution was having numerous performance problems and the lack of visibility into why was causing a severe customer satisfaction issue.

InsightIQ monitors the nodes, cluster communication, and even does file analysis to help administrators understand where their space is consumed and by what type of files.

Deploying InsightIQ is a typical OVA process, we’ve collected the information necessary in previous posts, so I’ll be brief, in fact you can probably wing-it on this one if you want.

*Note there is no sound, this is to follow along the steps.
  1. In the vSphere Web Client, deploy an OVA
  2. Provide the networking information and datastore for the InsightIQ appliance
  3. After the OVA deploy is complete, open the console to the VM, where you’ll need to enter the root password
  4. Navigate your browser to the IP address you entered, logging in as root, with the password you created in the console
  5. Add the Isilon cluster to InsightIQ and wait while it discovers all the nodes.

 

Move InsightIQ Data to IsilonSD Edge Cluster

You can imagine collecting performance data, and file statistics will consume quite a bit of storage. By default InsightIQ will store all this data on the virtual machine, so I move the InsightIQ Datastore onto the Isilon cluster itself. While this is a little circular, InsightIQ will generate some load writing the monitoring data, which in turn will give it something to monitor, for our lab purposes this provides some activity.

Simply log into InsightIQ, under Settings -> Datastore, change the location to NFS Mounted Datastore. By default Isilon shares out /IFS, however in production this should ALWAYS be changed, but for a lab we’ll leverage the export path.

IsilonSD_InsightIQDSMove

If you do this immediately after deploying InsightIQ, it will be very quick. If, however, you’ve been collecting data, you’ll be presented with information about the progress of the migration, refreshing the browser will provide updates.

IsilonSD_InsightIQDSMoveProgressSynchronize Software Repository

I have all my ISO files, keys, OVAs and software installation on a physical NAS; this makes it very easy to mount via NFS to all my hosts as a datastore, physical and nested; for quickly installing software in my lab. Because of this, I use this repository daily; so to ensure I’m actually utilizing IsilonSD to continue to learn about it post setup, I’m going use IsilonSD to keep a copy of this software repository, mounting all my nested ESXi hosts to it.

I still need my physical NAS for my physical hosts, in case I lose the IsilonSD I don’t want to lose all my software and be unable to reinstall. I want the physical NAS and IsilonSD to stay in sync too. My simple solution is to leverage robocopy to sync the two file systems; the added benefit of this is I also get the regular load on IsilonSD.

Delving into robocopy is a whole different post, but here is my incredibly simple batch routine. It mirrors my primary NAS software repository to the Isilon. This runs nightly now.

:START
robocopy \\nas\software\ \\isilon01\ifs\software\ /MIR /MT:64 /R:0 /W:0 /ZB
GOTO START

Upon first execution, I see in InsightIQ traffic onto IsilonSD. Even though this is nested ESXi, with the virtual Isilon nodes sharing both compute, network, memory and disk; I see a fairly healthy external throughput rate, peaking around 100Mb/s.

IsilonSD_InsightIQRobocopyThroughput

 

When the copy process is complete, looking in the OneFS administrator console will show the data has been spread across the nodes (under HDD Used).

IsilonSD_HDDLoaded

Mount Isilon01 as vSphere Datastore

Generally speaking, I would not recommend Isilon for VMware storage. Isilon is built for file services, and its specialty is sequential access workloads. For small workloads, if you have an Isilon for file services already, an Isilon datastore will work; but there are better solutions for vSphere data stores in my opinion.

For my uses in the lab though, with my software repository being replicated onto Isilon, mounting an Isilon NFS export as a datastore will not only allow me to access those ISO files but open multiple concurrent connections to monitor.

*Note there is no sound, this is to follow along the steps.
Mounting an NFS datastore to Isilon is exactly the same as any other NFS NAS.

You MUST use the FQDN to allow SmartConnect to balance the connections.

 

With the datastore mounted, if you go back into the OneFS administrator console; you can see the connections were spread across the nodes.

IsilonSD_ConnectionSpread

Now I have a purpose to regularly use my IsilonSD Edge cluster, keeping it around for upgrade testing, referencing while talking to others, etc. Again, with the EMC Free and Frictionless license, I’m not going to run out of time, I can keep using this.

Load Test

Even though I have an ongoing use for IsilonSD, I want to to a little more to do than just serve as a software share, just to ensure it’s really working well. So I’ll use IOMeter to put a little load on it.
I’m running IsilonSD Edge on 4 nested ESXi virtual machines, which in turn are all running on one physical host. So IsilonSD is sharing compute, memory, disk and network across the 4 IsilonSD nodes (plus I have dozens of other servers running on this host). Needless to say, this is not going handle a high amount of load, nor provide the lowest latency. So, while I’m going to use IOMeter to put some load on my new IsilonSD Edge cluster and typically I would record all the details of a performance test; this time I’m not. Especially given I’m generating load from virtual machines on the same host.

Given Isilon is running on x86 servers, it would be incredibly interesting to see a scientific comparison between physical Isilon and IsilonSD Edge with like-for-like hardware. In my personal experience with virtualization, there is a negligible overhead, but I have to wonder the difference Infiniband makes.

In this case, my point of load testing is not to ascertain the latency or IOPS, but merely to put the storage device under some stress for a couple of hours to ensure it’s stable. So I created a little load, peaking around 80Mbps and 150 IOPS, but running for about 17 hours (overnight).

Below are some excerpts from InsightIQ, happily the next morning the cluster was running fine, even given the load. During the test, the latency fluctuated widely (as you’d expect due to the level of contention my nested environment creates). From an end user perspective, it was still usable.

IsilonSD_LoadTest1IsilonSD_LoadTest2IsilonSD_LoadTest3IsilonSD_LoadTest4

In my next post I’m going to wrap this up and share my thoughts on IsilonSD Edge.

By | April 1st, 2016|EMC, Home Lab, VMWare|1 Comment

IsilonSD – Part 4: Adding & Removing Nodes

One of the core functions of the Isilon product is scaling. In it’s physical incarnation you can scale up to 144 nodes with over 50 petabytes in a single namespace. Those limits are because of hardware, as drives and switches get bigger, so can Isilon scale bigger. Even so, you can still start with only three nodes. When nodes are added to the cluster, it increases the storage and performance; existing data is re-balanced across all the nodes after the addition. Likewise, you can remove a node, proactively migrating the data from the leaving node without sacrificing data protection; an excellent way to lifecycle replace your hardware. This tenant of Isilon, coupled with non-disruptive software upgrades, means you there is no pre-set life-span to an Isilon cluster. With SmartPools ability to tier storage by node type, you can leverage older nodes for less frequently accessed data, maximizing your investment.

IsilonSD Edge has that same ability, though slightly muted given you’re limited to six nodes and 36TB (for now hopefully). I wanted to walk through the exercise, to see how node changes are accomplished in the virtual version, which is different from the physical version.

This post is part of a series covering the EMC Free and Frictionless software products.
Go to the first post for a table of contents.

 

Adding a Node

Adding a node to the IsilonSD Edge Cluster is very easy, as long as you have an ESX host with all the criteria ready. If you recall from building our test platform, we built a forth node for this very purpose.

*Note there is no sound, this is to follow along the steps.
  1. In the vSphere Web Client, return to the Manage IsilonSD Cluster tab
  2. Select the cluster (in our case, Isilon01)
  3. Switch to the Nodes tab
  4. Click the + button
  5.  The Management server will again search for candidates if found it will allow you to select them.
  6. Again select the disks and their roles, and then proceed; all the cluster and networking information already exists.

 

Just like creating the cluster, the OVA will be deployed, data stores created (if you used raw disks) and the IsilonSD node brought online. This time the node will be added into the cluster, which will start a rebalance effort to re-stripe the data across all the nodes, including the new one.

Keep in mind, IsilonSD Edge can scale up to six nodes, so if you start with three you can double your space.

Removing a Node

Removing a node is just as straight-forward as adding one, as long as you have four or more nodes. This action can take a very long time, depending on how much data you have, as before the node can be removed from the cluster all the data must be re-striped.

*Note there is no sound, this is to follow along the steps.
  1. In the vSphere Web Client, return to the Manage IsilonSD Cluster tab
  2. Select the cluster (in our case, Isilon01)
  3. Switch to the Nodes tab
  4. Select the node to evict (in our case, node 4)
  5. Click the – (minus) button.
  6. Double check the node and select Yes
  7. Wait until the node Status turned to StopFail

 

During the smart fail operation, should you log onto the IsilonSD Edge Administrator GUI, you’ll notice in the Cluster Overview the node you are removing has a warning light next to it. This is also a good summary screen to gauge the progress of the smartfail, by comparing the % column of the node your evicting, to the other nodes. In the picture below the node we choose to remove now is <1% used, while the other 3 nodes are 4% or 5%, meaning we’re almost there. IsilonSD_RemoveNodeClusterOverview

 

 

Drilling into that node is the best way to understand why it has a warning, there you will see the message that the node is being smartfailed. IsilonSD_RemoveNodeSmartFailMessage

When the smartfail is complete, you still have some cleanup activities.

*Note there is no sound, this is to follow along the steps.
  1. In the vSphere Web Client, return to the Manage IsilonSD Cluster tab
  2. Select the cluster (in our case, Isilon01)
  3. Switch to the Nodes tab
  4. The node you previously set to evict should be red in the Status
  5. Select the node, then click on the trash icon.
  6. This will delete the virtual machine and associated VMDKs

 

If you provided IsilonSD unformatted disks, the data stores the wizard created will still exist and you might want to clean them up. If you want to re-add the node, you’ll need to wait awhile, or restart the vCenter Inventory Service as it takes a bit to update.

By | March 31st, 2016|EMC, Home Lab, Storage, VMWare|1 Comment

IsilonSD – Part 3: Deploy a Cluster (Successfully)

With proper planning and setup of the prerequisites (see Part 2), the actual deployment of the IsilonSD Edge cluster is fairly straight-forward. If you experience issues during this section (see Part 1) it’s very likely because you don’t have the proper configuration, so revisit the previous steps. That said, let’s dive in and make a cluster.

This post is part of a series covering the EMC Free and Frictionless software products.
Go to the first post for a table of contents.

High Level, you’re going to do a few things:

  1. Deploy the IsilonSD Edge Management Server
  2. Setup IsilonSD Edge Management Server Password
  3. Complete IsilonSD Edge Management Server Boot-up
  4. Configure  Management Server vSphere Link & Upload Isilon Node Template
  5. Open the IsilonSD Edge vSphere Web Client Plug-in
  6. Deploy the IsilonSD Edge Cluster

Here’s the detail.

Deploy the IsilonSD Edge Management Server

*Note there is no sound, this is to follow along the steps.
This is your standard OVA deployment, as long as  you’re using the “EMC_IsilonSD_Edge_MS_x.x.x.ova” file from the download and providing an IP address accessible to vCenter, you can deploy this just about anywhere.

Follow along in the video on the left if you’re not familiar with the OVA process.

Once the OVA deployment launches, ensure you find the deployment task in the vSphere Task Pane and keep an eye on the progress.

Setup IsilonSD Edge Management Server password

IsilonSD_ManagementBootPasswordChangeOnce the OVA is complete and the virtual machine is booting up, you’ll need to open the console and watch the initialization process. Generally I recommend this with any OVA deployment, as you’ll see if there are any errors as the first boot configuration occurs. For the IsilonSD Edge Management Appliance, it’s required to set the administrator password.

Complete IsilonSD Edge Management Server Boot-up

IsilonSD_ManagementConsoleBlueScreenAfter entering your password, the server will continue it’s first boot process and configuration. When you reach this screen (what I call the blue screen of start) you’re ready to proceed. Open a browser and navigate to the URL provided in the blue screen next to IsilonSD Management Server.

 

Configure Management Server vSphere Link & Upload Isilon Node Template

When you navigate to the URL provided by the blue screen, after accepting the unauthorized certificate, you’ll be promoted for logon credentials. This is NOT the password you provided during the boot up. I failed to read the documentation and assumed this, resulting in much frustration.

Logon using:
username: admin
password: sunshine

After successful logon, and accepting the EULA, you have just a couple steps, which you can follow along in the video on the right:

  1. Adjust the admin password
  2. Register your vCenter
  3. Upload the Isilon Virtual Node template
    1. This is “EMC_IsilonSD_Edge_x.x.x.ova in your download

*Note there is no sound, this is to follow along the steps.

 

Open the IsilonSD Edge vSphere Web Client Plug-in

Wait for the OVA template to upload, this may take up to ten minutes depending on your environment. Once complete, you’ll be ready to move on to actually creating the IsilonSD cluster through the vSphere Web Client plug-in that was installed by the Management Server when you registered vCenter. Ensure you close out all the browser windows and open a new session to your vSphere Web Client.

IsilonSD_vCenterDatacenterSelect the datacenter where you deployed the management server (not the cluster, again where I lost some time).

 

IsilonSD_ManageTab In the right-hand pane of vSphere, under the Manage tab, you should now see two new sub-tabs, Create IsilonSD Cluster and Manage IsilonSD ClusterIsilonSD_vCenterIsilonTabs

 

Deploy the IsilonSD Edge Cluster

*Note there is no sound, this is to follow along the steps.

Follow along in the video above:

  1. Check the box next to your license
  2. Adjust your node resources
    1. For my deployment, I started with 3 nodes; adjusting the Cluster Capacity from the default 2 TB  to the minimum 1152 GB (64GB & 7 Drives * 3 Nodes)
  3. Clicking Next on the Requirements tab will search the virtual datacenter in your vCenter for ESX hosts that can satisfy the requirements you provided, including having those independent drives that meet the capacity requirement
    1. Should the process fail to find the necessary hosts, you’ll see a message like this. Don’t get discouraged, look over the requirements again to ensure everything is in order, try restarting the Inventory Service too.
    2. IsilonSD_NoQualifiedHosts
  4. When the search for hosts is successful, you’ll see a list of hosts available to select, such as
    1. IsilonSD_HostSelection
  5. Next, select all the hosts you wish to add to the cluster (if you prepared more than 3, consider selecting 3 now, as the next post we’ll walk through adding an additional node).
  6. For each host, you’ll need to select the disks and their associated role (Data Disk, Journal, Boot Disk or Journal & Boot Disk).
    1. Remember, you need at LEAST 6 data disks, you won’t get this far if you don’t but you won’t get farther if you don’t select them.
    2. In our scenario, we selects 6x 68GB data disks, and a final 28GB disk for Journal & Boot Disk
    3. You’ll also need to select the External Network Port Group and Internal Network Port Group
    4. IsilonSD_HostDriveSelection
  7. After setting up all hosts, with the exact configuration, you’ll move into the Cluster Identity
    1. IsilonSD_ClusterIdentity
    2. Cluster Name (this is used in the management interface to name the cluster)
    3. Root Password
    4. Admin Password
    5. Encoding (I’d leave this alone)
    6. Timezone
    7. ESRS Information (only populate this if you have a production license)
  8. Next will be your network settings.
    1. IsilonSD_ClusterNetworking
    2. External Network
    3. Internal Network
    4. SmartConnect
  9. You have a final screen to verify all your settings, look them over, the full deployment will take awhile and click Next.

At this point, patience is key, do not interrupt it. An OVA will be deployed for every node, then all of those unformatted disks will be turned into data stores, then VMDK files put on each datastore; finally all the nodes will boot and configure themselves. If everything goes as planned, your reward will look like this:

IsilonSD_ClusterCreationSuccess

To verify everything, point your browser to your smart connect IP address, in our case https://Isilon01.lab.vernex.io:8080 if you get a OneFS Logon Prompt, you should be in business!

IsilonSD_OneFSLogonPrompt

 

You should also be able to navigate in windows to your SmartConnect address; recall ours is \\Isilon01.lab.vernex.io\ and see the IFS share. This is the initial administrator share that in a production environment you’d disable. Likewise in *nix you can NFS attach to //Isilon01.lab.vernex.io:/IFS

 

 

 

 

 

 

 

 

By | March 30th, 2016|EMC, Home Lab, Storage, VMWare|1 Comment

IsilonSD – Part 2: Test Platform

This post is part of a series covering the EMC Free and Frictionless software products.
Go to the first post for a table of contents.

So… last post I figured out IsilonSD Edge needs (at least) three separate ESX hosts, with (at least) 7 independent, local disks. So I cannot deploy this on the nested ESXi VSAN I made, but I really want to get IsilonSD up and running.

Since my previous attempt to wing-it didn’t go so well, I’m also going to do a little more planning. I need to create ESX hosts that meet the criteria, plus some plan out things like cluster name, network settings, IP addresses and DNS zones.

For the ESX hosts, my solution is to run nested ESXi (a virtual machine running ESX on top of a physical machine running ESX). This allows me to provide the independent disks, as well multiple ESX hosts, without all the hardware. As well, this will help facilitate the networking needs, through making virtual switches to provide the internal and external networks.

To build this test platform, we’ll cover 4 main areas:

  • ESX Hosts
  • External Network
  • Internal Network
  • SmartConnect

ESX Hosts

For testing IsilonSD Edge, I’m going to make four virtual machines, and configure them as ESX hosts. Each of these will need four nNICs (two for ESX purposes, two for IsilonSD) and nine hard drives (2 for ESX again, and 7 for IsilonSD). I’m hosting all the hard drives in a single data store; it happens to be SSD. For physical networking, my host only has a single network card connected, so I’ve leveraged virtual switches without a network card to simulate a private management network.

A snapshot of my single VM configuration is below:

IsilonSD_NestedESXDiagram1

With the first virtual machine created, now I simply clone it three times, so I have four exact replicas. Why four? It will allow a three node cluster to start; then I can test adding (and removing) a node without data loss; the same we would with a physical Isilon.

Note the ‘guest OS’ for the virtual machines is VMware ESXi 6.x. This is nice feature of vSphere to help you keep track of your nested ESXi VMs. Though keep in mind, nesting vSphere is NOT supported by Vmware; you cannot call and ask for help. Not a concern here given I can’t call EMC for Isilon either since I’m using Free and Frictionless downloads. This is not a production grade configuration by any stretch.

IsilonSD_NestedESXDiagramx4Once all four virtual machines existed on my physical ESX host, installing ESX is just an ISO attach away.

After installing ESX on all my virtual hosts, I then add them to my existing vCenter as hosts. vCenter doesn’t know these are virtual machines and treats them the same as a physical ESX host.

I’ve placed these virtual hosts into a vCenter cluster. However, this is only for aesthetics purposes to keep them organized. I won’t enable normal cluster features such as HA and DRS given Isilon cannot leverage them, nor does it need them. Plus given there is no shared storage between these hosts, you cannot do standard vMotion (enhanced vMotion is always possible, but that’s another blog).

Here you can see those virtual machines with ESX installed masquerading as vSphere hosts:

IsilonSD_NestedESXCluster

I’ll leverage Cluster A you see in the screenshots for the Management Server and InsightIQ server. Cluster A is another cluster of nested ESXi VMs I used for testing VSAN; it also has the Virtual Infrastructure port group available so all the IsilonSD Edge VMs can be on the same logical segment.

External Network

The Isilon External network is what faces the NAS clients. In my environment, I have a vSphere Distributed Virtual Switch Port Group called ‘Virtual Infrastructure’ where I place my core systems. This is also where vCenter and the ESX Hosts sit as well, and what I’ll use for Isilon as there is no firewall/router between the Isilon and what I’ll connect to it.

Virtual Infrastructure network space is 10.0.0.0/24; I’ve set aside a range of IP addresses for Isilon in this network.
.50 is the management server
.151-158 for nodes
.159 for SmartConnect
.149 for InsightIQ.
You MUST have contiguous ranges for your nodes, but all other IP addresses are personal preference.

For use in the deployment steps:
-Netmask: 255.255.255.0
-Low IP Range: 10.0.0.151
-High IP Range: 10.0.0.158
-MTU: 1500
-Gateway: 10.0.0.1
-DNS Servers: 10.0.0.11
-Search Domains: lab.vernex.io

Internal Network

The physical Isilon appliances use dedicated Infiniband switches to interconnect the nodes. This non-blocking, low latency, high bandwidth network allows the nodes to communicate with each other to stripe data across nodes, providing hardware resiliency. For IsilonSD Edge, Ethernet is used over a virtual network for this same purpose. If you were deploying this on physical nodes, you could bind the Isilon internal network to anything that all hosts have access too, the same network as vMotion, or a dedicated network if you prefer. Obviously, 10Gb is preferable, and I would recommend diversifying your physical connections using failover or LACP at the vSwitch/VDS level.

For my lab, I have a vSphere DVS for Private Management traffic; this is bound to the virtual switch of my host that has no actual NIC associated with it. It’s a private network on the physical host under my nested ESXi instances. I use this DVS for VSAN traffic already, so I merely created an additional port group for Isilon named PG_Isilon.

Because this is essentially a dedicated, non-routable network, the IP addresses do not matter. But to keep things clean I use a range set aside for private traffic (10.0.101.0/24) as well use the same last octet as my external network.

For use in the deployment:
-Netmask: 255.255.255.0
-Low IP Range: 10.0.101.151
-High IP Range: 10.0.101.158

SmartConnect

For those not familiar with Isilon, SmartConnect is the technique used for load balancing clients across the multiple nodes in the cluster. Isilon protects the data across nodes using custom code, but to interoperate with the vast variety of clients standard protocols such as SMB, CIFS and NFS is used. For these, there still is not an industry standard method for load to be spread across multiple servers (NFS does have the ability for transparent failover, which Isilon supports), the approach is a beautiful blend of powerful and simplistic. By delegating a zone in your enterprise DNS for the Isilon cluster to manage, SmartConnect will hand out IP addresses to clients based different load balancing options appropriate for your workloads, such as round robin (default) or others like least connection.

To prepare for deploying an IsilonSD Edge cluster, we’re going to modify the DNS to extend this delegation to Isilon.Configuring the DNS ahead of time makes the deployment simple. If you’re running Windows DNS, here are the quick steps (if you’re using BIND or something similar, this is a delegation and should be very similar in your config files).

Launch the Windows/Active Directory DNS Administration Tool

IsilonSD_NewDNSDelgation

Locate the parent zone you wish to extend, here I use lab.vernex.io.

Right click on the parent zone and select New Delegation

 

 

 

 

 

 

 

 

 

 

 

Enter the name of the delgated zone, this ideally will be your cluster name, for my deployment Isilon01IsilonSD_Delgation1

 

 

 

 

 

 

 

 

 

Enter the IP address you intend to assign to Isilon SmartConnect

IsilonSD_Delgation2

 

 

 

 

 

 

 

That’s it, when the Isilon Cluster is deployed and Smart Connect running, you’ll be able to navigate to a CIFS share like \\Isilon01.lab.vernex.io, your DNS will pass this request to the Isilon DNS server, which will reply with an IP address of a host that can accept your workload. This same DNS works for managing the Isilon cluster as well.

QuickTip, you can CNAME anything else to Isilon01.lab.vernex.io, so I could make File.lab.vernex.io CNAME pointed to SmartConnect. This is an excellent way to replace multiple file servers with a single Isilon.

For use in the deployment:
-Zone Name: Isilon01.lab.vernex.io
-SmartConnect Service IP: 10.0.0.159

 

By | March 28th, 2016|EMC, Home Lab, Storage, VMWare|1 Comment

IsilonSD – Part 1: Quick Deploy (or how I failed to RTM)

This post is part of a series covering the EMC Free and Frictionless software products.
Go to the first post for a table of contents.

As I mentioned in my previous post, EMC recently released the “software defined” version of Isilon; their leading enterprise scale-out NAS solution. If you’re familiar with Isilon, you might already know there has been a virtual Isilon (called Isilon Simulator) for years now. The virtual Isilon would run on a laptop with VMware Workstation/Fusion, or on vSphere in your datacenter. I purchased and installed Isilon in multiple organizations, for several different use-cases; the Isilon Simulator was a great solution for testing changes pre-production as well familiarizing engineers with the interface. The Isilon Simulator is not supported and up until recently you had to know the right people to even get ahold of it.

With the introduction of IsilonSD Edge, we now have a virtualized Isilon that is fully supported, available for download and purchase through your favorite EMC sales rep. It runs the same codebase as the physical appliances, with some adjustments for the virtual worlds. As we discussed, there is a “free” version for use in non-production as part of EMC’s Free and Frictionless movement. I’ve run the Isilon Simulator personally for years, so I want to leverage this latest release of IsilonSD Edge as my new test Isilon in my home lab.

A quick stop to the IsilonSD Edge download page and I’m quickly pulling down the bits. While waiting a few moments for the 2GB download, I review some of the links; there is a YouTube video on Aggregating Unused Storage, another that covers FAQs on IsilonSD Edge and one more that talks about Expanding the Data Lake to the Edge. These all cover what I assumed, you’ll want at least three ESX hosts to provide physical diversity, you can run other workloads on these hosts, and the ultimate goal of this software is to extend Isilon into Edge scenarios, such as branch offices.

Opening the downloaded ZIP file, I find a couple OVA files, plus the installation instructions. I reviewed a couple of the FAQs linked from the product page, though didn’t spend much time on the installation guide; nor did I watch the YouTube Demo: Install and Configure IsilonSD Edge. I like to figure some things out on my own; that’s half the fun of a lab, right? I did see under the system requirements, it mentions support for VSAN compatible hardware, referencing the VMware HCL for VSAN. I just recently setup VSAN in my home lab, so that coupled with the fact I’ve run the Isilon Simulator; I’m good to go.

Fast forward though a couple failed installations, re-reading the FAQs, more failed installations, then reading the actual manual… here’s the catch.

You have to have local disks on each ESX node.

More specifically, you need to have directly attached storage… 
        without hardware RAID protection or VSAN.

Plus, you need at least 7 of these directly attached unprotected disks, per node.

While this wasn’t incredibly clear to me in the documentation, once you know this, you will see it’s said; but given IsilonSD is running on VMDK files, I glossed over the parts of the documentation that (vaguely) spelled this out. If you’ve deployed the Isilon Simulator, or any OVA for that matter, you’re used to selecting where the virtual machines are deployed, I assumed this would be the same for IsilonSD and I could choose the storage location.

However IsilonSD comes with a vCenter Plug-In that deploys the cluster, as part of that deployment, it scans for hosts and disks that meet this specific requirement. Moreover, during the deployment IsilonSD leverages a little used feature in vSphere to create virtual serial port connections over the network for the Isilon nodes to communicate with the Management Server, this is how the cluster is configured; so deploying IsilonSD nodes by hand isn’t an option (you can use still use the Isilon Simulator, which you can deploy more manually).

I’m going to stop here and touch on some thoughts, I’ll elaborate more on this in a later post once I actually have IsilonSD Edge working.

I do not know any IT shop that has ESX hosts that has locally attached, independent disks (again, not in a RAID group or under any type of data protection). We’ve worked hard as VM engineers to always build shared storage so we can use things like vMotion.

The marketing talks about capturing unused storage, about running IsilonSD on the same hosts as other workloads; in fact the same storage as other VMs; but I’m not sure who has unused capacity that’s also independent disks.

I certainly wouldn’t recommend running virtual machines on storage without any type of RAID-like protection. Maybe some folks have a server with some disks they never put into play, but 7 disks; and at least three servers with 7 disks?

I know there are organizations that have lean budget and this might be the best they can afford, but are shops like licensing and running vCenter ($), are they looking at virtual Isilon($)?

Call me perplexed, but I’m going to put off thinking about this as I still want to get this running in my lab. Since I don’t have three servers and 21 disks laying around at home, I’ll need to figure out a way to create a test platform for IsilonSD to run.

Be back soon…

 

By | March 25th, 2016|EMC, Home Lab, Storage, Uncategorized|3 Comments

Free and Frictionless – A Series

One of the most common statements I’ve made to vendors over the years is “why should I pay to test your software?”. To this day I still don’t understand this; if I’m going to purchase software to run in my production environment, why should I have to pay to run this software for our development and testing needs? It seems counter-intuitive; in my mind having easy access to software which IT can test and develop against increases the probability of choosing it for a project. Having software be free in non-production allows developers to ensure that it is properly leveraged, as well it encourages accurate testing and facilitates operations ensuring it’s ready to be run in production. In my experience not only does this result in more use of the software in production (which means more sales for the vendor), but more operational knowledge (which means less support needed from the vendor).

Companies offer difference solutions to attempt to solve this. Microsoft does it well with TechNet and MSDN subscriptions; which for a small yearly fee you can license your IT staff, rather than the servers; you get some limited support and recently even cloud credits. Many companies will provide time-bombed versions of the software; this helps in the evaluation phase to test installation, but falls short in ongoing development needs, not to mention operations teams gain no experience. Some vendors will steeply discount non-production; though most of them only do this through during the purchasing process, and I’ve seen a wide range of how well this gets negotiated (if at all).

There is no doubt in my mind that this challenge is a significant factor in the growth of open-source software. With the ability to easily download software, without time limits and without a sales discussion; the time to start being productive in developing a solution is dramatically reduced. I’ve made this very choice; downloading free software and beginning the project while things like budget are still not finalized. The software can be kept running in non-production and when moved into production, support contracts can begin. You don’t need to pay upfront, before prototyping, before a decision is made and before any business value is being is being derived.

This is why I’ve been ecstatic EMC is making a movement towards a method that allows the free use of software in non-production, even in products they are not using an open-source license model. They refer to this approach as ‘Free and Frictionless’. It doesn’t apply to all their software, but the list is growing. Currently, products like ScaleIO, VNX, ECS and recently added; Isilon. The free and frictionless products are available for download, without support, but without time-bombs either. In most cases there are restrictions, such as the total amount of manageable storage. These limitations are easy to understand and work with and fully deliver on my age old question “why should I pay to test your software”.

I’m going to spend a little time with these offerings, many of them I’ve run in production, at scale; so I’m interested how well they stack up in their virtual forms. I’ll also explore some products I haven’t run before.

By | March 24th, 2016|EMC, Home Lab, Storage, VMWare|10 Comments

Leap Day 2016 – VMAX & All Flash

Happy Leap Day; a day every four years I’m reminded how technologists across the globe can manage to work together to ensure our systems all stay on (relatively) the same time, even with odd things like leap day thrown into the mix!

If you haven’t been on social media in the past few weeks, you might have missed today EMC is making quite a few announcements. One of them, the ‘All Flash VMAX’ storage array, is something I want to share some thoughts on. I’ve answered quite a few questions about VMAX over the years, from running it, from technical panels and references calls; so I’m going to put this post in Q/A type format. It’s also a little long, if you feel like the cliff notes, head over to GotDedupe.com for the Podcast on this subject; or hookup your favorite podcast subscription tool to Remain Silent.

Before we start, a few things I want to make clear, as I think it’s germane to the conversation:

  • I do not work for EMC; though I am EMC Elect, and have been on and off their Enterprise Technical Advisory Panel for years.
  • I have been a Symmetrix/VMAX customer for a little over 15 years in some form or fashion; yes I still call it Symmetrix.
  • I’ve been a customer of countless other storage technologies as well.
  • I’m going to refer to ‘other’ storage techniques, when doing so, don’t read between the lines. I’m talking about all products, both EMC and their competitor. This blog isn’t about trash talking the non-VMAX arrays, this is about what makes VMAX great.
  • I’m going to be candid. In many cases, the questions are a little obtuse simply because it’s what I hear.
  • My response here is similar to what I’d say in person; akin to some very conversations I’ve had. Please read in that vein and with the humor I intend.
  • These are my opinions; I’d love to debate them if you don’t share them.

Q: Isn’t EMC late to the game with All Flash VMAX?
A: First, no, they are not. I’m not sure people can argue specific dates, but the Symmetrix line has had flash drives for years. Before even the VMAX product (think DMX days). Yes, it predominantly was as a caching layer for FAST VP tiering, or full LUN tiering, or in small increments to carve up for application use like temp tables; but it’s been in the product for longer than any ‘All-Flash-Array’ even existed to my knowledge. I spoke with people running All Flash DMX arrays years ago for very critical workloads in the finance space; it was, very expensive; but so was the workload they were supporting. So you could buy an all-flash VMAX before, it warranted some special sizing and attention is all.

Q: But there are other ‘all-flash-arrays’ out there that are selling like hot-cakes, so isn’t this EMC just playing catch-up?
A: Still no, besides the previous point and the fact they have analyst leading all-flash-array in their portfolio already, because quite simply, in almost all the scenarios a VMAX properly configured with FAST VP tiering could produce the same performance as those all-flash-arrays; so there wasn’t a large need to create an All Flash version of VMAX before. Those other arrays offset the price of flash with deduplication, which in some scenarios worked incredibly well; but not in all scenarios. In fact, if you didn’t get a good dedupe rate, you not only had to buy more storage but buy at a price higher than you originally thought based on the initial dedupe ratio you were told at pre-sales. VMAX offset the price with tiering; which doesn’t have the unknown dedupe problem, though does have a parallel problem if the IO density skew was not correct. Today, with 3D NAND flash bringing consistent performance at ever lowering prices, and the fact we’re seeing flash drives will outpace spinning disk for capacity per square inch, the economics of flash can now be the price offset.

Q: Matt, you’re avoiding the point, “why even play if you show up late to the game.”
A: Well you didn’t really read my previous points, but fine; the introduction of an ‘All Flash VMAX’ is behind other products that market an ‘all flash array.’ But answer my question now:

If you had a binary choice between bleeding edge, or reliability, which would you choose?

If it were a binary choice, I would choose reliability. To the point, VMAX arrays are sitting behind some of the most mission-critical workloads in the world. VMAX handle transactions for large credit providers that cannot deal with any transaction getting lost or slowing down. VMAX sits in large hospitals that have people lives riding on stable infrastructure. Banks have multiple VMAX arrays with replication in multiple countries. Not just two sites, but three or four; chained replication, star replication, in some cases, I know banks that have ‘bunker VMAX’ array simply receiving a copy of all their data in a super secure facility in other countries to protect against major disasters. By the way, when talking about major disasters, we’re not just discussing hurricanes or ice storms; but great depression era economic crashes, nuclear events, pandemics, or military invasions. These are the scenarios dealt with by some of our IT brethren we only see in blockbuster movies. When I talk to those engineers, they run VMAX arrays.

At one point in my career, I worked for a software vendor in the healthcare space, I know first hand when making decisions about our product, we asked ourselves if we would put our loved ones in a hospital running our product. Personally, I was responsible for reliability and performance testing, so I asked that of myself often when evaluating hardware as well. These are the questions the VMAX engineers ask themselves too because they know all the mission-critical workloads that are running on the product they develop. I’ve bought VMAX over and over again because of that very fact. Because they are worrying about reliability, not just on the latest buzzword technology, but interoperability on Mainframe, AS/400, AIX/Power, Solaris/Sparc, Integrity/HP-UX, Linux, Vmware, Hyper-V and more.

So, I still don’t think they are late to the game, given the economics of flash are just coming to the point all flash is making financial sense without highly space efficient workloads, but even if you do; keep in mind… doing something right, takes some time. Doing it well enough that lives, countries, and world economies ride on it; is worth taking the extra time. I don’t want a surgeon telling me he’s trying a new free POC scalpel a vendor gave him for free right before I go into surgery.

“When VMAX comes out with a feature, I’d say, without a doubt, it’s going to be the most reliable option.”

Q: I get your point, but I don’t run a bank and no ones lives depends on my storage, why does this apply to me?
A: That level of reliability engineering and testing trickles downhill; even if you aren’t that bank or hospital, do you want your systems going down?

Q: No of course not, I don’t want my systems going down, but it’s not like other solutions fail all the time, when is VMAX the right solution?
A: Now we get past the marketing into the real conversation, when VMAX All-Flash vs. all-flash-array with dedupe, or virtual storage, or classic active/passive scale up storage? Of course, it depends on your requirements and technology stack, and I HIGHLY recommend to have that conversation with a VAR or manufacturer that has all those options. Only when you’re toolbox is full, can you pick the right tool for a job. If you’d like, I’d be happy to have the conversation too, feel free to reach out.

That said, this is a blog, so let’s explore the questions… here is how I look at VMAX:

“VMAX is the RIGHT solution for every storage requirement, is it not, however, always the BEST solution for every storage requirement”.

Because of that rock solid operations, because of all the options you have with VMAX; scale up, scale out, ficon, fibre channel, eNAS, iSCSI, large cache, consistency groups, asynchronous replication, synchronous replication, metro-clustered storage, snap-shots, cache partitioning, GUI management, CLI management, (on and on and on); I have not encountered a situation the VMAX will not provide the necessary solution. You might need all flash, you might need several of them, you might need 1 engine or 8, you might need 3.5-inch drives with high-density SATA; but I bet you can use VMAX to solve it.

But…

If you have a VDI environment, you want dedicated disk for, an all-flash-array with deduplication will probably be simpler and more cost effective. For that matter, hyper-configured infrastructure would be even more appropriate as it’s more than just storage but includes compute, software, management and probably supported together.

If you have an enormous amount of file data that’s growing at a high rate; you might want a scale-out NAS solution that allows adding storage at a node level and can allow you to access those objects across a single namespace or through a central API. HCIA is starting to get interesting here as we see software defined NAS solutions that scale-out, as well object specific arrays, or cloud object stores.

If you have a medium environment, you need host interoperability, don’t have a high scale or super high-reliability requirements; active/passive scale-up arrays might be more cost effective. Yet again, HCIA is a good option here too if you are x86-based and virtualization comfortable.

If you’re doing genomic sequencing on a Linux compute farm, you might need a PCI Express based low latent storage array for blistering fast IO.

But if you have a mainframe running a mission critical workload in a CICS-complex across long distance; you’re probably going to choose VMAX.

If you are running a large ERP system on two ‘big-iron’ RISC-based Unix boxes and need synchronous replication to your secondary site across town, and from there asynchronous replication to your third site across the country; you’ll probably choose VMAX.

If you have a large VMware environment under high load and need multi-tenancy options to subdivide the workloads and prevent group of VMs from adversely affecting each other,  down to having dedicated storage array channels, plus control each group differently at the replication layer and fail them over with rich Site Recovery Manager integration, you’ll probably choose VMAX.

If you have a heterogeneous environment with lots of technologies and need something that supports them all; you’ll probably choose VMAX because you have a single pool of storage to leverage across those technologies with one management interface and substantial vendor support for all those platforms.

Or, if you’re like me, and walk into IT organizations that need help across the technology landscape; I pick VMAX because I know it will solve all the storage needs. I know it will be almost transparent to maintain after setting it up and that with everything I have to deal with, I won’t be getting a call at 3 am because of the VMAX. I did this even with an all VMware environment, because even though VMAX still supports ficon on mainframe, it also supports all the VAAI command set and one of the first to support VVOLs, plus has some of the best VMWare plug-ins out there.

Q: Mainframes, big-iron, Unix? Those are all dinosaurs, just like VMAX.
A: So if those are dinosaurs, they are the alligators, that survived not only extinction-level events but the epochs since, to sit at the top of their food chain today. Seriously, stop using that phrase about technology just because you are only exposed to (or sell) the current technology heralded as the second coming, which will only become yet another ‘dinosaur’ when the next recycled idea gets a new spin and big marketing campaign. Do not misread this, I work with the newer technologies daily, I believe in many of them, but there is still a place for some of these more battle-tested technologies; especially when they are still under massive development and used by giants of industry.

If you can’t integrate the bleeding edge with existing technologies, you’re on an island connected to nothing.

Q: Wow, ok Matt, I get it, you like VMAX…
A: Yes, I do, though I like a lot of other storage as well, as I said at the beginning, we’re talking about what makes VMAX great. We’re not talking about anyone else, because there are lots out there I like and have used them.

“Put it this way, if I had to choose one array, it would be VMAX. If I got to choose two, one of them would be VMAX.”

I just think in today’s age with all the new technology getting attention, we often lose sight of all the needs of IT, big and small, across industries. A good technologist will look at their requirements and pick the BEST solution; which again, may not be VMAX; though more use-cases just became in play with the announcement of the All-Flash VMAX.

Q: Speaking of All Flash VMAX, I thought it was the point of this post?
A: Somewhat, so today EMC is releasing the ‘VMAX All-Flash’, the reason the previous points were important, was not only to address some of the questions I get when discussing VMAX but to acknowledge the history of functionality and reliability that the new announcement put in play.

“This is… a VMAX… with, all flash. That is a big deal.”

There are some enhancements to the technology today, and coming, that take advantage of all flash, there is also a new bundling model to simplify purchasing, and a sizing model to ensure it’s configured correctly. But you’re getting SRDF and TimeFinder, and Hypermax (Enginuity). Those things are important if you understand this product.

Q: What do I get with a VMAX All Flash?
A: I’m a little past the TLDR point, and there are going to be some great blogs on the speeds and feed, which I’ll link to when they come out. To wrap this up here is how I look at the product when taking knowledge of the VMAX into account.

This is a VMAX configuration optimized for all flash. Because of that, it will offer the larger drives (think 3.8TB today, and I believe that will grow quickly). These are larger flash drives than available in the VMAX3, because there is less worry about IO density. In a tiered system, it is important the flash tier be designed for the necessary IO, meaning drive count was as important as tier size. Even flash has IO limitations. With all flash, you balance the IO density in one tier so larger drives will not become a bottleneck (put more simply: people skimp on drives in their flash tier when they are so big, and cause that IO skew problem we talked about earlier; with all-flash everything is tier 0 so the larger drives are generally safe, though the sizing configurator for this model will ensure the right drive is selected for IO vs capacity).

One of the distinguishing factors with VMAX in general is the larger DRAM-based cache and associated algorithms. This comes into play with all-flash. While other arrays are starting to tier between SLC and MLC to provide fast write while avoiding cell level wearing; the VMAX can leverage the large cache to coalesce writes, speeding up the write activity, reducing wear and still leveraging more cost efficient flash technology. Those VMAX engineers we talked about, that think mission critical, put about a million lines of code into HyperMax to ensure the best leverage of these drives, not only putting IO to them, but monitoring them.

You get large-scale; think 8 engines with 16 directors, 16TB of DRAM cache, 256 host ports, 1,920 flash drives pushing 4.3PB of usable flash capacity.

SPC-2 numbers like 55+GBps, small block IOPS in the millions. With that entire 4.3PB of space being diamond class.

Simple purchasing:

  • Two hardware model
    • 450 = 4 engine (200K)
    • 850 = 8 engine (400K)
  • Two software packages
    • F = Base+Snapshots (add-ons ala carte)
    • FX = Probably everything you need, like SRDF, Cloud Array, Data at Rest Encryption, eNAS, VIPR (other options ala carte)
  • Base configuration, then you grow in ‘V-Brick’ units, packs of disks that give you a model you know how to grow up front.

You can also run SRDF to a Hybrid VMAX, opening all sorts of possibilities to insert the new All Flash VMAX into muti-array/site implementations. The only thing I’ve seen to be aware of is the lack of FAST.X. This does make sense when you think about it; if you’re going to leverage FAST.X to tier DOWN from the VMAX, then use a VMAX3. That product still is the RIGHT solution, it’s just that maybe the new All Flash VMAX is the BEST solution to your requirements.

“This is what makes VMAX great; the flexibility of choice to make the storage fit your environment, rather than make your environment fit the storage.”VMAX-AFpng

By | February 29th, 2016|Storage|4 Comments

Soapbox Topic=”FUD”

liedetectorgraph<soapbox>

FUD; Fear, Uncertainty and Doubt. More specifically for this blog, a vendor who attempts to discredit a competitor rather than speak of their own value. Be it with straw man arguments, comparing apples to oranges, or simply outright lies. I find the practice personally disgusting. If you cannot speak well enough about your product that you require speaking poorly of others, I’ll assume you have a bad product. It’s like when my kids tattle on each trying to get out of trouble, then I know for sure they did the deed I’m asking about. It happens every day, but I can recall a few cases in the storage world that raised my ire.

Years ago, a large technology manufacturer (of who I had many of their products, as well as their competitor) was pitching me on a need for net new growth. Their product line had been suffering in my operations, so it was not the front-runner. Rather than speaking about how to improve the situation, or how the newer generation would resolve the issues; they tried make their competition appear overly expensive. In doing so, pulled prices from eBay (yes really, eBay). Now, the argument had a merit at the surface, since sales are not privy to competitors pricing, eBay has to be cheaper than list price, or even discount, right, it’s eBay! I can see their line of thought “certainly if we’re cheaper than eBay than they’ll use us”. Sadly, the eBay prices were incredibly inflated, over current list prices, not to mention the discount I’d receive. I recall losing my cool in that conversation, dressing down the sales rep about FUD practices and failing to address our concerns in the new sale, not to mention our operational issues. To boot, believing we as the customer couldn’t do the basic math in a cost comparison. He was walked off our campus, never to return (seriously, did I mention I don’t like FUD?).

In another case, a vendor was telling me how their product was superior to their competitor because they tiered at the sub-lun level, telling me my product of choice would only tier at the whole lun. I was in management at the time and storage was one of many of my departments, so I have to imagine the sales team believed I simply wasn’t aware of the details. The detail being, they were comparing their current product to their competitors product of 2 years ago. Not only did I correct them on their misinformation, but since they sold other products I liked and wanted, I had the account team replaced because of that breach of trust (again, I really don’t like FUD).

Today, with social media, my witness of this practice is no longer limited to personal interactions. Almost daily I see a tweet about one product replacing another; and when the replacement is 3-5 years old, and likely 2+ generations, again my hackles raise. Especially because in many of these cases, I believe the product has technical merit. The bitter use of logical fallacy in comparing different generations, in the world of Moore’s Law, causes me to assume they are trying to cover up something. The approach erodes my trust in the people and the company itself that spread the misinformation.

If you are reading this and have an involvement in the sales channel, please, compete with integrity. Stand on your own merits. If your pitch is rooted in bashing your competitor, educate yourself and focus on your products positive aspects, leave it up to the customer to weight them against the competition. If the product you are competing against truly has issues you want to inform your customer of, leverage a reference customer to have an unbiased call.

You might just win the deal based on your integrity.

</soapbox>

By | February 4th, 2016|Soapbox|0 Comments

Bulk Upload ISOs to vCloud Air

I’ve been leveraging vCloud Air recently for development and testing. The ability to run nested ESXi in the cloud, coupled with the promotions for free credits makes it a convenient playground for running a plethora of applications.

Uploading to vCloud Director has always been cumbersome. Due to the required browser plug-ins, you needed the right combination of versions and your settings had to be just right. But recently Chrome and Firefox have closed the door on old Netscape API that fully broke the ISO upload feature.

VMWare provides a command line tool called OVFTool that allows uploading ISO files (and OVF as well). William Lam over on vGhetto wrote an excellent shell script to wrap OVFTool to make it a little easier. However, I’m still too lazy for that. What I wanted is to upload a bulk of ISO files to my catalog, so I could quickly build environment from scratch.

So, entered vCloudAir_BulkUploadISO.sh, this bash script for Mac will do just that. Set a couple of variables up front; your vCloud Air information and the folder you want to upload. The script will prompt for your credentials and loop through all the ISO files in the folder upload them to your catalog. Clone it from my GitHub page.

Open in your favorite editor, adjust the variables and upload en masse.

By | January 21st, 2016|VMWare|0 Comments

ProcessControl

I was looking through some old code and found a little gem I built almost ten years ago called ProcessControl. I’m sure you’re familiar with the ability to adjust the affinity and priority of processes in Windows. If you’re not, give it a try in Task Manager.

I’ve used these controls quite a bit over the years for numerous purposes. Such as troubleshooting performance by adjusting process priority. Or getting older applications or games to work better by constraining them to a single core (this was common early in the multi-core days before thread management was widespread). Writing the code in my spare time to test an idea I had to improve the performance of a COTS product called SolarWinds Orion, which is an excellent monitoring tool for networking (among other things).

Orion runs multiple components on a single server (web server, business logic, SNMP trap server, NetFlow collector and SNMP poller), however at the time, the SNMP poller was single-threaded; causing a conflict of resources. The other portions of the application that spread their workload across all cores would be on the same core that the single-threaded daemon was assigned automatically by Windows. This contention would slow down the SNMP poller due to the inefficiency.

After testing tuning the affinity of all processes, when the SNMP poller received a dedicated core it showed a vast improvement in the polls per second. However manually adjusting these settings in Task Manager after every reboot or process change is not a viable solution. So in entered a very simple piece a code that would take an XML configuration file with the affinity and priority desire of each process and adjust them programmatically. Running this as a Windows service set to automatically start after boot, and recheck the process settings periodically made it ready for operations.

With the existence of this tool, I found myself using it frequently to solve odd performance situations. Additional benefits were found, such as decreasing the occurrence of processes moving between cores/sockets and the low-level cache rebuilding. Or in processor constrained systems where upgrading hardware wasn’t possible, removing the offending process from core 0 and reducing the process priority would solve stability issues (hardware level interrupts and core operating system processes use core 0, so often OS stability is due to user processes contending with kernel processes).

My most common use of this code was my personal development workstations. When installing lots of SQL, IIS, MySql, PostGres, Apache and more all on the same instance where I do coding; the resource contention between all these applications slows down the GUI. ProcessControl can reduce the cores and priority of all those server daemons, which do not need robust resources to simply test code, or look at configuration details. This, in turn, leaves more resources for the application with a human interface, speeding up the experience.

I’ve posted the code and working binaries on my GitHub page. If you’re tired of manually changing your process priority and affinity or have been looking for a way to tune applications that are having conflicts due to thread management; feel free to use the tool and contribute to the code. It’s not highly complex code, but it gets the job done.

ProcessControl Quick Start

Installation

  1. Create a new folder called “ProcessControl” under “C:\Program Files (x86)\”
  2. Download the entire zipped repository from GitHub (or clone it if you want)
  3. Open the zip file and copy all the contents from “ProcessControl\bin\Release\” into “C:\Program Files (x86)\ProcessControl\”
  4. Navigate to “C:\Program Files (x86)\ProcessControl\” and double-click “ProcessControl.exe”
    1. This will not actually launch the service, but trying to execute the application will ensure you have the needed .Net framework
  5. Use the provided tool “srvinstw” to register ProcessControl.exe as a Windows service
    1. Run as default, so it has access to the processes.
    2. Choose Auto or Manual based on your needs.
  6. Adjust the XML config file “ProcessControlParams.xml” as desired, refer to configuration details below
  7. Start the server

Configuration

There are two configuration files:

ProcessControl.xml is the main application configuration. This has settings for the location of the ProcessControlParams.xml (if you want to host the service and associated files in another location outside of Program Files). As well a setting for the interval to recheck process attributes, by default this is every 15 minutes.

The second XML file, ProcessControlParams.xml, is the meat of the application. The first configuration line with the process name ‘Default’ will adjust the affinity of ALL processes. This is a baseline reset which allows you to clear off a core. The priority control does NOT work for Default. The next line, copied and pasted as many times as you need; adjusts the process of your choosing. You can adjust the priority and/or affinity of (almost) any process (there are a few system processes you cannot control). Here is a quick look at the XML:

<Process Name=“Default” Priority=“” Affinity=“”/>
<Process Name=“MyProcess” Priority=“” Affinity=“”/>
The options for Priority and Affinity?

Priority – these are the standard options:

  • RealTime
  • High
  • AboveNormal
  • Normal
  • BelowNormal
  • Idle

Affinity – this is a little more tricky and controlled by a number to represent all the different configuration options. Below are documented the common options I’ve used in up to an 8 core environment. The options are also documented in the ProcessControlParams.xml file when you download. If you want a combination that is not documented and don’t want to do the math; simply manually set a process to the desired state before starting the service. Step one in launching it to log the status of all existing processes. On the off change you discover more, please update the file on GitHub?

All 8 = 255
4-7 = 240
2,3 = 12
0,1 = 3
4,5 = 48
6,7 = 192
1,2 = 3

Logging

ProcessControl will log into the Windows Application event log. Successful process changes, as well as errors and full exception catch output will all be put into event entries. If you’re having problems with the service, it’s a good bet the information will be in the application log.

By | January 13th, 2016|Code, Tuning|0 Comments