Windows Server Multicast Networking Guide

This article is a compilation of several articles on the topic of networking (particularly multicast messaging) for Windows Server that I had previously written. Many of the hits on this blog recently have been for these articles, so I figured it was time to compile all my learnings from several years of working with high-throughput applications on Windows in one convenient place.

Sub-par Performance because of Power-Saving Features

Both Windows as well as network interface cards (NICs) come with various power saving features. The default settings are usually a compromise between energy efficiency and performance. For optimal performance they will need to be changed.

This article shows how to do that at the operating system level: Power-Saving in Windows Server 2008 R2 and later

Not Receiving Multicast Messages on a Microsoft Failover Cluster

It can happen that multicast messages are not received on a machine that is part of a Failover Cluster because the physical interface has a higher metric than the virtual failover adapter that is installed as part of Microsoft Failover Cluster.

This article explains how to diagnose the issue and adjust that metric of the physical interface so applications listening for incoming multicast messages are joined on the correct interface: The Case of Multicast Messages not being Received on a Windows Server 2008 R2 Microsoft Failover Cluster

Multicast Messages dropped because of a small Socket Receive Buffer

As messages travel from the wire through the NIC, the operating system network stack to your application, there are various buffers along the way that will need to be large enough to hold data for a little while in case one of the components involved is busy.

As an application developer, it is your job to make sure the receive buffer sizes used by the sockets listening for incoming multicast messages are large enough to buffer data e.g. when a garbage collection prevents your code from processing it for a moment.

Multicast Messages dropped because of the Base Filtering Engine

If there is a lot of multicast traffic on a machine, Windows’ Base Filtering Engine (BFE) might not be able to keep up with it, resulting in dramatic message loss. BFE is part of Windows Firewall, but it might be active even when a different firewall solution is used.

This article links to a hotfix available for Windows Server 2012 and also explains how to fix the issue in Windows Server 2012 R2 through the UdpExemptPortRange registry parameter: The Case of Multicast Message Loss on Windows Server 2012 R2

Multicast Messages dropped because of a small NIC Receive Buffer

Another reason for multicast message loss I experienced has been too small of a buffer on the receiving NIC. Everything was going smoothly until traffic was becoming a bit spiky. So even though the application socket receive buffer (see above) was large enough, data didn’t reach it because it was dropped in the network stack.

This article shows how to review NIC parameters with PowerShell (which unlike the NIC driver GUI in Windows does not require administrator access) and diagnose this kind of message drops: The Case of Multicast Message Loss on Windows Server 2012 R2 (again) It also contains some general tips on optimal NIC settings.

Multicast Message dropped because of the wrong Receive Side Scaling Load Balancing Profile

This article takes a look at the RSS Load Balancing Profile and how a driver update resetting it caused multicast message loss: The Case of Multicast Message Loss because of the wrong Receive Side Scaling Load Balancing Profile

Advertisements

The Case of Multicast Message Loss because of the wrong Receive Side Scaling Load Balancing Profile

[2018-11-11 Update] I have compiled information from this and all my other articles on the topic into my Windows Server Multicast Networking Guide.

The Problem

Even after I had applied all the optimizations outlined in my Windows Server Networking Guide, I was still experiencing multicast message loss and even noticeable delays in TCP traffic on one Windows Sever 2012 R2 machine in a failover cluster.

Interestingly enough, this was an issue on only one of the cluster nodes but not others. My analysis showed, it was a machine which hosted applications that are essentially using one and only one CPU continuously for mathematical calculations.

The Cause

During installation of the NIC and a subsequent driver update, one of the NIC parameter was apparently reset: the Receive Side Scaling (RSS) Load Balancing Profile.

While Microsoft’s documentation lists the default as NUMAScalingStatic, in my case it was set to ClosestProcessor (the exact parameter and value names may vary based on your NIC model).

If I understand the documentation and Microsoft’s Introduction to Receive Side Scaling correctly, ClosestProcessor will essentially use just one processor for handling all network traffic. This would line up neatly with my observation that only one machine with heavy-single-CPU-use applications was affected.

The Solution

Adding a line to the machine setup PowerShell script to force the parameter back to NUMAScalingStatic has completely resolved the issue.

Hiking at Baldeneysee

On Saturday I decided to get up early and take the train to Baldeneysee on the river Ruhr to see the sun rise over the lake form Hügel and then hike on the Baldeneysteig trail in the mountains above.

Despite having to rise early on a day off work, this was the most beautiful experience. The weather was perfect for the occasion resulting in the gorgeous pictures you see below.

Sunrise

Hike




Arrival in Kupferdreh


The Case of Low Disk Space because of another User’s Recycle Bin

The Problem

The other day, I got an alert that one machine was running low on disk space. I used WinDirStat to find out that one of the largest folders was a sub-folder of the recycle bin. These folders are named with the Security Identifier (SID) of the user they belong to.

Now I only needed to find out which user this SID belonged to.

The Solution

This article on TechNet provides PowerShell snippets showing which .NET classes to use to translate between Security Identifiers and user names. I evolved these snippets into the following helper functions which as always can be found in my PowerShell Utilities on GitHub.

Function Convert-SidToUsername($SidString)
{
    $sid = New-Object System.Security.Principal.SecurityIdentifier($SidString)
    
    try
    {
       $user = $sid.Translate([System.Security.Principal.NTAccount])
       $result = $user.Value
    }
    catch
    {
        $result = "Unknown user: $SidString"
    }
    return $result
}


Function Convert-UsernameToSid($Domain = "", $Username)
{
    if ($Domain)
    {
        $user = New-Object System.Security.Principal.NTAccount($Domain, $Username)
    }
    else
    {
        $user = New-Object System.Security.Principal.NTAccount($Username)
    }
    try
    {
        $sid = $user.Translate([System.Security.Principal.SecurityIdentifier])
        $result = $sid.Value
    }
    catch
    {
        $result = "Unknown user: $Domain\$Username"
    }
    return $result
}

Knowing their name I could now ask the user to log onto the machine in question an empty their recycle bin.

It might be that with administrative privileges I could delete the SID-named folder myself, but I felt it best not to mess with the internal folder structure of the recycle bin.

The Case of Lightroom Placing all Photos at the End of the Track: Sorting a GPX File

The Problem

Using the Map module of Lightroom and a GPX file containing a track log, photos were supposed to be placed on the track based on the time they were taken. However, Lightroom placed all photos at end of the track, even though photos were taken at various times at various places along the track. Looking in to the GPX file it was verified that there were more suitable track points for Lightroom to match the photos against.

The Cause

It seems that Lightroom expects the track points in the GPX file to be in chronological order (i.e. oldest track point first) and it will associate a photo with the first track point that has a time stamp newer than the photo. In this case, the track points were in descending order (i.e. the end of the track was at the top of the file), so all photos matched this track point and hence placed at the end of the track.

The Solution

GPX files are simply XML files, so I wrote the following PowerShell script to sort the track points by time and save the result to a new GPX file.

[xml]$xml = Get-Content 'MyTrack.gpx'
($points = ($xml.gpx.trk.trkseg.trkpt | Sort-Object -Property time -Descending) ) | Out-Null
$firstPoint = $points[-1]
$points | ForEach-Object { $xml.gpx.trk.trkseg.InsertAfter($_, $firstPoint) } | Out-Null
$xml.Save('MyTrackSorted.gpx')

Since time stamps are in ISO 8601 format they can be sorted as strings.

All output is piped to Out-Null since there are a lot of track points that would otherwise take a long time to print and needlessly slow down the script.

Making this work with multiple track segments in a single file is left as an exercise to the reader.