The Sad History of the Microsoft POSIX Subsystem

When Windows NT was first being developed, one of the goals was to make the kernel separate from the programming interface. NT was originally intended to be the successor to OS/2 but Microsoft also wanted to include compatibility to run Windows 3.x applications and to meet 1980s era DoD Orange Book and FIPS specifications to sell to the defense market.  As a result, Windows NT was developed as a multiuser platform with sophisticated discretionary access controls capabilities and it was implemented as a hybrid microkernel 3 userland environments:

  • Windows (Win32)
  • OS/2
  • POSIX

Microsoft had a falling out with IBM over Win32 and the NT project split from OS/2. The team focus shifted to Win32 so much that the Client-Server Runtime Subsystem (CSRSS) that hosts the Win32 API became mandatory and OS/2 and POSIX subsystems were never really completed but they were shipped with the first five versions of Windows NT through Windows 2000. The OS/2 subsystem could only run OS/2 1.0 command-line programs and had no presentation manager support. The POSIX subsystem supported POSIX.1 spec but provided no shells or UNIX-like environment of any kind. With the success of Win32 in the form of Windows 95, the development of the OS/2 and POSIX subsystems ceased. They  were entirely dead and gone from Windows XP and Windows Server 2003.

Meanwhile, around 1996,  Softway Systems developed a UNIX-to-Windows NT porting system called OpenNT. OpenNT was built on the NT POSIX subsystem but fleshed it out into a usable UNIX environment. This was at a time when UNIX systems where hugely expensive. Softway used OpenNT to re-target a number of UNIX applications for US Federal agencies onto Windows NT. In 1998, OpenNT was re-named Interix. Softway Systems also eventually built a full replacement for the NT POSIX subsystem in order to implement system calls that the Microsoft POSIX subsystem didn’t support and to develop a richer libc, single-rooted view of the file system and a functional gcc.

Microsoft acquired Softway and the Interix platform in 1999. Initially Interix 2.2 was made available as a fairly expensive paid add-on to Windows NT 4 and Windows 2000. Later it was incorporated as a component of Services for UNIX 3.0 and 3.5 (SFU) and SFU was made free-of-charge. When Interix became free, Microsoft removed the X11 server component that was previously bundled with Interix because in the wake of U.S. vs Microsoft, they did not want to defend law suits from the entrenched and expensive PC X Server industry but the X11 client libraries remained.

SFU/Interix 3.0 was released in early 2002 followed up with SFU 3.5 less than two years later and cool stuff got implemented like fast pthreads, fork(), setuid, ptys, deaemons with RC scripts including inetd and sendmail among other things. InteropSystems ported OpenSSH and developed a high-performance port of Apache using pthreads and other proof-of-concept ports like GTK and GIMP among many other things. Hotmail even ran on Interix. And enterprising people did cool things like a Linux ELF binary loader on top of Interix.

I got into this stuff and built and donated ports to the SFU/SUA community, including cadaver, ClamAV, GnuMP, libtool, NcFTP, neon, rxvt and gnu whois. My company sponsored the port of OpenSSH to Interix 6.0 for Vista SUA (because it broke backwards compatibility with Interix 3.5 binaries). We ran Interix on all of our workstations and servers. We used it for management, remote access and to interop with clients who used Solaris, Linux and OS X on various projects.

Slowly Going Off the Rails

With Windows Server 2003 R2 (and only R2), Interix became a core operating system component, rebranded as “Subsystem for UNIX Applications” (SUA). Around this time, the core development team was reformed in India rather than Redmond and some of the key Softway developers moved on to other projects like Monad (PowerShell) or left Microsoft. Interix for Windows Server 2003 R2 (aka Interix 5.2) was broken. It shipped with corrupt libraries and a number of new but flawed APIs and broke some previously stable APIs like select(). Also, related to the inclusion of Interix as an OS component, SP2 for Windows Server 2003 clobbers Interix 3.5 installations.

Things have been downhill from there. It’s not just that obvious things didn’t get implemented like a fully-functional poll() or updating binutils and gcc to something reasonably modern. The software suffered from severe bitrot.

One of the consequences of including SUA as an OS component has been that a bifurcation of the “subsystem” from the “tools”. The subsystem consists of just a few files: psxss.exe, psxss.dll, posix.exe and psxrun.exe. This implements the runtime and a terminal environment but nothing else, not even libc. In order to get shells, PTYs and usable programs, you have to install the “Utilities and SDK for UNIX-based Applications”  (aka tools) which is sizable download. Apparently Microsoft has concern about bundling GPL code onto the actual Windows media.

OK. This is a little weird but not a big deal except that the development timeline of the tools is now completely out of whack with Windows releases. The tools for Vista were only available in beta when Vista went gold and the version for Windows Server 2008 and Vista SP1 was not available until about a month after Vista SP1/Win2k8 was released. When Windows 7 was released no tools were available at all in July 2009 when Windows 7 was released. They didn’t become available until 8 months later in March 2010 and contain no new features.

To top things off, while SFU 3.5 ran on all versions of NT 5.x, SUA only runs on Windows Server and the Enterpise and Ultimate client editions. SUA is not available on Vista Business or Home and Windows 7 Professional and Home editions.

Is Interix Dead?

For some reason Microsoft seems to be ambivalent about this technology. On the one hand they bring it into the core of the OS and make it a “premium” feature that only Enterprise and Ultimate customers get to use and on the other they pare back development to almost nothing.

Interix has been supported with support forms and a ports tree maintained by InteropSystems collectively known as SUA Community which operates with supplemental funding from Microsoft. The /Tools ports tree is the source for key packages not provided by Microsoft such as Bash, OpenSSH, BIND, cpio and a ton of libraries that Microsoft does not bundle.  Microsoft has been increasingly reluctant to fund the SUA Community and has survey users on a number of occasions. The latest survey was very pointed and culminated with Microsoft cutting off funding and shuttering the SUA Community site on July 6th, 2010 but a few days later it was back online. I’m not sure how or why.

I have no inside knowledge but my gut says that Interix has lost internal support at Microsoft. It is being kept on life support because of loud complaints from important customers but it is going nowhere. I will be surprised if there is a Subsystem for UNIX-based Applications in Windows 8. I think the ambivalence is ultimately about an API war. At some level, the strategerizers have decided it is better to not dignify UNIX API with support. I think the calculus is that people will still use Windows but it chokes off oxygen for UNIX-like systems if it takes a lot of extra work to write cross-platform code for Windows and UNIX—the premise being that you write for Windows first because that’s where the market is. Furthermore, in a lot of business cases what is needed is Linux support or Red Hat Linux version X support in order to run something. I think Microsoft realizes that it is hard for Interix to beat Linux which is why SUSE and Red Hat Linux can be virtualized under Hyper-V.

I also believe that Microsoft sees C/C++ APIs as “legacy”. I think they want to build an OS that is verifiably secure and more reliable by being based on fully managed code. The enormous library of software built for the Windows API is a huge legacy problem to manage in migrating to such a system. Layering POSIX/UNIX on top of that makes it worse.

Whatever the reason, it seems pretty clear that Interix is dying.

WONTFIX: select(2) in SUA 5.2 ignores timeout

With Windows Server 2003 R2, Microsoft incorporated Services for UNIX as a set of operating system components. The POSIX subsystem, Interix, is called the Subsystem for UNIX Applications (SUA) in Windows Server 2003 R2 and later.

Interix is the internal name of the Windows Posix Subsystem (PSXSS) that is based on OpenBSD and operates as an independent sister subsystem with the Windows subsystem (aka CSRSS or Client/Server Runtime Subsystem).

With the first version of SUA, aka Interix 5.2, Microsoft added a bunch of new UNIX APIs. Unfortunately the broke some things that were previously working in the previous edition which was called Interix 3.5 (aka Services for UNIX 3.5).

For example, select(2) is broken in SUA 5.2. It completely ignores the timeouts provided as arguments and returns immediately.

From the POSIX specification:

If the timeout parameter is not a null pointer, it specifies a maximum interval to wait for the selection to complete. If the specified time interval expires without any requested operation becoming ready, the function shall return. If the timeout parameter is a null pointer, then the call to pselect() or select() shall block indefinitely until at least one descriptor meets the specified criteria. To effect a poll, the timeout parameter should not be a null pointer, and should point to a zero-valued timespec structure.

Here is a little test program. What should happen is that select() should block for 10 seconds every time through the loop.

#include <stdio.h>;
#include <sys/time.h>

int main()
{
	printf("Testing select(2). Each pass through the loop should pause 10 seconds.\n\n");

	struct timeval time, pause;
	pause.tv_sec  = 10;
	pause.tv_usec = 0;
	int i;

	for( i=0; i&lt;10; i++ )
	{
		//insert a 10 second pause on every loop to test select().		
		gettimeofday(&amp;time, 0);
		printf(";Current time: %d\n", time.tv_sec);
		time.tv_sec += 10L;
		printf("Add 10 seconds: %d... And pause by calling select(2).\n", time.tv_sec);
		(void) select(0, 0, 0, 0, &pause);
	}
	return 0;
}

What actually happens is select() returns immediately.

% uname -svrm
Interix 5.2 SP-9.0.3790.4125 EM64T
% gcc selecttest.c -o selecttest 
% ./selecttest 
Testing select(2). Each pass through the loop should pause 10 seconds. 

Current time: 1142434664 
Add 10 seconds: 1142434674... And pause by calling select(2). 
Current time: 1142434664 
Add 10 seconds: 1142434674... And pause by calling select(2). 
Current time: 1142434664 
Add 10 seconds: 1142434674... And pause by calling select(2). 
Current time: 1142434664 
Add 10 seconds: 1142434674... And pause by calling select(2). 
Current time: 1142434664 
Add 10 seconds: 1142434674... And pause by calling select(2). 
Current time: 1142434664 
Add 10 seconds: 1142434674... And pause by calling select(2). 
Current time: 1142434664 
Add 10 seconds: 1142434674... And pause by calling select(2). 
Current time: 1142434664 
Add 10 seconds: 1142434674... And pause by calling select(2). 
Current time: 1142434664 
Add 10 seconds: 1142434674... And pause by calling select(2). 
Current time: 1142434664 
Add 10 seconds: 1142434674... And pause by calling select(2).
% 

The test run above should have taken 100 seconds but it actually completes in less than 1 second. This is a problem because many UNIX applications will use select() as a timing mechanism. Some will use select() as a timer even if they aren’t doing IO.

There is good news and bad news.

The bad news is that MSFT told me that the won’t fix this issue. Their official guidance is to use sleep(2) and usleep(2) to control timeouts in Interix 52.

The good news is that select(2) works properly on Interix 6.0 with Windows Vista and Windows Server 2008.