FIX: VirtualBox Host-Only Network Adapter Creates a Virtual “Public Network” Connection That Causes Windows to Disable Services

vbox-host-only-net-net-n-sharing

vbox-host-only-net-connectletVirtualBox creates a “VirtualBox Host-Only Network” device which is essentially a loopback adapter for creating network connections between virtual machines and between the host and virutal machines. Unfortunately, it shows up to Windows as an unidentified public network. Connecting to a “public network” ratchets up your firewall and disables network discovery and SMB/CIFS network shares. This is kind of a big side effect of installing some VM software.Fortunately, Windows does have a way to mark a network device as virtual by creating a registry value.

The type of the device. The default value is zero, which indicates a standard networking device that connects to a network. Set *NdisDeviceType to NDIS_DEVICE_TYPE_ENDPOINT (1) if this device is an endpoint device and is not a true network interface that connects to a network. For example, you must specify NDIS_DEVICE_TYPE_ENDPOINT for devices such as smart phones that use a networking infrastructure to communicate to the local computer system but do not provide connectivity to an external network.

This powerhsell script will find the “VirtualBox Host-Only Ethernet Adapter device entry in the registry and adds the ‘*NdisDeviceType’ value of 1. After reboot, the “VirtualBox Host-Only Ethernet Adapter” will no longer be monitored by the Network and Sharing center.

# tell windows that VirtualBox Host-Only Network Adapter
# is not a true network interface that connects to a network
# see http://msdn.microsoft.com/en-us/library/ff557037(VS.85).aspx
pushd
echo 'Marking VirtualBox Host-Only Network Adapter as a virtual device.'
cd 'HKLM:\system\CurrentControlSet\control\class\{4D36E972-E325-11CE-BFC1-08002BE10318}'
ls ???? | where { ($_ | get-itemproperty -name driverdesc).driverdesc `
-eq 'VirtualBox Host-Only Ethernet Adapter' } |`
new-itemproperty -name '*NdisDeviceType' -PropertyType dword -value 1
echo 'After you reboot the VirtualBox Host-Only Network unidentified public network should be gone.'
popd


Advertisement

ASP.NET MVC 2.0 Undocumented Model String Property Binding Breaking Change

The default behavior of Model binding in ASP.NET MVC 1.0 was to initialize strings to string.Empty. However, MVC 2.0 defaults to initializing strings to null. Unfortunately, this is not listed as one of the breaking changes from MVC 1.0.

This can be a big problem if you have a substantial site built in MVC 1.0 and you import it into Visual Studio 2010 and use MVC 2.0. You may find that you are suddenly throwing NullReferenceException during model binding on code that is working in production.

Locating the Issue

ASP.NET MVC source code is available under the Ms-PL open source license which means that it is possible to diff the two versions of MVC and figure out what changed without having to guess or disassemble anything.

DefaultModelBinder-diff

protected virtual object GetPropertyValue(ControllerContext controllerContext, ModelBindingContext bindingContext, PropertyDescriptor propertyDescriptor, IModelBinder propertyBinder) {
    object value = propertyBinder.BindModel(controllerContext, bindingContext);

    if (bindingContext.ModelMetadata.ConvertEmptyStringToNull && Object.Equals(value, String.Empty)) {
        return null;
    }

    return value;
}

MVC 2.0 adds some indirection to the binding behavior of the DefaultModelBinder class via a new ModelMetadata class which governs some of the behavior of DefaultModelBinder. Of particular interest here is that there is a property named ConvertEmptyStringToNull which, when true, causes an empty string to be returned as null instead.

It also turns out that when ModelMetaData is constructed, the value of ConvertEmptyStringToNull is set to true and there is no code in MVC which changes that default value.

ModelMetaData.ConvertEmptyString-default

Restoring MVC 1.0 Model String Property Binding Behavior

I think the quickest solution is going to be to set that ConvertEmptyStringToNull property to false by providing a creating a new, custom default implementation of IModelBinder by inheriting from DefaultModelBinder and changing the ConvertEmptyStringToNull value to false on the internal ModelMetaData object.

public sealed class EmptyStringModelBinder : DefaultModelBinder 
{
    public override object BindModel(ControllerContext controllerContext, ModelBindingContext bindingContext)
    {
        bindingContext.ModelMetadata.ConvertEmptyStringToNull = false;
        return base.BindModel(controllerContext, bindingContext);
    }
}

Then the Application_Start() method of Global.asax.cs, we set the DefaultBinder property of ModelBinderDictionary to use our custom type instead of leaving it as the default. This is exposed through the Binders property on the static ModelBinders class so that we get our IModelBinder as the default instead of DefaultModelBinder.

protected void Application_Start()
{
	ModelBinders.Binders.DefaultBinder = new EmptyStringModelBinder();
	RegisterRoutes( RouteTable.Routes );
}

VirtualBox Seemless Mode Glitch with Ubuntu

I recently installed Ubuntu 10.04 Lucid Lynx in VirtualBox. I applied all updates and installed the VM additions. When I tried swtiching into seemless mode, I ran into a problem where all of the window chrome disappeared. Well, the title bars were gone as well as a little bit of the status bar that includes the rounded corners. The title bars were still there but still, I was unable to move or resize windows in seamless mode, which is not ideal.

vbox-seemless-bug

Seemless Mode Only Works Correctly with all Effects Disabled

After fiddling for a while, I realized that Ubuntu had turned on some window effects after rebooting with the VM additions installed. Disabling all window effects solves the problem.

ubuntu-disable-effects

Seamless mode is cool but it isn’t multi-monitor aware. I can’t drag guest OS windows off of the monitor where seamless mode was initiated. If I define the VM as having 2 monitors in VirtualBox, Ubuntu doesn’t see the second one. Also, the screens size is set to the resolution of my smaller monitor in seamless mode which causes the Ubuntu desktop toolbars to float in the middle of my screen. Basically, the entire guest OS desktop has to be on one of my monitors but I can move everything with {right-CTRL+L} (to exit seamless mode), move guest OS window to desired monitor, {right-CTRL+L} (to enter seamless mode on the new monitor). This solution is workable, but hopefully some day VirtualBox seamless mode will automagically present multiple monitors to the guest OS and let me drag windows around arbitrarily.

VirtualBox Trigger Windows 7 Bug Causing it to Believe the File System is Broken

Booting for 4 hours and another hour to go

IMG_20100913_173926

VirtualBox 3.2.8 (and possibly other versions) running on Windows 7 or Windows Server 2008 R2 as the host operating sytsem triggers a bug that causes Windows 7 or Windows Server 2008 R2 to believe that NTFS is corrupt. Chkdsk then runs on every boot. In fact chkdsk is chasing a ghost. There is no corruption but on my system chkdsk takes over 5 hours to run to completion.

I don’t believe that VirtualBox is doing anything hinky. It is just unlucky to trigger a regression in NTFS.

This is a known regression in Windows 7 in the NTFS file system.  It occurs when doing a superceding [sic] rename over a file that has an atomic oplock on it (atomic oplocks are a new feature in Windows 7).  The indexer uses atomic oplocks which is why it helped when you disabled the indexer.  Explorer also uses atomic oplocks which is why you are still seeing the issue.  When this occurs STATUS_FILE_CORRUPT is incorrectly returned and the volume is marked "dirty" which is a signal to the system that chkdsk needs to be run.  No actual corruption has occured [sic].

Neal Christiansen
NTFS Development Lead

Fortunately, a hotfix already exists. Applying KB982927 appears to fix the issue for me.

Mozilla Compatible Silverlight 4 Plugin Requires Loading DLLs from CWD

chrome-silverlight-agcore-missingI visited a site yesterday in Chrome that tried to load Silverlight to provide a video player. I have KB2264107 installed and have globally disabled loading of DLLs from the current working directory in order to mitigate luring attacks against apps that use the default insecure DLL loading behavior of LoadLibrary(). Just like the Java plugin for Mozilla, Chrome generated a big fat bonk dialog trying to load the DLLs that the Silverlight plugin uses. The specific missing file is agcore.dll, which is found in “C:\Program Files (x86)\Microsoft Silverlight\4.0.50524.0” on my system.

I tried creating a symlink to agcore,dll so that agcore.dll is in the same directory as Chrome.exe, which fixes the bonk but Silverlight doesn’t work. I just end up with a black box where the movie player should be. I also tried adding the Silverlight directory to $env:path which removed the bonk but, instead, I got the “Install Microsoft Silverlight” button. I tried various combinations of symlinking DLLs and messing with the $env:path but I didn’t arrive at a combination that can actually work.

The only solution that I found is to dial the CWDIllegalInDllSearch value for Chrome and Firefox to 2 (DLLs not allowed to load from CWD if CWD is any remote, network location) instead of 0xffffffff (it also works to change this globally). I then have to hope that Firefox and Chrome are careful about how they are using CWD. I hope they are setting CWD just for loading the installed plugins in “Prgram Files” but cannot be lured into loading some evil DLL from a spurious location when doing something like opening an HTML document on a USB stick.


PS> Get-ItemProperty chrome.exe, firefox.exe | select pspath,cwdillegalindllsearch | fl


PSPath                : Microsoft.PowerShell.Core\Registry::HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVer
                        sion\Image File Execution Options\chrome.exe
CWDIllegalInDllSearch : 2

PSPath                : Microsoft.PowerShell.Core\Registry::HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVer
                        sion\Image File Execution Options\firefox.exe
CWDIllegalInDllSearch : 2




Java Built with Unsupported Old Compilers

When I turned off DLL loading from the current working directory to defeat DLL pre-loading luring attacks, one of the things I discovered was that the Java plug-in was broken in Firefox and Chrome. This problem of Java finding its C library is not new at all. The tubes are choked with posts and bug reports about getting various things that are dependent on Java to work when msvcr71.dll can’t be found. The new CWDIllegalInDllSearch = 0xFFFFFFFF option just exacerbates an existing deployment problem.

PS> cd 'C:\Program Files (x86)\Java\jre6\bin\new_plugin'
PS> dumpbin /dependents .\npjp2.dll
Microsoft (R) COFF/PE Dumper Version 10.00.30319.01
Copyright (C) Microsoft Corporation.  All rights reserved.


Dump of file .\npjp2.dll

File Type: DLL

  Image has the following dependencies:

    USER32.dll
    GDI32.dll
    ADVAPI32.dll
    MSVCR71.dll
    KERNEL32.dll
    ole32.dll

  Summary

        6000 .data
        3000 .rdata
        1000 .reloc
        7000 .rsrc
        4000 .text

The root of the Mozilla-compatible browser Java plugin problem is npj2.dll which is dynamically linked to msvcr71.dll. Because it is a plugin, the DLL loading is done by Windows on behalf of the executable and  (Chrome or Firefox) rather than the plugin. The new_plugin directory includes msvcr71.dll which probably helped if a browser changed its CWD to the new_plugin directory when loading npj2.dll but with searching CWD out of the picture it doesn’t.

Java simply doesn’t do a good job of loading its C runtime correctly on Windows. It’s not a new problem. I don’t understand why don’t just change a switch on the complier to /MT instead of /MD and statically link the C runtime into jpjp2.dll. That would make the whole problem go away.

There are also some other oddities here. There are 2 different C runtimes used in the 32-bit version of Java 6 for Windows. For some reason they are redistributing msvcr71.dll (C runtime from Visual Studio .NET 2003) and also msvcrt.dll which is supposed to be the name of the private C runtime used by Windows components. However, the msvcrt.dll in the Java directory is actually the C runtime from Visual Studio 6 or possibly a very old Platform SDK.

By implication Oracle/Sun is using the C/C++ optimizing compiler from Visual Studio 6 (1998) and Visual Studio .NET 2003 to build the 32-bit version of Java. Holy cow those are 5 and 3 versions back from the current compilers and between 12 and 7 years old. I’m pretty sure that Visual Studio 6 is no longer supported and unfortunately both predate the side-by-side C runtime distribution system that starts with Visual Studio 2005.

The x64 version of Java is even stranger. It links against a Microsoft  x64 C runtime library called msvcrt.dll. Again this is the name reserved for the private Windows platform C runtime but this msvcrt.dll has file version 6.10.2207.0 either from a very old version of the Windows Platform SDK that provided x64 compilation support prior to Visual Studio 2005 or from a tool chain that was available by request (and is no longer available) for Visual Studio 2003.

It seems like the Java team has made a bit of a fetish of using really old compilers for the Microsoft platforms. I can understand that there is a risk of breaking stuff when upgrading a tool chain but this has been taken a bit to the extreme by the Java build team. There is a cost to testing a huge platform like Java when building with a new tool chain. Sun was cash-constrained and, although popular, Java SE didn’t really make them any money directly.

Good News for Java 7 (Probably)

It looks like the problem is being addressed. OpenJDK 7 builds, it looks like Oracle is upgrading to the C/C++ compiler from Visual Studio 2010.

BEGIN WARNING: At this time (Spring/Summer 2010) JDK 7 is starting a transition to use the newest VS2010 Microsoft compilers. These build instructions are updated to show where we are going. We have a QA process to go through before official builds actually use VS2010. So for now, official builds are still using VS2003. No other compilers are known to build the entire JDK, including non-open portions. So for now you should be able to build with either VS2003 or VS2010. We do not guarantee that VS2008 will work, although there is sufficient makefile support to make at least basic JDK builds plausible. Visual Studio 2010 Express compilers are now able to build all the open source repositories, but this is 32 bit only. To build 64 bit Windows binaries use the the 7.1 Windows SDK.END WARNING.

The 32-bit OpenJDK Windows build requires Microsoft Visual Studio C++ 2010 (VS2010) Professional Edition or Express compiler. The compiler and other tools are expected to reside in the location defined by the variable VS100COMNTOOLS which is set by the Microsoft Visual Studio installer.

So maybe in 2011 Java will have its C runtime library sorted out by virtue of having a supported global mechanism to register the Visual Studio 2010 C runtime.

Replace Task Manager with Process Explorer x64

Process Explorer has a “Replace Task Manager” option. On x64 Windows, this doesn’t work right. Instead of replacing Task Manager, it ensures that Task Manager can never run.

This feature works through an image hijack. What is supposed to happen is Process Explorer is supposed to register itself as the debugger for Task Manager. It doesn’t act as a debugger, instead, it just launches itself.

Here is the garbage that gets written by default.

taskmgr-img-hijack-broken

The Debugger value should be the fully qualified path to where procexp.exe lives. Unfortunately, procexp wrote some garbage in there.

Set-ItemProperty 'HKLM:\software\Microsoft\Windows NT\CurrentVersion\Image File Execution Options\taskmgr.exe' -name Debugger -value "C:\Program Files\Sysinternals\procexp.exe"

taskmgr-img-hijack

Now Task Manager is magically Process Explorer.

 


PowerShell’s Object Pipeline Corrupts Piped Binary Data

Yesterday I used curl to download a huge database backup from a remote server. Curl is UNIX-ey. By default, it streams its output to sdout and you then redirect that stream to a pipe or file like this:

$ curl sftp://server.somehwere.com/somepath/file > file

The above is essentially what I did from inside of a PowerShell session. After a couple of hours, I had my huge download and discovered that the database backup was corrupt. Then I realized that the file I ended up with was a little over 2x the size of the original file.

What happened?

Long story, short. This is a consequence of the object pipeline in PowerShell and you should never pipe raw binary data in PowerShell because it will be corrupted.

The Gory Details

You don’t have to work with a giant file. A small binary file will also be corrupted. I took a deeper look at this using a small PNG image.

PS> curl sftp://ftp.noserver.priv/img.png > img1.png
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 46769  100 46769    0     0  19986      0  0:00:02  0:00:02 --:--:— 74950

(FYI. Curl prints the progress of the download to stderr, so you see something on the console even though the stdout is redirected to file.)

This is essentially what I did with my big download and yields a file that is more than 2x the size of the original. My theory at this point was that since String objects in .Net are always Unicode, the bytes were being doubled as a consequence of an implicit conversion to UTF-16.

Using the > operator in PowerShell is the same thing piping to the Out-File cmdlet. Out-File has some encoding options. The interesting one is OEM:

"OEM" uses the current original equipment manufacturer code page identifier for the operating system.

That is essentially writing raw bytes.

PS> curl sftp://ftp.noserver.priv/img.png | out-file -encoding oem img2.png
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 46769  100 46769    0     0  26304      0  0:00:01  0:00:01 --:--:-- 74950

I was clearly on to something because this almost works. The file is just slightly larger than the original. It almost worked.

Just to prove that my build of curl isn’t broken, I also used the –o (–output) option.

PS> curl sftp://ftp.noserver.priv/img.png -o img3.png
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 46769  100 46769    0     0  25839      0  0:00:01  0:00:01 --:--:-- 76796

Here’s the result. You can see by the file sizes and md5 hashes that img1.png and img2.png are corrupt but img3.png is the same as the original img.png.

PS> ls img*.png | select name, length | ft -AutoSize

Name     Length
----     ------
img.png   46769
img1.png  94168
img2.png  47083
img3.png  46769


PS> md5 img*.png
MD5 (img.png) = 21d5d61e15a0e86c61f5ab1910d1c0bf
MD5 (img1.png) = eb5a1421bcc4e3bea1063610b26e60f9
MD5 (img2.png) = 03b9b691f86404e9538a9c9c668c50ed
MD5 (img3.png) = 21d5d61e15a0e86c61f5ab1910d1c0bf

Hrm. What’s going on here?

Let’s look at a diff of img.png and img1.png, which was the result of using the > operator to redirect the stdout of curl to file.

diff-img-img1

The big thing to see here is that there are a lot of extra bytes. Crucially, the bytes are Unicode glyphs and FFFE have been added as the first two bytes. 0xFFEE is the byte order mark for a little-endian UTF-16. That confirms my theory that internally PowerShell converted the data to Unicode.

I can also create the same behavior by the Get-Content (aliased as cat) cmdlet to redirect binary data to a file.

PS> cat img1.png > img4.png
PS> ls img1.png, img4.png | select name, length | ft -AutoSize

Name     Length
----     ------
img1.png  94168
img4.png  94168

So what is going on inside that pipeline?

PS> (cat .\img.png).GetType() | select name

Name
----
Object[]


PS> cat .\img.png | %{ $_.GetType() } | group name | select count, name | ft -AutoSize

Count Name
----- ----
  315 String

The file is being converted into an Object array of 315 elements. Each element of the array contains a String object. Since the internal data type of String is Unicode, sometimes referred to loosely as “double-byte” characters, the total size of that data is roughly doubled.

Using the OEM text encoder converts the data back but not quite. What is going wrong? Time to look at a diff of img.png and img2.png, which was the OEM text encoder.

 diff-img-img2

What you see here is a lot of 0x0D bytes have been inserted in front of all of the 0x0A bytes.

PS> (ls .\img2.png ).Length - (ls .\img.png ).Length
314

There are actually 314 of these 0x0D bytes added. What the heck is 0x0D? It is Carriage Return (CR). 0x0A is Line Feed (LF). In a Windows text file each line is marked with the sequence CRLF. 314 is exactly the number of CRLF sequences you need to turn a 315 element array of strings into a text file with Windows line endings.

Here’s what is happengin. PowerShell is making some assumtions:

  1. Anything streaming in as raw bytes is assumed to be text
  2. The text is converted into an array by splitting on bytes that would indicate an end of line in a text file.
  3. The text is reconstituted by out-file using the standard Windows end of line characters.

While this will work just fine with any kind of text, it is virtually guaranteed to corrupt any binary data. With the default text encoding you get a doubling of the original bytes and a bunch of new 0x0D bytes, too. The corruption fundamentally happens when the data is split into a string array. Using a binary encoder at the end of the pipeline doesn’t put the data back correctly because it always puts CRLF at the end of every array element. Unfortunately since there is more than one possible end of line sequence, this is as good as anything. Using a Windows to Unix conversion will not fix the file. There is no way to put humpty dumpty back together again.

To Sum Up, Just Don’t Do It

The moral is that it is never safe to pipe raw binary data in PowerShell. Pipes in PowerShell are for objects and text that can safely be automagically converted to a string array. You need to be cognizant of this and use Stream objects to manipulate binary files.

When using curl with PowerShell, never, never redirect to file with >. Always use the –o or –out <file>switch. If you need to stream the output of curl to another utility (say gpg) then you need to sub-shell into cmd for the binary streaming or use temporary files.

WONTFIX: select(2) in SUA 5.2 ignores timeout

With Windows Server 2003 R2, Microsoft incorporated Services for UNIX as a set of operating system components. The POSIX subsystem, Interix, is called the Subsystem for UNIX Applications (SUA) in Windows Server 2003 R2 and later.

Interix is the internal name of the Windows Posix Subsystem (PSXSS) that is based on OpenBSD and operates as an independent sister subsystem with the Windows subsystem (aka CSRSS or Client/Server Runtime Subsystem).

With the first version of SUA, aka Interix 5.2, Microsoft added a bunch of new UNIX APIs. Unfortunately the broke some things that were previously working in the previous edition which was called Interix 3.5 (aka Services for UNIX 3.5).

For example, select(2) is broken in SUA 5.2. It completely ignores the timeouts provided as arguments and returns immediately.

From the POSIX specification:

If the timeout parameter is not a null pointer, it specifies a maximum interval to wait for the selection to complete. If the specified time interval expires without any requested operation becoming ready, the function shall return. If the timeout parameter is a null pointer, then the call to pselect() or select() shall block indefinitely until at least one descriptor meets the specified criteria. To effect a poll, the timeout parameter should not be a null pointer, and should point to a zero-valued timespec structure.

Here is a little test program. What should happen is that select() should block for 10 seconds every time through the loop.

#include <stdio.h>;
#include <sys/time.h>

int main()
{
	printf("Testing select(2). Each pass through the loop should pause 10 seconds.\n\n");

	struct timeval time, pause;
	pause.tv_sec  = 10;
	pause.tv_usec = 0;
	int i;

	for( i=0; i&lt;10; i++ )
	{
		//insert a 10 second pause on every loop to test select().		
		gettimeofday(&amp;time, 0);
		printf(";Current time: %d\n", time.tv_sec);
		time.tv_sec += 10L;
		printf("Add 10 seconds: %d... And pause by calling select(2).\n", time.tv_sec);
		(void) select(0, 0, 0, 0, &pause);
	}
	return 0;
}

What actually happens is select() returns immediately.

% uname -svrm
Interix 5.2 SP-9.0.3790.4125 EM64T
% gcc selecttest.c -o selecttest 
% ./selecttest 
Testing select(2). Each pass through the loop should pause 10 seconds. 

Current time: 1142434664 
Add 10 seconds: 1142434674... And pause by calling select(2). 
Current time: 1142434664 
Add 10 seconds: 1142434674... And pause by calling select(2). 
Current time: 1142434664 
Add 10 seconds: 1142434674... And pause by calling select(2). 
Current time: 1142434664 
Add 10 seconds: 1142434674... And pause by calling select(2). 
Current time: 1142434664 
Add 10 seconds: 1142434674... And pause by calling select(2). 
Current time: 1142434664 
Add 10 seconds: 1142434674... And pause by calling select(2). 
Current time: 1142434664 
Add 10 seconds: 1142434674... And pause by calling select(2). 
Current time: 1142434664 
Add 10 seconds: 1142434674... And pause by calling select(2). 
Current time: 1142434664 
Add 10 seconds: 1142434674... And pause by calling select(2). 
Current time: 1142434664 
Add 10 seconds: 1142434674... And pause by calling select(2).
% 

The test run above should have taken 100 seconds but it actually completes in less than 1 second. This is a problem because many UNIX applications will use select() as a timing mechanism. Some will use select() as a timer even if they aren’t doing IO.

There is good news and bad news.

The bad news is that MSFT told me that the won’t fix this issue. Their official guidance is to use sleep(2) and usleep(2) to control timeouts in Interix 52.

The good news is that select(2) works properly on Interix 6.0 with Windows Vista and Windows Server 2008.

My First Chrome Extension, part 2

I was fixated on how to get Chrome to pass a feed URL to my registered feed reader, Outlook 2010. After I figured out how to pass the message, it was nearly a one-liner to modify an existing extension. How many things could be wrong with one line of code that seems to be working?

url = url.replace( "%f", feedUrl.replace( "http:", "feed:" ) );

There are two bugs that I see in here.

1: URLs can legitimately include %f

From RFC 1738:

In addition, octets may be encoded by a character triplet consisting of the character "%" followed by the two hexadecimal digits (from "0123456789ABCDEF") which forming the hexadecimal value of the octet. (The characters "abcdef" may also be used in hexadecimal encodings.)

In practice this would affect URLs that include these characters:

  • ð -> %F0
  • ñ –> %F1
  • ò –> %F2
  • ó –> %F3
  • ô –> %F4
  • õ –> %F5
  • ö –> %F6
  • ÷ –> %F7
  • ø –> %F8
  • ù –> %F9
  • ú –> %FA
  • û –> %FB
  • ü –> %FC
  • ý –> %FD
  • þ –> %FE
  • ÿ –> %FF

The simple solution is to avoid using a macro character sequence that includes valid hexadecimal values. Sticking with a single ASCII character it could be anything from ‘g’ trhough ‘z’ except ‘s’, which is already being used.

2: A feed URL could have “http:” anywhere

The scheme for a URI is always at the beginning, but it is common for one URI to encode another one inside it. That is exactly how you pass a feed URL to Google Reader.

Imagine a feed URL like “http://feedify.exmaple/q=http://mysite.somewhere/page”. In order to pass this to the feed scheme handler I need to replace the http: scheme with feed: but my original code would also replace the second http:, which would break the URL.

The solution is to use a regular expression so that I only replace the http: scheme rather than every occurrence of the pattern “http:”. In Regular Expression syntax, the ^ character means that the pattern has to start with the beginning of the string.

^http:

I JavaScript, the shorthand for a Regular Expression object is paired / characters:

/^http:/

3:URI schemes are not case sensitive

The replace() method of the JavaScript String object is case sensitive. My code would not work on a URL like “HTTP://mysite.example/stuff”.

Fortunately, the Regular Expression object in JavaScript has a case-insensitive match mode. You set this with the ‘i’ option:

/^http:/i

3 Bugs in 1 Line: Fixed

url = url.replace( "%g", feedUrl.replace( /^http:/i, "feed:" ) );

%d bloggers like this: