Custom JsonResult Class for ASP.Net MVC to Avoid MaxJsonLength Exceeded Exception

Shortly before Christmas, I was working on an application that sent a large data set to jqGrid using an Ajax JSON stream in ASP.Net MVC. We were trying out using a “get everything once” model where all of the I/O happens when jqGrid is initialized or reset and then all of the data is available on the client for fast sort and filter. It was working well with test data of ~5-6K rows until a colleague checked in a change that added a new  column. Suddenly my jqGrid was blank while hers (with a different, somewhat smaller, set of test data) was fine. Usually this sort of behavior from jqGrid indicates that the JSON was broken. Sure enough when I fired up my Fiddler HTTP debugger, I saw an error 500 for the JSON ajax query.

Server Error in ‘/’ Application.


Error during serialization or deserialization using the JSON JavaScriptSerializer. The length of the string exceeds the value set on the maxJsonLength property.

Description: An unhandled exception occurred during the execution of the current web request. Please review the stack trace for more information about the error and where it originated in the code. 

Exception Details: System.InvalidOperationException: Error during serialization or deserialization using the JSON JavaScriptSerializer. The length of the string exceeds the value set on the maxJsonLength property.

Source Error: 

An unhandled exception was generated during the execution of the current web request. Information regarding the origin and location of the exception can be identified using the exception stack trace below.

Stack Trace: 

[InvalidOperationException: Error during serialization or deserialization using the JSON JavaScriptSerializer. The length of the string exceeds the value set on the maxJsonLength property.]
   System.Web.Script.Serialization.JavaScriptSerializer.Serialize(Object obj, StringBuilder output, SerializationFormat serializationFormat) +551497
   System.Web.Script.Serialization.JavaScriptSerializer.Serialize(Object obj, SerializationFormat serializationFormat) +74
   System.Web.Script.Serialization.JavaScriptSerializer.Serialize(Object obj) +6
   System.Web.Mvc.JsonResult.ExecuteResult(ControllerContext context) +341
   System.Web.Mvc.ControllerActionInvoker.InvokeActionResult(ControllerContext controllerContext, ActionResult actionResult) +10
   System.Web.Mvc.<>c__DisplayClass14.<InvokeActionResultWithFilters>b__11() +20
   System.Web.Mvc.ControllerActionInvoker.InvokeActionResultFilter(IResultFilter filter, ResultExecutingContext preContext, Func`1 continuation) +251
   System.Web.Mvc.<>c__DisplayClass16.<InvokeActionResultWithFilters>b__13() +19
   System.Web.Mvc.ControllerActionInvoker.InvokeActionResultWithFilters(ControllerContext controllerContext, IList`1 filters, ActionResult actionResult) +178
   System.Web.Mvc.ControllerActionInvoker.InvokeAction(ControllerContext controllerContext, String actionName) +314
   System.Web.Mvc.Controller.ExecuteCore() +105
   System.Web.Mvc.ControllerBase.Execute(RequestContext requestContext) +39
   System.Web.Mvc.ControllerBase.System.Web.Mvc.IController.Execute(RequestContext requestContext) +7
   System.Web.Mvc.<>c__DisplayClass8.<BeginProcessRequest>b__4() +34
   System.Web.Mvc.Async.<>c__DisplayClass1.<MakeVoidDelegate>b__0() +21
   System.Web.Mvc.Async.<>c__DisplayClass8`1.<BeginSynchronous>b__7(IAsyncResult _) +12
   System.Web.Mvc.Async.WrappedAsyncResult`1.End() +59
   System.Web.Mvc.MvcHandler.EndProcessRequest(IAsyncResult asyncResult) +44
   System.Web.Mvc.MvcHandler.System.Web.IHttpAsyncHandler.EndProcessRequest(IAsyncResult result) +7
   System.Web.CallHandlerExecutionStep.System.Web.HttpApplication.IExecutionStep.Execute() +8682542
   System.Web.HttpApplication.ExecuteStep(IExecutionStep step, Boolean& completedSynchronously) +155


Version Information: Microsoft .NET Framework Version:2.0.50727.4952; ASP.NET Version:2.0.50727.4955

 

The error is inside of a call from MVC to BCL

It turns out that the exception happens inside of JsonResult.ExecuteResult(ControllerContext context). What is going on in there? Well, fortunately, ASP.Net MVC code is open source (MS-PL) and the .NET Framework class library source code is available for reference as well. Let’s take a look.

The meat of JsonResult.ExecuteResult(ControllerContext)

if (Data != null) {
    JavaScriptSerializer serializer = new JavaScriptSerializer();
    response.Write(serializer.Serialize(Data));
}

The meat of JavaScriptSerializer.Serialize(object obj)

public string Serialize(object obj) {
    return Serialize(obj, SerializationFormat.JSON);
}

private string Serialize(object obj, SerializationFormat serializationFormat) {
    StringBuilder sb = new StringBuilder(); 
    Serialize(obj, sb, serializationFormat); 
    return sb.ToString();
}

internal void Serialize(object obj, StringBuilder output, SerializationFormat serializationFormat) {
    SerializeValue(obj, output, 0, null, serializationFormat); 
    // DevDiv Bugs 96574: Max JSON length does not apply when serializing to Javascript for ScriptDescriptors
    if (serializationFormat == SerializationFormat.JSON && output.Length > MaxJsonLength) {
        throw new InvalidOperationException(AtlasWeb.JSON_MaxJsonLengthExceeded);
    } 
}

JavaScriptSerializer completely serializes object obj into StringBuilder output and then, having allocated that memory, checks the size of the StringBuilder and if it is larger than the MaxJsonLength property it throws an InvalidOperationException. JsonResult just creates a new JavaScriptSerializer and uses it so there is no way to change the default MaxJsonLength when using JsonResult in MVC. Since the memory is allocated before the InvalidOperationException is thrown, I’m not really clear what the point of MaxJsonLength is this deep in the framework. Surely whatever is going to use the JSON string would be in a better position to decide if the string returned by JavaScriptSerializer.Serialize() was too long to use?

Anyway, we have the problem isolated now for a solution. We need to implement our own ActionResult that will generate JSON while allowing the caller to twiddle the knobs on JavaScriptSerializer.

LargeJsonResult ActionResult class

using System;
using System.Web.Script.Serialization;

namespace System.Web.Mvc
{
    public class LargeJsonResult : JsonResult
    {
        const string JsonRequest_GetNotAllowed = "This request has been blocked because sensitive information could be disclosed to third party web sites when this is used in a GET request. To allow GET requests, set JsonRequestBehavior to AllowGet.";
        public LargeJsonResult()
        {
            MaxJsonLength = 1024000;
            RecursionLimit = 100;
        }

        public int MaxJsonLength { get; set; }
        public int RecursionLimit { get; set; }

        public override void ExecuteResult( ControllerContext context )
        {
            if( context == null )
            {
                throw new ArgumentNullException( "context" );
            }
            if( JsonRequestBehavior == JsonRequestBehavior.DenyGet &&
                String.Equals( context.HttpContext.Request.HttpMethod, "GET", StringComparison.OrdinalIgnoreCase ) )
            {
                throw new InvalidOperationException( JsonRequest_GetNotAllowed );
            }

            HttpResponseBase response = context.HttpContext.Response;

            if( !String.IsNullOrEmpty( ContentType ) )
            {
                response.ContentType = ContentType;
            }
            else
            {
                response.ContentType = "application/json";
            }
            if( ContentEncoding != null )
            {
                response.ContentEncoding = ContentEncoding;
            }
            if( Data != null )
            {
                JavaScriptSerializer serializer = new JavaScriptSerializer() { MaxJsonLength = MaxJsonLength, RecursionLimit = RecursionLimit };
                response.Write( serializer.Serialize( Data ) );
            }
        }
    }
}

You can use return new LargeJsonResult(){ Data = data } from any Action method where you would have used return Json(data). Also, you have direct control over the MaxJsonLength and RecursionLimit properites of JavaScriptSerializer.

return new LargeJsonResult() { Data = output, MaxJsonLength = int.MaxValue };
Advertisement

Is F# Math Really Faster than C#?

I ran across an article claiming that F# was about an order of magnitude faster at calculating a basic math algorithm. Can this be true?

Sigmoid in C#

public static float Sigmoid(float value) 
{
    return 1.0f / (1.0f + (float) Math.Exp(-value));
}
//csc -o -debug-
.method public static float32 Sigmoid(float32 'value') cil managed
{
    .maxstack 8
    L_0000: ldc.r8 1
    L_0009: ldc.r8 1
    L_0012: ldarg.0 
    L_0013: neg 
    L_0014: conv.r8 
    L_0015: call float64 [mscorlib]System.Math::Exp(float64)
    L_001a: add 
    L_001b: div 
    L_001c: conv.r4 
    L_001d: ret 
}

Sigmoid in F#

let Sigmoid value = 1.0f/(1.0f + exp(-value));
//fsc --debug- --optimize+
.method public static float32 Sigmoid(float32 'value') cil managed
{
    .maxstack 5
    .locals init (
        [0] float32 num)
    L_0000: ldc.r4 1
    L_0005: ldc.r4 1
    L_000a: ldarg.0 
    L_000b: neg 
    L_000c: stloc.0 
    L_000d: ldloc.0 
    L_000e: conv.r8 
    L_000f: call float64 [mscorlib]System.Math::Exp(float64)
    L_0014: conv.r4 
    L_0015: add 
    L_0016: div 
    L_0017: ret 
}

The IL generated is nearly the same. The F# version allocates a little less memory using float32 variables on the stack where C# uses float64 and generally manages with a smaller stack, but the basic math operations are the same. Can these differences make an order of magnitude performance difference? Short answer is no.

Benchmarks are Tricky

C# sigmoid algorithm benchmark

// (c) Rinat Abdullin
// http://abdullin.com/journal/2009/1/5/caching-activation-function-is-not-worth-it.html
// sigmoidcs.cs

using System;
using System.Diagnostics;

static class App
{
	const float Scale = 320.0f;
	const int Resolution = 2047;
	const float Min = -Resolution/Scale;
	const float Max = Resolution/Scale;

	static readonly float[] _lut = InitLut();

	static float[] InitLut()
	{
	  var lut =  new float[Resolution + 1];
	  for (int i = 0; i < Resolution + 1; i++)
	  {
	    lut[i] = (float) (1.0/(1.0 + Math.Exp(-i/Scale)));
	  }
	  return lut;
	}

	static float Sigmoid1(double value)
	{
	  return (float) (1.0/(1.0 + Math.Exp(-value)));
	}

	static float Sigmoid2(float value)
	{
	  if (value <= Min) return 0.0f;
	  if (value >= Max) return 1.0f;
	  var f = value*Scale;
	  if (value >= 0) return _lut[(int) (f + 0.5f)];
	  return 1.0f - _lut[(int) (0.5f - f)];
	}

	static float TestError()
	{
	  var emax = 0.0f;
	  for (var x = -10.0f; x < 10.0f; x += 0.000001f)
	  {
	    var v0 = Sigmoid1(x);
	    var v1 = Sigmoid2(x);

	    var e = Math.Abs(v1 - v0);
	    if (e > emax) emax = e;
	  }
	  return emax;
	}

	static double TestPerformancePlain()
	{
	  var sw = new Stopwatch();
	  sw.Start();
	  for (int i = 0; i < 10; i++)
	  {
	    for (float x = -5.0f; x < 5.0f; x += 0.000001f)
	    {
	      Sigmoid1(x);
	    }
	  }

	  return sw.Elapsed.TotalMilliseconds;
	}

	static double TestPerformanceOfLut()
	{
	  var sw = new Stopwatch();
	  sw.Start();
	  for (int i = 0; i < 10; i++)
	  {
	    for (float x = -5.0f; x < 5.0f; x += 0.000001f)
	    {
	      Sigmoid2(x);
	    }
	  }
	  return sw.Elapsed.TotalMilliseconds;
	}

	static void Main()
	{
	  var emax = TestError();
	  var t0 = TestPerformancePlain();
	  var t1 = TestPerformanceOfLut();

	  Console.WriteLine("Max detected deviation using LUT: {0}", emax);
	  Console.WriteLine("10^7 iterations using Sigmoid1() took {0} ms", t0);
	  Console.WriteLine("10^7 iterations using Sigmoid2() took {0} ms", t1);
	}
}

F# sigmoid algorithm benchmark

// (c) Rinat Abdullin
// http://abdullin.com/journal/2009/1/6/f-has-better-performance-than-c-in-math.html
// sigmoidfs.fs

#light

let Scale = 320.0f;
let Resolution = 2047;

let Min = -single(Resolution)/Scale;
let Max = single(Resolution)/Scale;

let range step a b =
  let count = int((b-a)/step);
  seq { for i in 0 .. count -> single(i)*step + a };

let lut = [| 
  for x in 0 .. Resolution ->
    single(1.0/(1.0 +  exp(-double(x)/double(Scale))))
  |]

let sigmoid1 value = 1.0f/(1.0f + exp(-value));

let sigmoid2 v = 
  if (v <= Min) then 0.0f;
  elif (v>= Max) then 1.0f;
  else
    let f = v * Scale;
    if (v>0.0f) then lut.[int (f + 0.5f)]
    else 1.0f - lut.[int(0.5f - f)];

let getError f = 
  let test = range 0.00001f -10.0f 10.0f;
  let errors = seq { 
    for v in test -> 
      abs(sigmoid1(single(v)) - f(single(v)))
  }
  Seq.max errors;

open System.Diagnostics;

let test f = 
  let sw = Stopwatch.StartNew(); 
  let mutable m = 0.0f;
  let result = 
    for t in 1 .. 10 do
      for x in 1 .. 1000000 do
        m <- f(single(x)/100000.0f-5.0f);
  sw.Elapsed.TotalMilliseconds;

printf "Max deviation is %f\n" (getError sigmoid2)
printf "10^7 iterations using sigmoid1: %f ms\n" (test sigmoid1)
printf "10^7 iterations using sigmoid2: %f ms\n" (test sigmoid2)
PS> csc -o -debug- .\sigmoidcs.cs
Microsoft (R) Visual C# 2010 Compiler version 4.0.30319.1
Copyright (C) Microsoft Corporation. All rights reserved.

PS> fsc --debug- --optimize+  .\sigmoidfs.fs
Microsoft (R) F# 2.0 Compiler build 4.0.30319.1
Copyright (c) Microsoft Corporation. All Rights Reserved.
PS> .\sigmoidcs.exe
Max detected deviation using LUT: 0.001663984
10^7 iterations using Sigmoid1() took 2644.5935 ms
10^7 iterations using Sigmoid2() took 510.9379 ms
PS> .\sigmoidfs.exe
Max deviation is 0.001664
10^7 iterations using sigmoid1: 403.974300 ms
10^7 iterations using sigmoid2: 124.520100 ms

Wow. That’s 404ms for F# and 2,656 for C#. That’s a huge difference. Why?

We already established that the IL for the Sigmoid1() algorithm above compiles to very similar IL. What else could be different?

fsharp-test-func

Wait a minute.

let result = 
    for t in 1 .. 10 do
      for x in 1 .. 1000000 do
        m <- f(single(x)/100000.0f-5.0f);

The F# loop is not equivalent to the C# version. In the F# version the counter is an int and the C# version the counter is a float. (Also the F# version is passing a delegate to the test function.)

for (int i = 0; i < 10; i++)
{
	for (float x = -5.0f; x < 5.0f; x += 0.000001f)
	{
		Sigmoid1(x);
	}
}

Does that int counter thing make a big difference.

using System;
using System.Diagnostics;

static class App
{
	static float Sigmoid1(double value)
	{
	  return (float) (1.0/(1.0 + Math.Exp(-value)));
	}

	static double TestPerformancePlain()
	{
		var sw = new Stopwatch();
		sw.Start();
		float num = 0f;
		for (int i = 0; i < 10; i++)
		{
			for (int j = 1; j < 1000001; j++)
			{
				num = Sigmoid1((((float) j) / 100000f) - 5f);
			}
		}  
		return sw.Elapsed.TotalMilliseconds;
	}

	static void Main()
	{
	  var t0 = TestPerformancePlain();
	  Console.WriteLine("10^7 iterations using Sigmoid1() took {0} ms", t0);
	}
}
PS> csc -o -debug- .\sigmoid-counter.cs
Microsoft (R) Visual C# 2010 Compiler version 4.0.30319.1
Copyright (C) Microsoft Corporation. All rights reserved.

PS> ./sigmoid-counter
10^7 iterations using Sigmoid1() took 431.6634 ms

Yup C# just executed that Sigmoid1() test in 431ms. Recall that F# was 404ms.

The other difference was that the F# code was passing a delegate of FsharpFunc<float,float> rather than making a method call. Lets try the same thing in C# using a Lambda expression to create an anonymous Func<float,float> delegate.

using System;
using System.Diagnostics;

static class App
{
	static double Test(Func<float,float> f)
	{
		var sw = new Stopwatch();
		sw.Start();
		float num = 0f;
		for (int i = 0; i < 10; i++)
		{
			for (int j = 1; j < 1000001; j++)
			{
		   		num = f((((float) j) / 100000f) - 5f);
			}
		}  
		return sw.Elapsed.TotalMilliseconds;
	}

	static void Main()
	{
	  var t0 = Test( x => { return (float) (1.0/(1.0 + Math.Exp(-x))); } );
	  Console.WriteLine("10^7 iterations using Sigmoid1() took {0} ms", t0);
	}
}
PS> ./sigmoid-counter
10^7 iterations using Sigmoid1() took 431.6634 ms
PS> csc -o -debug- .\sigmoid-lambda.cs
Microsoft (R) Visual C# 2010 Compiler version 4.0.30319.1
Copyright (C) Microsoft Corporation. All rights reserved.

PS> .\sigmoid-lambda.exe
10^7 iterations using Sigmoid1() took 275.1087 ms

Now C# is 130ms faster than F#.

For what it’s worth the C version of the Sigmoid1() benchmark executes in 166ms on my machine using CL 16 x64 on my machine.

  • C: 166ms
  • C# (delegate): 275ms
  • C# (method): 431ms
  • C# (method, float counter): 2,656ms
  • F#: 404ms
  • So, what have we learned?

  • Never, ever use a float as a counter in a for loop in C#.
  • Invoking a delegate in the CLR is slightly faster than a normal method invocation.
  • It’s very easy to measure something other than what you intended when benchmarking.
  • In as much as this little test means anything, C# and F# have similar performance. C is still faster.

Embedding Arbitrary Language Glyphs in PDF with ItextSharp

One of my clients has an application which generates a PDF using ITextSharp. The document largely contains English text in the Latin character set but a portion of the PDF is supposed to contain contact information in a foreign language. In the first version of the software, the requirement was to support Latin, Cyrillic, Georgian and Armenian character sets.

We quickly discovered during testing that the Adobe Type 1 fonts embedded in itextsharp.dll only support Latin characters. Code points from the Cyrillic, Georgian and Armenian character sets showed up as white space in the document. Fortunately, iTextSharp supports TrueType font embedding with the correct incantation which enabled us to use Sylfaen to provide the necessary glyphs.

string sylfaenpath = Environment.GetEnvironmentVariable( "SystemRoot" ) + "\\fonts\\sylfaen.ttf";
BaseFont sylfaen = BaseFont.CreateFont( sylfaenpath, BaseFont.IDENTITY_H, BaseFont.EMBEDDED );
Font easternEuroTextFont = new Font( sylfaen, 9f, Font.NORMAL );

With the second version of the application, we needed to support a bunch of new character sets in addition to the ones we previously supported including Hebrew, Arabic, Devangari, Sinhala, Lao, Thai and more South Asian and Southeast Asian scripts.

One option is to pick supporting fonts for each character set but we elected to pick something universal, which is Arial Unicode which includes glyphs for every code point defined in Unicode 2.1. Arial Unicode is from the Afra Monotype foundry and is bundled with Office 2007 and later but can be purchased separately if Office isn’t installed. (The main side effect of this font choice is moving from a serif font to a sans serif one.)

Universal Glyph Support Can Still Yield Gibberish

The remaining wrinkle is that Hebrew and Arabic are right-to-left languages which means that the characters in a Hebrew or Arabic string are supposed to be rendered from right-to-left instead of left-to-right. Just rendering an Arabic string with Arial Unicode in iTextSharp will yield reflected output which is gibberish.

Here is some reference Arabic text rendered in Arial. It says “al-ingliziya”, the Arabic word for English.

al-ingliziyah

Here’s what you get by default using Arial Unicode in iTextSharp.

default-broken-al-ingliziyah

Clearly, this is different from the reference rendering. Arabic is complicated because of the way the ligatures work so that the shape of a letter is heavily influenced by the letters next to it but basically, it’s backwards. What we have is now not nothing but Hebrew and Arabic are gibberish instead. We need to alter the rendering for Hebrew and Arabic and make them right-to-left.

A simple algorithm is to detect the presence of Hebrew or Arabic code points in a string and turn on right-to-left rendering. Regular Expressions define \p{Hebrew} and \p{Arabic} character classes which would be useful but unfortunately those aren’t supported in System.Text.RegularExpressions at this point. We need to roll our own.

const string regex_match_arabic_hebrew = @"[\u0600-\u06FF,\u0590-\u05FF]+";
if( Regex.IsMatch( text, regex_match_arabic_hebrew, RegexOptions.IgnoreCase ) 
    //arabic or hebrew characters exist, fix rendering

There’s no obvious RTL option for a text element in iTextSharp, so I tried reversing the strings, which is a slight improvement but it’s still broken. What we have is brain-dead rendering. The ligatures are not connecting the letters correctly.

string-reverse-broken-al-ingliziyah

On closer examination, there is RTL support in iTextSharp. It is exposed through object graph elements that implement IPdfRunDirection. (This is one of the places where it really shows that iTextSharp is a Java port. The use of static integer constants rather than enums is very Java 1.4. Enums are much more discoverable and the correct usage is more intuitively obvious.)

element.RunDirection = PdfWriter.RUN_DIRECTION_RTL;

Now the output from iTextSharp looks like the reference rendering.

correct-rtl-al-ingliziyah

Transliterate to Java and the same concepts apply to iText.

Example Snippet Code in C#

using System;
using System.Collections.Generic;
using System.IO;
using System.Text.RegularExpressions;
using iTextSharp.text;
using iTextSharp.text.pdf;

//... assume a class that does stuff exists

pubilc byte[] CreatePdfStreamPdfWithRandomLanguageSupport( IEnumerable<string> textList )
{
	//C# does not support \p{Arabic} and \p{Hebrew} character classes. We have to roll our own.
	//We are assuming any string that contains an Arabic or Hebrew character is meant to be RTL.
	//Better would be to break strings into word tokens and test each word.
	const string regex_match_arabic_hebrew = @"[\u0600-\u06FF,\u0590-\u05FF]+";
	const string arialunicodepath          = Environment.GetEnvironmentVariable( "SystemRoot" ) + "\\fonts\\ARIALUNI.TTF";

	Document document = new Document( PageSize.LETTER );
	using(MemoryStream stream = new MemoryStream())
	{
		PdfWriter writer = PdfWriter.GetInstance( document, stream );
		try
		{
			//bunch of document setup here.
			document.Open();
			//arbitrarily, creating a 5 columnt table.
			PdfPTable table = new PdfPTable( 5 );
			
			//embed a Unicode font with broad glyph support for any code point we might need.
			//only the glyphs for code points actually used will be embedded in the document
			BaseFont nationalBase;
			if( File.Exists( arialunicaodepath ) 
				BaseFont.CreateFont( arialunicodepath, BaseFont.IDENTITY_H, BaseFont.EMBEDDED ); 
			else
				throw new FileNotFoundException( "Could not find \"Arial Unicode MS\" font installed on this system." );

			Font nationalTextFont = new Font( nationalBase, 9f, Font.NORMAL );

			foreach( string text in textList )
			{
				//PdfPCell implements IPdfRunDirection
				PdPCell cell = new PdfPCell();
				//Arabic and Hebrew strings need to be reversed for right-to-left rendering
				//which is done by setting IPdfRunDirection.RunDirection. Otherwise, your RTL language text
				//comes out as backwards gibberish.
				if( text != null && Regex.IsMatch( text, regex_match_arabic_hebrew, RegexOptions.IgnoreCase ) )
			   		cell.RunDirection = PdfWriter.RUN_DIRECTION_RTL;
			    //apply unicode font
			    Phrase phrase = new Phrase( text, nationalTextFont );
				cell.Add( phrase );
				table.AddCell( cell );
			}
			document.add( table );
		}
		finally
	    {
	        document.Close();
	        writer.Close();
	    }
	    return stream.GetBuffer();
	}
}

Can Google Sidestep Oracle Patent Payouts with Mono/C#?

android-monoOracle has sued Google over patent and copyright violations related to Google’s use of Java technologies in Android. Oracle acquired the Java IP as a part of its acquisition of Sun Microsystems. The details are somewhat different but this has the same general flavor as when Sun sued Microsoft over its non-conforming Java runtime and J++ language compiler. That lawsuit was based in contracts law because Microsoft did license Java from Sun and violated the terms of the license. In this case, Google has attempted to sidestep the licensing requirements of Java with their Dalvik VM. Once could reasonably argue that the technical basis was similar. Both Microsoft and Google want to achieve significant performance improvement and platform integration over a vanilla JVM at the cost of incompatibility with the Java standard. It’s not entirely clear that Davlik actually achieves superior performance, though. I have to wonder if the stack-based VM concept was incidental to the goal of making an end-run around J2SE runtime licensing requirements.

One intriguing—if a bit self-serving and improbable—proposal has been floated by Miguel de Icaza: Why not just replace Dalik with Mono, the free and open source implementation of Microsoft’s .NET? The Mono runtime is LGPLv2 and the class libraries are MIT licensed. Additionally the .NET Micro edition has been placed entirely under the Microsoft Public License which is a BSD-style license with an explicit patent grant. The Microsoft Community Promise explicitly indemnifies patent claims against anyone wishing to implement C# and the CLI and unlike Sun’s patent grant for Java, embrace-and-extend is OK—you can implement a superset of the C# and CLI features and you are still covered.

Google definitely has the wherewithal to migrate Android from Dalvik to Mono if they want to. They could make it a seamless transition and even migrate the bytecode of existing Dalvik (or Java) apps to IL. They could also provide a tool to migrate projects from Java language to C# as Microsoft did. Other implementations exist.

I think it would be pleasantly symmetrical if history repeated itself. Sun’s lawsuit put the kibosh on Microsoft using Java the way it wanted and essentially gave birth to C#, the CLI and the CLR. It would be ironic if history repeated and Android just adopted Mono as its runtime. The road is much easier to tread this time around because a fully open source implementation already exists and it has already been ported onto Android and bytecode-to-IL and Java-to-C# tools exist and are mature.

The Stack is an Implementation Detail

c-sharpI have a confession. When .NET and C# beta came out in 1999, I was confused by the definition of struct as a “value type” allocated on the stack and class as a “reference type” allocated on the heap. We were told structs are lean and fast while classes are heavy and slow. I distinctly recall searching for every imaginable opportunity to use the struct keyword. I further recall being confused by the statement that everything in C# is passed by value and by the ref and out keywords. I used ref whenever I wanted to modify values in a formal parameter regardless of whether they were structs or classes. What I didn’t realize at the time was that ref and out are really just an explicit use of a pointer. Ref and provide a mechanism to manipulate a value type by pointer instead of manipulating a local copy. For reference types, though, using ref and out is the moral equivalent of using a pointer-to-pointer in C and it is rarely necessary or correct.

Eric Lippert has a could of blog posts about how the stack is an implementation detail and not really the point of value types at all. He flat-out stays that the guidance about stacks and heaps is useless and confusing. The point of value types is that they are always copied by value rather than just having another variable point (refer) to the same memory as with a reference type. They just happen to be placed on the stack because they can.

Along the way he explains why you can’t capture a ref value type in a closure.

C# as Universal Smart Phone Programming Language?

We started thinking about building a smart phone app to interface to PeopleMatrix. The obvious devices to support are BlackBerry, iPhone, Android and Windows Mobile. There is also Symbian but those devices are unusual in our primary market. Each one of these platforms has a totally different programming model:

  • BlackBerry –> Java ME + RIM libraries
  • iPhone –> Objective-C
  • Android –> Subset of Java 5 + Apache commons and  Android libraries
  • Windows Mobile –> C/C++ and .NET CF
  • Symbian –> Weird non-standard Symbian C++ variant and Qt

I just can’t envision anyone using their smartphone to interact with a sophisticated app on a screen the size or a postage stamp. That eliminates Blackberry and (many) Windows Mobile devices. Also, you have to prioritize developing for the device platforms that are growing. That means iPhone and Andoid. iPhone is very popular and Android has shown amazing growth.

The problem is that the development environment is totally different so that porting applications between Android and iPhone is a complete re-write. One ray of hope for leveraging code across these platforms is the Mono project. Novell is currently shipping a product called MonoTouch which compiles C# code into native binaries for the iPhone. The Mono guys also have Mono working on Android with proxy classes that call into the Android libraries. (In early testing Mono appears to out-perform Dalvik, too.)

If Mono on Android gets polished up like MonoTouch, that would make C# a first class programming language for a huge swath of the most exciting devices. The largest challenge for managing the codebase of an app is that it is very likely that each platform would require care to abstract access to platform-native APIs which would certainly include the GUI and other hardware interfaces.

Even so, I am watching Mono closely. Interesting times.

Interesting Post about Microsoft’s C# Compiler

One of the guys from the C# compiler team at Microsoft posted this article about how their C# compiler goes about parsing source code and emitting an assembly. Whereas C and C++ are a single pass, the C# compiler does dozens of passes. Somehow csc seems to do this much faster than cl or gcc would compile a comparable body of C/C++. I suspect the multiple passes also help the compiler to emit really good error messages.

Fascinating stuff.

%d bloggers like this: