## Is F# Math Really Faster than C#?

I ran across an article claiming that F# was about an order of magnitude faster at calculating a basic math algorithm. Can this be true?

### Sigmoid in C#

```public static float Sigmoid(float value)
{
return 1.0f / (1.0f + (float) Math.Exp(-value));
}

```
```//csc -o -debug-
.method public static float32 Sigmoid(float32 'value') cil managed
{
.maxstack 8
L_0000: ldc.r8 1
L_0009: ldc.r8 1
L_0012: ldarg.0
L_0013: neg
L_0014: conv.r8
L_0015: call float64 [mscorlib]System.Math::Exp(float64)
L_001b: div
L_001c: conv.r4
L_001d: ret
}

```

### Sigmoid in F#

```let Sigmoid value = 1.0f/(1.0f + exp(-value));

```
```//fsc --debug- --optimize+
.method public static float32 Sigmoid(float32 'value') cil managed
{
.maxstack 5
.locals init (
[0] float32 num)
L_0000: ldc.r4 1
L_0005: ldc.r4 1
L_000a: ldarg.0
L_000b: neg
L_000c: stloc.0
L_000d: ldloc.0
L_000e: conv.r8
L_000f: call float64 [mscorlib]System.Math::Exp(float64)
L_0014: conv.r4
L_0016: div
L_0017: ret
}

```

The IL generated is nearly the same. The F# version allocates a little less memory using float32 variables on the stack where C# uses float64 and generally manages with a smaller stack, but the basic math operations are the same. Can these differences make an order of magnitude performance difference? Short answer is no.

## Benchmarks are Tricky

### C# sigmoid algorithm benchmark

```// (c) Rinat Abdullin
// http://abdullin.com/journal/2009/1/5/caching-activation-function-is-not-worth-it.html
// sigmoidcs.cs

using System;
using System.Diagnostics;

static class App
{
const float Scale = 320.0f;
const int Resolution = 2047;
const float Min = -Resolution/Scale;
const float Max = Resolution/Scale;

static readonly float[] _lut = InitLut();

static float[] InitLut()
{
var lut =  new float[Resolution + 1];
for (int i = 0; i < Resolution + 1; i++)
{
lut[i] = (float) (1.0/(1.0 + Math.Exp(-i/Scale)));
}
return lut;
}

static float Sigmoid1(double value)
{
return (float) (1.0/(1.0 + Math.Exp(-value)));
}

static float Sigmoid2(float value)
{
if (value <= Min) return 0.0f;
if (value >= Max) return 1.0f;
var f = value*Scale;
if (value >= 0) return _lut[(int) (f + 0.5f)];
return 1.0f - _lut[(int) (0.5f - f)];
}

static float TestError()
{
var emax = 0.0f;
for (var x = -10.0f; x < 10.0f; x += 0.000001f)
{
var v0 = Sigmoid1(x);
var v1 = Sigmoid2(x);

var e = Math.Abs(v1 - v0);
if (e > emax) emax = e;
}
return emax;
}

static double TestPerformancePlain()
{
var sw = new Stopwatch();
sw.Start();
for (int i = 0; i < 10; i++)
{
for (float x = -5.0f; x < 5.0f; x += 0.000001f)
{
Sigmoid1(x);
}
}

return sw.Elapsed.TotalMilliseconds;
}

static double TestPerformanceOfLut()
{
var sw = new Stopwatch();
sw.Start();
for (int i = 0; i < 10; i++)
{
for (float x = -5.0f; x < 5.0f; x += 0.000001f)
{
Sigmoid2(x);
}
}
return sw.Elapsed.TotalMilliseconds;
}

static void Main()
{
var emax = TestError();
var t0 = TestPerformancePlain();
var t1 = TestPerformanceOfLut();

Console.WriteLine("Max detected deviation using LUT: {0}", emax);
Console.WriteLine("10^7 iterations using Sigmoid1() took {0} ms", t0);
Console.WriteLine("10^7 iterations using Sigmoid2() took {0} ms", t1);
}
}

```

### F# sigmoid algorithm benchmark

```// (c) Rinat Abdullin
// http://abdullin.com/journal/2009/1/6/f-has-better-performance-than-c-in-math.html
// sigmoidfs.fs

#light

let Scale = 320.0f;
let Resolution = 2047;

let Min = -single(Resolution)/Scale;
let Max = single(Resolution)/Scale;

let range step a b =
let count = int((b-a)/step);
seq { for i in 0 .. count -> single(i)*step + a };

let lut = [|
for x in 0 .. Resolution ->
single(1.0/(1.0 +  exp(-double(x)/double(Scale))))
|]

let sigmoid1 value = 1.0f/(1.0f + exp(-value));

let sigmoid2 v =
if (v <= Min) then 0.0f;
elif (v>= Max) then 1.0f;
else
let f = v * Scale;
if (v>0.0f) then lut.[int (f + 0.5f)]
else 1.0f - lut.[int(0.5f - f)];

let getError f =
let test = range 0.00001f -10.0f 10.0f;
let errors = seq {
for v in test ->
abs(sigmoid1(single(v)) - f(single(v)))
}
Seq.max errors;

open System.Diagnostics;

let test f =
let sw = Stopwatch.StartNew();
let mutable m = 0.0f;
let result =
for t in 1 .. 10 do
for x in 1 .. 1000000 do
m <- f(single(x)/100000.0f-5.0f);
sw.Elapsed.TotalMilliseconds;

printf "Max deviation is %f\n" (getError sigmoid2)
printf "10^7 iterations using sigmoid1: %f ms\n" (test sigmoid1)
printf "10^7 iterations using sigmoid2: %f ms\n" (test sigmoid2)

```
```PS> csc -o -debug- .\sigmoidcs.cs
Microsoft (R) Visual C# 2010 Compiler version 4.0.30319.1

PS> fsc --debug- --optimize+  .\sigmoidfs.fs
Microsoft (R) F# 2.0 Compiler build 4.0.30319.1
PS> .\sigmoidcs.exe
Max detected deviation using LUT: 0.001663984
10^7 iterations using Sigmoid1() took 2644.5935 ms
10^7 iterations using Sigmoid2() took 510.9379 ms
PS> .\sigmoidfs.exe
Max deviation is 0.001664
10^7 iterations using sigmoid1: 403.974300 ms
10^7 iterations using sigmoid2: 124.520100 ms

```

Wow. That’s 404ms for F# and 2,656 for C#. That’s a huge difference. Why?

We already established that the IL for the Sigmoid1() algorithm above compiles to very similar IL. What else could be different?

Wait a minute.

```let result =
for t in 1 .. 10 do
for x in 1 .. 1000000 do
m <- f(single(x)/100000.0f-5.0f);

```

The F# loop is not equivalent to the C# version. In the F# version the counter is an int and the C# version the counter is a float. (Also the F# version is passing a delegate to the test function.)

```for (int i = 0; i < 10; i++)
{
for (float x = -5.0f; x < 5.0f; x += 0.000001f)
{
Sigmoid1(x);
}
}

```

Does that int counter thing make a big difference.

```using System;
using System.Diagnostics;

static class App
{
static float Sigmoid1(double value)
{
return (float) (1.0/(1.0 + Math.Exp(-value)));
}

static double TestPerformancePlain()
{
var sw = new Stopwatch();
sw.Start();
float num = 0f;
for (int i = 0; i < 10; i++)
{
for (int j = 1; j < 1000001; j++)
{
num = Sigmoid1((((float) j) / 100000f) - 5f);
}
}
return sw.Elapsed.TotalMilliseconds;
}

static void Main()
{
var t0 = TestPerformancePlain();
Console.WriteLine("10^7 iterations using Sigmoid1() took {0} ms", t0);
}
}

```
```PS> csc -o -debug- .\sigmoid-counter.cs
Microsoft (R) Visual C# 2010 Compiler version 4.0.30319.1

PS> ./sigmoid-counter
10^7 iterations using Sigmoid1() took 431.6634 ms

```

Yup C# just executed that Sigmoid1() test in 431ms. Recall that F# was 404ms.

The other difference was that the F# code was passing a delegate of FsharpFunc<float,float> rather than making a method call. Lets try the same thing in C# using a Lambda expression to create an anonymous Func<float,float> delegate.

```using System;
using System.Diagnostics;

static class App
{
static double Test(Func<float,float> f)
{
var sw = new Stopwatch();
sw.Start();
float num = 0f;
for (int i = 0; i < 10; i++)
{
for (int j = 1; j < 1000001; j++)
{
num = f((((float) j) / 100000f) - 5f);
}
}
return sw.Elapsed.TotalMilliseconds;
}

static void Main()
{
var t0 = Test( x => { return (float) (1.0/(1.0 + Math.Exp(-x))); } );
Console.WriteLine("10^7 iterations using Sigmoid1() took {0} ms", t0);
}
}

```
```PS> ./sigmoid-counter
10^7 iterations using Sigmoid1() took 431.6634 ms
PS> csc -o -debug- .\sigmoid-lambda.cs
Microsoft (R) Visual C# 2010 Compiler version 4.0.30319.1

PS> .\sigmoid-lambda.exe
10^7 iterations using Sigmoid1() took 275.1087 ms

```

Now C# is 130ms faster than F#.

For what it’s worth the C version of the Sigmoid1() benchmark executes in 166ms on my machine using CL 16 x64 on my machine.

• C: 166ms
• C# (delegate): 275ms
• C# (method): 431ms
• C# (method, float counter): 2,656ms
• F#: 404ms
• ## So, what have we learned?

• Never, ever use a float as a counter in a for loop in C#.
• Invoking a delegate in the CLR is slightly faster than a normal method invocation.
• It’s very easy to measure something other than what you intended when benchmarking.
• In as much as this little test means anything, C# and F# have similar performance. C is still faster.