Monday, December 8, 2008

Variable types and sizes

Just a random thought on variable sizes:

There's an old design decision for programming languages as to whether the variable size for built-in numeric types be static or dynamic. For example, the .NET framework has static sizes: every type in the System namespace (which are all the built-in types) has a size associated with it, eg: Int32. Conversely, in C/C++, int is of dynamic size, dependent on the compiler and the target environment.

There are arguments for both. On the static size, you have deterministic size, so you can predict exactly what values will/won't fit. On the dynamic side, you can automatically use the size which is appropriate for the architecture, which gives you automatic adaptability to architectures with new intrinsic data sizes (eg: 32bit -> 64bit) without speed degradation from extra operations to adapt non-standard variable sizes.

With those trade-offs in mind, I'd propose a new? thought for variable declarations: a concept of "at least" x bits. This would give you the best of both worlds: you could say with certainty that values within a target range would fit in your variable, while allowing the compiler to allocate a larger type if that was more optimal for the target architecture. You would sacrifice predictability of variable size, but you could still use fixed-size constructs as a fallback if you needed them.

With that in mind, variable declarations might look like:
int32p i32bitOrLargerValue;
int64p i64bitOrLargerValue;

... where the 'p' is for 'plus'.

Just a thought.

Thursday, November 13, 2008

The enormous problem with highly dynamic languages

So you're moving from an "old-school" language like C/C++ into a "new hotness" language like C#, and life is so much easier. Memory management is automatic, type information is part of the runtime, everything is dynamic. You can create these very simple expression to perform very complex operations, all auto-magically, and rapidly prototype applications like never before. Life is great, right?

Well, there's a small problem which is the 800lb gorilla in the ideology. See, there's two parts to a language enabling you to writing code which does what you want: letting you express what you want to do, and helping you not express what you didn't want to do. The former is aided by higher level abstractions, patterns, powerful expression syntax, API libraries, etc. The latter is aided by strong compile-time checking, API/structure transparency, clear and predictable behavior, etc. A good language balances both of these.

The problem with low-level languages is that they have a lot of the latter, without much of the former. The problem with highly dynamic languages is that they have a lot of the former, at the expense of the latter. The big problem with languages lacking the latter is that lacking the former just slows you down, whereas lacking the latter makes your applications fundamentally less reliable and more prone to subtle systemic problems. The huge problem is that there's no way around that problem: no matter how clean your structure and methodologies, you're always forced to perform runtime verification, and it's nearly impossible to eliminate systemic errors because the runtime implementation is so convoluted and obtuse.

I would not be surprised if we see a resurgence of "native" code development because of these issues, because they are so fundamentally intractable in dynamic languages. I know I shutter to think of trying to build a reliable .NET application of any meaningful complexity. We shall see.

Wednesday, October 22, 2008

Fun with COM interop

So I have a C# object which exposes a COM interface through interop, and it was working. Then I did something, and when I went to reload it, it said the component was not registered. I confirmed that the ProgID was in the registry, and everything appeared to be good.

... It turns out that if the constructor for an object which is being constructed for a COM object instantiation throws an exception, the CoCreateInstance call will return that the component is not registered. This is not normally a problem with native C++ COM objects, since they run the constructor code as part of the registration process, so you'd catch the error earlier. However, C# COM object apparently do not, and the error coming from COM is very misleading.

Just an interesting tidbit for COM interop.

Monday, October 6, 2008

Bizarre error of the day

So I'm playing with compiling something which is C++/CLI using /clr, and ran across an error while trying to run:
'Could not load file or assembly' of my exe itself!

So to make a long story short, after some research, it turns out that:
- The .NET framework cannot load assemblies which have more than 65k global symbols defined
- Every static string in the code apparently is assigned its own global symbol when compiling with /clr

The solution, equally bizarre, is to enable string pooling for the Debug build of the exe. This reduces the amount of static strings dramatically, which allows the assembly to load, and the program to run. Talk about a random issue.

Oh, and obligatory "yeah, C++/CLI is ready for real world apps...".