Objects creation in C# vs C++

Introduction

Sometimes the way two languages implement a feature can significantly differ in surprising ways, and even the keywords can be misleading.
This is the case of instantiation of C++ classes and C# structs a.k.a. .NET value types.

In this article, we will see the differences between the two languages in the way they handle the creation of objects: their memory allocation and their initialization.

.NET and C#

First a quick introduction to the main difference between C++ and the pair .NET/C#.

With C++ source code is directly compiled to machine binary code and run by the OS.

.NET is an object-oriented (OO) managed platform with a runtime, the Common Language Runtime a.k.a. CLR, much like the Java Virtual Machine a.k.a. JVM, which is running its own binary code, a bytecode called the Common Intermediate Language a.k.a. CIL a.k.a MSIL, that it compiles to native code that is run by the OS.
The CLR is also responsible for managing the memory by allocating it, tracking it, and freeing it.

.NET supports many languages, C# being the main one, whose source code is compiled to this binary code which is stored to specific binary files called assemblies with the same typology and roles as native ones: .exe executables and .dll libraries.

Note that there is a version of C++ compiled to .NET CIL: C++/CLI whose you can see a concrete use-case in this other article: Using C# from native C++ with the help of C++/CLI.

Classes typology

C++

In C++ there is only one type of class, the only difference between declaring them with the keyword class or struct is the default visibility of members: private for class, public for struct.

Moreover, in C++ the way an object’s memory is allocated does not depend on the object’s type.
Especially any class instance can be stored on the stack by simply declaring it, or the heap via the new operator.

.NET and C#

In .NET this is a different story, there is still a single concept of class (declared with the .class directive in CIL) but there are two distinct hierarchies:

  • Reference types: classes directly inheriting from System.Object
  • Value types: classes inheriting from System.ValueType (which itself inherits from System.Object)

Their semantics differs:

  • Reference types are intended to represent full-blown OO classes with most of the features you can expect from an OO platform (except multiple inheritance),
  • Value types are intended to represent data structures, a bit like C structs, though they support many more OO features like methods and can be used polymorphically through interfaces, but they have some limitations like no support for inheritance.

Their memory is managed differently by the CLR:

  • Reference types instances are always allocated on the heap and are accessed through typed references (themselves allocated “in place”),
  • Value types instances are allocated “in place” and hence can never be allocated by themselves on the heap but only as a part of another reference type instance (an array, an object).

In this article, we only use value types as their instances can live both on the stack and the heap, so we’ll be able to reproduce the same scenarios as C++.

At the language level in C# we can define a reference type by introducing it with the class keyword, and a value type with the struct keyword, which is greatly misleading for C++ developers as in C++ the two keywords are technically almost synonymous.

Objects initialization

C++

C++ is quite permissive with regard to memory initialization so you can end up with objects containing “random” data left there by previous objects’ allocations and bugs whose root cause is harder to find as the effects can appear far from the buggy code.
It’s not an issue by itself, just a design choice.

Here is an illustration:

#include <iostream>

struct A {
	int N;	
};

struct B {
	int N;
	
	B() {
	}
};

void f() {
	std::cout << "Write data to the stack" << std::endl;
	
	A a;
	a.N = 123;
	
	B b;
	b.N = 456;
	
	std::cout << "a.N: " << a.N << std::endl;
	std::cout << "b.N: " << b.N << std::endl;
}

void g() {
	std::cout << "Without explicit constructor call" << std::endl;
	
	A a;
	B b;
	
	std::cout << "a.N: " << a.N << std::endl;
	std::cout << "b.N: " << b.N << std::endl;
}

void h() {
	std::cout << "With explicit constructor call" << std::endl;
	
	A a = A();
	B b = B();
	
	std::cout << "a.N: " << a.N << std::endl;
	std::cout << "b.N: " << b.N << std::endl;
}

int main() {
	f();
	g();
	h();

	return 0;
}

Compilation:

>cl /EHsc test.cpp
 Microsoft (R) C/C++ Optimizing Compiler Version 19.34.31933 for x64
 Copyright (C) Microsoft Corporation.  All rights reserved.
 test.cpp
 test.cpp(28) : warning C4700: uninitialized local variable 'a' used
 Microsoft (R) Incremental Linker Version 14.34.31933.0
 Copyright (C) Microsoft Corporation.  All rights reserved.
 /out:test.exe
 test.obj

Note that the compiler detects the potential issue and emits a warning.

Run:

>test.exe
 Write data to the stack
 a.N: 123
 b.N: 456
 Without explicit constructor call
 a.N: 123
 b.N: 456
 With explicit constructor call
 a.N: 0
 b.N: 123

So by default, the memory is not zeroed and you end up with objects filled up with the state of the previously allocated objects in the same place on the stack.

And you can see the different semantics of zero-initialization with an explicit call to the default constructor when it is explicitly defined or not.

C#

C# takes the opposite tack by forcing the developer to ensure the memory is fully initialized before using it, either by:

  • Initializing all the fields after full object allocation
  • Calling a constructor that will do that (by default one that initializes all the fields to their default value is created)


Here is the illustration with the equivalent code:

using static System.Console;

void f() {
	WriteLine("Write data to the stack");
	
	A a;
	a.N = 123;
	
	B b;
	b.N = 456;
	
	WriteLine($"a.N: {a.N}");
	WriteLine($"b.N: {b.N}");
}

void g() {
	WriteLine("Without explicit constructor call");
	
	A a;
	B b;
	
	WriteLine($"a.N: {a.N}");
	WriteLine($"b.N: {b.N}");
}

void h() {
	WriteLine("With explicit constructor call");
	
	A a = new A();
	B b = new B();
	
	WriteLine($"a.N: {a.N}");
	WriteLine($"b.N: {b.N}");
}

f();
g();
h();

struct A {
	public int N;	
}

struct B {
	public int N;
	
	public B() {
	}
}

It’s OK for method f as the fields are initialized right after allocation of the instances.
But not for method g which won’t compile:

>csc Test.cs
 Compilateur Microsoft (R) Visual C# version 4.4.0-3.22518.13 (7856a68c)
 Copyright (C) Microsoft Corporation. Tous droits réservés.
 Test.cs(17,20): error CS0170:  Use of possibly unassigned field 'N'
 Test.cs(18,20): error CS0170:  Use of possibly unassigned field 'N'

And without the g method:

>csc Test.cs
 Compilateur Microsoft (R) Visual C# version 4.4.0-3.22518.13 (7856a68c)
 Copyright (C) Microsoft Corporation. Tous droits réservés.
>Test.exe
 Write data to the stack
 a.N: 123
 b.N: 456
 With explicit constructor call
 a.N: 0
 b.N: 0

All the memory has been zeroed out, nothing falling through any crack.

Default constructor invocation

C++

In C++ the default constructor is called after the memory to store the object has been allocated, be it inside an array, for each one of its elements, or inside another object:

#include <iostream>

struct A {
	A()	{
		std::cout << "\tA()" << std::endl;
	}
};

struct B {
	A a;
};

int main() {
	std::cout << "1. Allocation on the stack without call to constructor" << std::endl;
	A a;
	
	std::cout << "2. Allocation in an array allocated on the stack" << std::endl;
	A as[3];
	
	std::cout << "3. Allocation in an array allocated on the heap" << std::endl;
	A* ash = new A[3];
	
	std::cout << "4. Allocation in a class instance allocated on the stack" << std::endl;
	B bs;
	
	std::cout << "5. Allocation in a class instance allocated on the heap" << std::endl;
	B* bh = new B();
	
	std::cout << "6. Allocation in an array allocated on the heap with default instanciation"  << std::endl;
	A ase[] = { {}, {}, {} };
	
	std::cout << "7. Allocation on the stack with call to constructor" << std::endl;
	A ac = A();

	std::cout << "8. Allocation in an array allocated on the heap with call to constructor" << std::endl;
	A asc[] = { A(), A(), A() };
	
	return 0;
}

Compilation:

>cl /EHsc test.cpp

Run:

>test.exe
1. Allocation on the stack without call to constructor
     A()
2. Allocation in an array allocated on the stack
     A()
     A()
     A()
3. Allocation in an array allocated on the heap
     A()
     A()
     A()
4. Allocation in a class instance allocated on the stack
     A()
5. Allocation in a class instance allocated on the heap
     A()
6. Allocation in an array allocated on the heap with default instanciation
     A()
     A()
     A()
7. Allocation on the stack with call to constructor
     A()
8. Allocation in an array allocated on the heap with call to constructor
     A()
     A()
     A()

The default constructor is called every time.

As for case #6 not sure this is a perfect equivalent of the C# default operator which returns the default value for any type, e.g. 0 for int, and a zeroed instance for structs.

C#

In C# again things are different, the default constructor is never called implicitly if the object is allocated inside another, it must be called explicitly :

using System;
using static System.Console;

WriteLine("1. Allocation on the stack without call to constructor");
A a;
	
WriteLine("2. Allocation in an array allocated on the stack");
Span<A> @as = stackalloc A[3];
	
WriteLine("3. Allocation in an array allocated on the heap");
A[] ash = new A[3];
	
WriteLine("4. XXX");
	
WriteLine("5. Allocation in a class instance allocated on the heap");
B bh = new B();

WriteLine("6. Allocation in an array allocated on the heap with default instanciation");
A[] ase = { default(A), default(A), default(A) };

WriteLine("7. Allocation on the stack with call to constructor");
A ac = new A();

WriteLine("8. Allocation in an array allocated on the heap with call to constructor");
A[] asc = { new A(), new A(), new A() };

struct A
{
	public A() => WriteLine("\tA()");
}

class B
{
	A a;
}

Some remarks:

  • for .NET Framework (should be transparent in .NET Core) to get and use the Span<> type you must use NuGet package System.Memory (I’ve used version 4.5.5) and System.Runtime.CompilerServices.Unsafe (version 4.5.3 which is the one expected by the version of System.Memory)
  • Case #4 is not possible as B is a reference type so its instances can’t be allocated on the stack

Compilation:

>csc /reference:System.Memory.dll Test.cs
 Compilateur Microsoft (R) Visual C# version 4.4.0-3.22518.13 (7856a68c)
 Copyright (C) Microsoft Corporation. Tous droits réservés.
 Test.cs(9,3): warning CS0168: The variable 'a' is declared, but never used
 Test.cs(39,4): warning CS0169: Field 'B.a' is never used

Run:

>Test.exe
1. Allocation on the stack without call to constructor
2. Allocation in an array allocated on the stack
3. Allocation in an array allocated on the heap
4. XXX
5. Allocation in a class instance allocated on the heap
6. Allocation in an array allocated on the heap with default instanciation
7. Allocation on the stack with call to constructor
     A()
8. Allocation in an array allocated on the heap with call to constructor
     A()
     A()
     A() 

The default constructor is only called when explicitly called.

And previous to C# 10.0 it was even prohibited to define a default constructor for structs:

>csc /reference:System.Memory.dll -langversion:9.0 Test.cs
 Compilateur Microsoft (R) Visual C# version 4.4.0-3.22518.13 (7856a68c)
 Copyright (C) Microsoft Corporation. Tous droits réservés.
 Test.cs(34,9): error CS8773: Feature 'parameterless struct constructors' is not available in C# 9.0. Please use language version 10.0 or greater. 

It was by design to avoid wrong expectations like it being called when allocating an array of instances.

Conclusion

This article has illustrated the different design choices made for C++ and .NET/C# regarding objects creation.

There is far more to say concerning the differences between C++ and .NET/C# but objects creation is a tricky topic which can get in the way if not well understood.

Comparing the languages allows putting into perspective some behaviours to understand that they are not due to technical limitations but merely to deliberate choices at the inception of the language.
And that these behaviours can change as the language gets more mature like C# now supporting default constructors for structs.

2 thoughts on “Objects creation in C# vs C++

Leave a Reply to pragmateek Cancel reply

Your email address will not be published. Required fields are marked *

Prove me you\'re human :) *