VS extension development introduction talk slides

At work we have self organized talks about just about everything people come up with. This week it was my turn and I gave this little talk about Visual Studio extension development.
I just leave this here in case anyone else is interested πŸ™‚
vsextension talk slides – 1.4mb pptx

C++11 compile time checked printf format

A few notes on how I wrote 100% compile time checked printf-style format type checker. Source code is online on GitHub.


I recently worked again a bit on ezEngine, a beautiful Open Source game engine developed continuously by a few friends of mine in their spare time. ezEngine uses custom implemented formatting function that follows the style of (s)printf. Like printf it is C-varargs based and therefore not type safe. Example:

printf("%i", "not an int");

Continue reading

Microproject: OptionalScopedObject template class

(Published in similar form GitHub)

Recently I ran into a weird situation where a threading lock was either passed, or not (since locking might or might not be appropriate in a given situation). Since I am always very concerned about other people adding early out return statements and forgetting to do the cleanup (here: unlocking the lock), I wanted to use a scoped version of the lock. But having the scope optional forced me into using an allocation:

void somethingProbablyUnsafe(ThreadingLock* lock)
    // unrelated code
    // ...

    std::unique_ptr<ScopedLock> scopedLock;
        scopedLock.reset(new ScopedLock(*lock));

    // Complicated code with early outs.
    // ...

This certainly solves the problem – the lock is still scoped and nobody needs to worry about having the lock unlocked. Way better than checking the lock variable at each return point. But it requires an allocation and no longer keeps the ScopedLock on the stack as it is intended to be, how awful! 😦

Template programming for the rescue! Using variadic templates and placement constructors we can solve this problem in no time once and for all! πŸ™‚

// Class wrapper to provide an optional scoped object on the stack without any dynamic allocations.
template<typename ScopedObject>
class OptionalScopedObject
	// Creates a new inactive object. The underlying ScopedObject type is not initialized!
	OptionalScopedObject() : m_active(false)

	// If the underlying ScopedObject was created via construct, it will be deconstructed.

	OptionalScopedObject(OptionalScopedObject&) = delete;
	void operator = (OptionalScopedObject&) = delete;

	// Constructs the object. Will destruct the object first if there was already one.
	template<typename... Args>
	void construct(Args... args)
		if (m_active)
		new (m_memory) ScopedObject(args...);
		m_active = true;

	// Destructs the underlying object if one has been previously constructed.
	void destruct()
		if (m_active)
			m_active = false;

	// Access to underlying object.
	const ScopedObject* operator -> () const
		return reinterpret_cast<const ScopedObject*>(m_memory);
	const ScopedObject& operator * () const
		return *reinterpret_cast<const ScopedObject*>(m_memory);
	ScopedObject* operator -> ()
		return reinterpret_cast<ScopedObject*>(m_memory);
	ScopedObject& operator * ()
		return *reinterpret_cast<ScopedObject*>(m_memory);

	// Weather the underlying object is constructed or not.
	operator bool() const
		return m_active;

	char m_memory[sizeof(ScopedObject)];
	bool m_active;


void somethingProbablyUnsafe(ThreadingLock* lock)
    // unrelated code
    // ...

    OptionalScopedObject<ScopedLock> scopedLock;

    // Complicated code with early outs.
    // ...

Scoped thread locks are of course only an example. The template can be used with any kind of scoped object that may or may not be required.

All this almost let forget you about the simple solution: Your scoped object should take a pointer and it should handle null. (yeah, I forgot about that myself when I had this whole idea, thank you for pointing that out to me @ Christopher)

Abstracting (DX12/NewAPI) Resource Transitions

The new generation of low-level rendering APIs introduced the concept of “Resource Transitions“: The API user is responsible for keeping track how a memory resource is used and make explicit transitions in the command-list. Most common example: Use a texture as rendertarget and vice versa.
(This is definitely true for DirectX12 and Mantle, but I do not know for sure about the upcoming Vulkan-API… although it would surprise me if they handle it differently)

How to handle this concept in a render abstraction?

  • No abstraction, pass burden of transition to user
    • Pro:
      • Highlevel renderer should know best, most resources don’t need any transition anyways
      • Zero extra overhead
      • If something goes wrong, the debug layer should be able to do the hard validation work
    • Con:
      • Tedious to keep track
      • May want to implement automatism anyway
      • Error prone: Old APIs don’t care, debug layer of new API might miss something
  • Go single-threaded and record resource usage (always only one)
    • Pro
      • Trivial way to remove all resource transitions from userland
    • Con
      • Defeats an important point of the new APIs!
  • Create extra command lists at submission to fill in unknown transitions
    • Excellent explanation here
    • Pro
      • Fully automatic and still multi-threaded, no locks
    • Con
      • Requires two levels of resource state tracking
      • Need to create command lists on the fly (or pre-cached) – how costly is that?
  • Record command lists multi-threaded in custom data structure without sending any commands to the API. When all lists are ready, missing resource transition infos can be propagated. Only then the API-side command lists are generated
    • Has anybody tried that?
    • Pro
      • Manual recording needs may be necessary to emulate multi-threading with old APIs anyway
    • Con
      • Requires multiple threading barriers
      • Two step recording introduces a lot of overhead

In DirectX12 resource transitions are a special form of Resource Barriers.
Resource barriers in general were already in place in the old APIs, e.g. the infamous glMemoryBarrier. A common usage example are unordered access views, where the question weather caches need to be flushed or not depends entirely on the shader operating on it (usage example here). None of the discussed approaches is able to abstract such UAV Barriers entirely away, without generating too many of them. More or less the same goes for Aliasing Barriers (different resources on the same memory).

For the ClearSight project I am torn between the “no abstraction” and the “command list fill-in” which sounds reasonable to me (and hey, a guy on the official DX12 edu channel told us about it!). Automating the process or resource transitions sounds very attractive is probably a must-have for every serious render-abstraction. However, I don’t think it is trivial to implement in a robust and fast fashion. And as I mentioned it cannot abstract all resource barriers.

So for now I will just pass the burden to the user. I am curious how much extra work this actually means to the implementer of a high level rendering pipeline. I suspect that it should be fairly simple since most transitions are made deliberately. Assuming the API debug layer catches all missing transitions, it should also make it easier to see where unnecessary overhead was introduced:
An automatic system may fill in all transitions, but does not report where the graphics programmer did not intend to perform transitions.

About Resource Handles

For the sake of this article I define a resource as an abstract object that cannot or should not be created, destroyed or manipulated directly like you would access a variable. Usual examples are objects that are associated with “external memory” like files or GPU-textures.

Many game engine do not allow access to resources by (raw/ref-counted/etc.) pointers to class interfaces, but through special resource handles that don’t have any special member functions at all. A resource handle is usually an identifier which is used together with a resource manager to either:

  • perform operations on or with it
  • temporarily retrieve an interface to the resource (pointer to a class instance) that represents or even holds the actual data

I think these two two possibilities are important to distinguish. I call them indirect andΒ direct resource access (beware! I made these names just up). Note that you can easily map both access types to each other, so in the end it is more about programming interfaces!
The indirect resource access is fairly common, especially in non-object-oriented languages like C, where it is not possible to give pointers to resource-interfaces. A good example for such indirect access handles are C’s file-handles.

Either way, why should one use resource-handles in object oriented languages at all?

  • The resource system can change how a handle is resolved. This is useful for:
    • resource fallbacks
      • not found
      • still loading
    • streaming
    • uncertain memory location
    • might even be used for detail levels
  • They act similar to weak pointer
    • Resource can be destroyed explicitly:
      Subsequent queries (direct access) or function calls (indirect access) at the resource manager notify the user that the resource does not exist

It is all about moving power from away from user handled “resource instances” to a resource system which has global knowledge.
If you cannot make use of the mentioned advantages, just go with the normal object oriented approach where you create/destroy/manipulate at any time without special supervision. The disadvantages of a handle system are rather obvious:

  • Additional indirection
  • [only direct] Resource system can be circumvented by just asking only once for the resource and keeping it then

Oh so there is a disadvantage for the direct resource system but none for the indirect? Not so fast: Giving a resource object/interface to the user allows much nicer usage and can leverage polymorphism. In contrast, the indirect system needs to have all functionality in the manager which can be very unnatural and inflexible.

Of course all this is very simplified and general. In some situations a hybrid might be useful as well!
More infos about “indirect” handle-based resource managers can be found for example here. An example for a “direct” handle-based resource manager can be found in the ezEngine.

Now to the practical usage in ClearSight, my current C# rendering framework (as a precaution I stopped calling it “Engine” πŸ˜‰ ). So far I have identified two places which are in need for some kind of resource system with different properties.

  • (low-level-ish) Renderer
    • ensure that resources that are still in use by the GPU (!) are not deleted immediately
    • few different resource operations
    • resource provides very few operations itself
    • creation can be assumed to be very fast; only states are thus either existing or not existing
  • higher level asset system
    • load by lookup (do not reload loaded resources)
    • fallbacks
    • many and complex resource operations
    • possible multi-threaded loading processes, different loading states

Since in C# any object that is still referenced somewhere cannot be deallocated, the advantage of a weak-pointer-like behaviour is irrelevant. Needless to say that all resources should be deallocated explicitly and implement thus IDisposable. Therefore, the Dispose method can decide if a resource is actually allowed to be destroyed.

This ultimately means that, while I need a thin resource management in the renderer, I do not need resource handles in that place. However, I will likely use a direct handle-based resource system later on top, where a single resource might compromise one or more renderer resource and additional data.

New Sideproject, C# Engine?


Since I am now slowly getting settled down again after moving to Dublin, I have recently been thinking about starting a new side project. Doing side projects is a very natural hobby for me that allows me to play freely with ideas and gives me that cosy feeling of steady progress.
First, I thought about continuing with global illumination research (which I did in my master’s thesis) or helping out my friends at the amazing ezEngine. However, I realized that it might be fun to experiment with general infrastructure myself – something that I haven’t done for a long time. While the past showed me that I tend not to reuse code directly, there are many pieces of code in my archive that get recycled and improved in other projects – so it could be also useful for spin-offs etc.
My job involves mainly C++ coding atm, so I would rather use something else: I was tempted to use the Rust language but I am just too much in love with IDEs to try a large project with a fancy new language (note to myself: Blog about experiences with Rust). Instead I decided to go with an old friend of mine: C#

C# Pro/Cons

Personally, the biggest pros are the great IDE support (ReSharper makes it even better) and the lush standard library. Practically all engines I have seen so far have a huge Core of basic functionality for containers, threads/tasks, SIMD math, file access and so on. In .net there is already a lot of this stuff; 4.6Β  has even basic SIMD support, a work stealing task scheduler and the async functionalities are just a amazing! Don’t take me wrong, writing all this by hand can be a joy on its own, but I am just not in the mood for it ;).
Also, I have to admit that I just love some of the syntactical sugar of C#, e.g. properties, lambdas, events, …

On the other hand the garbage collector can strike at any time and can make it difficult to manage resource & lifetimes by hand where it is necessary (textures, buffers, but also high-level game resource management).Β  Another con from my point of view is that I already now quite a bit about C#/.Net stuff which makes the whole matter a bit less interesting. On the other hand, recent versions introduced some cool stuff. Cross-Platform support is getting with Microsoft’s more recent open source initiatives really good, but I am still sceptical.

Things to try

A loose list of a few interesting things to which I look especially forward to investigate and experiment with during this project:

  • Modern multi-threaded renderer abstraction
    • Support for DX12, Vulkan but also OpenGL 4.5 (to prove the concept and make it work on my laptop)
    • might be necessary to keep the lowest abstraction level relatively high
    • Molecule Matters has a great series about this topic with many super interesting ideas. Can’t wait to try all this in C# πŸ™‚
  • Resource system/manager
  • Component Entity systems
    • Especially interesting in the context of .net reflection stuff
  • Unit-testing in .net
  • Modern Scenegraph
  • Run on Linux without Mono
  • Make a WinRT application

Of course there will be also a lot of seemingly less exiting things that still might be fun, but to which I haven’t given much thought yet. For example logging, messaging, special data structures, etc.

Next, I will have a look at existing engines and start then experimenting & planning with the general project structure. While .net offers already a lot of stuff I will probably still need some kind of base library. As always, I will make everything public on GitHub.

To be continued! πŸ™‚

First steps with Direct3D12

Background & Motivation

The last few months I was pretty busy with my master’s thesis, which was a OpenGL 4.5 based project. Now that I’m on free foot again, it is time to try something new. I have some experience in D3D11 but its been a while.

Since there are already a lot of useful resources, I won’t write a full “First Triangle” tutorial but rather link to others. This post just sums up my first steps & insights. Maybe its useful for someone else too.

Preparation & Prerequisites:

  • Windows 10
  • Visual Studio 2015
    • There is a free Community Edition
  • Windows 10 SDK (not included in Visual Studio!)
  • A driver that supports DX12
    • Nvidia: Fermi or newer
    • AMD: GCN 1.0 or newer
    • Intel: Haswell or newer
    • if you have a Fermi+IvyBridge “Optimus” combination like my laptop, you are out of luck. Looks like there is no way to get it running >.<

What’s New, Overview

Scrapped from here and some other places where I found things that were especially interesting to me.

  • The new lowlevel API itself.
    • There is no Map/Lock-Discard and SubResource anymore! It is like using D3D11’s NO_OVERWRITE or OpenGL4.5 persistent mapping all the time. That means you need to pipeline updates yourself.
    • No reference counting for actual memory, only for interfaces.
    • There is only one large state object that contains everything: The Pipeline State Object (PSO)
    • Command Lists that are submited to the GPU instead of (one or more) immediate or deferred contexts
    • “Bundles” allow to record certain commands a bit like Display Lists in old OpenGL
    • …………..
  • It is finally possible to perform asynchronous readbacks
  • A few new hardware features, exposed as well in DX11.3
  • There is an open source header with C++ helper structs. Its called d3dx12.h but has not much in common with the good old D3DX from the DX9 days. You can find such functionallity in github repos like DirectXTex


Porting to Direct3D12 might be hard and does not guarantee that your application will run faster. Aras gave recently a talk about the issues you might encounter (siggraph15). Accordingly, its like porting from loose constants to constant buffers – you need a lot more “global” information about your rendering process.

External Resources

Here are some useful links to get started:

I have an overview diagram of D3D12 in the making and will publish it as soon as I have a bit more experience. Stay tuned! Posted Image