VS extension development introduction talk slides

At work we have self organized talks about just about everything people come up with. This week it was my turn and I gave this little talk about Visual Studio extension development.
I just leave this here in case anyone else is interested πŸ™‚
vsextension talk slides – 1.4mb pptx

C++11 compile time checked printf format

A few notes on how I wrote 100% compile time checked printf-style format type checker. Source code is online on GitHub.


I recently worked again a bit on ezEngine, a beautiful Open Source game engine developed continuously by a few friends of mine in their spare time. ezEngine uses custom implemented formatting function that follows the style of (s)printf. Like printf it is C-varargs based and therefore not type safe. Example:

printf("%i", "not an int");

Continue reading

Microproject: OptionalScopedObject template class

(Published in similar form GitHub)

Recently I ran into a weird situation where a threading lock was either passed, or not (since locking might or might not be appropriate in a given situation). Since I am always very concerned about other people adding early out return statements and forgetting to do the cleanup (here: unlocking the lock), I wanted to use a scoped version of the lock. But having the scope optional forced me into using an allocation:

void somethingProbablyUnsafe(ThreadingLock* lock)
    // unrelated code
    // ...

    std::unique_ptr<ScopedLock> scopedLock;
        scopedLock.reset(new ScopedLock(*lock));

    // Complicated code with early outs.
    // ...

This certainly solves the problem – the lock is still scoped and nobody needs to worry about having the lock unlocked. Way better than checking the lock variable at each return point. But it requires an allocation and no longer keeps the ScopedLock on the stack as it is intended to be, how awful! 😦

Template programming for the rescue! Using variadic templates and placement constructors we can solve this problem in no time once and for all! πŸ™‚

// Class wrapper to provide an optional scoped object on the stack without any dynamic allocations.
template<typename ScopedObject>
class OptionalScopedObject
	// Creates a new inactive object. The underlying ScopedObject type is not initialized!
	OptionalScopedObject() : m_active(false)

	// If the underlying ScopedObject was created via construct, it will be deconstructed.

	OptionalScopedObject(OptionalScopedObject&) = delete;
	void operator = (OptionalScopedObject&) = delete;

	// Constructs the object. Will destruct the object first if there was already one.
	template<typename... Args>
	void construct(Args... args)
		if (m_active)
		new (m_memory) ScopedObject(args...);
		m_active = true;

	// Destructs the underlying object if one has been previously constructed.
	void destruct()
		if (m_active)
			m_active = false;

	// Access to underlying object.
	const ScopedObject* operator -> () const
		return reinterpret_cast<const ScopedObject*>(m_memory);
	const ScopedObject& operator * () const
		return *reinterpret_cast<const ScopedObject*>(m_memory);
	ScopedObject* operator -> ()
		return reinterpret_cast<ScopedObject*>(m_memory);
	ScopedObject& operator * ()
		return *reinterpret_cast<ScopedObject*>(m_memory);

	// Weather the underlying object is constructed or not.
	operator bool() const
		return m_active;

	char m_memory[sizeof(ScopedObject)];
	bool m_active;


void somethingProbablyUnsafe(ThreadingLock* lock)
    // unrelated code
    // ...

    OptionalScopedObject<ScopedLock> scopedLock;

    // Complicated code with early outs.
    // ...

Scoped thread locks are of course only an example. The template can be used with any kind of scoped object that may or may not be required.

All this almost let forget you about the simple solution: Your scoped object should take a pointer and it should handle null. (yeah, I forgot about that myself when I had this whole idea, thank you for pointing that out to me @ Christopher)

Abstracting (DX12/NewAPI) Resource Transitions

The new generation of low-level rendering APIs introduced the concept of “Resource Transitions“: The API user is responsible for keeping track how a memory resource is used and make explicit transitions in the command-list. Most common example: Use a texture as rendertarget and vice versa.
(This is definitely true for DirectX12 and Mantle, but I do not know for sure about the upcoming Vulkan-API… although it would surprise me if they handle it differently)

How to handle this concept in a render abstraction?

  • No abstraction, pass burden of transition to user
    • Pro:
      • Highlevel renderer should know best, most resources don’t need any transition anyways
      • Zero extra overhead
      • If something goes wrong, the debug layer should be able to do the hard validation work
    • Con:
      • Tedious to keep track
      • May want to implement automatism anyway
      • Error prone: Old APIs don’t care, debug layer of new API might miss something
  • Go single-threaded and record resource usage (always only one)
    • Pro
      • Trivial way to remove all resource transitions from userland
    • Con
      • Defeats an important point of the new APIs!
  • Create extra command lists at submission to fill in unknown transitions
    • Excellent explanation here
    • Pro
      • Fully automatic and still multi-threaded, no locks
    • Con
      • Requires two levels of resource state tracking
      • Need to create command lists on the fly (or pre-cached) – how costly is that?
  • Record command lists multi-threaded in custom data structure without sending any commands to the API. When all lists are ready, missing resource transition infos can be propagated. Only then the API-side command lists are generated
    • Has anybody tried that?
    • Pro
      • Manual recording needs may be necessary to emulate multi-threading with old APIs anyway
    • Con
      • Requires multiple threading barriers
      • Two step recording introduces a lot of overhead

In DirectX12 resource transitions are a special form of Resource Barriers.
Resource barriers in general were already in place in the old APIs, e.g. the infamous glMemoryBarrier. A common usage example are unordered access views, where the question weather caches need to be flushed or not depends entirely on the shader operating on it (usage example here). None of the discussed approaches is able to abstract such UAV Barriers entirely away, without generating too many of them. More or less the same goes for Aliasing Barriers (different resources on the same memory).

For the ClearSight project I am torn between the “no abstraction” and the “command list fill-in” which sounds reasonable to me (and hey, a guy on the official DX12 edu channel told us about it!). Automating the process or resource transitions sounds very attractive is probably a must-have for every serious render-abstraction. However, I don’t think it is trivial to implement in a robust and fast fashion. And as I mentioned it cannot abstract all resource barriers.

So for now I will just pass the burden to the user. I am curious how much extra work this actually means to the implementer of a high level rendering pipeline. I suspect that it should be fairly simple since most transitions are made deliberately. Assuming the API debug layer catches all missing transitions, it should also make it easier to see where unnecessary overhead was introduced:
An automatic system may fill in all transitions, but does not report where the graphics programmer did not intend to perform transitions.

About Resource Handles

For the sake of this article I define a resource as an abstract object that cannot or should not be created, destroyed or manipulated directly like you would access a variable. Usual examples are objects that are associated with “external memory” like files or GPU-textures.

Many game engine do not allow access to resources by (raw/ref-counted/etc.) pointers to class interfaces, but through special resource handles that don’t have any special member functions at all. A resource handle is usually an identifier which is used together with a resource manager to either:

  • perform operations on or with it
  • temporarily retrieve an interface to the resource (pointer to a class instance) that represents or even holds the actual data

I think these two two possibilities are important to distinguish. I call them indirect andΒ direct resource access (beware! I made these names just up). Note that you can easily map both access types to each other, so in the end it is more about programming interfaces!
The indirect resource access is fairly common, especially in non-object-oriented languages like C, where it is not possible to give pointers to resource-interfaces. A good example for such indirect access handles are C’s file-handles.

Either way, why should one use resource-handles in object oriented languages at all?

  • The resource system can change how a handle is resolved. This is useful for:
    • resource fallbacks
      • not found
      • still loading
    • streaming
    • uncertain memory location
    • might even be used for detail levels
  • They act similar to weak pointer
    • Resource can be destroyed explicitly:
      Subsequent queries (direct access) or function calls (indirect access) at the resource manager notify the user that the resource does not exist

It is all about moving power from away from user handled “resource instances” to a resource system which has global knowledge.
If you cannot make use of the mentioned advantages, just go with the normal object oriented approach where you create/destroy/manipulate at any time without special supervision. The disadvantages of a handle system are rather obvious:

  • Additional indirection
  • [only direct] Resource system can be circumvented by just asking only once for the resource and keeping it then

Oh so there is a disadvantage for the direct resource system but none for the indirect? Not so fast: Giving a resource object/interface to the user allows much nicer usage and can leverage polymorphism. In contrast, the indirect system needs to have all functionality in the manager which can be very unnatural and inflexible.

Of course all this is very simplified and general. In some situations a hybrid might be useful as well!
More infos about “indirect” handle-based resource managers can be found for example here. An example for a “direct” handle-based resource manager can be found in the ezEngine.

Now to the practical usage in ClearSight, my current C# rendering framework (as a precaution I stopped calling it “Engine” πŸ˜‰ ). So far I have identified two places which are in need for some kind of resource system with different properties.

  • (low-level-ish) Renderer
    • ensure that resources that are still in use by the GPU (!) are not deleted immediately
    • few different resource operations
    • resource provides very few operations itself
    • creation can be assumed to be very fast; only states are thus either existing or not existing
  • higher level asset system
    • load by lookup (do not reload loaded resources)
    • fallbacks
    • many and complex resource operations
    • possible multi-threaded loading processes, different loading states

Since in C# any object that is still referenced somewhere cannot be deallocated, the advantage of a weak-pointer-like behaviour is irrelevant. Needless to say that all resources should be deallocated explicitly and implement thus IDisposable. Therefore, the Dispose method can decide if a resource is actually allowed to be destroyed.

This ultimately means that, while I need a thin resource management in the renderer, I do not need resource handles in that place. However, I will likely use a direct handle-based resource system later on top, where a single resource might compromise one or more renderer resource and additional data.