PtBi update and source release

I just released a new version of PtBi (5.1729). It’s a minor update that adds a few small features people were asking for:

  • A nearest neighbour scaling mode.
  • The ability to bind keys to switch directly to a given AA or scaling mode (instead of going through the available modes step by step). See keys.ini for details and some examples.

More importantly, I also uploaded an initial commit of the PtBi source to GitHub. It’s probably a bit hard to get to build initially due to the dependencies, but I hope it is useful for someone.

 

C++11 chrono timers

I’m a pretty big proponent of C++ as a language, and particularly enthused about C++11 and how that makes it even better. However, sadly reality still lags a bit behind specification in many areas.

One thing that was always troublesome in C++, particularly in high performance or realtime programming, was that there was no standard, platform independent way of getting a high performance timer. If you wanted cross-platform compatibility and a small timing period, you had to go with some external library, go OpenMP or roll your own on each supported platform.

In C++11, the chrono namespace was introduced. It, at least in theory, provides everything you always wanted in terms of timing, right there in the standard library. Three different types of clocks are offered for different use cases: system_clock ,  steady_clock  and high_resolution_clock.

Yesterday I wrote a small program to query and test these clocks in practice on different platforms. Here are the results:

So, sadly everything is not as great as it could be, yet. For each platform, the first three blocks are the values reported for the clock, and the last block contains values determined by repeated measurements:

  • “period” is the tick period reported by each clock, in nanoseconds.
  • “unit” is the unit used by clock values, also in nanoseconds.
  • “steady” indicates whether the time between ticks is always constant for the given clock.
  • “time/iter, no clock” is the time per loop iteration for the measurement loop without the actual measurement. It’s just a reference value to better judge the overhead of the clock measurements.
  • “time/iter, clock” is the average time per iteration, with clock measurement.
  • “min time delta” is the minimum difference between two consecutive, non-identical time measurements.

On Linux with GCC 4.8.1, all clocks report a tick period of 1 nanosecond. There isn’t really a reason to doubt that, and it’s obviously a great granularity. However, the drawback is that it takes around 120 nanoseconds on average to get a clock measurement. This would be understandable for the system clock, but seems excessive in the other cases, and could cause significant perturbation when trying to measure/instrument small code areas.

On Windows with VS12, a clock period of 100 nanoseconds is reported, but the actual measured tick period is a whopping 1000000 ns (1 millisecond). That is obviously unusable for many of the kind of use cases that would call for a “high resolution clock”. Windows is perfectly capable of supplying a true high resolution clock measurement, so this performance (or lack of it) is quite surprising. On the bright side, a measurement takes just 9 nanoseconds on average.

Clearly, both implementations tested here still have a way to go. If you want to test your own platform(s), here is the very simple program:

 

PtBi version 5

I just released a new major version of PtBi, with 2 new features.

Dolby Digital 5.1 decoding

PtBi can now decode audio streams transmitted in Dolby Digital 5.1 format. Together with the existing DTS 5.1 decoding, this should now allow for true surround sound from almost any source. I believe that PtBi is the only Blackmagic Intensity capture program with this type of audio support.
This was easier than I expected, at least at first, because the decoding library functions very similarly to the one I used for DTS, but I was stuck for hours without any progress. It turns out that someone thought it would be a good idea to standardize a bitstream format such that it can be either big-endian or little-endian. Ugh.

SMAA integration

In addition to the existing FXAA, PXAA and TPXAA post-processing AA modes PtBi now also supports SMAA1x. SMAA1x has slightly better edge quality and motion stability than FXAA. I’ll look into integrating SMAA with my predication filters at some point in the future.

 

Also, I plan to release the source code for PtBi soon-ish. I was always reluctant to do this, since some of it is based on code I wrote almost a decade ago which is pretty terrible, but I cleaned it up slightly now. And some parts of it, like how to integrate the AA modes in OpenGL or how to use the various libraries for audio decoding/playback might be useful to someone. Also, it could help people identify and solve problems with AMD cards, which are always very hard for me to test/debug without access to the hardware.