Improving performance using hardware layers Debugging QML applications

C

Layer and VRAM optimization guide for Infineon TRAVEO T2G boards

Overview

The Infineon TRAVEO T2G boards have a rather unique layer and graphics driver architecture. This guide explains how to optimize graphics performance and VRAM memory usage on these boards.

As background, you might first want to familiarize yourself with the basic hardware layer concepts explained on the Improving performance using hardware layers page.

Infineon TRAVEO T2G hardware layers

On TRAVEO T2G, there are the following hardware layer types:

Image layer

This layer type allows pixel data to be read directly from addressable internal or external flash memory, and is suitable for static background or foreground image data. It corresponds to the ImageLayer QML type.

IBO layer

IBO layers consist of double buffered framebuffers in VRAM. The TRAVEO T2G graphics engines draw directly into the back-buffer while the front-buffer is being composited onto the final display image.

The rendering hint NoRenderingHint is used to set the IBO layer type:

ItemLayer {
    renderingHints: ItemLayer.NoRenderingHint
}

Because of double buffering, the memory usage of a 800x480 IBO layer with two bytes per pixel would be as follows:

WIDTH x HEIGHT x BUFFERS x BPP = 800 x 480 x 2 x 2 bytes = 1.54 MB

LBO to memory layer

Like IBO layers, LBO to memory layers also consist of double buffered framebuffers in VRAM. The difference is that the TRAVEO T2G blit engine buffers up drawing commands, and replays the drawing commands into the back-buffer on a line-by-line basis. This enables doing some optimizations and can reduce overdraw, although there's a certain overhead that might be unnecessary for simple UIs.

The rendering hint OptimizeForSpeed is used to set the LBO layer type:

ItemLayer {
    renderingHints: ItemLayer.OptimizeForSpeed
}

The memory usage of an LBO to memory layer is the same as for IBO layers.

OTF layer

OTF (on-the-fly) layer, also known as LBO to display, is unique because it does not require double buffered framebuffers of pixel data to represent dynamically rendered contents. Instead, a command buffer is kept for each frame, which gets replayed at the screen refresh rate as long as the frame is active. The command buffer entries specify which blend operations to perform, with which source image data.

The rendering of the command buffer during each screen refresh happens via a circular line buffer. The height of the line buffer has to be 32, 64, 128, or some other power of two. The TRAVEO T2G blit engine writes to the front of the line buffer, whereas the display data is read from the back.

Because the command buffer entries retain references to source image data, the Qt Quick Ultralite Core library has to ensure that image data isn't freed as long as the frame referring to it is still active.

The rendering hint OptimizeForSize is used to set the OTF layer type. It is the default rendering hint, so it doesn't have to be explicitly set.

Memory usage of a 800x480 OTF layer with 24 bits per pixel (3 bytes per pixel) color depth would be as follows, assuming a 64 pixel line buffer height and a 64 KB command buffer:

WIDTH x LINE_BUFFER_HEIGHT x BUFFERS x BPP + COMMAND_BUFFER_SIZE = 800 x 64 x 3 bytes + 64 KB = 219 KB

Line buffer overview

This illustration shows the normal operation mode of the line buffer. The red line is the line currently being read by the layer composition engine, to be blended together with the other layers and sent to the display. The blue line is the line that is currently being prepared by the TRAVEO T2G blit engine. The green lines are the lines that have already been prepared and contain contents ready to be read by the layer composition engine. The white lines do not yet have any contents, and are waiting to be prepared by the blit engine.

This demonstrates the "circular" nature of the line buffer. The layer composition engine has already advanced by a number of scanlines, leaving the top of the circular buffer free for the blit engine to prepare more contents.

In this case, the blit engine has had the time to fill the entire line buffer, except for the line that's currently being read by the layer composition engine. This is a desirable mode of operation, since it means the blit engine is producing contents faster than the layer composition engine is able to consume them.

On the other hand, if the blit engine is slow at rendering contents into the line buffer, we might end up with the following scenario. There are no ready lines for the layer composition engine to consume, so if it needs to read the next line before the blit engine has finished rendering it, a line buffer underrun happens.

Line buffer underrun

If a line buffer underrun happens in an OTF layer, the display will flash with a black or red color, instead of showing the correct visual output. This might happen if there's too much content in the OTF layer, or if the memory bandwidth is insufficient. The memory bandwidth might become a bottleneck if images are kept in slow external flash memory.

Here are some possible solutions to the line buffer underrun problem:

Increase the line buffer height by using the Tvii::Configuration::setConfigForOTFLayer API.
Move image resources into faster memory, such as internal flash or VRAM, using ImageFiles.MCU.resourceCachePolicy or ImageFiles.MCU.resourceStorageSection.
Move some UI elements into IBO layers, whose contents are cached instead of being drawn just-in-time.

Optimizing VRAM usage

Here are some ways in which to optimize the VRAM usage of the Qt for MCUs application on Infineon TRAVEO T2G boards.

Layer color depth

The ItemLayer::depth property is used to control the color depth of the layer. If SpriteLayer is used, the SpriteLayer::depth has to match the color depth of the item layers and image layers contained inside.

If an ItemLayer is used as the bottom-most layer, prefer using the Bpp24 color depth, as it saves 25 % of VRAM compared to the default Bpp32. For this reason, it might in some cases be preferable not to use an ImageLayer as the bottom-most layer, but instead put the image content as a regular Image item in an ItemLayer with 24 bpp color depth.

Bpp16 and Bpp16Alpha can be used to further reduce VRAM usage, if the UI elements contained do not require the higher color precision.

Disable caching of image resources in VRAM

By default, image resources will have the resource cache policy set to OnStartup. To reduce VRAM usage, consider setting it to NoCaching by default, and only use OnStartup if it's deemed necessary for improved performance. Images that are rotated or scaled could benefit from being explicitly placed in VRAM.

MCU.Config {
    resourceCachePolicy: "NoCaching"
}

Disable caching of glyph alpha maps in VRAM

When using the static font engine, or for StaticText when using the Monotype Spark font engine, glyph alpha maps get copied, by default, to VRAM during resource initialization. To disable this, set glyphsCachePolicy to NoCaching:

MCU.Config {
    glyphsCachePolicy: "NoCaching"
}

Monotype Spark heap and cache

The font heap and cache are allocated in VRAM when using the Monotype Spark font engine.

MCU.Config.fontHeapSize is set to -1 by default, but a fixed value is preferable to prevent memory fragmentation. Reasonable values might be between 24 KB and 64 KB, depending on the application, the fonts, and the font sizes used.

MCU.Config.fontCacheSize is set to 200 KB by default. This is more than is needed on Traveo T2G boards, since the text cache is enabled by default and offers better performance improvements than the font cache. MCU.Config.fontCacheSize can be set to a smaller value such as 32 KB.

MCU.Config {
    fontEngine: "Spark"
    fontCacheSize: 32000
    fontHeapSize: 40000
}

Text cache

The text cache is an 8 bits per pixel cache that is kept per Text item, to prevent having to redraw the text glyph by glyph each frame. On Traveo T2G it offers significant performance improvements, due to the high draw call overhead in the graphics driver.

On Traveo T2G boards the text cache is kept in VRAM. It is enabled by default, and its size is set to 192 KB. For some applications, it might be possible to reduce this value somewhat, without reducing performance.

The application can customize the size of the text cache when creating the Qul::Application object:

Qul::ApplicationConfiguration appConfig;
appConfig.setTextCacheEnabled(true);
appConfig.setTextCacheSize(128 * 1024);
Qul::Application app(appConfig);

See Text caching for more information.

Buffers for vector graphics

Various buffers are required for hardware-accelerated vector graphics blending. They are indirectly used by the Shape, ArcItem, and Qul::PlatformInterface::DrawingEngine::blendPath APIs.

Global alpha buffer

The global alpha buffer is an intermediate mask buffer used by the TRAVEO T2G drawing engine when drawing vector path data.

Its size is configured in platform_config.h.in by setting QUL_PLATFORM_TVII_DRAW_CONTEXT_ALPHA_BUFFER_WIDTH and QUL_PLATFORM_TVII_DRAW_CONTEXT_ALPHA_BUFFER_HEIGHT. By default it's 320x320 pixels.

The alpha buffer has 4 bits per pixels by default, offering a reasonable trade-off between memory usage and antialiasing quality. If necessary, this can be changed by modifying the call to OpenDrawCtx in platform_display.cpp.

Note: Changing the alpha buffer size or bit depth requires rebuilding the platform library.

With the default settings, 51 KB of VRAM get consumed by the alpha buffer.

Global path buffer

The global path buffer is used by the TRAVEO T2G drawing engine to cache vector path data before it gets drawn into the mask buffer.

Its size is configured in platform_config.h.in by setting QUL_PLATFORM_TVII_DRAW_CONTEXT_PATH_BUFFER_SIZE. By default it's 32 KB.

Note: Changing the path buffer size requires rebuilding the platform library.

Per-path mask buffers cached by the `CyGfxDrawingEngine`

CyGfxDrawingEngine is the platform implementation of the Qul::PlatformInterface::DrawingEngine API for the TRAVEO T2G boards. When paths are drawn, the drawing engine has to cache mask buffers on a per-path basis.

The exception is if the target layer is an IBO layer and the ShapePath::fillGradient property is not set. In that case, the path can be drawn directly into the target framebuffer without using the intermediate mask buffer (although the global alpha and path buffers mentioned above are still required).

If the drawing engine fails to allocate the required mask buffer, an QulError_DrawingEngine_SurfaceAllocationFailed error is triggered. If the application uses vector graphics it has to ensure there's enough free VRAM available for these mask buffers.

Improving performance using hardware layers Debugging QML applications

Available under certain Qt licenses.
Find out more.

C

Layer and VRAM optimization guide for Infineon TRAVEO T2G boards

Overview

Infineon TRAVEO T2G hardware layers

Image layer

IBO layer

LBO to memory layer

OTF layer

Line buffer overview

Line buffer underrun

Optimizing VRAM usage

Layer color depth

Disable caching of image resources in VRAM

Disable caching of glyph alpha maps in VRAM

Monotype Spark heap and cache

Text cache

Buffers for vector graphics

Global alpha buffer

Global path buffer

Per-path mask buffers cached by the CyGfxDrawingEngine

Per-path mask buffers cached by the `CyGfxDrawingEngine`