Qt Quick Direct3D 12 Adaptation¶
The Direct3D 12 adaptation for Windows 10, both in Win32 (
windows
platform plugin) and in UWP (winrt
platform plugin), is shipped as a dynamically loaded plugin. This adaptation doesn’t work on earlier Windows versions. Building this plugin is enabled automatically, whenever the necessary D3D and DXGI develpoment files are present. In practice, this currently means Visual Studio 2015 and newer.The adaptation is available both in normal, OpenGL-enabled Qt builds, and also when Qt is configured with
-no-opengl
. However, it’s never the default, meaning that the user or the application has to explicitly request it by setting theQT_QUICK_BACKEND
environment variable tod3d12
or by callingsetSceneGraphBackend()
.
Motivation¶
This experimental adaptation is the first Qt Quick backend that focuses on a modern, lower-level graphics API in combination with a windowing system interface that’s different from the traditional approaches used in combination with OpenGL.
This adaptation also allows better integration with Windows, as Direct3D is the primary vendor-supported solution. Consequently, there are fewer problems anticipated with drivers, operations like window resizes, and special events like graphics device loss caused by device resets or graphics driver updates.
Performance-wise, the general expectation is a somewhat lower CPU usage compared to OpenGL, due to lower driver overhead, and a higher GPU utilization with less idle time wastage. The backend doesn’t heavily utilize threads yet, which means there are opportunities for further improvements in the future, for example to further optimize image loading.
The D3D12 backend also introduces support for pre-compiled shaders. All the backend’s own shaders (used by the built-in materials on which the Rectangle, Image, Text, and other QML types are built with) are compiled to D3D shader bytecode when you compile Qt. Applications using ShaderEffect items can choose to ship bytecode either in regular files, via the Qt resource system, or use High Level Shading Language for DirectX (HLSL) source strings. Unlike OpenGL, the compilation for HLSL is properly threaded, meaning shader compilation won’t block the application and its user interface.
Graphics Adapters¶
The plugin does not necessarily require hardware acceleration. You can also use WARP, the Direct3D software rasterizer. By default, the first adapter providing hardware acceleration is chosen. To override this and use another graphics adapter or to force the use of the software rasterizer, set the
QT_D3D_ADAPTER_INDEX
environment variable to the index of the adapter. The adapters discovered are printed at startup whenQSG_INFO
or theqt.scenegraph.general
logging category is enabled.
Troubleshooting¶
If you encounter issues, always set the
QSG_INFO
andQT_D3D_DEBUG
environment variables to1
, to get debug and warning messages printed on the debug output.QT_D3D_DEBUG
enables the Direct3D debug layer.Note
The debug layer shouldn’t be enabled in production use, since it can significantly impact performance (CPU load) due to increased API overhead.
Render Loops¶
By default, the D3D12 adaptation uses a single-threaded render loop similar to OpenGL’s
windows
render loop. A threaded variant is also available, that you can request by setting theQSG_RENDER_LOOP
environment variable tothreaded
. However, due to conceptual limitations in DXGI, the windowing system interface, the threaded loop is prone to deadlocks when multipleQQuickWindow
orQQuickView
instances are shown. Consequently, for the time being, the default is the single-threaded loop. This means that with the D3D12 backend, applications are expected to move their work from the main (GUI) thread out to worker threads, instead of expecting Qt to keep the GUI thread responsive and suitable for heavy, blocking operations.For more information see Qt Quick Scene Graph for details on render loops and Multithreading and DXGI regarding the issues with multithreading.
Renderer¶
The scene graph renderer in the D3D12 adaptation currently doesn’t perform any batching. This is less of an issue, unlike OpenGL, because state changes don’t present any problems in the first place. The simpler renderer logic can also lead to lower CPU overhead in some cases. The trade-offs between the various approaches are currently under research.
Shader Effects¶
The ShaderEffect QML type is fully functional with the D3D12 adaptation as well. However, the interpretation of the fragmentShader and vertexShader properties is different than with OpenGL.
With D3D12, these strings can either be a URL for a local file, a file in the resource system, or an HLSL source string. Using a URL for a local file or a file in the resource system indicates that the file in question contains pre-compiled D3D shader bytecode generated by the
fxc
tool, or, alternatively, HLSL source code. The type of file is detected automatically. This means that the D3D12 backend supports all options from GraphicsInfo .shaderCompilationType and GraphicsInfo .shaderSourceType.Unlike OpenGL, whenever you open a file, there is a
QFileSelector
with the extrahlsl
selector used. This provides easy creation of ShaderEffect items that are functional across both backends, for example by placing the GLSL source code intoshaders/effect.frag
, the HLSL source code or - preferably - pre-compiled bytecode intoshaders/+hlsl/effect.frag
, while simply writingfragmentShader: "qrc:shaders/effect.frag"
in QML. For more details, see ShaderEffect .
Multisample Render Targets¶
The Direct3D 12 adaptation ignores the
QSurfaceFormat
set on theQQuickWindow
orQQuickView
, or set viasetDefaultFormat()
, with two exceptions:samples()
andalphaBufferSize()
are still taken into account. When the sample value is greater than 1, multisample offscreen render targets will be created with the specified sample count at the maximum supported quality level. The backend automatically performs resolving into the non-multisample swapchain buffers after each frame.
Semi-transparent Windows¶
When the alpha channel is enabled either via
setDefaultAlphaBuffer()
or by setting alphaBufferSize to a non-zero value in the window’sQSurfaceFormat
or in the global format managed bysetDefaultFormat()
, the D3D12 backend will create a swapchain for composition and go through DirectComposition. This is necessary, because the mandatory flip model swapchain wouldn’t support transparency otherwise.Therefore, it’s important not to unneccessarily request an alpha channel. When the alphaBufferSize is 0 or the default -1, all these extra steps can be avoided and the traditional window-based swapchain is sufficient.
On WinRT, this isn’t relevant because the backend there always uses a composition swapchain which is associated with the ISwapChainPanel that backs
QWindow
on that platform.
Mipmaps¶
Mipmap generation is supported and handled transparently to the applications via a built-in compute shader. However, at the moment, this feature is experimental and only supports power-of-two images. Textures of other size will work too, but this involves a
QImage
-based scaling on the CPU first. Therefore, avoid enabling mipmapping for Non-Power-Of-Two (NPOT) images whenever possible.
Image Formats¶
When creating textures via C++ scene graph APIs like
createTextureFromImage()
, 32-bit formats won’t involve any conversion, they’ll map directly to the correspondingR8G8B8A8_UNORM
orB8G8R8A8_UNORM
format. Everything else will trigger aQImage
-based format conversion on the CPU first.
Unsupported Features¶
Particles and some other OpenGL-dependent utilities, like
QQuickFramebufferObject
, are currently not supported.Like with Software adaptation , text is always rendered using the native method. Distance field-based text rendering is currently not implemented.
The shader sources in the Qt Graphical Effects module have not been ported to any format other than the OpenGL 2.0 compatible one, meaning that the QML types provided by that module are currently not functional with the D3D12 backend.
Texture atlases are currently not in use.
The renderer may lack support for certain minor features, such as drawing points and lines with a width other than 1.
Custom Qt Quick items using custom scene graph nodes can be problematic because materials are inherently tied to the graphics API. Therefore, only items that use the utility rectangle and image nodes are functional across all adaptations.
QQuickWidget
and its underlying OpenGL-based compositing architecture is not supported. If you need to mix withQWidget
-based user interfaces, usecreateWindowContainer()
to embed the native window of theQQuickWindow
orQQuickView
.Finally, rendering via
QSGEngine
andQSGAbstractRenderer
is not feasible with the D3D12 adaptation at the moment.
Advanced Configuration¶
The D3D12 adaptation can keep multiple frames in flight, similar to modern game engines. This is somewhat different from the traditional “render - swap - wait for vsync” model and allows for better GPU utilization at the expense of higher resource use. This means that the renderer will be a number of frames ahead of what is displayed on the screen.
For a discussion of flip model swap chains and the typical configuration parameters, refer to Sample Application for Direct3D 12 Flip Model Swap Chains .
Vertical synchronization is always enabled, meaning Present() is invoked with an interval of 1.
The configuration can be changed by setting the following environment variables:
Environment variable
Description
QT_D3D_BUFFER_COUNT
The number of swap chain buffers in range 2 - 4. The default value is 3.
QT_D3D_FRAME_COUNT
The number of frames prepared without blocking in range 1 - 4. The default value is 2. Present() starts blocking after queuing 3 frames (regardless of
QT_D3D_BUFFER_COUNT
), unless the waitable object is in use. Every additional frame increases GPU resource usage since geometry and constant buffer data needs to be duplicated, and involves more bookkeeping on the CPU side.
QT_D3D_WAITABLE_SWAP_CHAIN_MAX_LATENCY
The frame latency in range 1 - 16. The default value is 0 (disabled). Changes the limit for Present() and triggers a wait for an available swap chain buffer when beginning each frame. For a detailed discussion, see the article linked above.
Note
Currently, this behavior is experimental.
QT_D3D_BLOCKING_PRESENT
The time the CPU should wait, a non-zero value, for the GPU to finish its work after each call to Present(). The default value is 0 (disabled). This behavior effectively kills all parallelism but makes the behavior resemble the traditional swap-blocks-for-vsync model, which can be useful in some special cases. However, this behavior is not the same as setting the frame count to 1 because that still avoids blocking after Present(), and may only block when starting to prepare the next frame (or may not block at all depending on the time gap between the frames).
© 2022 The Qt Company Ltd. Documentation contributions included herein are the copyrights of their respective owners. The documentation provided herein is licensed under the terms of the GNU Free Documentation License version 1.3 as published by the Free Software Foundation. Qt and respective logos are trademarks of The Qt Company Ltd. in Finland and/or other countries worldwide. All other trademarks are property of their respective owners.