C

Getting graphics on the screen

Getting graphics on the screen

This chapter explains how to implement the platform interface to render pixels onto the screen. It describes how to implement basic support for a single screen with no hardware layers. See the next chapter to implement for hardwares with layers support.

Overview

Qt Quick Ultralite core converts the QML scene into a series of rendering commands in order to blend them into a framebuffer. It's the platform library's responsibility to provide this framebuffer, which is then presented to the screen. If the target hardware comes with hardware-accelerated graphics support, the platform library also provides an implementation of the rendering commands for it.

Implementing basic graphics

Qt Quick Ultralite core communicates with the platform hardware through Qul::Platform::PlatformContext instance. To show content on the screen, platform has to implement graphics related parts of the context. The first method to implement is PlatformContext::availableScreens.

Note: The example implementation assumes that there is only one screen without additional layers.

Qul::PlatformInterface::Screen *ExamplePlatform::availableScreens(size_t *screenCount) const
{
    *screenCount = sizeof(platformScreens) / sizeof(PlatformInterface::Screen);
    return platformScreens;
}

PlatformContext::availableScreens returns an array of the available screens on the device. The returned Qul::PlatformInterface::Screen instances provide the screen identifier, its size in pixels, and its color format. It also tells Qt Quick Ultralite core whether the screen supports resizing, which in practice means whether the platform can support multiple resolutions or framebuffer dimensions. In the example we rely on the default behavior of the Screen constructor, which is no resizing support.

If your device is supporting multiple screens you should provide a name as identifier for each screen in its constructor.

Next, implement the PlatformContext::initializeDisplay method that is called once per screen that the application wants to use. If the screen supports resizing, it will resize itself according to the root QML item's size. Based on the screen size, the platform library can now initialize the display hardware.

void ExamplePlatform::initializeDisplay(const PlatformInterface::Screen *screen)
{
    // const int screenIndex = screen - platformScreens;
    // initLcd(screenIndex, screen->size().width(), screen->size().height());
}

You also need to provide framebuffers to render into. If the screen is a fixed size, you can statically allocate them like this:

// Assuming we use 32bpp framebuffers
static const int BytesPerPixel = 4;
unsigned char framebuffer[2][BytesPerPixel * ScreenWidth * ScreenHeight]
    __attribute__((section(".framebufferSection")));
static int backBufferIndex = 0;

Two framebuffers are usually used to achieve better performance and repeatedly refresh directly from memory the display hardware needs it. One is the back buffer, which Qt Quick Ultralite core does the rendering of the next frame into, and the other is the front buffer which gets displayed on screen. To report the fact that this buffering type is used, return Qul::Platform::FlippedDoubleBuffering from PlatformContext::frameBufferingType. The variable backBufferIndex is used to track which of the two framebuffers is currently used as the back buffer.

FrameBufferingType ExamplePlatform::frameBufferingType(const PlatformInterface::LayerEngine::ItemLayer *) const
{
    return FlippedDoubleBuffering;
}

Qt Quick Ultralite core uses the value returned from PlatformContext::frameBufferingType to do partial rendering updates of only the parts of the screen that changed, instead of redrawing the full framebuffer each frame.

Note: Some custom architectures might not use framebuffers at all, for example using command buffers to update the display just in time. In such a case, Qul::Platform::OtherBuffering must be returned to inform Qt Quick Ultralite core that full repaints have to be done each frame.

Some platforms might not have enough RAM to fit even a single full framebuffer. Using a Partial framebuffer can significantly lower the memory requirements of an application, but can lead to reduced performance and potential visual tearing artifacts for complex applications. In such case, Qul::Platform::PartialBuffering must be returned to inform Qt Quick Ultralite core that partial updates should be split to fit the partial framebuffer size.

Then, you are ready to put the pieces in place to enable Qt Quick Ultralite core to render to your framebuffer.

If full frame buffering is chosen, Qt Quick Ultralite core calls PlatformContext::beginFrame and PlatformContext::endFrame once for each frame, whenever a screen needs a visual update.

If partial buffering is chosen, Qt Quick Ultralite core calls beginFrame and endFrame once for each area that needs a visual update. Updated area is given with rect in PlatformContext::beginFrame. See Implementing partial framebuffering support for more information.

beginFrame returns an instance of Qul::PlatformInterface::DrawingDevice containing:

  • the pixel format and screen or layer size,
  • a pointer to the framebuffer to render into,
  • the number of bytes per line,
  • and the drawing engine to be used to render into the buffer.

If you are relying fully on the default software rendering included in Qt Quick Ultralite core, a plain Qul::PlatformInterface::DrawingEngine can be used instead of ExampleDrawingEngine here. Subclassing the drawing engine is the way for the platform to enable hardware-accelerated drawing into the buffer. How to do this is explained in detail in a later section.

PlatformInterface::DrawingDevice *ExamplePlatform::beginFrame(const PlatformInterface::LayerEngine::ItemLayer *layer,
                                                              const PlatformInterface::Rect &rect,
                                                              int refreshInterval)
{
    if (enableLayerEngine)
        return ExampleLayerEngine::beginFrame(layer, rect, refreshInterval);

    static ExampleDrawingEngine drawingEngine;

    requestedRefreshInterval = refreshInterval;

    // Wait until the back buffer is free, i.e. no longer held by the display
    waitForBufferFlip();

    // A pointer to the back buffer
    uchar *bits = framebuffer[backBufferIndex];
    static PlatformInterface::DrawingDevice buffer(Qul::PixelFormat_RGB32,
                                                   PlatformInterface::Size(ScreenWidth, ScreenHeight),
                                                   bits,
                                                   ScreenWidth * BytesPerPixel,
                                                   &drawingEngine);

    buffer.setBits(bits);
    return &buffer;
}

waitForBufferFlip is called to make sure the back buffer (that was the front buffer of the previous frame) has been released by the display controller and is ready to be rendered into. How to implement it is explained later in the context of presentFrame.

PlatformContext::endFrame is called after rendering of a frame for a given screen is completed. In the case of multiple screens or layers, some platforms might want to flush the hardware-accelerated blending unit's command buffers here, but in most cases it can be left empty.

void ExamplePlatform::endFrame(const PlatformInterface::LayerEngine::ItemLayer *layer)
{
    if (enableLayerEngine)
        ExampleLayerEngine::endFrame(layer);
}

After rendering for a screen, the function PlatformContext::presentFrame is called by Qt Quick Ultralite core. This is where the platform should make the buffer that has just been rendered visible on the display. The example uses double buffering, so it passes the pointer to the back buffer to the display, and swap the front and back buffers so that the next frame gets rendered to what was the previous frame's front buffer.

FrameStatistics ExamplePlatform::presentFrame(const PlatformInterface::Screen *screen,
                                              const PlatformInterface::Rect &rect)
{
    if (enableLayerEngine)
        return ExampleLayerEngine::presentFrame(screen, rect);

    // HW_SyncFramebufferForCpuAccess();

    synchronizeAfterCpuAccess(rect);
    framebufferAccessedByCpu = false;

    FrameStatistics stats;
    stats.refreshDelta = refreshCount - requestedRefreshInterval;
    waitForRefreshInterval();
    static const int RefreshRate = 60; // screen refresh rate in Hz
    stats.remainingBudget = idleTimeWaitingForDisplay + stats.refreshDelta * int(1000.0f / RefreshRate);

    waitingForBufferFlip = true;
    idleTimeWaitingForDisplay = 0;

    // Now we can update the framebuffer address
    // LCD_SetBufferAddr(framebuffer[backBufferIndex]);

    // Now the front and back buffers are swapped
    if (backBufferIndex == 0)
        backBufferIndex = 1;
    else
        backBufferIndex = 0;

    return stats;
}

In the case of hardware-accelerated blending, presentFrame starts by synchronizing with the hardware-accelerated blending unit, to ensure all pending blending commands are fully committed to the back buffer.

Also, the synchronizeAfterCpuAccess function will be useful on some platforms. In case any CPU rendering fallback is used, it invalidates any data caches before allowing asynchronous reads from the display controller to access the memory:

static bool framebufferAccessedByCpu = false;

static void synchronizeAfterCpuAccess(const PlatformInterface::Rect &rect)
{
    if (framebufferAccessedByCpu) {
        unsigned char *backBuffer = framebuffer[backBufferIndex];
        for (int i = 0; i < rect.height(); ++i) {
            unsigned char *pixels = backBuffer + (ScreenWidth * (rect.y() + i) + rect.x()) * BytesPerPixel;
            // CleanInvalidateDCache_by_Addr(pixels, rect.width() * BytesPerPixel);
        }
    }
}

The presentFrame function continues by calling waitForRefreshInterval, to enable reducing the refresh rate when requested to do so by the Qt Quick Ultralite core. This requires keeping track of how many refreshes have happened since the last call to presentFrame, and waiting until this count reaches the requested refresh interval. The changes to the refreshDelta and remainingBudget values in the FrameStatistics struct are required to enable frame skip compensation, which is explained later.

static void waitForRefreshInterval()
{
    if (refreshCount < requestedRefreshInterval) {
        uint64_t startTime = getPlatformInstance()->currentTimestamp();
        while (refreshCount < requestedRefreshInterval) {
            // The device may yield or go into sleep mode
        }
        idleTimeWaitingForDisplay += getPlatformInstance()->currentTimestamp() - startTime;
    }

    refreshCount = 0;
}

This assumes there's also an interrupt handler that tracks the screen refresh count:

static volatile int refreshCount = 1;
volatile unsigned int currentFrame = 0;

// Note: This line incrementing the refreshCount will need to be moved to the
// actual interrupt handler for the display available on the target platform. It
// needs to be called once per vertical refresh of the display, to keep track of
// how many refreshes have happened between calls to presentFrame in order to
// support custom refresh intervals. On some implementations this can be done
// using a so called "line event".
void LCD_RefreshInterruptHandler()
{
    ++refreshCount;

    // currentFrame is only needed if the layer backend is used
    ++currentFrame;
}

Finally, presentFrame can tell the display controller to start refreshing from the back buffer instead of the currently held front buffer. This happens asynchronously, as the display controller might still be in the process of scanning the front buffer for display. It's therefore necessary to set waitingForBufferFlip to true, and use an interrupt handler to get notified by the display controller when the old front buffer is released and can be used as the new back buffer for drawing into:

static volatile bool waitingForBufferFlip = false;
static uint32_t idleTimeWaitingForDisplay = 0;

static void waitForBufferFlip()
{
    // Has there already been a buffer flip?
    if (!waitingForBufferFlip)
        return;
    const uint64_t startTime = getPlatformInstance()->currentTimestamp();
    while (waitingForBufferFlip) {
        // The device may yield or go into sleep mode
    }

    idleTimeWaitingForDisplay = getPlatformInstance()->currentTimestamp() - startTime;
}

// Note: This line clearing waitingForBufferFlip will need to be moved to the
// actual interrupt handler for the display available on the target platform.
// It's needed to inform about when the buffer address used to scan out pixels
// to the display has been updated, making the buffer free in order to start
// drawing the next frame.
void LCD_BufferFlipInterruptHandler()
{
    waitingForBufferFlip = false;
}

idleTimeWaitingForDisplay tracks how much time was spent waiting for the display, and is thus time that could have been used for rendering instead. It's used to implement frame skip compensation, which is be explained in the following section.

Frame skip compensation

The return value of PlatformContext::presentFrame is a Qul::Platform::FrameStatistics value, which is used by Qt Quick Ultralite core to implement frame skip compensation to achieve smoother animations. FrameStatistics contains Qul::Platform::FrameStatistics::refreshDelta, indicating how much the previous frame was delayed relative to its target frame. So if the last frame's rendering took too long, and it ended up being displayed a frame later than the requested refresh interval (meaning there was a frame skip), the refreshDelta must be set to 1.

On the other hand, if the requested refresh interval is 2, and the rendering was fast enough that a refresh interval of 1 could have been used without causing a frame skip, the refreshDelta must be set to -1.

Additionally, FrameStatistics has the value Qul::Platform::FrameStatistics::remainingBudget. It indicates how much more time (in milliseconds) could have been spent on rendering without skipping a frame, assuming refreshDelta is already added to the swap interval. If this value is very low, Qt Quick Ultralite core might preemptively increase the swap interval to reduce the risk of skipping any frames during animations, by temporarily dropping to a lower refresh rate and making sure animation frames end up being displayed at the correct target time.

If no frame skip compensation is desired, a default constructed FrameStatistics value can be returned.

In the example, the refreshDelta is computed as the difference between the requested refresh interval and how many refreshes actually happened since the last frame was shown. The remaining budget is time that could have been spent on rendering, assuming the refreshDelta is added to the refresh interval. This can be computed by adding the time spent idling waiting for the display to the refreshDelta times the screen refresh interval in milliseconds:

FrameStatistics stats;
stats.refreshDelta = refreshCount - requestedRefreshInterval;
waitForRefreshInterval();
static const int RefreshRate = 60; // screen refresh rate in Hz
stats.remainingBudget = idleTimeWaitingForDisplay + stats.refreshDelta * int(1000.0f / RefreshRate);

Hardware acceleration

To implement hardware-acceleration, subclass Qul::PlatformInterface::DrawingEngine and override the functions that the platform is able to accelerate.

class ExampleDrawingEngine : public PlatformInterface::DrawingEngine
{
public:
    void blendRect(PlatformInterface::DrawingDevice *drawingDevice,
                   const PlatformInterface::Rect &rect,
                   PlatformInterface::Rgba32 color,
                   BlendMode blendMode) QUL_DECL_OVERRIDE;

    void blendRoundedRect(PlatformInterface::DrawingDevice *drawingDevice,
                          const PlatformInterface::Rect &rect,
                          const PlatformInterface::Rect &clipRect,
                          PlatformInterface::Rgba32 color,
                          int radius,
                          BlendMode blendMode) QUL_DECL_OVERRIDE;

    void blendImage(PlatformInterface::DrawingDevice *drawingDevice,
                    const PlatformInterface::Point &pos,
                    const PlatformInterface::Texture &source,
                    const PlatformInterface::Rect &sourceRect,
                    int sourceOpacity,
                    BlendMode blendMode) QUL_DECL_OVERRIDE;

    void blendAlphaMap(PlatformInterface::DrawingDevice *drawingDevice,
                       const PlatformInterface::Point &pos,
                       const PlatformInterface::Texture &source,
                       const PlatformInterface::Rect &sourceRect,
                       PlatformInterface::Rgba32 color,
                       BlendMode blendMode) QUL_DECL_OVERRIDE;

    void blendTransformedImage(PlatformInterface::DrawingDevice *drawingDevice,
                               const PlatformInterface::Transform &transform,
                               const PlatformInterface::RectF &destinationRect,
                               const PlatformInterface::Texture &source,
                               const PlatformInterface::RectF &sourceRect,
                               const PlatformInterface::Rect &clipRect,
                               int sourceOpacity,
                               BlendMode blendMode);
    void blendTransformedAlphaMap(PlatformInterface::DrawingDevice *drawingDevice,
                                  const PlatformInterface::Transform &transform,
                                  const PlatformInterface::RectF &destinationRect,
                                  const PlatformInterface::Texture &source,
                                  const PlatformInterface::RectF &sourceRect,
                                  const PlatformInterface::Rect &clipRect,
                                  PlatformInterface::Rgba32 color,
                                  BlendMode blendMode);

    void synchronizeForCpuAccess(PlatformInterface::DrawingDevice *drawingDevice,
                                 const PlatformInterface::Rect &rect) QUL_DECL_OVERRIDE;

    PlatformInterface::DrawingEngine::Path *allocatePath(const PlatformInterface::PathData *pathData,
                                                         PlatformInterface::PathFillRule fillRule) QUL_DECL_OVERRIDE;

    void setStrokeProperties(PlatformInterface::DrawingEngine::Path *path,
                             const PlatformInterface::StrokeProperties &strokeProperties) QUL_DECL_OVERRIDE;

    void blendPath(PlatformInterface::DrawingDevice *drawingDevice,
                   PlatformInterface::DrawingEngine::Path *path,
                   const PlatformInterface::Transform &transform,
                   const PlatformInterface::Rect &clipRect,
                   const PlatformInterface::Brush *fillBrush,
                   const PlatformInterface::Brush *strokeBrush,
                   int sourceOpacity,
                   PlatformInterface::DrawingEngine::BlendMode blendMode) QUL_DECL_OVERRIDE;
};

If any of Qul::PlatformInterface::DrawingEngine::blendRect, Qul::PlatformInterface::DrawingEngine::blendRoundedRect, Qul::PlatformInterface::DrawingEngine::blendImage, Qul::PlatformInterface::DrawingEngine::blendAlphaMap, Qul::PlatformInterface::DrawingEngine::blendTransformedImage, or Qul::PlatformInterface::DrawingEngine::blendTransformedAlphaMap functions are not overridden, the default implementation is used. It calls Qul::PlatformInterface::DrawingEngine::synchronizeForCpuAccess before calling the corresponding fallback implementation on Qul::PlatformInterface::DrawingDevice::fallbackDrawingEngine. If the platform is able to partially accelerate some blending function, for example only for given blend modes or opacity parameters, it can itself use the fallbackDrawingEngine to fallback for a certain set of parameters.

Warning: If you experience a crash when using the fallbackDrawingEngine, it might be because you haven't called Qul::PlatformInterface::init32bppRendering() or similar in initializePlatform().

Note: If the blendImage function is using software rendering by default implementation or by using fallbackDrawingEngine, the source images are assumed to be in ARGB32_Premultiplied format. To enable blending of other formats with blendImage set QUL_PLATFORM_DEFAULT_RESOURCE_ALPHA_OPTIONS to "Always".

The actual implementation of the blend functions might vary significantly depending on the hardware-acceleration API. To make your task of implementing these functions easier, a dummy hardware-acceleration API is used for demonstration purposes. For example, here's how the implementation of Qul::PlatformInterface::DrawingEngine::blendRect might look:

void ExampleDrawingEngine::blendRect(PlatformInterface::DrawingDevice *drawingDevice,
                                     const PlatformInterface::Rect &rect,
                                     PlatformInterface::Rgba32 color,
                                     BlendMode blendMode)
{
    // Implement rectangle blending here

    // If only blitting is supported by the hardware, this is how to use the
    // fallback drawing engine for the blending path.
    if (color.alpha() != 255 && blendMode != BlendMode_SourceOver) {
        synchronizeForCpuAccess(drawingDevice, rect);
        drawingDevice->fallbackDrawingEngine()->blendRect(drawingDevice, rect, color, blendMode);
        return;
    }

    // HW_SetColor(1.0f, 1.0f, 1.0f, sourceOpacity * (1 / 256.0f));

    // HW_BlitRect(toHwRect(rect));
}

This example code assumes that the hardware-acceleration API doesn't offer rectangle blending support, for the sake of demonstrating how the fallback drawing engine could be used to fulfill this task instead.

Otherwise, the color is set up and the a call to blit the rectangle is issued.

Qul::PlatformInterface::DrawingEngine::blendRoundedRect is called when a rectangle with rounded corners is blended. The platform implementation also has to clip against the provided clip rectangle if it's smaller than the rectangle to be blended. For example implementation see:

void ExampleDrawingEngine::blendRoundedRect(PlatformInterface::DrawingDevice *drawingDevice,
                                            const PlatformInterface::Rect &rect,
                                            const PlatformInterface::Rect &clipRect,
                                            PlatformInterface::Rgba32 color,
                                            int radius,
                                            BlendMode blendMode)
{
    // Implement rectangle blending here

    // HW_SetColor(1.0f, 1.0f, 1.0f, sourceOpacity * (1 / 256.0f));
    // HW_SetClip(clipRect.x(), clipRect.y(), clipRect.width(), clipRect.height());

    // HW_BlitRoundRect(toHwRect(rect), radius);

    // HW_SetClip(0, 0, screen->width(), screen->height());
}

Similarly, when blending images and alpha maps you also need to inform the hardware-acceleration API about the texture layout and data somehow. The example assumes that there's a bindTexture function that handles it. This function must be implemented by the platform port according to the needs of the particular hardware-acceleration API.

For blendImage there's a source rectangle, a destination point, and the source opacity in the range 0 to 256 inclusively. Here's how its implementation might look:

void ExampleDrawingEngine::blendImage(PlatformInterface::DrawingDevice *drawingDevice,
                                      const PlatformInterface::Point &pos,
                                      const PlatformInterface::Texture &source,
                                      const PlatformInterface::Rect &sourceRect,
                                      int sourceOpacity,
                                      BlendMode blendMode)
{
    // Fall back to default CPU drawing engine for pixel formats that can't be
    // blended with hardware acceleration
    if (source.format() == Qul::PixelFormat_RGB332 || source.format() == PixelFormat_RLE_ARGB32
        || source.format() == PixelFormat_RLE_ARGB32_Premultiplied || source.format() == PixelFormat_RLE_RGB32
        || source.format() == PixelFormat_RLE_RGB888) {
        DrawingEngine::blendImage(drawingDevice, pos, source, sourceRect, sourceOpacity, blendMode);
        return;
    }

    // Implement image blending here

    // HW_SetBlendMode(toHwBlendMode(blendMode));

    // bindTexture(source);

    // HW_SetColor(1.0f, 1.0f, 1.0f, sourceOpacity * (1 / 256.0f));

    // const Rect destinationRect(pos, sourceRect.size());
    // HW_BlendTexture(toHwRect(sourceRect), toHwRect(destinationRect));
}

Next up, blendAlphaMap is very similar, the main difference being that the texture will have the pixel format PixelFormat_Alpha1 or PixelFormat_Alpha8, meaning one bit or one byte per pixel respectively indicating opacity. For each pixel, its opacity value must be multiplied by the given color before being blended or blitted depending on the blendMode.

void ExampleDrawingEngine::blendAlphaMap(PlatformInterface::DrawingDevice *drawingDevice,
                                         const PlatformInterface::Point &pos,
                                         const PlatformInterface::Texture &source,
                                         const PlatformInterface::Rect &sourceRect,
                                         PlatformInterface::Rgba32 color,
                                         BlendMode blendMode)
{
    // Implement alpha map blending here

    // HW_SetBlendMode(toHwBlendMode(blendMode));

    // bindTexture(source);

    // const float inv = 1 / 255.0f;
    // HW_SetColor(color.red() * inv, color.green() * inv, color.blue() * inv, color.alpha() * inv);

    // const PlatformInterface::Rect destinationRect(pos, sourceRect.size());
    // HW_BlendTexture(toHwRect(sourceRect), toHwRect(destinationRect));
}

Next, there's transformed blending of images and alpha maps. If the platform supports accelerating this, the implementation of blendTransformedImage might look something like this:

void ExampleDrawingEngine::blendTransformedImage(PlatformInterface::DrawingDevice *drawingDevice,
                                                 const PlatformInterface::Transform &transform,
                                                 const PlatformInterface::RectF &destinationRect,
                                                 const PlatformInterface::Texture &source,
                                                 const PlatformInterface::RectF &sourceRect,
                                                 const PlatformInterface::Rect &clipRect,
                                                 int sourceOpacity,
                                                 BlendMode blendMode)
{
    // Fall back to default CPU drawing engine for pixel formats that can't be
    // blended with hardware acceleration
    if (source.format() == Qul::PixelFormat_RGB332 || source.format() == PixelFormat_RLE_ARGB32
        || source.format() == PixelFormat_RLE_ARGB32_Premultiplied || source.format() == PixelFormat_RLE_RGB32
        || source.format() == PixelFormat_RLE_RGB888) {
        DrawingEngine::blendTransformedImage(drawingDevice,
                                             transform,
                                             destinationRect,
                                             source,
                                             sourceRect,
                                             clipRect,
                                             sourceOpacity,
                                             blendMode);
        return;
    }

    // Implement transformed image blending here

    // float matrix[16];
    // toHwMatrix(transform, &matrix);

    // HW_SetTransformMatrix(matrix);
    // HW_SetClip(clipRect.x(), clipRect.y(), clipRect.width(), clipRect.height());

    // HW_SetBlendMode(toHwBlendMode(blendMode));

    // bindTexture(source);

    // HW_SetColor(1.0f, 1.0f, 1.0f, sourceOpacity * (1 / 256.0f));

    // HW_BlendTexture(toHwRect(sourceRect), toHwRect(destinationRect));

    // HW_SetClip(0, 0, screen->width(), screen->height());
    // HW_SetTransformIdentity();
}

In addition to setting a clipping rectangle, you also need to convert the given transform to some matrix representation that the hardware-acceleration API accepts. If a 4x4 matrix is used, the conversion might look like the example toHwMatrix:

static void toHwMatrix(const PlatformInterface::Transform &transform, float *matrix)
{
    matrix[0] = transform.m11();
    matrix[1] = transform.m12();
    matrix[2] = 0;
    matrix[3] = 0;
    matrix[4] = transform.m21();
    matrix[5] = transform.m22();
    matrix[6] = 0;
    matrix[7] = 0;
    matrix[8] = 0;
    matrix[9] = 0;
    matrix[10] = 1;
    matrix[11] = 0;
    matrix[12] = transform.dx();
    matrix[13] = transform.dy();
    matrix[14] = 0;
    matrix[15] = 1;
}

The blendTransformedAlphaMap sample code looks very similar to blendTransformedImage, apart from taking the color into account as already demonstrated in the blendAlphaMap example code above. Refer to the platform/boards/qt/example-baremetal/platform_context.cpp file for the blendTransformedAlphaMap sample code.

As hardware blending often happens asynchronously, if any CPU based reading or writing to the framebuffer happens, Qul::PlatformInterface::DrawingEngine::synchronizeForCpuAccess is called by the Qt Quick Ultralite core. This function needs to sync with the hardware-accelerated blending unit to ensure that every pending blending command has been fully committed to the framebuffer. Here's how that might look with our dummy hardware-acceleration API:

void ExampleDrawingEngine::synchronizeForCpuAccess(PlatformInterface::DrawingDevice *drawingDevice,
                                                   const PlatformInterface::Rect &rect)
{
    // HW_SyncFramebufferForCpuAccess();

    framebufferAccessedByCpu = true;

    unsigned char *backBuffer = framebuffer[backBufferIndex];
    for (int i = 0; i < rect.height(); ++i) {
        unsigned char *pixels = backBuffer + (ScreenWidth * (rect.y() + i) + rect.x()) * BytesPerPixel;
        // CleanInvalidateDCache_by_Addr(pixels, rect.width() * BytesPerPixel);
    }
}

On some hardware, it might also be necessary to invalidate the data cache for the area involved. This ensures that the asynchronous writes done by the blending unit are fully seen by the CPU.

In addition, two more functions need to be implemented to ensure Qt Quick Ultralite core can read and write texture data dynamically. They are PlatformContext::waitUntilAsyncReadFinished and PlatformContext::flushCachesForAsyncRead.

Here's how their implementation might look:

void ExamplePlatform::waitUntilAsyncReadFinished(const void * /*begin*/, const void * /*end*/)
{
    // HW_SyncFramebufferForCpuAccess();
}

void ExamplePlatform::flushCachesForAsyncRead(const void * /*addr*/, size_t /*length*/)
{
    // CleanInvalidateDCache_by_Addr(const_cast<void *>(addr), length);
}

waitUntilAsyncReadFinished should ensure that any asynchronous operation reading from the given memory area has finished before returning. Typically this memory area refers to some texture data, which is going to get overwritten. The simplest option here is to sync with the hardware-accelerated blending units to ensure there are no more texture blending operations pending.

Next, flushCachesForAsyncRead ensures that any changes written by the CPU are flushed so that subsequent asynchronous reads get the correct up-to-date memory data. If this is necessary on the given hardware, there must be a call to invalidate the data caches available.

That covers all the pieces necessary to implement hardware accelerated blending, to make Qt Quick Ultralite UIs run smoothly and efficiently on the platform that's being ported to.

See also Framebuffer Requirements and Partial Framebuffer.

Available under certain Qt licenses.
Find out more.