PyQt5 signal/slot connection performance

schollii

5.00/5 (3 votes)

Sep 4, 2016

CPOL

6 min read

23448

Investigation on effect of using pyqtSlot in PyQt5

Introduction

The PyQt5 website indicates that using @pyqtSlot(...) decreases the amount of memory required and increases speed, although the site is not clear in what way. I wrote pyqt5_connections_mem_speed.py to get specifics on this statement.

This script generates the following output on my Windows 7 x64 virtual machine, running on my laptop:

Comparing speed for 1000 samples of 1000000 emits
(Raw: expect approx 1460 sec more to complete)
(Pyqt Slot: expect approx 1437 sec more to complete)

Raw slot mean, stddev:  1.45 0.019
Pyqt slot mean, stddev: 1.446 0.015
Percent gain with pyqtSlot: 0 %

Raw times:       [1.447, 1.478, 1.493, 1.482, 1.476, 1.45, 1.446, 1.433, 1.468, 1.451, 1.441, 1.458, 1.456, 1.467, 1.453, 1.444, 1.445, 1.446, 1.482, 1.446, 1.458, 1.458, 1.516, ...]
Pyqt slot times: [1.437, 1.445, 1.437, 1.432, 1.444, 1.434, 1.434, 1.434, 1.438, 1.431, 1.443, 1.44, 1.436, 1.434, 1.434, 1.444, 1.432, 1.443, 1.45, 1.469, 1.447, 1.437, 1.44, 1.438, ...]


Comparing mem and time required to create N connections, N from 100 to 10000000

Measuring for 100 connections
              # connects     mem (bytes)   time (sec)
Raw         :        100               0      0.00148
Pyqt Slot   :        100               0     0.000311
Ratios      :                        nan            5

Measuring for 1000 connections
              # connects     mem (bytes)   time (sec)
Raw         :       1000          618496       0.0125
Pyqt Slot   :       1000           65536      0.00211
Ratios      :                          9            6

Measuring for 10000 connections
              # connects     mem (bytes)   time (sec)
Raw         :      10000         8507392        0.127
Pyqt Slot   :      10000          327680       0.0201
Ratios      :                         26            6

Measuring for 100000 connections
              # connects     mem (bytes)   time (sec)
Raw         :     100000        82685952         1.26
Pyqt Slot   :     100000         1048576        0.198
Ratios      :                         79            6

Measuring for 1000000 connections
              # connects     mem (bytes)   time (sec)
Raw         :    1000000       841474048         12.5
Pyqt Slot   :    1000000        16945152         1.97
Ratios      :                         50            6

Process finished with exit code 0

The output is analysed in the following subsections.

Signaling speed

The first test compares the speed of signaling with and without the pyqtSlot decorator. I.e., it checks whether there is a difference in speed between using

   class Handler:
        def slot(self):
            pass

and using

   class Handler(QObject):
        @pyqtSlot()
        def slot(self):
            pass

when a signal connected to a handler.slot is emitted. The script times how long it takes to emit a million signals, does this a 1000 times and averages.

The difference is not significant: the gain is 0% in the above capture, and a couple separate run showed around 2-4%. Either way, in a typical application this speed difference would be completely unnoticeable.

Connection establishment speed

The next test in the output compares the memory used by connections, and the time required to establish connections -- no signal emission is involved. The time results show that pyqtSlot'd methods are about 6 times faster to connect to, than "raw" methods. This is significant, but would only matter in an application where establishing raw connections was a significant portion of the total cpu time of the application. This is not a common occurrence IMO.

For example, if half the CPU time of an app is spent establishing raw connections (presumably because a huge
number of objects are being created and destroyed that either emit or receive), then switching to pyqtSlot'd methods could bring this down by about 45%. But in most GUI applications, establishing connections is likely to be sporadic: when dialog windows are opened, threads started, graphics scene objects created, lists populated. These operations involve, in my experience, much more than just establishing a few connections; various functions must be called, classes instantiated etc. I doubt the speed effect would be noticeble, but it is good to know what could affect connect speed, and this information could certainly help focus profiling: if you see GUI reaction to a click take a long time, take a quick look at the number of raw connections established as a result of the click, compared to the rest of the code involved in reacting to the click.

Connection memory

In terms of memory, the results show that establishing connections to raw methods take 10 to 100 times more memory than to pyqtSlot'd methods. The accuracy is rather low due presumably to limitations of memory size computation funcions used, although the numbers are consistent across runs. It would be nice to have a more accurate measurement, but if we take those numbers at face value, 1000 pyqtSlot'd connections uses about 20k, vs 440k for the same number of raw connections. Since 1000 connections at any given time is again not very likely, and 440k is really not worth worrying about on a desktop, most applications probably don't need to care.

It would certainly be important where memory is at a premium like (current) mobile devices and embedded systems, or in applications that establishing a 1000+ connections (I could see this in a GUI table view where each row of the table model is listening for changes from a data object).

Feedback on the above would be most appreciated, it is very easy to get performance analyses wrong!

Other Points of Interest

Wrapping slots

Sometimes it is necessary to wrap a slot in a function that does extra stuff before and/or after the slot is called. For example, say you want to wrap all your slots with a function that will catch any exception raised while the slot is called, log the error somewhere and continue gracefully (useful during development!). You could achieve this by replacing

    class Handler(QObject):
        @pyqtSlot()
        def slot(self):
            pass

by this:

    class Handler(QObject):
        @slot_wrapper
        def slot(self):
            pass

with slot_wrapper defined as

    def slot_wrapper(func):
        def wrapped_slot():
            ... do extra stuff...
            func()
    
        pyqt_slot = pyqtSlot()(wrapped_slot)
        assert pyqt_slot is wrapped_slot
        return pyqt_slot

Given that the return value of pyqtSlot()(wrapped_slot) is wrapped_slot, and the latter is not a method
on a QObject but just a bare function (albeit a closure in the Python sense of the term), I wondered if the wrapped version would have the same memory and time performance as a raw slot. According to the PyQt5 author Phil Thompson, the handler's meta object is what provides the memory and speed improvements: the pyqtSlot()(func) puts stuff in func's QObject instance meta-object, but does it work on a wrapper of a method?

The script pyqt5_connections_mem_speed.py can test this by setting USE_WRAPPED_SLOT
to True. Interestingly, the results don't change. Indeed inspecting the handler.metaObject() using the
Python debugger^** reveals that wrapped_slot is in it (somehow!). So wrapping a pyqtSlot'd method still
maintains the memory and speed advantages of unwrapped pyqtSlot'd methods, that's great news!

QObject creation

When populating a list widget with list items, it can be necessary to connect each item to some other object that may change as the user interacts with an applicaiton. An example would be an IDE search results list that shows the lines of code that match a certain pattern -- when the user edits the code in the editor, the search panel should update the individual lines affected by the change without re-running the complete search. For each list item created, a handler must be created and connected to the other object. In such a case, the time and memory comparison of connections must include the time to create each list item: in the case of a raw slot, the handler need not derive from QObject; in the case of a pyqtSlot, the handler *must* derive from QObject, and this base portion of the handler requires some time to create.

To test this, I extended the script to have an additional test that includes the creation time of the handler. This shows the following output:

Comparing mem and time required to create N objects each with 1 connection, N from 100 to 10000000

Measuring for 100 connections
              # connects     mem (bytes)   time (sec)
Raw         :        100               0       0.0015
Pyqt Slot   :        100               0     0.000495
Ratios      :                        nan            3

Measuring for 1000 connections
              # connects     mem (bytes)   time (sec)
Raw         :       1000               0       0.0138
Pyqt Slot   :       1000               0       0.0035
Ratios      :                        nan            4

Measuring for 10000 connections
              # connects     mem (bytes)   time (sec)
Raw         :      10000          458752        0.133
Pyqt Slot   :      10000          901120       0.0386
Ratios      :                          1            3

Measuring for 100000 connections
              # connects     mem (bytes)   time (sec)
Raw         :     100000        13320192         1.31
Pyqt Slot   :     100000         8753152        0.372
Ratios      :                          2            4

Measuring for 1000000 connections
              # connects     mem (bytes)   time (sec)
Raw         :    1000000       762757120         13.9
Pyqt Slot   :    1000000       178745344         4.12
Ratios      :                          4            3

The time advantage of connecting to pyqtSlots is now half of what it was, over connecting to raw slots, all other things being equal. A delay of 1/10th of a second is not likely perceivable by the user, so for the advantge of pyqtSlots to be noticeable to the user, the number of QObjects created must be on the order of 30,000 or more. It is not likely that an application that could have so many items in a list widget would populate the complete list, this in itself would be a slow operation; the developer would likely only add about a 100 items to the list at a time, and the difference as the user scrolls would likely not be noticeable.

The memory difference is however massively reduced: now just too small to discern, or only a factor of 4 at very high number of objects. However a factor of 4 less memory is nothing to sneeze at, so when an applicaiton generates large numbers of QObject's with even just one connection each, pyqtSlot is definitely still worth making use of.

^** Footnote: checking meta-object for slots:

    meta = handler.metaObject()
    meta_methods = [str(meta.method(i).methodSignature())
                    for i in range(meta.methodOffset(), meta.methodCount())]
    print(meta_methods)

History

First version submitted September 3, 2016