OOo my Threading (2)

I’ve already talked a bit
about status quo of threading in OOo, and listed some largely
useful stuff that others have been doing to remedy the problems
shared-state multi-threading poses.

Why not going further? If there’s a way to know that a function is pure, or that an
object is thread-safe, then it would be nice to have
automatic parallelization employed. The underlying issue that
needs to be controlled here are races –
prohibiting unserialized access to shared state. So, either a function
is pure, i.e. does not depend on shared state at all, or the called
subsystem takes care of itself against concurrent modification (this
is most easily achieved by the environment concept of the UNO
threading framework: the UNO runtime implicitely serializes
external access to thread-unsafe appartements).

Although C++ provides no portable ways to express those concepts on a
language level, for pure functions, there’s a way of using a limited
subset of lambda
expressions, that inhibit access to shared state on the syntax
level. And it’s perfectly possible to mark objects (or even subsets of
a class’ methods) to be thread-safe. One straight-forward way to do
this are specializations of UNO interface references, i.e. ones that
denote thread-safe
components.

Given all of this, we can form statements that contain:

thread-safe method calls

pure function calls

impure function calls

So, in fact, a runtime engine could reason about which subexpressions
can be executed concurrently, and which must be serialized. If you
treat method calls as what they are, e.g. implicitely carrying a
this pointer argument, a possible data flow graph might
look like this:

new object1                      new object2
 |                                |
 +->object1::methodA              +->object2::methodA
            |                               |
            +------------------------------>+->object2::methodB(object1)
                                                         |
                                                         v
                                                 object1::methodC

new object3
 |
 +->object3::methodA

That is, the this pointer is carried along as a target
for modifications, and as soon as two methods have access to the same
object, they need to be called sequentially. This does not apply for
UNO interface calls or objects that are tagged as thread-safe, of course. To be specific, a forest of data flow trees can be generated, which
defines a partial ordering over the subexpressions. If neither
exp1<exp2 nor exp1>exp2 can be deduced
from this ordering, those two subexpressions can be executed in
parallel. Really basic stuff, that compiler optimizers do, as well –
only that plain C/C++ doesn’t provide that many clues to safely
parallelize. From the example above, it is obvious that
object3::methodA can be executed concurrently to all other
methods, that object1::methodC must be execute strictly
after object2::methodA, and that
object1::methodA and object2::methodA can
also be executed concurrently.

Okay, this is largely crack-smoking. But there is something to be made
of it. Stay tuned.

Thorsten's Weblog