Thorsten's Weblog

Obsolete external header guards…

While updating the OOo coding standards in late 2006, especially the IncGuards item sparked some discussion about why not removing all those existing external guards. Kendy quickly came up with a removal script, but a few tests showed that builds using the old wntmsci10 (MSVC 2003) compiler became significantly slower (external header guards had originally been put there for performance reasons). Since Hamburg RelEng, as the main victim of increased build times, now switched to MSVC 2005, it’s about time to bin these warts – currently committing the bulk of the changes to CWS incguards01.

Blog migration

February 3, 2008 – 03:09
Posted in Personal
Tagged migration, roller, wordpress, xslt
Leave a Comment

Having changed my employer, I found it appropriate to move away from the blogs.sun.com company facility. I liked the WordPress engine the most, so I’m here now – with the help of this little XSLT script (took this feed), even my old postings followed me, without the comments, but WTH.

Slight changes…

I hereby happily announce that starting this Friday, I’ll be part of the great OpenOffice.org team at Novell. Being fully home-assigned and saving the ~2 hours commute every day, I can now hopefully push my hidden gsl agenda even more.

Don’t trust your tools, if they are software…

While hunting down broken images in the pdf import, I came across a very weird behaviour, that was seemingly caused by plain C stream IO: every once in a while, streams that were explicitely opnened in binary mode converted 0x0A into 0x0D. One-on-one substitution, not the usual 0x0A to 0x0A 0x0D lineend conversion, notably. I wasted about a day trying to track that down, suspecting everything from crazy filesystem flags, to symbol hijacking during link time (i.e. somebody sneaking in own fwrite/putc/whatever into the binary). Laboriously ruled out one after the other.

Finally, I did what I should have done in the first place: I used a different hex editor. And lo and behold, the trusty midnight commander showed all the files with correct 0x0A. Caught in the act and proven guilty: emacs’ hexl-mode.

Well, kind of. It turned out to be a rather unlucky combination of things going wrong: emacs with its buffer encoding scheme, which tries to be smart and detect the actual encoding – if there’s a 0x0A in the file before a 0x0D, emacs assumes raw-text-unix and leaves line ends alone. If there’s a 0x0D in the file before any 0x0A, emancs assumes raw-text-mac, and converts all assumed lineends to 0x0D. And since hexl-mode seems to work on top of an existing buffer (and never loads the file directly from disk), this resulted in the described mess.

So, to save yourself some trouble: don’t trust your tools. And make sure to have files you want to work on with hexl-mode match something in file-coding-system-alist’s content, that enforces binary reads (elc, png, jpeg, ar, gz etc. for my setup).

GSoC 2007 miscellanea

Being honoured to mentor Shane Matthews on his GSoC 2007 project Impress OpenGL rendered transitions, I today received the mentor gift – thank you Google for the tshirt!

On related news, the resulting CWS got recently merged to HEAD, so those of the adventurous kind can now build this as an optional feature: just issue “build ENABLE_OPENGL=TRUE” in the slideshow module to be blessed with an ogltrans shared lib. And thanks to Rene, after CWS configure22 is integrated, one can even give “–enable-opengl” on the configure line. After that, register the ogltrans lib into OOo via unopkg, and use this presentation file to see what others are missing.

This could even be shipped as a separate extension after 2.4 is out…

A Dead Parrot Learns to Fly

After an impromptu dead parrot sketch at the OOoCon (“the beamer works” – “no, it’s dead” – “bah, it’s not dead, it just needs a reboot”… (you might look forward to the still-missing video)), I had a rather rough ride with the proverbial last 10 percent of the browser functionality – I’m still not 100% happy with the representation in OOo’s Calc, and the xcs schema parser leaves something to desire. But surely the thing is now ready for config hackers to play with it – grab your copy here:

09e22fc6d8d9f9bfd3dc7cf838ce6590 ooconfig.oxt

(for those of you who don’t know what I’m talking about: OoConfig is a Python extension, providing a tool quite similar in spirit to Mozilla’s about:config page – a way of tweaking the complete (even hidden) set of configuration items. See also the wiki for further information)

A hacker’s way of app-bundling

Just found this. Pretty cool way of packaging an application to be completely relocatable (though without desktop integration, barring further hacks). Something for OOo Portable?

Germany leads in installed photovoltaic power generation

Via Russel Coker of Debian fame, the Boston Globe reports that 55 percent of the world’s installed photovoltaic power is in: Germany. Yay! With my solar roof contributing, if even only the tiniest amount…

Intel opensourced their Thread Building Blocks

Seems that Intel has set their TBB lib free – which solves a few of the basic issues outlined here (I’ve commented a bit in the context of c++).

OOo my Threading (3)

Okay, what does all of this
really buy us, right now?
Lets assume we want to speed up Calc’s way of parsing and
calculating cell values. Given the following dependency tree (cell 1
refers to the result of cell4, cell 6 to the result of cell2, and so
forth):

    parse_cell_4                                parse_cell_5
         |                                           |
    parse_cell_1        parse_cell_2            parse_cell_3
         |                   |                       |
         -------------- parse_cell_6 -----------------

The partial ordering of cell evaluation equivalent to this tree is as follows:

parse_cell_4<parse_cell_1
parse_cell_5<parse_cell_3
parse_cell_1<parse_cell_6
parse_cell_2<parse_cell_6
parse_cell_3<parse_cell_6

Wanting to employ what we’ve talked about earlier, to parallelize this
calculation, one would need a static expression (i.e. one fixed at
compile time):

as_func(
    as_func(
        parse,
        as_func(
            parse,
            4),
        1),
    as_func(
        parse,
        2),
    as_func(
        parse,
        as_func(
            parse,
            5),
        3))

(with as_func denoting that the argument should be
evaluated lazily, and parse being a Calc cell parsing unary function,
expecting the cell number as its sole parameter).

Now, having the formula dependencies fixed in the code isn’t of much
value for a spreadsheet, thus we need to handle this a bit more
dynamically. Futures and Actions in
principle provide the functionality that we need here, only that
there’s one slight catch: the pool of threads processing the Actions or
Futures might actually be too small to have all cells resolved, that
in turn are referenced from other ones (with circular cell references
being the extreme example. But see below). This is because forcing a
Future or Action to evaluate will block the thread requesting that
value, which will eventually lead to starvation, unless there’s at
least one more thread active to process cell values other Actions are
waiting for.

Of course, one could remedy this by using N threads when
dealing with N cells. But that would get out of hand
pretty quickly. Alternatively, the cell parsing function can be split
into two parts: the first generates a parse tree, thus extracting the
referenced cells (depending on the heap allocation overhead, one could
also throw away the parser outcome, except for the cell
references. But OTOH, with a decent thread-local allocator, the full
parsing could happen concurrently. YMMV). Given the list of references
for each cell, one again gets the partial ordering over the value
calculation:

vector< vector >  intermediate;
parallel_transform( cells.begin(), cell.end(),
                    intermediate.begin(),
                    &parse_step_1 );

For each cell, this step yields a vector of preconditions (other
cells, that need to be calculated before). Pushing the actual cell
value calculation functions into a job queue, and handing it the
dependency graph (represented by the individual cell’s references)
generated above, we arrive at a fully dynamic version of the
concurrent cell parsing example:

int              i=0;
job_queue        queue;
vector results(intermediate.size(),0.0);
transform( intermediate.begin(),
           intermediate.end(),
           results.begin(),
           bind( &job_queue::add_job,
                 ref(queue),
                 bind( &parse_step_2,
                       ref(results),
                       ref(cells),
                       var(i)++ ),
                _1 ));
queue.run();

This looks weird, but peeking at the prototypes of the involved
functions might help clear things up:

/** adds a job functor

    @param functor
    Functor to call when job is to be executed

    @param prerequisites
    Vector of indices into the job queue, that must be processed
    strictly before this job. Permitted to use not-yet-existing
    indices.
 */
template job_queue::add_job( Func               functor,
                                            vector const& prerequisites );

and

/** calculates cell value

    @param io_results
    In/out result vector. Receives resulting cell value, and is used
    to read referenced cell's input values.

    @param content
    Cell content

    @param cell_index
    Index of cell to process
 */
parse_step_2( vector& io_results,
              string const&   content,
              int             cell_num );

This job queue can then decide, either globally or via policies, what
to do in various situations:

Whether to execute jobs in parallel or not. Depending on the number of cores
available (both physically and load-wise), the queue could decide
to stay single-threaded, if the number of jobs is low, or
multi-threaded for a larger number of jobs. Note that
this decision might be highly influenced by the amount of work a
single job involves, and therefore external hints to the queue
might be necessary. Kudos to mhu for the hint, that it’s wise to
parallelize ten jobs that take one hour each, but not so for
twenty jobs that only take a few microseconds to complete.

At any rate, fine-tuning this to various hardware, operating systems
and deployment settings is much easier than for manual thread
creations. Plus, given a few (falsifiable) attributes of the functions
called, it’s also much safer.

Thorsten's Weblog

Obsolete external header guards…

Blog migration

Slight changes…

Don’t trust your tools, if they are software…

GSoC 2007 miscellanea

A Dead Parrot Learns to Fly

A hacker’s way of app-bundling

Germany leads in installed photovoltaic power generation

Intel opensourced their Thread Building Blocks

OOo my Threading (3)

« Home

Pages

Categories

Archives

Search

Blogroll

RSS Feeds

Meta