Category Archives: OpenOffice.org

Introduction

Beside the problem of general polygon clipping, which has been thoroughly researched in the past twenty or so years, it is sometimes the case that only rectangular areas have to be clipped against each other. Two very prominent examples are the calculation of redraw areas in a GUI application – widget and application content areas are, because of performance and simplicity, often represented as rectangles, and quick, approximative spatial indexing, e.g. for GIS data.

It turns out that constraining clip calculations to axis-aligned bounding boxes (aka AABB) allows for noticeable simplifications in the algorithm – regarding code and time complexity, as well as numerical stability.

The algorithm

For clipping a set of AABBs against each other, the common sweep line algorithm is employed, in this case sweeping a vertical line from the leftmost box over to the rightmost. Therefore, each box B_i contributes two sweep line events E_l_i and E_r_i to the algorithm, one for its left and one for its right edge. After sorting those events in ascending order, a line is swept horizontally across all boxes:

Each time a box’s left edge is hit by the sweep line, two horizontal edges H_u_i and H_l_i are inserted into a list of currently-active horizontal segments. This list is kept sorted with ascending y values, and every vertical edge event is checked against intersection with all active horizontal edge segments.

At the core of every polygon clipping algorithm, mutual edge intersections need to be computed. Fortunately, as rectangles are convex polygons and free of self-intersections, a lot of the more complicated preprocessing steps involved for generic polygon clipping can be avoided. One of the most notable aspects is the fact that this algorithm is numericable stable also when performing intersection calculations with finite-precision floating-point math, because no precision-reducing operations need to be performed (this is in contrast to general polygon clipping, which gets notoriously instable under floating-point math, since the (sometimes repeated) calculation of intersection points introduces round-off errors – obvious for oblique edges).

For intersection calculations of AABBs, no calculations whatsoever are necessary on the coordinate elements, the resulting intersection vertices are just element-wise merged input coordinates.

Basically, four cases of edge intersections can be distinguished:

Note that, since the sweep line is vertical, it is always a right or a left edge of a rectangle, that needs to be intersected with either an upper or a lower edge. Degenerate cases, such as rectangles with zero height or width, or two exactly identical rectangles, are handled in a defined, but consistent way without affecting the general algorithm.

For each sweep line event (being either a left or a right rectangle edge), all currently active horizontal edge segments are processed, starting with the one with the smallest y value. Each left edge sweep line event creates a polygon, into which intersecting horizontal edges are merged. Therefore, when processing the horizontal edge segments, the sweep line will carry a current polygon P_c. The sweep line’s current polygon may change, naturally, as it intersects horizontal edges, taking up the associated polygon of the intersecting edge and in turn passing the current polygon to that horizontal edge:

For the sake of clarity, I have omitted symmetric cases here, which you get when polygon orientation is reversed (denoting reversed inside and outside areas). Those follow rather straight-forward. A version of the algorithm that handles those cases, and is thus able to perform the usual boolean operations on AABB-defined polygons, is implemented in c++, and used in OpenOffice.org’s graphic subsystem. A standalone version can be found here.

I'm going to FOSDEM, the Free and Open Source Software Developers' European Meeting

I’m actually even presenting!

Just uploaded ooo-build OOo 3.2 rc2 mac builds here.

Just a brief notice, took caff+tls, pulled changes up to latest debian version from svn, and hacked it into working state: script is here, suitable .caffrc is there.

Spent an intense last week in Orvieto, Italy. First two days had the 2nd odf plugfest; glad to see so many enthusiastic people from the odf universe again, or for the first time in person – and of course witnessing big corporation representatives like Doug and Rob sitting on the table, striving for better odf interop.

img_1630

Following that had three days of OpenOffice.org conference, with many interesting talks (including those from my excellent colleagues Petr, Noel, Cedric, Fridrich, Kendy, Rodo and Kohei).
I found the outlined ideas around an "odfkit" (the similarity in name to webkit is not by accident) quite remarkable – if that ends up in more code reuse across the place (and ideally also reusing existing code in their implementation), this is Good ™.

img_1716-up

Also great to meet the ever-energetic Chris Noack again; it was mostly due to him that I attended a few UX-related talks – and had at least the occasional feeling that approaches there were a bit by-the-book, maybe leaving a few peculiarities and potential synergies available in software generally and FLOSS specifically out of the equation. Reportedly, there are also students working on UX topics, so it would be really awesome to see them join the education project – in an attempt to tear down the wall between coders and interaction designers, that at least I perceive is existing in OOo.

Been at the mentor summit for the first time, and it was even exceeding my high expectations. Awe-some. If there’s another event that pulls together so many friendly high-profile FLOSS people, I’d like to know immediately.

Group Photo - everyone including photographers

Many kudos to Google for the event, and warthog9 for the nice group pic!

With the mentor summit coming up, it’s high time to wrap up this year’s Google Summer of Code for ooo-build. We started the term with 6 students, of which one was sadly missing in action right from the beginning (to set the record straight, he returned all funds).

All participants initially had to fight with OOo’s inherent complexity; I can only stress the importance of getting used to build system, installation & debugging peculiarities before the actual coding starts.

Progress until midterm was good to excellent, except for one case, where we had to make the very hard decision whether to continue with a student showing insufficient results – in the end, the agreement was to drop him, pretty much according to all best practices and lessons learned from other organisations. What I find most encouraging, and open-minded, is the fact that he’s now returning to OOo, and even bringing a few other students from his university for a joint project. Welcome back, Jon!

Which leaves us with a total of four completed projects:

  • Cross-building OOo for win32 (Jesus Corrius, the code (basically all crosswin32-prefixed patches))
  • Impress slide layout rework (Dona Hertel, the code)
  • Writer document comparison (Tzvetelina Tzeneva, the code)
  • Writer document navigation buttons (Maja Djordjevic, the code)

For the last two, let me just reference Tzvetelina’s and Maja’s respective blog postings; I couldn’t possibly explain and advertise it better.

The idea for the cross-building stems from the fact that compiling OOo on win32 is dog slow. With OOo consisting of some 90% c++, compilation is highly i/o-intensive, with an access pattern that apparently makes the Linux buffer cache shine, and the Windows one fall flat on its face. Additionally, of course you’d free people from buying and relying on proprietary software. Additionally, having something cross-buildable usually makes a project’s build system cleaner and more orthogonal.

With the Impress slide layout project, the aim was to get rid of the many hard-coded places where slide layout kinds were handled, following the paradigm of having pure data – the general layout – in a data structure, and not implicitely in code. Of course, an extra benefit is that changing or adding layouts now no longer needs re-compilation of OOo. Specifically, Impress now loads an autolayout configuration file on startup, and exposes the necessary data structures via OOo’s component model to the lower-level subsystems, for import and export.

This was my second GSoC term as a mentor, and my first one as an organisation admin. I’m more than happy with the results; our students did really well, and I hope we’ll see continued contribution from them over time. I also wish to thank my fellow mentors for their time and enthusiasm, and of course Google for making all of this possible in the first place!

2009-summer-of-code-logo-final-r3-01

Finally, there’ll be a panel presentation of the GSoC results at the annual OpenOffice.org conference; we’ll hopefully manage to get most of the students and mentors there. Will keep you posted.

The motivations and actual demographic groups the result’s responders recruited themselves from set aside: 53% pro UI-overhaul seems to be pretty low; at least this would make keeping the old UI optionally available a priority for me (and not an afterthought).

As mentioned by several, last week saw Novell’s HackWeek IV. I had two important vectors along which I wanted to advance, namely bringing slideshow a bit closer to using the new drawinglayer primitives, and improving on the built-in testing facilities in ooo-build.

I started with doing an inventory of things to do for the slideshow upgrade, a pleasant surprise was the fact that Armin already provides a factory for retrieving XPrimitive2D for a given Impress XShape – so technically, the slideshow can retrieve all that’s necessary via pure UNO. Looking at the renderer implementation (that outputs XPrimitive2D on the XCanvas render substrate used in slideshow), I found a bunch of loose ends, and got side-tracked fixing those.

The first thing I actually tackled was the way canvas implements gradients (I seem to develop a knack for those recently). The initial interface for those was rather rigid, and to add insult to injury, extremely tied to ODF’s weird gradient definitions. So I changed the XParametricPolyPolygon2DFactory into a generic XMultiServiceFactory instead, that takes the following parameters:

   Gradients - all gradients need to support two construction
   parameters, "Colors" being a sequence of a sequence of doubles
   and "Stops" being a sequence of doubles. Both must
   have the same length, and at least two elements. See
   http://www.w3.org/TR/SVG11/pservers.html#GradientStops for
   the semantics of gradient stops and colors.
   Required gradient services:

   * "LinearGradient" - the gradient varies linearly between
     the given colors. without coordinate system
     transformation, the color interpolation happens in
     increasing x direction, and is constant in y
     direction. Equivalent to svg linear gradient

http://www.w3.org/TR/SVG11/pservers.html#LinearGradients

   * "EllipticalGradient" - this gradient has zeroth color
     index in the middle, and varies linearly between center
     and final color. The services takes an additional
     parameter named "AspectRatio" of double
     (width divided by height), if this aspect ratio is 1, the
     gradient is circular. If it's not 1, the gradient is
     elliptical, with the special twist that the aspect ratio
     is maintained also for the center color: the gradient will
     not collapse into a single point, but become a line of
     center color. If "AspectRatio" is missing, or equal to 1,
     this gradient yields similar results as the svg radial
     gradient

http://www.w3.org/TR/SVG11/pservers.html#RadialGradients

   * "RectangularGradient" - this gradient has zeroth color
     index in the middle, and varies linearly between center
     and final color via rectangular boxes
     around the center point. The services takes an additional
     parameter named "AspectRatio" of double
     (width divided by height), if this aspect ratio is 1, the
     gradient is quadratic. If it's not 1, the gradient is
     rectangular, with the special twist that the aspect ratio
     is maintained also for the center color: the gradient will
     not collapse into a single point, but become a line of
     center color.

Further massaging the code to share the same gradient texture transformation setup across drawinglayer, canvas, and cppcanvas module, this not only finally gave a nicely uniform rendering of gradients (especially the canvas had some glitches before, with one or the other canvas implementation even missing a gradient type), but also much cleaner code (in terms of comprehensibility, and in terms of implicit duplication of functionality). Final patch has 1237 insertions(+), 1016 deletions(-), containing new functionality (see below) plus substantial commentary (amounting, in total, to about 150 lines). Which gets me quite close to one of my unstated goals of getting less code with more features, when doing refactoring. ;-)

Kudos to Armin btw. for abstracting away the legacy gradients pixel-by-pixel into the usual matrix plus texture function concept!

If you’ve read closely the parameter description for the new gradient factory, you’ve noticed that svg gradients are already fully covered by the canvas now (the little twist is to omit the AspectRatio parameter, and instead scale the gradient anisotrophically via the texture transform, to get the normal svg way of isobare gradient rasterization). Possibly one of the next useful additions would be an optional color transformation function, like the sigmoidal gradient that’s often used in MSO docs – with that, we should get fairly close to implementing the superset of all relevant gradient concepts in office land (well, still missing gdiplus-like path gradient, but that’s rather arcane and seldomly used))

‘Nuff said, in the end, I used most of the week on this gradient rework, but it was time well spent. I used Friday to hack up slideshow to use the primitive renderer for XShapes, which actually works, but of course needs much more changes to e.g. properly support attribute animations (like changing fill color, font size etc.). Full patch is here and here. The more relevant achievement of Friday was the addition of ~convenient convwatch support in ooo-build:

On a known-good version, do this (ooconvwatch is here):

   "cd ooo-build/build/install/program; source ooenv; ooconvwatch -c \
    -d path_to_testdocs"

Wait.

When ooconvwatch finally returns, rebuild with your dubious patches, then:

   "cd ooo-build/build/install/program; source ooenv; ooconvwatch \
    -d path_to_testdocs"

Check /tmp/allfiles.txt for list of docs processed, check /tmp/_filename_.prn.diff.jpg for the graphical diffs.

Todo: collect sensible set of test docs. Throwing huge amounts of random docs at this is a waste. I’m pretty sure I can put everything that will possibly break on a handful of Impress pages. Times the number of odf/sxi/ppt/pptx format versions, and one still has a smallish number. Explore the database usage of convwatch – someone willing to share a description of that feature?

Update: in case it’s not painstakingly obvious, the set of test docs of course will not be closed; instead, after each regression not found by convwatch, it needs to be updated to henceforth catch the problem (referring to the “I have a good idea about what will possibly break in Impress”). ;-)

Part of ooo-build since early June, and now on the way to 3.2 for vanilla. Just a few pictures that should tell the story, starting with the state of affairs up to and including OOo 3.1:

blog-sample-0-ooold

The competition:

blog-sample-0

After the fix:

blog-sample-0-ooo

The differences are a bit subtle when looked at on this scale, but quite annoying when encountered by someone who designs her presentations with an eye for details. Especially the way sub-shapes were gradient-filled before, like the top end of the can or the eyes of the smiley. Kudos to haui btw. for the very nice hack to make OOo believe a sub-shape actually has the same bounding box as the main shape. Here’s the patch for those interested in the gory details.