While working on the rulers fix, Christian pointed out that, with my fix, using rulers on a numbered list of paragraphs totally kills the numbering – because of the way I have to change the outline-defining entities (I pointed out that their atomicity is a real problem). So, this is what you get when changing the outline control item in the middle of a numbered list of paragraphs:

blog-sample-1-ooold-indented

That is, changing any attribute, even the color of the number, restarts numbering in Impress (notice the double occurence of the colored "a)"). While when changing indentation, this might be acceptable, the behaviour with color is clearly non-intuitive, and also likely a fall-out of the list level fixes. Let’s look at the competition:

blog-sample-1-ooo-indented

Somewhat different – changing outline attributes don’t affect numbering. I initially thought a bit about the indentation issue, as you can find arguments for both ways; ultimately though, deviating from MSO for no extremely good reason is something we usually avoid in OOo-land; and incidentally there’s a good reason for the way PowerPoint is doing it: indentation, font, coloring, numbering scheme, etc. is just formatting. The only thing I would consider content is the outline level, i.e. the depth of the nesting (of the numbered paragraphs), on the document level.

So with the golden rule that content is king and formatting is nothing, I went and boldly changed the way numbering is done in ooo-build, namely that only the outline level determines the counting (of course, unnumbered and bulleted paragraphs inbetween still restart the numbering):

blog-sample-1-ooold-indented

Dear LazyWeb, feedback about which behaviour you find more usable (and less surprising) much appreciated.

With the substantial changes in OOo 3.0 regarding list levels (that have caused a bit of trouble elsewhere), Impress seemed to be going along quite nicely, except for the issues when converting between ODF1.1 and tentative-ODF-1.2 documents.

Anyway.

But then along came someone who was using the rulers in Impress (which are disabled by default – you get them inside the “View” menu), complained about extensive brokenness, and I started to have a deeper look:

blog-sample-1-ooold-indented blog-sample-1-ooo-indented

The left image shows status quo (red line showing position of the ruler controls), note that the cursor is inside the last, most indented paragraph. The right image shows the behaviour as it was before 3.0 (the strong black lines added, to better visualize ruler control position). Moving those ruler controls then gives this:

blog-sample-1-ooold-indented blog-sample-1-ooo-indented

Again, left side new, right side old behaviour. Apparently now, the ruler controls only add an offset to the already existing indentation of the outline, whereas before, they directly controlled bullet position and left margin of the text. What’s more, the lower control does not change text left margin, but first line indent – something just utterly useless in a presentation outline. Not exactly an improvement in usability, in my not so humble opinion. This is now fixed in ooo-build master.

That said, the fix has a tiny little fly in the ointment: changing the bullet/text distance needs to modify the entity that determines the whole bullet/numbering appearance (the SvxNumBulletItem, sadly therefore only atomically modifiable – a decision that might need re-assessment), which in turn leads to an outline format that will no longer adhere to e.g. changes in the master page styles. This is no different when changing said distance in an unfixed vanilla OOo, via the Bullets & Numbering dialog; and clearly a lot less annoying than the ruler behaviour there – but it still warrants mentioning, I guess.

Following the original announcement, more than 70 students applied to the Go OpenOffice project for the Google Summer of Code. I was truely impressed, and want to say “thank you!” to all that took the sometimes considerable effort to write a good application.

2009-summer-of-code-logo-final-r3-01

Choosing six projects among those many was not easy; but I think the collective mentors did an outstanding job selecting both excellent students and relevant tasks – so I’m happy to announce this year’s Go OpenOffice GSoC participants (in order of their last name):

Andrés Correa Casablanca will work on performance improvements

Jesús Corrius will have Win32 OOo cross-compile under Linux

Maja Djordjevic will add Hyperlink/Reference navigation buttons

Dona Hertel will extend the functionality of the templates in Impress

Tzvetelina Tzeneva will improve OOo Writer’s document comparison

Jonathan Winandy will add an Ocropus OCR integration to OOo

Yay to the successful applicants!

To all those who applied, but have not been selected: the competition was fierce, I can assure you, so don’t be put off, try again next year – and maybe give yourself this extra bit of a head start and continue working on OOo! We’ll always and happily mentor you, if you’re enthusiastic and willing to learn – just come and ask us, you already know where!

That only leaves me to thank Google for sponsoring us; thanks to all who applied, thanks to the mentors – without you folks, all of this just won’t happen! Looking forward to a wonderful summer 2009!

With the recent survey about the community’s favorite distributed software configuration management system completed, the Engineering Steering Committee went into debating the candidate system’s merits. Since there was a draw between mercurial and git (which of course counts double because of extra coolness), the ESC came to this very Solomonian decision: proportional to vote turn-out and coolness coherency, the OOo source tree will be split into four different repositories, each hosted by the respective DSCM prospect, in the following way:

dscm hg (with 49%) gets 48% of code:
 helpcontent2 officecfg binfilter sd ooo_custom_images sysui chart2
 wizards instsetoo_native framework odk icu writerfilter dmake psprint_config
 redland toolkit scripting xpdf bitstream_vera_fonts libxml2 basctl openssl
 lotuswordpro libxslt stoc fpicker cppu comphelper accessibility uui basegfx
 package rsc external rhino xmlhelp ucbhelper cppcanvas crashrep registry soltools
 lingucomponent hwpfilter jvmfwk lpsolve io basebmp sax hyphen embedserv
 pyuno writerperfect i18nutil xml2cmp libwps remotebridges rdbmaker jut
 postprocess sccomp twain offuh fileaccess MathMLDTD

dscm none (with 25%) gets 24% of code:
 svx sw dictionaries offapi dbaccess vcl sal xmerge vigra desktop xmloff scaddins
 cairo scp2 jfreereport solenv readlicense_oo autodoc ucb bridges tools stlport
 hsqldb canvas setup_native psprint shell saxon reportbuilder jurt cppuhelper dtrans
 swext codemaker curl testtools scsolver xmlscript javaunohelper sot ridljar
 external_images icc idlc testshl2 hunspell beanshell idl jpeg UnoControls unoil
 mdbtools epm regexp salhelper libwpg jvmaccess ure animations expat fondu
 unixODBC x11_extensions eventattacher sane

dscm git (with 23%) gets 22% of code:
 sc qadevOOo default_images svtools boost sfx2 extensions filter i18npool
 connectivity oovbaapi sdext basic python starmath reportdesign configmgr
 oox forms xmlsecurity lucene goodies tomcat slideshow apache-commons
 udkapi padmin berkeleydb libxmlsec javainstaller2 agg transex3 cli_ure
 config_office automation unotools embeddedobj avmedia libwpd moz linguistic
 store libtextcat unoxml neon bean cppunit cosv Mesa sj2 smoketestoo_native
 sandbox vos unodevtools udm afms cpputools np_sdk zlib o3tl msfontextract
 stax libegg packimages agfa_monotype_fonts

dscm bzr (with 3%) gets 4% of code:
 extras

Where the people that did not prefer any DSCM will get what they asked for, namely no DSCM at all: those modules will stay with subversion (the proposal to not even host them in any SCM, but simply hold one version on a file server, did not get a majority vote).

We’ve been selected to be part of this year’s awesome GSoC, so all of you OOo-affine students out there, start thinking about the cool projects you want to do during summer!

2009-summer-of-code-logo-final-r3-01

Should you be new to OOo and not having any immediate idea what to do, fear not, here’s a list of ideas to pick from (just be aware that many of those are initial vectors to put you into the right direction, and fleshing them out would be part of your task; we’ll of course help with that).

Similarly, if you’re an established OOo hacker, willing to mentor and/or miss your favourite idea, help the project and add your name and idea to the proposals page. GSoC is a wonderful opportunity to get people involved with OOo, don’t miss out to get smart students working on your code!

Student application period starts on March 23rd, we’re looking forward to your applications!

I have been working a bit on improving the ooxml import for Impress recently, focusing on the SmartArt stuff from PowerPoint 2k7. A really nifty feature indeed. Cunningly enough, MS does not store any fallback shapes for SmartArt, thus leaving me with the sole option of implementing a SmartArt layout engine myself. Here’s what it can do already:

quickdiagram0a

And this is how the original thing looks like:

ppt03

But there’s more to it. Since I needed a SmartArt layouting engine anyways, it was quite natural to (re)use that for actually editing and relayouting the content in Impress! For that to work, of course either the ooxml input fragments or some derived data structure have to be available at the shape; again the most straight-forward way was to use ooxml directly (in the form of an in-memory representation of the xml tree, aka DOM). Having a group shape with four extra custom attributes then gives something like this:

smartart01

So “editing” this shape means tweaking the data xml fragment, i.e. adding or removing text or changing attributes, and then re-triggering the import/layouting engine. I just love it when code reuse is that easy. ;-)

What I have now is something working end-to-end, with basic import, basic layouting and basic editing working. What comes next is improving all the details, i.e. supporting all layout types, editing all aspects, not only text etc. Stay tuned!

As usual, FOSDEM was a blast. Sadly I missed the Friday beer event this year, due to my current over-loadedness I went to Brussels (very early) Saturday morning. Gave a little talk about what I considered cool (and felt competent enough to talk about) in OOo’s gsl/graphics area together with Janneke, who really deserves (and actually got) the thunder about this awesomely cool dialog layouting implementation.

fosdem_small

For the curious, the slides are here (but we really talked & demoed a lot in between).

Following the announcements of Eike and Eric, I’m going to FOSDEM ‘09 as well. If you’re around: I’ll be talking a bit about OpenOffice.org’s graphics/gui core, and what’s hot there currently. Tentative schedule is Saturday at 17:15 in AW1.126.

While having to work on binary PowerPoint import/export recently, I found the support for “debugging” the actual file format a bit lacking (to say the least). Of course, for those with MSDN access there’s the magic FileViewer.exe, but that’s of limited use on those other platforms I fancy working on, plus one cannot easily extend it.

I was therefore enviously looking at Daniel’s biffdumper and even more at Kohei’s xls-dump.py, and thusly set off ripping the guts out of the latter & hacking up a ppt-dump.py – which was a fun project actually!

Basically, what this gives you is a human-readable (and diffable!) dump of binary ppt files, like this:

====================================================================
[DFF_msofbtClientTextbox]
(type: F00Dh inst: 0000h, vers: 000Fh, start: 2760, size: 127)
====================================================================

 ====================================================================
 [DFF_PST_TextHeaderAtom]
 (type: 0F9Fh inst: 0000h, vers: 0000h, start: 0, size: 4)
 ====================================================================

 0F9Fh: -------------------------------------------------------------
 0F9Fh: 01 00 00 00
 0F9Fh: -------------------------------------------------------------

 ====================================================================
 [DFF_PST_TextBytesAtom]
 (type: 0FA8h inst: 0000h, vers: 0000h, start: 12, size: 45)
 ====================================================================

 0FA8h: -------------------------------------------------------------
 0FA8h: text: 'Text^MText^MText^MText^MText^MText^MText^MText^MText'
 0FA8h: -------------------------------------------------------------
 0FA8h: 54 65 78 74 0D 54 65 78 74 0D 54 65 78 74 0D 54
 0FA8h: 65 78 74 0D 54 65 78 74 0D 54 65 78 74 0D 54 65
 0FA8h: 78 74 0D 54 65 78 74 0D 54 65 78 74 0D
 0FA8h: -------------------------------------------------------------

 ====================================================================
 [DFF_PST_StyleTextPropAtom]
 (type: 0FA1h inst: 0000h, vers: 0000h, start: 65, size: 22)
 ====================================================================

 0FA1h: -------------------------------------------------------------
 0FA1h: para props for 46 chars, indent: 0
 0FA1h: para prop given: para linespacing 80
 0FA1h: -------------------------------------------------------------
 0FA1h: char props for 46 chars
 0FA1h: char prop given: char font size 30
 0FA1h: -------------------------------------------------------------
 0FA1h: -------------------------------------------------------------
 0FA1h: 2E 00 00 00 00 00 00 10 00 00 50 00 2E 00 00 00
 0FA1h: 00 00 02 00 1E 00
 0FA1h: -------------------------------------------------------------

 ====================================================================
 [DFF_PST_TextSpecInfoAtom]
 (type: 0FAAh inst: 0000h, vers: 0000h, start: 95, size: 24)
 ====================================================================

 0FAAh: -------------------------------------------------------------
 0FAAh: 2D 00 00 00 01 00 00 00 00 00 01 00 00 00 07 00
 0FAAh: 00 00 00 00 09 08 00 00
 0FAAh: -------------------------------------------------------------

[...]

Kudos to Kohei for his great work on xls_dump, of which I reused the structure and most importantly the biff record parsing. I’m not aware of anything like ppt-dump.py, but would of course be interested if there is.

Other than that, here’s a brief list of other FLOSS tools for MSO binary document handling I’m aware of (besides OOo, of course):

Update: merge with xls_dump done, adapted viewvc links

So, as seemingly the Mac users currently miss all the nice Go-Oo features, I put up 3.0 RC4 Intel builds here. A word of warning, though: in comparison to Linux & Windows, the Mac version of Go-Oo has not yet received broad testing, so please consider this version unstable. Any feedback greatly appreciated – and if you want localized UI, there are language packs alongside the RC4 package (e.g. for the fr or de locale).

Just in case you want to try a build yourself: the wiki has a howto.