O the Irony...

Posted by Simon
Jan 4, 2021 at 10:03 AM


I’ve been using Microsoft Word since the early 90’s. As time wore on and people became very anti Microsoft and proprietary formats I had to decide what to do when I switch to macOS with 2,000 MS Word files. Everyone said to use RTF which can still be read in 100 years. Well, although that might be true, it’s clear in 2021, that docx or doc is better than RTF. Try converting RTF to Markdown. Not possible unless you’re a command line aficionado. My old early 90’s doc files do not have this problem. Open them in Word and convert, then open in iA Writer or convert with Pandoc—-job done! Of course the real laugh-out-loud was the many websites that said that to convert RTF to Markdown, first save it as a doc/docx file and then convert! Why, because many tools can do that. Even the great Pandoc cannot convert an RTF, but it can convert a Word document! O the irony.


Posted by tightbeam
Jan 4, 2021 at 12:09 PM


> As time wore on and people became very anti Microsoft and proprietary format…

Thanks for a good lesson in why it’s usually smart to ignore the “anti” crowd when they have no real reason for being “anti” other than it feels right (or worse, “cool”).


Posted by Christoph
Jan 4, 2021 at 04:25 PM


Simon, yes, I see your point, but let me put some things into perspective.

The controversy between whether to use DOC or RTF was from before 2008, when Microsoft came up with the new, open DOCX format. Which is also supported by LibreOffice. Nobody really is or was opposed against DOCX. People were opposed against DOC because it was a closed standard (the specification was not published). While at the same time RTF existed, which had a published specification (btw, *all* of these formats were developed by Microsoft).

Was it reasonable to favor the use of an open, documented standard (RTF) versus the use of a closed format (DOC) that would also change with every version of Word? (Btw, are you sure you can still read all incarnations of the DOC format, e.g. the DOC format of Word 4 for Mac OS which was different from the DOC format of Word 6 for Windows?)

Also, while *you* can easily convert doc formats, you must consider how much unnecessary blood, sweat and tears is behind such tools, because the authors had to reverse engineer the closed DOC format and its quirks. I recommend reading the section on the DOC format in the Wikipedia: https://en.wikipedia.org/wiki/Doc_(computing)#Specification

Btw, all your arguments for DOC also hold for RTF. You can still open it in Word (or any other Word processor). And as far as I see Pandoc does not support DOC, but only DOCX. To be fair, you need to compare RTF with DOC, not with DOCX. DOCX can be considered a successor of both of these.

The other reason why people favored RTF over DOC was that they were not happy with Microsoft building a monopoly around MS Office and stifling invention and competition in the market for office software. That’s why the wanted people to use a file format that could be read by other word processors as well and was interoperable. I still remember very well how in the 90s people sent me DOCs via mail that I couldn’t open because they used a different version of word. Or because I was sitting at a computer that did not have Windows/Office installed. And I remember the main backlash was not against people *storing* files in DOC format, but *sending* them in DOC format and expecting everyone to be able to open it. Also remember: The DOC format could contain macros and was actively used to spread viruses and trojans. RTF did not support macros. Security was the other big aspect why admins urged everyone to use RTF instead of DOC.

In the end, Microsoft succeeded anyway, because people didn’t care to use closed software, and so Office has become a “quasi standard”. But is it really desirable to have such a strong monopoly?

Compare this with the browser war. If Microsoft had won it, we would probably still all use Internet Explorer and closed technologies like ActiveX. Opposing the proliferation of closed software and proprietary formats and technologies and insisting on open formats makes sense, it’s not just done because it is “cool.”


Posted by Listerene
Jan 4, 2021 at 10:49 PM


Of course, if you ever need to convert .rtf, you can always convert it to docx then into any form you want. Your observation is .... kinda nonsensical.


Posted by Amontillado
Jan 5, 2021 at 02:12 AM


Proprietary file formats are not a good sales point, but there is a spectrum. Proprietary and cryptic is something to avoid.

However, if a file format is obvious, just not documented, that’s not so bad.

Mellel, for example, uses a proprietary format, if you can call XML proprietary. I wanted a mail merge, so I studied a few Mellel documents. About three hours and 80 lines of Python later, I had my mail merge running.

Extracting plain text from Mellel files turns out to be trivial, so I would classify Mellel documents as a safe file format.


