Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Too bad SQLite didn't exist at the time. It would be a pretty good candidate for something like that without eating bogs of memory for large documents.

To be fair to the Office team of the day, when your company also develops the compiler and can guarantee the safety of writing and loading raw structures under specific constraints (even in ways that violate programming language standards), it's not too bad of an idea. Not that it's the greatest, as even then there were certainly better ways of doing it, but the landscape of serialization wasn't as nice as it is now.



SQLite would not be a good choice.

DOC file format, combined with OLE containers, was designed at least partly to deal with common practice of keeping documents on floppies, and to make saving documents as speedy as possible.

This meant a combination of pseudo filesystem (OLE containers can be summarised as a variation of FAT filesystem) and blittable data structures. This way when you saved an updated version of the document, Word would only extend the file with new changes, minimizing amount of I/O necessary.

Another aspect to this was that DOC was not supposed to be interchange file format - that's a role that was supposed to belong to RTF. Which was much easier to parse and had formal, published specification instead of memory dumps of internal data structures. However it took a more resources to update such file, so RTF and DOC were kept in sync when it came to capabilities all the way to Office 2003 - everything you could do in DOC, you could do in RTF, and vice-versa.

Of course, practice quickly went in different direction and most people only think of RTF as the simpler sibling of DOC, because WordPad on Windows used it.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: