Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

File formats based on ZIP files only work for small files. For big files you have huge overheads; opening and saving a moderately sized documents takes seconds (vs. milliseconds for writing changes to an SQLite database). There's a reason why Excel files are limited to a million rows, while Access databases aren't.

The complexity of including SQLite is trivial for practical purposes; it's already available on many systems, and if not you can include it by adding a single C file to your project.

Setting up a workflow for Google Protocol Buffers (another popular alternative for document file formats) is a lot more complex than building or linking with SQLite, and it doesn't stop people from using them.

One thing that speaks for SQLite is the quality of the project; it's one of the best maintained Open Source projects with fantastic quality assurance and support for almost every OS. This means that you are unlikely to run into issues compiling or working with SQLite, like you might have with alternative libraries like libxml2 or jsonc (which are still great libraries!!).

EDIT: The big downside of SQLite is that it's unsuitable for documents that are exposed to the user because of the temporary files (like the WAL). If you have a ZIP based file format that you atomically rewrite from scratch on every save, it's almost impossible to corrupt. Your users can just take the file and email it and nothing bad will happen. I'm not sure what happens if you email an SQLite database file that is currently being used. I've done that in the past and have been surprised that some data seemed to be missing, but I don't recall the details. Hence SQLite is often used for application data files that are not directly exposed to the user.



You could get SQLite to work as document files exposed to the user so long as you use sessions[1]. When a file is opened, copy the DB to a temporary file or to use memory and write all changes during operation to this new DB, recording them all in a session. When the user explicitly saves a document, apply the session to the real DB.

[1]: https://www.sqlite.org/sessionintro.html#:~:text=1%20Introdu...


That doesn't sounds like it would just combine the drawbacks of both approaches: You end up with slow open/save operations, and the writes to the db still aren't necessarily atomic (eg. if a user copies the file while changes are being applied)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: