Outliner Software Forum RSS Feed Forum Posts Feed

Subscribe by Email

CRIMP Defined

 

Tip Jar

How much data do you store.

View this topic | Back to topic list

Posted by MadaboutDana
Feb 14, 2019 at 09:32 AM

 

This is a very interesting question.

I am more and more wary of putting vast amounts of documentary data into any single database. That’s why I use Curiota for general information gathering.

My Curiota collection is about 7GB and steadily growing – but all files are stored separately, and I believe Curiota uses Spotlight as its search engine - it’s very fast and efficient, so I’m not complaining.

I have about 3GB in Notebooks - which, again, stores multiple files rather than creating databases - and I believe also uses Spotlight for searching. This works well, but isn’t very refined (no highlighted hits in the general search; you have to do separate searches in each document to isolate specific search terms).

My largest DEVONthink database is about 5GB, and while the search function is excellent, moving to the first “hit” in a large PDF can take a little time. Once DEVONthink has sorted itself out, it moves from hit to hit within a document extremely fast. But the initial loading takes a few seconds. However, nothing else has DEVONthink’s precision search facility…

... apart from FoxTrot, which is excellent, and in itself a very good argument for preserving data in separate files. I use the Pro version, which has an excellent Preview tool and moves from hit to hit like greased lightning. I can thoroughly recommend FoxTrot, even though they’re not doing a great job of marketing themselves. FoxTrot also has an iOS companion, rather ingeniously using just the text indexes created by the desktop version rather than syncing the entire mass of files. I can’t say I use it, but it’s a neat solution to a tricky problem.

The advantage of storing multiple separate files rather than relying on databases is the much greater ease of sharing across networks. Huge databases are notoriously difficult to transfer/sync, and awkward to back up. This isn’t such an issue when your corpus consists of lots of individual files, and the search index is held separately (either in Spotlight or in some proprietary format, doesn’t really matter).

Cheers,
Bill