Outliner Software Forum RSS Feed Forum Posts Feed

Subscribe by Email

CRIMP Defined

 

Tip Jar

CRIMP Alert: A Compiled List of PDF Managing and Search Tools

View this topic | Back to topic list

Posted by Derek Cornish
Feb 8, 2008 at 01:46 AM

 

Alexander Deliyannis wrote:

>I have little
>difficulty in deciding myself. I have always opted for leaving them in (a few)
>permanent windows folders and indexing/linking to them from database programs such
>as UltraRecall or whatever. The sheer size of the files is such that it would make no
>sense to include them _within_ a file.

The difficulty in deciding doesn’t arise in connection with the notes/ideas manager I use. For example, I use Zoot as my ideas database and as a capture tool for text snippets extracted from the web and from files. Like you, I have tended to store pdf, doc, htm files in organized Windows folders and link them to Zoot items. (This is in any case not a matter of choice as, unlike UltraRecall, Zoot cannot store these types of files as it currently only accepts plain text. OTOH, using Zoot avoids the temptation of loading one’s notes/idea manager with wodges of irrelevant information.)

Storing the ‘real’ files in in the Windows folder system also has the advantage of keeping them available for indexing and searching using any competent desktop search engine. Zoot databases can themselves be indexed and searched by the same software, although they first have to be converted into (large) htm files - unless one is using Archivarius, apparently. (Does Archivarius index and search UR files yet?)

For me, the difficulty in deciding arises at the point of downloading the pdf, htm, etc. files from the web, and this is where the question of whether to store these types of files in the Windows folder system or in dedicated web capture database software comes in. For a long time I used Net Snippets, which uses the Windows folder system to store its files and so allows desktop search engines to index and search them. But if you want to organize your downloaded files in more complex ways - for example, by using keywords or multiple categories -  then you may have to look to dedicated web capturing tools like Surfulater or Web Research - even though their content may not be easily accessed by desktop search engines, nor easily linked to one’s chosen information manager.

Web Research (WR) is currently my main web capture tool for certain purposes - e.g., for pdf, htm, doc, and image files connected with particular projects, and for files I am keeping for semi-permanent reference purposes (e.g., software specifications, manuals, and so on). I can hyperlink from WR to Zoot and vice versa, so from that point of view it works well. The downside is that my search engine of choice, dtSearch, can only index and search the htm files stored in WR. Maybe I could persuade the Archivarius developers to take a look at WR’s database file format…

Given their potential drawbacks, why would one ever want to use dedicated web capture tools in preference to simply downloading files into windows folders and linking to Zoot?  I think there are a number of reasons: (1) Quicker real-time saving and categorizing - or dumping first and classifying later; (2) Easy re-organization of imported files via categories/keywords when necessary; (3) Highlighting, metadata; (4) An intermediate store for files en route to the Windows folder system; (5) A useful place in which to browse through files.

>Apart for the size issue, I
>believe that files as such will be accesible for quite some time, whereas database
>programs come and go. Think of the time involved in importing such files to a database
>and then exporting them to one’s next information manager.

I think this is a good argument for not downloading files into one’s notes/ideas manager - where, in any case, they may just clog things up - but less valid in the case of web capture tools as these have some value as both temporary and permanent storage sites for particular projects or purposes - and usually offer quite good bulk exporting features these days.

>That said, information
>is not knowledge. A library of references makes little sense unless one invests in
>slowly building their comprehension of the ideas within that material, i.e. their
>knowledge, whether visually (with mind maps etc) or textually (with a classic
>outliner). For this I find many of the tools we discuss in this forum absolutely
>invaluable.

Absolutely agree on this.

>
>An indexing program complements the building of such an ‘idea
>structure’ by helping reference and support themes, once one knows what they are
>after. Personally, I was attracted to Archivarius by its support for an amazing
>multitude of file formats, as well as for my own working language which is often
>unsupported by anglosaxon made/oriented software.

Can’t argue with that :-).

Derek