Some words about UltraRecall, etc vs Search tools (meow!)

Started by 22111 on 3/8/2023

22111 3/8/2023 10:40 am

Yesterday, some agitator (i.e. somebody of the "competition", perhaps?) tried to provoke UR prospects on bits into not buying, and since he obviously wasn't interested in our kind answers, he got his due today over there.

(He got two answers from me, one from UR's developer, and you will see proof of my allegation that he's an agitator, by searching their "previous comments" (or whatever they call'em) for "hj" only... Tja, der Hans-Jürgen, hehehe!, or is the Her(r)mann-Josef? my god, Herman!)

Obviously, some info I then gave there, might be new to some "contributors" or other readers of this forum, so read below if you're interested, and if not, don't - and look up our previous answers over there (ditto).

https://www.bitsdujour.com/software/ultra-recall-2#comments165998

"Since HJB asked his current questions here in 2015 already, just worded otherwise, the question arises how he would have maintained / searched his obviously very extensive document corpus the 7.5 years in-between (and the question if a Windows program can = is entitled to access the Apple cloud, well...), and others may have some misconceptions, too, and since UR will not have been on bits today for the very last time, it could certainly help to explain the different paradigms, once and for all.

A) Search engines for specific file formats: X1 is rent, 79$ p.a. (their "Buy" button is for "buying" a subscription..., and people said it just indexed = made available for search just the first 10 MB of every Pdf, but the problem may have been duly addressed finally (?)), Copernic has been rent even before X1 joined them in this, but has now come up with a rent model from 20$ p.a. on which may be sufficient (!) for many users, or then they will perhaps need the 32.5$ p.a. model (for MS "365"), but obviously, Copernic have decided to make their rent models much more "accessible" now, so people who are "into" that paradigm, should first look into that program (they have a comparison table, in order to enable you to decide what you will need, and 20 / 32.50 p.a. seem to be very acceptable pricing). For individual use, dtSearch always is 249$ (incl. some minor updates I suppose), for the time being.

There are other competitors in that field; the aforementioned tools have in common to build a (= one) global index for all the folders / drives you want them to, BUT just for the file formats which are in their list, and if you have files in some other formats, they are off, i.e. left out, and you can do nothing about that - dtSearch don't even answer your kind enquiry for advice on that.

You will then have a search result list, and according to the tool you will have bought or rented, you either "get" to that position of the document in question, within its "native" app, or then not, and your click on the result will just open that document at the beginning, and you will then have to look up the "find" in there again, which may become very, very cumbersome over the months, so you should check the programs for that, before buying/renting, and they could even behave differently in that, for different file "formats". (For example, I have read in some forum that Lookeen, and which had started as a dedicated Outlook search tool, 50/100/140$ when I last looked their prices up, did NOT go to where you would have wanted, in OL, by clicking on the "hit" in Lookeen's search results, but that observation might be not applicable anymore (?)...) - So you have to really trial, within your real work situations, before making the "investment".

Again: Any non-listed file "format" will NOT be indexed / searchable from within those programs, and that applies for example for database contents - I currently don't know ANY such program (but might have overlooked one that does it) which would be to integrate even standard SQL databases into its index, and: no index, no search!

B) Non-indexing search tools, of which FileLocator (69$, incl. 1 year of updates) seems best (they also have a free version, with the same name, it's their trial after 14 (?) days; there also is FileSeek, much cheaper and regularly on bits, but much less robust from my experience (I use both programs)). Those tools just crawl any file / folder you tell them to crawl, for your current search, and that can take time, but if you know about the file format (by having opened such files with a binary editor, and having taken notes), and even enter the correct codes, you'll be able to find many things indeed you are unable to find with A) above; you also can buy such / quite similar tools for tenfold the price, with the term "forensic(s)" somewhere in their name...

FileLocator Pro belongs to B) from its origin, but has added some indexing functionality, but obviously (judging from the respective entries in the help file) that (and especially the maintenance / renewals of those indices (several ones here) - that's not the strongest part of that program; here again, you have the problem of having to look up twice in case, first FileLocator, to identify the file, then again a search, within the respective application, in order to finally "get" to the "hit".

With the concept of A) and B), you work the traditional way, mostly in MS Word, Excel, etc, and you search by search tool, to identify the document which holds the information in question. But you produce multiple documents, so this is typical office / administration use, and hopefully, you name, and/or file, your (own) documents in a way as to distinguish them from those documents which contain "external" content (mostly PDFs, in most cases).

In this paradigm (or work setup if you prefer), your search tool(s) will become you main work tool, aside from your text processor, your spreadsheet tool, and perhaps MS PowerPoint, and that's it, for most; if you work in a corporate environment, there will also be some database, for tabular data, and with its own search / filter routines.

My advice here: look into Copernic's 20/32.50$ rentals (being aware though that they could multiply those prices anytime), combine with FileLocator free - I do NOT think though that you could replace that setup with just buying FileLocator instead: technically, you could indeed, but read about FL's index management, and you will understand my reservation...

C) Text / content database, plus some files for export / for communication purposes with customers, clients, publishers...

Here, you do your main "text production" within = into your text database, of which the three "big" ones, Windows-wise, are more or less regularly on bits: UR, RightNote, MyInfo, MyBase... (check the respective robustness, and the respective cloning functionality... and you will probably convene you should have bought UR today instead...)

(And there also is TheBrain, at 219$, but any upgrades for the same price then (!, or then by subscription: obviously, they want to dissuade prospects from buying, instead of renting), with its weird "graphical" concept, which just needlessly complicates things, and which is pleasant to the eye (their so-called "free" version is utterly worthless: you will discover that the moment your trial reverts to "free", so you would be very badly advised to invest much time in that trial...) in trial = as far as you just will have to maintain some elements... and some dishonest "reviews" pretend you "need" (and implicitely: you'd need to pay its price) it for transclusion = cloning, see my post above which proves the contrary...)

As said, you main text / output production will be within the text database, and at the same time, you will want to import / link as much as possible external information into that text database, in order to take advantage of your database's full text search; in UR, you either import, which means the file will be replicated within UR, and its full text will be indexed, or you just "link", where again the text will be indexed and thus becomes available for its search results table, but for looking up those "hits", you will then need to open the documents in question, within their "native" application.

Then, though, most users will not have HJB's 8 TBs (if I remember well) of pdf (or other external) data, since that's would be some quite incredible amount of text (!) then; bear in mind that NO (dedicated or not) search tool can also index illustrations, photos, etc., except for their respective meta data in case, so in his case - if he's serious then -, LINKING instead of importing, and then indexing (the respective texts) could be a perfectly viable use case in the end.

And remember, if UR can't index the (pdf or other MS Word or similar) text (because it may be "secured" by its respective author), those dedicated search tools can't either, and OCR isn't that reliable after all but might be a "solution" for very important documents.

I personally think it's insane to misuse a text database as a cheap search tool, when you continue to your your "writing", i.e. your text production, in multiple text processor files; rent Copernic at 20$ p.a., pray they won't multiply your rent too soon, and continue to work as you obviously like to work.

On the other hand, I'm POSITIVE that 95 or more out of 100 prospective text database users will NOT run into HJB's (pretended?) problems; as said, he asked in 2015, and it's almost 8 years now, so how did he organize his a) writing / output, b) searching / external input in-between?

And yes, UR's full text search CAN compete, within reason, with dedicated search tools, so almost NO UR user will have to set up some hybrid system, maintaining indices within UR and within Copernic, but even that is perfectly possible whenever need in those rare cases the need for that might arise.

For people who do their "output" within their text database (UR here), export is important, and as said, UR's multiple export possibilities are excellent, be it simple, quick rtf - or then, you could even use MS Word (and with its new format) "internally", i.e. as your default (!) UR text processor ; and as also mentioned above, there are (multiple, and even user-adjustable varieties of) html and XML export formats, ensuring any possible "output" needs, incl. export to HAT (help authoring tools) or DTP / preprint (FrameMaker, Madcap Flare, Quark, InDesign): with a little help from real experts then, for the necessary adjustments, and in quite other pricing regions what the rentals of those sofware then concerns.

HJB's questions might just have been meant as a provocation, in view of his obvious need to have found at least SOME solution to his monstruous text collections for the last almost 8 years... but the answers he's got (and wasn't probably even interested in) should reassure even power users with lots of pdfs, etc, to link, and I personally think that most powers users even could savely import (instead of just "linking") "everything".

Btw, I don't know how "easy" it would be to then, if really needed, replace the "import" by a "link", afterwards. That should be doable, since even with "import", the original files stay unharmed, of course, and if currently, that's not "as easy as it comes", the developer will certainly check, and amend that functionality, if a legit, honest, real UR users then asks for it.

And finally, yes, some people pretend that "file system PLUS text database", constituting a hybrid system anyway and should thus be avoided to begin with, but when you have, as I have, almost 400,000 records in that text database (= in UR in my case), perfectly ordered within, and by that perfect ordering such a text database offers - how could you imagine such perfect order being replicated into another 400,000 (mostly formatted text) files within my file system? Right: That would be unmanageable, all the more so since any ordering within (sic!) groups can't be done but in a very bad, weird, convoluted way in the file system: by an extra, sidecar file to anyone of those sub folders of any indentation level:...

And then, would your (even very expensive, in case, and indeed, you can spend "thousands" every year, for such an app...) search tool respect, follow the order of your files, from those sidecar files?

In a sophisticated (mainly) text data base (just to differenciate them from tabular data bases here) as UR, all this is no problem whatsoever, the terms to know are just "lineage" (as first sort criterion, and then "tree order" (as second sort criterion, and just for example; multiple other ways easily available and storable).

And that's another, non-negligeable advantage of UR (and other such data bases if they are as good, which remains to be proven first): You'll get all your search results in the order you need them... whilst your dedicated search tool, of almost any price, will very probably not be able to do so.

And, remember, dtSearch people don't even bother to answer some kind enquiry from some prospect willing to pay 249$ plus VAT if shown a way to overcome the above-mentioned limitation, whilst on the other hand, they do lots of blabla re "forensics" - how would they then do "forensics" if their tool isn't even able to search any non-standard file format? are they serious? -, so they will almost certainly give a heck about fine-grained sort order any really good data base (as UR here) is able to deliver...

And that's not even speaking of the "real access time" then: in UR, you will be "within" the "item" = record in a fraction of a second, the search term(s) being highlighted optionally, and thus, you can quickly browse multiple "hits"; "getting into it", from many a search tool, might take quite some seconds then...

There is no perfect software (or anything perfect else then) currently, but with at least knowing where your main requirements, you can do much better choices than by provocately asking for "the impossible", but while not having changed anything, in almost 8 years in-between, on you "current" "workflow", but which obviously should have been revised almost 8 years ago.

According to your individual situation, some people might be correctly advised, to staying with their current file system only based "workflow", others will be perfectly served by a hybrid system: text data base plus file system, plus, in some cases, even an additional search tool, and FileLocator free should indeed be part of any Windows' user's toolbox for example, whatever they deploy otherwise. As should be (Voidtool's) "Everything" (also free), but that would be another subject of discussion..."

22111 3/8/2023 5:28 pm

There's thread hijacking - you all know that -, and there's also deliberate "thread doubling" as you might call it: both schemes serve to invalidate arguments you otherwise (i.e. by counter-arguments) couldn't attack;

a blatant example of the second plot is to be found here, https://www.outlinersoftware.com/topics/viewt/10050/0/ultrarecall - so read there for further info on the matter, and for lots of new advertising...