RFC - New Software Project: Infosqueezer

Posted by Lothar Scholz
Sep 11, 2019 at 11:15 AM

Hello,

I want to announce that I will start in October working two full-time years (that’s now full financed) on my Information/Knowledge Application. I spend the last years spare time to implement some prototypes and made enough failures to now be wise enough to create something great.

This Request For Comments is for you to provide me with your ideas, critiques and suggestions as i’m going to implement a few totally new concepts and combine it with well known stuff.

It will be a 1, 2 and even 3 pane outliner for information, a pdf+html reader with qualitative text analysis capability, a web clipper to provide this html and some kind of file management tool.

Development will happen on macOS and Linux first. I keep an eye to make it portable to Android and iOS but this will be coming later. The same goes for Windows because Microsoft GUI development currently totally suck and until they have decided about their direction I postpone all work on that platform.

*** Cards and Panes

Information is stored on little index cards which contain markdown text and optional an represented file (purpose as described later)
The card can link to other files but only represent at most one file. Wiki style linking among cards is possible.

There are two kinds of cards. Free Floating „Knowledge“ cards and cards that are the nodes of an outliner. Free floating nodes can be displayed in multiple outliners and multiple locations in one outliner or they can be outside of any outliner organization and only be revealed by search and browsing features.

The left pane outline is a full featured outliner with columns like Omni Outliner. It allows clones and mark+gather operations and all the features discussed and found useful on this board in the past. Each node in the outliner can also reference a free floating card, which is showing in the 2nd pane.

Compared to other 2pane outliners the concept makes the outline a full data document and not just the table of content of a collection of cards. This is a unique feature. The reason is easy to understand. For example if you write a book, you can use the outline pane to organize your chapters and add comments about the progress and todos while the cards in contain the content of your book. This decoupling is important because an organized set of content items is much more then the sum of its content and often information make only sense because of the combination of the items. No outliner so far could model this.

I just want to mention that there will be no difference between folders and items in the outline. This was concept taken from the technical implementation of file systems and I don’t know why so many outliners just copied it.

Third pane shows cards that have relations to the current card shown in 2nd pane.
This can be the destination/source cards of links on the current card or what looks similar to some AI algorithm.

Also each outline node can contain an explicit list of related cards. In the book writing example you would use this list to attach cards to a node with knowledge that you want mention in the text you write in the 2nd pane. This is a feature I’ve seen in IBM Doors Requirement Engineering Tool where you keep and reference the source material from there, like law requirements or technical specs. Sure you could just add them as intrinsic links on the outline node but having a separate list of documents is IMHO cleaner.

The 2nd and 3rd pane will show multiple cards at once.

*** Data Fields and Sections

As a huge fan of the abandoned Asksam and the ability to add data fields directly inside the document wherever you want I have to add this feature.
And in the current design it has turned out to be the key element.

The design rule is different from normal database that the user and entering information has priority over correctness of the database. If you have a year field in an art database you don’t need to enter a 4 digit year, you can also notate it as „painted during his orange phase living in Washington“ . This will be stored but underlined with just an error line. In the end its better to write it down then to forget it.

Fields can be added one or multiple times. Content of fields can contain multiple data values like comma separated list or a ZIP code/city/country combination that is automatically broken into data pieces by a pattern matcher.

For example if you write the following on a card in your movie database it will create a „Movie“ Record (declared by the @@ line) with the fields

@@Movie:
@Title: Star Wars: Episode IV - A New Hope
@Director | Writer: George Lucas
@Year: 1977
@Rating: 10
@Genre: SciFi
@Actor: Mark Hamill
@Actor: Harrison Ford, Carrie Fisher, Alex Guiness
@Synopsis:
*Luke Skywalker* joins forces with a [[Jedi Knight|Jedi]], a cocky pilot,
a Wookiee and two droids to save the galaxy from the Empire’s world-destroying battle station,
while also attempting to rescue Princess Leia from the mysterious Darth Vader.

As you see it’s possible write condensed like the @Fieldname | Fieldname: syntax or the comma separated list.
There are tons of more things and fine tunings but this explainnation should be enough to understand the idea. A a field can contain any markdown text, including images and links. In fact they just split the text into different sections.

Now back to the outline, you can add children to a node automatically, for example you have your curated outline of movies and a node „Best SciFi“ then you can add „@@Movies@Rating > = 9 AND @@Movies@Genre == SciFi SORT BY @Rating“ and the outline will automatically fill the child notes with the top movies.

„@@Movies@Rating GROUP BY @Genre, @Year“ would fill the outline three levels deep and create items like
Movies
- SciFi
- 1977
- Star Wars: Episode IV

If you know databases this is how the SQL Grouping clause works for selecting data. It will be automatically inserted into the outline when you collapse the item containing this query (and then stay this way until collapsed or explicitly refreshed).
I have so far never seen or heard of any outliner who automatically generates the data shown as children.

If you add nodes automatically you can’t add any individual text or attach cards to them (There might be ways but I don’t think about this at the moment). But you could add a description of the Database search you do with the top-level „Movies“ node so you know what happens when you collapse the node.

*** Outliner and Columns

The data fields can be used to fill columns in the outline. Either fields taken from the referenced free floating knowledge cards or the outline node card. In the book example you could add a Progress column and add a „@Progress: mostly done“ line to your outline node and it will show the field value as column/row value.

Each pane will have a slider at the bottom to control how many lines of the markup are shown. So you can compact the outline to one line of text and then use the columns to find to get an easy status overview.

*** Tags

Outlines and Tags are complementary ways to organize notes.

Infosqueezer will support tags in an autogenerated tag tree view (I’ve seen it as a PhD sometimes ago but can’t find the reference anymore, it has never been implemented in real products so far AFAIK). This is not the almost useless idea that Bear is doing but really smart and makes it possible to provide very good browsing through a document collection. Studies have shown that browsing and scrolling is still by far more popular to find items than direct search.

If you add multiple hashtags tag to your card say #Politics #Trump #BrExit to a card. This will generate a tag tree/outline with the following nodes:

- Politics
- Trump
- BrExit
- BrExit
- Trump
- Trump
- Politics
- BrExit
- BrExit
- Politics
- BrExit
- Trump
- Politics
- Politics
- Trump

This means all permutations of tags are created in the tag tree. Each level means that cards must have at least the tags specified by the current node and all nodes above.

You can filter the tags used to build the tree based on fields and search queries. For example you base the tag tree on every tag found inside @@Movies@Synopsis field. This opens endless opportunities to fine tune your database

*** Qualitative Text Analysis

This is another unique aspect i have never seen anywhere but will fit nicely and easily into the overall structure of Infosqueezer.

You can add pdf files or captured HTML webpages into the database. Each file gets it’s own free floating card. It is represented by this card.

This allows you to easily add meta data like bibliography fields to the files. If you use the normal PDF Annotation feature, the marked text and your own added text is added automatically to the card as a field inside an @@Annotation record. This will use the block quote and cite source syntax from the MultiMarkdown specification.

You can add hashtags to all each annotation and use the tag tree growing and all the other powerful methods above.

*** Multiple Pages for Card

Each card has multiple pages. Currently I think about: Foreground, Background, References and Annotations.

The use case for a foreground and background page can be easily seen in Wikipedia where you have the knowledge page and the discussion page for each topic. Or in IMDB where the main page contains an overview (with a selection of the cast members) and the background page can contain the full very long data list.

It makes sense to add another References page for footnotes (becoming endnotes) and pdf based cards become the annotation page.

In Markdown its easy. We have the horizontal ruler syntax already (three dashes). A page break will be specified by three tilde characters followed by the page name like

~ ~ ~
This is main page

~ ~ ~ Discussion
Lets talk about it

~ ~ ~ References
[^bible]: Isaiah 66:11 That ye may suck and be satisfied with the breasts of her consolations

I think this is enough for the first presentation pitch.