AI-infused

< Next Topic | Back to topic list | Previous Topic >

Posted by Amontillado
Oct 20, 2025 at 01:26 PM

Just curious - how would one concatenate all the markdown files in Windows.

Linux and Mac are easy - in a terminal window, cd to the Obsidian vault and run find . -name \*.md -exec cat {} >> bulk.txt \;

I really need to learn more things like Powershell. I’m stuck in a bash world.

Posted by satis
Oct 20, 2025 at 02:36 PM

Andy Brice wrote:
>Just this morning I searched “Capri” on Google. The AI overview told me
>it is a fictional island. But it’s very much a factual island.

Before I finished typing the 5-letter word the Google suggestions bar offered “an island located in the Tyrrhenian…” with a tiny thumbnail from Wikipedia. When I finished I got the Wikipedia page as the first link (with thumbnail) a list of “People also ask questions” like ‘Is Capri very expensive?”, a “Things to know” section that has dropdowns for population, culture, pronouncination, history and location, and a sidebar on the right with a photo carousel, current weather, travel pricing link, and a maps link.

Posted by satis
Oct 20, 2025 at 03:23 PM

Dr Andus wrote:
satis wrote:
>>Perplexity lets you select up to 10 files at once (up to 40MB each) to
>>perform cross-document queries and comparisons. It analyzes short files
>>entirely, while “longer” files are processed to extract “the most
>>relevant parts for your specific query,” meaning it sometimes loses
>>context or misses meaning in larger documents.
> >
>I came across an article today which talked about the need to combine
>hundreds of Obsidian files into a single file, so that NotebookLM can
>read them:
> >Your mention that Perplexity also limits you to 10 files.
> >It makes me wonder if it’s better to combine thousands of files into a
>single file when I’m importing them into Obsidian then, if I wanted to
>use AI at one point to analyse them.

Even with concatenation I do not think you will get the full analysis you want on those files. Definitely not for free, and probably not in 2025 as a consumer.

LLMs, used direct or through an Obsidian plugin, have problems limited to context length (how much text the LLM can “see” at once). For big vaults the plugin or LLM fetches and summarizes what it thinks are the most relevant segments for the model to use and won’t fully evaluate all your text.

NotebookLM lets you upload up to 50 sources (300 on paid plans) but each file has a 500,000 word and 200MB cap. If you combine thousands of documents into one massive file, you risk hitting formatting and organization problems. Its processing depends on its context window and can’t analyze everything in a giant aggregation fluidly. It sometimes only covers the chunk or segment currently loaded, missing details outside its window, and breaking up citations and logical divisions. Merging files can lead to wrong citations and unreliable answers for large-scale projects or large numbers of files or large files.

Perplexity Pro, which costs $200/year and has access to several premium AI models like ChatGPT5 and Claude Sonnet 4.5, specifically states that short files are “usually” analyzed in their entirety and Perplexity can directly check the whole content to generate answers. But, Perplexity tells me, “Long files or large documents are processed differently—Perplexity will extract and analyze the most important sections or highlights, focusing on the parts most relevant to your query. This means it does not always use the entire content for its response, especially if the file is extensive.”

One option for hobbyists and SOHO and academics has been installing (lesser) open source, locally-run models like Llama or Gamma on a PC, but you’ll need a pretty beefy computer with gobs of RAM and GPU for that, and performance can vary radically based on model and model size. Off the shelf prebuilt PCs that are ready for local AI processing are in the $4,000 range and have RTX 4090 or 5090 GPUs in them and 64Gb RAM.

So, absent your own computer running your own managed high-end PC and LLM, concatenating files probably won’t give you a complete analysis of your files.

It might yield decent results but they will be incomplete, and you won’t really be sure where.

Posted by Andy Brice
Oct 21, 2025 at 09:24 AM

satis wrote:
>Before I finished typing the 5-letter word the Google suggestions bar
>offered “an island located in the Tyrrhenian…” with a tiny thumbnail
>from Wikipedia. When I finished I got the Wikipedia page as the first
>link (with thumbnail) a list of “People also ask questions” like ‘Is
>Capri very expensive?”, a “Things to know” section that has dropdowns
>for population, culture, pronouncination, history and location, and a
>sidebar on the right with a photo carousel, current weather, travel
>pricing link, and a maps link.

I just typed “Capri” into the same app on the same device. This time it gave me a completely different answer (correct this time).

Posted by Dr Andus
Oct 21, 2025 at 05:32 PM

@satis, thank you very much for the thorough answer, it clarified a lot.

Well, I got myself an education discount and bought one of those high-end PCs (they are cheaper in the UK than the US, strangely, I guess because of the whole tariff situation)...

Maybe I’m a complete fool for spending still a lot of money on it, but my previous HP mobile workstation lasted me 10 years, so I’m hoping this one will also last for a while.

It comes with LLM Phi 3.5 installed on it, but I’m only just beginning to play with it.

So you can put me down in that category of “hobbyists and SOHO and academics” :-)

satis wrote:
>One option for hobbyists and SOHO and academics has been installing
>(lesser) open source, locally-run models like Llama or Gamma on a PC,
>but you’ll need a pretty beefy computer with gobs of RAM and GPU for
>that, and performance can vary radically based on model and model size.
>Off the shelf prebuilt PCs that are ready for local AI processing are in
>the $4,000 range and have RTX 4090 or 5090 GPUs in them and 64Gb RAM.
> >So, absent your own computer running your own managed high-end PC and
>LLM, concatenating files probably won’t give you a complete analysis of
>your files.
> >It might yield decent results but they will be incomplete, and you won’t
>really be sure where.

Pages: ‹ First < 8 9 10 11 12 13 >

Back to topic list