Postgres runs well in a container in my experience and is nice to work with, def support that. I know sqlite works well, but nice to have a system sometimes.
Postgres runs well in a container in my experience and is nice to work with, def support that. I know sqlite works well, but nice to have a system sometimes.
I’m always personally wary of storing blobs in a database if for no other reason it’s going to totally be more expensive to store on a server rather than in some sort of blob storage.
I compressed some of the 4k rips I did, all my dvd and 1080p blurays, it’s the HDR only that’s stopped me from some of them as I found I lost it with the settings I was using and I put it on the “list of things I’ll come back to later” shelf.
I recall some banding on a few of the dvd rips, probably was a little too aggressive with the settings I used, but they’re still definitely watchable
I have LOTR directors cut on my server, haven’t bothered reencoding it because I’m not super experienced with keeping hdr 10 going to h265 or equivalent. Return of the king alone is around 130 gigs across two files, jellyfin says its bitrate is about 70 mbps.
Titanic is only about 74 gigs
That’s the boat I’m in, I swapped my laptop from kubuntu to Debian which is solid for me. Server has a lot setup on it that I could move but for now Ubuntu server works, not really feeling the push to change.
I used a lepotato on my last project in place of a pi3 but libre computer totally has rockchip boards available as well. Price wise seemed decent, documentation was decent enough for me and more importantly I could actually get one.
Could use Polars, afaik it supports streaming from CSVs too, and frankly the syntax is so much nicer than pandas coming from spark land.
Do you need to persist? What are you doing with them? A really common pattern for analytics is landing those in something like Parquet, Delta, less frequently seen Avro or ORC and then working right off that. If they don’t change, it’s an option. 100 gigs of CSVs will take some time to write to a database depending on resources, tools, db flavour, tbf writing into a compressed format takes time too, but saves you managing databases (unless you want to, just presenting some alternates)
Could look at a document db, again, will take time to ingest and index, but definitely another tool, I’ve touched elastic and stood up mongo before, but Solr is around and built on top of lucene which I knew elastic was but apparently so is mongo.
Edit: searchable? I’d look into a document db, it’s quite literally what they’re meant for, all of those I mentioned are used for enterprise search.