• 0 Posts
  • 21 Comments
Joined 1 year ago
cake
Cake day: July 28th, 2023

help-circle




  • Sorry, I misinterpreted what you meant. You said “any AI models” so I thought you were talking about the model itself should somehow know where the data came from. Obviously the companies training the models can catalog their data sources.

    But besides that, if you work on AI you should know better than anyone that removing training data is counter to the goal of fixing overfitting. You need more data to make the model more generalized. All you’d be doing is making it more likely to reproduce existing material because it has less to work off of. That’s worse for everyone.


  • What you’re asking for is literally impossible.

    A neural network is basically nothing more than a set of weights. If one word makes a weight go up by 0.0001 and then another word makes it go down by 0.0001, and you do that billions of times for billions of weights, how do you determine what in the data created those weights? Every single thing that’s in the training data had some kind of effect on everything else.

    It’s like combining billions of buckets of water together in a pool and then taking out 1 cup from that and trying to figure out which buckets contributed to that cup. It doesn’t make any sense.






  • Just buy them on eBay. Why does it matter where they come from? Again, four of them have to die before it’s no longer worth it. It’s extremely unlikely you’d be that unlucky.

    Personally I have 15 drives in my NAS, all of them were bought used and they’ve been running 24/7 for 4+ years without issue. Originally I expected to lose at least one per year but they just keep chugging along. All of them have at least 40k power on hours, with the oldest 3TB ones having over 80k (9+ years)

    I use unRAID so if/when one does die it’s as simple as pulling out the dead one, popping in a new one, and letting it rebuild itself.










  • I wanted to try Immich but I quickly found out you can’t simply point it at an existing folder structure like say Plex or Jellyfin. You have to “import” all your files via a client and if you’re like me and already have thousands of images in Nextcloud then even with their bulk upload CLI tool it is too much of a hassle.

    Plus I don’t want to be locked into their format, I want to be able to switch if the project goes under or I find something better later on. Nextcloud’s photo management is not great but I am willing to sack some speed and usability for using raw folders rather than a database.