Generative AI doesn’t get any training in use. The explosion in public AI offerings falls into three categories:
- Saves the company labor by replacing support staff
- Used to entice users by offering features competitors lack (or as catch-up after competitors have added it for this reason)
- Because AI is the current hot thing that gets investors excited
To make a good model you need two things:
- Clean data that is tagged in a way that allows you to grade model performance
- Lots of it
User data might meet need 2, but it fails at need 1. Running random data through neural networks to make it more exploitable (more accurate interest extraction, etc) makes sense, but training on that data doesn’t.
This is clearly demonstrated by Google’s search AI, which learned lots of useful info from Reddit but also learned absurd lies with the same weight. Not just overtuned-for-confidence lies, straight up glue-the-cheese-on lies.
They have two avenues to make money: