In life, people will frequently say things to you that won’t be the whole truth, but you can figure out what’s actually going on by looking at the context of the situation. This is commonly referred to as “being deceptive” or sometimes just “lying”. Corporate PR and salespeople, the ones who put out this press release, do it regularly.
You don’t need to record content categories of searches to make a good tool for displaying websites, you need it to perform predictions about what users will search for. They’ve already said they wanted to focus on AI and linked to an example of the system they want to improve, it’s their site recommender, complete with sponsored recommendations that could be sold for a higher price if the Mozilla AI could predict that “people in country X will soon be looking for vacations”.
Did the image get copied onto their servers in a manner they were not provided a legal right to? Then they violated copyright. Whatever they do after that isn’t the copyright violation.
And this is obvious because they could easily assemble a dataset with no copyright issues. They could also attempt to get permission from the copyright holders for many other images, but that would be hard and/or costly and some would refuse. They want to use the extra images, but don’t want to get permission, so they just take it, just like anyone else who would like an image but doesn’t want to pay for it.