Okay, I read the articles, it can actually read text. So it is a text that has been placed on a picture.
I am not sure this Rossetta.ai is what he is using, but I can see what can be done with it.
So I got to thinking, it has to be apart of a picture. It cannot be added text.
In Photoshop you can create clipping masks. Take a picture, and with that it will clip wording out using a picture. Flatten the layers after that.. really doesn't take that long to do, a min tops, but you have to have Photoshop to do it.
Coupling that with the metadata.. I think we have a work around.
This is like the CAPTCHAS that are used.