Here's an idea... Develop a bot to trawl NSFW sites and hash each image (combine...

Fomite · on Oct 16, 2013

"This multi-TB disk array labeled 'Porn' has a legitimate business use!"

pa5tabear · on Oct 16, 2013

How do you program that sort of thing?

Do you have to tell it what shapes/colors to look for? Or do a combination of overall image similar combined with localized image similarity and portion by portion image comparison?

viraptor · on Oct 16, 2013

Maybe recognising the furniture in the background would work too ;) I remember there was a website/catalog of IKEA furniture somewhere made using NSFW photos.

_mulder_ · on Oct 16, 2013

Well Image Hashing is distinct from normal MD5 hashing as the hash does consider similarity of colour, etc. so it's not purely binary. A Google search produced a library called pHash.org that might do something similar.

lifeformed · on Oct 16, 2013

Is it possible to hash an image so that you can partially match it with subsets of that image (like cropped regions or resizes)? Or a slight modification of that image (colors shifted, image flipped, etc).

russellsprouts · on Oct 16, 2013

I believe so. Tineye uses something similar, because it detects those matches.

gwern · on Oct 17, 2013

Yep. Google "perceptual hash functions".

krapp · on Oct 17, 2013

shameless self-promotion - I wrote a perceptual hasher for PHP : https://github.com/kennethrapp/phasher

FWIW the biggest problem with this is false positives, though admittedly I may just not be clever enough to do it with enough finesse.