Develop a bot to trawl NSFW sites and hash each image (combined with the 'skin detecting' algorithms detailed previously). Then compare the user uploaded image hash with those in the NSFW database.
This technique relies on the assumption that NSFW images that are spammed onto social media sites will use images that already exist on NSFW sites (or are very similar to). Then it simply becomes a case of pattern recognition, much like SoundHound for audio, or Google Image search.
It wouldn't reliably detect 'original' NSFW material, but given enough cock shots as source material, it could probably find a common pattern over time.
edit: I've just noticed rfusca in the OP suggests a similar method
Do you have to tell it what shapes/colors to look for? Or do a combination of overall image similar combined with localized image similarity and portion by portion image comparison?
Maybe recognising the furniture in the background would work too ;) I remember there was a website/catalog of IKEA furniture somewhere made using NSFW photos.
Well Image Hashing is distinct from normal MD5 hashing as the hash does consider similarity of colour, etc. so it's not purely binary.
A Google search produced a library called pHash.org that might do something similar.
Is it possible to hash an image so that you can partially match it with subsets of that image (like cropped regions or resizes)? Or a slight modification of that image (colors shifted, image flipped, etc).
Develop a bot to trawl NSFW sites and hash each image (combined with the 'skin detecting' algorithms detailed previously). Then compare the user uploaded image hash with those in the NSFW database.
This technique relies on the assumption that NSFW images that are spammed onto social media sites will use images that already exist on NSFW sites (or are very similar to). Then it simply becomes a case of pattern recognition, much like SoundHound for audio, or Google Image search.
It wouldn't reliably detect 'original' NSFW material, but given enough cock shots as source material, it could probably find a common pattern over time.
edit: I've just noticed rfusca in the OP suggests a similar method