Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Here's an idea...

Develop a bot to trawl NSFW sites and hash each image (combined with the 'skin detecting' algorithms detailed previously). Then compare the user uploaded image hash with those in the NSFW database.

This technique relies on the assumption that NSFW images that are spammed onto social media sites will use images that already exist on NSFW sites (or are very similar to). Then it simply becomes a case of pattern recognition, much like SoundHound for audio, or Google Image search.

It wouldn't reliably detect 'original' NSFW material, but given enough cock shots as source material, it could probably find a common pattern over time.

edit: I've just noticed rfusca in the OP suggests a similar method



"This multi-TB disk array labeled 'Porn' has a legitimate business use!"


How do you program that sort of thing?

Do you have to tell it what shapes/colors to look for? Or do a combination of overall image similar combined with localized image similarity and portion by portion image comparison?


Maybe recognising the furniture in the background would work too ;) I remember there was a website/catalog of IKEA furniture somewhere made using NSFW photos.


Well Image Hashing is distinct from normal MD5 hashing as the hash does consider similarity of colour, etc. so it's not purely binary. A Google search produced a library called pHash.org that might do something similar.


Is it possible to hash an image so that you can partially match it with subsets of that image (like cropped regions or resizes)? Or a slight modification of that image (colors shifted, image flipped, etc).


I believe so. Tineye uses something similar, because it detects those matches.


Yep. Google "perceptual hash functions".


shameless self-promotion - I wrote a perceptual hasher for PHP : https://github.com/kennethrapp/phasher

FWIW the biggest problem with this is false positives, though admittedly I may just not be clever enough to do it with enough finesse.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: