71
biskus
5y

Turns out my 4chan image scraper has been running for 6 months without interruptions. I now have 106k pictures and webms of highly questionable content on my harddrive. This is how Oppenheimer must have felt.

Comments
  • 8
    Do you have it sorted?
    Scanned for buzzwords in the post, like YLYL?
    And did you compared hashes and linked double posts?
    Sounds like cool project to me. Many things to try out and optimize come into my mind
  • 5
    thats pretty cool, search for a partnership with a dataspecialist and an digital artist.
  • 1
    Good luck viewing them all... I've had a scraper to for years though it's run on demand....
  • 11
    Now just re upload everything you have back to 4chan and see if your scrapper will pick it up and download :D
  • 14
    NSA/ CIA : what are all those stuffs in your hard drive?
  • 3
  • 0
    Can we see it?
    :P
  • 5
    Knowing their content i wouldn't even want to glance at that folder. There is at least a gig of some form of illegal or gruesome content. I would like to keep my sanity thanks.
  • 3
    You are correct sir, ylyl was the first keyword I added. All media is saved with django, all posts are connected to a thread which has some attributes like title, date etc. I'm currently building a web interface to sort through the data. Comparing hashes to identify duplicates is a great idea, thanks. @Kimmax
  • 4
    @redstonetehnik if you can find a proper name for this project with an available domain name, I will give you admin access :)
  • 4
    4chancache.com
  • 1
    boring! 😆

    dirtiestplace.online
  • 0
    Intriguing 🤔
  • 6
    Everytime I hear about 4chan, it's some cool stuff. But when I go onto it..very disturbing content.
  • 3
    @dom3mo you just have to have the ability to filter what you see :D
    But there are other boards too, like /wsg/ that disallows the crazy stuff
  • 0
    @dom3mo welcome to devRant!
Add Comment