596
dfox
7y

Based on popular demand, we're proud to introduce a basic image repost detector on devRant!

Right now it uses very simple hashing to see if an exact copy of an image was posted recently. If it was, then we display an error and we don't allow the image to be posted.

This is experimental so if you experience any issues with it please let me know.

Comments
  • 12
    Thanks. Also think it is a nice feature.
  • 10
    If I would resize the image by f.e. 1px, would it detect it too?
  • 48
    @Skayo nope, it will only stop identical images. I think that will account for a lot of them though because I don't think many people bother to change the image they find at all and people are likely finding the same images.

    Also, this isn't to prevent abuse from people who want to post reposts. It's to warn/stop people who honestly don't know the image hey are posing was recently posted, which I'm guessing is the case like 99% of the time.

    But we will definitely be monitoring how well it works as it's kind of hard to tell without testing live.
  • 6
    @dfox How recent are we talking about here ?
  • 18
    @salvator we're starting with 3 days (so no identical images within 3 day window).

    There's a number of reasons behind that, but it will probably be adjusted as we test more. But it should account for the main, annoying use-case when someone uploads an image and then an hour later someone uploads the same exact image.
  • 13
    Love the idea and direction, but I'm worried about 2 things:
    1. There's usually a few variations of the same image on Google (and I mean the same image like resized for that website or of different compression or something)
    2. When you download an image from devRant it adds a nice bar beneath it to let people know where it's from, but this means the image is not the same anymore. Granted, there's still that bar so people can see it's from DevRant, but still I feel :/ about that

    Not saying they're crucial things nor easy to fix things, just saying my current thoughts... Have some implementation of my own to do so... :)
  • 17
    @Bikonja welp, I'm an idiot... The point is to prevent accidental reposts... If you downloaded the image from devRant you know it's a repost... *Sigh* need some sleep....
  • 3
    @dfox you could add an excpected pixelmatch. You count how many pixels match the last one and then look if enough matched the previous img. For this, you should rather scan the zones of shades of a color instead of pixels. (that was my bachelor)
  • 2
    Thank you, constantly seeing the same images over and over again was leaving me somewhat bored and reluctant to have to wade through memes to find the real stories.

    If there was also an algorithm to prevent comments from being copied and pasted as their own rants that'd be cool too...
  • 3
    Alll hail @dfox and team for developing devRant!
  • 4
    Nice! BTW @dfox maybe make it also search for devRant bar at the bottom.
  • 6
    Could you add a link to the "original" Rant in the message?
  • 6
    What if you also provide link to the post as well? I think i will definitely want to see who has already posted that image.
  • 5
    Thank you dfox.
    You can improve it providing the url of the rant recently posted, so if the user wanted to talk about that image, he/she has a place to do it :)
  • 1
  • 0
    Ah... It's not identical, since it has the devrant watermark at the bottom... Never mind... unless someone else saves the original image too
  • 1
    @NeatNerdPrime just FYI, that 'team' is just one other person besides David.

    Nice feature btw :)
  • 5
    @dfox maybe you could add a link to the rant where the last picture (hash) was found so the user who uploaded the repost can comment this post.
  • 1
    @g-m-f Fuck you, I can't stop laughing 😄
  • 1
    @dfox This is beautiful
  • 4
    You know I don't think a lot of the reposts are on purpose.. I just don't think people see all of the memes before hand. What could be helpful if there was a devRant "gallery" that can be filtered easily by common tags which only shows photos posted and no text. That will make it easier for people to go through the memes or whatever else posted. And it would be fun for people to go through the memes.
  • 1
    So was the repost option for a -- not working?
  • 1
    What is the algo behind this feature? Are you using some CV AI or...?
  • 1
    Or Maybe just comparing hash?
  • 1
    @slinavipuz He explained its comparing hashes
  • 3
    Thanks for the feedback everyone!

    We're also going to start saving a phash which is a perceptual hash of images which can be used for real image similarity and can detect images that aren't exactly the same but are close.
  • 9
    Also thanks for the great idea about providing the link to the original rant. We can add it in the modal to start and it won't be clickable but will at least allow you to find the rant.
  • 4
    Oh and one more thing - this is not a replacement for downvoting as repost since this is pretty much for a few specific cases, so please don't stop marking reposts!

    As it was asked before - voting repost does help and it helps the rest of the community who has hide reposts turned on.

    The goal with this new feature is to make it so at least in some cases those reposts will never get posted.
  • 1
    Seems like a neutral network could do wonders in identifying duplicate images even after modification or screen shots. Any thoughts @dfox?
  • 1
    @dfox you should use machine learning to capture all the memes

    Hey there's a good Collab idea
  • 3
    To start out simple and not have to develop a lot of it on our own, I'm looking at this PHP library which seems to do a nice job of figuring out the distance between two images: https://github.com/jenssegers/...

    Machine learning would be cool and I know there's a lot more complex stuff, but unfortunately I don't have a lot of time to pursue and of that since we have a lot of stuff to do :/
  • 2
    @dfox there are many memes out there and meme generators with only text difference... would this algorithm mark them as duplicates as well? I meam they're close
  • 1
    @dfox That is honestly the coolest thing using hamming distance!
  • 1
    A couple things to note about this idea.
    1) What if the rant text itself is different from another rant of the same image?
    2) perhaps you can save space on the database by having similar pictures point to eachother, instead of uploading a completely new image?
  • 3
    @amjo @tisaconundrum good questions.

    The library I linked to seems to be able to identify the text being different. What we would need to do is set the distance threshold small enough where the image could be a little different (ex. compressed, different size, etc.) but where different text would be different enough that it doesn't get caught. Another thing we could do for the fuzzy search is make it so it is a warning rather than not being able to post it. Ex. here's rants that look similar, please check these images to make sure it's not an exact repost.

    @nocturnal as we discussed in the comments here, the image needs to be identical so downloading mine from this rant won't work for a number of reasons (the bar at the bottom, compression). It only checks against original source images because the use case is really the posters got an identical image from the same outside source (ex. Reddit).
  • 3
    @dfox ouh, nice, like Stack Overflow does before letting you post a question :)
    (Off topic-kinda) Btw, not sure if anyone mentioned this before, but can devrant scroll to the comment I was tagged in? It gets awkward when more than 20 comments are found :p
  • 0
  • 2
    @JAnken123 it won't. it will just destroy reposts that occur repeatedly.
  • 3
    @dfox your post is just 25 ++'s shy of you getting a free stress ball! Isn't that exciting?
  • 6
    @amjo yup that's on our list, will probably come soon.

    @Yankeesrule I'm looking forward. Will be another one to trip over while I'm walking around my apartment!
  • 4
    Would it be possible/easily to implement if error, take me to the rant? For if I upload my wife and I get an error, then shyt just got real 😂
  • 5
  • 2
    @dfox if it's so I could take that ball 😂😂
  • 0
    What if it's a screenshot. You could use a pretty basic neural network to detect image similarities.
  • 1
    @jAsE dude this made me almost cry
  • 2
    It will be great to show matched image. So that, post creator has option to get his message through by commenting on existing post.
  • 4
    If imgur had this, they would die in a matter of minutes
  • 0
    I don't like repost but new users or some one who missed the old post may would enjoy it... Many you should only mark it in some way as repost?
  • 1
    Thats a good idea! ↑
    Maybe warn the user and mark post as repost, you can hide it from settings anyways.
  • 1
    @dfox dedication! I see you took a screenshot at 2:14am :D thanks for the feature :)
  • 4
    @J4s0n that's why the system only looks for images in the past 3 days ;)
  • 0
    This is going to be legen... Wait for it... Dary...
  • 3
    @dfox This is indeed a great feature.. And, I am sure that it will help make DevRant even more awesome...

    It's good to know that the people behind this app is directly having a conversation with their users...
  • 1
    Recommendatiom: Use clarifai
  • 4
    @dfox, I was wondering. Did u mail yourself a stress ball?
  • 6
    @Letmecode you're getting too close and I don't have any more announcements to post right now. Welp, guess I have to stay up all night and write witty comments on every rant. This sucks because in general
    I'm bad at coming up with witty comments so it'll be hard work.
  • 4
    @dfox well, don't the most of us suck at that? I know I do
  • 4
    @jpichardo haha yeah. Every once and a while we all come up with a good one though :)
  • 3
    @dfox I have to agree on that.
  • 0
    @dfox can you have a feedback system too.
    Like, if a ranter claims that it's not identical, there is a way to submit feedback which includes the original image and the image that is trying to post.

    This way you can also know where your system is failing and if you need to tweak the similarity threshold.

    It can even be useful to train better machine learning models in future once you have collected such data.

    To save your time, currently the most effective and advanced image diff tool uses Convolution Neural Network (CNN).

    I will try to make one sample implementation of it which takes 2 image input and returns similarity metric. Once, I do, will include it's link. I will also try to see how easy it is to integrate with PHP.
Add Comment