Ranter
Join devRant
Do all the things like
++ or -- rants, post your own rants, comment on others' rants and build your customized dev avatar
Sign Up
Pipeless API
From the creators of devRant, Pipeless lets you power real-time personalized recommendations and activity feeds using a simple API
Learn More
Comments
-
retoor4641dAdd this moment the machine has identified 380 spam accounts.
My application sees spammer in rant -> goes to profile -> browse its rants -> goes to that rants -> meets other spam bots and it continues recursive and making the whole spamming network clear. Using this method I can determine how many networks there are and how many years they're active. -
retoor4641dThe Ragnar anti-spam system has somehow a memory leak while its written in Python. I have no idea how it is possible but it can be out of memory and crashes the server that needs reboot after that. I don't get it, that process is by far not processing enough data to even use real memory and afaik it only remembers id's of rants that it has checked. If someone thinks he can find it, look at my site to the Ragnar project.
Ironically, it's made to stop every thread that has an exception, so it's also not an never ending loop processing smth. -
Nanonoko121d@retoor I found out that the minuses I got were minuses from your bots. They downvoted me because I helped you detect spam. Traitors)
-
Hazarth95061dYou cache a whole lot of stuff, I'd recommend to add logs to the cache function you have and watch if it doesn't recache stuff. I'll have to check more closely once I wake up
-
retoor46422h@Nanonoko I explained you somewhere else that it downvoted because you had a low reputation AND sharing links. Now it will be OK :)
-
Wisecrack947022h@retoor First this is very cool.
Second, and this is so stupid I used to do it regularly, but apparently var = [] is not the same thing as var = list()
No idea if that has any application to your memory leak, but I recall having a couple memory leaks because of that specific issue.
You're building in python, what libraries you using? -
retoor46419h@Wisecrack I what world that's not the same? I'm sure it is tbh.
Look at this, five down: https://devrant.com/rants/11595787/...
I only use requests library. You can see project source here: https://molodetz.nl/retoor/ragnar. Would be nice if you could take a look. I just removed a bug.
Related Rants
Black box. It does seem to put messages with an URL in a certain category though, but also that's not always correct. It's trained on 3000 normal dR messages, and 3000 spam dR messages. 6000 dR messages in total. Many epochs but not good for use yet. The idea that the system could classify without discriminating new users is from the table. That discrimination is needed as a safe margin. Original spam system is a bit simple, but it doesn't do false positive and works great. Still, I want to make smth advanced out of it for the sake of education. Tomorrow I'll have my neural networks book. Probably over two weeks I have some good insights how to improve this all. New hobby :)
(pretrained 3b models are fine for recognizing spam btw. But it costs resource. 8 CPU's 100%. A self trained model pure on spam doesn't and is fast. With a pretrained model you can't do mass classification.)
joke/meme
text
dr
devrant
spam
messages
classifier
ml
category
machine learning