14

Scrape the Twitter frontend API without any authentication and restriction.

Project Type
Existing open source project
Summary

Scrape the Twitter frontend API without any authentication and restriction.

Description
Twitter's API is annoying to work with, and has lots of limitations — luckily their frontend (JavaScript) has it's own API, which I reverse–engineered. No API rate limits. No restrictions. Extremely fast. You can use this library to get the text of any user's Tweets trivially. Extract user tweets with all meta-data Extracts external links, hashtags and mentions from a tweet Extracts reply, favorite and retweet counts of a tweet
Tech Stack
python3, lxml, xpath
Current Team Size
1
URL
Comments
  • 0
    This is my first pypi release so, any suggestions, feedbacks are welcome.
    You are more than welcome to contribute.
  • 1
    Was looking for something exactly like this, sweet!
  • 0
    @bolobz thinking of adding topic modelling in the next release...
  • 2
    Pure curiosity: is this legal?

    I mean, in theory their front-end is "closed" source (even if visible to everyone) and copyrighted.
    And this also allows you to, as you said, use their service without restirctions, for example to create an army of bots, or just to steal credentials with your custom authtentication process.

    Awesome project btw. 😜
  • 1
    @JS96 yeah I am kind of new to the whole scrapping world, so I don't know how much of it is legal and all... I hope I don't get in to trouble... Although to be respectful I have a added a max of 25 pages scrapping in one instance ... But I know that won't stop some one out there to abuse it...
  • 0
    Being respectful is not the concern here. There are a lot of terms and conditions for developers using the Twitter API and you will be circumventing all of them. This will get you into a HUGE amount of trouble.

    Particularly when GDPR comes into affect in Europe. This has a lot of implications for those who would use such a tool to gather and collect data. The developer t&c’s give you a set of guidelines, if you follow them, you can use them to protect yourself. By circumventing that, you will have no defense.

    This is probably considered breaching someone’s privacy as you haven’t agreed to the rules Twitter have put in place.

    ... on a completely unrelated note, what’s the issues with Twitters API’s? Apart from rate limiting, which you know is a “sorry we need to make money from this at some point” kind of thing, I’ve not had any issues. Built many tools using the streaming API without any hassle
Add Comment