14
nitnip
15d

Boss"So, we need to get some data about the users using the APIs from this list of sites."
Me"Alright, sounds feasible enough"

Navigating to first site.

M"Hold on, where's the API?"
B"What do you mean? You're looking at it."
M"This is a website with a search bar, not an API"
B"Same thing. Get to scrapping that data."
M"I-It's written in a JS framework to be reactive in a half-assed way."
B"We need that data"
M"The data is not even consistent!"
B"That's why we need to join it with all these different sources."

The API was a lie. None of the sites had anything remotely similar to an API.

Having to use bloody selenium with chrome driver to scrap all the information because of course, it has to be done programatically every week from now on.

I just hope no captcha of any kind is installed before I finish this project.

Comments
  • 6
    I had someone request something pretty similar. (No mention of an API though.)

    Wrote it in selenium and let it run for half a week. Site was slow as frozen tamales.

    The same script worked flawlessly for all but like 15 entries (out of 500+?). For some reason selenium just refused to work with them, saying it couldn’t find elements that were obviously there, the whatever expired, Mercury was on retrograde, someone insulted its mother (it was me), etc. idfk. No sense.

    Never got paid for it, so I never gave them the data. 🤷🏻‍♀️
  • 1
    Selenium not finding shit is the worst. It's either elements not being there because the js framework of choice being used in the site hasn't put them in the dom yet or elements not being clickable.

    I can somewhat get around the first issue by issuing an explicit wait until element is present but the explicit wait until the element is clickable? It doesn't fucking work. I have no idea what it does but certainly not what it's supposed to.

    This is supposed to run in the background so it won't really be affecting the site itself but... I predict it's gonna fail half of the time just because of how unreliable the sites I have to scrap are.
  • 2
    @sipu261988 Cram it, botface.
Add Comment