14

That feeling when you’re scraping a website to build an API and your script downloads 4049/6170 pages before failing and you have to rewrite it so that puppeteer hits the next button 4049 times before executing the script. 😅

This database is so frustrating.

I hate this website (the one I’m scraping).

It’s going to be so satisfying when this is finished.

Comments
  • 1
    What. The. Fuck.
  • 3
    I cache pages on hard drive and download those that are not there.
    It’s faster and disk space is cheap.
  • 0
    *Imperial Guard voice* Stop right there, criminal scum!
  • 0
    @skylord Its the Grey Fox!
  • 0
    Or just notice that there's a pattern in requests and try requesting the 4049 page directly 😣
  • 2
    @vane the problem is there's no direct pagination with a URL. There's a shitty sort of one page application made with ASP.
Add Comment