4
tueor
6y

I have to download 500 images from bookreads to help a friend out. Thought I'd use this opportunity to learn about web scraping rather than downloading the images which'd be a plain and long waste of time. I've got a list of books and author names, the process I wanna automate is putting the book name and author name into the search bar, clicking it, and downloading the first image the appears on the new webpage. I'm planning to use selenium, BeautifulSoup and requests for this project. Is that the right way to go?

Comments
  • 1
    https://goo.gl/dNrLaA
    Goodreads API | ProgrammableWeb
  • 1
    @heyheni book covers are 3rd party data so it isn't allowed
  • 1
    Selenium will get you there but is slow.

    You should try first with Scrapy.
  • 2
    @antorqs This is the first time I'm doing a web scraping project tho, are you sure learning scrapy wont make me want to quit programming
  • 1
    try Katalon its a nice easy to use selenium enviroment. With it's google chrome extensionit's a breeze to record the steps.

    https://www.katalon.com/
  • 1
    @heyheni thanks fam I'll look into it
  • 1
    For me requests+beautifulsoup and a parser like lxml or html.parser works
  • 1
    @iineo not at all. Scrapy is very simple. And it has a good-enough tutorial
  • 1
    @antorqs ya man the tutorials do seem more detailed and comprehensive, thanks
Add Comment