tueor

7y

I have to download 500 images from bookreads to help a friend out. Thought I'd use this opportunity to learn about web scraping rather than downloading the images which'd be a plain and long waste of time. I've got a list of books and author names, the process I wanna automate is putting the book name and author name into the search bar, clicking it, and downloading the first image the appears on the new webpage. I'm planning to use selenium, BeautifulSoup and requests for this project. Is that the right way to go?

question

python

web scraping

Ranter

Comments

1

heyheni

20806

7y

https://goo.gl/dNrLaA
Goodreads API | ProgrammableWeb
1

tueor

73

7y

@heyheni book covers are 3rd party data so it isn't allowed
1

grayfox

3551

7y

Selenium will get you there but is slow.

You should try first with Scrapy.
2

tueor

73

7y

@antorqs This is the first time I'm doing a web scraping project tho, are you sure learning scrapy wont make me want to quit programming
1

heyheni

20806

7y

try Katalon its a nice easy to use selenium enviroment. With it's google chrome extensionit's a breeze to record the steps.

https://www.katalon.com/
1

tueor

73

7y

@heyheni thanks fam I'll look into it
1

silverstar

1163

7y

For me requests+beautifulsoup and a parser like lxml or html.parser works
1

grayfox

3551

7y

@iineo not at all. Scrapy is very simple. And it has a good-enough tutorial
1

tueor

73

7y

@antorqs ya man the tutorials do seem more detailed and comprehensive, thanks

Related Rants

devRant © 2021 Hexical Labs LLC
Privacy Policy | Terms of Service