3
donuts
5y

Is there a HTML scraper library for JS/NPM like Selenium or HtmlAgilityPack where I can find elements by ID, XPath, element type and their attached attributes?

Comments
  • 1
    CasperJS?
  • 0
    @bkwilliams not really sure it fits my usage and the sample code is hard to understand how it works generally...

    What I'd like is I pass in a HTML string and it returns some sort of HtmlDocument then I can var nodes = doc.findByElementType(".//div[@class='row']/a");

    for (var x in nodes) { ... }
  • 1
    If you are already using JS, just use JS.
    [stolen from SO]
    https://stackoverflow.com/questions...

    var parser = new DOMParser();
    var htmlDoc = parser.parseFromString(txt, 'text/html');
    // do whatever you want with htmlDoc.getElementsByTagName('a');

    But I use Selenium/ChromeDriver with HAP in C# so I’m kinda guessing.
  • 1
    @bkwilliams I need to create a Android app but to leave to relearn Android so gonna use ReactNative which supports some npm libraries
  • 0
    phantomJS?
  • 0
    @dakkarant isn't that basically a browser? I meant more like a DOM library that can be used to read in HTML and then can return the elements I want via selectors
  • 0
    Google nokogiri for JS, something has to show up.
  • 1
    JSDom or Cheerio if you want jQuery. Both very good options and let you work with the document like you were in the browser

    I prefer JSDom myself for the vanilla js experience. Can't stand jQuery 😀
Add Comment