simple-html-scraper
Simple Html Scraper is a small scraper using puppeteer that can be used to fetch the html content and images of a web page.
Last updated a month ago by aminekamal .
MIT · Repository · Bugs · Original npm · Tarball · package.json
$ cnpm install simple-html-scraper 
SYNC missed versions from official npm registry.

Simple Html Scraper

Simple Html Scraper is a small scraper using puppeteer that can be used to fetch the html content and images of a web page.

Installation

Use the package manager npm to install Simple Html Scraper.

npm i simple-html-scraper

Usage

import { Scraper } from 'simple-html-scraper';

const scraper = new Scraper(/* { options } */);

(async () => {
  const result = await scraper.get('url');
  console.log(result.content); // Html
  console.log(result.images); // Array of image urls
})();

/* options
{
  scroll?: boolean; //enable scrolling
  maxScroll?: number | 'MAX'; // scroll iterations
  scrollWait?: number; // time to wait after each scroll
  resources?: string[]; // rescources to accept during the page load
  puppeteer?: LaunchOptions; // options sent to puppeteer
}
*/

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

Please make sure to update tests as appropriate.

License

MIT

Current Tags

  • 1.0.2                                ...           latest (a month ago)

3 Versions

  • 1.0.2                                ...           a month ago
  • 1.0.1                                ...           a month ago
  • 1.0.0                                ...           a month ago
Maintainers (1)
Downloads
Today 0
This Week 0
This Month 0
Last Day 0
Last Week 0
Last Month 43
Dependencies (1)
Dev Dependencies (8)
Dependents (0)
None

Copyright 2014 - 2016 © taobao.org |