article-parser
Extract clean article data from given URL.
Last updated 8 months ago by dongnd .
MIT · Repository · Bugs · Original npm · Tarball · package.json
$ cnpm install article-parser 
SYNC missed versions from official npm registry.

article-parser

Extract main article, main image and meta data from URL.

NPM Build Status

Usage

npm install article-parser

Then:

const {
  extract
} = require('article-parser');

let url = 'https://goo.gl/MV8Tkh';

extract(url).then((article) => {
  console.log(article);
}).catch((err) => {
  console.log(err);
});

APIs

configure(Object conf)

{
  fetchOptions: Object,
  wordsPerMinute: Number,
  htmlRules: Object,
  SoundCloudKey: String,
  YouTubeKey: String,
  EmbedlyKey: String
}
  • fetchOptions: Object, simple version of node-fetch options. Only headers, timeout and agent are available here.
  • wordsPerMinute: Number, default 300, use to estimate time to read
  • htmlRules: Object, options to to clean HTML with sanitize-html
  • SoundCloudKey: String, use to get audio duration. Get it here.
  • YouTubeKey: String, use to get video duration. Get it here.
  • EmbedlyKey: String, use to extract with Embedly API. Refer here.

Default configurations may work for most case.

extract(String url)

Extract article data from specified url.

const {
  extract
} = require('article-parser');

let url = 'https://www.youtube.com/watch?v=tRGJj59G1x4';

extract(url).then((article) => {
  console.log(article);
}).catch((err) => {
  console.log(err);
});

Now article would be something like this:

{
  title: 'Zato ESB - Test demo hosted on company server',
  alias: 'zato-esb-test-demo-hosted-on-company-server-1500021746537-PAQXw8IYcU',
  url: 'https://www.youtube.com/watch?v=tRGJj59G1x4',
  canonicals:
   [ 'https://www.youtube.com/watch?v=tRGJj59G1x4',
     'https://youtu.be/tRGJj59G1x4',
     'https://www.youtube.com/v/tRGJj59G1x4',
     'https://www.youtube.com/embed/tRGJj59G1x4' ],
  description: 'Our sample: https://github.com/greenglobal/zato-demo Zato homepage: https://zato.io Tutorial: "Zato — a powerful Python-based ESB solution for your SOA" http...',
  content: '<iframe src="https://www.youtube.com/embed/tRGJj59G1x4?feature=oembed" frameborder="0" allowfullscreen></iframe>',
  image: 'https://i.ytimg.com/vi/tRGJj59G1x4/hqdefault.jpg',
  author: 'Dong Nguyen',
  source: 'YouTube',
  domain: 'youtube.com',
  publishedTime: '',
  duration: 292
}

extractWithEmbedly(String url [, String EmbedlyKey])

Extract article data from specified url using Embedly Extract API:

The second parameter is optional. If you've added your Embedly key via configure() method, you can ignore it here.

const {
  extractWithEmbedly
} = require('article-parser');

let url = 'https://goo.gl/MV8Tkh';

extractWithEmbedly(url).then((article) => {
  console.log(article);
}).catch((err) => {
  console.log(err);
});

getConfig()

Return the current configurations.

Test

git clone https://github.com/ndaidong/article-parser.git
cd article-parser
npm install
npm test

License

The MIT License (MIT)

Current Tags

  • 2.4.0                                ...           latest (8 months ago)

74 Versions

  • 2.4.0                                ...           8 months ago
  • 2.3.7                                ...           a year ago
  • 2.3.6                                ...           a year ago
  • 2.3.5                                ...           a year ago
  • 2.3.4                                ...           2 years ago
  • 2.3.2                                ...           2 years ago
  • 2.3.1                                ...           2 years ago
  • 2.3.0                                ...           2 years ago
  • 2.2.1                                ...           2 years ago
  • 2.2.0                                ...           2 years ago
  • 2.1.1                                ...           2 years ago
  • 2.0.5                                ...           2 years ago
  • 2.0.4                                ...           2 years ago
  • 2.0.3                                ...           2 years ago
  • 2.0.2                                ...           2 years ago
  • 2.0.1                                ...           2 years ago
  • 2.0.0                                ...           2 years ago
  • 2.0.0-rc1                                ...           2 years ago
  • 2.0.0-rc0                                ...           2 years ago
  • 1.6.4                                ...           2 years ago
  • 1.6.21                                ...           2 years ago
  • 1.6.2                                ...           2 years ago
  • 1.6.15                                ...           2 years ago
  • 1.6.14                                ...           2 years ago
  • 1.6.13                                ...           2 years ago
  • 1.6.12                                ...           2 years ago
  • 1.6.11                                ...           2 years ago
  • 1.6.1                                ...           2 years ago
  • 1.6.0                                ...           3 years ago
  • 1.5.3                                ...           3 years ago
  • 1.5.27                                ...           3 years ago
  • 1.5.26                                ...           3 years ago
  • 1.5.25                                ...           3 years ago
  • 1.5.24                                ...           3 years ago
  • 1.5.23                                ...           3 years ago
  • 1.5.22                                ...           3 years ago
  • 1.5.21                                ...           3 years ago
  • 1.5.2                                ...           3 years ago
  • 1.5.1                                ...           3 years ago
  • 1.5.0                                ...           3 years ago
  • 1.1.0                                ...           3 years ago
  • 1.0.0                                ...           3 years ago
  • 0.5.12                                ...           3 years ago
  • 0.5.11                                ...           3 years ago
  • 0.5.10                                ...           3 years ago
  • 0.4.3                                ...           3 years ago
  • 0.4.2                                ...           3 years ago
  • 0.4.1                                ...           3 years ago
  • 0.4.0                                ...           3 years ago
  • 0.3.8                                ...           3 years ago
  • 0.3.7                                ...           3 years ago
  • 0.3.6                                ...           3 years ago
  • 0.3.51                                ...           3 years ago
  • 0.3.5                                ...           3 years ago
  • 0.3.42                                ...           3 years ago
  • 0.3.41                                ...           3 years ago
  • 0.3.4                                ...           3 years ago
  • 0.3.3                                ...           3 years ago
  • 0.3.2                                ...           3 years ago
  • 0.3.12                                ...           4 years ago
  • 0.3.11                                ...           4 years ago
  • 0.2.5                                ...           4 years ago
  • 0.2.42                                ...           4 years ago
  • 0.2.41                                ...           4 years ago
  • 0.2.2                                ...           4 years ago
  • 0.2.1                                ...           4 years ago
  • 0.1.9                                ...           4 years ago
  • 0.1.8                                ...           4 years ago
  • 0.1.6                                ...           4 years ago
  • 0.1.4                                ...           4 years ago
  • 0.1.3                                ...           4 years ago
  • 0.1.2                                ...           4 years ago
  • 0.1.1                                ...           4 years ago
  • 0.1.0                                ...           4 years ago
Maintainers (1)
Downloads
Today 0
This Week 0
This Month 1
Last Day 0
Last Week 1
Last Month 5
Dependencies (12)
Dev Dependencies (5)

Copyright 2014 - 2016 © taobao.org |