GitHub - Netgator/scrapedin: LinkedIn Scraper (working for the new website 2019)

Scraper for LinkedIn full profile data.
Unlike others scrapers, it's working in 2019 with their new website.

Install via npm package manager: npm i scrapedin

Check your version!

We need to update at every LinkedIn change. Please check if you have the latest version.

Latest release: v1.0.8 (latest 16 jul 2019)

Usage Example:

const scrapedin = require('scrapedin')

const profileScraper = await scrapedin({ email: 'login@mail.com', password: 'pass' })
const profile = await profileScraper('https://www.linkedin.com/in/some-profile/')

Documentation:

scrapedin(options)
- options Object:
  - email: LinkedIn login e-mail (required)
  - password: LinkedIn login password (required)
  - isHeadless: display browser (default false)
  - hasToLog: print logs on stdout (default false)
  - puppeteerArgs: puppeteer launch options Object. It's very useful, you can also pass Chromium parameters at its args property, example: { args: ['--no-sandbox'] } (default undefined)
- returns: Promise of profileScraper function
profileScraper(url, waitTimeMs = 500)
- url string: A LinkedIn profile URL
- waitTimeMs integer: milliseconds to wait page load before scraping
- returns: Promise of profile Object

profile Object:

{
  profile: {
    name, headline, location, summary, connections, followers
  },
  positions:[
    { title, company, description, date1, date2,
      roles: [{ title, description, date1, date2 }]
    }
  ],
  educations: [
    { title, degree, date1, date2 }
  ],
  skills: [
    { title, count }
  ],
  recommendations: [
    { user, text }
  ],
  recommendationsCount: {
    received, given
  },
  recommendationsReceived: [
    { user, text }
  ],
  recommendationsGiven: [
    { user, text }
  ],
  accomplishments: [
   { count, title, items }
  ],
  volunteerExperience: {
    title, experience, location, description, date1, date2
  },
  peopleAlsoViewed: [
    { user, text }
  ]
}

Tips

We already built a crawler to automatically collect multiple profiles, so check it out: scrapedin-linkedin-crawler
Usually in the first run LinkedIn asks for a manual check, to solve that you should:
- set isHeadless to false on scrapedin to solve the manual check in the browser.
- set waitTimeMs with a large number (such as 10000) to you have time to solve the manual check.
After doing the manual check once you can go back with isHeadless and waitTimeMs previous values and start the scraping.

We still don't have a solution for that on remote servers without GUI, if you have any idea please tell us!

Contribution

Feel free to contribute. Just open an issue to discuss something before creating a PR.

License

Apache 2.0

Name		Name	Last commit message	Last commit date
Latest commit History 61 Commits
src		src
.gitignore		.gitignore
.travis.yml		.travis.yml
LICENSE		LICENSE
README.md		README.md
logo.png		logo.png
package-lock.json		package-lock.json
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Check your version!

Usage Example:

Documentation:

Tips

Contribution

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Check your version!

Usage Example:

Documentation:

Tips

Contribution

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages