Scraper for LinkedIn full profile data.
Unlike others scrapers, it's working in 2019 with their new website.
Install via npm package manager: npm i scrapedin
We need to update at every LinkedIn change. Please check if you have the latest version.
- Latest release: v1.0.8 (latest 16 jul 2019)
const scrapedin = require('scrapedin')
const profileScraper = await scrapedin({ email: 'login@mail.com', password: 'pass' })
const profile = await profileScraper('https://www.linkedin.com/in/some-profile/')-
scrapedin(options)- options Object:
- email: LinkedIn login e-mail (required)
- password: LinkedIn login password (required)
- isHeadless: display browser (default
false) - hasToLog: print logs on stdout (default
false) - puppeteerArgs: puppeteer launch options Object. It's very useful, you can also pass Chromium parameters at its
argsproperty, example:{ args: ['--no-sandbox'] }(defaultundefined)
- returns: Promise of profileScraper function
- options Object:
-
profileScraper(url, waitTimeMs = 500)- url string: A LinkedIn profile URL
- waitTimeMs integer: milliseconds to wait page load before scraping
- returns: Promise of profile Object
-
profileObject:{ profile: { name, headline, location, summary, connections, followers }, positions:[ { title, company, description, date1, date2, roles: [{ title, description, date1, date2 }] } ], educations: [ { title, degree, date1, date2 } ], skills: [ { title, count } ], recommendations: [ { user, text } ], recommendationsCount: { received, given }, recommendationsReceived: [ { user, text } ], recommendationsGiven: [ { user, text } ], accomplishments: [ { count, title, items } ], volunteerExperience: { title, experience, location, description, date1, date2 }, peopleAlsoViewed: [ { user, text } ] }
-
We already built a crawler to automatically collect multiple profiles, so check it out: scrapedin-linkedin-crawler
-
Usually in the first run LinkedIn asks for a manual check, to solve that you should:
- set
isHeadlesstofalseon scrapedin to solve the manual check in the browser. - set
waitTimeMswith a large number (such as10000) to you have time to solve the manual check.
After doing the manual check once you can go back with
isHeadlessandwaitTimeMsprevious values and start the scraping.We still don't have a solution for that on remote servers without GUI, if you have any idea please tell us!
- set
Feel free to contribute. Just open an issue to discuss something before creating a PR.
