Skip to content

Crawler system to parse specific website depends on an xml config file

Notifications You must be signed in to change notification settings

aboharrawi/Crawler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

This is the beta version of our crawler system which can crawl to any website given by xml file and collect all hyper-links and data then store it in a database "mysql" .

This crowler is multithreaded java application which uses the jsoup api to parse a website .

the jar file is available in out/artifacts.Crawler_jar/Crawler.jar run as java -jar Crawler.jar DBurl username password xmlConfigFile

About

Crawler system to parse specific website depends on an xml config file

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages