GitHub - aboharrawi/Crawler: Crawler system to parse specific website depends on an xml config file

This is the beta version of our crawler system which can crawl to any website given by xml file and collect all hyper-links and data then store it in a database "mysql" .

This crowler is multithreaded java application which uses the jsoup api to parse a website .

the jar file is available in out/artifacts.Crawler_jar/Crawler.jar run as java -jar Crawler.jar DBurl username password xmlConfigFile

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.idea		.idea
jarFiles		jarFiles
out		out
src		src
Crawler.iml		Crawler.iml
README.md		README.md
form-beans.xsd		form-beans.xsd

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

aboharrawi/Crawler

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages