Skip to content

crawl a subset #5

Description

@gregwebs

Avoid creating excess traffic and storing excess files. These features would help:

Reducing crawling

  • crawl a subset of pages
  • limit link following
  • interactive mode that reports on all the links it will follow and asks for approval

Reducing high bandwidth/storage

  • don't download images (possibly past a threshold)
  • don't download videos (don't actually know if it is now)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions