crawl within a set of site, when all links of each site in the set found, see if they interlink.
basically this is a snowball WITHIN a set of sites until there are no more links findable WITHIN the sites. Then see if the sites in the sets interlink and draw a cluster map from it.
Or Input URLs, find URLs' outlinks (3 deep), map interlinkings between inputted actors only.
Specification
Input URLs, find URLs' outlinks (3 deep), map interlinkings between inputted actors only.
Notes (gmc)
Basically, create
Issue Crawler? .shouldVisit(), reject any links that are outside the sp's.
Do one iterations (= 0 iterations in frontend language), check what final iter does.
The Plan:
- Run a test on the devel crawler with test1.issuecrawler.net and test2.issuecrawler.net as input
- Add the shouldVisit() method
- Does class Issue Crawler? know the starting points?
- Run another test on the devel crawler
Topic revision: r4 - 30 Sep 2009 - 09:54:09 -
Erik Borra