The treemap generator output displays blocks according to size, all in burgundy red. Make an option for other colours and an option for creating a tree map/heat map.
The launch button of treemap generator and raw text 2 treemap reads 'clouds to svg/pdf' but only produces svg (no pdf. Also, the tool does not recognize 'polar bear (10)' as one term. It sizes 'bear' and leaves out polar.
Actor Resonance Tool - Queries Technorati to measure the resonance of an actor for an issue in the Blogosphere.
overall log in system
faceted search for open calais
split bla (2) in bla bla, and reverse. Per line or just bag of words
count lines in common in triangulator
Wikipedia Bots, just show edits, make bots optional (checkbox)
install local version of wikipedia (usefull for synonyms, wikipedia networks, etc)
scaleWorld interface (reminder: switch tag)
google scholar scraper
yandex and baidu scrapers
issuedramaturg volatility
significance measures on engines: e.g.chi square
inversed tag cloud
screenshot generator
Labels in issuegeographer do not display anymore
Issuefeed: issues through time
Option to set the colourscheme for a scheduled issuecrawl. (Now every crawl gets a new colour scheme, it would be very useful if a scheduled crawl could use the same colours for .org, .com, etc, every time it crawls the same source set. This would make comparison over time a lot easier. -- sabine)
the circle map option in issuecrawler is not working: Circle Map - Due to persisting problems, the circle map has been disabled. The circle map returns in 2008.
Webpage History Generator - Uses the Internet Archive's Wayback Machine to make screenshots of all different versions of a site and output a webpage history scroll.
cross device/cross spherical tag cloud generator. This tool proposal is a combination of the Analyze tool and the Tag Cloud generator. In Analyze it is now only possible to compare 2 lists. To make a cross spherical tag cloud the number of possible input lists should be expanded to at least 3 (up to 5). For visualization purposes the output from Analyze ("Sites that are common to list1 and list2"; "Sites that only appear in list2"; "Sites that only appear in list1") can be combined with the Tag Cloud Generator by adding frequency of appearance of a keyword or site between brackets. When comparing 2 lists, "Sites that are common to list1 and list2" should have the addition (2) behind every site, while "Sites that only appear in list2" and "Sites that only appear in list1" should have (1) added. The results file would be a .svg that can be further designed in Illustrator. If possible, other design features could be added such as a diffent color for each list and overlapping results (list1=blue, list2=red, overlap=purple), and an illustrator svg filter that organizes the layout of the results automatically. Cross spherical source comparison / analysis as brought up in SourceDistanceExerciseGroup1
image scraper die niet headers & footers meeneemt in results
tagcloud over time movie maker (1. take all results from archive.org 2. make tag cloud 3. view changes over time (see "growth" and "decline" of words used thus pointing to a shift in importance)
Tool proposal voor het scrapen van Hyves netwerken in Google maps.
anchor text scraper for technorati en yahoo! results.
Nytimes.com Archive scraper. from http://nytimes.com/search (Erik, I think this will be useful in combination with your discovery tools - michael)
MeScraper. a scraper that works like coffee but is more healthy.
Done:
NetworkCloud asks for a network_id. In order to achieve better consistency this should be Insert an Issuecrawler XML file.
New Link Ripper/Harvester tool. When harvesting URLs from Google results for instance, not all URLs are fetched since some don't start with "http" or "www" (such as en.wikipedia... ). Therefore the list is incomplete as input for other tools. Link Ripper fetches all URLs form a page, but this is at the same time the problem. In Google results are returned double, one in the title and one at the bottom of each result (green) and in Google Blog Search the title URL and the green URL are different (specific page and host). Cleaning up the URL list can be done partly automatic (using Analyze for double Google results) or manually (Google Blog Search results), but it would be better if this was fully automated. Ripping only title URLs or green URLs results, best if this is optional, would be a useful addition to the DMI toolbox. Or, if possible (and probably best), harvesting not only by "http" and "www" but also by . without spaces (text.text) or ".com" ".nl" ".org" etc -> see Tool Harvester, Tool Triangulation, and Tool Compare Lists
URL cleaner - return hosts like in Analyse, but without deleting duplicates (and still keeping the same order as input). Completed by Erik, added to analyse tool