Extend the Tracker class to encompass the master copy, local content, and other metadata servers
|Reported by:||jcnelson||Owned by:||jcnelson|
Description (last modified by jcnelson)
Right now, the ways mdcrawlerd gathers metadata to process is sequential, specific to each method, and slow (detecting write conflicts is O(n2)). The solution is to extend the (sitemap-based) Tracker class to track bunches of metadata--be they sitemaps, the master copy, local content, or other metadata servers. We need to gather the metadata in parallel, sort it on URL and THEN find write conflicts (O(n log n)), and then validate them (by crawling).
Additional config syntax:
MDSERVER = "www.md-url.com/cgi-bin/mdcgi" "/path/to/entries"
Change History (4)
comment:3 Changed 6 years ago by jcnelson
- Description modified (diff)
- Priority changed from major to critical
- Summary changed from Make it possible for a metadata server to crawl other metadata servers to Extend the Tracker class to encompass the master copy, local content, and other metadata servers