Ticket #21 (closed task: fixed)

Opened 3 years ago

Last modified 3 years ago

Extend the Tracker class to encompass the master copy, local content, and other metadata servers

Reported by: jcnelson Owned by: jcnelson
Priority: critical Milestone:
Component: component1 Version:
Keywords: Cc:

Description (last modified by jcnelson) (diff)

Right now, the ways mdcrawlerd gathers metadata to process is sequential, specific to each method, and slow (detecting write conflicts is O(n2)). The solution is to extend the (sitemap-based) Tracker class to track bunches of metadata--be they sitemaps, the master copy, local content, or other metadata servers. We need to gather the metadata in parallel, sort it on URL and THEN find write conflicts (O(n log n)), and then validate them (by crawling).

Additional config syntax:
MDSERVER = "www.md-url.com/cgi-bin/mdcgi" "/path/to/entries"

Change History

comment:1 Changed 3 years ago by jcnelson

Not recommended in practice--if A subscribes to B, A will need to store both A's and B's metadata.  Use sparingly.

comment:2 Changed 3 years ago by jcnelson

  • Description modified (diff)

comment:3 Changed 3 years ago by jcnelson

  • Priority changed from major to critical
  • Description modified (diff)
  • Summary changed from Make it possible for a metadata server to crawl other metadata servers to Extend the Tracker class to encompass the master copy, local content, and other metadata servers

comment:4 Changed 3 years ago by jcnelson

  • Status changed from new to closed
  • Resolution set to fixed
Note: See TracTickets for help on using tickets.