Opened 6 years ago

Closed 6 years ago

#21 closed task (fixed)

Extend the Tracker class to encompass the master copy, local content, and other metadata servers

Reported by: jcnelson Owned by: jcnelson
Priority: critical Milestone:
Component: component1 Version:
Keywords: Cc:

Description (last modified by jcnelson)

Right now, the ways mdcrawlerd gathers metadata to process is sequential, specific to each method, and slow (detecting write conflicts is O(n2)). The solution is to extend the (sitemap-based) Tracker class to track bunches of metadata--be they sitemaps, the master copy, local content, or other metadata servers. We need to gather the metadata in parallel, sort it on URL and THEN find write conflicts (O(n log n)), and then validate them (by crawling).

Additional config syntax:
MDSERVER = "www.md-url.com/cgi-bin/mdcgi" "/path/to/entries"

Change History (4)

comment:1 Changed 6 years ago by jcnelson

Not recommended in practice--if A subscribes to B, A will need to store both A's and B's metadata.  Use sparingly.

comment:2 Changed 6 years ago by jcnelson

  • Description modified (diff)

comment:3 Changed 6 years ago by jcnelson

  • Description modified (diff)
  • Priority changed from major to critical
  • Summary changed from Make it possible for a metadata server to crawl other metadata servers to Extend the Tracker class to encompass the master copy, local content, and other metadata servers

comment:4 Changed 6 years ago by jcnelson

  • Resolution set to fixed
  • Status changed from new to closed
Note: See TracTickets for help on using tickets.