...
Once you do this setup with a publisher you should inform me in order to start to ingest the collected urls by the GoldenFish backend team.
Rss Detector
We have created an RssDetector that accepts a whitelist of domain names provided by a client and checks to validate which domains have an Rss feed. Once confirmed, the validated URL for the Rss feed will be forwarded to the RssFetcherService so that any URLs published through the feed inthe future will be profiled.
Input: a csv file in s3
{bucket}/{messageId}/input/
Output: 3 files in s3: a statistics file, a configuration file (configuration to add to RssFetcherService) and a file with the hosts to unblacklist
{bucket}/{messageId}/output/stats.txt
{bucket}/{messageId}/output/configuration.yml
{bucket}/{messageId}/output/blacklisted-hosts.txt