domysee9m

I developed a small tool to find RSS feeds for websites. You can try it out here: https://lighthouseapp.io/tools/feed-finder

In >90% of cases the standard way of checking meta tags is enough to find the feeds. But my goal for this tool is that it finds feeds regardless if they're linked somewhere or not. That if this feed finder doesn't find a feed, no feed exists.

It's a big goal and admittedly not there yet, but it does a few things that are a step in that direction.

* Checks meta tags of parent pages (sometimes the article itself doesn't have the meta tag, but the main blog page does)

* Checks common suffixes like /rss, /index.xml and many others (sometimes the feed exists but isn't linked)

* Checks the sitemap

* Checks all links on the page

* Checks 3rd party feeds (OpenRSS for now, when I find more such repositories I'll add them too)

There are a couple of additional ideas I have, like checking search engines and crawling the entire domain (highly inefficient, but possible).

Would love if you could try it, and even more if you post sites where it doesn't work.

53 comments
  • rollcat9m

    Quick rant about websites that go into all the trouble of having an RSS feed but not linking to it in the <head>... I don't want to go hunting for the cute orange button, I want to copy and paste "https://example.com" into my feed reader and let the computer handle the work.

    If you maintain any website with a news feed, go right now and check that you have this in your <head>:

        <link rel="alternate" type="application/rss+xml" href="/rss.xml" title="News feed" />
                                                               ^^^^^^^^ change! ^^^^^^^^^
    
    (Also note whether and where you need to use application/rss+xml, application/atom+xml, or application/json.)
    • awanderingmind9m

      Thanks for this comment, it encouraged me to go and add this to the <head> of my blog.

  • jcul9m

    This is great, it's hard to believe sites can have RSS feeds but make it so difficult to find.

    I suspect some sites are just running some framework than enables it and don't even realize they have one.

    I have used this site in the past to find feeds: https://www.rsssearchhub.com/

    In the past I was looking for a feed for https://ra.co, but could not find it, though I had seen old posts referencing a RSS feed.

    I ended up emailing them and, to my delight, they let me know they still have an unsupported RSS feed here:

    https://ra.co/xml/rss_news.xml

    Just for feedback, this tool doesn't find the feed, though it doesn't look like a standard URL to me.

    • domysee9m

      Definitely not a standard path, but good to know for testing, thank you!

  • LorenDB9m

    If I can't find an RSS link directly, I generally copy the root URL into archive.org and search for all URLs matching "xml", which includes content type, not just URL names.

  • superkuh9m

    This is 100% a feature that should be in the browser, not a third party tool. I still use an very old version of Firefox for this. Too bad Mozilla decided auto-discovery wasn't necessary in 2016 and removed it. Then two years later claimed no one was aware of RSS/Atom feeds and didn't use them (I wonder why?!?). All so they could try to replace it with their profit/adware that is pocket and we all know how that went.

    >Mozilla is working on alternatives such as Pocket or Reader Mode, and on improving WebExtensions which could provide features related to RSS/Atom feeds without the toll on maintenance. (ref: https://www.ghacks.net/2018/07/25/mozilla-plans-to-remove-rs...)

    • 9m
      [deleted]
  • AiAi9m

    Interesting. These days I was trying to subscribe to some blogs, and they didn’t have a RSS button in their page, so I had to inspect the page to find out the feed URL. Not sure why keep a RSS feed but hide from the visitors. It could be it expected the feed reader to be able to identify it, but since I was using Thunderbird it did not.

    • domysee9m

      Most feed readers find at least feeds that are linked with a link tag in the header, if it's <link rel="alternate" type="application/rss+xml" ... />

      Probably they're expecting people to just paste the website URL in the feed reader and them identifying it. But it would be nice to see the RSS URL linked somewhere.

    • Klonoar9m

      Some of these cases are sites that are built on a CMS that exposes RSS by default, but people don’t consider showing a link/button/whatever in their design.

    • 9m
      [deleted]
  • account429m

    > Application error: a client-side exception has occurred (see the browser console for more information).

    Ok then.

    Also, this would make more sense as a browser extension. Especially if it brought back the RSS icon in the address bar to indicate when a feed is available (although maybe you don't want it to do all of the checks until prompted).

    • domysee9m

      Which URL did you try?

      Yeah the checks are quite expansive, depending on the URL it might more than a hundred requests.

      A browser extension would make sense. Guess I have another project :D

      • djbusby9m

        100!? I have a tool to find feeds from sites - checks like 4 things.

        • mdp20219m

          Well, it must miss many then: my list already is only (and omits a few variations e.g. with 'atom'):

            .../rss , .../rss.xml , .../.rss , .../rss_full.xml , .../feed , .../rss-feed , .../feed/all/ , .../MySection.xml , .../MySection.atom , feedserver.example.com/section/index
  • sodality29m

    Great idea. I tried it with my personal site (https://matthew.science) and it didn't find any, which admittedly doesn't have any meta tags, but it is linked at the footer at https://matthew.science/atom.xml. It was the default feed URL for my SSG. I'd recommend adding this to the common suffix list.

    • domysee9m

      This I must check, it looks standard enough that the tool should've found it. Thanks for the feedback!

  • Cieric9m

    Tried the hacker news front page (https://news.ycombinator.com/news) and when clicking on OpenRSS I get this error:

    TypeError: URL constructor: is not a valid URL. [NextJS] (5603-cb6f1c5a9761f9d0.js:14:5466)

    Browser is Firefox 130.0 on Windows.

    Would be really nice to see this working really well since I search for RSS feeds a lot for a bunch of different things. Whether the RSS feed is good is always another question.

    • domysee9m

      I don't get the error on my machine, but there probably is a timing issue somewhere. Thanks for letting me know!

  • DamonHD9m

    FYI it's only finding one (Atom) feed at earth.org.uk, even though there are several feeds, Atom and RSS.

    Your method described above should have found at least two feeds I think.

    • domysee9m

      Interesting, I'll check that, thanks for letting me know!

  • freetonik9m

    I've been using an NPM package called rss-url-finder [1] in my blog search engine project to find the RSS link. It works relatively well, but still fails sometimes. For now I end up manually searching the source code of the HTML page for .xml or similar link.

    [1] https://www.npmjs.com/package/rss-url-finder

  • Circlecrypto29m

    I am very grateful for this actually. I still read RSS and when I find a good news site I tend to spend 15 minutes or more looking for their feed.

  • jayemar9m

    Are you opposed to this being used programmatically? I've been working on a site [0] that replays feeds, but the initial step is to first find the feed given a website, and it's not always able to find it. I'd be interested in using your service to try to find the feed when I'm unable to do so.

    [0] https://refeed.to

    • pogue9m

      Can you explain the purpose of replaying a feed is?

      • jayemar9m

        My initial use case was for reading content from blogs that had been published before I'd subscribed to their feed. I could visit their site and read their previous posts, but I much prefer the slow drip of an RSS feed. So I created refeed.to to be able to add 1 post per day from the blog to my feed starting from their first post.

        Since creating it I also use it to inject a few extra cartoons into my feed (xkcd every day!) and have also had fun with tech flashbacks from trustedreviews.com. So it's just a way to add a little variation to my feed.

    • domysee9m

      Sure, email me at dominik at lighthouseapp.io

    • domysee9m

      This is great, thank you!

  • nanna9m

    Great work! I've stopped using Twitter but I managed to taper from it by following things using RSS feeds drawn from Nitter. Don't know if that still works but could be an idea?

    • domysee9m

      Twitter feeds would definitely be great to have, will check Nitter to see how I can get them. Thanks for the suggestion!

  • validatori9m

    add also .feed to common suffixes example: https://wiadomosci.onet.pl/.feed

  • chuanliang9m

    Great tools.

    I always use RSSHub Radar , Your tools support more website than RSSHub Radar

    Detection of /feed could be added, most wordpres supported sites have this suffix

  • cranberryturkey9m

    Cool. I wrote a script to search google and find sites with rss feeds so I can create a collection on a particular topic.

    • domysee9m

      That's awesome. Is there any specific search text you used to find the feeds? I know Bing has a command to do that but don't know about Google.

      • djbusby9m

        Don't forget DDG and Kagi - might of some tools too

  • richardbui959m

    I tried it on my website, ebookany.com, but didn't find anything. So sad :(( But your idea is quite interesting.

    • domysee9m

      That's good to know, thank you, helps me debugging

  • stuaxo9m

    I bet this finds some feeds that sites don't know or have forgotten they even have.

  • oidar9m

    The tool misses reddit rss feeds.

    • domysee9m

      Thanks for the hint, will fix that!

  • AIPodNav-Team9m

    cant find lex fridman podcast's feed. https://lexfridman.com/

  • asddubs9m

    my suggestion is a way to have users of the extension suggest a feed URL if it doesn't find one

  • GavCo9m

    Cool. I'm a big fan of RSS feeds.

    Wondering if it's necessary to continue with the other checks if you find a feed in the meta tags?

    • domysee9m

      Probably not, but I'm trying to find all feeds.

      I guess the best option is to show results as soon as they are found, without waiting for everything to complete.

  • cxr9m

    [deleted]

    • domysee9m

      That's super interesting, will definitely try it, thank you!

  • dotBen9m

    RIP Google Reader

  • glub1030119m

    [dead]

  • jacobvespers9m

    [dead]

  • jacobvespers9m

    [dead]