Looking at RSS User-Agents |
February 4th, 2021 |
meta, rs, tech |
An RSS reader sends periodic requests to get the latest feed. This
includes a User-Agent field, identifying which fetcher is running:
Feedbin feed-id:1242010 - 38 subscribersThis fetcher is nicely passing along statistics, saying how many readers it represents.
I took one day of logs, with 5,962 requests for my RSS feed:
$ sudo grep '"GET /news.rss ' \ /var/log/nginx/access.log.1 \ | awk -F'"' '{print $6}' \ | wc -l 5962There were 162 unique User-Agents:
$ sudo grep '"GET /news.rss ' \ /var/log/nginx/access.log.1 \ | awk -F'"' '{print $6}' \ | sort \ | uniq \ | wc -l 162Of the 5,962 requests, 932 (16%) gave stats:
$ sudo grep '"GET /news.rss ' \ /var/log/nginx/access.log.1 \ | awk -F'"' '{print $6}' \ | grep 'subscriber\|reader' \ | wc -l 932They sent 21 distinct User-Agents:
$ sudo grep '"GET /news.rss ' \ /var/log/nginx/access.log.1 \ | awk -F'"' '{print $6}' \ | grep 'subscriber\|reader' \ | sort \ | uniq \ | wc -l 21Some sent multiple requests with different numbers of subscribers:
Feedbin feed-id:1242010 - 38 subscribers Feedbin feed-id:372940 - 11 subscribers Feedbin feed-id:382 - 1 subscribersI suspect this comes from people using old URLs that then get redirected to my current URL. For example, now it's
https://www.jefftk.com/news.rss
, but it used to be
http://www.jefftk.com/news.rss
, and even longer ago it
was an sccs.swarthmore.edu
address. Summing subscriber
counts, I see:
- Feedly: 573
- inoreader.com: 87
- NewsBlur: 62
- Feedbin: 50
- theoldreader.com: 34
- Dreamwidth Studios: 7
- BazQux: 5
- Bloglovin: 2
- Feed Wrangler: 2
- pine.blog: 1
Different services fetched at different intervals. Taking the shortest interval for each distinct User-Agent:
- Feedly: 7min
- Feedbin: 15min
- Bloglovin: 30min
- Dreamwidth Studios: 30min
- Feed Wrangler: 30min
- NewsBlur: 30min
- BazQux: 40min
- inoreader.com: 1hr
- theoldreader.com: 2hr
- pine.blog: 24hr
Comment via: facebook, lesswrong