I've put up two new ways to get at the UMW online community data, a "Starts With" and a "Contains" search. (See also here). The bonus is that it's helping to expose some flaws in the scrapers.
I'm using SimplePie to auto-find feeds for individual blogs. If all worked well, it'd find an Atom feed or, failing that, find an RSS2 feed for the blog. It looks like in rare cases it finds the feed for the comments on a post, not the feed for the containing blog. Grr.
First attack is likely to see if there are funny things in the WordPress template in those blogs. That'll be interesting and helpful, but ultimately irrelevant since the premise is to work with the feeds. I'm therefore hoping to find reliable clues within the feeds themselves, or even better some nuance to SimplePie's API that'll help me sort that out. I have a class extending SimplePie to tune the feed discovery mechanism onto Atom, but it looks like that might need some work.
Post new comment