The Algorithm: Home Page & Rule-based Curation
The main list on the Home page is designed to provide a good sense of what's currently going on with your team just by scrolling down a single page. But, our crawlers find so much content that simply displaying everything we discover and add to our database in chronological order would rarely reflect a useful or accurate snapshot.
Our goal is for the home page to provide a good mix of local media coverage (newspapers, Canadian national media, beat-writers), columns, blog posts, podcasts, rumours, reactions, and updates, along with complementary radio clips and videos.
In order to show you a good mix from a variety of sources (Sportsnet, The Athletic, ESPN etc.), and types of sources (newspapers, blogs, national media, podcasts etc.), and content creators (authors, hosts, channels) while providing a one-page snapshot, we limit/filter certain content on the home page only.
ALL content found by our crawlers and admins is available chronologically on the Recently Added page, as well as all category, author, search, date, and player pages.
How it works
Our crawler runs 24 hours a day, always looking to find new and great content. We aim to be the most extensive resource out there. Sometimes that means, for instance, having a lot of podcasts, or adding posts about minor league promotions, or adding 12 individual post-game interview videos at the same time. Curation allows us to follow the content wherever it takes us, while still providing what is at the core of the Aggregator sites, an easy-to-digest up-to-date high-quality list of links.
An important distinction to make about Home is that we don't hand-pick items for this list. Their inclusion isn't specifically an endorsement nor an assertion of quality (though quality is the goal). Instead we create rules that limit certain content in an effort to make the best use of limited real estate on the front page, the most viewed page at each of our sites.
There are no hard and fast rules, but here are some circumstances in which an item might be held off the home page and why:
We're less confident in an item's quality when we don't find an author or show or some type of attribution for the content. There are times when, despite not discovering an author or show name, our crawler is confident that an item should be added to the site (enough other criteria are met), but does not feel confident enough to add it to the main list. These criteria are set on a source-by-source basis.
A form of aggregation itself, Link Dump is the generic term for a post that lists links, typically revolving around a single subject or theme, and timeframe, with brief descriptions and/or recommendations. The Aggregator sites themselves fulfil the function of a Link Dump, more or less, which make them somewhat redundant on our front page. We LOVE Link Dumps, but having a list of links (us) with a link to a list of links? We think the scarce real estate on the front page can be better utilized.
We love these too, but they tend to be community-driven. They tend to have really busy, super fun comment sections, but minimal content. Find them on their respective blog category pages.
Radio, Video clips & Game Recaps
These often come in bunches and would flood the front page without intervention. Especially just after a game, highlights, clips and recaps can come in big bunches over a short period of time. They overwhelm the top of the page and push other content way down the list.
Blogs & Podcasts
As far as blogs and podcasts go, we want to add as many established team-centric ones as possible to each Aggregator site so some are limited in order to keep blogs proportional to other types of content. Blogs used to be the most likely source-type to become disproportionate (now it’s podcasts of course), because there are many great ones, and we love adding them.
Another type of content we run into fairly often is a blog post or article at one source that simply refers (or has embedded media) to a radio or podcast appearance or an article at another source. Again, this is useful content for the users of that site, but we likely have the clip or podcast in its respective section at its respective Aggregator site and in this context the content is somewhat redundant.
Minimal Content / Slideshows
Some items simply contain a very small amount of content. Think 100-word player status updates, or a "Top 50" article with individual promotional pages published for each item, or slideshows. More often than not, this content is terrific and useful to the users of their respective sites, but again, we feel other links are more impactful, thus a better use of scarce page space for our users.
Interview or press conference transcripts are really useful content, but more often than not we have the radio or video in its respective section. Since most radio and video is filtered from Our List, we tend to treat transcripts similarly. That said, if there is something particularly newsworthy going on with the team we will try to show relevant radio, video and transcripts as they are in high demand.
Volume Content Limits
On top of the checks and balances mentioned above, we also monitor the overall volume of content from each source, author or show. Again to encourage a variety of sources in a limited space, if a media outlet, blog or specific author posts numerous articles in one day, our system may limit some on the home page.
We love aggregation (well, most of the time). It's a healthy and useful part of media ecosystem of a team if done right. When something newsworthy happens, it's complimentary to the original reporting/event to have have it summarized and expounded upon by different and numerous voices. That said, sometimes news is simply regurgitated and provides very little value to Sports Aggregator users. The sheer volume of aggregation (good and bad) lead us to limiting some of it, or else other content gets buried.You can see everything we discover and add to our database by using:
You can find a reverse-chronological list of EVERYTHING found by our crawlers at each Aggregator site by clicking Recently Added from the main menu.