How software ate manual content placement on Metro.co.uk

Trending and Newsfeed Automatic Placements

Trending and Newsfeed Automatic Placements


The majority of content placement on metro.co.uk is now managed by software. This has been a long journey based on real world feedback and incremental addition of complexity. My goal has always been to take a developer’s view of the editorial process and optimise where possible. Looking at the numbers it became clear that for large areas of the site a disproportionate amount of time was spent on content placement for the value it returned. My previous post covered the first part of the journey and now I will explain how we extended this to run the majority of the site.

We gather a lot of information from WordPress, Facebook, Twitter and Omniture into a MySQL database. This data is passed through a stored procedure to return five different sorts:

Score

((Tweets + Facebook Interactions) * Social Multiplier) + Views

Trending

Score Now – Score 30 Mins Ago

Social

Tweets + Facebook Interactions

Date

Date descending

Coefficient

((((Tweets + Facebook Interactions) * Social Multiplier) + Views + Editorial Boost) * Hour Coefficient) + Tag Boost

We also pass in the following to modify the results depending on the channel, e.g. news, that the data sits within.

Hours to return

e.g. From 24-336 since publishing

Filter Subcategories

e.g. Football,Oddballs,Weird,Food

Boost Tag

e.g. arsenal-fc

Social Multiplier

e.g. 10

Content Type

e.g. Blog or Post

Remove Tags

e.g. tottenham-hotspur-fc

Coefficient

e.g. 0-4 * 3, 4-12 * 1. 13-24 * 0.5, 25-48 0.3

The coefficient sort now takes input from the articles that the editors place at the top of each of the channel pages of the site. This editorial boost allowed us to keep everything feeling much fresher until the data catches up. We have also built our own real time page view tracking system with Node and Redis to get around the time lag in Omniture of 30-60 mins.

We recently centralised all of the settings so that they are easy to view and change. I focused on optimising the cluster of content returned and timeframes to retrieve within to get them working with publishing patterns per channel. The ability to cluster content of similar channels has helped ensure we offer a wider variety of content at different stages of the user’s journey.

Using different sort methods coupled with this clustering has reduced duplication. The design has also helped by using different image ratios and colours for different sections of that page that may contain the same content. We have standardised the bottom of all pages to be the same. This means if we are able to improve performance then it is felt across the entire site.

The last addition was the ability to boost by tag. This has enabled the article based algorithm to be much more relevant. At this level of granularity we decided context is much more important than freshness. Moving the tag boost outside of the coefficient enabled this to be clustered at the top but we limit this to the five most recent related articles.

Our API is able to deliver everything that the front end needs for rendering including title, image and excerpt. This has also enabled us to use this data in multiple places such as the native Tablet and Phone Editions and the glance journalism experiment Metro10 and MetroVision. Our feeds finally go through an additional layer that allow us to add sponsored content at fixed positions for traffic driving purposes.

The great part of all of this is that the maths is still very simple and can be explained to anyone who is interested. Having a set of values to tweak per channel has enabled us to have enough options to slice the data for use in multiple contexts. It has taken 12 months and a full redesign to really see this come to life but I hope it will be a part of Metro for years to come.

Home Page

Metro Homepage Placements

Metro Homepage Placements

Article Page

Metro Article Page Placement

Metro Article Page Placement

My talk at the WordPress VIP Big Media Meetup on this.

Leave a Reply

Your email address will not be published. Required fields are marked *