Friday, 4 January 2008

A Lovefilm lifestream hack with Dapper and Pipes

So I'm a big fan of Lovefilm the online DVD rentals service. But for a while now I've been reviewing and rating my moveis in Flixster because that is a more open system with outputs such as Facebook apps and Opensocial.

This is less than ideal as I really should double-enter my ratings to ensure Lovefilm recommendations are accurate, so I was pleased to see that Lovefilm have launched a user public profile which allows you to publish some of your data to the world, see mine here. However, as is so often the case the way the data isn't portable, so I used it as an opportunity to try out a new service I'd heard about called Dapper.

I'd come across Dapper when they spoke at FOWA this year and have had them on my list to try out ever since. Dapper is enssentially screen scraping software, which allows you to feed it some example pages which it then pulls apart to work out what is content that you might want to repurpose. It's a very nicely built web application with huge potential given the output formats (XML, RSS, Netvibes, flash widgets to name a few). So I booted it up and fed it my Lovefilm profile page.

My first hurdle was that I had difficulty getting it to recognise certain parts of the page as content chunks. It couldn't grab the review separately from the list of stars and director, so in the end I have a little more information in the RSS description than I would have liked, I'm guessing this is down to how well the HTML is written. My second issue was that with the RSS option enabled it only allowed me to link data to the fields; title, description and pubdate. I was keen to take the movie artwork in as an RSS enclosure, so I went back and switch to XML output and just grabbed everything into an XML file. After bit of neatening up my finished 'Dap' was ready and published.

After my Dap was finished I wanted to clean things up and sort out the enclosure. So I jumped over to Yahoo! Pipes and took in the XML output, renamed a few fields and performed some regex's so that everything was how I wanted it. The final RSS feed output can be seen here.

Finally I wanted to get that into my lifestream. Currently I'm using Jaiku for this purpose, so I jumped over and added in the new RSS feed which in short order appeared nicely into my stream with my first review. As I use this to wire updates into other systems (e.g. through the Jaiku app in Facebook), my new review quickly perculated into my social network.

Overall this probably took me two hours to setup, but a lot of that was learning and fiddling around the edges. Dapper is certainly a very powerful tool and combined with the more programmatic functionality of Yahoo! Pipes, which allows me to start making my locked up information more portable. The one restriction at the moment is the current inability to automate any logins, so if you don't have a shortcut private URL for your data you're out of luck on an automated login. I think this is possibly an opportunity for OpenId in the future.