A better way for Mother Jones to present the Trayvon story
I was looking at Mother Jones’ Trayvon Martin story, and it occurred to me that it would be really nice if I could load the page and only see the updates since I last visited. So I put together a little ruby program to do this locally for me. It’s very version-0.1, but think it works (EDIT: not really—see below).
Here it is:
require 'nokogiri'
require 'open-uri'
Dir.mkdir("trayvon") if !(Dir.exist?(Dir.pwd + "/trayvon"))
Dir.chdir("trayvon")
File.new("trayvon_full.html", "w+") if !(File.exist?("trayvon_full.html"))
oldFile = Nokogiri::HTML(open("trayvon_full.html"))
newFile = Nokogiri::HTML(open('http://motherjones.com/politics/2012/03/what-happened-trayvon-martin-explained'))
oldText = ""
oldFile.search('div#node-header', 'div#node-body-top', 'div#node-body-bottom').each { |body| oldText += body.to_s }
newText = ""
newFile.search('div#node-header', 'div#node-body-top', 'div#node-body-bottom').each { |body| newText += body.to_s }
oldArray = oldText.split("\n")
newArray = newText.split("\n")
diff = newArray - oldArray
diffText = "<meta http-equiv=\"Content-Type\" content=\"text/html; charset=utf-8\">\n<p>\n"
diffText += diff.join("\n") + "\n"
diffFile = File.open("trayvon_new.html", "w+")
diffFile.write(diffText)
File.open("trayvon_full.html", "w") {|full| full.write(newFile.to_s) }
What it does: Downloads the Trayvon article (first page). Checks for lines that aren’t on your archived copy (so the first time, you see all of them). Writes the full post to trayvon_full.html, and writes the new updates (what you want to look at) to trayvon_new.html (both in trayvon/ inside the current working directory).
Thoughts: If we stop thinking about dynamic, rather than static, news articles, something like this could make a lot of sense for developing news stories. Dear Mother Jones: Please do something like this.
Caveats: This doesn’t style the output correctly. Not very portable (too much hardcoding). Not very good in general. The post hasn’t been updated since I started working on this, so it hasn’t gone through a “real” use case yet.
Want nice formatted code on your tumblelog? I put a brief note on what I did here.
(Oh, and it doesn’t format on the tumblr dashboard, and it doesn’t display full lines on mobile.)
UPDATE: Hello, people-coming-via-Mother Jones. You should know that, if you try this code yourself, it won’t quite work how you want it to—it’s a little bit naive, so you get things besides just the latest updates. I think calling it very v0.1 covers that, but I’ll be even clearer this time. I’m going to be working on something better—watch this space for updates.