Technorati's link cosmos on your site

Natalie Jost has a nice post about how to track who else is linking to your blog but also a question: “How to get this information out of Technorati?” The answer is a little bit more techie than it should have to be, but anyway:

Technorati is offering an API for getting exactly this kind of information out of it. What do you need for that?

  1. An account on Technorati
  2. An API key which you can get here

The last thing you need is a script that actually uses this API. How should it work? That absolutely depends on what you want. I personally prefer a small script that polls that data let’s say once per day and stores the output inside of a PHP file that I could then simply integrate in whatever CMS I’m currently using.


A very simple script for doing something like that would be:

#!/usr/bin/env ruby
require 'rexml/document'
require 'open-uri'
require 'cgi'

OUTPUT_FILE="cosmos.php"
API_KEY="" # your Technorati API
BLOG_URL="" # your site. e.g.: zerokspot.com


class String
  def escape_single_quotes
    self.gsub(/[']/, '\\\\\'')
  end
end
# http://www.bigbold.com/snippets/posts/show/880

u="http://api.technorati.com/cosmos?key=%s&url=%s"%([API_KEY,BLOG_URL])
open(u) do |site|
  doc = REXML::Document.new(site.read)
  open(OUTPUT_FILE,'w+') do |output_file|
    output_file.write("$links=array();\n")
    doc.elements.each("//item") do |item|
      puts "#" if $DEBUG
      out = "$links[] = array(\"site\"=>\'%s\', \"url\"=> \'%s\');\n"%([
        CGI::escapeHTML(item.elements['weblog/name'].text.escape_single_quotes),
        CGI::escapeHTML(item.elements['nearestpermalink'].text.escape_single_quotes)
        ])
      output_file.write(out) 
    end
  end
end

All you’d have to do is change the API_KEY and BLOG_URL constants and it would create a cosmos.php whereever you’ve started this script. This php file would then contain entries in following format:

$links = array();
$links[] = array("site"=>"My site", "url"=>"http://mysite.com/pointing_to_you.html");

Ready for being integrated in any php script and for cron’ing :)

The code isn’t all that fantastic but it should at least be a good starting point :)