Google Sitemaps
I just seen that Google came out with a new Sitemap feature. This is kind of cool as it allows you to basically tell search engines what pages you have available instead of them just having to find them. So, I went ahead and setup a sitemap using the Python script they have made available.
To do this, I simply downloaded the script, copied the sitemap_gen.py to a folder on my server. Then I created the following config.xml file and saved it in the same folder:
<?xml version="1.0" encoding="UTF-8"?>
<site
base_url="http://www.cbulock.com/"
store_into="/home/cbulock/public_html/sitemap.xml.gz"
verbose="1"
>
<urllist path="urllist.txt" encoding="UTF-8" />
</site>
What this does is looks for a urllist.txt file that contains all the URL's for the map. It then outputs a sitemap.xml.gz file that search engines (just Google currently) can use. To create the urllist.txt file, I created a new index template in Movable Type. The index template simply outputs the urllist.txt file to the same directoryas the sitemap_gen.py and config.xml. This template just outputs all the Individual, Monthly, and Category archive page links. I also placed a link to my main index page to. The Movable Type template looks like this:
http://www.cbulock.com/ changefreq=daily priority=1.0
<MTArchiveList archive_type="Individual">
<$MTArchiveLink encode_xml="1"$> lastmod=<$MTArchiveDate format="%Y-%m-%dT%H:%M:%S"$><$MTBlogTimezone$> priority=0.8
</MTArchiveList>
<MTArchiveList archive_type="Category">
<$MTArchiveLink encode_xml="1"$> changefreq=weekly priority=0.7
</MTArchiveList>
<MTArchiveList archive_type="Monthly">
<$MTArchiveLink encode_xml="1"$> changefreq=monthly priority=0.5
</MTArchiveList>
The changefreq and priority attributes can be changed. All the details on how to use those can be found on the Sitemap Generator instruction page.
After you have the script uploaded, the config.xml file setup and the urllist.txt outputted by Movable Type, you can then run the script. This requires telnet or SSH access to your server. The commands are also listed on Googles instruction page. I have setup a cron job that runs the script once a day. And that's about all that's required to get this up and running.
There is still some more work on the template that I plan on doing. For instance, I have a number of pages that are generated by a seperate blog. I will need to add those to the map and also set it up to give the lastmod attribute to every page. I only spent about 20 minutes on the template so far. To see my latest updates to the template, here is the latest copy of the template that I am using.