Google Sitemaps from MediaWiki

Google Sitemaps allow a webmaster to give Google a list of URLs that should be crawled. I use a simple MySQL script to generate a Google Sitemap for this wiki which runs on MediaWiki 1.5. Note: since version 1.6 the MediaWiki core code has included a maintenance script called generateSitemap.php which accomplishes this in a more complete fashion. To do this on your own wiki you must have shell access, know the database name and database username, and have rights to place the sitemap file in the web directory.

Here's how I did it.

Create a text file called sitemap.sql containing the following SQL statements. Change it to match your wiki domain and URL format (I know it's a bad idea for me to have my wiki in the root): SELECT '<\?xml version="1.0" encoding="UTF-8"\?>' as ' '; SELECT '' as ' '; SELECT concat(' http://super8wiki.com/  ',                DATE_FORMAT(now,'%Y-%m-%d'),                '  0.9  ') AS ' '; SELECT concat("  http://super8wiki.com/index.php/",                `page_title`,                "  ",                DATE_FORMAT(`page_touched`, "%Y-%m-%d"),                "  0.8  ") FROM `page` WHERE `page_namespace`  = 0 ORDER BY  `page_touched`  DESC ; SELECT ' ' as ' '; Notes:
 * this script does not properly handle character set issues, not a problem for me yet because all my page names are American English
 * I only submit the Main Page and articles in the first namespace
 * I fake the modification date for the Main Page to be "now" and give the root a slightly higher priority

From the Linux command line issue this statement - of course substitute your actual MySQL username and database name and make sure to direct the output file into your web directory, wherever you told Google when you added this sitemap. You will be prompted for the MySQL password. After the script runs, just visit your Google Sitemaps account and "resubmit" the new sitemap. mysql -s -p -u database_username wiki_database < sitemap.sql > /var/www/htdocs/sitemap.xml Notes:
 * you can gzip the sitemap if you wish just make sure to tell Google the right filename

Here is a short sample of the sitemap generated by this script:   http://super8wiki.com/ 2005-08-04  0.9 http://super8wiki.com/index.php/Cleaning_Cameras 2005-08-04  0.8 http://super8wiki.com/index.php/Bencini_Comet_B_Super_8e_automatic 2005-08-02  0.8 http://super8wiki.com/index.php/Hanimex_Vonet_83 2005-08-02  0.8 http://super8wiki.com/index.php/Hanimex_Vonet_53 2005-08-02  0.8 http://super8wiki.com/index.php/Cosina_MS_XL_204_M_(Magic_Sound) 2005-08-02  0.8 http://super8wiki.com/index.php/Cosina_MS-5000_XL 2005-08-02  0.8 http://super8wiki.com/index.php/Cosina_MS-4000_XL 2005-08-02  0.8 http://super8wiki.com/index.php/Zeiss_Ikon_Moviflex_G_S8_Electronic 2005-03-03  0.8 http://super8wiki.com/index.php/Agfa_Microflex_100_Sensor 2005-03-03  0.8 http://super8wiki.com/index.php/Canon_Auto_Zoom_814 2005-03-03  0.8 http://super8wiki.com/index.php/Beaulieu_6008_S_Pro_Digital 2005-03-03  0.8 http://super8wiki.com/index.php/Beaulieu_6008_Pro-digital 2005-03-03  0.8 http://super8wiki.com/index.php/Beaulieu_6008_Pro 2005-03-03  0.8 http://super8wiki.com/index.php/Beaulieu_5008_S 2005-03-03  0.8 http://super8wiki.com/index.php/Manufacturers_Sandbox 2005-02-25  0.8 http://super8wiki.com/index.php/Super_8_Wiki 2005-02-20  0.8