Google Sitemaps from MediaWiki

From Super8wiki
Jump to: navigation, search

Google Sitemaps allow a webmaster to give Google a list of URLs that should be crawled. I use a simple MySQL script to generate a Google Sitemap for this wiki which runs on MediaWiki 1.5. Note: since version 1.6 the MediaWiki core code has included a maintenance script called generateSitemap.php which accomplishes this in a more complete fashion. To do this on your own wiki you must have shell access, know the database name and database username, and have rights to place the sitemap file in the web directory.

Here's how I did it.

Create a text file called sitemap.sql containing the following SQL statements. Change it to match your wiki domain and URL format (I know it's a bad idea for me to have my wiki in the root):

SELECT '<\?xml version="1.0" encoding="UTF-8"\?>' as ' ';
SELECT '<urlset xmlns="http://www.google.com/schemas/sitemap/0.84">' as ' ';
SELECT concat('<url><loc>http://super8wiki.com/</loc><lastmod>',
                DATE_FORMAT(now(),'%Y-%m-%d'),
                '</lastmod><priority>0.9</priority></url>') AS ' ';
SELECT  concat("<url><loc>http://super8wiki.com/index.php/",
                `page_title`,
                "</loc><lastmod>" ,
                DATE_FORMAT(`page_touched`, "%Y-%m-%d"),
                "</lastmod><priority>0.8</priority></url>")
        FROM  `page`
        WHERE  `page_namespace`  = 0
        ORDER  BY  `page_touched`  DESC ;
SELECT '</urlset>' as ' ';

Notes:

  • this script does not properly handle character set issues, not a problem for me yet because all my page names are American English
  • I only submit the Main Page and articles in the first namespace
  • I fake the modification date for the Main Page to be "now" and give the root a slightly higher priority

From the Linux command line issue this statement - of course substitute your actual MySQL username and database name and make sure to direct the output file into your web directory, wherever you told Google when you added this sitemap. You will be prompted for the MySQL password. After the script runs, just visit your Google Sitemaps account and "resubmit" the new sitemap.

mysql -s -p -u database_username wiki_database < sitemap.sql > /var/www/htdocs/sitemap.xml

Notes:

  • you can gzip the sitemap if you wish just make sure to tell Google the right filename

Here is a short sample of the sitemap generated by this script:

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.google.com/schemas/sitemap/0.84">
<url><loc>http://super8wiki.com/</loc><lastmod>2005-08-04</lastmod><priority>0.9</priority></url>
<url><loc>http://super8wiki.com/index.php/Cleaning_Cameras</loc><lastmod>2005-08-04</lastmod><priority>0.8</priority></url>
<url><loc>http://super8wiki.com/index.php/Bencini_Comet_B_Super_8e_automatic</loc><lastmod>2005-08-02</lastmod><priority>0.8</priority></url>
<url><loc>http://super8wiki.com/index.php/Hanimex_Vonet_83</loc><lastmod>2005-08-02</lastmod><priority>0.8</priority></url>
<url><loc>http://super8wiki.com/index.php/Hanimex_Vonet_53</loc><lastmod>2005-08-02</lastmod><priority>0.8</priority></url>
<url><loc>http://super8wiki.com/index.php/Cosina_MS_XL_204_M_(Magic_Sound)</loc><lastmod>2005-08-02</lastmod><priority>0.8</priority></url>
<url><loc>http://super8wiki.com/index.php/Cosina_MS-5000_XL</loc><lastmod>2005-08-02</lastmod><priority>0.8</priority></url>
<url><loc>http://super8wiki.com/index.php/Cosina_MS-4000_XL</loc><lastmod>2005-08-02</lastmod><priority>0.8</priority></url>
<url><loc>http://super8wiki.com/index.php/Zeiss_Ikon_Moviflex_G_S8_Electronic</loc><lastmod>2005-03-03</lastmod><priority>0.8</priority></url>
<url><loc>http://super8wiki.com/index.php/Agfa_Microflex_100_Sensor</loc><lastmod>2005-03-03</lastmod><priority>0.8</priority></url>
<url><loc>http://super8wiki.com/index.php/Canon_Auto_Zoom_814</loc><lastmod>2005-03-03</lastmod><priority>0.8</priority></url>
<url><loc>http://super8wiki.com/index.php/Beaulieu_6008_S_Pro_Digital</loc><lastmod>2005-03-03</lastmod><priority>0.8</priority></url>
<url><loc>http://super8wiki.com/index.php/Beaulieu_6008_Pro-digital</loc><lastmod>2005-03-03</lastmod><priority>0.8</priority></url>
<url><loc>http://super8wiki.com/index.php/Beaulieu_6008_Pro</loc><lastmod>2005-03-03</lastmod><priority>0.8</priority></url>
<url><loc>http://super8wiki.com/index.php/Beaulieu_5008_S</loc><lastmod>2005-03-03</lastmod><priority>0.8</priority></url>
<url><loc>http://super8wiki.com/index.php/Manufacturers_Sandbox</loc><lastmod>2005-02-25</lastmod><priority>0.8</priority></url>
<url><loc>http://super8wiki.com/index.php/Super_8_Wiki</loc><lastmod>2005-02-20</lastmod><priority>0.8</priority></url>
</urlset>

Comments[edit]