Pantek Library
Hosting Provided By
CybrHost
High Speed Hosting

SQL for Subversion update-report

From: Jack Bates <ms419(at)freezone.co.uk>
Date: Wed Oct 03 2007 - 14:02:00 EDT


I'm working on a Subversion interface to MediaWiki and am struggling with the SQL to respond to Subversion's update-report: http://www.mediawiki.org/wiki/WebDAV

MediaWiki's revision table contains unique revision ids and the corresponding page id. The page table contains unique page ids and the corresponding page title.

Suversion's update-report intends to get a list of changes between the current state of the working copy and a target revision: often HEAD, MAX(revision.rev_id)

By converting update-report entries to an SQL condition, I can select rows of the revision table which come after the current state of the working copy:

  <S:update-report send-all="true" xmlns:S="svn:">
<S:src-path>http://ket/~jablko/mediawiki/webdav.php</S:src-path>
<S:target-revision>27</S:target-revision>
<S:entry rev="18"></S:entry>
<S:entry rev="20">Test</S:entry>
<S:entry rev="20">Main_Page</S:entry>
  </S:update-report>

  • becomes:

((page_title = 'Main_Page' OR page_title LIKE 'Main\_Page/%') AND revision.rev_id > '20' OR NOT (page_title = 'Main_Page' OR page_title LIKE 'Main\_Page/%') AND ((page_title = 'Test' OR page_title LIKE 'Test/%') AND revision.rev_id > '20' OR NOT (page_title = 'Test' OR page_title LIKE 'Test/%') AND revision.rev_id > '18'))

Using GROUP BY page_id, I get a list of pages which changed between the current state of the working copy and the target revision.

My problem: I also need to know if these pages have revisions before <current-state>, or if they are newly created.

Do you need help?X

I don't want to do a second query for revisions of <page-list> before <current-state> because on an initial checkout, <page-list> could be huge, making an enormous SQL query.

Instead, I think I should do a LEFT JOIN on another instance of the revision table ("old"), where revisions are before <current-state>. NULL rows in this table correspond to newly created pages:

SELECT page_title, MAX(new.rev_id), old.rev_id FROM page JOIN revision AS new LEFT JOIN revision AS old ON new.rev_page = old.rev_page WHERE new.rev_page = page_id AND old.rev_id < new.rev_id AND new.rev_id <= <target-revision> AND <new.rev_id greater than current-state> GROUP BY page_id

The problem with this query is that "old.rev_id < new.rev_id" means each row in "old" is less than _a_ row in "new", not necessarily less than the _minimum_ row in "new".

I don't want to replace this condition with <old.rev_id less than current-state> because the <current-state> expression can be long and complex; I prefer to evaluate it only once, for <new.rev_id greater than current-state>

I tried replacing "old.rev_id < new.rev_id" with "old.rev_id < MIN(new.rev_id)" but got a MySQL error: Invalid use of group function

Can anyone recommend how best to query the database for a list of pages which changed since the current state of the working copy, and whether those pages are newly created?

Much thanks, Jack

Received on Wed Oct 3 14:09:01 2007
Do you need more help?X

This archive was generated by hypermail 2.1.8 : Sun Oct 07 2007 - 10:13:40 EDT


Contact Us  Legal Notices  Order Services Online 
Pantek Home  Privacy Policy  IT news  Site Map  Pantek Library