- •Table of Contents
- •Foreword
- •Preface
- •Audience
- •How to Read this Book
- •Conventions Used in This Book
- •Typographic Conventions
- •Icons
- •Organization of This Book
- •New in Subversion 1.1
- •This Book is Free
- •Acknowledgments
- •From Ben Collins-Sussman
- •From Brian W. Fitzpatrick
- •From C. Michael Pilato
- •Chapter 1. Introduction
- •What is Subversion?
- •Subversion's History
- •Subversion's Features
- •Subversion's Architecture
- •Installing Subversion
- •Subversion's Components
- •A Quick Start
- •Chapter 2. Basic Concepts
- •The Repository
- •Versioning Models
- •The Problem of File-Sharing
- •The Lock-Modify-Unlock Solution
- •The Copy-Modify-Merge Solution
- •Subversion in Action
- •Working Copies
- •Revisions
- •How Working Copies Track the Repository
- •The Limitations of Mixed Revisions
- •Summary
- •Chapter 3. Guided Tour
- •Help!
- •Import
- •Revisions: Numbers, Keywords, and Dates, Oh My!
- •Revision Numbers
- •Revision Keywords
- •Revision Dates
- •Initial Checkout
- •Basic Work Cycle
- •Update Your Working Copy
- •Make Changes to Your Working Copy
- •Examine Your Changes
- •svn status
- •svn diff
- •svn revert
- •Resolve Conflicts (Merging Others' Changes)
- •Merging Conflicts by Hand
- •Copying a File Onto Your Working File
- •Punting: Using svn revert
- •Commit Your Changes
- •Examining History
- •svn diff
- •Examining Local Changes
- •Comparing Working Copy to Repository
- •Comparing Repository to Repository
- •svn list
- •A Final Word on History
- •Other Useful Commands
- •svn cleanup
- •svn import
- •Summary
- •Chapter 4. Branching and Merging
- •What's a Branch?
- •Using Branches
- •Creating a Branch
- •Working with Your Branch
- •The Key Concepts Behind Branches
- •Copying Changes Between Branches
- •Copying Specific Changes
- •The Key Concept Behind Merging
- •Best Practices for Merging
- •Tracking Merges Manually
- •Previewing Merges
- •Merge Conflicts
- •Noticing or Ignoring Ancestry
- •Common Use-Cases
- •Merging a Whole Branch to Another
- •Undoing Changes
- •Resurrecting Deleted Items
- •Common Branching Patterns
- •Release Branches
- •Feature Branches
- •Switching a Working Copy
- •Tags
- •Creating a Simple Tag
- •Creating a Complex Tag
- •Branch Maintenance
- •Repository Layout
- •Data Lifetimes
- •Summary
- •Chapter 5. Repository Administration
- •Repository Basics
- •Understanding Transactions and Revisions
- •Unversioned Properties
- •Repository Data-Stores
- •Berkeley DB
- •FSFS
- •Repository Creation and Configuration
- •Hook Scripts
- •Berkeley DB Configuration
- •Repository Maintenance
- •An Administrator's Toolkit
- •svnlook
- •svnadmin
- •svndumpfilter
- •svnshell.py
- •Berkeley DB Utilities
- •Repository Cleanup
- •Managing Disk Space
- •Repository Recovery
- •Migrating a Repository
- •Repository Backup
- •Adding Projects
- •Choosing a Repository Layout
- •Creating the Layout, and Importing Initial Data
- •Summary
- •Chapter 6. Server Configuration
- •Overview
- •Network Model
- •Requests and Responses
- •Client Credentials Caching
- •svnserve, a custom server
- •Invoking the Server
- •Built-in authentication and authorization
- •Create a 'users' file and realm
- •Set access controls
- •SSH authentication and authorization
- •SSH configuration tricks
- •Initial setup
- •Controlling the invoked command
- •httpd, the Apache HTTP server
- •Prerequisites
- •Basic Apache Configuration
- •Authentication Options
- •Basic HTTP Authentication
- •SSL Certificate Management
- •Authorization Options
- •Blanket Access Control
- •Per-Directory Access Control
- •Disabling Path-based Checks
- •Extra Goodies
- •Repository Browsing
- •Other Features
- •Supporting Multiple Repository Access Methods
- •Chapter 7. Advanced Topics
- •Runtime Configuration Area
- •Configuration Area Layout
- •Configuration and the Windows Registry
- •Configuration Options
- •Servers
- •Config
- •Properties
- •Why Properties?
- •Manipulating Properties
- •Special Properties
- •svn:executable
- •svn:mime-type
- •svn:ignore
- •svn:keywords
- •svn:eol-style
- •svn:externals
- •svn:special
- •Automatic Property Setting
- •Peg and Operative Revisions
- •Externals Definitions
- •Vendor branches
- •General Vendor Branch Management Procedure
- •svn_load_dirs.pl
- •Localization
- •Understanding locales
- •Subversion's use of locales
- •Subversion Repository URLs
- •Chapter 8. Developer Information
- •Layered Library Design
- •Repository Layer
- •Repository Access Layer
- •RA-DAV (Repository Access Using HTTP/DAV)
- •RA-SVN (Custom Protocol Repository Access)
- •RA-Local (Direct Repository Access)
- •Your RA Library Here
- •Client Layer
- •Using the APIs
- •The Apache Portable Runtime Library
- •URL and Path Requirements
- •Using Languages Other than C and C++
- •Inside the Working Copy Administration Area
- •The Entries File
- •Pristine Copies and Property Files
- •WebDAV
- •Programming with Memory Pools
- •Contributing to Subversion
- •Join the Community
- •Get the Source Code
- •Become Familiar with Community Policies
- •Make and Test Your Changes
- •Donate Your Changes
- •Chapter 9. Subversion Complete Reference
- •The Subversion Command Line Client: svn
- •svn Switches
- •svn Subcommands
- •svn blame
- •svn checkout
- •svn cleanup
- •svn commit
- •svn copy
- •svn delete
- •svn diff
- •svn export
- •svn help
- •svn list
- •svn merge
- •svn mkdir
- •svn move
- •svn propedit
- •svn proplist
- •svn resolved
- •svn revert
- •svn status
- •svn switch
- •svn update
- •svnadmin
- •svnadmin Switches
- •svnadmin Subcommands
- •svnadmin create
- •svnadmin deltify
- •svnadmin dump
- •svnadmin help
- •svnadmin list-dblogs
- •svnadmin list-unused-dblogs
- •svnadmin load
- •svnadmin lstxns
- •svnadmin recover
- •svnadmin rmtxns
- •svnadmin setlog
- •svnadmin verify
- •svnlook
- •svnlook Switches
- •svnlook
- •svnlook author
- •svnlook changed
- •svnlook date
- •svnlook help
- •svnlook history
- •svnlook tree
- •svnlook uuid
- •svnserve
- •svnserve Switches
- •svnversion
- •svnversion
- •mod_dav_svn Configuration Directives
- •Appendix A. Subversion for CVS Users
- •Revision Numbers Are Different Now
- •Directory Versions
- •More Disconnected Operations
- •Distinction Between Status and Update
- •Branches and Tags
- •Metadata Properties
- •Conflict Resolution
- •Binary Files and Translation
- •Versioned Modules
- •Authentication
- •Converting a Repository from CVS to Subversion
- •Appendix B. Troubleshooting
- •Common Problems
- •Problems Using Subversion
- •Every time I try to access my repository, my Subversion client just hangs.
- •Every time I try to run svn, it says my working copy is locked.
- •I'm getting errors finding or opening a repository, but I know my repository URL is correct.
- •How can I specify a Windows drive letter in a file:// URL?
- •I'm having trouble doing write operations to a Subversion repository over a network.
- •Under Windows XP, the Subversion server sometimes seems to send out corrupted data.
- •What is the best method of doing a network trace of the conversation between a Subversion client and Apache server?
- •Why does the svn revert command require an explicit target? Why is it not recursive by default? This behavior differs from almost all the other subcommands.
- •On FreeBSD, certain operations (especially svnadmin create) sometimes hang.
- •I can see my repository in a web browser, but svn checkout gives me an error about 301 Moved Permanently.
- •Appendix C. WebDAV and Autoversioning
- •Basic WebDAV Concepts
- •Just Plain WebDAV
- •DeltaV Extensions
- •Subversion and DeltaV
- •Mapping Subversion to DeltaV
- •Autoversioning Support
- •The mod_dav_lock Alternative
- •Autoversioning Interoperability
- •Win32 WebFolders
- •Unix: Nautilus 2
- •Linux davfs2
- •Appendix D. Third Party Tools
- •Clients and Plugins
- •Language Bindings
- •Repository Converters
- •Higher Level Tools
- •Repository Browsing Tools
- •Appendix E. Copyright
Repository Administration
someone else, perhaps a manager or engineering QA team, who can choose to promote the transaction into a revision, or abort it.
Unversioned Properties
Transactions and revisions in the Subversion repository can have properties attached to them. These properties are generic key-to-value mappings, and are generally used to store information about the tree to which they are attached. The names and values of these properties are stored in the repository's filesystem, along with the rest of your tree data.
Revision and transaction properties are useful for associating information with a tree that is not strictly related to the files and directories in that tree—the kind of information that isn't managed by client working copies. For example, when a new commit transaction is created in the repository, Subversion adds a property to that transaction named svn:date—a datestamp representing the time that the transaction was created. By the time the commit process is finished, and the transaction is promoted to a permanent revision, the tree has also been given a property to store the username of the revision's author (svn:author) and a property to store the log message attached to that revision (svn:log).
Revision and transaction properties are unversioned properties—as they are modified, their previous values are permanently discarded. Also, while revision trees themselves are immutable, the properties attached to those trees are not. You can add, remove, and modify revision properties at any time in the future. If you commit a new revision and later realize that you had some misinformation or spelling error in your log message, you can simply replace the value of the svn:log property with a new, corrected log message.
Repository Data-Stores
As of Subversion 1.1, there are two options for storing data in a Subversion repository. One type of repository stores everything in a Berkeley DB database; the other kind stores data in ordinary flat files, using a custom format. Because Subversion developers often refer to a repository as “The [Versioned] Filesystem”, they have adopted the habit of referring to the latter type of repository as FSFS: that is, it's a versioned filesystem implementation that uses the native OS filesystem to store data.
When a repository is created, an administrator must decide whether it will use Berkeley DB or FSFS. There are advantages and disadvantages to each, which we'll describe in a bit. Neither back-end is more “official” than another, and programs which access the repository are insulated from this implementation detail. Programs have no idea how a repository is storing data; they only see revision and transaction trees through the repository API.
Here is a table that gives a comparative overview of Berkeley DB and FSFS repositories. The next sections go into detail.
Table 5.1. Repository Data-Store Comparison
Feature |
Berkeley DB |
FSFS |
|
|
|
Sensitivity to interruptions |
very; crashes and permission prob- |
quite insensitive. |
|
lems can leave the database |
|
|
“wedged”, requiring journaled recov- |
|
|
ery procedures. |
|
|
|
|
Usable from a read-only mount |
no |
yes |
|
|
|
Platform-independent storage |
no |
yes |
|
|
|
Usable over network filesystems |
no |
yes |
|
|
|
Repository size |
slightly larger |
slightly smaller |
|
|
|
Scalability: number of revision trees |
database; no problems |
some older native filesystems don't |
|
|
scale well with thousands of entries |
|
|
in a single directory. |
|
|
|
66
Repository Administration
Feature |
|
Berkeley DB |
FSFS |
|
|
|
|
Scalability: directories with |
many |
slower |
faster |
files |
|
|
|
|
|
|
|
Speed: checking out latest code |
|
faster |
slower |
|
|
|
|
Speed: large commits |
|
slower, but work is spread throughout |
faster, but finalization delay may |
|
|
commit |
cause client timeouts |
|
|
|
|
Group permissions handling |
|
sensitive to user umask problems; |
works around umask problems |
|
|
best if accessed by only one user. |
|
|
|
|
|
Code maturity |
|
in use since 2001 |
in use since 2004 |
|
|
|
|
Berkeley DB
When the initial design phase of Subversion was in progress, the developers decided to use Berkeley DB for a variety of reasons, including its open-source license, transaction support, reliability, performance, API simplicity, thread-safety, support for cursors, and so on.
Berkeley DB provides real transaction support—perhaps its most powerful feature. Multiple processes accessing your Subversion repositories don't have to worry about accidentally clobbering each other's data. The isolation provided by the transaction system is such that for any given operation, the Subversion repository code sees a static view of the database—not a database that is constantly changing at the hand of some other process—and can make decisions based on that view. If the decision made happens to conflict with what another process is doing, the entire operation is rolled back as if it never happened, and Subversion gracefully retries the operation against a new, updated (and yet still static) view of the database.
Another great feature of Berkeley DB is hot backups—the ability to backup the database environment without taking it “offline”. We'll discuss how to backup your repository in the section called “Repository Backup”, but the benefits of being able to make fully functional copies of your repositories without any downtime should be obvious.
Berkeley DB is also a very reliable database system. Subversion uses Berkeley DB's logging facilities, which means that the database first writes to on-disk log files a description of any modifications it is about to make, and then makes the modification itself. This is to ensure that if anything goes wrong, the database system can back up to a previous checkpoint—a location in the log files known not to be corrupt—and replay transactions until the data is restored to a usable state. See the section called “Managing Disk Space” for more about Berkeley DB log files.
But every rose has its thorn, and so we must note some known limitations of Berkeley DB. First, Berkeley DB environments are not portable. You cannot simply copy a Subversion repository that was created on a Unix system onto a Windows system and expect it to work. While much of the Berkeley DB database format is architecture independent, there are other aspects of the environment that are not. Secondly, Subversion uses Berkeley DB in a way that will not operate on Windows 95/98 systems—if you need to house a repository on a Windows machine, stick with Windows 2000 or Windows XP. Also, you should never keep a Berkeley DB repository on a network share. While Berkeley DB promises to behave correctly on network shares that meet a particular set of specifications, almost no known shares actually meet all those specifications.
Finally, because Berkeley DB is a library linked directly into Subversion, it's more sensitive to interruptions than a typical relational database system. Most SQL systems, for example, have a dedicated server process that mediates all access to tables. If a program accessing the database crashes for some reason, the database daemon notices the lost connection and clean up any mess left behind. And because the database daemon is the only process accessing the tables, applications don't need to worry about permission conflicts. These things are not the case with Berkeley DB, however. Subversion (and programs using Subversion libraries) access the database tables directly, which means that a program crash can leave the database in a temporarily inconsistent, inaccessible state. When this happens, an administrator needs to ask Berkeley DB to restore to a checkpoint, which is a bit of an annoyance. Other things can cause a repository to “wedge” besides crashed processes, such as programs conflicting over ownership and permissions on the database files. So while a BerkeleyDB repository is quite fast and scalable, it's best used by a single server process running as one user—such as Apache's httpd or svnserve (see Chapter 6, Server Configuration)—rather than accessing it as many different users via file:/// or svn+ssh:// URLs. If using a
67
