<?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE article PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
          "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd">
<article>
  <articleinfo>
    <authorinitials>wft</authorinitials>
    <author>
      <firstname>Walter</firstname>
      <othername>F.</othername>
      <surname>Tichy</surname>
      <affiliation>
        <orgname>Purdue University</orgname>
        <orgdiv>Department of Computer Sciences</orgdiv>
        <address><city>West Lafayette</city>, <state>Indiana</state> <postcode>47907</postcode></address>
      </affiliation>
    </author>
    <authorinitials>wft</authorinitials>
    <artpagenums>637-654</artpagenums>
    <volumenum>15</volumenum>
    <issuenum>7</issuenum>
    <publisher>
      <publishername>Software&mdash;Practice &amp;
      Experience</publishername>
    </publisher>
    <pubdate>July 1985</pubdate>
    <title>RCS&mdash;A System for Version Control</title>
    <titleabbrev>Tichy85</titleabbrev>
    <revhistory>
      <revision>
        <revnumber>1.0</revnumber>
        <date>July 1985</date>
	<authorinitials>wft</authorinitials>
        <revremark>Published.</revremark>
      </revision>
      <revision>
        <revnumber>1.1</revnumber>
        <date>1 June 1995</date>
        <revremark>Included with free software distribution.</revremark>
      </revision>
      <revision>
        <revnumber>1.2</revnumber>
        <date>July 2004</date>
	<authorinitials>ashawley</authorinitials>
	<revremark>Converted to XML Docbook 4.2 with the help of Groff</revremark>
    <!-- Creator     : groff version 1.18.1 -->
    <!-- CreationDate: Thu Jul 15 12:07:11 2004 -->
      </revision>
    </revhistory>
    <keywordset>
      <keyword>configuration management</keyword>
      <keyword>history</keyword>
      <keyword>management</keyword>
      <keyword>version control</keyword>
      <keyword>revisions</keyword>
      <keyword>deltas</keyword>
    </keywordset>
  </articleinfo>
  <abstract>
    <para>An important problem in program development and
    maintenance is version control, i.e., the task of keeping a
    software system consisting of many versions and configurations
    well organized. The Revision Control System (RCS) is a software
    tool that assists with that task. RCS manages revisions of text
    documents, in particular source programs, documentation, and
    test data. It automates the storing, retrieval, logging and
    identification of revisions, and it provides selection
    mechanisms for composing configurations. This paper introduces
    basic version control concepts and discusses the practice of
    version control using RCS. For conserving space, RCS stores
    deltas, i.e., differences between successive revisions. Several
    delta storage methods are discussed. Usage statistics show that
    RCS&rsquo;s delta storage method is space and time efficient.
    The paper concludes with a detailed survey of version control
    tools.</para>
  </abstract>
  <section>
    <title>Introduction</title>
    <para>Version control is the task of keeping software systems
    consisting of many versions and configurations well organized.
    The Revision Control System (RCS) is a set of UNIX commands
    that assist with that task.</para>
    <para>RCS&rsquo; primary function is to manage 
    <emphasis>revision groups</emphasis>. A revision group is a set
    of text documents, called 
    <emphasis>revisions</emphasis>, that evolved from each other. A
    new revision is created by manually editing an existing one.
    RCS organizes the revisions into an ancestral tree. The initial
    revision is the root of the tree, and the tree edges indicate
    from which revision a given one evolved. Besides managing
    individual revision groups, RCS provides flexible selection
    functions for composing configurations. RCS may be combined
    with MAKE
    <xref linkend="feldman1979make" />, resulting in a powerful
    package for version control.</para>
    <para>RCS also offers facilities for merging updates with
    customer modifications, for distributed software development,
    and for automatic identification. Identification is the
    &lsquo;stamping&rsquo; of revisions and configurations with
    unique markers. These markers are akin to serial numbers,
    telling software maintainers unambiguously which configuration
    is before them.</para>
    <para>RCS is designed for both production and experimental
    environments. In production environments, access controls
    detect update conflicts and prevent overlapping changes. In
    experimental environments, where strong controls are
    counterproductive, it is possible to loosen the
    controls.</para>
    <para>Although RCS was originally intended for programs, it is
    useful for any text that is revised frequently and whose
    previous revisions must be preserved. RCS has been applied
    successfully to store the source text for drawings, VLSI
    layouts, documentation, specifications, test data, form letters
    and articles.</para>
    <para>This paper discusses the practice of version control
    using RCS. It also introduces basic version control concepts,
    useful for clarifying current practice and designing similar
    systems. Revision groups of individual components are treated
    in the next three sections, and the extensions to
    configurations follow. Because of its size, a survey of version
    control tools appears at the end of the paper.</para>
  </section>
  <section>
    <title>Getting started with RCS</title>
    <para>Suppose a text file 
    <filename>f.c</filename> is to be placed under control of RCS.
    Invoking the check-in command</para>
    <synopsis><userinput>ci f.c</userinput></synopsis>
    <para>creates a new revision group with the contents of 
    <filename>f.c</filename>as the initial revision (numbered 1.1)
    and stores the group into the file 
    <filename>f.c,v</filename>. Unless told otherwise, the command
    deletes 
    <filename>f.c</filename>. It also asks for a description of the
    group. The description should state the common purpose of all
    revisions in the group, and becomes part of the group&rsquo;s
    documentation. All later check-in commands will ask for a log
    entry, which should summarize the changes made. (The first
    revision is assigned a default log message, which just records
    the fact that it is the initial revision.)</para>
    <para>Files ending in 
    <filename role="extension">,v</filename> are called 
    <emphasis>RCS files</emphasis>(
    <literal>v</literal> stands for 
    <emphasis>v</emphasis>ersions); the others are called working
    files. To get back the working file 
    <filename>f.c</filename> in the previous example, execute the
    check-out command:</para>
    <synopsis><userinput><command>co</command> <filename>f.c</filename></userinput></synopsis>
    <para>This command extracts the latest revision from the
    revision group 
    <filename>f.c,v</filename> and writes it into 
    <filename>f.c</filename>. The file 
    <filename>f.c</filename> can now be edited and, when finished,
    checked back in with 
    <command>ci</command>:</para>
    <synopsis><userinput><command>ci</command> <filename>f.c</filename></userinput></synopsis>
    <para>
    <command>Ci</command> assigns number 1.2 to the new revision. If
    
    <command>ci</command> complains with the message</para>
    <para>
      <computeroutput>ci error: no lock set by &lt;login&gt;</computeroutput>
    </para>
    <para>then the system administrator has decided to configure
    RCS for a production environment by enabling the &lsquo;strict
    locking feature&rsquo;. If this feature is enabled, all RCS
    files are initialized such that check-in operations require a
    lock on the previous revision (the one from which the current
    one evolved). Locking prevents overlapping modifications if
    several people work on the same file. If locking is required,
    the revision should have been locked during the check-out by
    using the option 
    <literal>&minus;l</literal>:</para>
    <synopsis><userinput><command>co -l</command> <filename>f.c</filename></userinput></synopsis>
    <para>Of course it is too late now for the check-out with
    locking, because 
    <filename>f.c</filename> has already been changed; checking out
    the file again would overwrite the modifications. (To prevent
    accidental overwrites, 
    <command>co</command> senses the presence of a working file and
    asks whether the user really intended to overwrite it. The
    overwriting check-out is sometimes useful for backing up to the
    previous revision.) To be able to proceed with the check-in in
    the present case, first execute</para>
    <synopsis><userinput><command>rcs</command> -l <filename>f.c</filename></userinput></synopsis>
    <para>This command retroactively locks the latest revision,
    unless someone else locked it in the meantime. In this case,
    the two programmers involved have to negotiate whose
    modifications should take precedence.</para>
    <para>If an RCS file is private, i.e., if only the owner of the
    file is expected to deposit revisions into it, the strict
    locking feature is unnecessary and may be disabled. If strict
    locking is disabled, the owner of the RCS file need not have a
    lock for check-in. For safety reasons, all others still do.
    Turning strict locking off and on is done with the
    commands:</para>
    <para><userinput> <command>rcs</command> &minus;U <filename>f.c</filename></userinput> and <userinput><command>rcs</command> &minus;L <filename>f.c</filename></userinput></para>
    <para>These commands enable or disable the strict locking
    feature for each RCS file individually. The system
    administrator only decides whether strict locking is enabled
    initially.</para>
    <para>To reduce the clutter in a working directory, all RCS
    files can be moved to a subdirectory with the name 
    <literal>RCS</literal>. RCS commands look first into that
    directory for RCS files. All the commands presented above work
    with the 
    <literal>RCS</literal> subdirectory without
      change.<footnote><para>Pairs of RCS and working files can actually be specified in 3 ways:
a) both are given, b) only the working file is given, c) only the
RCS file is given.
If a pair is given, both files may have arbitrary path prefixes;
	  RCS commands pair them up intelligently.</para></footnote></para>
    <para>It may be undesirable that 
    <command>ci</command> deletes the working file. For instance,
    sometimes one would like to save the current revision, but
    continue editing. Invoking</para>
    <synopsis><userinput><command>ci</command> &minus;l <filename>f.c</filename></userinput></synopsis>
    <para>checks in 
    <filename>f.c</filename>as usual, but performs an additional
    check-out with locking afterwards. Thus, the working file does
    not disappear after the check-in. Similarly, the option 
    <literal>&minus;u</literal> does a check-in followed by a
    check-out without locking. This option is useful if the file is
    needed for compilation after the check-in. Both options update
    the identification markers in the working file (see
    below).</para>
    <para>Besides the operations 
    <command>ci</command> and 
    <command>co</command>, RCS provides the following
    commands:</para>
    <variablelist>
      <varlistentry>
        <term><command>ident</command></term>
        <listitem>
          <para>extract identification markers</para>
        </listitem>
      </varlistentry>
      <varlistentry>
        <term><command>rcs</command></term>
        <listitem>
          <para>change RCS file attributes</para>
        </listitem>
      </varlistentry>
      <varlistentry>
        <term><command>rcsclean</command></term>
        <listitem>
          <para>remove unchanged working files (optional)</para>
        </listitem>
      </varlistentry>
      <varlistentry>
        <term><command>rcsdiff</command></term>
        <listitem>
          <para>compare revisions</para>
        </listitem>
      </varlistentry>
      <varlistentry>
        <term><command>rcsfreeze</command></term>
        <listitem>
          <para>record a configuration (optional)</para>
        </listitem>
      </varlistentry>
      <varlistentry>
        <term><command>rcsmerge</command></term>
        <listitem>
          <para>merge revisions</para>
        </listitem>
      </varlistentry>
      <varlistentry>
        <term><command>rlog</command></term>
        <listitem>
          <para>read log messages and other information in RCS
          files</para>
        </listitem>
      </varlistentry>
    </variablelist>
    <para>A synopsis of these commands appears in
    <xref linkend="appendix"/>.</para>
    <section>
      <title>Automatic Identification</title>
      <para>RCS can stamp source and object code with special
      identification strings, similar to product and serial
      numbers. To obtain such identification, place the
      marker</para>
      <screen><literal>$Id$</literal></screen>
      <para>into the text of a revision, for instance inside a
      comment. The check-out operation will replace this marker
      with a string of the form</para>
      <screen><literal>$Id: filename revision number date time author state locker $</literal></screen>
      <para>This string need never be touched, because 
      <command>co</command> keeps it up to date automatically. To
      propagate the marker into object code, simply put it into a
      literal character string. In C, this is done as
      follows:</para>
      <literallayout class="monospaced"><!-- code language="C" -->static char rcsid[] = "$Id$";<!-- /code --></literallayout>
      <para>The command 
      <command>ident</command> extracts such markers from any file,
      in particular from object code. 
      <command>ident</command> helps to find out which revisions of
      which modules were used in a given program. It returns a
      complete and unambiguous component list, from which a copy of
      the program can be reconstructed. This facility is invaluable
      for program maintenance.</para>
      <para>There are several additional identification markers,
      one for each component of $Id$. The marker</para>
      <screen><literal>$Log$</literal></screen>
      <para>has a similar function. It accumulates the log messages
      that are requested during check-in. Thus, one can maintain
      the complete history of a revision directly inside it, by
      enclosing it in a comment. <xref linkend="fig1"/> is an edited version of a
      log contained in revision 4.1 of the file 
      <filename>ci.c</filename>. The log appears at the beginning
      of the file, and makes it easy to determine what the recent
      modifications were.</para>
      <figure id="fig1">
        <title>Log entries produced by the marker $Log$.</title>
        <literallayout class="monospaced"><!-- code language="C" -->
     /*
      * $Log: ci.c,v $
      * Revision 4.1  1983/05/10 17:03:06  wft
      * Added option &minus;d and &minus;w, and updated assignment of date, etc. to new delta.
      * Added handling of default branches.
      *
      * Revision 3.9  1983/02/15 15:25:44  wft
      * Added call to fastcopy() to copy remainder of RCS file.
      *
      * Revision 3.8  1983/01/14 15:34:05  wft
      * Added ignoring of interrupts while new RCS file is renamed;
      * avoids deletion of RCS files by interrupts.
      *
      * Revision 3.7  1982/12/10 16:09:20  wft
      * Corrected checking of return code from diff.
      * An RCS file now inherits its mode during the first ci from the working file,
      * except that write permission is removed.
      */
        <!-- /code --></literallayout>
      </figure>
      <para>Since revisions are stored in the form of differences,
      each log message is physically stored once, independent of
      the number of revisions present. Thus, the $Log$ marker
      incurs negligible space overhead.</para>
    </section>
  </section>
  <section id="rcsrevisiontree">
    <title>The RCS Revision Tree</title>
    <para>RCS arranges revisions in an ancestral tree. The 
    <command>ci</command> command builds this tree; the auxiliary
    command 
    <command>rcs</command> prunes it. The tree has a root revision,
    normally numbered 1.1, and successive revisions are numbered
    1.2, 1.3, etc. The first field of a revision number is called
    the 
    <emphasis>release number</emphasis> and the second one the 
    <emphasis>level number</emphasis>. Unless given explicitly, the
    
    <command>ci</command> command assigns a new revision number by
    incrementing the level number of the previous revision. The
    release number must be incremented explicitly, using the 
    <literal>&minus;r</literal> option of 
    <command>ci</command>. Assuming there are revisions 1.1, 1.2,
    and 1.3 in the RCS file f.c,v, the command</para>
    <para>
    <userinput><command>ci</command> &minus;r2.1 <filename>f.c</filename></userinput>
      or
      <userinput><command>ci</command> &minus;r2 <filename>f.c</filename></userinput></para>
    <para>assigns the number 2.1 to the new revision. Later
    check-ins without the 
    <literal>&minus;r</literal> option will assign the numbers 2.2,
    2.3, and so on. The release number should be incremented only
    at major transition points in the development, for instance
    when a new release of a software product has been
    completed.</para>
    <section>
      <title>When are branches needed?</title>
      <para>A young revision tree is slender: It consists of only
      one branch, called the trunk. As the tree ages, side branches
      may form. Branches are needed in the following 4
      situations.</para>
      <variablelist>
        <varlistentry>
          <term>
            <emphasis>Temporary fixes</emphasis>
          </term>
          <listitem>
            <para>Suppose a tree has 5 revisions grouped in 2
            releases, as illustrated in <xref linkend="fig2"/>. Revision 1.3, the
            last one of release 1, is in operation at customer
            sites, while release 2 is in active development.</para>
	    <figure id="fig2">
	      <title>A slender revision tree.</title>
	      <mediaobject>
		<imageobject><imagedata fileref="rcs-fig2.png" format="PNG"/></imageobject>
	      </mediaobject>
	    </figure>
            <para>Now imagine a customer requesting a fix of a
            problem in revision 1.3, although actual development
            has moved on to release 2. RCS does not permit an extra
            revision to be spliced in between 1.3 and 2.1, since
            that would not reflect the actual development history.
            Instead, create a branch at revision 1.3, and check in
            the fix on that branch. The first branch starting at
            1.3 has number 1.3.1, and the revisions on that branch
            are numbered 1.3.1.1, 1.3.1.2, etc. The double
            numbering is needed to allow for another branch at 1.3,
            say 1.3.2. Revisions on the second branch would be
            numbered 1.3.2.1, 1.3.2.2, and so on. The following
            steps create branch 1.3.1 and add revision
            1.3.1.1:</para>
            <screen>
              <prompt>$</prompt> <userinput><command>co</command> &minus;r1.3 <filename>f.c</filename></userinput> <lineannotation>&mdash; check out revision 1.3</lineannotation>
              <prompt>$</prompt> <userinput><command>edit</command> <filename>f.c</filename></userinput> <lineannotation>&mdash; change it</lineannotation>
              <prompt>$</prompt> <userinput><command>ci</command>&minus;r1.3.1 <filename>f.c</filename></userinput> <lineannotation>&mdash; check it in on branch 1.3.1</lineannotation>
            </screen>
            <para>This sequence of commands transforms the tree of
            <xref linkend="fig2"/> into the one in <xref linkend="fig3"/>. Note that it may be
            necessary to incorporate the differences between 1.3
            and 1.3.1.1 into a revision at level 2. The operation 
            <command>rcsmerge</command> automates this process (see
	      <xref linkend="appendix"/>).</para>
	    <figure id="fig3">
	      <title>A revision tree with one side
		branch</title>
	      <mediaobject>
		<imageobject><imagedata fileref="rcs-fig3.png" format="PNG"/></imageobject>
	      </mediaobject>
	    </figure>
          </listitem>
        </varlistentry>
        <varlistentry>
          <term>
            <emphasis>Distributed development and customer
            modifications</emphasis>
          </term>
          <listitem>
            <para>Assume a situation as in <xref linkend="fig2"/>, where revision
            1.3 is in operation at several customer sites, while
            release 2 is in development. Customer sites should use
            RCS to store the distributed software. However,
            customer modifications should not be placed on the same
            branch as the distributed source; instead, they should
            be placed on a side branch. When the next software
            distribution arrives, it should be appended to the
            trunk of the customer&rsquo;s RCS file, and the
            customer can then merge the local modifications back
            into the new release. In the above example, a
            customer&rsquo;s RCS file would contain the following
            tree, assuming that the customer has received revision
            1.3, added his local modifications as revision 1.3.1.1,
            then received revision 2.4, and merged 2.4 and 1.3.1.1,
            resulting in 2.4.1.1.</para>
	    <figure id="fig4">
	      <title>A customer&rsquo;s revision tree with
		local modifications.</title>
	      <mediaobject>
		<imageobject><imagedata fileref="rcs-fig4.png" format="PNG"/></imageobject>
	      </mediaobject>
	    </figure>
            <para>This approach is actually practiced in the CSNET
            project, where several universities and a company
            cooperate in developing a national computer
            network.</para>
          </listitem>
        </varlistentry>
        <varlistentry>
          <term>
            <emphasis>Parallel development</emphasis>
          </term>
          <listitem>
            <para>Sometimes it is desirable to explore an alternate
            design or a different implementation technique in
            parallel with the main line development. Such
            development should be carried out on a side branch. The
            experimental changes may later be moved into the main
            line, or abandoned.</para>
          </listitem>
        </varlistentry>
        <varlistentry>
          <term>
            <emphasis>Conflicting updates</emphasis>
          </term>
          <listitem>
            <para>A common occurrence is that one programmer has
            checked out a revision, but cannot complete the
            assignment for some reason. In the meantime, another
            person must perform another modification immediately.
            In that case, the second person should check-out the
            same revision, modify it, and check it in on a side
            branch, for later merging.</para>
          </listitem>
        </varlistentry>
      </variablelist>
      <para>Every node in a revision tree consists of the following
      attributes: a revision number, a check-in date and time, the
      author&rsquo;s identification, a log entry, a state and the
      actual text. All these attributes are determined at the time
      the revision is checked in. The state attribute indicates the
      status of a revision. It is set automatically to
      &lsquo;experimental&rsquo; during check-in. A revision can
      later be promoted to a higher status, for example
      &lsquo;stable&rsquo; or &lsquo;released&rsquo;. The set of
      states is user-defined.</para>
    </section>
    <section>
      <title>Revisions are represented as deltas</title>
      <para>For conserving space, RCS stores revisions in the form
      of deltas, i.e., as differences between revisions. The user
      interface completely hides this fact.</para>
      <para>A delta is a sequence of edit commands that transforms
      one string into another. The deltas employed by RCS are
      line-based, which means that the only edit commands allowed
      are insertion and deletion of lines. If a single character in
      a line is changed, the edit scripts consider the entire line
      changed. The program 
	<command>diff</command><xref linkend="hunt1976diff"/> produces a small, line-based delta
      between pairs of text files. A character-based edit script
      would take much longer to compute, and would not be
      significantly shorter.</para>
      <para>Using deltas is a classical space-time tradeoff: deltas
      reduce the space consumed, but increase access time. However,
      a version control tool should impose as little delay as
      possible on programmers. Excessive delays discourage the use
      of version controls, or induce programmers to take shortcuts
      that compromise system integrity. To gain reasonably fast
      access time for both editing and compiling, RCS arranges
      deltas in the following way. The most recent revision on the
      trunk is stored intact. All other revisions on the trunk are
      stored as reverse deltas. A reverse delta describes how to go
      backward in the development history: it produces the desired
      revision if applied to the successor of that revision. This
      implementation has the advantage that extraction of the
      latest revision is a simple and fast copy operation. Adding a
      new revision to the trunk is also fast: 
      <command>ci</command> simply adds the new revision intact,
      replaces the previous revision with a reverse delta, and
      keeps the rest of the old deltas. Thus, 
      <command>ci</command> requires the computation of only one new
      delta.</para>
      <para>Branches need special treatment. The naive solution
      would be to store complete copies for the tips of all
      branches. Clearly, this approach would cost too much space.
      Instead, RCS uses 
      <emphasis>forward</emphasis> deltas for branches. Regenerating
      a revision on a side branch proceeds as follows. First,
      extract the latest revision on the trunk; secondly, apply
      reverse deltas until the fork revision for the branch is
      obtained; thirdly, apply forward deltas until the desired
      branch revision is reached. <xref linkend="fig5"/> illustrates a tree with
      one side branch. Triangles pointing to the left and right
      represent reverse and forward deltas, respectively.</para>
      <figure id="fig5">
	<title>A revision tree with reverse and forward deltas.</title>
	<mediaobject>
	  <imageobject><imagedata fileref="rcs-fig5.png" format="PNG"/></imageobject>
	</mediaobject>
      </figure>
      <para>Although implementing fast check-out for the latest
      trunk revision, this arrangement has the disadvantage that
      generation of other revisions takes time proportional to the
      number of deltas applied. For example, regenerating the
      branch tip in <xref linkend="fig5"/> requires application of five deltas
      (including the initial one). Since usage statistics show that
      the latest trunk revision is the one that is retrieved in 95
	per cent of all cases (see <xref linkend="usagestatistics"/>),
      biasing check-out time in favor of that revision results in
      significant savings. However, careful implementation of the
      delta application process is necessary to provide low
      retrieval overhead for other revisions, in particular for
      branch tips.</para>
      <para>There are several techniques for delta application. The
      naive one is to pass each delta to a general-purpose text
      editor. A prototype of RCS invoked the UNIX editor 
      <command>ed</command> both for applying deltas and for
      expanding the identification markers. Although easy to
      implement, performance was poor, owing to the high start-up
      costs and excess generality of 
      <command>ed</command>. An intermediate version of RCS used a
      special-purpose, stream-oriented editor. This technique
      reduced the cost of applying a delta to the cost of checking
      out the latest trunk revision. The reason for this behavior
      is that each delta application involves a complete pass over
      the preceding revision.</para>
      <para>However, there is a much better algorithm. Note that
      the deltas are line oriented and that most of the work of a
      stream editor involves copying unchanged lines from one
      revision to the next. A faster algorithm avoids unnecessary
      copying of character strings by using a 
      <emphasis>piece table</emphasis>. A piece table is a
      one-dimensional array, specifying how a given revision is
      &lsquo;pieced together&rsquo; from lines in the RCS file.
      Suppose piece table 
      <literal>PTr</literal> represents revision 
      <literal>r</literal>. Then 
      <literal>PTr[i]</literal> contains the starting position of
      line 
      <literal>i</literal> of revision 
      <literal>r</literal>. Application of the next delta
      transforms piece table 
      <literal>PTr</literal> into 
      <literal>PTr+1</literal>. For instance, a delete command
      removes a series of entries from the piece table. An
      insertion command inserts new entries, moving the entries
      following the insertion point further down the array. The
      inserted entries point to the text lines in the delta. Thus,
      no I/O is involved except for reading the delta itself. When
      all deltas have been applied to the piece table, a sequential
      pass through the table looks up each line in the RCS file and
      copies it to the output file, updating identification markers
      at the same time. Of course, the RCS file must permit random
      access, since the copied lines are scattered throughout that
      file. <xref linkend="fig6"/> illustrates an RCS file with two revisions and
      the corresponding piece tables.</para>
      <figure id="fig6">
	<title>An RCS file and its piece tables</title>
	<mediaobject>
	  <imageobject><imagedata fileref="rcs-fig6.png" format="PNG"/></imageobject>
	</mediaobject>
      </figure>
      <para>The piece table approach has the property that the time
      for applying a single delta is roughly determined by the size
      of the delta, and not by the size of the revision. For
      example, if a delta is 10 per cent of the size of a revision,
      then applying it takes only 10 per cent of the time to
      generate the latest trunk revision. (The stream editor would
      take 100 per cent.)</para>
      <para>There is an important alternative for representing
	deltas that affects performance. SCCS<xref linkend="rochkind1975sccs"/>, a precursor of RCS,
      uses 
      <emphasis>interleaved</emphasis> deltas. A file containing
      interleaved deltas is partitioned into blocks of lines. Each
      block has a header that specifies to which revision(s) the
      block belongs. The blocks are sorted out in such a way that a
      single pass over the file can pick up all the lines belonging
      to a given revision. Thus, the regeneration time for all
      revisions is the same: all headers must be inspected, and the
      associated blocks either copied or skipped. As the number of
      revisions increases, the cost of retrieving any revision is
      much higher than the cost of checking out the latest trunk
      revision with reverse deltas. A detailed comparison of
      SCCS&rsquo;s interleaved deltas and RCS&rsquo;s reverse
	deltas can be found in Reference <xref linkend="tichy1982rcs"/>. This reference considers
      the version of RCS with the stream editor only. The piece
      table method improves performance further, so that RCS is
      always faster than SCCS, except if 10 or more deltas are
      applied.</para>
      <para>Additional speed-up for both delta methods can be
      obtained by caching the most recently generated revision, as
	has been implemented in DSEE.<xref linkend="chase1984dsee"/> With caching, access time to
      frequently used revisions can approach normal file access
      time, at the cost of some additional space.</para>
    </section>
  </section>
  <section>
    <title>Locking: A Controversial Issue</title>
    <para>The locking mechanism for RCS was difficult to design.
    The problem and its solution are first presented in their
    &lsquo;pure&rsquo; form, followed by a discussion of the
    complications caused by &lsquo;real-world&rsquo;
    considerations.</para>
    <para>RCS must prevent two or more persons from depositing
    competing changes of the same revision. Suppose two programmers
    check out revision 2.4 and modify it. Programmer A checks in a
    revision before programmer B. Unfortunately, programmer B has
    not seen A&rsquo;s changes, so the effect is that A&rsquo;s
    changes are covered up by B&rsquo;s deposit. A&rsquo;s changes
    are not lost since all revisions are saved, but they are
      confined to a single revision.<footnote><para>Note that this problem is entirely different from the atomicity problem.
Atomicity means that
concurrent update operations on the same RCS file cannot be permitted,
because that may result in inconsistent data.
Atomic updates are essential (and implemented in RCS),
	but do not solve the conflict discussed here.</para></footnote></para>
    <para>This conflict is prevented in RCS by locking. Whenever
    someone intends to edit a revision (as opposed to reading or
    compiling it), the revision should be checked out and locked,
    using the 
    <literal>&minus;l</literal> option on 
    <command>co</command>. On subsequent check-in, 
    <command>ci</command> tests the lock and then removes it. At
    most one programmer at a time may lock a particular revision,
    and only this programmer may check in the succeeding revision.
    Thus, while a revision is locked, it is the exclusive
    responsibility of the locker.</para>
    <para>An important maxim for software tools like RCS is that
    they must not stand in the way of making progress with a
    project. This consideration leads to several weakenings of the
    locking mechanism. First of all, even if a revision is locked,
    it can still be checked out. This is necessary if other people
    wish to compile or inspect the locked revision while the next
    one is in preparation. The only operations they cannot do are
    to lock the revision or to check in the succeeding one.
    Secondly, check-in operations on other branches in the RCS file
    are still possible; the locking of one revision does not affect
    any other revision. Thirdly, revisions are occasionally locked
    for a long period of time because a programmer is absent or
    otherwise unable to complete the assignment. If another
    programmer has to make a pressing change, there are the
    following three alternatives for making progress: a) find out
    who is holding the lock and ask that person to release it; b)
    check out the locked revision, modify it, check it in on a
    branch, and merge the changes later; c) break the lock.
    Breaking a lock leaves a highly visible trace, namely an
    electronic mail message that is sent automatically to the
    holder of the lock, recording the breaker and a commentary
    requested from him. Thus, breaking locks is tolerated under
    certain circumstances, but will not go unnoticed. Experience
    has shown that the automatic mail message attaches a high
    enough stigma to lock breaking, such that programmers break
    locks only in real emergencies, or when a co-worker resigns and
    leaves locked revisions behind.</para>
    <para>If an RCS file is private, i.e., when a programmer owns
    an RCS file and does not expect anyone else to perform check-in
    operations, locking is an unnecessary nuisance. In this case,
    the &lsquo;strict locking feature&rsquo; discussed earlier may
    be disabled, provided that file protection is set such that
    only the owner may write the RCS file. This has the effect that
    only the owner can check-in revisions, and that no lock is
    needed for doing so.</para>
    <para>As added protection, each RCS file contains an access
    list that specifies the users who may execute update
    operations. If an access list is empty, only normal UNIX file
    protection applies. Thus, the access list is useful for
    restricting the set of people who would otherwise have update
    permission. Just as with locking, the access list has no effect
    on read-only operations such as 
    <command>co</command>. This approach is consistent with the
    UNIX philosophy of openness, which contributes to a productive
    software development environment.</para>
  </section>
  <section>
    <title>Configuration Management</title>
    <para>The preceding sections described how RCS deals with
    revisions of individual components; this section discusses how
    to handle configurations. A configuration is a set of
    revisions, where each revision comes from a different revision
    group, and the revisions are selected according to a certain
    criterion. For example, in order to build a functioning
    compiler, the &lsquo;right&rsquo; revisions from the scanner,
    the parser, the optimizer and the code generator must be
    combined. RCS, in conjunction with MAKE, provides a number of
    facilities to effect a smooth selection.</para>
    <section>
      <title>RCS Selection Functions</title>
      <variablelist>
        <varlistentry>
          <term>
            <emphasis>Default selection</emphasis>
          </term>
          <listitem>
            <para>During development, the usual selection criterion
            is to choose the latest revision of all components. The
            
            <command>co</command> command makes this selection by
            default. For example, the command</para>
            <!-- INDENTATION -->
            <synopsis><userinput>co *,v</userinput></synopsis>
            <!-- INDENTATION -->
            <para>retrieves the latest revision on the default
            branch of each RCS file in the current directory. The
            default branch is usually the trunk, but may be set to
            be a side branch. Side branches as defaults are needed
            in distributed software development, as discussed in
	      the <xref linkend="rcsrevisiontree"/>.</para>
          </listitem>
        </varlistentry>
        <varlistentry>
          <term>
            <emphasis>Release based selection</emphasis>
          </term>
          <listitem>
            <para>Specifying a release or branch number selects the
            latest revision in that release or branch. For
            instance,</para>
            <synopsis><userinput><command>co</command> &minus;r2 <filename>*,v</filename></userinput></synopsis>
            <para>retrieves the latest revision with release number
            2 from each RCS file. This selection is convenient if a
            release has been completed and development has moved on
            to the next release.</para>
          </listitem>
        </varlistentry>
        <varlistentry>
          <term>
            <emphasis>State and author based selection</emphasis>
          </term>
          <listitem>
            <para>If the highest level number within a given
            release number is not the desired one, the state
            attribute can help. For example,</para>
            <synopsis><userinput><command>co</command> &minus;r2 &minus;sReleased <filename>*,v</filename></userinput></synopsis>
            <para>retrieves the latest revision with release number
            2 whose state attribute is &lsquo;Released&rsquo;. Of
            course, the state attribute has to be set
            appropriately, using the 
            <command>ci</command> or 
            <command>rcs</command> commands. Another alternative is
            to select a revision by its author, using the 
            <literal>&minus;w</literal> option.</para>
          </listitem>
        </varlistentry>
        <varlistentry>
          <term>
            <emphasis>Date based selection</emphasis>
          </term>
          <listitem>
            <para>Revisions may also be selected by date. Suppose a
            release of an entire system was completed and current
            on March 4, at 1:00 p.m. local time. Then the
            command</para>
            <synopsis><userinput><command>co</command> &minus;d&rsquo;March 4, 1:00 pm LT&rsquo;  <filename>*,v</filename></userinput></synopsis>
            <para>checks out all the components of that release,
            independent of the numbering. The 
            <literal>&minus;d</literal> option specifies a
            &lsquo;cutoff date&rsquo;, i.e., the revision selected
            has a check-in date that is closest to, but not after
            the date given.</para>
          </listitem>
        </varlistentry>
        <varlistentry>
          <term>
            <emphasis>Name based selection</emphasis>
          </term>
          <listitem>
            <para>The most powerful selection function is based on
            assigning symbolic names to revisions and branches. In
            large systems, a single release number or date is not
            sufficient to collect the appropriate revisions from
            all groups. For example, suppose one wishes to combine
            release 2 of one subsystem and release 15 of another.
            Most likely, the creation dates of those releases
            differ also. Thus, a single revision number or date
            passed to the 
            <command>co</command> command will not suffice to select
            the right revisions. Symbolic revision numbers solve
            this problem. Each RCS file may contain a set of
            symbolic names that are mapped to numeric revision
            numbers. For example, assume the symbol 
            <literal>V3</literal> is bound to release number 2 in
            file 
            <filename>s,v</filename>, and to revision number 15.9
            in 
            <filename>t,v</filename>. Then the single
            command</para>
            <synopsis><userinput><command>co</command> &minus;rV3 <filename>s,v</filename> <filename>t,v</filename></userinput></synopsis>
            <para>retrieves the latest revision of release 2 from 
            <filename>s,v</filename>, and revision 15.9 from 
            <filename>t,v</filename>. In a large system with many
            modules, checking out all revisions with one command
            greatly simplifies configuration management.</para>
          </listitem>
        </varlistentry>
      </variablelist>
      <para>Judicious use of symbolic revision numbers helps with
      organizing large configurations. A special command, 
      <command>rcsfreeze</command>, assigns a symbolic revision
      number to a selected revision in every RCS file. 
      <command>Rcsfreeze</command> effectively freezes a
      configuration. The assigned symbolic revision number selects
      all components of the configuration. If necessary, symbolic
      numbers may even be intermixed with numeric ones. Thus, 
      <literal>V3.5</literal> in the above example would select
      revision 2.5 in 
      <filename>s,v</filename> and branch 15.9.5 in 
      <filename>t,v</filename>.</para>
      <para>The options 
      <literal>&minus;r</literal>, 
      <literal>&minus;s</literal>, 
      <literal>&minus;w</literal> and 
      <literal>&minus;d</literal> may be combined. If a branch is
      given, the latest revision on that branch satisfying all
      conditions is retrieved; otherwise, the default branch is
      used.</para>
    </section>
    <section>
      <title>Combining MAKE and RCS</title>
      <para>MAKE<xref linkend="feldman1979make" /> is a program that processes configurations. It is
      driven by configuration specifications recorded in a special
      file, called a &lsquo;Makefile&rsquo;. MAKE avoids redundant
      processing steps by comparing creation dates of source and
      processed objects. For example, when instructed to compile
      all modules of a given system, it only recompiles those
      source modules that were changed since they were processed
      last.</para>
      <para>MAKE has been extended with an auto-checkout feature
	for RCS.<footnote><para>This auto-checkout extension is available
	    only in some versions of MAKE, e.g. GNU MAKE.</para></footnote></para>
      <para>When a certain file to be processed is not present,
      MAKE attempts a check-out operation. If successful, MAKE
      performs the required processing, and then deletes the
      checked out file to conserve space. The selection parameters
      discussed above can be passed to MAKE either as parameters,
      or directly embedded in the Makefile. MAKE has also been
      extended to search the subdirectory named 
      <literal>RCS</literal> for needed files, rather than just the
      current working directory. However, if a working file is
      present, MAKE totally ignores the corresponding RCS file and
      uses the working file. (In newer versions of MAKE distributed
      by AT&amp;T and others, auto-checkout can be achieved with
      the rule DEFAULT, instead of a special extension of MAKE.
      However, a file checked out by the rule DEFAULT will not be
      deleted after processing. 
      <command>Rcsclean</command> can be used for that
      purpose.)</para>
      <para>With auto-checkout, RCS/MAKE can effect a selection
      rule especially tuned for multi-person software development
      and maintenance. In these situations, programmers should
      obtain configurations that consist of the revisions they have
      personally checked out plus the latest checked in revision of
      all other revision groups. This schema can be set up as
      follows.</para>
      <para>Each programmer chooses a working directory and places
      into it a symbolic link, named 
      <literal>RCS</literal>, to the directory containing the
      relevant RCS files. The symbolic link makes sure that 
      <command>co</command> and 
      <command>ci</command> operations need only specify the working
      files, and that the Makefile need not be changed. The
      programmer then checks out the needed files and modifies
      them. If MAKE is invoked, it composes configurations by
      selecting those revisions that are checked out, and the rest
      from the subdirectory 
      <literal>RCS</literal>. The latter selection may be
      controlled by a symbolic revision number or any of the other
      selection criteria. If there are several programmers editing
      in separate working directories, they are insulated from each
      other&rsquo;s changes until checking in their
      modifications.</para>
      <para>Similarly, a maintainer can recreate an older
      configuration by starting to work in an empty working
      directory. During the initial MAKE invocation, all revisions
      are selected from RCS files. As the maintainer checks out
      files and modifies them, a new configuration is gradually
      built up. Every time MAKE is invoked, it substitutes the
      modified revisions into the configuration being
      manipulated.</para>
      <para>A final application of RCS is to use it for storing
      Makefiles. Revision groups of Makefiles represent multiple
      versions of configurations. Whenever a configuration is
      baselined or distributed, the best approach is to
      unambiguously fix the configuration with a symbolic revision
      number by calling 
      <command>rcsfreeze</command>, to embed that symbol into the
      Makefile, and to check in the Makefile (using the same
      symbolic revision number). With this approach, old
      configurations can be regenerated easily and reliably.</para>
    </section>
  </section>
  <section id="usagestatistics">
    <title>Usage Statistics</title>
    <para>The following usage statistics were collected on two DEC
    VAX-11/780 computers of the Purdue Computer Science Department.
    Both machines are mainly used for research purposes. Thus, the
    data reflect an environment in which the majority of projects
    involve prototyping and advanced software development, but
    relatively little long-term maintenance.</para>
    <para>For the first experiment, the 
    <command>ci</command> and 
    <command>co</command> operations were instrumented to log the
    number of backward and forward deltas applied. The data were
    collected during a 13 month period from Dec. 1982 to Dec. 1983.
      <xref linkend="table1"/> summarizes the results.</para>
    <segmentedlist id="table1">
      <title>Table I. Statistics for <command>co</command> and <command>ci</command> operations.</title>
      <?dbhtml list-presentation="table"?>
      <segtitle>Operation</segtitle>
      <segtitle>Total operations</segtitle>
      <segtitle>Total deltas applied</segtitle>
      <segtitle>Mean deltas applied</segtitle>
      <segtitle>Operations with &gt;1 delta</segtitle>
      <segtitle>Branch operations</segtitle>
      <seglistitem>
        <seg>co</seg>
        <seg>7867</seg>
        <seg>9320</seg>
        <seg>1.18</seg>
        <seg>509 (6%)</seg>
        <seg>203 (3%)</seg>
      </seglistitem>
      <seglistitem>
        <seg>ci</seg>
        <seg>3468</seg>
        <seg>2207</seg>
        <seg>0.64</seg>
        <seg>85 (2%)</seg>
        <seg>75 (2%)</seg>
      </seglistitem>
      <seglistitem>
        <seg>ci &amp; co</seg>
        <seg>11335</seg>
        <seg>11527</seg>
        <seg>1.02</seg>
        <seg>594 (5%)</seg>
        <seg>278 (2%)</seg>
      </seglistitem>
    </segmentedlist>
    <para>The first two lines show statistics for check-out and
    check-in; the third line shows the combination. Recall that 
    <command>ci</command> performs an implicit check-out to obtain a
    revision for computing the delta. In all measures presented,
    the most recent revision (stored intact) counts as one delta.
    The number of deltas applied represents the number of passes
    necessary, where the first &lsquo;pass&rsquo; is a copying
    step.</para>
    <para>Note that the check-out operation is executed more than
    twice as frequently as the check-in operation. The fourth
    column gives the mean number of deltas applied in all three
    cases. For 
    <command>ci</command>, the mean number of deltas applied is
    less than one. The reasons are that the initial check-in
    requires no delta at all, and that the only time 
    <command>ci</command> requires more than one delta is for
    branches. Column 5 shows the actual number of operations that
    applied more than one delta. The last column indicates that
    branches were not used often.</para>
    <para>The last three columns demonstrate that the most recent
    trunk revision is by far the most frequently accessed. For RCS,
    check-out of this revision is a simple copy operation, which is
    the absolute minimum given the copy-semantics of 
    <command>co</command>. Access to older revisions and branches
    is more common in non-academic environments, yet even if access
    to older deltas were an order of magnitude more frequent, the
    combined average number of deltas applied would still be below
    1.2. Since RCS is faster than SCCS until up to 10 delta
    applications, reverse deltas are clearly the method of
    choice.</para>
    <para>The second experiment, conducted in March of 1984,
    involved surveying the existing RCS files on our two machines.
    The goal was to determine the mean number of revisions per RCS
      file, as well as the space consumed by them. <xref linkend="table2"/> shows the
    results. (Tables I and II were produced at different times and
    are unrelated.)</para>
    <segmentedlist id="table2">
      <title>Table II. Statistics for RCS files.</title>
      <?dbhtml list-presentation="table"?>
      <segtitle></segtitle>
      <segtitle>Total RCS files</segtitle>
      <segtitle>Total revisions</segtitle>
      <segtitle>Mean revisions</segtitle>
      <segtitle>Mean size of RCS files</segtitle>
      <segtitle>Mean size of revisions</segtitle>
      <segtitle>Overhead</segtitle>
      <seglistitem>
        <seg>All files</seg>
        <seg>8033</seg>
        <seg>11133</seg>
        <seg>1.39</seg>
        <seg>6156</seg>
        <seg>5585</seg>
        <seg>1.10</seg>
      </seglistitem>
      <seglistitem>
        <seg>Files with &gt;= 2 deltas</seg>
        <seg>1477</seg>
        <seg>4578</seg>
        <seg>3.10</seg>
        <seg>8074</seg>
        <seg>6041</seg>
        <seg>1.34</seg>
      </seglistitem>
    </segmentedlist>
    <para>The mean number of revisions per RCS file is 1.39.
    Columns 5 and 6 show the mean sizes (in bytes) of an RCS file
    and of the latest revision of each RCS file, respectively. The
    &lsquo;overhead&rsquo; column contains the ratio of the mean
    sizes. Assuming that all revisions in an RCS file are
    approximately the same size, this ratio gives a measure of the
    space consumed by the extra revisions.</para>
    <para>In our sample, over 80 per cent of the RCS files
    contained only a single revision. The reason is that our
    systems programmers routinely check in all source files on the
    distribution tapes, even though they may never touch them
    again. To get a better indication of how much space savings are
    possible with deltas, all measures with those files that
    contained 2 or more revisions were recomputed. Only for those
    files is RCS necessary. As shown in the second line, the
    average number of revisions for those files is 3.10, with an
    overhead of 1.34. This means that the extra 2.10 deltas require
    34 per cent extra space, or 16 per cent per extra revision.
      Rochkind<xref linkend="rochkind1975sccs"/> measured the space consumed by SCCS, and reported an
    average of 5 revisions per group and an overhead of 1.37 (or
    about 9 per cent per extra revision). In a later paper,
      Glasser<xref linkend="glasser1978sccs"/> observed an average of 7 revisions per group in a
    single, large project, but provided no overhead figure. In his
    paper on DSEE<xref linkend="chase1984dsee"/> With, Leblang reported that delta storage combined
    with blank compression results in an overhead of a mere
    1&minus;2 per cent per revision. Since leading blanks accounted
    for about 20 per cent of the surveyed Pascal programs, a
    revision group with 5&minus;10 members was smaller than a
    single clear text copy.</para>
    <para>The above observations demonstrate clearly that the space
    needed for extra revisions is small. With delta storage, the
    luxury of keeping multiple revisions online is certainly
    affordable. In fact, introducing a system with delta storage
    may reduce storage requirements, because programmers often save
    back-up copies anyway. Since back-up copies are stored much
    more efficiently with deltas, introducing a system such as RCS
    may actually free a considerable amount of space.</para>
  </section>
  <section>
    <title>Survey of Version Control Tools</title>
    <para>The need to keep back-up copies of software arose when
    programs and data were no longer stored on paper media, but
    were entered from terminals and stored on disk. Back-up copies
    are desirable for reliability, and many modern editors
    automatically save a back-up copy for every file touched. This
    strategy is valuable for short-term back-ups, but not suitable
    for long-term version control, since an existing back-up copy
    is overwritten whenever the corresponding file is
    edited.</para>
    <para>Tape archives are suitable for long-term, offline
    storage. If all changed files are dumped on a back-up tape once
    per day, old revisions remain accessible. However, tape
    archives are unsatisfactory for version control in several
    ways. First, backing up the file system every 24 hours does not
    capture intermediate revisions. Secondly, the old revisions are
    not online, and accessing them is tedious and time-consuming.
    In particular, it is impractical to compare several old
    revisions of a group, because that may require mounting and
    searching several tapes. Tape archives are important fail-safe
    tools in the event of catastrophic disk failures or accidental
    deletions, but they are ill-suited for version control.
    Conversely, version control tools do not obviate the need for
    tape archives.</para>
    <para>A natural technique for keeping several old revisions
    online is to never delete a file. Editing a file simply creates
    a new file with the same name, but with a different sequence
    number. This technique, available as an option in DEC&rsquo;s
    VMS operating system, turns out to be inadequate for version
    control. First, it is prohibitively expensive in terms of
    storage costs, especially since no data compression techniques
    are employed. Secondly, indiscriminately storing every change
    produces too many revisions, and programmers have difficulties
    distinguishing them. The proliferation of revisions forces
    programmers to spend much time on finding and deleting useless
    files. Thirdly, most of the support functions like locking,
    logging, revision selection, and identification described in
    this paper are not available.</para>
    <para>An alternative approach is to separate editing from
    revision control. The user may repeatedly edit a given
    revision, until freezing it with an explicit command. Once a
    revision is frozen, it is stored permanently and can no longer
    be modified. (In RCS, freezing a revisions is done with 
    <command>ci</command>.) Editing a frozen revision implicitly
    creates a new one, which can again be changed repeatedly until
    it is frozen itself. This approach saves exactly those
    revisions that the user considers important, and keeps the
      number of revisions manageable. IBM&rsquo;s CLEAR/CASTER<xref linkend="brown1970clear"/>,
      AT&amp;T&rsquo;s SCCS<xref linkend="rochkind1975sccs"/>, CMU&rsquo;s
      SDC<xref linkend="habermann1979sdc"/> and DEC&rsquo;s CMS<xref linkend="dec1982cms"/>,
    are examples of version control systems using this approach.
    CLEAR/CASTER maintains a data base of programs, specifications,
    documentation and messages, using deltas. Its goal is to
    provide control over the development process from a management
    viewpoint. SCCS stores multiple revisions of source text in an
    ancestral tree, records a log entry for each revision, provides
    access control, and has facilities for uniquely identifying
    each revision. An efficient delta technique reduces the space
    consumed by each revision group. SDC is much simpler than SCCS
    because it stores not more than two revisions. However, it
    maintains a complete log for all old revisions, some of which
    may be on back-up tape. CMS, like SCCS, manages tree-structured
    revision groups, but offers no identification mechanism.</para>
    <para>Tools for dealing with configurations are still in a
    state of flux. SCCS, SDC and CMS can be combined with MAKE or
    MAKE-like programs. Since flexible selection rules are missing
    from all these tools, it is sometimes difficult to specify
    precisely which revision of each group should be passed to MAKE
      for building a desired configuration. The Xerox Cedar system<xref linkend="lampson1983cedar"/>
    provides a &lsquo;System Modeller&rsquo; that can rebuild a
    configuration from an arbitrary set of module revisions. The
    revisions of a module are only distinguished by creation time,
    and there is no tool for managing groups. Since the selection
    rules are primitive, the System Modeller appears to be somewhat
      tedious to use. Apollo&rsquo;s DSEE<xref linkend="chase1984dsee"/> is a sophisticated
    software engineering environment. It manages revision groups in
    a way similar to SCCS and CMS. Configurations are built using
    &lsquo;configuration threads&rsquo;. A configuration thread
    states which revision of each group named in a configuration
    should be chosen. A configuration thread may contain dynamic
    specifiers (e.g., &lsquo;choose the revisions I am currently
    working on, and the most recent revisions otherwise&rsquo;),
    which are bound automatically at build time. It also provides a
    notification mechanism for alerting maintainers about the need
    to rebuild a system after a change.</para>
    <para>RCS is based on a general model for describing
      multi-version/multi-configuration systems<xref linkend="tichy1982multi"/>. The model
    describes systems using AND/OR graphs, where AND nodes
    represent configurations, and OR nodes represent version
    groups. The model gives rise to a suit of selection rules for
    composing configurations, almost all of which are implemented
    in RCS. The revisions selected by RCS are passed to MAKE for
    configuration building. Revision group management is modelled
    after SCCS. RCS retains SCCS&rsquo;s best features, but offers
    a significantly simpler user interface, flexible selection
    rules, adequate integration with MAKE and improved
    identification. A detailed comparison of RCS and SCCS appears
    in Reference <xref linkend="tichy1982rcs"/>.</para>
    <para>An important component of all revision control systems is
    a program for computing deltas. SCCS and RCS use the program 
    <command>diff</command><xref linkend="hunt1976diff"/>, which first computes the longest
    common substring of two revisions, and then produces the delta
    from that substring. The delta is simply an edit script
    consisting of deletion and insertion commands that generate one
    revision from the other.</para>
    <para>A delta based on a longest common substring is not
    necessarily minimal, because it does not take advantage of
    crossing block moves. Crossing block moves arise if two or more
    blocks of lines (e.g., procedures) appear in a different order
    in two revisions. An edit script derived from a longest common
    substring first deletes the shorter of the two blocks, and then
      reinserts it. Heckel<xref linkend="heckel1978diff"/> proposed an algorithm for detecting
    block moves, but since the algorithm is based on heuristics,
    there are conditions under which the generated delta is far
    from minimal. DSEE uses this algorithm combined with blank
    compression, apparently with satisfactory overall results. A
    new algorithm that is guaranteed to produce a minimal delta
      based on block moves appears in Reference <xref linkend="tichy1984diff"/>. A future release
    of RCS will use this algorithm.</para>
  </section>
  <ackno><emphasis>Acknowledgements</emphasis>: Many people have helped make RCS a success by contributed
  criticisms, suggestions, corrections, and even whole new commands
  (including manual pages). The list of people is too long to be
  reproduced here, but my sincere thanks for their help and
  goodwill goes to all of them.</ackno>
  <appendix id="appendix">
    <title>Synopsis of RCS Operations</title>
    <variablelist>
      <varlistentry>
        <term>
        <emphasis>ci</emphasis> &minus; check in revisions</term>
        <listitem>
          <para>
          <command>Ci</command> stores the contents of a working
          file into the corresponding RCS file as a new revision.
          If the RCS file doesn&rsquo;t exist, 
          <command>ci</command> creates it. 
          <command>Ci</command> removes the working file, unless one
          of the options 
          <literal>&minus;u</literal> or 
          <literal>&minus;l</literal> is present. For each check-in,
          
          <command>ci</command> asks for a commentary describing the
          changes relative to the previous revision.</para>
          <!-- INDENTATION -->
          <para>
          <command>Ci</command> assigns the revision number given by
          the 
          <literal>&minus;r</literal> option; if that option is
          missing, it derives the number from the lock held by the
          user; if there is no lock and locking is not strict, 
          <command>ci</command> increments the number of the latest
          revision on the trunk. A side branch can only be started
          by explicitly specifying its number with the 
          <literal>&minus;r</literal> option during check-in.</para>
          <!-- INDENTATION -->
          <para>
          <command>Ci</command> also determines whether the revision
          to be checked in is different from the previous one, and
          asks whether to proceed if not. This facility simplifies
          check-in operations for large systems, because one need
          not remember which files were changed.</para>
          <!-- INDENTATION -->
          <para>The option 
          <literal>&minus;k</literal> searches the checked in file
          for identification markers containing the attributes
          revision number, check-in date, author and state, and
          assigns these to the new revision rather than computing
          them. This option is useful for software distribution:
          Recipients of distributed software using RCS should check
          in updates with the 
          <literal>&minus;k</literal> option. This convention
          guarantees that revision numbers, check-in dates, etc.,
          are the same at all sites.</para>
        </listitem>
      </varlistentry>
      <varlistentry>
        <term>
        <emphasis>co</emphasis> &minus; check out revisions</term>
        <listitem>
          <para>
          <command>Co</command> retrieves revisions according to
          revision number, date, author and state attributes. It
          either places the revision into the working file, or
          prints it on the standard output. 
          <command>Co</command> always expands the identification
          markers.</para>
        </listitem>
      </varlistentry>
      <varlistentry>
        <term>
        <emphasis>ident</emphasis> &minus; extract identification
        markers</term>
        <listitem>
          <para>
          <command>Ident</command> extracts the identification
          markers expanded by 
          <command>co</command> from any file and prints
          them.</para>
        </listitem>
      </varlistentry>
      <varlistentry>
        <term>
        <emphasis>rcs</emphasis> &minus; change RCS file
        attributes</term>
        <listitem>
          <para>
          <command>Rcs</command> is an administrative operation that
          changes access lists, locks, unlocks, breaks locks,
          toggles the strict-locking feature, sets state attributes
          and symbolic revision numbers, changes the description,
          and deletes revisions. A revision can only be deleted if
          it is not the fork of a side branch.</para>
        </listitem>
      </varlistentry>
      <varlistentry>
        <term>
        <emphasis>rcsclean</emphasis> &minus; clean working
        directory</term>
        <listitem>
          <para>
          <command>Rcsclean</command> removes working files that
          were checked out but never changed.*</para>
        </listitem>
      </varlistentry>
      <varlistentry>
        <term>
        <emphasis>rcsdiff</emphasis> &minus; compare
        revisions</term>
        <listitem>
          <para>
          <command>Rcsdiff</command> compares two revisions and
          prints their difference, using the UNIX tool 
          <command>diff</command>. One of the revisions compared
          may be checked out. This command is useful for finding
          out about changes.</para>
        </listitem>
      </varlistentry>
      <varlistentry>
        <term>
        <emphasis>rcsfreeze</emphasis> &minus; freeze a
        configuration</term>
        <listitem>
          <para>
          <command>Rcsfreeze</command> assigns the same symbolic
          revision number to a given revision in all RCS files.
          This command is useful for accurately recording a
          configuration.*</para>
        </listitem>
      </varlistentry>
      <varlistentry>
        <term>
        <emphasis>rcsmerge</emphasis> &minus; merge revisions</term>
        <listitem>
          <para>
          <command>Rcsmerge</command> merges two revisions, 
          <literal>rev1</literal> and 
          <literal>rev2</literal>, with respect to a common
          ancestor. A 3-way file comparison determines the segments
          of lines that are (a) the same in all three revisions, or
          (b) the same in 2 revisions, or (c) different in all
          three. For all segments of type (b) where 
          <literal>rev1</literal> is the differing revision, the
          segment in 
          <literal>rev1</literal>replaces the corresponding segment
          of 
          <literal>rev2</literal>. Type (c) indicates an
          overlapping change, is flagged as an error, and requires
          user intervention to select the correct
          alternative.</para>
        </listitem>
      </varlistentry>
      <varlistentry>
        <term>
        <emphasis>rlog</emphasis> &minus; read log messages</term>
        <listitem>
          <para>
          <command>Rlog</command> prints the log messages and other
          information in an RCS file.</para>
        </listitem>
      </varlistentry>
    </variablelist>
  </appendix>
  <bibliography>
    <title>References</title>
    <biblioentry id="feldman1979make" xreflabel="1">
      <biblioset relation="article">
        <author>
          <firstname>Stuart I.</firstname>
          <surname>Feldman</surname>
        </author>
        <pagenums>255-265</pagenums>
        <title>Make&mdash;A Program for Maintaining Computer
        Programs</title>
      </biblioset>
      <biblioset relation="journal">
        <title>Software&mdash;Practice &amp; Experience</title>
        <volumenum>9</volumenum>
        <issuenum>3</issuenum>
        <pubdate>March 1979</pubdate>
      </biblioset>
    </biblioentry>
    <biblioentry id="hunt1976diff" xreflabel="2">
      <authorgroup>
        <author>
          <firstname>James W.</firstname>
          <surname>Hunt</surname>
        </author>
        <author>
          <firstname>M. D.</firstname>
          <surname>McIlroy</surname>
        </author>
      </authorgroup>
      <title>&ldquo;An Algorithm for Differential File
      Comparison&rdquo;</title>
      <pubsnumber>41</pubsnumber>
      <publishername>Computing Science Technical
      Report</publishername>
      <corpname>Bell Laboratories</corpname>
      <pubdate>June 1976</pubdate>
    </biblioentry>
    <biblioentry id="rochkind1975sccs" xreflabel="3">
      <biblioset relation="article">
        <author>
          <surname>Rochkind</surname>
          <firstname>Marc J.</firstname>
        </author>
        <title>The Source Code Control System</title>
        <pagenums>364-370</pagenums>
      </biblioset>
      <biblioset relation="journal">
        <title>IEEE Transactions on Software Engineering</title>
        <volumenum>SE-1</volumenum>
        <issuenum>4</issuenum>
        <pubdate>Dec 1975</pubdate>
      </biblioset>
    </biblioentry>
    <biblioentry id="tichy1982rcs" xreflabel="4">
      <biblioset relation="article">
        <author>
          <firstname>Walter F.</firstname>
          <surname>Tichy</surname>
        </author>
        <title>&ldquo;Design, Implementation, and Evaluation of a
        Revision Control System&rdquo;</title>
        <pagenums>58-67</pagenums>
      </biblioset>
      <biblioset relation="conference">
        <confgroup>
          <conftitle>Proceedings of the 6th International
          Conference on Software Engineering</conftitle>
          <confsponsor>ACM</confsponsor>
          <confsponsor>IEEE</confsponsor>
          <confsponsor>IPS</confsponsor>
          <confsponsor>NBS</confsponsor>
        </confgroup>
        <pubdate>September 1982</pubdate>
      </biblioset>
    </biblioentry>
    <biblioentry id="chase1984dsee" xreflabel="5">
      <biblioset relation="article">
        <authorgroup>
          <author>
            <firstname>David B.</firstname>
            <surname>Leblang</surname>
          </author>
          <author>
            <firstname>Robert P.</firstname>
            <surname>Chase</surname>
          </author>
        </authorgroup>
        <title>&ldquo;Computer-Aided Software Engineering in a
        Distributed Workstation Environment&rdquo;</title>
        <pagenums>104-112</pagenums>
      </biblioset>
      <biblioset relation="journal">
        <title>SIGPLAN Notices</title>
        <volumenum>19</volumenum>
        <issuenum>5</issuenum>
        <pubdate>1984</pubdate>
        <confgroup>
          <conftitle>Proceedings of the ACM SIGSOFT/SIGPLAN
          Software Engineering Symposium on Practical Software
          Development Environments.</conftitle>
        </confgroup>
      </biblioset>
    </biblioentry>
    <biblioentry id="glasser1978sccs" xreflabel="6">
      <biblioset relation="article">
        <author>
          <firstname>Alan L.</firstname>
          <surname>Glasser</surname>
        </author>
        <title>&ldquo;The Evolution of a Source Code Control
        System&rdquo;</title>
        <pagenums>122-125</pagenums>
      </biblioset>
      <biblioset relation="journal">
        <title>Software Engineering Notes,</title>
        <volumenum>3</volumenum>
        <issuenum>5</issuenum>
        <pubdate>Nov. 1978</pubdate>
        <confgroup>
          <conftitle>Proceedings of the Software Quality and
          Assurance Workshop.</conftitle>
        </confgroup>
      </biblioset>
    </biblioentry>
    <biblioentry id="brown1970clear" xreflabel="7">
      <author>
        <firstname>H.B.</firstname>
        <surname>Brown</surname>
      </author>
      <title>&ldquo;The Clear/Caster System&rdquo;</title>
      <confgroup>
        <conftitle>Nato Conference on Software Engineering,
        Rome</conftitle>
        <confdates>1970</confdates>
      </confgroup>
    </biblioentry>
    <biblioentry id="habermann1979sdc" xreflabel="8">
      <author>
        <firstname>A. Nico</firstname>
        <surname>Habermann</surname>
      </author>
      <title>A Software Development Control System</title>
      <subtitle>Technical Report</subtitle>
      <publishername>Carnegie-Mellon University, Department of
      Computer Science</publishername>
      <pubdate>Jan. 1979</pubdate>
    </biblioentry>
    <biblioentry id="dec1982cms" xreflabel="9">
      <corpauthor>DEC</corpauthor>
      <title>Code Management System</title>
      <publishername>Digital Equipment Corporation</publishername>
      <pubdate>1982</pubdate>
      <pubsnumber>Document No. EA-23134-82</pubsnumber>
    </biblioentry>
    <biblioentry id="lampson1983cedar" xreflabel="10">
      <authorgroup>
        <author>
          <firstname>Eric E.</firstname>
          <surname>Schmidt</surname>
        </author>
        <author>
          <firstname>Butler W.</firstname>
          <surname>Lampson</surname>
        </author>
      </authorgroup>
      <title>&ldquo;Practical Use of a Polymorphic Applicative
      Language&rdquo;</title>
      <confgroup>
        <conftitle>Proceedings of the 10th Symposium on Principles
        of Programming Languages</conftitle>
        <confsponsor>ACM</confsponsor>
      </confgroup>
      <pagenums>237-255</pagenums>
      <pubdate>January 1983</pubdate>
    </biblioentry>
    <biblioentry id="tichy1982multi" xreflabel="11">
      <biblioset relation="article">
        <author>
          <firstname>Walter F.</firstname>
          <surname>Tichy</surname>
        </author>
        <title>&ldquo;A Data Model for Programming Support
        Environments and its Application&rdquo;</title>
      </biblioset>
      <biblioset relation="book">
        <title>Automated Tools for Information System Design and
        Development</title>
        <editor>
          <firstname>Hans-Jochen</firstname>
          <surname>Schneider</surname>
        </editor>
        <editor>
          <firstname>Anthony I.</firstname>
          <surname>Wasserman</surname>
        </editor>
        <publisher>
          <publishername>North-Holland Publishing
          Company</publishername>
          <address>Amsterdam</address>
        </publisher>
        <pubdate>1982</pubdate>
      </biblioset>
    </biblioentry>
    <biblioentry id="heckel1978diff" xreflabel="12">
      <biblioset relation="article">
        <author>
          <firstname>Paul</firstname>
          <surname>Heckel</surname>
        </author>
        <title>&ldquo;A Technique for Isolating Differences Between
        Files&rdquo;</title>
        <pagenums>264-268</pagenums>
      </biblioset>
      <biblioset relation="journal">
        <title>Communications of the ACM</title>
        <volumenum>21</volumenum>
        <issuenum>4</issuenum>
        <pubdate>April 1978</pubdate>
      </biblioset>
    </biblioentry>
    <biblioentry id="tichy1984diff" xreflabel="13">
      <author>
        <firstname>Walter F.</firstname>
        <surname>Tichy</surname>
      </author>
      <title>&ldquo;The String-to-String Correction Problem with
      Block Moves&rdquo;</title>
      <title>ACM Transactions on Computer Systems</title>
      <volumenum>2</volumenum>
      <issuenum>4</issuenum>
      <pagenums>309-321</pagenums>
      <pubdate>Nov. 1984</pubdate>
    </biblioentry>
  </bibliography>
</article>
