source: trunk/packages/invirt-dev/README.invirtibuilder @ 2858

Last change on this file since 2858 was 2858, checked in by broder, 14 years ago

Add documentation on the Invirtibuilder.

File size: 13.3 KB
Line 
1============================
2Design of The Invirtibuilder
3============================
4
5Introduction
6============
7
8The Invirtibuilder is an automated Debian package builder, APT
9repository manager, and Git repository hosting tool. It is intended
10for projects that consist of a series of Debian packages each tracked
11as a separate Git repository, and designed to keep the Git and APT
12repositories in sync with each other. The Invirtibuilder supports
13having multiple threads, or "pockets" of development, and can enforce
14different access control and repository consistency rules for each
15pocket.
16
17Background and Goals
18====================
19
20The Invirtibuilder was originally developed for Invirt_, a project of
21the MIT SIPB_. When we went to develop a tool for managing our APT and
22Git repositories, we had several goals, each of which informed the
23design of the Invirtibuilder:
24
25* One Git repository per Debian package.
26
27  Because of how Git tracks history, it's better suited for tracking a
28  series of small repositories, as opposed to one large one
29  [#]_. Furthermore, most pre-existing tools and techniques for
30  dealing with Debian packages in Git repositories (such as
31  git-buildpackage_ or `VCS location information`_) are designed
32  exclusively for this case.
33
34* Synchronization between Git and APT repositories.
35
36  In our previous development models, we would frequently merge
37  development into trunk without necessarily being ready to deploy it
38  to our APT repository (and by extension, our servers) yet. However,
39  once the changes had been merged in, it was no longer possible to
40  see the current state of the APT repository purely from inspection
41  of the source control repository.
42
43* Support for multiple *pockets* of development.
44
45  For the Invirt_ project, we maintain separate production and
46  development environments. Initially, they each shared the same APT
47  repository. To test changes, we had to install them into the APT
48  repository and install the update on our development cluster, and
49  simply wait to take the update on our production cluster until
50  testing was completed. When designing the Invirtibuilder, we wanted
51  the set of packages available to our development cluster to be
52  separate from the packages in the production cluster.
53
54* Different ACLs for different pockets.
55
56  Access to our development cluster is relatively unrestricted—we
57  freely grant access to interested developers to encourage
58  contributions to the project. Our production cluster, on the other
59  hand, has a much higher standard of security, and access is limited
60  to the core maintainers of the service. The Invirtibuilder needed to
61  support that separation of privilege.
62
63* Tool-enforced version number restrictions.
64
65  Keeping our packages in APT repositories adds a few restrictions to
66  the version numbers of packages. First, version numbers in the APT
67  repository must be unique. That is, you can not have two different
68  packages of the same name and version number. Second, version
69  numbers are expected to be monotonically increasing. If a newer
70  version of a package had a lower version number than the older
71  version, dpkg would consider this a downgrade. Downgrades are not
72  supported by dpkg, and will not even be attempted by APT.
73
74  In order to avoid proliferation of version numbers used only for
75  testing purposes, we opted to bend the latter rule for our
76  development pocket.
77
78* Tool-enforced consistent history.
79
80  In order for the Git history to be meaningful, we chose to require
81  that each version of a package that is uploaded into the APT
82  repository be a fast-forward of the previous version.
83
84  Again, to simplify and encourage testing, we bend this rule for the
85  development pocket as well.
86
87Design
88======
89
90Configuration
91-------------
92
93For the Invirt_ project's use of the Invirtibuilder, we adapted our
94existing configuration mechanism. Our configuration file consists of a
95singls YAML_ file. Here is the snippet of configuration we use for our
96build configuration::
97
98 build:
99  pockets:
100   prod:
101    acl: system:xvm-root
102    apt: stable
103   dev:
104    acl: system:xvm-dev
105    apt: unstable
106    allow_backtracking: yes
107  tagger:
108   name: Invirt Build Server
109   email: invirt@mit.edu
110
111The Invirtibuilder allows naming Invirtibuilder pockets separately
112form their corresponding Git branches or APT components. However, if
113either the ``git`` or ``apt`` properties of the pocket are
114unspecified, they are assumed to be the same as the name of the
115pocket.
116
117The ``acl`` attributes for each pocket are interpreted within our
118authorization modules to determine who is allowed to request builds on
119a given pocket. ``system:xvm-root`` and ``system:xvm-dev`` are the
120names of AFS groups, which we use for authorization.
121
122The ``tagger`` attribute indicates the name and e-mail address to be
123used whenever the Invirtibuilder generates new Git repository objects,
124such as commits or tags.
125
126Finally, it was mentioned in `Background and Goals`_ that we wanted
127the ability to not force version number consistency or Git
128fast-forwards for our development pocket. The ``allow_backtracking``
129attribute was introduced to indicate that preference. When it is set
130to ``yes`` (i.e. YAML's "true" value), then neither fast-forwards nor
131increasing-version-numbers are enforced when validating builds. The
132attribute is assumed to be false if undefined.
133
134Git Repositories
135----------------
136
137In order to make it easy to check out all packages at once, and for
138version controlling the state of the APT repository, we create a
139"superproject" using Git submodules [#]_.
140
141There is one Git branch in the superproject corresponding to each
142pocket of development. Each branch contains a submodule for each
143package in the corresponding component of the APT repository, and the
144submodule commit referred to by the head of the Git branch matches the
145revision of the package currently in the corresponding component of
146the APT repository. Thus, the heads of the Git superproject match the
147state of the components in the APT repository.
148
149Each of the submodules also has a branch for each pocket. The head of
150that branch points to the revision of the package that is currently in
151the corresponding component of the APT repository. This provides a
152convenient branching point for new development. Additionally, there is
153a Git tag for every version of the package that has ever been uploaded
154to the APT repository.
155
156Because the Invirtibuilder and its associated infrastructure are
157responsible for keeping the superproject in sync with the state of the
158APT repository, an update hook disallows all pushes to the
159superproject.
160
161Pushes to the submodules, on the other hand, are almost entirely
162unrestricted. Like with the superproject, the Git branches for each
163pocket and Git tags are maintained by the build infrastructure, so
164pushes to them are disallowed. Outside of that, we make no
165restrictions on the creation or deletion of branches, nor are pushes
166required to be fast-forwards.
167
168The Build Queue
169---------------
170
171We considered several ways to trigger builds of new package versions
172using Git directly. However, we realized that what we actually wanted
173was a separate build queue where each build request was handled and
174processed independently of any requests before or after it. It's not
175possible to have these semantics using Git as a signalling mechanism
176without breaking standard assumptions about how remote Git
177repositories work.
178
179In order to trigger builds, then, we needed a side-channel. Since it
180was already widely used in the Invirt_ project, we chose to use
181remctl_, a GSSAPI-authenticated RPC protocol with per-command ACLs.
182
183To trigger a new build, a developer calls remctl against the build
184server with a pocket, a package, and a commit ID from that package's
185Git repository. The remctl daemon then calls a script which validates
186the build and adds it to the build queue. Because of the structure of
187remctl's ACLs, we are able to have different ACLs depending on which
188pocket the build is destined for. This allows us to fulfil our design
189goal of having different ACLs for different pockets.
190
191For simplicity, the queue itself is maintained as a directory of
192files, where each file is a queue entry. To maintain order in the
193queue, the file names for queue entries are of the form
194``YYYYMMDDHHMMSS_XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX``, where ``X``
195indicates a random hexadecimal digit. Each file contains the
196parameters passed in over remctl (pocket, package, and commit ID to
197build), as well as the Kerberos principal of the user that requested
198the build, for logging.
199
200The Build Daemon
201----------------
202
203To actually execute builds, we run a separate daemon to monitor for
204new build requests in the build queue. The daemon uses inotify so that
205it's triggered whenever a new item is added to the build
206queue. Whenever an item in the build queue triggers the build daemon,
207the daemon first validates the build, then executes the build, and
208finally updates both the APT repository and Git superproject with the
209results of the build. The results of all attempted builds are recorded
210in a database table for future reference.
211
212Build Validation
213````````````````
214
215The first stage of processing a new build request is validating the
216build. First, the build daemon checks the version number of the
217requested package in each pocket of the repository. If the package is
218present in any other pocket with the same version number, but the Git
219commit for the package is different, the build errors out, because it
220is not possible for an APT repository to contain two different
221packages with the same name and version number.
222
223Next, the build daemon checks to make sure that the version number of
224the new package is a higher version number than the version currently
225in the APT repository, as version numbers must be monotonically
226increasing.
227
228Finally, we require new packages to be fast-forwards in Git of the
229previous version of the package. This is verified as well.
230
231As mentioned above, the ``allow_backtracking`` attribute can be set
232for a pocket to bypass the latter two checks in development
233environments.
234
235When the same package with the same version is inserted into multiple
236places in the same APT repository, the MD5 hash of the package is used
237to validate that it hasn't changed. Because rebuilding the same
238package causes the MD5 hash to change, when a version of a package
239identical to a version already in the APT repository is added to
240another pocket, we need to copy it directly. Since the validation
241stage already has all of the necessary information to detect this
242case, if the same version of a package is already present in another
243pocket, the validation stage returns this information.
244
245Build Execution
246```````````````
247
248Once the build has been validated, it can be executed. The requested
249version of the package is exported from Git, and then a Debian source
250package is generated. Next, the package itself is built using sbuild.
251
252sbuild creates an ephemeral build chroot for each build that has only
253essential build packages and the build dependencies for the package
254being built installed. We use sbuild for building packages for several
255reasons. First, it helps us verify that all necessary build
256dependencies have been included in our packages. Second, it helps us
257ensure that configuration files haven't been modified from their
258upstream defaults (which could cause problems for packages using
259config-package-dev_).
260
261The build daemon keeps the build logs from all attempted builds on the
262filesystem for later inspection.
263
264Repository Updates
265``````````````````
266
267Once the build has been successfully completed, the APT and Git
268repositories are updated to match the new state. First, a new tag is
269added to the package's Git repository for the current version
270[#]_. Next, the pocket tracking branch in the submodule is also
271updated with the new version of the package. Then the a new commit is
272created on the superproject which updates the package's submodule to
273point to the new version of the package. Finally, the new version of
274the package is included in the appropriate component of the APT
275repository.
276
277Because the Git superproject, the Git submodules, and the APT
278repository are all updated simultaneously to reflect the new package
279version, the Git repositories and the APT repository always stay in
280sync.
281
282Build Failures
283``````````````
284
285If any of the above stages of executing a build fail, that failure is
286trapped and recorded for later inspection, and recorded along with the
287build record in the database. Regardless of success or failure, the
288build daemon runs any scripts in a hook directory. The hook directory
289could contain scripts to publish the results of the build in whatever
290way is deemed useful by the developers.
291
292.. _config-package-dev: http://debathena.mit.edu/config-packages
293.. _git-buildpackage: https://honk.sigxcpu.org/piki/projects/git-buildpackage/
294.. _Invirt: http://invirt.mit.edu
295.. _remctl: http://www.eyrie.org/~eagle/software/remctl/
296.. _SIPB: http://sipb.mit.edu
297.. _VCS location information: http://www.debian.org/doc/developers-reference/best-pkging-practices.html#bpp-vcs
298.. _YAML: http://yaml.org/
299
300.. [#] http://lwn.net/Articles/246381/
301.. [#] A Git submodule is a second Git repository embedded at a
302       particular path within the superproject and fixed at a
303       particular commit.
304.. [#] Because we don't force any sort of version consistency for
305       pockets with ``allow_backtracking`` set to ``True``, we don't
306       create new tags for builds on pockets with
307       ``allow_backtracking`` set to ``True`` either.
Note: See TracBrowser for help on using the repository browser.