Please file new bugs on Launchpad: Invirt or XVM (if you're not sure which, just pick one)

Context Navigation

source: trunk/packages/xen-common/xen-common/docs/src/user.tex @ 95

Last change on this file since 95 was 34, checked in by hartmans, 18 years ago
Add xen and xen-common
File size: 169.2 KB

Line
1	\documentclass[11pt,twoside,final,openright]{report}
2	\usepackage{a4,graphicx,html,parskip,setspace,times,xspace,url}
3	\setstretch{1.15}
4
5	\renewcommand{\ttdefault}{pcr}
6
7	\def\Xend{{Xend}\xspace}
8	\def\xend{{xend}\xspace}
9
10	\latexhtml{\renewcommand{\path}[1]{{\small {\tt #1}}}}{\renewcommand{\path}[1]{{\tt #1}}}
11
12
13	\begin{document}
14
15	% TITLE PAGE
16	\pagestyle{empty}
17	\begin{center}
18	\vspace*{\fill}
19	\includegraphics{figs/xenlogo.eps}
20	\vfill
21	\vfill
22	\vfill
23	\begin{tabular}{l}
24	{\Huge \bf Users' Manual} \\[4mm]
25	{\huge Xen v3.0} \\[80mm]
26	\end{tabular}
27	\end{center}
28
29	{\bf DISCLAIMER: This documentation is always under active development
30	and as such there may be mistakes and omissions --- watch out for
31	these and please report any you find to the developers' mailing list,
32	xen-devel@lists.xensource.com. The latest version is always available
33	on-line. Contributions of material, suggestions and corrections are
34	welcome.}
35
36	\vfill
37	\clearpage
38
39
40	% COPYRIGHT NOTICE
41	\pagestyle{empty}
42
43	\vspace*{\fill}
44
45	Xen is Copyright \copyright 2002-2005, University of Cambridge, UK, XenSource
46	Inc., IBM Corp., Hewlett-Packard Co., Intel Corp., AMD Inc., and others. All
47	rights reserved.
48
49	Xen is an open-source project. Most portions of Xen are licensed for copying
50	under the terms of the GNU General Public License, version 2. Other portions
51	are licensed under the terms of the GNU Lesser General Public License, the
52	Zope Public License 2.0, or under ``BSD-style'' licenses. Please refer to the
53	COPYING file for details.
54
55	Xen includes software by Christopher Clark. This software is covered by the
56	following licence:
57
58	\begin{quote}
59	Copyright (c) 2002, Christopher Clark. All rights reserved.
60
61	Redistribution and use in source and binary forms, with or without
62	modification, are permitted provided that the following conditions are met:
63
64	\begin{itemize}
65	\item Redistributions of source code must retain the above copyright notice,
66	this list of conditions and the following disclaimer.
67
68	\item Redistributions in binary form must reproduce the above copyright
69	notice, this list of conditions and the following disclaimer in the
70	documentation and/or other materials provided with the distribution.
71
72	\item Neither the name of the original author; nor the names of any
73	contributors may be used to endorse or promote products derived from this
74	software without specific prior written permission.
75	\end{itemize}
76
77	THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
78	AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
79	IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
80	DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE
81	FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
82	DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
83	SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
84	CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
85	OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
86	OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
87	\end{quote}
88
89	\cleardoublepage
90
91
92	% TABLE OF CONTENTS
93	\pagestyle{plain}
94	\pagenumbering{roman}
95	{ \parskip 0pt plus 1pt
96	\tableofcontents }
97	\cleardoublepage
98
99
100	% PREPARE FOR MAIN TEXT
101	\pagenumbering{arabic}
102	\raggedbottom
103	\widowpenalty=10000
104	\clubpenalty=10000
105	\parindent=0pt
106	\parskip=5pt
107	\renewcommand{\topfraction}{.8}
108	\renewcommand{\bottomfraction}{.8}
109	\renewcommand{\textfraction}{.2}
110	\renewcommand{\floatpagefraction}{.8}
111	\setstretch{1.1}
112
113
114	%% Chapter Introduction moved to introduction.tex
115	\chapter{Introduction}
116
117
118	Xen is an open-source \emph{para-virtualizing} virtual machine monitor
119	(VMM), or ``hypervisor'', for the x86 processor architecture. Xen can
120	securely execute multiple virtual machines on a single physical system
121	with close-to-native performance. Xen facilitates enterprise-grade
122	functionality, including:
123
124	\begin{itemize}
125	\item Virtual machines with performance close to native hardware.
126	\item Live migration of running virtual machines between physical hosts.
127	\item Up to 32 virtual CPUs per guest virtual machine, with VCPU hotplug.
128	\item x86/32, x86/32 with PAE, and x86/64 platform support.
129	\item Intel Virtualization Technology (VT-x) for unmodified guest operating systems (including Microsoft Windows).
130	\item Excellent hardware support (supports almost all Linux device
131	drivers).
132	\end{itemize}
133
134
135	\section{Usage Scenarios}
136
137	Usage scenarios for Xen include:
138
139	\begin{description}
140	\item [Server Consolidation.] Move multiple servers onto a single
141	physical host with performance and fault isolation provided at the
142	virtual machine boundaries.
143	\item [Hardware Independence.] Allow legacy applications and operating
144	systems to exploit new hardware.
145	\item [Multiple OS configurations.] Run multiple operating systems
146	simultaneously, for development or testing purposes.
147	\item [Kernel Development.] Test and debug kernel modifications in a
148	sand-boxed virtual machine --- no need for a separate test machine.
149	\item [Cluster Computing.] Management at VM granularity provides more
150	flexibility than separately managing each physical host, but better
151	control and isolation than single-system image solutions,
152	particularly by using live migration for load balancing.
153	\item [Hardware support for custom OSes.] Allow development of new
154	OSes while benefiting from the wide-ranging hardware support of
155	existing OSes such as Linux.
156	\end{description}
157
158
159	\section{Operating System Support}
160
161	Para-virtualization permits very high performance virtualization, even
162	on architectures like x86 that are traditionally very hard to
163	virtualize.
164
165	This approach requires operating systems to be \emph{ported} to run on
166	Xen. Porting an OS to run on Xen is similar to supporting a new
167	hardware platform, however the process is simplified because the
168	para-virtual machine architecture is very similar to the underlying
169	native hardware. Even though operating system kernels must explicitly
170	support Xen, a key feature is that user space applications and
171	libraries \emph{do not} require modification.
172
173	With hardware CPU virtualization as provided by Intel VT and AMD
174	SVM technology, the ability to run an unmodified guest OS kernel
175	is available. No porting of the OS is required, although some
176	additional driver support is necessary within Xen itself. Unlike
177	traditional full virtualization hypervisors, which suffer a tremendous
178	performance overhead, the combination of Xen and VT or Xen and
179	Pacifica technology complement one another to offer superb performance
180	for para-virtualized guest operating systems and full support for
181	unmodified guests running natively on the processor. Full support for
182	VT and Pacifica chipsets will appear in early 2006.
183
184	Paravirtualized Xen support is available for increasingly many
185	operating systems: currently, mature Linux support is available and
186	included in the standard distribution. Other OS ports---including
187	NetBSD, FreeBSD and Solaris x86 v10---are nearing completion.
188
189
190	\section{Hardware Support}
191
192	Xen currently runs on the x86 architecture, requiring a ``P6'' or
193	newer processor (e.g.\ Pentium Pro, Celeron, Pentium~II, Pentium~III,
194	Pentium~IV, Xeon, AMD~Athlon, AMD~Duron). Multiprocessor machines are
195	supported, and there is support for HyperThreading (SMT). In
196	addition, ports to IA64 and Power architectures are in progress.
197
198	The default 32-bit Xen supports up to 4GB of memory. However Xen 3.0
199	adds support for Intel's Physical Addressing Extensions (PAE), which
200	enable x86/32 machines to address up to 64 GB of physical memory. Xen
201	3.0 also supports x86/64 platforms such as Intel EM64T and AMD Opteron
202	which can currently address up to 1TB of physical memory.
203
204	Xen offloads most of the hardware support issues to the guest OS
205	running in the \emph{Domain~0} management virtual machine. Xen itself
206	contains only the code required to detect and start secondary
207	processors, set up interrupt routing, and perform PCI bus
208	enumeration. Device drivers run within a privileged guest OS rather
209	than within Xen itself. This approach provides compatibility with the
210	majority of device hardware supported by Linux. The default XenLinux
211	build contains support for most server-class network and disk
212	hardware, but you can add support for other hardware by configuring
213	your XenLinux kernel in the normal way.
214
215
216	\section{Structure of a Xen-Based System}
217
218	A Xen system has multiple layers, the lowest and most privileged of
219	which is Xen itself.
220
221	Xen may host multiple \emph{guest} operating systems, each of which is
222	executed within a secure virtual machine. In Xen terminology, a
223	\emph{domain}. Domains are scheduled by Xen to make effective use of the
224	available physical CPUs. Each guest OS manages its own applications.
225	This management includes the responsibility of scheduling each
226	application within the time allotted to the VM by Xen.
227
228	The first domain, \emph{domain~0}, is created automatically when the
229	system boots and has special management privileges. Domain~0 builds
230	other domains and manages their virtual devices. It also performs
231	administrative tasks such as suspending, resuming and migrating other
232	virtual machines.
233
234	Within domain~0, a process called \emph{xend} runs to manage the system.
235	\Xend\ is responsible for managing virtual machines and providing access
236	to their consoles. Commands are issued to \xend\ over an HTTP interface,
237	via a command-line tool.
238
239
240	\section{History}
241
242	Xen was originally developed by the Systems Research Group at the
243	University of Cambridge Computer Laboratory as part of the XenoServers
244	project, funded by the UK-EPSRC\@.
245
246	XenoServers aim to provide a ``public infrastructure for global
247	distributed computing''. Xen plays a key part in that, allowing one to
248	efficiently partition a single machine to enable multiple independent
249	clients to run their operating systems and applications in an
250	environment. This environment provides protection, resource isolation
251	and accounting. The project web page contains further information along
252	with pointers to papers and technical reports:
253	\path{http://www.cl.cam.ac.uk/xeno}
254
255	Xen has grown into a fully-fledged project in its own right, enabling us
256	to investigate interesting research issues regarding the best techniques
257	for virtualizing resources such as the CPU, memory, disk and network.
258	Project contributors now include XenSource, Intel, IBM, HP, AMD, Novell,
259	RedHat.
260
261	Xen was first described in a paper presented at SOSP in
262	2003\footnote{\tt
263	http://www.cl.cam.ac.uk/netos/papers/2003-xensosp.pdf}, and the first
264	public release (1.0) was made that October. Since then, Xen has
265	significantly matured and is now used in production scenarios on many
266	sites.
267
268	\section{What's New}
269
270	Xen 3.0.0 offers:
271
272	\begin{itemize}
273	\item Support for up to 32-way SMP guest operating systems
274	\item Intel (Physical Addressing Extensions) PAE to support 32-bit
275	servers with more than 4GB physical memory
276	\item x86/64 support (Intel EM64T, AMD Opteron)
277	\item Intel VT-x support to enable the running of unmodified guest
278	operating systems (Windows XP/2003, Legacy Linux)
279	\item Enhanced control tools
280	\item Improved ACPI support
281	\item AGP/DRM graphics
282	\end{itemize}
283
284
285	Xen 3.0 features greatly enhanced hardware support, configuration
286	flexibility, usability and a larger complement of supported operating
287	systems. This latest release takes Xen a step closer to being the
288	definitive open source solution for virtualization.
289
290
291
292	\part{Installation}
293
294	%% Chapter Basic Installation
295	\chapter{Basic Installation}
296
297	The Xen distribution includes three main components: Xen itself, ports
298	of Linux and NetBSD to run on Xen, and the userspace tools required to
299	manage a Xen-based system. This chapter describes how to install the
300	Xen~3.0 distribution from source. Alternatively, there may be pre-built
301	packages available as part of your operating system distribution.
302
303
304	\section{Prerequisites}
305	\label{sec:prerequisites}
306
307	The following is a full list of prerequisites. Items marked `$\dag$' are
308	required by the \xend\ control tools, and hence required if you want to
309	run more than one virtual machine; items marked `$*$' are only required
310	if you wish to build from source.
311	\begin{itemize}
312	\item A working Linux distribution using the GRUB bootloader and running
313	on a P6-class or newer CPU\@.
314	\item [$\dag$] The \path{iproute2} package.
315	\item [$\dag$] The Linux bridge-utils\footnote{Available from {\tt
316	http://bridge.sourceforge.net}} (e.g., \path{/sbin/brctl})
317	\item [$\dag$] The Linux hotplug system\footnote{Available from {\tt
318	http://linux-hotplug.sourceforge.net/}} (e.g.,
319	\path{/sbin/hotplug} and related scripts). On newer distributions,
320	this is included alongside the Linux udev system\footnote{See {\tt
321	http://www.kernel.org/pub/linux/utils/kernel/hotplug/udev.html/}}.
322	\item [$*$] Build tools (gcc v3.2.x or v3.3.x, binutils, GNU make).
323	\item [$*$] Development installation of zlib (e.g.,\ zlib-dev).
324	\item [$*$] Development installation of Python v2.2 or later (e.g.,\
325	python-dev).
326	\item [$*$] \LaTeX\ and transfig are required to build the
327	documentation.
328	\end{itemize}
329
330	Once you have satisfied these prerequisites, you can now install either
331	a binary or source distribution of Xen.
332
333	\section{Installing from Binary Tarball}
334
335	Pre-built tarballs are available for download from the XenSource downloads
336	page:
337	\begin{quote} {\tt http://www.xensource.com/downloads/}
338	\end{quote}
339
340	Once you've downloaded the tarball, simply unpack and install:
341	\begin{verbatim}
342	# tar zxvf xen-3.0-install.tgz
343	# cd xen-3.0-install
344	# sh ./install.sh
345	\end{verbatim}
346
347	Once you've installed the binaries you need to configure your system as
348	described in Section~\ref{s:configure}.
349
350	\section{Installing from RPMs}
351	Pre-built RPMs are available for download from the XenSource downloads
352	page:
353	\begin{quote} {\tt http://www.xensource.com/downloads/}
354	\end{quote}
355
356	Once you've downloaded the RPMs, you typically install them via the
357	RPM commands:
358
359	\verb\|# rpm -iv rpmname\|
360
361	See the instructions and the Release Notes for each RPM set referenced at:
362	\begin{quote}
363	{\tt http://www.xensource.com/downloads/}.
364	\end{quote}
365
366	\section{Installing from Source}
367
368	This section describes how to obtain, build and install Xen from source.
369
370	\subsection{Obtaining the Source}
371
372	The Xen source tree is available as either a compressed source tarball
373	or as a clone of our master Mercurial repository.
374
375	\begin{description}
376	\item[Obtaining the Source Tarball]\mbox{} \\
377	Stable versions and daily snapshots of the Xen source tree are
378	available from the Xen download page:
379	\begin{quote} {\tt \tt http://www.xensource.com/downloads/}
380	\end{quote}
381	\item[Obtaining the source via Mercurial]\mbox{} \\
382	The source tree may also be obtained via the public Mercurial
383	repository at:
384	\begin{quote}{\tt http://xenbits.xensource.com}
385	\end{quote} See the instructions and the Getting Started Guide
386	referenced at:
387	\begin{quote}
388	{\tt http://www.xensource.com/downloads/}
389	\end{quote}
390	\end{description}
391
392	% \section{The distribution}
393	%
394	% The Xen source code repository is structured as follows:
395	%
396	% \begin{description}
397	% \item[\path{tools/}] Xen node controller daemon (Xend), command line
398	% tools, control libraries
399	% \item[\path{xen/}] The Xen VMM.
400	% \item[\path{buildconfigs/}] Build configuration files
401	% \item[\path{linux-*-xen-sparse/}] Xen support for Linux.
402	% \item[\path{patches/}] Experimental patches for Linux.
403	% \item[\path{docs/}] Various documentation files for users and
404	% developers.
405	% \item[\path{extras/}] Bonus extras.
406	% \end{description}
407
408	\subsection{Building from Source}
409
410	The top-level Xen Makefile includes a target ``world'' that will do the
411	following:
412
413	\begin{itemize}
414	\item Build Xen.
415	\item Build the control tools, including \xend.
416	\item Download (if necessary) and unpack the Linux 2.6 source code, and
417	patch it for use with Xen.
418	\item Build a Linux kernel to use in domain~0 and a smaller unprivileged
419	kernel, which can be used for unprivileged virtual machines.
420	\end{itemize}
421
422	After the build has completed you should have a top-level directory
423	called \path{dist/} in which all resulting targets will be placed. Of
424	particular interest are the two XenLinux kernel images, one with a
425	``-xen0'' extension which contains hardware device drivers and drivers
426	for Xen's virtual devices, and one with a ``-xenU'' extension that
427	just contains the virtual ones. These are found in
428	\path{dist/install/boot/} along with the image for Xen itself and the
429	configuration files used during the build.
430
431	%The NetBSD port can be built using:
432	%\begin{quote}
433	%\begin{verbatim}
434	%# make netbsd20
435	%\end{verbatim}\end{quote}
436	%NetBSD port is built using a snapshot of the netbsd-2-0 cvs branch.
437	%The snapshot is downloaded as part of the build process if it is not
438	%yet present in the \path{NETBSD\_SRC\_PATH} search path. The build
439	%process also downloads a toolchain which includes all of the tools
440	%necessary to build the NetBSD kernel under Linux.
441
442	To customize the set of kernels built you need to edit the top-level
443	Makefile. Look for the line:
444	\begin{quote}
445	\begin{verbatim}
446	KERNELS ?= linux-2.6-xen0 linux-2.6-xenU
447	\end{verbatim}
448	\end{quote}
449
450	You can edit this line to include any set of operating system kernels
451	which have configurations in the top-level \path{buildconfigs/}
452	directory.
453
454	%% Inspect the Makefile if you want to see what goes on during a
455	%% build. Building Xen and the tools is straightforward, but XenLinux
456	%% is more complicated. The makefile needs a `pristine' Linux kernel
457	%% tree to which it will then add the Xen architecture files. You can
458	%% tell the makefile the location of the appropriate Linux compressed
459	%% tar file by
460	%% setting the LINUX\_SRC environment variable, e.g. \\
461	%% \verb!# LINUX_SRC=/tmp/linux-2.6.11.tar.bz2 make world! \\ or by
462	%% placing the tar file somewhere in the search path of {\tt
463	%% LINUX\_SRC\_PATH} which defaults to `{\tt .:..}'. If the
464	%% makefile can't find a suitable kernel tar file it attempts to
465	%% download it from kernel.org (this won't work if you're behind a
466	%% firewall).
467
468	%% After untaring the pristine kernel tree, the makefile uses the {\tt
469	%% mkbuildtree} script to add the Xen patches to the kernel.
470
471	%% \framebox{\parbox{5in}{
472	%% {\bf Distro specific:} \\
473	%% {\it Gentoo} --- if not using udev (most installations,
474	%% currently), you'll need to enable devfs and devfs mount at boot
475	%% time in the xen0 config. }}
476
477	\subsection{Custom Kernels}
478
479	% If you have an SMP machine you may wish to give the {\tt '-j4'}
480	% argument to make to get a parallel build.
481
482	If you wish to build a customized XenLinux kernel (e.g.\ to support
483	additional devices or enable distribution-required features), you can
484	use the standard Linux configuration mechanisms, specifying that the
485	architecture being built for is \path{xen}, e.g:
486	\begin{quote}
487	\begin{verbatim}
488	# cd linux-2.6.12-xen0
489	# make ARCH=xen xconfig
490	# cd ..
491	# make
492	\end{verbatim}
493	\end{quote}
494
495	You can also copy an existing Linux configuration (\path{.config}) into
496	e.g.\ \path{linux-2.6.12-xen0} and execute:
497	\begin{quote}
498	\begin{verbatim}
499	# make ARCH=xen oldconfig
500	\end{verbatim}
501	\end{quote}
502
503	You may be prompted with some Xen-specific options. We advise accepting
504	the defaults for these options.
505
506	Note that the only difference between the two types of Linux kernels
507	that are built is the configuration file used for each. The ``U''
508	suffixed (unprivileged) versions don't contain any of the physical
509	hardware device drivers, leading to a 30\% reduction in size; hence you
510	may prefer these for your non-privileged domains. The ``0'' suffixed
511	privileged versions can be used to boot the system, as well as in driver
512	domains and unprivileged domains.
513
514	\subsection{Installing Generated Binaries}
515
516	The files produced by the build process are stored under the
517	\path{dist/install/} directory. To install them in their default
518	locations, do:
519	\begin{quote}
520	\begin{verbatim}
521	# make install
522	\end{verbatim}
523	\end{quote}
524
525	Alternatively, users with special installation requirements may wish to
526	install them manually by copying the files to their appropriate
527	destinations.
528
529	%% Files in \path{install/boot/} include:
530	%% \begin{itemize}
531	%% \item \path{install/boot/xen-3.0.gz} Link to the Xen 'kernel'
532	%% \item \path{install/boot/vmlinuz-2.6-xen0} Link to domain 0
533	%% XenLinux kernel
534	%% \item \path{install/boot/vmlinuz-2.6-xenU} Link to unprivileged
535	%% XenLinux kernel
536	%% \end{itemize}
537
538	The \path{dist/install/boot} directory will also contain the config
539	files used for building the XenLinux kernels, and also versions of Xen
540	and XenLinux kernels that contain debug symbols such as
541	(\path{xen-syms-3.0.0} and \path{vmlinux-syms-2.6.12.6-xen0}) which are
542	essential for interpreting crash dumps. Retain these files as the
543	developers may wish to see them if you post on the mailing list.
544
545
546	\section{Configuration}
547	\label{s:configure}
548
549	Once you have built and installed the Xen distribution, it is simple to
550	prepare the machine for booting and running Xen.
551
552	\subsection{GRUB Configuration}
553
554	An entry should be added to \path{grub.conf} (often found under
555	\path{/boot/} or \path{/boot/grub/}) to allow Xen / XenLinux to boot.
556	This file is sometimes called \path{menu.lst}, depending on your
557	distribution. The entry should look something like the following:
558
559	%% KMSelf Thu Dec 1 19:06:13 PST 2005 262144 is useful for RHEL/RH and
560	%% related Dom0s.
561	{\small
562	\begin{verbatim}
563	title Xen 3.0 / XenLinux 2.6
564	kernel /boot/xen-3.0.gz dom0_mem=262144
565	module /boot/vmlinuz-2.6-xen0 root=/dev/sda4 ro console=tty0
566	\end{verbatim}
567	}
568
569	The kernel line tells GRUB where to find Xen itself and what boot
570	parameters should be passed to it (in this case, setting the domain~0
571	memory allocation in kilobytes and the settings for the serial port).
572	For more details on the various Xen boot parameters see
573	Section~\ref{s:xboot}.
574
575	The module line of the configuration describes the location of the
576	XenLinux kernel that Xen should start and the parameters that should be
577	passed to it. These are standard Linux parameters, identifying the root
578	device and specifying it be initially mounted read only and instructing
579	that console output be sent to the screen. Some distributions such as
580	SuSE do not require the \path{ro} parameter.
581
582	%% \framebox{\parbox{5in}{
583	%% {\bf Distro specific:} \\
584	%% {\it SuSE} --- Omit the {\tt ro} option from the XenLinux
585	%% kernel command line, since the partition won't be remounted rw
586	%% during boot. }}
587
588	To use an initrd, add another \path{module} line to the configuration,
589	like: {\small
590	\begin{verbatim}
591	module /boot/my_initrd.gz
592	\end{verbatim}
593	}
594
595	%% KMSelf Thu Dec 1 19:05:30 PST 2005 Other configs as an appendix?
596
597	When installing a new kernel, it is recommended that you do not delete
598	existing menu options from \path{menu.lst}, as you may wish to boot your
599	old Linux kernel in future, particularly if you have problems.
600
601	\subsection{Serial Console (optional)}
602
603	Serial console access allows you to manage, monitor, and interact with
604	your system over a serial console. This can allow access from another
605	nearby system via a null-modem (``LapLink'') cable or remotely via a serial
606	concentrator.
607
608	You system's BIOS, bootloader (GRUB), Xen, Linux, and login access must
609	each be individually configured for serial console access. It is
610	\emph{not} strictly necessary to have each component fully functional,
611	but it can be quite useful.
612
613	For general information on serial console configuration under Linux,
614	refer to the ``Remote Serial Console HOWTO'' at The Linux Documentation
615	Project: \url{http://www.tldp.org}
616
617	\subsubsection{Serial Console BIOS configuration}
618
619	Enabling system serial console output neither enables nor disables
620	serial capabilities in GRUB, Xen, or Linux, but may make remote
621	management of your system more convenient by displaying POST and other
622	boot messages over serial port and allowing remote BIOS configuration.
623
624	Refer to your hardware vendor's documentation for capabilities and
625	procedures to enable BIOS serial redirection.
626
627
628	\subsubsection{Serial Console GRUB configuration}
629
630	Enabling GRUB serial console output neither enables nor disables Xen or
631	Linux serial capabilities, but may made remote management of your system
632	more convenient by displaying GRUB prompts, menus, and actions over
633	serial port and allowing remote GRUB management.
634
635	Adding the following two lines to your GRUB configuration file,
636	typically either \path{/boot/grub/menu.lst} or \path{/boot/grub/grub.conf}
637	depending on your distro, will enable GRUB serial output.
638
639	\begin{quote}
640	{\small \begin{verbatim}
641	serial --unit=0 --speed=115200 --word=8 --parity=no --stop=1
642	terminal --timeout=10 serial console
643	\end{verbatim}}
644	\end{quote}
645
646	Note that when both the serial port and the local monitor and keyboard
647	are enabled, the text ``\emph{Press any key to continue}'' will appear
648	at both. Pressing a key on one device will cause GRUB to display to
649	that device. The other device will see no output. If no key is
650	pressed before the timeout period expires, the system will boot to the
651	default GRUB boot entry.
652
653	Please refer to the GRUB documentation for further information.
654
655
656	\subsubsection{Serial Console Xen configuration}
657
658	Enabling Xen serial console output neither enables nor disables Linux
659	kernel output or logging in to Linux over serial port. It does however
660	allow you to monitor and log the Xen boot process via serial console and
661	can be very useful in debugging.
662
663	%% kernel /boot/xen-2.0.gz dom0_mem=131072 console=com1,vga com1=115200,8n1
664	%% module /boot/vmlinuz-2.6-xen0 root=/dev/sda4 ro
665
666	In order to configure Xen serial console output, it is necessary to
667	add a boot option to your GRUB config; e.g.\ replace the previous
668	example kernel line with:
669	\begin{quote} {\small \begin{verbatim}
670	kernel /boot/xen.gz dom0_mem=131072 com1=115200,8n1
671	\end{verbatim}}
672	\end{quote}
673
674	This configures Xen to output on COM1 at 115,200 baud, 8 data bits, no
675	parity and 1 stop bit. Modify these parameters for your environment.
676	See Section~\ref{s:xboot} for an explanation of all boot parameters.
677
678	One can also configure XenLinux to share the serial console; to achieve
679	this append ``\path{console=ttyS0}'' to your module line.
680
681
682	\subsubsection{Serial Console Linux configuration}
683
684	Enabling Linux serial console output at boot neither enables nor
685	disables logging in to Linux over serial port. It does however allow
686	you to monitor and log the Linux boot process via serial console and can be
687	very useful in debugging.
688
689	To enable Linux output at boot time, add the parameter
690	\path{console=ttyS0} (or ttyS1, ttyS2, etc.) to your kernel GRUB line.
691	Under Xen, this might be:
692	\begin{quote}
693	{\footnotesize \begin{verbatim}
694	module /vmlinuz-2.6-xen0 ro root=/dev/VolGroup00/LogVol00 \
695	console=ttyS0, 115200
696	\end{verbatim}}
697	\end{quote}
698	to enable output over ttyS0 at 115200 baud.
699
700
701
702	\subsubsection{Serial Console Login configuration}
703
704	Logging in to Linux via serial console, under Xen or otherwise, requires
705	specifying a login prompt be started on the serial port. To permit root
706	logins over serial console, the serial port must be added to
707	\path{/etc/securetty}.
708
709	\newpage
710	To automatically start a login prompt over the serial port,
711	add the line: \begin{quote} {\small {\tt c:2345:respawn:/sbin/mingetty
712	ttyS0}} \end{quote} to \path{/etc/inittab}. Run \path{init q} to force
713	a reload of your inttab and start getty.
714
715	To enable root logins, add \path{ttyS0} to \path{/etc/securetty} if not
716	already present.
717
718	Your distribution may use an alternate getty; options include getty,
719	mgetty and agetty. Consult your distribution's documentation
720	for further information.
721
722
723	\subsection{TLS Libraries}
724
725	Users of the XenLinux 2.6 kernel should disable Thread Local Storage
726	(TLS) (e.g.\ by doing a \path{mv /lib/tls /lib/tls.disabled}) before
727	attempting to boot a XenLinux kernel\footnote{If you boot without first
728	disabling TLS, you will get a warning message during the boot process.
729	In this case, simply perform the rename after the machine is up and
730	then run \path{/sbin/ldconfig} to make it take effect.}. You can
731	always reenable TLS by restoring the directory to its original location
732	(i.e.\ \path{mv /lib/tls.disabled /lib/tls}).
733
734	The reason for this is that the current TLS implementation uses
735	segmentation in a way that is not permissible under Xen. If TLS is not
736	disabled, an emulation mode is used within Xen which reduces performance
737	substantially. To ensure full performance you should install a
738	`Xen-friendly' (nosegneg) version of the library.
739
740
741	\section{Booting Xen}
742
743	It should now be possible to restart the system and use Xen. Reboot and
744	choose the new Xen option when the Grub screen appears.
745
746	What follows should look much like a conventional Linux boot. The first
747	portion of the output comes from Xen itself, supplying low level
748	information about itself and the underlying hardware. The last portion
749	of the output comes from XenLinux.
750
751	You may see some error messages during the XenLinux boot. These are not
752	necessarily anything to worry about---they may result from kernel
753	configuration differences between your XenLinux kernel and the one you
754	usually use.
755
756	When the boot completes, you should be able to log into your system as
757	usual. If you are unable to log in, you should still be able to reboot
758	with your normal Linux kernel by selecting it at the GRUB prompt.
759
760
761	% Booting Xen
762	\chapter{Booting a Xen System}
763
764	Booting the system into Xen will bring you up into the privileged
765	management domain, Domain0. At that point you are ready to create
766	guest domains and ``boot'' them using the \texttt{xm create} command.
767
768	\section{Booting Domain0}
769
770	After installation and configuration is complete, reboot the system
771	and and choose the new Xen option when the Grub screen appears.
772
773	What follows should look much like a conventional Linux boot. The
774	first portion of the output comes from Xen itself, supplying low level
775	information about itself and the underlying hardware. The last
776	portion of the output comes from XenLinux.
777
778	%% KMSelf Wed Nov 30 18:09:37 PST 2005: We should specify what these are.
779
780	When the boot completes, you should be able to log into your system as
781	usual. If you are unable to log in, you should still be able to
782	reboot with your normal Linux kernel by selecting it at the GRUB prompt.
783
784	The first step in creating a new domain is to prepare a root
785	filesystem for it to boot. Typically, this might be stored in a normal
786	partition, an LVM or other volume manager partition, a disk file or on
787	an NFS server. A simple way to do this is simply to boot from your
788	standard OS install CD and install the distribution into another
789	partition on your hard drive.
790
791	To start the \xend\ control daemon, type
792	\begin{quote}
793	\verb!# xend start!
794	\end{quote}
795
796	If you wish the daemon to start automatically, see the instructions in
797	Section~\ref{s:xend}. Once the daemon is running, you can use the
798	\path{xm} tool to monitor and maintain the domains running on your
799	system. This chapter provides only a brief tutorial. We provide full
800	details of the \path{xm} tool in the next chapter.
801
802	% \section{From the web interface}
803	%
804	% Boot the Xen machine and start Xensv (see Chapter~\ref{cha:xensv}
805	% for more details) using the command: \\
806	% \verb_# xensv start_ \\
807	% This will also start Xend (see Chapter~\ref{cha:xend} for more
808	% information).
809	%
810	% The domain management interface will then be available at {\tt
811	% http://your\_machine:8080/}. This provides a user friendly wizard
812	% for starting domains and functions for managing running domains.
813	%
814	% \section{From the command line}
815	\section{Booting Guest Domains}
816
817	\subsection{Creating a Domain Configuration File}
818
819	Before you can start an additional domain, you must create a
820	configuration file. We provide two example files which you can use as
821	a starting point:
822	\begin{itemize}
823	\item \path{/etc/xen/xmexample1} is a simple template configuration
824	file for describing a single VM\@.
825	\item \path{/etc/xen/xmexample2} file is a template description that
826	is intended to be reused for multiple virtual machines. Setting the
827	value of the \path{vmid} variable on the \path{xm} command line
828	fills in parts of this template.
829	\end{itemize}
830
831	There are also a number of other examples which you may find useful.
832	Copy one of these files and edit it as appropriate. Typical values
833	you may wish to edit include:
834
835	\begin{quote}
836	\begin{description}
837	\item[kernel] Set this to the path of the kernel you compiled for use
838	with Xen (e.g.\ \path{kernel = ``/boot/vmlinuz-2.6-xenU''})
839	\item[memory] Set this to the size of the domain's memory in megabytes
840	(e.g.\ \path{memory = 64})
841	\item[disk] Set the first entry in this list to calculate the offset
842	of the domain's root partition, based on the domain ID\@. Set the
843	second to the location of \path{/usr} if you are sharing it between
844	domains (e.g.\ \path{disk = ['phy:your\_hard\_drive\%d,sda1,w' \%
845	(base\_partition\_number + vmid),
846	'phy:your\_usr\_partition,sda6,r' ]}
847	\item[dhcp] Uncomment the dhcp variable, so that the domain will
848	receive its IP address from a DHCP server (e.g.\ \path{dhcp=``dhcp''})
849	\end{description}
850	\end{quote}
851
852	You may also want to edit the {\bf vif} variable in order to choose
853	the MAC address of the virtual ethernet interface yourself. For
854	example:
855
856	\begin{quote}
857	\verb_vif = ['mac=00:16:3E:F6:BB:B3']_
858	\end{quote}
859	If you do not set this variable, \xend\ will automatically generate a
860	random MAC address from the range 00:16:3E:xx:xx:xx, assigned by IEEE to
861	XenSource as an OUI (organizationally unique identifier). XenSource
862	Inc. gives permission for anyone to use addresses randomly allocated
863	from this range for use by their Xen domains.
864
865	For a list of IEEE OUI assignments, see
866	\url{http://standards.ieee.org/regauth/oui/oui.txt}
867
868
869	\subsection{Booting the Guest Domain}
870
871	The \path{xm} tool provides a variety of commands for managing
872	domains. Use the \path{create} command to start new domains. Assuming
873	you've created a configuration file \path{myvmconf} based around
874	\path{/etc/xen/xmexample2}, to start a domain with virtual machine
875	ID~1 you should type:
876
877	\begin{quote}
878	\begin{verbatim}
879	# xm create -c myvmconf vmid=1
880	\end{verbatim}
881	\end{quote}
882
883	The \path{-c} switch causes \path{xm} to turn into the domain's
884	console after creation. The \path{vmid=1} sets the \path{vmid}
885	variable used in the \path{myvmconf} file.
886
887	You should see the console boot messages from the new domain appearing
888	in the terminal in which you typed the command, culminating in a login
889	prompt.
890
891
892	\section{Starting / Stopping Domains Automatically}
893
894	It is possible to have certain domains start automatically at boot
895	time and to have dom0 wait for all running domains to shutdown before
896	it shuts down the system.
897
898	To specify a domain is to start at boot-time, place its configuration
899	file (or a link to it) under \path{/etc/xen/auto/}.
900
901	A Sys-V style init script for Red Hat and LSB-compliant systems is
902	provided and will be automatically copied to \path{/etc/init.d/}
903	during install. You can then enable it in the appropriate way for
904	your distribution.
905
906	For instance, on Red Hat:
907
908	\begin{quote}
909	\verb_# chkconfig --add xendomains_
910	\end{quote}
911
912	By default, this will start the boot-time domains in runlevels 3, 4
913	and 5.
914
915	You can also use the \path{service} command to run this script
916	manually, e.g:
917
918	\begin{quote}
919	\verb_# service xendomains start_
920
921	Starts all the domains with config files under /etc/xen/auto/.
922	\end{quote}
923
924	\begin{quote}
925	\verb_# service xendomains stop_
926
927	Shuts down all running Xen domains.
928	\end{quote}
929
930
931
932	\part{Configuration and Management}
933
934	%% Chapter Domain Management Tools and Daemons
935	\chapter{Domain Management Tools}
936
937	This chapter summarizes the management software and tools available.
938
939
940	\section{\Xend\ }
941	\label{s:xend}
942
943
944	The \Xend\ node control daemon performs system management functions
945	related to virtual machines. It forms a central point of control of
946	virtualized resources, and must be running in order to start and manage
947	virtual machines. \Xend\ must be run as root because it needs access to
948	privileged system management functions.
949
950	An initialization script named \texttt{/etc/init.d/xend} is provided to
951	start \Xend\ at boot time. Use the tool appropriate (i.e. chkconfig) for
952	your Linux distribution to specify the runlevels at which this script
953	should be executed, or manually create symbolic links in the correct
954	runlevel directories.
955
956	\Xend\ can be started on the command line as well, and supports the
957	following set of parameters:
958
959	\begin{tabular}{ll}
960	\verb!# xend start! & start \xend, if not already running \\
961	\verb!# xend stop! & stop \xend\ if already running \\
962	\verb!# xend restart! & restart \xend\ if running, otherwise start it \\
963	% \verb!# xend trace_start! & start \xend, with very detailed debug logging \\
964	\verb!# xend status! & indicates \xend\ status by its return code
965	\end{tabular}
966
967	A SysV init script called {\tt xend} is provided to start \xend\ at
968	boot time. {\tt make install} installs this script in
969	\path{/etc/init.d}. To enable it, you have to make symbolic links in
970	the appropriate runlevel directories or use the {\tt chkconfig} tool,
971	where available. Once \xend\ is running, administration can be done
972	using the \texttt{xm} tool.
973
974	\subsection{Logging}
975
976	As \xend\ runs, events will be logged to \path{/var/log/xen/xend.log} and
977	(less frequently) to \path{/var/log/xen/xend-debug.log}. These, along with
978	the standard syslog files, are useful when troubleshooting problems.
979
980	\subsection{Configuring \Xend\ }
981
982	\Xend\ is written in Python. At startup, it reads its configuration
983	information from the file \path{/etc/xen/xend-config.sxp}. The Xen
984	installation places an example \texttt{xend-config.sxp} file in the
985	\texttt{/etc/xen} subdirectory which should work for most installations.
986
987	See the example configuration file \texttt{xend-debug.sxp} and the
988	section 5 man page \texttt{xend-config.sxp} for a full list of
989	parameters and more detailed information. Some of the most important
990	parameters are discussed below.
991
992	An HTTP interface and a Unix domain socket API are available to
993	communicate with \Xend. This allows remote users to pass commands to the
994	daemon. By default, \Xend does not start an HTTP server. It does start a
995	Unix domain socket management server, as the low level utility
996	\texttt{xm} requires it. For support of cross-machine migration, \Xend\
997	can start a relocation server. This support is not enabled by default
998	for security reasons.
999
1000	Note: the example \texttt{xend} configuration file modifies the defaults and
1001	starts up \Xend\ as an HTTP server as well as a relocation server.
1002
1003	From the file:
1004
1005	\begin{verbatim}
1006	#(xend-http-server no)
1007	(xend-http-server yes)
1008	#(xend-unix-server yes)
1009	#(xend-relocation-server no)
1010	(xend-relocation-server yes)
1011	\end{verbatim}
1012
1013	Comment or uncomment lines in that file to disable or enable features
1014	that you require.
1015
1016	Connections from remote hosts are disabled by default:
1017
1018	\begin{verbatim}
1019	# Address xend should listen on for HTTP connections, if xend-http-server is
1020	# set.
1021	# Specifying 'localhost' prevents remote connections.
1022	# Specifying the empty string '' (the default) allows all connections.
1023	#(xend-address '')
1024	(xend-address localhost)
1025	\end{verbatim}
1026
1027	It is recommended that if migration support is not needed, the
1028	\texttt{xend-relocation-server} parameter value be changed to
1029	``\texttt{no}'' or commented out.
1030
1031	\section{Xm}
1032	\label{s:xm}
1033
1034	The xm tool is the primary tool for managing Xen from the console. The
1035	general format of an xm command line is:
1036
1037	\begin{verbatim}
1038	# xm command [switches] [arguments] [variables]
1039	\end{verbatim}
1040
1041	The available \emph{switches} and \emph{arguments} are dependent on the
1042	\emph{command} chosen. The \emph{variables} may be set using
1043	declarations of the form {\tt variable=value} and command line
1044	declarations override any of the values in the configuration file being
1045	used, including the standard variables described above and any custom
1046	variables (for instance, the \path{xmdefconfig} file uses a {\tt vmid}
1047	variable).
1048
1049	For online help for the commands available, type:
1050
1051	\begin{quote}
1052	\begin{verbatim}
1053	# xm help
1054	\end{verbatim}
1055	\end{quote}
1056
1057	This will list the most commonly used commands. The full list can be obtained
1058	using \verb_xm help --long_. You can also type \path{xm help $<$command$>$}
1059	for more information on a given command.
1060
1061	\subsection{Basic Management Commands}
1062
1063	One useful command is \verb_# xm list_ which lists all domains running in rows
1064	of the following format:
1065	\begin{center} {\tt name domid memory vcpus state cputime}
1066	\end{center}
1067
1068	The meaning of each field is as follows:
1069	\begin{quote}
1070	\begin{description}
1071	\item[name] The descriptive name of the virtual machine.
1072	\item[domid] The number of the domain ID this virtual machine is
1073	running in.
1074	\item[memory] Memory size in megabytes.
1075	\item[vcpus] The number of virtual CPUs this domain has.
1076	\item[state] Domain state consists of 5 fields:
1077	\begin{description}
1078	\item[r] running
1079	\item[b] blocked
1080	\item[p] paused
1081	\item[s] shutdown
1082	\item[c] crashed
1083	\end{description}
1084	\item[cputime] How much CPU time (in seconds) the domain has used so
1085	far.
1086	\end{description}
1087	\end{quote}
1088
1089	The \path{xm list} command also supports a long output format when the
1090	\path{-l} switch is used. This outputs the full details of the
1091	running domains in \xend's SXP configuration format.
1092
1093	If you want to know how long your domains have been running for, then
1094	you can use the \verb_# xm uptime_ command.
1095
1096
1097	You can get access to the console of a particular domain using
1098	the \verb_# xm console_ command (e.g.\ \verb_# xm console myVM_).
1099
1100	\subsection{Domain Scheduling Management Commands}
1101
1102	The credit CPU scheduler automatically load balances guest VCPUs
1103	across all available physical CPUs on an SMP host. The user need
1104	not manually pin VCPUs to load balance the system. However, she
1105	can restrict which CPUs a particular VCPU may run on using
1106	the \path{xm vcpu-pin} command.
1107
1108	Each guest domain is assigned a \path{weight} and a \path{cap}.
1109
1110	A domain with a weight of 512 will get twice as much CPU as a
1111	domain with a weight of 256 on a contended host. Legal weights
1112	range from 1 to 65535 and the default is 256.
1113
1114	The cap optionally fixes the maximum amount of CPU a guest will
1115	be able to consume, even if the host system has idle CPU cycles.
1116	The cap is expressed in percentage of one physical CPU: 100 is
1117	1 physical CPU, 50 is half a CPU, 400 is 4 CPUs, etc... The
1118	default, 0, means there is no upper cap.
1119
1120	When you are running with the credit scheduler, you can check and
1121	modify your domains' weights and caps using the \path{xm sched-credit}
1122	command:
1123
1124	\begin{tabular}{ll}
1125	\verb!xm sched-credit -d <domain>! & lists weight and cap \\
1126	\verb!xm sched-credit -d <domain> -w <weight>! & sets the weight \\
1127	\verb!xm sched-credit -d <domain> -c <cap>! & sets the cap
1128	\end{tabular}
1129
1130
1131
1132	%% Chapter Domain Configuration
1133	\chapter{Domain Configuration}
1134	\label{cha:config}
1135
1136	The following contains the syntax of the domain configuration files
1137	and description of how to further specify networking, driver domain
1138	and general scheduling behavior.
1139
1140
1141	\section{Configuration Files}
1142	\label{s:cfiles}
1143
1144	Xen configuration files contain the following standard variables.
1145	Unless otherwise stated, configuration items should be enclosed in
1146	quotes: see the configuration scripts in \path{/etc/xen/}
1147	for concrete examples.
1148
1149	\begin{description}
1150	\item[kernel] Path to the kernel image.
1151	\item[ramdisk] Path to a ramdisk image (optional).
1152	% \item[builder] The name of the domain build function (e.g.
1153	% {\tt'linux'} or {\tt'netbsd'}.
1154	\item[memory] Memory size in megabytes.
1155	\item[vcpus] The number of virtual CPUs.
1156	\item[console] Port to export the domain console on (default 9600 +
1157	domain ID).
1158	\item[vif] Network interface configuration. This may simply contain
1159	an empty string for each desired interface, or may override various
1160	settings, e.g.\
1161	\begin{verbatim}
1162	vif = [ 'mac=00:16:3E:00:00:11, bridge=xen-br0',
1163	'bridge=xen-br1' ]
1164	\end{verbatim}
1165	to assign a MAC address and bridge to the first interface and assign
1166	a different bridge to the second interface, leaving \xend\ to choose
1167	the MAC address. The settings that may be overridden in this way are
1168	type, mac, bridge, ip, script, backend, and vifname.
1169	\item[disk] List of block devices to export to the domain e.g.
1170	\verb_disk = [ 'phy:hda1,sda1,r' ]_
1171	exports physical device \path{/dev/hda1} to the domain as
1172	\path{/dev/sda1} with read-only access. Exporting a disk read-write
1173	which is currently mounted is dangerous -- if you are \emph{certain}
1174	you wish to do this, you can specify \path{w!} as the mode.
1175	\item[dhcp] Set to {\tt `dhcp'} if you want to use DHCP to configure
1176	networking.
1177	\item[netmask] Manually configured IP netmask.
1178	\item[gateway] Manually configured IP gateway.
1179	\item[hostname] Set the hostname for the virtual machine.
1180	\item[root] Specify the root device parameter on the kernel command
1181	line.
1182	\item[nfs\_server] IP address for the NFS server (if any).
1183	\item[nfs\_root] Path of the root filesystem on the NFS server (if
1184	any).
1185	\item[extra] Extra string to append to the kernel command line (if
1186	any)
1187	\end{description}
1188
1189	Additional fields are documented in the example configuration files
1190	(e.g. to configure virtual TPM functionality).
1191
1192	For additional flexibility, it is also possible to include Python
1193	scripting commands in configuration files. An example of this is the
1194	\path{xmexample2} file, which uses Python code to handle the
1195	\path{vmid} variable.
1196
1197
1198	%\part{Advanced Topics}
1199
1200
1201	\section{Network Configuration}
1202
1203	For many users, the default installation should work ``out of the
1204	box''. More complicated network setups, for instance with multiple
1205	Ethernet interfaces and/or existing bridging setups will require some
1206	special configuration.
1207
1208	The purpose of this section is to describe the mechanisms provided by
1209	\xend\ to allow a flexible configuration for Xen's virtual networking.
1210
1211	\subsection{Xen virtual network topology}
1212
1213	Each domain network interface is connected to a virtual network
1214	interface in dom0 by a point to point link (effectively a ``virtual
1215	crossover cable''). These devices are named {\tt
1216	vif$<$domid$>$.$<$vifid$>$} (e.g.\ {\tt vif1.0} for the first
1217	interface in domain~1, {\tt vif3.1} for the second interface in
1218	domain~3).
1219
1220	Traffic on these virtual interfaces is handled in domain~0 using
1221	standard Linux mechanisms for bridging, routing, rate limiting, etc.
1222	Xend calls on two shell scripts to perform initial configuration of
1223	the network and configuration of new virtual interfaces. By default,
1224	these scripts configure a single bridge for all the virtual
1225	interfaces. Arbitrary routing / bridging configurations can be
1226	configured by customizing the scripts, as described in the following
1227	section.
1228
1229	\subsection{Xen networking scripts}
1230
1231	Xen's virtual networking is configured by two shell scripts (by
1232	default \path{network-bridge} and \path{vif-bridge}). These are called
1233	automatically by \xend\ when certain events occur, with arguments to
1234	the scripts providing further contextual information. These scripts
1235	are found by default in \path{/etc/xen/scripts}. The names and
1236	locations of the scripts can be configured in
1237	\path{/etc/xen/xend-config.sxp}.
1238
1239	\begin{description}
1240	\item[network-bridge:] This script is called whenever \xend\ is started or
1241	stopped to respectively initialize or tear down the Xen virtual
1242	network. In the default configuration initialization creates the
1243	bridge `xen-br0' and moves eth0 onto that bridge, modifying the
1244	routing accordingly. When \xend\ exits, it deletes the Xen bridge
1245	and removes eth0, restoring the normal IP and routing configuration.
1246
1247	%% In configurations where the bridge already exists, this script
1248	%% could be replaced with a link to \path{/bin/true} (for instance).
1249
1250	\item[vif-bridge:] This script is called for every domain virtual
1251	interface and can configure firewalling rules and add the vif to the
1252	appropriate bridge. By default, this adds and removes VIFs on the
1253	default Xen bridge.
1254	\end{description}
1255
1256	Other example scripts are available (\path{network-route} and
1257	\path{vif-route}, \path{network-nat} and \path{vif-nat}).
1258	For more complex network setups (e.g.\ where routing is required or
1259	integrate with existing bridges) these scripts may be replaced with
1260	customized variants for your site's preferred configuration.
1261
1262	\section{Driver Domain Configuration}
1263	\label{s:ddconf}
1264
1265	\subsection{PCI}
1266	\label{ss:pcidd}
1267
1268	Individual PCI devices can be assigned to a given domain (a PCI driver domain)
1269	to allow that domain direct access to the PCI hardware.
1270
1271	While PCI Driver Domains can increase the stability and security of a system
1272	by addressing a number of security concerns, there are some security issues
1273	that remain that you can read about in Section~\ref{s:ddsecurity}.
1274
1275	\subsubsection{Compile-Time Setup}
1276	To use this functionality, ensure
1277	that the PCI Backend is compiled in to a privileged domain (e.g. domain 0)
1278	and that the domains which will be assigned PCI devices have the PCI Frontend
1279	compiled in. In XenLinux, the PCI Backend is available under the Xen
1280	configuration section while the PCI Frontend is under the
1281	architecture-specific "Bus Options" section. You may compile both the backend
1282	and the frontend into the same kernel; they will not affect each other.
1283
1284	\subsubsection{PCI Backend Configuration - Binding at Boot}
1285	The PCI devices you wish to assign to unprivileged domains must be "hidden"
1286	from your backend domain (usually domain 0) so that it does not load a driver
1287	for them. Use the \path{pciback.hide} kernel parameter which is specified on
1288	the kernel command-line and is configurable through GRUB (see
1289	Section~\ref{s:configure}). Note that devices are not really hidden from the
1290	backend domain. The PCI Backend appears to the Linux kernel as a regular PCI
1291	device driver. The PCI Backend ensures that no other device driver loads
1292	for the devices by binding itself as the device driver for those devices.
1293	PCI devices are identified by hexadecimal slot/function numbers (on Linux,
1294	use \path{lspci} to determine slot/function numbers of your devices) and
1295	can be specified with or without the PCI domain: \\
1296	\centerline{ {\tt ({\em bus}:{\em slot}.{\em func})} example {\tt (02:1d.3)}} \\
1297	\centerline{ {\tt ({\em domain}:{\em bus}:{\em slot}.{\em func})} example {\tt (0000:02:1d.3)}} \\
1298
1299	An example kernel command-line which hides two PCI devices might be: \\
1300	\centerline{ {\tt root=/dev/sda4 ro console=tty0 pciback.hide=(02:01.f)(0000:04:1d.0) } } \\
1301
1302	\subsubsection{PCI Backend Configuration - Late Binding}
1303	PCI devices can also be bound to the PCI Backend after boot through the manual
1304	binding/unbinding facilities provided by the Linux kernel in sysfs (allowing
1305	for a Xen user to give PCI devices to driver domains that were not specified
1306	on the kernel command-line). There are several attributes with the PCI
1307	Backend's sysfs directory (\path{/sys/bus/pci/drivers/pciback}) that can be
1308	used to bind/unbind devices:
1309
1310	\begin{description}
1311	\item[slots] lists all of the PCI slots that the PCI Backend will try to seize
1312	(or "hide" from Domain 0). A PCI slot must appear in this list before it can
1313	be bound to the PCI Backend through the \path{bind} attribute.
1314	\item[new\_slot] write the name of a slot here (in 0000:00:00.0 format) to
1315	have the PCI Backend seize the device in this slot.
1316	\item[remove\_slot] write the name of a slot here (same format as
1317	\path{new\_slot}) to have the PCI Backend no longer try to seize devices in
1318	this slot. Note that this does not unbind the driver from a device it has
1319	already seized.
1320	\item[bind] write the name of a slot here (in 0000:00:00.0 format) to have
1321	the Linux kernel attempt to bind the device in that slot to the PCI Backend
1322	driver.
1323	\item[unbind] write the name of a skit here (same format as \path{bind}) to have
1324	the Linux kernel unbind the device from the PCI Backend. DO NOT unbind a
1325	device while it is currently given to a PCI driver domain!
1326	\end{description}
1327
1328	Some examples:
1329
1330	Bind a device to the PCI Backend which is not bound to any other driver.
1331	\begin{verbatim}
1332	# # Add a new slot to the PCI Backend's list
1333	# echo -n 0000:01:04.d > /sys/bus/pci/drivers/pciback/new_slot
1334	# # Now that the backend is watching for the slot, bind to it
1335	# echo -n 0000:01:04.d > /sys/bus/pci/drivers/pciback/bind
1336	\end{verbatim}
1337
1338	Unbind a device from its driver and bind to the PCI Backend.
1339	\begin{verbatim}
1340	# # Unbind a PCI network card from its network driver
1341	# echo -n 0000:05:02.0 > /sys/bus/pci/drivers/3c905/unbind
1342	# # And now bind it to the PCI Backend
1343	# echo -n 0000:05:02.0 > /sys/bus/pci/drivers/pciback/new_slot
1344	# echo -n 0000:05:02.0 > /sys/bus/pci/drivers/pciback/bind
1345	\end{verbatim}
1346
1347	Note that the "-n" option in the example is important as it causes echo to not
1348	output a new-line.
1349
1350	\subsubsection{PCI Backend Configuration - User-space Quirks}
1351	Quirky devices (such as the Broadcom Tigon 3) may need write access to their
1352	configuration space registers. Xen can be instructed to allow specified PCI
1353	devices write access to specific configuration space registers. The policy may
1354	be found in:
1355
1356	\centerline{ \path{/etc/xen/xend-pci-quirks.sxp} }
1357
1358	The policy file is heavily commented and is intended to provide enough
1359	documentation for developers to extend it.
1360
1361	\subsubsection{PCI Backend Configuration - Permissive Flag}
1362	If the user-space quirks approach doesn't meet your needs you may want to enable
1363	the permissive flag for that device. To do so, first get the PCI domain, bus,
1364	slot, and function information from dom0 via \path{lspci}. Then augment the
1365	user-space policy for permissive devices. The permissive policy can be found
1366	in:
1367
1368	\centerline{ \path{/etc/xen/xend-pci-permissive.sxp} }
1369
1370	Currently, the only way to reset the permissive flag is to unbind the device
1371	from the PCI Backend driver.
1372
1373	\subsubsection{PCI Backend - Checking Status}
1374	There two important sysfs nodes that provide a mechanism to view specifics on
1375	quirks and permissive devices:
1376	\begin{description}
1377	\item \path{/sys/bus/drivers/pciback/permissive} \\
1378	Use \path{cat} on this file to view a list of permissive slots.
1379	\item \path{/sys/bus/drivers/pciback/quirks} \\
1380	Use \path{cat} on this file view a hierarchical view of devices bound to the
1381	PCI backend, their PCI vendor/device ID, and any quirks that are associated with
1382	that particular slot.
1383	\end{description}
1384
1385	You may notice that every device bound to the PCI backend has 17 quirks standard
1386	"quirks" regardless of \path{xend-pci-quirks.sxp}. These default entries are
1387	necessary to support interactions between the PCI bus manager and the device bound
1388	to it. Even non-quirky devices should have these standard entries.
1389
1390	In this case, preference was given to accuracy over aesthetics by choosing to
1391	show the standard quirks in the quirks list rather than hide them from the
1392	inquiring user
1393
1394	\subsubsection{PCI Frontend Configuration}
1395	To configure a domU to receive a PCI device:
1396
1397	\begin{description}
1398	\item[Command-line:]
1399	Use the {\em pci} command-line flag. For multiple devices, use the option
1400	multiple times. \\
1401	\centerline{ {\tt xm create netcard-dd pci=01:00.0 pci=02:03.0 }} \\
1402
1403	\item[Flat Format configuration file:]
1404	Specify all of your PCI devices in a python list named {\em pci}. \\
1405	\centerline{ {\tt pci=['01:00.0','02:03.0'] }} \\
1406
1407	\item[SXP Format configuration file:]
1408	Use a single PCI device section for all of your devices (specify the numbers
1409	in hexadecimal with the preceding '0x'). Note that {\em domain} here refers
1410	to the PCI domain, not a virtual machine within Xen.
1411	{\small
1412	\begin{verbatim}
1413	(device (pci
1414	(dev (domain 0x0)(bus 0x3)(slot 0x1a)(func 0x1)
1415	(dev (domain 0x0)(bus 0x1)(slot 0x5)(func 0x0)
1416	)
1417	\end{verbatim}
1418	}
1419	\end{description}
1420
1421	%% There are two possible types of privileges: IO privileges and
1422	%% administration privileges.
1423
1424	\section{Support for virtual Trusted Platform Module (vTPM)}
1425	\label{ss:vtpm}
1426
1427	Paravirtualized domains can be given access to a virtualized version
1428	of a TPM. This enables applications in these domains to use the services
1429	of the TPM device for example through a TSS stack
1430	\footnote{Trousers TSS stack: http://sourceforge.net/projects/trousers}.
1431	The Xen source repository provides the necessary software components to
1432	enable virtual TPM access. Support is provided through several
1433	different pieces. First, a TPM emulator has been modified to provide TPM's
1434	functionality for the virtual TPM subsystem. Second, a virtual TPM Manager
1435	coordinates the virtual TPMs efforts, manages their creation, and provides
1436	protected key storage using the TPM. Third, a device driver pair providing
1437	a TPM front- and backend is available for XenLinux to deliver TPM commands
1438	from the domain to the virtual TPM manager, which dispatches it to a
1439	software TPM. Since the TPM Manager relies on a HW TPM for protected key
1440	storage, therefore this subsystem requires a Linux-supported hardware TPM.
1441	For development purposes, a TPM emulator is available for use on non-TPM
1442	enabled platforms.
1443
1444	\subsubsection{Compile-Time Setup}
1445	To enable access to the virtual TPM, the virtual TPM backend driver must
1446	be compiled for a privileged domain (e.g. domain 0). Using the XenLinux
1447	configuration, the necessary driver can be selected in the Xen configuration
1448	section. Unless the driver has been compiled into the kernel, its module
1449	must be activated using the following command:
1450
1451	\begin{verbatim}
1452	modprobe tpmbk
1453	\end{verbatim}
1454
1455	Similarly, the TPM frontend driver must be compiled for the kernel trying
1456	to use TPM functionality. Its driver can be selected in the kernel
1457	configuration section Device Driver / Character Devices / TPM Devices.
1458	Along with that the TPM driver for the built-in TPM must be selected.
1459	If the virtual TPM driver has been compiled as module, it
1460	must be activated using the following command:
1461
1462	\begin{verbatim}
1463	modprobe tpm_xenu
1464	\end{verbatim}
1465
1466	Furthermore, it is necessary to build the virtual TPM manager and software
1467	TPM by making changes to entries in Xen build configuration files.
1468	The following entry in the file Config.mk in the Xen root source
1469	directory must be made:
1470
1471	\begin{verbatim}
1472	VTPM_TOOLS ?= y
1473	\end{verbatim}
1474
1475	After a build of the Xen tree and a reboot of the machine, the TPM backend
1476	drive must be loaded. Once loaded, the virtual TPM manager daemon
1477	must be started before TPM-enabled guest domains may be launched.
1478	To enable being the destination of a virtual TPM Migration, the virtual TPM
1479	migration daemon must also be loaded.
1480
1481	\begin{verbatim}
1482	vtpm_managerd
1483	\end{verbatim}
1484	\begin{verbatim}
1485	vtpm_migratord
1486	\end{verbatim}
1487
1488	Once the VTPM manager is running, the VTPM can be accessed by loading the
1489	front end driver in a guest domain.
1490
1491	\subsubsection{Development and Testing TPM Emulator}
1492	For development and testing on non-TPM enabled platforms, a TPM emulator
1493	can be used in replacement of a platform TPM. First, the entry in the file
1494	tools/vtpm/Rules.mk must look as follows:
1495
1496	\begin{verbatim}
1497	BUILD_EMULATOR = y
1498	\end{verbatim}
1499
1500	Second, the entry in the file tool/vtpm\_manager/Rules.mk must be uncommented
1501	as follows:
1502
1503	\begin{verbatim}
1504	# TCS talks to fifo's rather than /dev/tpm. TPM Emulator assumed on fifos
1505	CFLAGS += -DDUMMY_TPM
1506	\end{verbatim}
1507
1508	Before starting the virtual TPM Manager, start the emulator by executing
1509	the following in dom0:
1510
1511	\begin{verbatim}
1512	tpm_emulator clear
1513	\end{verbatim}
1514
1515	\subsubsection{vTPM Frontend Configuration}
1516	To provide TPM functionality to a user domain, a line must be added to
1517	the virtual TPM configuration file using the following format:
1518
1519	\begin{verbatim}
1520	vtpm = ['instance=<instance number>, backend=<domain id>']
1521	\end{verbatim}
1522
1523	The { \it instance number} reflects the preferred virtual TPM instance
1524	to associate with the domain. If the selected instance is
1525	already associated with another domain, the system will automatically
1526	select the next available instance. An instance number greater than
1527	zero must be provided. It is possible to omit the instance
1528	parameter from the configuration file.
1529
1530	The {\it domain id} provides the ID of the domain where the
1531	virtual TPM backend driver and virtual TPM are running in. It should
1532	currently always be set to '0'.
1533
1534
1535	Examples for valid vtpm entries in the configuration file are
1536
1537	\begin{verbatim}
1538	vtpm = ['instance=1, backend=0']
1539	\end{verbatim}
1540	and
1541	\begin{verbatim}
1542	vtpm = ['backend=0'].
1543	\end{verbatim}
1544
1545	\subsubsection{Using the virtual TPM}
1546
1547	Access to TPM functionality is provided by the virtual TPM frontend driver.
1548	Similar to existing hardware TPM drivers, this driver provides basic TPM
1549	status information through the {\it sysfs} filesystem. In a Xen user domain
1550	the sysfs entries can be found in /sys/devices/xen/vtpm-0.
1551
1552	Commands can be sent to the virtual TPM instance using the character
1553	device /dev/tpm0 (major 10, minor 224).
1554
1555	% Chapter Storage and FileSytem Management
1556	\chapter{Storage and File System Management}
1557
1558	Storage can be made available to virtual machines in a number of
1559	different ways. This chapter covers some possible configurations.
1560
1561	The most straightforward method is to export a physical block device (a
1562	hard drive or partition) from dom0 directly to the guest domain as a
1563	virtual block device (VBD).
1564
1565	Storage may also be exported from a filesystem image or a partitioned
1566	filesystem image as a \emph{file-backed VBD}.
1567
1568	Finally, standard network storage protocols such as NBD, iSCSI, NFS,
1569	etc., can be used to provide storage to virtual machines.
1570
1571
1572	\section{Exporting Physical Devices as VBDs}
1573	\label{s:exporting-physical-devices-as-vbds}
1574
1575	One of the simplest configurations is to directly export individual
1576	partitions from domain~0 to other domains. To achieve this use the
1577	\path{phy:} specifier in your domain configuration file. For example a
1578	line like
1579	\begin{quote}
1580	\verb_disk = ['phy:hda3,sda1,w']_
1581	\end{quote}
1582	specifies that the partition \path{/dev/hda3} in domain~0 should be
1583	exported read-write to the new domain as \path{/dev/sda1}; one could
1584	equally well export it as \path{/dev/hda} or \path{/dev/sdb5} should
1585	one wish.
1586
1587	In addition to local disks and partitions, it is possible to export
1588	any device that Linux considers to be ``a disk'' in the same manner.
1589	For example, if you have iSCSI disks or GNBD volumes imported into
1590	domain~0 you can export these to other domains using the \path{phy:}
1591	disk syntax. E.g.:
1592	\begin{quote}
1593	\verb_disk = ['phy:vg/lvm1,sda2,w']_
1594	\end{quote}
1595
1596	\begin{center}
1597	\framebox{\bf Warning: Block device sharing}
1598	\end{center}
1599	\begin{quote}
1600	Block devices should typically only be shared between domains in a
1601	read-only fashion otherwise the Linux kernel's file systems will get
1602	very confused as the file system structure may change underneath
1603	them (having the same ext3 partition mounted \path{rw} twice is a
1604	sure fire way to cause irreparable damage)! \Xend\ will attempt to
1605	prevent you from doing this by checking that the device is not
1606	mounted read-write in domain~0, and hasn't already been exported
1607	read-write to another domain. If you want read-write sharing,
1608	export the directory to other domains via NFS from domain~0 (or use
1609	a cluster file system such as GFS or ocfs2).
1610	\end{quote}
1611
1612
1613	\section{Using File-backed VBDs}
1614
1615	It is also possible to use a file in Domain~0 as the primary storage
1616	for a virtual machine. As well as being convenient, this also has the
1617	advantage that the virtual block device will be \emph{sparse} ---
1618	space will only really be allocated as parts of the file are used. So
1619	if a virtual machine uses only half of its disk space then the file
1620	really takes up half of the size allocated.
1621
1622	For example, to create a 2GB sparse file-backed virtual block device
1623	(actually only consumes 1KB of disk):
1624	\begin{quote}
1625	\verb_# dd if=/dev/zero of=vm1disk bs=1k seek=2048k count=1_
1626	\end{quote}
1627
1628	Make a file system in the disk file:
1629	\begin{quote}
1630	\verb_# mkfs -t ext3 vm1disk_
1631	\end{quote}
1632
1633	(when the tool asks for confirmation, answer `y')
1634
1635	Populate the file system e.g.\ by copying from the current root:
1636	\begin{quote}
1637	\begin{verbatim}
1638	# mount -o loop vm1disk /mnt
1639	# cp -ax /{root,dev,var,etc,usr,bin,sbin,lib} /mnt
1640	# mkdir /mnt/{proc,sys,home,tmp}
1641	\end{verbatim}
1642	\end{quote}
1643
1644	Tailor the file system by editing \path{/etc/fstab},
1645	\path{/etc/hostname}, etc.\ Don't forget to edit the files in the
1646	mounted file system, instead of your domain~0 filesystem, e.g.\ you
1647	would edit \path{/mnt/etc/fstab} instead of \path{/etc/fstab}. For
1648	this example put \path{/dev/sda1} to root in fstab.
1649
1650	Now unmount (this is important!):
1651	\begin{quote}
1652	\verb_# umount /mnt_
1653	\end{quote}
1654
1655	In the configuration file set:
1656	\begin{quote}
1657	\verb_disk = ['tap:aio:/full/path/to/vm1disk,sda1,w']_
1658	\end{quote}
1659
1660	As the virtual machine writes to its `disk', the sparse file will be
1661	filled in and consume more space up to the original 2GB.
1662
1663	{\em{Note:}} Users that have worked with file-backed VBDs on Xen in previous
1664	versions will be interested to know that this support is now provided through
1665	the blktap driver instead of the loopback driver. This change results in
1666	file-based block devices that are higher-performance, more scalable, and which
1667	provide better safety properties for VBD data. All that is required to update
1668	your existing file-backed VM configurations is to change VBD configuration
1669	lines from:
1670	\begin{quote}
1671	\verb_disk = ['file:/full/path/to/vm1disk,sda1,w']_
1672	\end{quote}
1673	to:
1674	\begin{quote}
1675	\verb_disk = ['tap:aio:/full/path/to/vm1disk,sda1,w']_
1676	\end{quote}
1677
1678
1679	\subsection{Loopback-mounted file-backed VBDs (deprecated)}
1680
1681	{\em{{\bf{Note:}} Loopback mounted VBDs have now been replaced with
1682	blktap-based support for raw image files, as described above. This
1683	section remains to detail a configuration that was used by older Xen
1684	versions.}}
1685
1686	Raw image file-backed VBDs amy also be attached to VMs using the
1687	Linux loopback driver. The only required change to the raw file
1688	instructions above are to specify the configuration entry as:
1689	\begin{quote}
1690	\verb_disk = ['file:/full/path/to/vm1disk,sda1,w']_
1691	\end{quote}
1692
1693	{\bf Note that loopback file-backed VBDs may not be appropriate for backing
1694	I/O-intensive domains.} This approach is known to experience
1695	substantial slowdowns under heavy I/O workloads, due to the I/O
1696	handling by the loopback block device used to support file-backed VBDs
1697	in dom0. Loopbach support remains for old Xen installations, and users
1698	are strongly encouraged to use the blktap-based file support (using
1699	``{\tt{tap:aio}}'' as described above).
1700
1701	Additionally, Linux supports a maximum of eight loopback file-backed
1702	VBDs across all domains by default. This limit can be statically
1703	increased by using the \emph{max\_loop} module parameter if
1704	CONFIG\_BLK\_DEV\_LOOP is compiled as a module in the dom0 kernel, or
1705	by using the \emph{max\_loop=n} boot option if CONFIG\_BLK\_DEV\_LOOP
1706	is compiled directly into the dom0 kernel. Again, users are encouraged
1707	to use the blktap-based file support described above which scales to much
1708	larger number of active VBDs.
1709
1710
1711	\section{Using LVM-backed VBDs}
1712	\label{s:using-lvm-backed-vbds}
1713
1714	A particularly appealing solution is to use LVM volumes as backing for
1715	domain file-systems since this allows dynamic growing/shrinking of
1716	volumes as well as snapshot and other features.
1717
1718	To initialize a partition to support LVM volumes:
1719	\begin{quote}
1720	\begin{verbatim}
1721	# pvcreate /dev/sda10
1722	\end{verbatim}
1723	\end{quote}
1724
1725	Create a volume group named `vg' on the physical partition:
1726	\begin{quote}
1727	\begin{verbatim}
1728	# vgcreate vg /dev/sda10
1729	\end{verbatim}
1730	\end{quote}
1731
1732	Create a logical volume of size 4GB named `myvmdisk1':
1733	\begin{quote}
1734	\begin{verbatim}
1735	# lvcreate -L4096M -n myvmdisk1 vg
1736	\end{verbatim}
1737	\end{quote}
1738
1739	You should now see that you have a \path{/dev/vg/myvmdisk1} Make a
1740	filesystem, mount it and populate it, e.g.:
1741	\begin{quote}
1742	\begin{verbatim}
1743	# mkfs -t ext3 /dev/vg/myvmdisk1
1744	# mount /dev/vg/myvmdisk1 /mnt
1745	# cp -ax / /mnt
1746	# umount /mnt
1747	\end{verbatim}
1748	\end{quote}
1749
1750	Now configure your VM with the following disk configuration:
1751	\begin{quote}
1752	\begin{verbatim}
1753	disk = [ 'phy:vg/myvmdisk1,sda1,w' ]
1754	\end{verbatim}
1755	\end{quote}
1756
1757	LVM enables you to grow the size of logical volumes, but you'll need
1758	to resize the corresponding file system to make use of the new space.
1759	Some file systems (e.g.\ ext3) now support online resize. See the LVM
1760	manuals for more details.
1761
1762	You can also use LVM for creating copy-on-write (CoW) clones of LVM
1763	volumes (known as writable persistent snapshots in LVM terminology).
1764	This facility is new in Linux 2.6.8, so isn't as stable as one might
1765	hope. In particular, using lots of CoW LVM disks consumes a lot of
1766	dom0 memory, and error conditions such as running out of disk space
1767	are not handled well. Hopefully this will improve in future.
1768
1769	To create two copy-on-write clones of the above file system you would
1770	use the following commands:
1771
1772	\begin{quote}
1773	\begin{verbatim}
1774	# lvcreate -s -L1024M -n myclonedisk1 /dev/vg/myvmdisk1
1775	# lvcreate -s -L1024M -n myclonedisk2 /dev/vg/myvmdisk1
1776	\end{verbatim}
1777	\end{quote}
1778
1779	Each of these can grow to have 1GB of differences from the master
1780	volume. You can grow the amount of space for storing the differences
1781	using the lvextend command, e.g.:
1782	\begin{quote}
1783	\begin{verbatim}
1784	# lvextend +100M /dev/vg/myclonedisk1
1785	\end{verbatim}
1786	\end{quote}
1787
1788	Don't let the `differences volume' ever fill up otherwise LVM gets
1789	rather confused. It may be possible to automate the growing process by
1790	using \path{dmsetup wait} to spot the volume getting full and then
1791	issue an \path{lvextend}.
1792
1793	In principle, it is possible to continue writing to the volume that
1794	has been cloned (the changes will not be visible to the clones), but
1795	we wouldn't recommend this: have the cloned volume as a `pristine'
1796	file system install that isn't mounted directly by any of the virtual
1797	machines.
1798
1799
1800	\section{Using NFS Root}
1801
1802	First, populate a root filesystem in a directory on the server
1803	machine. This can be on a distinct physical machine, or simply run
1804	within a virtual machine on the same node.
1805
1806	Now configure the NFS server to export this filesystem over the
1807	network by adding a line to \path{/etc/exports}, for instance:
1808
1809	\begin{quote}
1810	\begin{small}
1811	\begin{verbatim}
1812	/export/vm1root 1.2.3.4/24 (rw,sync,no_root_squash)
1813	\end{verbatim}
1814	\end{small}
1815	\end{quote}
1816
1817	Finally, configure the domain to use NFS root. In addition to the
1818	normal variables, you should make sure to set the following values in
1819	the domain's configuration file:
1820
1821	\begin{quote}
1822	\begin{small}
1823	\begin{verbatim}
1824	root = '/dev/nfs'
1825	nfs_server = '2.3.4.5' # substitute IP address of server
1826	nfs_root = '/path/to/root' # path to root FS on the server
1827	\end{verbatim}
1828	\end{small}
1829	\end{quote}
1830
1831	The domain will need network access at boot time, so either statically
1832	configure an IP address using the config variables \path{ip},
1833	\path{netmask}, \path{gateway}, \path{hostname}; or enable DHCP
1834	(\path{dhcp='dhcp'}).
1835
1836	Note that the Linux NFS root implementation is known to have stability
1837	problems under high load (this is not a Xen-specific problem), so this
1838	configuration may not be appropriate for critical servers.
1839
1840
1841	\chapter{CPU Management}
1842
1843	%% KMS Something sage about CPU / processor management.
1844
1845	Xen allows a domain's virtual CPU(s) to be associated with one or more
1846	host CPUs. This can be used to allocate real resources among one or
1847	more guests, or to make optimal use of processor resources when
1848	utilizing dual-core, hyperthreading, or other advanced CPU technologies.
1849
1850	Xen enumerates physical CPUs in a `depth first' fashion. For a system
1851	with both hyperthreading and multiple cores, this would be all the
1852	hyperthreads on a given core, then all the cores on a given socket,
1853	and then all sockets. I.e. if you had a two socket, dual core,
1854	hyperthreaded Xeon the CPU order would be:
1855
1856
1857	\begin{center}
1858	\begin{tabular}{l\|l\|l\|l\|l\|l\|l\|r}
1859	\multicolumn{4}{c\|}{socket0} & \multicolumn{4}{c}{socket1} \\ \hline
1860	\multicolumn{2}{c\|}{core0} & \multicolumn{2}{c\|}{core1} &
1861	\multicolumn{2}{c\|}{core0} & \multicolumn{2}{c}{core1} \\ \hline
1862	ht0 & ht1 & ht0 & ht1 & ht0 & ht1 & ht0 & ht1 \\
1863	\#0 & \#1 & \#2 & \#3 & \#4 & \#5 & \#6 & \#7 \\
1864	\end{tabular}
1865	\end{center}
1866
1867
1868	Having multiple vcpus belonging to the same domain mapped to the same
1869	physical CPU is very likely to lead to poor performance. It's better to
1870	use `vcpus-set' to hot-unplug one of the vcpus and ensure the others are
1871	pinned on different CPUs.
1872
1873	If you are running IO intensive tasks, its typically better to dedicate
1874	either a hyperthread or whole core to running domain 0, and hence pin
1875	other domains so that they can't use CPU 0. If your workload is mostly
1876	compute intensive, you may want to pin vcpus such that all physical CPU
1877	threads are available for guest domains.
1878
1879	\chapter{Migrating Domains}
1880
1881	\section{Domain Save and Restore}
1882
1883	The administrator of a Xen system may suspend a virtual machine's
1884	current state into a disk file in domain~0, allowing it to be resumed at
1885	a later time.
1886
1887	For example you can suspend a domain called ``VM1'' to disk using the
1888	command:
1889	\begin{verbatim}
1890	# xm save VM1 VM1.chk
1891	\end{verbatim}
1892
1893	This will stop the domain named ``VM1'' and save its current state
1894	into a file called \path{VM1.chk}.
1895
1896	To resume execution of this domain, use the \path{xm restore} command:
1897	\begin{verbatim}
1898	# xm restore VM1.chk
1899	\end{verbatim}
1900
1901	This will restore the state of the domain and resume its execution.
1902	The domain will carry on as before and the console may be reconnected
1903	using the \path{xm console} command, as described earlier.
1904
1905	\section{Migration and Live Migration}
1906
1907	Migration is used to transfer a domain between physical hosts. There
1908	are two varieties: regular and live migration. The former moves a
1909	virtual machine from one host to another by pausing it, copying its
1910	memory contents, and then resuming it on the destination. The latter
1911	performs the same logical functionality but without needing to pause
1912	the domain for the duration. In general when performing live migration
1913	the domain continues its usual activities and---from the user's
1914	perspective---the migration should be imperceptible.
1915
1916	To perform a live migration, both hosts must be running Xen / \xend\ and
1917	the destination host must have sufficient resources (e.g.\ memory
1918	capacity) to accommodate the domain after the move. Furthermore we
1919	currently require both source and destination machines to be on the same
1920	L2 subnet.
1921
1922	Currently, there is no support for providing automatic remote access
1923	to filesystems stored on local disk when a domain is migrated.
1924	Administrators should choose an appropriate storage solution (i.e.\
1925	SAN, NAS, etc.) to ensure that domain filesystems are also available
1926	on their destination node. GNBD is a good method for exporting a
1927	volume from one machine to another. iSCSI can do a similar job, but is
1928	more complex to set up.
1929
1930	When a domain migrates, it's MAC and IP address move with it, thus it is
1931	only possible to migrate VMs within the same layer-2 network and IP
1932	subnet. If the destination node is on a different subnet, the
1933	administrator would need to manually configure a suitable etherip or IP
1934	tunnel in the domain~0 of the remote node.
1935
1936	A domain may be migrated using the \path{xm migrate} command. To live
1937	migrate a domain to another machine, we would use the command:
1938
1939	\begin{verbatim}
1940	# xm migrate --live mydomain destination.ournetwork.com
1941	\end{verbatim}
1942
1943	Without the \path{--live} flag, \xend\ simply stops the domain and
1944	copies the memory image over to the new node and restarts it. Since
1945	domains can have large allocations this can be quite time consuming,
1946	even on a Gigabit network. With the \path{--live} flag \xend\ attempts
1947	to keep the domain running while the migration is in progress, resulting
1948	in typical down times of just 60--300ms.
1949
1950	For now it will be necessary to reconnect to the domain's console on the
1951	new machine using the \path{xm console} command. If a migrated domain
1952	has any open network connections then they will be preserved, so SSH
1953	connections do not have this limitation.
1954
1955
1956	%% Chapter Securing Xen
1957	\chapter{Securing Xen}
1958
1959	This chapter describes how to secure a Xen system. It describes a number
1960	of scenarios and provides a corresponding set of best practices. It
1961	begins with a section devoted to understanding the security implications
1962	of a Xen system.
1963
1964
1965	\section{Xen Security Considerations}
1966
1967	When deploying a Xen system, one must be sure to secure the management
1968	domain (Domain-0) as much as possible. If the management domain is
1969	compromised, all other domains are also vulnerable. The following are a
1970	set of best practices for Domain-0:
1971
1972	\begin{enumerate}
1973	\item \textbf{Run the smallest number of necessary services.} The less
1974	things that are present in a management partition, the better.
1975	Remember, a service running as root in the management domain has full
1976	access to all other domains on the system.
1977	\item \textbf{Use a firewall to restrict the traffic to the management
1978	domain.} A firewall with default-reject rules will help prevent
1979	attacks on the management domain.
1980	\item \textbf{Do not allow users to access Domain-0.} The Linux kernel
1981	has been known to have local-user root exploits. If you allow normal
1982	users to access Domain-0 (even as unprivileged users) you run the risk
1983	of a kernel exploit making all of your domains vulnerable.
1984	\end{enumerate}
1985
1986	\section{Driver Domain Security Considerations}
1987	\label{s:ddsecurity}
1988
1989	Driver domains address a range of security problems that exist regarding
1990	the use of device drivers and hardware. On many operating systems in common
1991	use today, device drivers run within the kernel with the same privileges as
1992	the kernel. Few or no mechanisms exist to protect the integrity of the kernel
1993	from a misbehaving (read "buggy") or malicious device driver. Driver
1994	domains exist to aid in isolating a device driver within its own virtual
1995	machine where it cannot affect the stability and integrity of other
1996	domains. If a driver crashes, the driver domain can be restarted rather than
1997	have the entire machine crash (and restart) with it. Drivers written by
1998	unknown or untrusted third-parties can be confined to an isolated space.
1999	Driver domains thus address a number of security and stability issues with
2000	device drivers.
2001
2002	However, due to limitations in current hardware, a number of security
2003	concerns remain that need to be considered when setting up driver domains (it
2004	should be noted that the following list is not intended to be exhaustive).
2005
2006	\begin{enumerate}
2007	\item \textbf{Without an IOMMU, a hardware device can DMA to memory regions
2008	outside of its controlling domain.} Architectures which do not have an
2009	IOMMU (e.g. most x86-based platforms) to restrict DMA usage by hardware
2010	are vulnerable. A hardware device which can perform arbitrary memory reads
2011	and writes can read/write outside of the memory of its controlling domain.
2012	A malicious or misbehaving domain could use a hardware device it controls
2013	to send data overwriting memory in another domain or to read arbitrary
2014	regions of memory in another domain.
2015	\item \textbf{Shared buses are vulnerable to sniffing.} Devices that share
2016	a data bus can sniff (and possible spoof) each others' data. Device A that
2017	is assigned to Domain A could eavesdrop on data being transmitted by
2018	Domain B to Device B and then relay that data back to Domain A.
2019	\item \textbf{Devices which share interrupt lines can either prevent the
2020	reception of that interrupt by the driver domain or can trigger the
2021	interrupt service routine of that guest needlessly.} A devices which shares
2022	a level-triggered interrupt (e.g. PCI devices) with another device can
2023	raise an interrupt and never clear it. This effectively blocks other devices
2024	which share that interrupt line from notifying their controlling driver
2025	domains that they need to be serviced. A device which shares an
2026	any type of interrupt line can trigger its interrupt continually which
2027	forces execution time to be spent (in multiple guests) in the interrupt
2028	service routine (potentially denying time to other processes within that
2029	guest). System architectures which allow each device to have its own
2030	interrupt line (e.g. PCI's Message Signaled Interrupts) are less
2031	vulnerable to this denial-of-service problem.
2032	\item \textbf{Devices may share the use of I/O memory address space.} Xen can
2033	only restrict access to a device's physical I/O resources at a certain
2034	granularity. For interrupt lines and I/O port address space, that
2035	granularity is very fine (per interrupt line and per I/O port). However,
2036	Xen can only restrict access to I/O memory address space on a page size
2037	basis. If more than one device shares use of a page in I/O memory address
2038	space, the domains to which those devices are assigned will be able to
2039	access the I/O memory address space of each other's devices.
2040	\end{enumerate}
2041
2042
2043	\section{Security Scenarios}
2044
2045
2046	\subsection{The Isolated Management Network}
2047
2048	In this scenario, each node has two network cards in the cluster. One
2049	network card is connected to the outside world and one network card is a
2050	physically isolated management network specifically for Xen instances to
2051	use.
2052
2053	As long as all of the management partitions are trusted equally, this is
2054	the most secure scenario. No additional configuration is needed other
2055	than forcing Xend to bind to the management interface for relocation.
2056
2057
2058	\subsection{A Subnet Behind a Firewall}
2059
2060	In this scenario, each node has only one network card but the entire
2061	cluster sits behind a firewall. This firewall should do at least the
2062	following:
2063
2064	\begin{enumerate}
2065	\item Prevent IP spoofing from outside of the subnet.
2066	\item Prevent access to the relocation port of any of the nodes in the
2067	cluster except from within the cluster.
2068	\end{enumerate}
2069
2070	The following iptables rules can be used on each node to prevent
2071	migrations to that node from outside the subnet assuming the main
2072	firewall does not do this for you:
2073
2074	\begin{verbatim}
2075	# this command disables all access to the Xen relocation
2076	# port:
2077	iptables -A INPUT -p tcp --destination-port 8002 -j REJECT
2078
2079	# this command enables Xen relocations only from the specific
2080	# subnet:
2081	iptables -I INPUT -p tcp -{}-source 192.168.1.1/8 \
2082	--destination-port 8002 -j ACCEPT
2083	\end{verbatim}
2084
2085	\subsection{Nodes on an Untrusted Subnet}
2086
2087	Migration on an untrusted subnet is not safe in current versions of Xen.
2088	It may be possible to perform migrations through a secure tunnel via an
2089	VPN or SSH. The only safe option in the absence of a secure tunnel is to
2090	disable migration completely. The easiest way to do this is with
2091	iptables:
2092
2093	\begin{verbatim}
2094	# this command disables all access to the Xen relocation port
2095	iptables -A INPUT -p tcp -{}-destination-port 8002 -j REJECT
2096	\end{verbatim}
2097
2098	%% Chapter Xen Mandatory Access Control Framework
2099	\chapter{sHype/Xen Access Control}
2100
2101	The Xen mandatory access control framework is an implementation of the
2102	sHype Hypervisor Security Architecture
2103	(www.research.ibm.com/ssd\_shype). It permits or denies communication
2104	and resource access of domains based on a security policy. The
2105	mandatory access controls are enforced in addition to the Xen core
2106	controls, such as memory protection. They are designed to remain
2107	transparent during normal operation of domains (policy-conform
2108	behavior) but to intervene when domains move outside their intended
2109	sharing behavior. This chapter will describe how the sHype access
2110	controls in Xen can be configured to prevent viruses from spilling
2111	over from one into another workload type and secrets from leaking from
2112	one workload type to another. sHype/Xen depends on the correct
2113	behavior of Domain0 (cf previous chapter).
2114
2115	Benefits of configuring sHype/ACM in Xen include:
2116	\begin{itemize}
2117	\item robust workload and resource protection effective against rogue
2118	user domains
2119	\item simple, platform- and operating system-independent security
2120	policies (ideal for heterogeneous distributed environments)
2121	\item safety net with minimal performance overhead in case operating
2122	system security is missing, does not scale, or fails
2123	\end{itemize}
2124
2125	These benefits are very valuable because today's operating systems
2126	become increasingly complex and often have no or insufficient
2127	mandatory access controls. (Discretionary access controls, supported
2128	by of most operating systems, are not effective against viruses or
2129	misbehaving programs.) Where mandatory access control exists (e.g.,
2130	SELinux), they usually deploy complex and difficult to understand
2131	security policies. Additionally, multi-tier applications in business
2132	environments usually require different types of operating systems
2133	(e.g., AIX, Windows, Linux) which cannot be configured with compatible
2134	security policies. Related distributed transactions and workloads
2135	cannot be easily protected on the OS level. The Xen access control
2136	framework steps in to offer a coarse-grained but very robust security
2137	layer and safety net in case operating system security fails or is
2138	missing.
2139
2140	To control sharing between domains, Xen mediates all inter-domain
2141	communication (shared memory, events) as well as the access of domains
2142	to resources such as disks. Thus, Xen can confine distributed
2143	workloads (domain payloads) by permitting sharing among domains
2144	running the same type of workload and denying sharing between pairs of
2145	domains that run different workload types. We assume that--from a Xen
2146	perspective--only one workload type is running per user domain. To
2147	enable Xen to associate domains and resources with workload types,
2148	security labels including the workload types are attached to domains
2149	and resources. These labels and the hypervisor sHype controls cannot
2150	be manipulated or bypassed and are effective even against rogue
2151	domains.
2152
2153	\section{Overview}
2154	This section gives an overview of how workloads can be protected using
2155	the sHype mandatory access control framework in Xen.
2156	Figure~\ref{fig:acmoverview} shows the necessary steps in activating
2157	the Xen workload protection. These steps are described in detail in
2158	Section~\ref{section:acmexample}.
2159
2160	\begin{figure}
2161	\centering
2162	\includegraphics[width=13cm]{figs/acm_overview.eps}
2163	\caption{Overview of activating sHype workload protection in Xen.
2164	Section numbers point to representative examples.}
2165	\label{fig:acmoverview}
2166	\end{figure}
2167
2168	First, the sHype/ACM access control must be enabled in the Xen
2169	distribution and the distribution must be built and installed (cf
2170	Subsection~\ref{subsection:acmexampleconfigure}). Before we can
2171	enforce security, a Xen security policy must be created (cf
2172	Subsection~\ref{subsection:acmexamplecreate}) and deployed (cf
2173	Subsection~\ref{subsection:acmexampleinstall}). This policy defines
2174	the workload types differentiated during access control. It also
2175	defines the rules that compare workload types of domains and resources
2176	to provide access decisions. Workload types are represented by
2177	security labels that can be attached to domains and resources (cf
2178	Subsections~\ref{subsection:acmexamplelabeldomains}
2179	and~\ref{subsection:acmexamplelabelresources}). The functioning of
2180	the active sHype/Xen workload protection is demonstrated using simple
2181	resource assignment, and domain creation tests in
2182	Subsection~\ref{subsection:acmexampletest}.
2183	Section~\ref{section:acmpolicy} describes the syntax and semantics of
2184	the sHype/Xen security policy in detail and introduces briefly the
2185	tools that are available to help create valid security policies.
2186
2187	The next section describes all the necessary steps to create, deploy,
2188	and test a simple workload protection policy. It is meant to enable
2189	anybody to quickly try out the sHype/Xen workload protection. Those
2190	readers who are interested in learning more about how the sHype access
2191	control in Xen works and how it is configured using the XML security
2192	policy should read Section~\ref{section:acmpolicy} as well.
2193	Section~\ref{section:acmlimitations} concludes this chapter with
2194	current limitations of the sHype implementation for Xen.
2195
2196	\section{Xen Workload Protection Step-by-Step}
2197	\label{section:acmexample}
2198
2199	What you are about to do consists of the following sequence:
2200	\begin{itemize}
2201	\item configure and install sHype/Xen
2202	\item create a simple workload protection security policy
2203	\item deploy the sHype/Xen security policy
2204	\item associate domains and resources with workload labels,
2205	\item test the workload protection
2206	\end{itemize}
2207	The essential commands to create and deploy a sHype/Xen security
2208	policy are numbered throughout the following sections. If you want a
2209	quick-guide or return at a later time to go quickly through this
2210	demonstration, simply look for the numbered commands and apply them in
2211	order.
2212
2213	\subsection{Configuring/Building sHype Support into Xen}
2214	\label{subsection:acmexampleconfigure}
2215	First, we need to configure the access control module in Xen and
2216	install the ACM-enabled Xen hypervisor. This step installs security
2217	tools and compiles sHype/ACM controls into the Xen hypervisor.
2218
2219	To enable sHype/ACM in Xen, please edit the Config.mk file in the top
2220	Xen directory.
2221
2222	\begin{verbatim}
2223	(1) In Config.mk
2224	Change: ACM_SECURITY ?= n
2225	To: ACM_SECURITY ?= y
2226	\end{verbatim}
2227
2228	Then install the security-enabled Xen environment as follows:
2229
2230	\begin{verbatim}
2231	(2) # make world
2232	# make install
2233	\end{verbatim}
2234
2235	\subsection{Creating A WLP Policy in 3 Simple Steps with ezPolicy}
2236	\label{subsection:acmexamplecreate}
2237
2238	We will use the ezPolicy tool to quickly create a policy that protects
2239	workloads. You will need both the Python and wxPython packages to run
2240	this tool. To run the tool in Domain0, you can download the wxPython
2241	package from www.wxpython.org or use the command
2242	\verb\|yum install wxPython\| in Redhat/Fedora. To run the tool on MS
2243	Windows, you also need to download the Python package from
2244	www.python.org. After these packages are installed, start the ezPolicy
2245	tool with the following command:
2246
2247	\begin{verbatim}
2248	(3) # xensec_ezpolicy
2249	\end{verbatim}
2250
2251	Figure~\ref{fig:acmezpolicy} shows a screen-shot of the tool. The
2252	following steps show you how to create the policy shown in
2253	Figure~\ref{fig:acmezpolicy}. You can use \verb\|<CTRL>-h\| to pop up a
2254	help window at any time. The indicators (a), (b), and (c) in
2255	Figure~\ref{fig:acmezpolicy} show the buttons that are used during the
2256	3 steps of creating a policy:
2257	\begin{enumerate}
2258	\item defining workloads
2259	\item defining run-time conflicts
2260	\item translating the workload definition into a sHype/Xen access
2261	control policy
2262	\end{enumerate}
2263
2264	\paragraph{Defining workloads.} Workloads are defined for each
2265	organization and department that you enter in the left panel. Please
2266	use the ``New Org'' button (a) to create the organizations ``Avis'',
2267	``Hertz'', ``CocaCola'', and ``PepsiCo''.
2268
2269	You can refine an organization to differentiate between multiple
2270	department workloads by right-clicking the organization and selecting
2271	\verb\|Add Department\| (or selecting an organization and pressing
2272	\verb\|<CRTL>-a\|). Create department workloads ``Intranet'',
2273	``Extranet'', ``HumanResources'', and ``Payroll'' for the ``CocaCola''
2274	organization and department workloads ``Intranet'' and ``Extranet''
2275	for the ``PepsiCo'' organization. The resulting layout of the tool
2276	should be similar to the left panel shown in
2277	Figure~\ref{fig:acmezpolicy}.
2278
2279	\paragraph{Defining run-time conflicts.} Workloads that shall be
2280	prohibited from running concurrently on the same hypervisor platform
2281	are grouped into ``Run-time Exclusion rules'' on the right panel of
2282	the window.
2283
2284	To prevent PepsiCo and CocaCola workloads (including their
2285	departmental workloads) from running simultaneously on the same
2286	hypervisor system, select the organization ``PepsiCo'' and, while
2287	pressing the \verb\|<CTRL>\|-key, select the organization ``CocaCola''.
2288	Now press the button (b) named ``Create run-time exclusion rule from
2289	selection''. A popup window will ask for the name for this run-time
2290	exclusion rule (enter a name or just hit \verb\|<ENTER>\|). A rule will
2291	appear on the right panel. The name is used as reference only and does
2292	not affect the hypervisor policy.
2293
2294	Repeat the process to create a run-time exclusion rule just for the
2295	department workloads CocaCola.Extranet and CocaCola.Payroll.
2296
2297	\begin{figure}[htb]
2298	\centering
2299	\includegraphics[width=13cm]{figs/acm_ezpolicy.eps}
2300	\caption{Final layout including workload definition and Run-time Exclusion rules.}
2301	\label{fig:acmezpolicy}
2302	\end{figure}
2303
2304	The resulting layout of your window should be similar to
2305	Figure~\ref{fig:acmezpolicy}. Save this workload definition by
2306	selecting ``Save Workload Definition as ...'' in the ``File'' menu
2307	(c). This workload definition can be later refined if required.
2308
2309	\paragraph{Translating the workload definition into a sHype/Xen access
2310	control policy.} To translate the workload definition into a access
2311	control policy understood by Xen, please select the ``Save as Xen ACM
2312	Security Policy'' in the ``File'' menu (c). Enter the following policy
2313	name in the popup window: \verb\|example.chwall_ste.test-wld\|. If you
2314	are running ezPolicy in Domain0, the resulting policy file
2315	test-wld\_security-policy.xml will automatically be placed into the
2316	right directory (/etc/xen/acm-security/ policies/example/chwall\_ste).
2317	If you run the tool on another system, then you need to copy the
2318	resulting policy file into Domain0 before continuing. See
2319	Section~\ref{subsection:acmnaming} for naming conventions of security
2320	policies.
2321
2322	\subsection{Deploying a WLP Policy}
2323	\label{subsection:acmexampleinstall}
2324	To deploy the workload protection policy we created in
2325	Section~\ref{subsection:acmexamplecreate}, we create a policy
2326	representation (test-wld.bin) that can be loaded into the Xen
2327	hypervisor and we configure Xen to actually load this policy at
2328	startup time.
2329
2330	The following command translates the source policy representation
2331	into a format that can be loaded into Xen with sHype/ACM support.
2332	Refer to the \verb\|xm\| man page for further details:
2333
2334	\begin{verbatim}
2335	(4) # xm makepolicy example.chwall_ste.test-wld
2336	\end{verbatim}
2337
2338	The easiest way to install a security policy for Xen is to include the
2339	policy in the boot sequence. The following command does just this:
2340
2341	\begin{verbatim}
2342	(5) # xm cfgbootpolicy example.chwall_ste.test-wld
2343	\end{verbatim}
2344
2345	\textit{Alternatively, if this command fails} (e.g., because it cannot
2346	identify the Xen boot entry), you can manually install the policy in 2
2347	steps. First, manually copy the policy binary file into the boot
2348	directory:
2349
2350	\begin{scriptsize}
2351	\begin{verbatim}
2352	# cp /etc/xen/acm-security/policies/example/chwall_ste/test-wld.bin \
2353	/boot/example.chwall_ste.test-wld.bin
2354	\end{verbatim}
2355	\end{scriptsize}
2356
2357	Second, manually add a module line to your Xen boot entry so that grub
2358	loads this policy file during startup:
2359
2360	\begin{scriptsize}
2361	\begin{verbatim}
2362	title Xen (2.6.16.13)
2363	root (hd0,0)
2364	kernel /xen.gz dom0_mem=2000000 console=vga
2365	module /vmlinuz-2.6.16.13-xen ro root=/dev/hda3
2366	module /initrd-2.6.16.13-xen.img
2367	module /example.chwall_ste.test-wld.bin
2368	\end{verbatim}
2369	\end{scriptsize}
2370
2371	Now reboot into this Xen boot entry to activate the policy and the
2372	security-enabled Xen hypervisor.
2373
2374	\begin{verbatim}
2375	(6) # reboot
2376	\end{verbatim}
2377
2378	After reboot, check if security is enabled:
2379
2380	\begin{scriptsize}
2381	\begin{verbatim}
2382	# xm list --label
2383	Name ID Mem(MiB) VCPUs State Time(s) Label
2384	Domain-0 0 1949 4 r----- 163.9 SystemManagement
2385	\end{verbatim}
2386	\end{scriptsize}
2387
2388	If the security label at the end of the line says ``INACTIV'' then the
2389	security is not enabled. Verify the previous steps. Note: Domain0 is
2390	assigned a default label (see \verb\|bootstrap\| policy attribute
2391	explained in Section~\ref{section:acmpolicy}). All other domains must
2392	be labeled in order to start on this sHype/ACM-enabled Xen hypervisor
2393	(see following sections for labeling domains and resources).
2394
2395	\subsection{Labeling Domains}
2396	\label{subsection:acmexamplelabeldomains}
2397	You should have a Xen domain configuration file that looks like the
2398	following (Note: www.jailtime.org or www.xen-get.org might be good
2399	places to look for example domU images). The following configuration
2400	file defines \verb\|domain1\|:
2401
2402	\begin{scriptsize}
2403	\begin{verbatim}
2404	# cat domain1.xm
2405	kernel = "/boot/vmlinuz-2.6.16.13-xen"
2406	memory = 128
2407	name = "domain1"
2408	vif = [ '' ]
2409	dhcp = "dhcp"
2410	disk = ['file:/home/xen/dom_fc5/fedora.fc5.img,sda1,w', \
2411	'file:/home/xen/dom_fc5/fedora.swap,sda2,w']
2412	root = "/dev/sda1 ro"
2413	\end{verbatim}
2414	\end{scriptsize}
2415
2416	If you try to start domain1, you will get the following error:
2417
2418	\begin{scriptsize}
2419	\begin{verbatim}
2420	# xm create domain1.xm
2421	Using config file "domain1.xm".
2422	domain1: DENIED
2423	--> Domain not labeled
2424	Checking resources: (skipped)
2425	Security configuration prevents domain from starting
2426	\end{verbatim}
2427	\end{scriptsize}
2428
2429	Every domain must be associated with a security label before it can
2430	start on sHype/Xen. Otherwise, sHype/Xen would not be able to enforce
2431	the policy consistently. The following command prints all domain
2432	labels available in the active policy:
2433
2434	\begin{scriptsize}
2435	\begin{verbatim}
2436	# xm labels type=dom
2437	Avis
2438	CocaCola
2439	CocaCola.Extranet
2440	CocaCola.HumanResources
2441	CocaCola.Intranet
2442	CocaCola.Payroll
2443	Hertz
2444	PepsiCo
2445	PepsiCo.Extranet
2446	PepsiCo.Intranet
2447	SystemManagement
2448	\end{verbatim}
2449	\end{scriptsize}
2450
2451	Now label domain1 with the CocaCola label and another domain2 with the
2452	PepsiCo.Extranet label. Please refer to the xm man page for further
2453	information.
2454
2455	\begin{verbatim}
2456	(7) # xm addlabel CocaCola dom domain1.xm
2457	# xm addlabel PepsiCo.Extranet dom domain2.xm
2458	\end{verbatim}
2459
2460	Let us try to start the domain again:
2461
2462	\begin{scriptsize}
2463	\begin{verbatim}
2464	# xm create domain1.xm
2465	Using config file "domain1.xm".
2466	file:/home/xen/dom_fc5/fedora.fc5.img: DENIED
2467	--> res:__NULL_LABEL__ (NULL)
2468	--> dom:CocaCola (example.chwall_ste.test-wld)
2469	file:/home/xen/dom_fc5/fedora.swap: DENIED
2470	--> res:__NULL_LABEL__ (NULL)
2471	--> dom:CocaCola (example.chwall_ste.test-wld)
2472	Security configuration prevents domain from starting
2473	\end{verbatim}
2474	\end{scriptsize}
2475
2476	This error indicates that domain1, if started, would not be able to
2477	access its image and swap files because they are not labeled. This
2478	makes sense because to confine workloads, access of domains to
2479	resources must be controlled. Otherwise, domains that are not allowed
2480	to communicate or run simultaneously could share data through storage
2481	resources.
2482
2483	\subsection{Labeling Resources}
2484	\label{subsection:acmexamplelabelresources}
2485	You can use the \verb\|xm labels type=res\| command to list available
2486	resource labels. Let us assign the CocaCola resource label to the domain1
2487	image file representing \verb\|/dev/sda1\| and to its swap file:
2488
2489	\begin{verbatim}
2490	(8) # xm addlabel CocaCola res \
2491	file:/home/xen/dom_fc5/fedora.fc5.img
2492	Resource file not found, creating new file at:
2493	/etc/xen/acm-security/policies/resource_labels
2494	# xm addlabel CocaCola res \
2495	file:/home/xen/dom_fc5/fedora.swap
2496	\end{verbatim}
2497
2498	Starting \verb\|domain1\| now will succeed:
2499
2500	\begin{scriptsize}
2501	\begin{verbatim}
2502	# xm create domain1.xm
2503	# xm list --label
2504	Name ID Mem(MiB) VCPUs State Time(s) Label
2505	domain1 1 128 1 r----- 2.8 CocaCola
2506	Domain-0 0 1949 4 r----- 387.7 SystemManagement
2507	\end{verbatim}
2508	\end{scriptsize}
2509
2510	The following command lists all labeled resources on the
2511	system, e.g., to lookup or verify the labeling:
2512
2513	\begin{scriptsize}
2514	\begin{verbatim}
2515	# xm resources
2516	file:/home/xen/dom_fc5/fedora.swap
2517	policy: example.chwall_ste.test-wld
2518	label: CocaCola
2519	file:/home/xen/dom_fc5/fedora.fc5.img
2520	policy: example.chwall_ste.test-wld
2521	label: CocaCola
2522	\end{verbatim}
2523	\end{scriptsize}
2524
2525	Currently, if a labeled resource is moved to another location, the
2526	label must first be manually removed, and after the move re-attached
2527	using the xm commands \verb\|xm rmlabel\| and \verb\|xm addlabel\|
2528	respectively. Please see Section~\ref{section:acmlimitations} for
2529	further details.
2530
2531	\begin{verbatim}
2532	(9) Label the resources of domain2 as PepsiCo.Extranet
2533	Do not try to start this domain yet
2534	\end{verbatim}
2535
2536	\subsection{Testing The Xen Workload Protection}
2537	\label{subsection:acmexampletest}
2538	We are about to demonstrate how the workload protection works by
2539	verifying:
2540	\begin{itemize}
2541	\item that domains with conflicting workloads cannot run
2542	simultaneously
2543	\item that domains cannot access resources of other workloads
2544	\item that domains cannot exchange network packets if they are not
2545	associated with the same workload type
2546	\end{itemize}
2547
2548	\paragraph{Test 1: Run-time exclusion rules.} We assume that domain1
2549	with the CocaCola label is still running. While domain1 is running,
2550	the run-time exclusion set of our policy says that domain2 cannot
2551	start because the label of domain1 includes the CHWALL type CocaCola
2552	and the label of domain2 includes the CHWALL type PepsiCo. The
2553	run-time exclusion rule of our policy enforces that PepsiCo and
2554	CocaCola cannot run at the same time on the same hypervisor platform.
2555	Once domain1 is stopped or saved, domain2 can start but domain1 can no
2556	longer start or be resumed. The ezPolicy tool, when creating the
2557	Chinese Wall types for the workload labels, ensures that department
2558	workloads inherit the organization type (and with it any organization
2559	exclusions).
2560
2561	\begin{scriptsize}
2562	\begin{verbatim}
2563	# xm list --label
2564	Name ID Mem(MiB) VCPUs State Time(s) Label
2565	domain1 2 128 1 -b---- 6.9 CocaCola
2566	Domain-0 0 1949 4 r----- 273.1 SystemManagement
2567
2568	# xm create domain2.xm
2569	Using config file "domain2.xm".
2570	Error: (1, 'Operation not permitted')
2571
2572	# xm destroy domain1
2573	# xm create domain2.xm
2574	Using config file "domain2.xm".
2575	Started domain domain2
2576
2577	# xm list --label
2578	Name ID Mem(MiB) VCPUs State Time(s) Label
2579	domain2 4 164 1 r----- 4.3 PepsiCo.Extranet
2580	Domain-0 0 1949 4 r----- 298.4 SystemManagement
2581
2582	# xm create domain1.xm
2583	Using config file "domain1.xm".
2584	Error: (1, 'Operation not permitted')
2585
2586	# xm destroy domain2
2587	# xm list
2588	Name ID Mem(MiB) VCPUs State Time(s)
2589	Domain-0 0 1949 4 r----- 391.2
2590	\end{verbatim}
2591	\end{scriptsize}
2592
2593	You can verify that domains with Avis label can run together with
2594	domains labeled CocaCola, PepsiCo, or Hertz.
2595
2596	\paragraph{Test2: Resource access.} In this test, we will re-label the
2597	swap file for domain1 with the Avis resource label. We expect that
2598	Domain1 will no longer start because it cannot access this resource.
2599	This test checks the sharing abilities of domains, which are defined
2600	by the Simple Type Enforcement Policy component.
2601
2602	\begin{scriptsize}
2603	\begin{verbatim}
2604	# xm rmlabel res file:/home/xen/dom_fc5/fedora.swap
2605	# xm addlabel Avis res file:/home/xen/dom_fc5/fedora.swap
2606	# xm resources
2607	file:/home/xen/dom_fc5/fedora.swap
2608	policy: example.chwall_ste.test-wld
2609	label: Avis
2610	file:/home/xen/dom_fc5/fedora.fc5.img
2611	policy: example.chwall_ste.test-wld
2612	label: CocaCola
2613
2614	# xm create domain1.xm
2615	Using config file "domain1.xm".
2616	file:/home/xen/dom_fc4/fedora.swap: DENIED
2617	--> res:Avis (example.chwall_ste.test-wld)
2618	--> dom:CocaCola (example.chwall_ste.test-wld)
2619	Security configuration prevents domain from starting
2620	\end{verbatim}
2621	\end{scriptsize}
2622
2623	\paragraph{Test 3: Communication.} In this test we would verify that
2624	two domains with labels Hertz and Avis cannot exchange network packets
2625	by using the 'ping' connectivity test. It is also related to the STE
2626	policy.{\bf Note:} sHype/Xen does control direct communication between
2627	domains. However, domains associated with different workloads can
2628	currently still communicate through the Domain0 virtual network. We
2629	are working on the sHype/ACM controls for local and remote network
2630	traffic through Domain0. Please monitor the xen-devel mailing list
2631	for updated information.
2632
2633	\section{Xen Access Control Policy}
2634	\label{section:acmpolicy}
2635
2636	This section describes the sHype/Xen access control policy in detail.
2637	It gives enough information to enable the reader to write custom
2638	access control policies and to use the available Xen policy tools. The
2639	policy language is expressive enough to specify most symmetric access
2640	relationships between domains and resources efficiently.
2641
2642	The Xen access control policy consists of two policy components. The
2643	first component, called Chinese Wall (CHWALL) policy, controls which
2644	domains can run simultaneously on the same virtualized platform. The
2645	second component, called Simple Type Enforcement (STE) policy,
2646	controls the sharing between running domains, i.e., communication or
2647	access to shared resources. The CHWALL and STE policy components can
2648	be configured to run alone, however in our examples we will assume
2649	that both policy components are configured together since they
2650	complement each other. The XML policy file includes all information
2651	needed by Xen to enforce the policies.
2652
2653	Figures~\ref{fig:acmxmlfilea} and \ref{fig:acmxmlfileb} show a fully
2654	functional but very simple example policy for Xen. The policy can
2655	distinguish two workload types \verb\|CocaCola\| and \verb\|PepsiCo\| and
2656	defines the labels necessary to associate domains and resources with
2657	one of these workload types. The XML Policy consists of four parts:
2658	\begin{enumerate}
2659	\item policy header including the policy name
2660	\item Simple Type Enforcement block
2661	\item Chinese Wall Policy block
2662	\item label definition block
2663	\end{enumerate}
2664
2665	\begin{figure}
2666	\begin{scriptsize}
2667	\begin{verbatim}
2668	01 <?xml version="1.0" encoding="UTF-8"?>
2669	02 <!-- Auto-generated by ezPolicy -->
2670	03 <SecurityPolicyDefinition
2671	xmlns="http://www.ibm.com"
2672	xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
2673	xsi:schemaLocation=
2674	"http://www.ibm.com ../../security_policy.xsd ">
2675	04 <PolicyHeader>
2676	05 <PolicyName>example.test</PolicyName>
2677	06 <Date>Wed Jul 12 17:32:59 2006</Date>
2678	07 <Version>1.0</Version>
2679	08 </PolicyHeader>
2680	09
2681	10 <SimpleTypeEnforcement>
2682	11 <SimpleTypeEnforcementTypes>
2683	12 <Type>SystemManagement</Type>
2684	13 <Type>PepsiCo</Type>
2685	14 <Type>CocaCola</Type>
2686	15 </SimpleTypeEnforcementTypes>
2687	16 </SimpleTypeEnforcement>
2688	17
2689	18 <ChineseWall priority="PrimaryPolicyComponent">
2690	19 <ChineseWallTypes>
2691	20 <Type>SystemManagement</Type>
2692	21 <Type>PepsiCo</Type>
2693	22 <Type>CocaCola</Type>
2694	23 </ChineseWallTypes>
2695	24
2696	25 <ConflictSets>
2697	26 <Conflict name="RER1">
2698	27 <Type>CocaCola</Type>
2699	28 <Type>PepsiCo</Type>
2700	29 </Conflict>
2701	30 </ConflictSets>
2702	31 </ChineseWall>
2703	32
2704	\end{verbatim}
2705	\end{scriptsize}
2706	\caption{Example XML security policy file -- Part I: Types and Rules Definition.}
2707	\label{fig:acmxmlfilea}
2708	\end{figure}
2709
2710	\subsection{Policy Header and Policy Name}
2711	\label{subsection:acmnaming}
2712	Lines 1-2 (cf Figure~\ref{fig:acmxmlfilea}) include the usual XML
2713	header. The security policy definition starts in Line 3 and refers to
2714	the policy schema. The XML-Schema definition for the Xen policy can be
2715	found in the file
2716	\textit{/etc/xen/acm-security/policies/security-policy.xsd}. Examples
2717	for security policies can be found in the example subdirectory. The
2718	acm-security directory is only installed if ACM security is configured
2719	during installation (cf Section~\ref{subsection:acmexampleconfigure}).
2720
2721	The \verb\|Policy Header\| spans lines 4-7. It includes a date field and
2722	defines the policy name \verb\|example.chwall_ste.test\|. It can also
2723	include optional fields that are not shown and are for future use (see
2724	schema definition).
2725
2726	The policy name serves two purposes: First, it provides a unique name
2727	for the security policy. This name is also exported by the Xen
2728	hypervisor to the Xen management tools in order to ensure that both
2729	enforce the same policy. We plan to extend the policy name with a
2730	digital fingerprint of the policy contents to better protect this
2731	correlation. Second, it implicitly points the xm tools to the
2732	location where the XML policy file is stored on the Xen system.
2733	Replacing the colons in the policy name by slashes yields the local
2734	path to the policy file starting from the global policy directory
2735	\verb\|/etc/xen/acm-security/policies\|. The last part of the policy
2736	name is the prefix for the XML policy file name, completed by
2737	\verb\|-security_policy.xml\|. Consequently, the policy with the name
2738	\verb\|example.chwall_ste.test\| can be found in the XML policy file
2739	named \verb\|test-security_policy.xml\| that is stored in the local
2740	directory \verb\|example/chwall_ste\| under the global policy directory.
2741
2742	\subsection{Simple Type Enforcement Policy Component}
2743
2744	The Simple Type Enforcement (STE) policy controls which domains can
2745	communicate or share resources. This way, Xen can enforce confinement
2746	of workload types by confining the domains running those workload
2747	types. The mandatory access control framework enforces its policy when
2748	domains access intended ways of communication or cooperation (shared
2749	memory, events, shared resources such as block devices). It builds on
2750	top of the core hypervisor isolation, which restricts the ways of
2751	inter-communication to those intended means. STE does not protect or
2752	intend to protect from covert channels in the hypervisor or hardware;
2753	this is an orthogonal problem that can be mitigated by using the
2754	Run-time Exclusion rules described above or by fixing the problem in
2755	the core hypervisor.
2756
2757	Xen controls sharing between domains on the resource and domain level
2758	because this is the abstraction the hypervisor and its management
2759	understand naturally. While this is coarse-grained, it is also very
2760	reliable and robust and it requires minimal changes to implement
2761	mandatory access controls in the hypervisor. It enables platform- and
2762	operation system-independent policies as part of a layered security
2763	approach.
2764
2765	Lines 9-15 (cf Figure~\ref{fig:acmxmlfilea}) define the Simple Type
2766	Enforcement policy component. Essentially, they define the workload
2767	type names \verb\|SystemManagement\|, \verb\|PepsiCo\|, and
2768	\verb\|CocaCola\| that are available in the STE policy component. The
2769	policy rules are implicit: Xen permits a domain to communicate with
2770	another domain if and only if the labels of the domains share an
2771	common STE type. Xen permits a domain to access a resource if and
2772	only if the labels of the domain and the resource share a common STE
2773	workload type.
2774
2775	\subsection{Chinese Wall Policy Component}
2776
2777	The Chinese Wall security policy interpretation of sHype enables users
2778	to prevent certain workloads from running simultaneously on the same
2779	hypervisor platform. Run-time Exclusion rules (RER), also called
2780	Conflict Sets, define a set of workload types that are not permitted
2781	to run simultaneously. Of all the workloads specified in a Run-time
2782	Exclusion rule, at most one type can run on the same hypervisor
2783	platform at a time. Run-time Exclusion Rules implement a less
2784	rigorous variant of the original Chinese Wall security component. They
2785	do not implement the *-property of the policy, which would require to
2786	restrict also types that are not part of an exclusion rule once they
2787	are running together with a type in an exclusion rule (please refer to
2788	http://www.gammassl.co.uk/topics/chinesewall.html for more information
2789	on the original Chinese Wall policy).
2790
2791	Xen considers the \verb\|ChineseWallTypes\| part of the label for the
2792	enforcement of the Run-time Exclusion rules. It is illegal to define
2793	labels including conflicting Chinese Wall types.
2794
2795	Lines 17-30 (cf Figure~\ref{fig:acmxmlfilea}) define the Chinese Wall
2796	policy component. Lines 17-22 define the known Chinese Wall types,
2797	which coincide here with the STE types defined above. This usually
2798	holds if the criteria for sharing among domains and sharing of the
2799	hardware platform are the same. Lines 24-29 define one Run-time
2800	Exclusion rule:
2801
2802	\begin{scriptsize}
2803	\begin{verbatim}
2804	<Conflict name="RER1">
2805	<Type>CocaCola</Type>
2806	<Type>PepsiCo</Type>
2807	</Conflict>
2808	\end{verbatim}
2809	\end{scriptsize}
2810
2811	Based on this rule, Xen enforces that only one of the types
2812	\verb\|CocaCola\| or \verb\|PepsiCo\| will run on a single hypervisor
2813	platform at a time. For example, once a domain assigned a
2814	\verb\|CocaCola\| workload type is started, domains with the
2815	\verb\|PepsiCo\| type will be denied to start. When the former domain
2816	stops and no other domains with the \verb\|CocaCola\| type are running,
2817	then domains with the \verb\|PepsiCo\| type can start.
2818
2819	Xen maintains reference counts on each running workload type to keep
2820	track of which workload types are running. Every time a domain starts
2821	or resumes, the reference count on those Chinese Wall types that are
2822	referenced in the domain's label are incremented. Every time a domain
2823	is destroyed or saved, the reference counts of its Chinese Wall types
2824	are decremented. sHype in Xen covers migration and live-migration,
2825	which is treated the same way as saving a domain on the source
2826	platform and resuming it on the destination platform.
2827
2828	Reasons why users would want to restrict which workloads or domains
2829	can share the system hardware include:
2830
2831	\begin{itemize}
2832	\item Imperfect resource management or control might enable a rogue
2833	domain to starve another domain and the workload running in it.
2834	\item Redundant domains might run the same workload to increase
2835	availability; such domains should not run on the same hardware to
2836	avoid single points of failure.
2837	\item Imperfect Xen core domain isolation might enable two rogue
2838	domains running different workload types to use unintended and
2839	unknown ways (covert channels) to exchange some data. This way, they
2840	bypass the policed Xen access control mechanisms. Such
2841	imperfections cannot be completely eliminated and are a result of
2842	trade-offs between security and other design requirements. For a
2843	simple example of a covert channel see
2844	http://www.multicians.org/timing-chn.html. Such covert channels
2845	exist also between workloads running on different platforms if they
2846	are connected through networks. The Xen Chinese Wall policy provides
2847	an approximation of this imperfect ``air-gap'' between selected
2848	workload types.
2849	\end{itemize}
2850
2851	\subsection{Security Labels}
2852
2853	To enable Xen to associate domains with workload types running in
2854	them, each domain is assigned a security label that includes the
2855	workload types of the domain.
2856
2857	\begin{figure}
2858	\begin{scriptsize}
2859	\begin{verbatim}
2860	32 <SecurityLabelTemplate>
2861	33 <SubjectLabels bootstrap="SystemManagement">
2862	34 <VirtualMachineLabel>
2863	35 <Name>SystemManagement</Name>
2864	36 <SimpleTypeEnforcementTypes>
2865	37 <Type>SystemManagement</Type>
2866	38 <Type>PepsiCo</Type>
2867	39 <Type>CocaCola</Type>
2868	40 </SimpleTypeEnforcementTypes>
2869	41 <ChineseWallTypes>
2870	42 <Type>SystemManagement</Type>
2871	43 </ChineseWallTypes>
2872	44 </VirtualMachineLabel>
2873	45
2874	46 <VirtualMachineLabel>
2875	47 <Name>PepsiCo</Name>
2876	48 <SimpleTypeEnforcementTypes>
2877	49 <Type>PepsiCo</Type>
2878	50 </SimpleTypeEnforcementTypes>
2879	51 <ChineseWallTypes>
2880	52 <Type>PepsiCo</Type>
2881	53 </ChineseWallTypes>
2882	54 </VirtualMachineLabel>
2883	55
2884	56 <VirtualMachineLabel>
2885	57 <Name>CocaCola</Name>
2886	58 <SimpleTypeEnforcementTypes>
2887	59 <Type>CocaCola</Type>
2888	60 </SimpleTypeEnforcementTypes>
2889	61 <ChineseWallTypes>
2890	62 <Type>CocaCola</Type>
2891	63 </ChineseWallTypes>
2892	64 </VirtualMachineLabel>
2893	65 </SubjectLabels>
2894	66
2895	67 <ObjectLabels>
2896	68 <ResourceLabel>
2897	69 <Name>SystemManagement</Name>
2898	70 <SimpleTypeEnforcementTypes>
2899	71 <Type>SystemManagement</Type>
2900	72 </SimpleTypeEnforcementTypes>
2901	73 </ResourceLabel>
2902	74
2903	75 <ResourceLabel>
2904	76 <Name>PepsiCo</Name>
2905	77 <SimpleTypeEnforcementTypes>
2906	78 <Type>PepsiCo</Type>
2907	79 </SimpleTypeEnforcementTypes>
2908	80 </ResourceLabel>
2909	81
2910	82 <ResourceLabel>
2911	83 <Name>CocaCola</Name>
2912	84 <SimpleTypeEnforcementTypes>
2913	85 <Type>CocaCola</Type>
2914	86 </SimpleTypeEnforcementTypes>
2915	87 </ResourceLabel>
2916	88 </ObjectLabels>
2917	89 </SecurityLabelTemplate>
2918	90 </SecurityPolicyDefinition>
2919	\end{verbatim}
2920	\end{scriptsize}
2921	\caption{Example XML security policy file -- Part II: Label Definition.}
2922	\label{fig:acmxmlfileb}
2923	\end{figure}
2924
2925	Lines 32-89 (cf Figure~\ref{fig:acmxmlfileb}) define the
2926	\verb\|SecurityLabelTemplate\|, which includes the labels that can be
2927	attached to domains and resources when this policy is active. The
2928	domain labels include Chinese Wall types while resource labels do not
2929	include Chinese Wall types. Lines 33-65 define the
2930	\verb\|SubjectLabels\| that can be assigned to domains. For example, the
2931	virtual machine label \verb\|CocaCola\| (cf lines 56-64 in
2932	Figure~\ref{fig:acmxmlfileb}) associates the domain that carries it
2933	with the workload type \verb\|CocaCola\|.
2934
2935	The \verb\|bootstrap\| attribute names the label
2936	\verb\|SystemManagement\|. Xen will assign this label to Domain0 at
2937	boot time. All other domains are assigned labels according to their
2938	domain configuration file (see
2939	Section~\ref{subsection:acmexamplelabeldomains} for examples of how to
2940	label domains). Lines 67-88 define the \verb\|ObjectLabels\|. Those
2941	labels can be assigned to resources when this policy is active.
2942
2943	In general, user domains should be assigned labels that have only a
2944	single SimpleTypeEnforcement workload type. This way, workloads remain
2945	confined even if user domains become rogue. Any domain that is
2946	assigned a label with multiple STE types must be trusted to keep
2947	information belonging to the different STE types separate (confined).
2948	For example, Domain0 is assigned the bootstrap label
2949	\verb\|SystemsManagement\|, which includes all existing STE types.
2950	Therefore, Domain0 must take care not to enable unauthorized
2951	information flow (eg. through block devices or virtual networking)
2952	between domains or resources that are assigned different STE types.
2953
2954	Security administrators simply use the name of a label (specified in
2955	the \verb\|<Name>\| field) to associate a label with a domain (cf.
2956	Section~\ref{subsection:acmexamplelabeldomains}). The types inside the
2957	label are used by the Xen access control enforcement. While the name
2958	can be arbitrarily chosen (as long as it is unique), it is advisable
2959	to choose the label name in accordance to the security types included.
2960	While the XML representation in the above label seems unnecessary
2961	flexible, labels in general can consist of multiple types as we will
2962	see in the following example.
2963
2964	Assume that \verb\|PepsiCo\| and \verb\|CocaCola\| workloads use virtual
2965	disks that are provided by a virtual I/O domain hosting a physical
2966	storage device and carrying the following label:
2967
2968	\begin{scriptsize}
2969	\begin{verbatim}
2970	<VirtualMachineLabel>
2971	<Name>VIO</Name>
2972	<SimpleTypeEnforcementTypes>
2973	<Type>CocaCola</Type>
2974	<Type>PepsiCo</Type>
2975	</SimpleTypeEnforcementTypes>
2976	<ChineseWallTypes>
2977	<Type>VIOServer</Type>
2978	</ChineseWallTypes>
2979	</VirtualMachineLabel>
2980	\end{verbatim}
2981	\end{scriptsize}
2982
2983	This Virtual I/O domain (VIO) exports its virtualized disks by
2984	communicating both to domains labeled with the \verb\|PepsiCo\| label
2985	and domains labeled with the \verb\|CocaCola\| label. This requires the
2986	VIO domain to carry both the STE types \verb\|CocaCola\| and
2987	\verb\|PepsiCo\|. In this example, the confinement of \verb\|CocaCola\|
2988	and \verb\|PepsiCo\| workload depends on a VIO domain that must keep the
2989	data of those different workloads separate. The virtual disks are
2990	labeled as well (see Section~\ref{subsection:acmexamplelabelresources}
2991	for labeling resources) and enforcement functions inside the VIO
2992	domain must ensure that the labels of the domain mounting a virtual
2993	disk and the virtual disk label share a common STE type. The VIO label
2994	carrying its own VIOServer CHWALL type introduces the flexibility to
2995	permit the trusted VIO server to run together with CocaCola or PepsiCo
2996	workloads.
2997
2998	Alternatively, a system that has two hard-drives does not need a VIO
2999	domain but can directly assign one hardware storage device to each of
3000	the workloads (if the platform offers an IO-MMU, cf
3001	Section~\ref{s:ddsecurity}. Sharing hardware through virtualization
3002	is a trade-off between the amount of trusted code (size of the trusted
3003	computing base) and the amount of acceptable over-provisioning. This
3004	holds both for peripherals and for system platforms.
3005
3006	\subsection{Tools For Creating sHype/Xen Security Policies}
3007	To create a security policy for Xen, you can use one of the following
3008	tools:
3009	\begin{itemize}
3010	\item \verb\|ezPolicy\| GUI tool -- start writing policies
3011	\item \verb\|xensec_gen\| tool -- refine policies created with \verb\|ezPolicy\|
3012	\item text or XML editor
3013	\end{itemize}
3014
3015	We use the \verb\|ezPolicy\| tool in
3016	Section~\ref{subsection:acmexamplecreate} to quickly create a workload
3017	protection policy. If desired, the resulting XML policy file can be
3018	loaded into the \verb\|xensec_gen\| tool to refine it. It can also be
3019	directly edited using an XML editor. Any XML policy file is verified
3020	against the security policy schema when it is translated (see
3021	Subsection~\ref{subsection:acmexampleinstall}).
3022
3023	\section{Current Limitations}
3024	\label{section:acmlimitations}
3025
3026	The sHype/ACM configuration for Xen is work in progress. There is
3027	ongoing work for protecting virtualized resources and planned and
3028	ongoing work for protecting access to remote resources and domains.
3029	The following sections describe limitations of some of the areas into
3030	which access control is being extended.
3031
3032	\subsection{Network Traffic}
3033	Local and remote network traffic is currently not controlled.
3034	Solutions to add sHype/ACM policy enforcement to the virtual network
3035	exist but need to be discussed before they can become part of Xen.
3036	Subjecting external network traffic to the ACM security policy is work
3037	in progress. Manually setting up filters in domain 0 is required for
3038	now but does not scale well.
3039
3040	\subsection{Resource Access and Usage Control}
3041
3042	Enforcing the security policy across multiple hypervisor systems and
3043	on access to remote shared resources is work in progress. Extending
3044	access control to new types of resources is ongoing work (e.g. network
3045	storage).
3046
3047	On a single Xen system, information about the association of resources
3048	and security labels is stored in
3049	\verb\|/etc/xen/acm-security/policy/resource_labels\|. This file relates
3050	a full resource path with a security label. This association is weak
3051	and will break if resources are moved or renamed without adapting the
3052	label file. Improving the protection of label-resource relationships
3053	is ongoing work.
3054
3055	Controlling resource usage and enforcing resource limits in general is
3056	ongoing work in the Xen community.
3057
3058	\subsection{Domain Migration}
3059
3060	Labels on domains are enforced during domain migration and the
3061	destination hypervisor will ensure that the domain label is valid and
3062	the domain is permitted to run (considering the Chinese Wall policy
3063	rules) before it accepts the migration. However, the network between
3064	the source and destination hypervisor as well as both hypervisors must
3065	be trusted. Architectures and prototypes exist that both protect the
3066	network connection and ensure that the hypervisors enforce access
3067	control consistently but patches are not yet available for the main
3068	stream.
3069
3070	\subsection{Covert Channels}
3071
3072	The sHype access control aims at system independent security policies.
3073	It builds on top of the core hypervisor isolation. Any covert channels
3074	that exist in the core hypervisor or in the hardware (e.g., shared
3075	processor cache) will be inherited. If those covert channels are not
3076	the result of trade-offs between security and other system properties,
3077	then they are most effectively minimized or eliminated where they are
3078	caused. sHype offers however some means to mitigate their impact
3079	(cf. run-time exclusion rules).
3080
3081	\part{Reference}
3082
3083	%% Chapter Build and Boot Options
3084	\chapter{Build and Boot Options}
3085
3086	This chapter describes the build- and boot-time options which may be
3087	used to tailor your Xen system.
3088
3089	\section{Top-level Configuration Options}
3090
3091	Top-level configuration is achieved by editing one of two
3092	files: \path{Config.mk} and \path{Makefile}.
3093
3094	The former allows the overall build target architecture to be
3095	specified. You will typically not need to modify this unless
3096	you are cross-compiling or if you wish to build a PAE-enabled
3097	Xen system. Additional configuration options are documented
3098	in the \path{Config.mk} file.
3099
3100	The top-level \path{Makefile} is chiefly used to customize the set of
3101	kernels built. Look for the line:
3102	\begin{quote}
3103	\begin{verbatim}
3104	KERNELS ?= linux-2.6-xen0 linux-2.6-xenU
3105	\end{verbatim}
3106	\end{quote}
3107
3108	Allowable options here are any kernels which have a corresponding
3109	build configuration file in the \path{buildconfigs/} directory.
3110
3111
3112
3113	\section{Xen Build Options}
3114
3115	Xen provides a number of build-time options which should be set as
3116	environment variables or passed on make's command-line.
3117
3118	\begin{description}
3119	\item[verbose=y] Enable debugging messages when Xen detects an
3120	unexpected condition. Also enables console output from all domains.
3121	\item[debug=y] Enable debug assertions. Implies {\bf verbose=y}.
3122	(Primarily useful for tracing bugs in Xen).
3123	\item[debugger=y] Enable the in-Xen debugger. This can be used to
3124	debug Xen, guest OSes, and applications.
3125	\item[perfc=y] Enable performance counters for significant events
3126	within Xen. The counts can be reset or displayed on Xen's console
3127	via console control keys.
3128	\end{description}
3129
3130
3131	\section{Xen Boot Options}
3132	\label{s:xboot}
3133
3134	These options are used to configure Xen's behaviour at runtime. They
3135	should be appended to Xen's command line, either manually or by
3136	editing \path{grub.conf}.
3137
3138	\begin{description}
3139	\item [ noreboot ] Don't reboot the machine automatically on errors.
3140	This is useful to catch debug output if you aren't catching console
3141	messages via the serial line.
3142	\item [ nosmp ] Disable SMP support. This option is implied by
3143	`ignorebiostables'.
3144	\item [ watchdog ] Enable NMI watchdog which can report certain
3145	failures.
3146	\item [ noirqbalance ] Disable software IRQ balancing and affinity.
3147	This can be used on systems such as Dell 1850/2850 that have
3148	workarounds in hardware for IRQ-routing issues.
3149	\item [ badpage=$<$page number$>$,$<$page number$>$, \ldots ] Specify
3150	a list of pages not to be allocated for use because they contain bad
3151	bytes. For example, if your memory tester says that byte 0x12345678
3152	is bad, you would place `badpage=0x12345' on Xen's command line.
3153	\item [ com1=$<$baud$>$,DPS,$<$io\_base$>$,$<$irq$>$
3154	com2=$<$baud$>$,DPS,$<$io\_base$>$,$<$irq$>$ ] \mbox{}\\
3155	Xen supports up to two 16550-compatible serial ports. For example:
3156	`com1=9600, 8n1, 0x408, 5' maps COM1 to a 9600-baud port, 8 data
3157	bits, no parity, 1 stop bit, I/O port base 0x408, IRQ 5. If some
3158	configuration options are standard (e.g., I/O base and IRQ), then
3159	only a prefix of the full configuration string need be specified. If
3160	the baud rate is pre-configured (e.g., by the bootloader) then you
3161	can specify `auto' in place of a numeric baud rate.
3162	\item [ console=$<$specifier list$>$ ] Specify the destination for Xen
3163	console I/O. This is a comma-separated list of, for example:
3164	\begin{description}
3165	\item[ vga ] Use VGA console (until domain 0 boots, unless {\bf
3166	vga=keep } is specified).
3167	\item[ com1 ] Use serial port com1.
3168	\item[ com2H ] Use serial port com2. Transmitted chars will have the
3169	MSB set. Received chars must have MSB set.
3170	\item[ com2L] Use serial port com2. Transmitted chars will have the
3171	MSB cleared. Received chars must have MSB cleared.
3172	\end{description}
3173	The latter two examples allow a single port to be shared by two
3174	subsystems (e.g.\ console and debugger). Sharing is controlled by
3175	MSB of each transmitted/received character. [NB. Default for this
3176	option is `com1,vga']
3177	\item [ vga=$<$options$>$ ] This is a comma-separated list of options:
3178	\begin{description}
3179	\item[ text-$<$mode$>$ ] Select text-mode resolution, where mode is
3180	one of 80x25, 80x28, 80x30, 80x34, 80x43, 80x50, 80x60.
3181	\item[ keep ] Keep the VGA console even after domain 0 boots.
3182	\end{description}
3183	\item [ console\_to\_ring ] Place guest console output into the
3184	hypervisor console ring buffer. This is disabled by default.
3185	When enabled, both hypervisor output and guest console output
3186	is available from the ring buffer. This can be useful for logging
3187	and/or remote presentation of console data.
3188	\item [ sync\_console ] Force synchronous console output. This is
3189	useful if you system fails unexpectedly before it has sent all
3190	available output to the console. In most cases Xen will
3191	automatically enter synchronous mode when an exceptional event
3192	occurs, but this option provides a manual fallback.
3193	\item [ conswitch=$<$switch-char$><$auto-switch-char$>$ ] Specify how
3194	to switch serial-console input between Xen and DOM0. The required
3195	sequence is CTRL-$<$switch-char$>$ pressed three times. Specifying
3196	the backtick character disables switching. The
3197	$<$auto-switch-char$>$ specifies whether Xen should auto-switch
3198	input to DOM0 when it boots --- if it is `x' then auto-switching is
3199	disabled. Any other value, or omitting the character, enables
3200	auto-switching. [NB. Default switch-char is `a'.]
3201	\item [ loglvl=$<$level$>/<$level$>$ ]
3202	Specify logging level. Messages of the specified severity level (and
3203	higher) will be printed to the Xen console. Valid levels are `none',
3204	`error', `warning', `info', `debug', and `all'. The second level
3205	specifier is optional: it is used to specify message severities
3206	which are to be rate limited. Default is `loglvl=warning'.
3207	\item [ guest\_loglvl=$<$level$>/<$level$>$ ] As for loglvl, but
3208	applies to messages relating to guests. Default is
3209	`guest\_loglvl=none/warning'.
3210	\item [ nmi=xxx ]
3211	Specify what to do with an NMI parity or I/O error. \\
3212	`nmi=fatal': Xen prints a diagnostic and then hangs. \\
3213	`nmi=dom0': Inform DOM0 of the NMI. \\
3214	`nmi=ignore': Ignore the NMI.
3215	\item [ mem=xxx ] Set the physical RAM address limit. Any RAM
3216	appearing beyond this physical address in the memory map will be
3217	ignored. This parameter may be specified with a B, K, M or G suffix,
3218	representing bytes, kilobytes, megabytes and gigabytes respectively.
3219	The default unit, if no suffix is specified, is kilobytes.
3220	\item [ dom0\_mem=$<$specifier list$>$ ] Set the amount of memory to
3221	be allocated to domain 0. This is a comma-separated list containing
3222	the following optional components:
3223	\begin{description}
3224	\item[ min:$<$min\_amt$>$ ] Minimum amount to allocate to domain 0
3225	\item[ max:$<$min\_amt$>$ ] Maximum amount to allocate to domain 0
3226	\item[ $<$amt$>$ ] Precise amount to allocate to domain 0
3227	\end{description}
3228	Each numeric parameter may be specified with a B, K, M or
3229	G suffix, representing bytes, kilobytes, megabytes and gigabytes
3230	respectively; if no suffix is specified, the parameter defaults to
3231	kilobytes. Negative values are subtracted from total available
3232	memory. If $<$amt$>$ is not specified, it defaults to all available
3233	memory less a small amount (clamped to 128MB) for uses such as DMA
3234	buffers.
3235	\item [ dom0\_vcpus\_pin ] Pins domain 0 VCPUs on their respective
3236	physical CPUS (default=false).
3237	\item [ tbuf\_size=xxx ] Set the size of the per-cpu trace buffers, in
3238	pages (default 0).
3239	\item [ sched=xxx ] Select the CPU scheduler Xen should use. The
3240	current possibilities are `credit' (default), and `sedf'.
3241	\item [ apic\_verbosity=debug,verbose ] Print more detailed
3242	information about local APIC and IOAPIC configuration.
3243	\item [ lapic ] Force use of local APIC even when left disabled by
3244	uniprocessor BIOS.
3245	\item [ nolapic ] Ignore local APIC in a uniprocessor system, even if
3246	enabled by the BIOS.
3247	\item [ apic=bigsmp,default,es7000,summit ] Specify NUMA platform.
3248	This can usually be probed automatically.
3249	\item [ dma\_bits=xxx ] Specify width of DMA
3250	addresses in bits. Default is 30 bits (addresses up to 1GB are DMAable).
3251	\item [ dma\_emergency\_pool=xxx ] Specify lower bound on size of DMA
3252	pool below which ordinary allocations will fail rather than fall
3253	back to allocating from the DMA pool.
3254	\item [ hap ] Instruct Xen to detect hardware-assisted paging support, such
3255	as AMD-V's nested paging or Intel\textregistered VT's extended paging. If
3256	available, Xen will use hardware-assisted paging instead of shadow paging
3257	for guest memory management.
3258	\end{description}
3259
3260	In addition, the following options may be specified on the Xen command
3261	line. Since domain 0 shares responsibility for booting the platform,
3262	Xen will automatically propagate these options to its command line.
3263	These options are taken from Linux's command-line syntax with
3264	unchanged semantics.
3265
3266	\begin{description}
3267	\item [ acpi=off,force,strict,ht,noirq,\ldots ] Modify how Xen (and
3268	domain 0) parses the BIOS ACPI tables.
3269	\item [ acpi\_skip\_timer\_override ] Instruct Xen (and domain~0) to
3270	ignore timer-interrupt override instructions specified by the BIOS
3271	ACPI tables.
3272	\item [ noapic ] Instruct Xen (and domain~0) to ignore any IOAPICs
3273	that are present in the system, and instead continue to use the
3274	legacy PIC.
3275	\end{description}
3276
3277
3278	\section{XenLinux Boot Options}
3279
3280	In addition to the standard Linux kernel boot options, we support:
3281	\begin{description}
3282	\item[ xencons=xxx ] Specify the device node to which the Xen virtual
3283	console driver is attached. The following options are supported:
3284	\begin{center}
3285	\begin{tabular}{l}
3286	`xencons=off': disable virtual console \\
3287	`xencons=tty': attach console to /dev/tty1 (tty0 at boot-time) \\
3288	`xencons=ttyS': attach console to /dev/ttyS0
3289	\end{tabular}
3290	\end{center}
3291	The default is ttyS for dom0 and tty for all other domains.
3292	\end{description}
3293
3294
3295	%% Chapter Further Support
3296	\chapter{Further Support}
3297
3298	If you have questions that are not answered by this manual, the
3299	sources of information listed below may be of interest to you. Note
3300	that bug reports, suggestions and contributions related to the
3301	software (or the documentation) should be sent to the Xen developers'
3302	mailing list (address below).
3303
3304
3305	\section{Other Documentation}
3306
3307	For developers interested in porting operating systems to Xen, the
3308	\emph{Xen Interface Manual} is distributed in the \path{docs/}
3309	directory of the Xen source distribution.
3310
3311
3312	\section{Online References}
3313
3314	The official Xen web site can be found at:
3315	\begin{quote} {\tt http://www.xensource.com}
3316	\end{quote}
3317
3318	This contains links to the latest versions of all online
3319	documentation, including the latest version of the FAQ.
3320
3321	Information regarding Xen is also available at the Xen Wiki at
3322	\begin{quote} {\tt http://wiki.xensource.com/xenwiki/}\end{quote}
3323	The Xen project uses Bugzilla as its bug tracking system. You'll find
3324	the Xen Bugzilla at http://bugzilla.xensource.com/bugzilla/.
3325
3326
3327	\section{Mailing Lists}
3328
3329	There are several mailing lists that are used to discuss Xen related
3330	topics. The most widely relevant are listed below. An official page of
3331	mailing lists and subscription information can be found at \begin{quote}
3332	{\tt http://lists.xensource.com/} \end{quote}
3333
3334	\begin{description}
3335	\item[xen-devel@lists.xensource.com] Used for development
3336	discussions and bug reports. Subscribe at: \\
3337	{\small {\tt http://lists.xensource.com/xen-devel}}
3338	\item[xen-users@lists.xensource.com] Used for installation and usage
3339	discussions and requests for help. Subscribe at: \\
3340	{\small {\tt http://lists.xensource.com/xen-users}}
3341	\item[xen-announce@lists.xensource.com] Used for announcements only.
3342	Subscribe at: \\
3343	{\small {\tt http://lists.xensource.com/xen-announce}}
3344	\item[xen-changelog@lists.xensource.com] Changelog feed
3345	from the unstable and 2.0 trees - developer oriented. Subscribe at: \\
3346	{\small {\tt http://lists.xensource.com/xen-changelog}}
3347	\end{description}
3348
3349
3350
3351	%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
3352
3353	\appendix
3354
3355	\chapter{Unmodified (VMX) guest domains in Xen with Intel\textregistered Virtualization Technology (VT)}
3356
3357	Xen supports guest domains running unmodified Guest operating systems using Virtualization Technology (VT) available on recent Intel Processors. More information about the Intel Virtualization Technology implementing Virtual Machine Extensions (VMX) in the processor is available on the Intel website at \\
3358	{\small {\tt http://www.intel.com/technology/computing/vptech}}
3359
3360	\section{Building Xen with VT support}
3361
3362	The following packages need to be installed in order to build Xen with VT support. Some Linux distributions do not provide these packages by default.
3363
3364	\begin{tabular}{lp{11.0cm}}
3365	{\bfseries Package} & {\bfseries Description} \\
3366
3367	dev86 & The dev86 package provides an assembler and linker for real mode 80x86 instructions. You need to have this package installed in order to build the BIOS code which runs in (virtual) real mode.
3368
3369	If the dev86 package is not available on the x86\_64 distribution, you can install the i386 version of it. The dev86 rpm package for various distributions can be found at {\scriptsize {\tt http://www.rpmfind.net/linux/rpm2html/search.php?query=dev86\&submit=Search}} \\
3370
3371	LibVNCServer & The unmodified guest's VGA display, keyboard, and mouse can be virtualized by the vncserver library. You can get the sources of libvncserver from {\small {\tt http://sourceforge.net/projects/libvncserver}}. Build and install the sources on the build system to get the libvncserver library. There is a significant performance degradation in 0.8 version. The current sources in the CVS tree have fixed this degradation. So it is highly recommended to download the latest CVS sources and install them.\\
3372
3373	SDL-devel, SDL & Simple DirectMedia Layer (SDL) is another way of virtualizing the unmodified guest console. It provides an X window for the guest console.
3374
3375	If the SDL and SDL-devel packages are not installed by default on the build system, they can be obtained from {\scriptsize {\tt http://www.rpmfind.net/linux/rpm2html/search.php?query=SDL\&submit=Search}}
3376	, {\scriptsize {\tt http://www.rpmfind.net/linux/rpm2html/search.php?query=SDL-devel\&submit=Search}} \\
3377
3378	\end{tabular}
3379
3380	\section{Configuration file for unmodified VMX guests}
3381
3382	The Xen installation includes a sample configuration file, {\small {\tt /etc/xen/xmexample.vmx}}. There are comments describing all the options. In addition to the common options that are the same as those for paravirtualized guest configurations, VMX guest configurations have the following settings:
3383
3384	\begin{tabular}{lp{11.0cm}}
3385
3386	{\bfseries Parameter} & {\bfseries Description} \\
3387
3388	kernel & The VMX firmware loader, {\small {\tt /usr/lib/xen/boot/vmxloader}}\\
3389
3390	builder & The domain build function. The VMX domain uses the vmx builder.\\
3391
3392	acpi & Enable VMX guest ACPI, default=0 (disabled)\\
3393
3394	apic & Enable VMX guest APIC, default=0 (disabled)\\
3395
3396	pae & Enable VMX guest PAE, default=0 (disabled)\\
3397
3398	vif & Optionally defines MAC address and/or bridge for the network interfaces. Random MACs are assigned if not given. {\small {\tt type=ioemu}} means ioemu is used to virtualize the VMX NIC. If no type is specified, vbd is used, as with paravirtualized guests.\\
3399
3400	disk & Defines the disk devices you want the domain to have access to, and what you want them accessible as. If using a physical device as the VMX guest's disk, each disk entry is of the form
3401
3402	{\small {\tt phy:UNAME,ioemu:DEV,MODE,}}
3403
3404	where UNAME is the device, DEV is the device name the domain will see, and MODE is r for read-only, w for read-write. ioemu means the disk will use ioemu to virtualize the VMX disk. If not adding ioemu, it uses vbd like paravirtualized guests.
3405
3406	If using disk image file, its form should be like
3407
3408	{\small {\tt file:FILEPATH,ioemu:DEV,MODE}}
3409
3410	If using more than one disk, there should be a comma between each disk entry. For example:
3411
3412	{\scriptsize {\tt disk = ['file:/var/images/image1.img,ioemu:hda,w', 'file:/var/images/image2.img,ioemu:hdb,w']}}\\
3413
3414	cdrom & Disk image for CD-ROM. The default is {\small {\tt /dev/cdrom}} for Domain0. Inside the VMX domain, the CD-ROM will available as device {\small {\tt /dev/hdc}}. The entry can also point to an ISO file.\\
3415
3416	boot & Boot from floppy (a), hard disk (c) or CD-ROM (d). For example, to boot from CD-ROM, the entry should be:
3417
3418	boot='d'\\
3419
3420	device\_model & The device emulation tool for VMX guests. This parameter should not be changed.\\
3421
3422	sdl & Enable SDL library for graphics, default = 0 (disabled)\\
3423
3424	vnc & Enable VNC library for graphics, default = 1 (enabled)\\
3425
3426	vncviewer & Enable spawning of the vncviewer (only valid when vnc=1), default = 1 (enabled)
3427
3428	If vnc=1 and vncviewer=0, user can use vncviewer to manually connect VMX from remote. For example:
3429
3430	{\small {\tt vncviewer domain0\_IP\_address:VMX\_domain\_id}} \\
3431
3432	ne2000 & Enable ne2000, default = 0 (disabled; use pcnet)\\
3433
3434	serial & Enable redirection of VMX serial output to pty device\\
3435
3436	\end{tabular}
3437
3438	\begin{tabular}{lp{10cm}}
3439
3440	usb & Enable USB support without defining a specific USB device.
3441	This option defaults to 0 (disabled) unless the option usbdevice is
3442	specified in which case this option then defaults to 1 (enabled).\\
3443
3444	usbdevice & Enable USB support and also enable support for the given
3445	device. Devices that can be specified are {\small {\tt mouse}} (a PS/2 style
3446	mouse), {\small {\tt tablet}} (an absolute pointing device) and
3447	{\small {\tt host:id1:id2}} (a physical USB device on the host machine whose
3448	ids are {\small {\tt id1}} and {\small {\tt id2}}). The advantage
3449	of {\small {\tt tablet}} is that Windows guests will automatically recognize
3450	and support this device so specifying the config line
3451
3452	{\small
3453	\begin{verbatim}
3454	usbdevice='tablet'
3455	\end{verbatim}
3456	}
3457
3458	will create a mouse that works transparently with Windows guests under VNC.
3459	Linux doesn't recognize the USB tablet yet so Linux guests under VNC will
3460	still need the Summagraphics emulation.
3461	Details about mouse emulation are provided in section \textbf{A.4.3}.\\
3462
3463	localtime & Set the real time clock to local time [default=0, that is, set to UTC].\\
3464
3465	enable-audio & Enable audio support. This is under development.\\
3466
3467	full-screen & Start in full screen. This is under development.\\
3468
3469	nographic & Another way to redirect serial output. If enabled, no 'sdl' or 'vnc' can work. Not recommended.\\
3470
3471	\end{tabular}
3472
3473
3474	\section{Creating virtual disks from scratch}
3475	\subsection{Using physical disks}
3476	If you are using a physical disk or physical disk partition, you need to install a Linux OS on the disk first. Then the boot loader should be installed in the correct place. For example {\small {\tt dev/sda}} for booting from the whole disk, or {\small {\tt /dev/sda1}} for booting from partition 1.
3477
3478	\subsection{Using disk image files}
3479	You need to create a large empty disk image file first; then, you need to install a Linux OS onto it. There are two methods you can choose. One is directly installing it using a VMX guest while booting from the OS installation CD-ROM. The other is copying an installed OS into it. The boot loader will also need to be installed.
3480
3481	\subsubsection*{To create the image file:}
3482	The image size should be big enough to accommodate the entire OS. This example assumes the size is 1G (which is probably too small for most OSes).
3483
3484	{\small {\tt \# dd if=/dev/zero of=hd.img bs=1M count=1 seek=1023}}
3485
3486	\subsubsection*{To directly install Linux OS into an image file using a VMX guest:}
3487
3488	Install Xen and create VMX with the original image file with booting from CD-ROM. Then it is just like a normal Linux OS installation. The VMX configuration file should have these two entries before creating:
3489
3490	{\small {\tt cdrom='/dev/cdrom'
3491	boot='d'}}
3492
3493	If this method does not succeed, you can choose the following method of copying an installed Linux OS into an image file.
3494
3495	\subsubsection*{To copy a installed OS into an image file:}
3496	Directly installing is an easier way to make partitions and install an OS in a disk image file. But if you want to create a specific OS in your disk image, then you will most likely want to use this method.
3497
3498	\begin{enumerate}
3499	\item {\bfseries Install a normal Linux OS on the host machine}\\
3500	You can choose any way to install Linux, such as using yum to install Red Hat Linux or YAST to install Novell SuSE Linux. The rest of this example assumes the Linux OS is installed in {\small {\tt /var/guestos/}}.
3501
3502	\item {\bfseries Make the partition table}\\
3503	The image file will be treated as hard disk, so you should make the partition table in the image file. For example:
3504
3505	{\scriptsize {\tt \# losetup /dev/loop0 hd.img\\
3506	\# fdisk -b 512 -C 4096 -H 16 -S 32 /dev/loop0\\
3507	press 'n' to add new partition\\
3508	press 'p' to choose primary partition\\
3509	press '1' to set partition number\\
3510	press "Enter" keys to choose default value of "First Cylinder" parameter.\\
3511	press "Enter" keys to choose default value of "Last Cylinder" parameter.\\
3512	press 'w' to write partition table and exit\\
3513	\# losetup -d /dev/loop0}}
3514
3515	\item {\bfseries Make the file system and install grub}\\
3516	{\scriptsize {\tt \# ln -s /dev/loop0 /dev/loop\\
3517	\# losetup /dev/loop0 hd.img\\
3518	\# losetup -o 16384 /dev/loop1 hd.img\\
3519	\# mkfs.ext3 /dev/loop1\\
3520	\# mount /dev/loop1 /mnt\\
3521	\# mkdir -p /mnt/boot/grub\\
3522	\# cp /boot/grub/stage* /boot/grub/e2fs\_stage1\_5 /mnt/boot/grub\\
3523	\# umount /mnt\\
3524	\# grub\\
3525	grub> device (hd0) /dev/loop\\
3526	grub> root (hd0,0)\\
3527	grub> setup (hd0)\\
3528	grub> quit\\
3529	\# rm /dev/loop\\
3530	\# losetup -d /dev/loop0\\
3531	\# losetup -d /dev/loop1}}
3532
3533	The {\small {\tt losetup}} option {\small {\tt -o 16384}} skips the partition table in the image file. It is the number of sectors times 512. We need {\small {\tt /dev/loop}} because grub is expecting a disk device \emph{name}, where \emph{name} represents the entire disk and \emph{name1} represents the first partition.
3534
3535	\item {\bfseries Copy the OS files to the image}\\
3536	If you have Xen installed, you can easily use {\small {\tt lomount}} instead of {\small {\tt losetup}} and {\small {\tt mount}} when coping files to some partitions. {\small {\tt lomount}} just needs the partition information.
3537
3538	{\scriptsize {\tt \# lomount -t ext3 -diskimage hd.img -partition 1 /mnt/guest\\
3539	\# cp -ax /var/guestos/\{root,dev,var,etc,usr,bin,sbin,lib\} /mnt/guest\\
3540	\# mkdir /mnt/guest/\{proc,sys,home,tmp\}}}
3541
3542	\item {\bfseries Edit the {\small {\tt /etc/fstab}} of the guest image}\\
3543	The fstab should look like this:
3544
3545	{\scriptsize {\tt \# vim /mnt/guest/etc/fstab\\
3546	/dev/hda1 / ext3 defaults 1 1\\
3547	none /dev/pts devpts gid=5,mode=620 0 0\\
3548	none /dev/shm tmpfs defaults 0 0\\
3549	none /proc proc defaults 0 0\\
3550	none /sys sysfs efaults 0 0}}
3551
3552	\item {\bfseries umount the image file}\\
3553	{\small {\tt \# umount /mnt/guest}}
3554	\end{enumerate}
3555
3556	Now, the guest OS image {\small {\tt hd.img}} is ready. You can also reference {\small {\tt http://free.oszoo.org}} for quickstart images. But make sure to install the boot loader.
3557
3558	\subsection{Install Windows into an Image File using a VMX guest}
3559	In order to install a Windows OS, you should keep {\small {\tt acpi=0}} in your VMX configuration file.
3560
3561	\section{VMX Guests}
3562	\subsection{Editing the Xen VMX config file}
3563	Make a copy of the example VMX configuration file {\small {\tt /etc/xen/xmeaxmple.vmx}} and edit the line that reads
3564
3565	{\small {\tt disk = [ 'file:/var/images/\emph{guest.img},ioemu:hda,w' ]}}
3566
3567	replacing \emph{guest.img} with the name of the guest OS image file you just made.
3568
3569	\subsection{Creating VMX guests}
3570	Simply follow the usual method of creating the guest, using the -f parameter and providing the filename of your VMX configuration file:\\
3571
3572	{\small {\tt \# xend start\\
3573	\# xm create /etc/xen/vmxguest.vmx}}
3574
3575	In the default configuration, VNC is on and SDL is off. Therefore VNC windows will open when VMX guests are created. If you want to use SDL to create VMX guests, set {\small {\tt sdl=1}} in your VMX configuration file. You can also turn off VNC by setting {\small {\tt vnc=0}}.
3576
3577	\subsection{Mouse issues, especially under VNC}
3578	Mouse handling when using VNC is a little problematic.
3579	The problem is that the VNC viewer provides a virtual pointer which is
3580	located at an absolute location in the VNC window and only absolute
3581	coordinates are provided.
3582	The VMX device model converts these absolute mouse coordinates
3583	into the relative motion deltas that are expected by the PS/2
3584	mouse driver running in the guest.
3585	Unfortunately,
3586	it is impossible to keep these generated mouse deltas
3587	accurate enough for the guest cursor to exactly match
3588	the VNC pointer.
3589	This can lead to situations where the guest's cursor
3590	is in the center of the screen and there's no way to
3591	move that cursor to the left
3592	(it can happen that the VNC pointer is at the left
3593	edge of the screen and,
3594	therefore,
3595	there are no longer any left mouse deltas that
3596	can be provided by the device model emulation code.)
3597
3598	To deal with these mouse issues there are 4 different
3599	mouse emulations available from the VMX device model:
3600
3601	\begin{description}
3602	\item[PS/2 mouse over the PS/2 port.]
3603	This is the default mouse
3604	that works perfectly well under SDL.
3605	Under VNC the guest cursor will get
3606	out of sync with the VNC pointer.
3607	When this happens you can re-synchronize
3608	the guest cursor to the VNC pointer by
3609	holding down the
3610	\textbf{left-ctl}
3611	and
3612	\textbf{left-alt}
3613	keys together.
3614	While these keys are down VNC pointer motions
3615	will not be reported to the guest so
3616	that the VNC pointer can be moved
3617	to a place where it is possible
3618	to move the guest cursor again.
3619
3620	\item[Summagraphics mouse over the serial port.]
3621	The device model also provides emulation
3622	for a Summagraphics tablet,
3623	an absolute pointer device.
3624	This emulation is provided over the second
3625	serial port,
3626	\textbf{/dev/ttyS1}
3627	for Linux guests and
3628	\textbf{COM2}
3629	for Windows guests.
3630	Unfortunately,
3631	neither Linux nor Windows provides
3632	default support for the Summagraphics
3633	tablet so the guest will have to be
3634	manually configured for this mouse.
3635
3636	\textbf{Linux configuration.}
3637
3638	First,
3639	configure the GPM service to use the Summagraphics tablet.
3640	This can vary between distributions but,
3641	typically,
3642	all that needs to be done is modify the file
3643	\path{/etc/sysconfig/mouse} to contain the lines:
3644
3645	{\small
3646	\begin{verbatim}
3647	MOUSETYPE="summa"
3648	XMOUSETYPE="SUMMA"
3649	DEVICE=/dev/ttyS1
3650	\end{verbatim}
3651	}
3652
3653	and then restart the GPM daemon.
3654
3655	Next,
3656	modify the X11 config
3657	\path{/etc/X11/xorg.conf}
3658	to support the Summgraphics tablet by replacing
3659	the input device stanza with the following:
3660
3661	{\small
3662	\begin{verbatim}
3663	Section "InputDevice"
3664	Identifier "Mouse0"
3665	Driver "summa"
3666	Option "Device" "/dev/ttyS1"
3667	Option "InputFashion" "Tablet"
3668	Option "Mode" "Absolute"
3669	Option "Name" "EasyPen"
3670	Option "Compatible" "True"
3671	Option "Protocol" "Auto"
3672	Option "SendCoreEvents" "on"
3673	Option "Vendor" "GENIUS"
3674	EndSection
3675	\end{verbatim}
3676	}
3677
3678	Restart X and the X cursor should now properly
3679	track the VNC pointer.
3680
3681
3682	\textbf{Windows configuration.}
3683
3684	Get the file
3685	\path{http://www.cad-plan.de/files/download/tw2k.exe}
3686	and execute that file on the guest,
3687	answering the questions as follows:
3688
3689	\begin{enumerate}
3690	\item When the program asks for \textbf{model},
3691	scroll down and selese \textbf{SummaSketch (MM Compatible)}.
3692
3693	\item When the program asks for \textbf{COM Port} specify \textbf{com2}.
3694
3695	\item When the programs asks for a \textbf{Cursor Type} specify
3696	\textbf{4 button cursor/puck}.
3697
3698	\item The guest system will then reboot and,
3699	when it comes back up,
3700	the guest cursor will now properly track
3701	the VNC pointer.
3702	\end{enumerate}
3703
3704	\item[PS/2 mouse over USB port.]
3705	This is just the same PS/2 emulation except it is
3706	provided over a USB port.
3707	This emulation is enabled by the configuration flag:
3708	{\small
3709	\begin{verbatim}
3710	usbdevice='mouse'
3711	\end{verbatim}
3712	}
3713
3714	\item[USB tablet over USB port.]
3715	The USB tablet is an absolute pointing device
3716	that has the advantage that it is automatically
3717	supported under Windows guests,
3718	although Linux guests still require some
3719	manual configuration.
3720	This mouse emulation is enabled by the
3721	configuration flag:
3722	{\small
3723	\begin{verbatim}
3724	usbdevice='tablet'
3725	\end{verbatim}
3726	}
3727
3728	\textbf{Linux configuration.}
3729
3730	Unfortunately,
3731	there is no GPM support for the
3732	USB tablet at this point in time.
3733	If you intend to use a GPM pointing
3734	device under VNC you should
3735	configure the guest for Summagraphics
3736	emulation.
3737
3738	Support for X11 is available by following
3739	the instructions at\\
3740	\verb+http://stz-softwaretechnik.com/~ke/touchscreen/evtouch.html+\\
3741	with one minor change.
3742	The
3743	\path{xorg.conf}
3744	given in those instructions
3745	uses the wrong values for the X \& Y minimums and maximums,
3746	use the following config stanza instead:
3747
3748	{\small
3749	\begin{verbatim}
3750	Section "InputDevice"
3751	Identifier "Tablet"
3752	Driver "evtouch"
3753	Option "Device" "/dev/input/event2"
3754	Option "DeviceName" "touchscreen"
3755	Option "MinX" "0"
3756	Option "MinY" "0"
3757	Option "MaxX" "32256"
3758	Option "MaxY" "32256"
3759	Option "ReportingMode" "Raw"
3760	Option "Emulate3Buttons"
3761	Option "Emulate3Timeout" "50"
3762	Option "SendCoreEvents" "On"
3763	EndSection
3764	\end{verbatim}
3765	}
3766
3767	\textbf{Windows configuration.}
3768
3769	Just enabling the USB tablet in the
3770	guest's configuration file is sufficient,
3771	Windows will automatically recognize and
3772	configure device drivers for this
3773	pointing device.
3774
3775	\end{description}
3776
3777	\subsection{USB Support}
3778	There is support for an emulated USB mouse,
3779	an emulated USB tablet
3780	and physical low speed USB devices
3781	(support for high speed USB 2.0 devices is
3782	still under development).
3783
3784	\begin{description}
3785	\item[USB PS/2 style mouse.]
3786	Details on the USB mouse emulation are
3787	given in sections
3788	\textbf{A.2}
3789	and
3790	\textbf{A.4.3}.
3791	Enabling USB PS/2 style mouse emulation
3792	is just a matter of adding the line
3793
3794	{\small
3795	\begin{verbatim}
3796	usbdevice='mouse'
3797	\end{verbatim}
3798	}
3799
3800	to the configuration file.
3801	\item[USB tablet.]
3802	Details on the USB tablet emulation are
3803	given in sections
3804	\textbf{A.2}
3805	and
3806	\textbf{A.4.3}.
3807	Enabling USB tablet emulation
3808	is just a matter of adding the line
3809
3810	{\small
3811	\begin{verbatim}
3812	usbdevice='tablet'
3813	\end{verbatim}
3814	}
3815
3816	to the configuration file.
3817	\item[USB physical devices.]
3818	Access to a physical (low speed) USB device
3819	is enabled by adding a line of the form
3820
3821	{\small
3822	\begin{verbatim}
3823	usbdevice='host:vid:pid'
3824	\end{verbatim}
3825	}
3826
3827	into the the configuration file.\footnote{
3828	There is an alternate
3829	way of specifying a USB device that
3830	uses the syntax
3831	\textbf{host:bus.addr}
3832	but this syntax suffers from
3833	a major problem that makes
3834	it effectively useless.
3835	The problem is that the
3836	\textbf{addr}
3837	portion of this address
3838	changes every time the USB device
3839	is plugged into the system.
3840	For this reason this addressing
3841	scheme is not recommended and
3842	will not be documented further.
3843	}
3844	\textbf{vid}
3845	and
3846	\textbf{pid}
3847	are a
3848	product id and
3849	vendor id
3850	that uniquely identify
3851	the USB device.
3852	These ids can be identified
3853	in two ways:
3854
3855	\begin{enumerate}
3856	\item Through the control window.
3857	As described in section
3858	\textbf{A.4.6}
3859	the control window
3860	is activated by pressing
3861	\textbf{ctl-alt-2}
3862	in the guest VGA window.
3863	As long as USB support is
3864	enabled in the guest by including
3865	the config file line
3866	{\small
3867	\begin{verbatim}
3868	usb=1
3869	\end{verbatim}
3870	}
3871	then executing the command
3872	{\small
3873	\begin{verbatim}
3874	info usbhost
3875	\end{verbatim}
3876	}
3877	in the control window
3878	will display a list of all
3879	usb devices and their ids.
3880	For example,
3881	this output:
3882	{\small
3883	\begin{verbatim}
3884	Device 1.3, speed 1.5 Mb/s
3885	Class 00: USB device 04b3:310b
3886	\end{verbatim}
3887	}
3888	was created from a USB mouse with
3889	vendor id
3890	\textbf{04b3}
3891	and product id
3892	\textbf{310b}.
3893	This device could be made available
3894	to the VMX guest by including the
3895	config file entry
3896	{\small
3897	\begin{verbatim}
3898	usbdevice='host:04be:310b'
3899	\end{verbatim}
3900	}
3901
3902	It is also possible to
3903	enable access to a USB
3904	device dynamically through
3905	the control window.
3906	The control window command
3907	{\small
3908	\begin{verbatim}
3909	usb_add host:vid:pid
3910	\end{verbatim}
3911	}
3912	will also allow access to a
3913	USB device with vendor id
3914	\textbf{vid}
3915	and product id
3916	\textbf{pid}.
3917	\item Through the
3918	\path{/proc} file system.
3919	The contents of the pseudo file
3920	\path{/proc/bus/usb/devices}
3921	can also be used to identify
3922	vendor and product ids.
3923	Looking at this file,
3924	the line starting with
3925	\textbf{P:}
3926	has a field
3927	\textbf{Vendor}
3928	giving the vendor id and
3929	another field
3930	\textbf{ProdID}
3931	giving the product id.
3932	The contents of
3933	\path{/proc/bus/usb/devices}
3934	for the example mouse is as
3935	follows:
3936	{\small
3937	\begin{verbatim}
3938	T: Bus=01 Lev=01 Prnt=01 Port=01 Cnt=02 Dev#= 3 Spd=1.5 MxCh= 0
3939	D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS= 8 #Cfgs= 1
3940	P: Vendor=04b3 ProdID=310b Rev= 1.60
3941	C:* #Ifs= 1 Cfg#= 1 Atr=a0 MxPwr=100mA
3942	I: If#= 0 Alt= 0 #EPs= 1 Cls=03(HID ) Sub=01 Prot=02 Driver=(none)
3943	E: Ad=81(I) Atr=03(Int.) MxPS= 4 Ivl=10ms
3944	\end{verbatim}
3945	}
3946	Note that the
3947	\textbf{P:}
3948	line correctly identifies the
3949	vendor id and product id
3950	for this mouse as
3951	\textbf{04b3:310b}.
3952	\end{enumerate}
3953	There is one other issue to
3954	be aware of when accessing a
3955	physical USB device from the guest.
3956	The Dom0 kernel must not have
3957	a device driver loaded for
3958	the device that the guest wishes
3959	to access.
3960	This means that the Dom0
3961	kernel must not have that
3962	device driver compiled into
3963	the kernel or,
3964	if using modules,
3965	that driver module must
3966	not be loaded.
3967	Note that this is the device
3968	specific USB driver that must
3969	not be loaded,
3970	either the
3971	\textbf{UHCI}
3972	or
3973	\textbf{OHCI}
3974	USB controller driver must
3975	still be loaded.
3976
3977	Going back to the USB mouse
3978	as an example,
3979	if \textbf{lsmod}
3980	gives the output:
3981
3982	{\small
3983	\begin{verbatim}
3984	Module Size Used by
3985	usbmouse 4128 0
3986	usbhid 28996 0
3987	uhci_hcd 35409 0
3988	\end{verbatim}
3989	}
3990
3991	then the USB mouse is being
3992	used by the Dom0 kernel and is
3993	not available to the guest.
3994	Executing the command
3995	\textbf{rmmod usbhid}\footnote{
3996	Turns out the
3997	\textbf{usbhid}
3998	driver is the significant
3999	one for the USB mouse,
4000	the presence or absence of
4001	the module
4002	\textbf{usbmouse}
4003	has no effect on whether or
4004	not the guest can see a USB mouse.}
4005	will remove the USB mouse
4006	driver from the Dom0 kernel
4007	and the mouse will now be
4008	accessible by the VMX guest.
4009
4010	Be aware the the Linux USB
4011	hotplug system will reload
4012	the drivers if a USB device
4013	is removed and plugged back
4014	in.
4015	This means that just unloading
4016	the driver module might not
4017	be sufficient if the USB device
4018	is removed and added back.
4019	A more reliable technique is
4020	to first
4021	\textbf{rmmod}
4022	the driver and then rename the
4023	driver file in the
4024	\path{/lib/modules}
4025	directory,
4026	just to make sure it doesn't get
4027	reloaded.
4028	\end{description}
4029
4030	\subsection{Destroy VMX guests}
4031	VMX guests can be destroyed in the same way as can paravirtualized guests. We recommend that you type the command
4032
4033	{\small {\tt poweroff}}
4034
4035	in the VMX guest's console first to prevent data loss. Then execute the command
4036
4037	{\small {\tt xm destroy \emph{vmx\_guest\_id} }}
4038
4039	at the Domain0 console.
4040
4041	\subsection{VMX window (X or VNC) Hot Key}
4042	If you are running in the X environment after creating a VMX guest, an X window is created. There are several hot keys for control of the VMX guest that can be used in the window.
4043
4044	{\bfseries Ctrl+Alt+2} switches from guest VGA window to the control window. Typing {\small {\tt help }} shows the control commands help. For example, 'q' is the command to destroy the VMX guest.\\
4045	{\bfseries Ctrl+Alt+1} switches back to VMX guest's VGA.\\
4046	{\bfseries Ctrl+Alt+3} switches to serial port output. It captures serial output from the VMX guest. It works only if the VMX guest was configured to use the serial port. \\
4047
4048	\subsection{Save/Restore and Migration}
4049	VMX guests currently cannot be saved and restored, nor migrated. These features are currently under active development.
4050
4051	\chapter{Vnets - Domain Virtual Networking}
4052
4053	Xen optionally supports virtual networking for domains using {\em vnets}.
4054	These emulate private LANs that domains can use. Domains on the same
4055	vnet can be hosted on the same machine or on separate machines, and the
4056	vnets remain connected if domains are migrated. Ethernet traffic
4057	on a vnet is tunneled inside IP packets on the physical network. A vnet is a virtual
4058	network and addressing within it need have no relation to addressing on
4059	the underlying physical network. Separate vnets, or vnets and the physical network,
4060	can be connected using domains with more than one network interface and
4061	enabling IP forwarding or bridging in the usual way.
4062
4063	Vnet support is included in \texttt{xm} and \xend:
4064	\begin{verbatim}
4065	# xm vnet-create <config>
4066	\end{verbatim}
4067	creates a vnet using the configuration in the file \verb\|<config>\|.
4068	When a vnet is created its configuration is stored by \xend and the vnet persists until it is
4069	deleted using
4070	\begin{verbatim}
4071	# xm vnet-delete <vnetid>
4072	\end{verbatim}
4073	The vnets \xend knows about are listed by
4074	\begin{verbatim}
4075	# xm vnet-list
4076	\end{verbatim}
4077	More vnet management commands are available using the
4078	\texttt{vn} tool included in the vnet distribution.
4079
4080	The format of a vnet configuration file is
4081	\begin{verbatim}
4082	(vnet (id <vnetid>)
4083	(bridge <bridge>)
4084	(vnetif <vnet interface>)
4085	(security <level>))
4086	\end{verbatim}
4087	White space is not significant. The parameters are:
4088	\begin{itemize}
4089	\item \verb\|<vnetid>\|: vnet id, the 128-bit vnet identifier. This can be given
4090	as 8 4-digit hex numbers separated by colons, or in short form as a single 4-digit hex number.
4091	The short form is the same as the long form with the first 7 fields zero.
4092	Vnet ids must be non-zero and id 1 is reserved.
4093
4094	\item \verb\|<bridge>\|: the name of a bridge interface to create for the vnet. Domains
4095	are connected to the vnet by connecting their virtual interfaces to the bridge.
4096	Bridge names are limited to 14 characters by the kernel.
4097
4098	\item \verb\|<vnetif>\|: the name of the virtual interface onto the vnet (optional). The
4099	interface encapsulates and decapsulates vnet traffic for the network and is attached
4100	to the vnet bridge. Interface names are limited to 14 characters by the kernel.
4101
4102	\item \verb\|<level>\|: security level for the vnet (optional). The level may be one of
4103	\begin{itemize}
4104	\item \verb\|none\|: no security (default). Vnet traffic is in clear on the network.
4105	\item \verb\|auth\|: authentication. Vnet traffic is authenticated using IPSEC
4106	ESP with hmac96.
4107	\item \verb\|conf\|: confidentiality. Vnet traffic is authenticated and encrypted
4108	using IPSEC ESP with hmac96 and AES-128.
4109	\end{itemize}
4110	Authentication and confidentiality are experimental and use hard-wired keys at present.
4111	\end{itemize}
4112	When a vnet is created its configuration is stored by \xend and the vnet persists until it is
4113	deleted using \texttt{xm vnet-delete <vnetid>}. The interfaces and bridges used by vnets
4114	are visible in the output of \texttt{ifconfig} and \texttt{brctl show}.
4115
4116	\section{Example}
4117	If the file \path{vnet97.sxp} contains
4118	\begin{verbatim}
4119	(vnet (id 97) (bridge vnet97) (vnetif vnif97)
4120	(security none))
4121	\end{verbatim}
4122	Then \texttt{xm vnet-create vnet97.sxp} will define a vnet with id 97 and no security.
4123	The bridge for the vnet is called vnet97 and the virtual interface for it is vnif97.
4124	To add an interface on a domain to this vnet set its bridge to vnet97
4125	in its configuration. In Python:
4126	\begin{verbatim}
4127	vif="bridge=vnet97"
4128	\end{verbatim}
4129	In sxp:
4130	\begin{verbatim}
4131	(dev (vif (mac aa:00:00:01:02:03) (bridge vnet97)))
4132	\end{verbatim}
4133	Once the domain is started you should see its interface in the output of \texttt{brctl show}
4134	under the ports for \texttt{vnet97}.
4135
4136	To get best performance it is a good idea to reduce the MTU of a domain's interface
4137	onto a vnet to 1400. For example using \texttt{ifconfig eth0 mtu 1400} or putting
4138	\texttt{MTU=1400} in \texttt{ifcfg-eth0}.
4139	You may also have to change or remove cached config files for eth0 under
4140	\texttt{/etc/sysconfig/networking}. Vnets work anyway, but performance can be reduced
4141	by IP fragmentation caused by the vnet encapsulation exceeding the hardware MTU.
4142
4143	\section{Installing vnet support}
4144	Vnets are implemented using a kernel module, which needs to be loaded before
4145	they can be used. You can either do this manually before starting \xend, using the
4146	command \texttt{vn insmod}, or configure \xend to use the \path{network-vnet}
4147	script in the xend configuration file \texttt{/etc/xend/xend-config.sxp}:
4148	\begin{verbatim}
4149	(network-script network-vnet)
4150	\end{verbatim}
4151	This script insmods the module and calls the \path{network-bridge} script.
4152
4153	The vnet code is not compiled and installed by default.
4154	To compile the code and install on the current system
4155	use \texttt{make install} in the root of the vnet source tree,
4156	\path{tools/vnet}. It is also possible to install to an installation
4157	directory using \texttt{make dist}. See the \path{Makefile} in
4158	the source for details.
4159
4160	The vnet module creates vnet interfaces \texttt{vnif0002},
4161	\texttt{vnif0003} and \texttt{vnif0004} by default. You can test that
4162	vnets are working by configuring IP addresses on these interfaces
4163	and trying to ping them across the network. For example, using machines
4164	hostA and hostB:
4165	\begin{verbatim}
4166	hostA# ifconfig vnif0004 10.0.0.100 up
4167	hostB# ifconfig vnif0004 10.0.0.101 up
4168	hostB# ping 10.0.0.100
4169	\end{verbatim}
4170
4171	The vnet implementation uses IP multicast to discover vnet interfaces, so
4172	all machines hosting vnets must be reachable by multicast. Network switches
4173	are often configured not to forward multicast packets, so this often
4174	means that all machines using a vnet must be on the same LAN segment,
4175	unless you configure vnet forwarding.
4176
4177	You can test multicast coverage by pinging the vnet multicast address:
4178	\begin{verbatim}
4179	# ping -b 224.10.0.1
4180	\end{verbatim}
4181	You should see replies from all machines with the vnet module running.
4182	You can see if vnet packets are being sent or received by dumping traffic
4183	on the vnet UDP port:
4184	\begin{verbatim}
4185	# tcpdump udp port 1798
4186	\end{verbatim}
4187
4188	If multicast is not being forwaded between machines you can configure
4189	multicast forwarding using vn. Suppose we have machines hostA on 10.10.0.100
4190	and hostB on 10.11.0.100 and that multicast is not forwarded between them.
4191	We use vn to configure each machine to forward to the other:
4192	\begin{verbatim}
4193	hostA# vn peer-add hostB
4194	hostB# vn peer-add hostA
4195	\end{verbatim}
4196	Multicast forwarding needs to be used carefully - you must avoid creating forwarding
4197	loops. Typically only one machine on a subnet needs to be configured to forward,
4198	as it will forward multicasts received from other machines on the subnet.
4199
4200	%% Chapter Glossary of Terms moved to glossary.tex
4201	\chapter{Glossary of Terms}
4202
4203	\begin{description}
4204
4205	\item[Domain] A domain is the execution context that contains a
4206	running {\bf virtual machine}. The relationship between virtual
4207	machines and domains on Xen is similar to that between programs and
4208	processes in an operating system: a virtual machine is a persistent
4209	entity that resides on disk (somewhat like a program). When it is
4210	loaded for execution, it runs in a domain. Each domain has a {\bf
4211	domain ID}.
4212
4213	\item[Domain 0] The first domain to be started on a Xen machine.
4214	Domain 0 is responsible for managing the system.
4215
4216	\item[Domain ID] A unique identifier for a {\bf domain}, analogous to
4217	a process ID in an operating system.
4218
4219	\item[Full virtualization] An approach to virtualization which
4220	requires no modifications to the hosted operating system, providing
4221	the illusion of a complete system of real hardware devices.
4222
4223	\item[Hypervisor] An alternative term for {\bf VMM}, used because it
4224	means `beyond supervisor', since it is responsible for managing
4225	multiple `supervisor' kernels.
4226
4227	\item[Live migration] A technique for moving a running virtual machine
4228	to another physical host, without stopping it or the services
4229	running on it.
4230
4231	\item[Paravirtualization] An approach to virtualization which requires
4232	modifications to the operating system in order to run in a virtual
4233	machine. Xen uses paravirtualization but preserves binary
4234	compatibility for user space applications.
4235
4236	\item[Shadow pagetables] A technique for hiding the layout of machine
4237	memory from a virtual machine's operating system. Used in some {\bf
4238	VMMs} to provide the illusion of contiguous physical memory, in
4239	Xen this is used during {\bf live migration}.
4240
4241	\item[Virtual Block Device] Persistant storage available to a virtual
4242	machine, providing the abstraction of an actual block storage device.
4243	{\bf VBD}s may be actual block devices, filesystem images, or
4244	remote/network storage.
4245
4246	\item[Virtual Machine] The environment in which a hosted operating
4247	system runs, providing the abstraction of a dedicated machine. A
4248	virtual machine may be identical to the underlying hardware (as in
4249	{\bf full virtualization}, or it may differ, as in {\bf
4250	paravirtualization}).
4251
4252	\item[VMM] Virtual Machine Monitor - the software that allows multiple
4253	virtual machines to be multiplexed on a single physical machine.
4254
4255	\item[Xen] Xen is a paravirtualizing virtual machine monitor,
4256	developed primarily by the Systems Research Group at the University
4257	of Cambridge Computer Laboratory.
4258
4259	\item[XenLinux] A name for the port of the Linux kernel that
4260	runs on Xen.
4261
4262	\end{description}
4263
4264
4265	\end{document}
4266
4267
4268	%% Other stuff without a home
4269
4270	%% Instructions Re Python API
4271
4272	%% Other Control Tasks using Python
4273	%% ================================
4274
4275	%% A Python module 'Xc' is installed as part of the tools-install
4276	%% process. This can be imported, and an 'xc object' instantiated, to
4277	%% provide access to privileged command operations:
4278
4279	%% # import Xc
4280	%% # xc = Xc.new()
4281	%% # dir(xc)
4282	%% # help(xc.domain_create)
4283
4284	%% In this way you can see that the class 'xc' contains useful
4285	%% documentation for you to consult.
4286
4287	%% A further package of useful routines (xenctl) is also installed:
4288
4289	%% # import xenctl.utils
4290	%% # help(xenctl.utils)
4291
4292	%% You can use these modules to write your own custom scripts or you
4293	%% can customise the scripts supplied in the Xen distribution.
4294
4295
4296
4297	% Explain about AGP GART
4298
4299
4300	%% If you're not intending to configure the new domain with an IP
4301	%% address on your LAN, then you'll probably want to use NAT. The
4302	%% 'xen_nat_enable' installs a few useful iptables rules into domain0
4303	%% to enable NAT. [NB: We plan to support RSIP in future]
4304
4305
4306
4307	%% Installing the file systems from the CD
4308	%% =======================================
4309
4310	%% If you haven't got an existing Linux installation onto which you
4311	%% can just drop down the Xen and Xenlinux images, then the file
4312	%% systems on the CD provide a quick way of doing an install. However,
4313	%% you would be better off in the long run doing a proper install of
4314	%% your preferred distro and installing Xen onto that, rather than
4315	%% just doing the hack described below:
4316
4317	%% Choose one or two partitions, depending on whether you want a
4318	%% separate /usr or not. Make file systems on it/them e.g.:
4319	%% mkfs -t ext3 /dev/hda3
4320	%% [or mkfs -t ext2 /dev/hda3 && tune2fs -j /dev/hda3 if using an old
4321	%% version of mkfs]
4322
4323	%% Next, mount the file system(s) e.g.:
4324	%% mkdir /mnt/root && mount /dev/hda3 /mnt/root
4325	%% [mkdir /mnt/usr && mount /dev/hda4 /mnt/usr]
4326
4327	%% To install the root file system, simply untar /usr/XenDemoCD/root.tar.gz:
4328	%% cd /mnt/root && tar -zxpf /usr/XenDemoCD/root.tar.gz
4329
4330	%% You'll need to edit /mnt/root/etc/fstab to reflect your file system
4331	%% configuration. Changing the password file (etc/shadow) is probably a
4332	%% good idea too.
4333
4334	%% To install the usr file system, copy the file system from CD on
4335	%% /usr, though leaving out the "XenDemoCD" and "boot" directories:
4336	%% cd /usr && cp -a X11R6 etc java libexec root src bin dict kerberos
4337	%% local sbin tmp doc include lib man share /mnt/usr
4338
4339	%% If you intend to boot off these file systems (i.e. use them for
4340	%% domain 0), then you probably want to copy the /usr/boot
4341	%% directory on the cd over the top of the current symlink to /boot
4342	%% on your root filesystem (after deleting the current symlink)
4343	%% i.e.:
4344	%% cd /mnt/root ; rm boot ; cp -a /usr/boot .

Note: See TracBrowser for help on using the repository browser.

Download in other formats: