<chapter id="gfwrc"><title>Sun xVM Hypervisor System Requirements</title><highlights><para>This chapter introduces <trademark>Sun</trademark> xVM concepts and
provides an overview of the Sun xVM hypervisor. System support to meet hypervisor
requirements is also discussed. The hypervisor is based on an open source
project, Xen.</para><para>For more information about using the hypervisor and the xVM architecture,
see:</para><itemizedlist><listitem><para><ulink url="http://www.opensolaris.org/os/community/xen" type="text_url">The OpenSolaris community</ulink></para>
</listitem><listitem><para><ulink url="http://www.xen.org" type="text_url">The home of
the open source Xen project</ulink> </para>
</listitem>
</itemizedlist>
</highlights><sect1 id="gexug"><title>Supported Hardware</title><para>x64 and x86 based systems are supported.</para>
</sect1><sect1 id="geyyr"><title>Supported Configurations</title><para>Supported virtual machine configurations
include the following:</para><itemizedlist><listitem><para>Solaris control domain (domain 0), and the following guests:</para><itemizedlist><listitem><para>Solaris guest domain (domain U, for unprivileged). The currently
supported Solaris guest operating system is the Solaris Express Developer
Edition 1/08 release or later. Support for a Solaris 10 guest domain is planned
to begin with a future update, targeted for release in 2008.</para>
</listitem><listitem><para>Linux guest domain</para>
</listitem><listitem><para>FreeBSD guest domain</para>
</listitem><listitem><para>Windows guest domain</para><caution><para>Note that Windows HVM domains are susceptible to viruses and
worms, so make sure you comply with your site's security policies, and do
all that is necessary to keep your network secure.</para>
</caution>
</listitem>
</itemizedlist>
</listitem><listitem><para>32-bit and 64-bit Solaris.</para>
</listitem><listitem><para>Multiprocessor control domain and guest domains.</para>
</listitem>
</itemizedlist><para>The following information applies to the control domain:</para><itemizedlist><listitem><para>ISA floppy is not supported.</para>
</listitem><listitem><para>For 32&ndash;bit, the processor must support physical address
extensions (PAE) mode.</para>
</listitem><listitem><para>The NIC must support GLDv3. These devices include <literal>bge</literal>, <literal>e1000g</literal>, <literal>xge</literal>, <literal>nge</literal>, and <literal>rge</literal>. For more information on GLDv3 interfaces, see <olink targetdoc="sysadv3" targetptr="gaugz" remap="external"><citetitle remap="section">Solaris OS
Interface Types</citetitle> in <citetitle remap="book">System Administration
Guide: IP Services</citetitle></olink>.</para>
</listitem>
</itemizedlist>
</sect1><sect1 id="gfohs"><title>Introduction to the Sun xVM Hypervisor</title><para>The hypervisor is a virtualization system for the Solaris operating
system on x86 machines. Unlike virtualization using zones, each guest domain
runs a full instance of an operating system.</para><para>The hypervisor can securely execute multiple virtual machines simultaneously
on a single x86 compatible computer, with each virtual machine running its
own operating system.</para><para>Each virtual machine instance is called a <emphasis>domain</emphasis>.
There are two kinds of domains, the control domain and the guest domain. The
control domain is also known as a domain 0, or dom0. A guest operating system,
or unprivileged domain, is also called a domain U or domU.</para>
</sect1><sect1 id="gfoih"><title>The Sun xVM Hypervisor and the Control Domain</title><para>The hypervisor
is responsible for controlling and executing each of the domains and runs
with full privileges. The control tools for managing the domains run under
a specialized control domain, also referred to as domain 0 (dom0).</para><para>The hypervisor virtualizes the system's hardware. The hypervisor transparently
shares and partitions the system's CPUs, memory, and NIC resources among the
user domains. The hypervisor performs the low-level work required to provide
a virtualized  platform for operating systems.</para><para>The hypervisor relies primarily on the control domain for the following:</para><itemizedlist><listitem><para>Which guest domains are created.</para>
</listitem><listitem><para>Which resources the guest domains can access.</para>
</listitem><listitem><para>How much memory a given guest domain can have.</para>
</listitem><listitem><para>Which devices a guest domain can access, because xVM does
not include any device drivers. The hypervisor delegates control of the machine's
physical devices to the control domain, domain 0.</para>
</listitem>
</itemizedlist><para>Thus, by default, only the control domain has access to physical devices.
The guest domains running on the host are presented with virtualized devices.
The domain interacts with the virtualized devices in the same way that the
domain would interact with the physical devices. Also see <olink targetptr="gefwv" remap="internal">Resource Virtualization to Enable Interoperability</olink>.</para><para>The following figure shows the Sun xVM configuration.</para><figure id="ggcly"><title>Sun xVM Configuration</title><mediaobject><imageobject><imagedata entityref="xVM.Configuration"/>
</imageobject><textobject><simpara>Figure shows domains and the hypervisor layer.</simpara>
</textobject>
</mediaobject>
</figure>
</sect1><sect1><title>How Hypervisors Work</title><para>A hypervisor is a software system that partitions a single physical
machine into multiple virtual machines, to provide server consolidation and
utility computing. Existing applications and binaries run unmodified.</para><para>The hypervisor presents a virtual machine to guests. The hypervisor
forms a layer between the software running in the virtual machine and the
hardware. This separation enables the hypervisor to control how guest operating
systems running inside a virtual machine use hardware resources.</para><sect2 id="gfvwj"><title>Uniform View of Hardware</title><para>A hypervisor provides a uniform view of underlying hardware. Machines
from different vendors with different I/O subsystems appear to be the same
machine, which means that virtual machines can run on any available supported
computer. Thus, administrators can view hardware as a pool of resources that
can run arbitrary services on demand. Because the hypervisor also encapsulates
a virtual machine's software state, the hypervisor layer can map and remap
virtual machines to available hardware resources at any time and also use
live migration to move virtual machines across computers. These capabilities
can also be used for load balancing among a collection of machines, dealing
with hardware failures, and scaling systems. When a computer fails and must
go offline or when a new machine comes online, the hypervisor layer can remap
virtual machines accordingly. Virtual machines are also easy to replicate,
which allows administrators to bring new services online as needed.</para>
</sect2><sect2 id="gfvvy"><title>Using Domain Capabilites</title><sect3 id="gfxmj"><title>Containment</title><para>Containment gives administrators a general-purpose undo capability.
Administrators can suspend a virtual machine and resume it at any time, or
checkpoint a virtual machine and roll it back to a previous execution state.
With this capability, systems can more easily recover from crashes or configuration
errors. See <olink targetptr="gfwlu" remap="internal">Recovery</olink>.</para><para>Containment also supports a very flexible mobility model. Users can
copy a suspended virtual machine over a network or store and transport it
on removable media. The hypervisor provides total mediation of all interactions
between the virtual machine and underlying hardware, thus allowing strong
isolation between virtual machines and supporting the multiplexing of many
virtual machines on a single hardware platform. The hypervisor can consolidate
several physical machines with low rates of utilization as virtual systems
on a single computer, thereby lowering hardware costs and space requirements.</para>
</sect3><sect3 id="gfxnc"><title>Security</title><para>Strong isolation is also valuable for reliability and security. Applications
that previously ran together on one machine can now be separated on different
virtual machines. If one application experiences a fault, the other applications
are isolated from this occurrence and will not be affected. Further, if a
virtual machine is compromised, the incident is contained to only that compromised
virtual machine.</para>
</sect3>
</sect2>
</sect1><sect1 id="gefwv"><title>Resource Virtualization to Enable Interoperability</title><para>As a key component of virtual machines, the hypervisor provides a layer
between software environments and physical hardware that has the following
characteristics:</para><itemizedlist><listitem><para>Programmable and transparent to the software above it</para>
</listitem><listitem><para>Makes efficient use of the hardware below it</para>
</listitem>
</itemizedlist><para>Virtualization provides a way to bypass interoperability constraints.
Virtualizing a system or component such as a processor, memory, or an I/O
device at a given abstraction level maps its interface and visible resources
onto the interface and resources of an underlying, possibly different, real
system. Consequently, the real system appears as a different virtual system
or even as multiple virtual systems.</para><para>The hypervisor assigns one or more virtual CPUs (VCPUs) to each domain.
Each VCPU contains all the state one would typically associate with a physical
CPU, such as registers, flags, and timestamps. A VCPU in xVM is an entity
that can be scheduled, like a thread in Solaris. When it is a domain's turn
to run on a CPU, xVM loads the physical CPU with the state in the VCPU, and
lets it run. Solaris treats each VCPU as it would a physical CPU. When the
hypervisor selects a VCPU to run, it will be running the thread that Solaris
loaded on the VCPU.</para>
</sect1><sect1 id="gfvvm"><title>Sun xVM Hypervisor Scheduler</title><para>The hypervisor schedules running domains (including domain 0)
onto the set of physical CPUs as configured. The scheduler is constrained
by configuration specifications such as the following.</para><itemizedlist><listitem><para>Domain configuration, such as the number of virtual CPUs allocated.</para>
</listitem><listitem><para>Runtime configuration, such as pinning virtual CPUs to physical
CPUs. Pinning CPUs ensures that certain VCPUs can only run on certain physical
CPUs.</para>
</listitem>
</itemizedlist><para>The default domain scheduler for the hypervisor is the <emphasis>credit
scheduler</emphasis>. This is a fair-share domain scheduler that balances
virtual CPUs of domains across the allowed set of physical CPUs according
to workload. No CPU will be idle if a domain has work to do and wants to run.</para><para>The scheduler is configured through the <command>xm sched-credit</command> command
described in the <olink targetdoc="group-refman" targetptr="xm-1m" remap="external"><citerefentry><refentrytitle>xm</refentrytitle><manvolnum>1M</manvolnum></citerefentry></olink> man
page.</para><para>The following parameters are used to configure the scheduler:</para><variablelist><varlistentry><term><option>d</option> <replaceable>domain</replaceable>, <literal>--domain=</literal><replaceable>domain</replaceable></term><listitem><para>Domain for which to set scheduling parameters.</para>
</listitem>
</varlistentry><varlistentry><term><option>c</option> <replaceable>cap</replaceable>, <literal>--cap=</literal><replaceable>cap</replaceable></term><listitem><para>The maximum amount of CPU the domain can consume. A value
of zero, which is the default, means no maximum is set. The value is expressed
in percentage points of a physical CPU. A value of 100 percent corresponds
to one full CPU. Thus, a value of 50 specifies a cap of half a physical CPU.</para>
</listitem>
</varlistentry><varlistentry><term><option>w</option> <replaceable>weight</replaceable>, <literal>--weight=</literal><replaceable>weight</replaceable></term><listitem><para>The relative weight, or importance, of the domain. A domain
with twice the weight of the other domains receives double the CPU time of
those other domains when CPU use is in contention. Valid weights are in the
range <replaceable>1</replaceable>-<replaceable>65536</replaceable>. The default
weight is <literal>256</literal>.</para>
</listitem>
</varlistentry>
</variablelist><example id="gfxoo"><title><command>xm sched-credit</command> Configuration</title><para>The following line configures scheduling parameters for the domain <literal>sol1</literal>. The domain has a weight of 500 and a cap of 1 CPU.</para><screen>xm sched-credit -d sol1 -w 500 -c100</screen><para>If used without the <option>w</option> and <option>c</option> options,
the current settings for the given domain are shown.</para>
</example>
</sect1><sect1 id="gexwg"><title>Supported Virtualization Modes </title><para>There are two basic types of virtualization,
full virtualization and paravirtualization. The hypervisor supports both modes.</para><para>Full virtualization allows any x86 operating system, including Solaris,
Linux, or Windows systems,  to run in a guest domain.</para><para>Paravirtualization requires changes to the operating system. Only specific
operating systems can be hosted in a paravirtualized guest domain. Currently
these systems are limited to Solaris, Linux, and FreeBSD.</para><para>A system can have both paravirtualized and fully virtualized domains
running simultaneously.</para><para>For paravirtualized mode and for all types of operating systems, the
only requirement is that the operating system be modified to support the virtual
device interfaces.</para><sect2 id="gfvwl"><title> Overview of Paravirtualization</title><para>In the more lightweight paravirtualization, the operating system
is both aware of the virtualization layer and modified to support it, which
results in higher performance.</para><para>The paravirtualized guest domain operating system is ported to run on
top of the hypervisor, and uses virtual network, disk, and console devices.</para><para>Since the control domain must work closely with the hypervisor layer,
control domain is always paravirtualized. Guest domains can be either paravirtualized
or fully virtualized, and a system can have both types running simultaneously.</para>
</sect2><sect2 id="gfvve"><title>Devices and Drivers in the Paravirtualization Mode</title><para>Since paravirtualization requires changes to
the OS, only specific operating systems can be hosted in a paravirtualized
guest domain. Currently those are limited to Solaris, Linux, and FreeBSD.</para><para>With paravirtualization, each device, such as a networking interface,
is presented as a fully virtual interface, and specific drivers are required
for it. Each virtual device is associated with a physical device and the driver
is split into two drivers.</para><para>A frontend driver runs in the guest domain and communicates over a virtual
data interface to a backend driver. The backend driver currently runs in domain
0 and communicates with both the frontend driver and the physical device the
driver controls. This enables a guest domain to make use of a network card
on the host, store data on a host disk drive, and other such tasks.</para><para>xVM in Solaris currently supports two main split drivers used for I/O.
Networking is done by using the xVM networking backend (<literal>xnb</literal>)
drivers. Solaris or other operating system guest domains use xnb to transmit
and receive networking traffic. Typically, a physical NIC, either shared or
dedicated, is used for communicating with the guest domains. Solaris xVM provides
networking access to guest domains by means of MAC-based virtual network switching.</para><para>Block I/O is provided by the xVM disk backend (<literal>xdb</literal>)
driver, which provides virtual disk access to guest domains. In the control
domain, the disk storage can be in a file, a ZFS volume, or a physical device.</para>
</sect2><sect2 id="ggdcl"><title>Virtual NICs</title><para>A single physical NIC can be carved into multiple VNICs, which can be
assigned to different zones or Solaris xVM instances running on the same system.
VNICs are managed using the <command>dladm</command> command line utility
described in the <olink targetdoc="group-refman" targetptr="dladm-1m" remap="external"><citerefentry><refentrytitle>dladm</refentrytitle><manvolnum>1M</manvolnum></citerefentry></olink> man
page. </para>
</sect2><sect2 id="gfwln"><title>Drivers for Solaris Running as a Guest</title><para>When running as a guest domain, Solaris xVM uses the xVM networking
frontend (<literal>xnf</literal> ) and xVM disk frontend (<literal>xdf</literal>)
drivers to talk to the relevant backend drivers.</para><para>In addition to these drivers, the Solaris console is virtualized when
the Solaris system is running as a guest domain. The console driver interacts
with the <olink targetdoc="group-refman" targetptr="xenconsoled-1m" remap="external"><citerefentry><refentrytitle>xenconsoled</refentrytitle><manvolnum>1M</manvolnum></citerefentry></olink> daemon running in domain 0 to provide console access.</para>
</sect2><sect2 id="gfvva"><title>Overview of Full Virtualization</title><para>In
a full virtualization, the operating system is not aware that it is running
in a virtualized environment under xVM. A fully virtualized guest domain is
referred to as a hardware-assisted virtual machine (HVM). An HVM guest domain
runs an unmodified operating system.</para><para>Fully-virtualized guest domains are supported under xVM with virtualization
extensions available on Intel-VT or AMD Secure Virtual Machine (SVM) processors.
These extensions must be present and enabled. Some BIOS versions disable the
extensions by default.</para><note><para> Full virtualization requires that the hypervisor transparently
intercept many operations that an operating system typically performs directly
on the hardware. This interception allows the hypervisor to ensure that a
domain cannot read or modify another domain's memory, cannot interfere with
its device access, and cannot shut down the CPUs it is using.</para>
</note>
</sect2>
</sect1><sect1 id="gezcw"><title>About Domains</title><para>The control domain and the guest domain are separate entities.</para><para>Each domain has a name and a UUID. Domains can be renamed but typically
retain the same UUID.</para><para>A domain ID is an integer that is specific to a running instance. The
ID changes whenever a guest domain is booted. A domain must be running to
have a domain ID.</para><sect2 id="gfwzx"><title>Control Domain 0</title><para>The control domain is a version of Solaris modified to run under the
xVM hypervisor. When the control domain is running, the control tools are
enabled. In most other respects, the control domain 0 instance runs and behaves
like an unmodified instance of the Solaris Operating System.</para><para>Other than by login, you cannot access a guest domain from the control
domain. A control domain should be reserved for system management work associated
with running a hypervisor. This means, for example, that users should not
have logins on the control domain. The control domain provides shared access
to a physical network interface to the guest domains, which have no direct
access to physical devices.</para><para>If a control domain crashes with a standard Solaris panic, the dump
will include just the control domain. Also see <olink targetptr="gfwug" remap="internal">About
Crash Dumps</olink>.</para>
</sect2><sect2 id="ggbfe"><title>Space Requirements</title><para>Control domain 0 space requirements are provided in the <citetitle>Solaris
Express Release and Installation Collection</citetitle> on <literal>docs.sun.com</literal>.</para><para>You will need to allocate sufficient disk space at the time the Solaris
control domain system is installed. A full Solaris install (<literal>SUNWcxall</literal>)
consumes approximately 4 Gbytes of disk space.</para>
</sect2><sect2 id="gfvtp"><title>Guest Domain Space Requirements</title><para>Domain U is a guest, or unprivileged, domain.</para><para>Size your domain as you would configure a machine to do the same workload.</para><para>The virtual disk requirement is dependent on the guest operating system
and software groups that you install.</para>
</sect2>
</sect1><sect1 id="ggbec"><title>Solaris Guest Domain U</title><para>A Solaris guest domain works like a normal Solaris Operating System.
All of the expected tools are available.</para><para>You can create <trademark>Solaris</trademark> Zones and branded zones
in a Solaris guest domain.</para><sect2 id="gfxyc"><title>Planning for Solaris Guest Domain U Upgrade</title><para>If you install a guest domain with the Solaris defaults, you will not
be able to upgrade that guest domain later because the Solaris installation
will not create enough space in the root partition.  If you want the ability
to upgrade the guest domain, increase the root partition to 1.5 Gbytes.</para>
</sect2>
</sect1><sect1 id="gfbse"><title>Domain States</title><para>A domain can be in one of six states. States are shown in <command>virsh list</command> and <command>xm vcpu-list</command> displays.</para><para>For example:</para><screen># <userinput>xm vcpu-list</userinput>
Name      ID  VCPU   CPU  State  Time(s) CPU Affinity
Domain-0   0     1     0   r--   3292.7  any cpu
sxc18     12     0     0   -b-   25.7    any cpu</screen><para>The states are:</para><variablelist><varlistentry><term><literal>r</literal>, running</term><listitem><para>The domain is currently running on a CPU.</para>
</listitem>
</varlistentry><varlistentry><term><literal>b</literal>, blocked</term><listitem><para> The domain is blocked, and not running or able to be run.
This can be caused because the domain is waiting on IO (a traditional wait
state) or it has gone to sleep because there was nothing running in it.</para>
</listitem>
</varlistentry><varlistentry><term><literal>p</literal>, paused</term><listitem><para> The domain has been paused, usually through the administrator
running <command>virsh suspend</command>. When in a paused state, the domain still consumes allocated resources like
memory, but is not eligible for scheduling
by the hypervisor. Run <command>resume</command> <replaceable>domain</replaceable> to place the domain in the running state.</para>
</listitem>
</varlistentry><varlistentry><term><literal>s</literal>, in shutdown</term><listitem><para>The domain is in process of shutting down, but has
not completely shutdown or crashed.</para>
</listitem>
</varlistentry><varlistentry><term><literal>s</literal>, shutoff</term><listitem><para>The domain is shut down.</para>
</listitem>
</varlistentry><varlistentry><term><literal>c</literal>, crashed</term><listitem><para> The domain has crashed. Usually this state can only occur
if the domain has been configured not to restart on crash. See <citerefentry><refentrytitle>xmdomain.cfg</refentrytitle><manvolnum>5</manvolnum></citerefentry> for more information.</para>
</listitem>
</varlistentry>
</variablelist>
</sect1><sect1 id="gfoie"><title>The Solaris System and Intel Platforms</title><para>With the introduction of xVM, there are now two platform implementations
on the Intel architecture. The implementation names are returned by the <command>uname</command> command with the <option>i</option> option and refer to the
same machine, i86pc.</para><itemizedlist><listitem><para><literal>i86pc</literal> refers to the Solaris system directly
installed on the physical system, also known as running on <emphasis>bare
metal</emphasis>.</para>
</listitem><listitem><para><literal>i86xpv</literal> refers to Solaris running paravirtualized
on top of an xVM hypervisor.</para>
</listitem>
</itemizedlist><para>Applications should use the <command>uname</command> command with the <option>p</option> option. On x86 platforms, regardless of whether Solaris is running
under xVM, this option always returns <literal>i386</literal>. The <option>p</option> option
is equivalent to using the <option>m</option> option in some other operating
systems.</para>
</sect1>
</chapter>