Wednesday, February 27, 2013

Starting/Restarting the Exalogic Control Stack

Introduction

The installation process for a virtualised Exalogic will install a number of vServers that run the Exalogic Control cloud management stack.  Once installed this will generally be left up and running so that the virtual data centre can be administered.  However at times it may be necessary to stop/restart the control stack.  For example, if you want to backup the control vServers the simplest mechanism is to stop the instances and use the snapshot capability of the ZFS storage appliance.

This blog post provides some instructions on how to shutdown/restart the control stack.  This relates to the Exalogic 2.0.1 and 2.0.4 releases.  For the 2.0.6 release see the comments for this posting.  The same principles apply but the services are simpler to use or the lifecycle tool exaBR can be used to shutdown the control stack.  (exabr -r <repository path> stop control-stack )

Shutdown/Startup of Exalogic Control Software Stack

Exalogic Control consists of a number of different applications deployed to multiple vServers.
  • Enterprise Manager Ops Center (EMOC)
    • The main interface used for Exalogic Control to manage the Rack and Virtual Data centre.
  • Ops Center Proxy Controller
    • There are two proxy servers deployed.  Requests to manage the environment are sent from the EMOC server to one or other proxy server which issues the commands.  The Proxy servers are part of the architecture that enables Ops Centre to manage complex network topologies where the central EMOC server is not in the same network as  the servers that it manages or to allow requests to be load balanced between multiple proxies.
    • For Exalogic there are two proxies because some of the commands are issued to the compute node ILOMs.  A compute node cannot manage its own ILOM so it is necessary to have the two proxies deployed so that all compute nodes can be fully managed.
  • Oracle VM Manager (OVMM)
    • Used by EMOC to manage the virtual environment.
  • Oracle Database
    • Both EMOC and OVMM have a need for an underlying database to store configuration and state management into.
The software on these servers can be started and stopped independently of the vServer itself.  Each application has a slightly different start and stop command as shown below:-
  • EMOC 
    • Stop - /opt/sun/xvmoc/bin/satadm stop -w
    • Start - /opt/sun/xvmoc/bin/satadm start -w
  • Proxies 
    • Stop - /opt/sun/xvmoc/bin/proxyadm stop -w
    • Start - /opt/sun/xvmoc/bin/proxyadm start -w
  • OVMM
    • Stop - service ovmm stop
    • Start - service ovmm start
  • Database
    • Stop - service oracle-db stop
    • Start - service oracle-db start
To run these commands it is necessary to log onto each of the vServers in turn to run the command.  The order in which they should be run is:-
  • Start
    • Database
    • OVMM
    • EMOC
    • Proxy Server 1
    • Proxy Server 2
  • Stop
    • Reverse of start order.
 The simplest mechanism to access all these vServers is to ssh onto a compute node and then make use of the private IP addresses they have in the IPoIB-admin network.  (Probably 192.168.20.nn)  The actual IP addresses allocated to each server is generated by the virtualised Exalogic Configuration spreadsheet, normally the configuration files that this creates are in the directory /opt/exalogic/ecu/config on the compute node used for the ECU.  The IP addresses for each vServer in the control stack are found in the following files:- db.json, ovmm.json, oc_pc1.json, oc_pc2.json and oc_ec1.json.

For ease of use I have created a script that does the stop/start/restart of the software components only.  This is available here and should be run from the OVS/hypervisor instance.

Before using this script edit the file and set the IP addresses for the 5 vServers appropriate for your environment and set the password up for the root user.  The script allows a separate password for each vServer if necessary.  It makes use of a short expect script to automate the login and issue the command necessary to start the software.

Its usage is simply:-

# restart-exalogic-control.sh [start|stop|restart]

The script is still a work in progress as I have plans to allow some further scoping of the script so that it can be used to restart only certain components as needed, at the moment it only does all.  Additionally it is not particularly intelligent and will continue processing even if it fails to log in.

Shutdown/Startup of vServers

Shutting down vServers can be as simple as logging onto the vServer and issuing the shutdown or halt commands.  The startup of a vServer in the control stack is slightly trickier.  Under normal circumstances you would simply log onto the Exalogic Control BUI, navigate to the account within the vDC, select the vServer and click to start it up.  Of course to start up the Exalogic Control vServers there is no access to the normal BUI so we need to startup these services via a different mechanism.

First off we need to startup the database and OVMM vServers.  This can be achieved by logging onto the compute node that hosts the two vServers, normally I would expect this to be compute node 1.  Start by identifying the vm.cfg file that relates to the DB and OVMM.  These files can be found under the /OVS directory structure as shown below.  Then issue the xm create command to start up each vServer.


[root@exalogic-cn01 ~]# cd /OVS/Repositories/0004fb0000030000f1aa50ba083a2ade/VirtualMachines/
[root@exalogic-cn01 VirtualMachines]# grep -e ExalogicControlDB -e ExalogicControlOVMM */vm.cfg
0004fb000006000067f8e47575a6f0b5/vm.cfg:OVM_simple_name = 'ExalogicControlOVMM'
0004fb0000060000ea920b30c74633c3/vm.cfg:OVM_simple_name = 'ExalogicControlDB'
[root@exalogic-cn01 VirtualMachines]#xm create

/OVS/Repositories/0004fb0000030000f1aa50ba083a2ade/VirtualMachines/0004fb0000060000ea920b30c74633c3/vm.cfg



Make sure you start the DB first and once it has started log on to check that the DB is available prior to starting up the OVMM instance.  Note - the long string of characters for the repository and the directory hosting the vm.cfg will be unique to your environment.

Once these two vServers are up and running it is possible to authenticate to the OVMM instance and use its capabilities to startup the proxy servers and EMOC.  As shown below.

Exalogic Control vServer running in Pool 1

Once the two proxies and the enterprise controller are all running then it is necessary to issue the start commands for the software as described earlier.  Once complete then the system management function is all present again and the rack cloud control is available for use.