Hi Chonduy,

Host liveness is controlled in the Db_gc module 
(xen-api.hg/ocaml/xapi/db_gc.ml). The way it works is that each pool slave is 
heartbeating to the master, calling every 30 seconds (This is defined in 
xapi_globs.ml - 'host_heartbeat_interval'). If the master hasn't heard from a 
slave for 200 seconds (Xapi_globs.host_assumed_dead_interval), then the host is 
marked as dead by setting to false the database field 'live' on the 
Host_metrics object.

This 'live' field is checked when a message is forwarded to the host in the 
Message_forwarding module (xen-api.hg/ocaml/xapi/message_forwarding.ml). If the 
host is thought to be dead, it immediately returns the error HOST_OFFLINE. If 
it's not thought to be dead, then we usually re-use an existing stunnel session 
that had previously been set up to talk to the host. This might take a long 
time to fail, but when it does it causes a 'CANNOT_CONTACT_HOST' error. Both of 
these errors are thrown from the Message_forwarding module.

There are only a couple of small parts of the codebase that haven't been 
open-sourced, like the HA and licensing code. The Db module is actually not 
present in the repository as it is generated from the code living in the 
ocaml/idl subdirectory of xen-api.hg. The file 'ocaml/idl/datamodel.ml' 
contains the descriptions of both the Xen API itself and also the database 
backing the objects. The database is not actually a complex beast - the code 
that is generated is essentially an interface between the strongly typed world 
that most of xapi inhabits and the underlying representation of the database 
(currently hashtables in memory, serialised to an xml file on disk). The 
database is currently only in memory on the pool master, and calls to the Db 
module from slaves are translated into requests sent across the network. If you 
manage to build xapi (not an easy task!), the source of the database will be in 
the ocaml/autogen dir (IIRC it's called db_actions.ml).

Hope this helps!

Jon



On 20 Mar 2010, at 01:07, Chonduy Nguyen wrote:

Hi Dave

I am trying to look into XCP to see if I can make change on the code to improve 
of some scnerio - but it looks like

not all codes are open  - May I confirm this?

I observed that when a node is down - and if my software tried to do shutdown 
on VMs - it took 20 minutes
for the api  to return Exception - error CANNOT_CONTACT_HOST

for another call like disable_host - it takes 3 minutes to return error : 
CANNOT_CONTACT_HOST
after the first error - it looks like the second calls - will faster with error 
-HOST_OFFLINE

May I ask info of Host down etc... stored on storage or memory of XCP ?

I look at the code and saw object call Db.  --> what is this DB ? - is a local 
file or pool storage ?

Thank you,

Chonduy/TNguyen






_______________________________________________
xen-api mailing list
[email protected]<mailto:[email protected]>
http://lists.xensource.com/mailman/listinfo/xen-api

_______________________________________________
xen-api mailing list
[email protected]
http://lists.xensource.com/mailman/listinfo/xen-api

Reply via email to