Previous Up Next

Chapter 18  CORBA Fault Tolerance

The previous chapter discussed proprietary fault tolerance mechanisms provided by some CORBA implementations. This chapter discusses CORBA-FT, which is the commonly-used name for the fault tolerance functionality that was added to the CORBA specification in 2000.

18.1  Terminology and Basic Infrastructure

CORBA-FT types are defined in a module called FT. Most of the fault tolerance infrastructure is concentrated in the FT::ReplicationManager interface. That interface defines two operations but inherits most of its functionality from base interfaces called ObjectGroupManager, GenericFactory and PropertyManager, all of which are defined in the FT module. I now discuss each of these interfaces in turn.

18.1.1  The ObjectGroupManager Interface

CORBA-FT uses the term object group to mean a replicated object. An individual replica is called a member of an object group. An Interoperable Object Group Reference (IOGR) is an IOR (Chapter 10) for an object group. An IOGR contains multiple profiles (that is, multiple sets of contact details): typically, one profile for each member that is currently running.1 An IOGR has an embedded TaggedComponent (Section 10.2.3) that records a version number for the IOGR. Whenever the set of members for an object group changes (for example, when a member dies or is restarted), the version number of the IOGR is updated.

The FT::ObjectGroupManager interface defines operations that manipulate information about the currently-alive members in object groups, for example, to add and remove members from an object group.

18.1.2  The GenericFactory Interface

FT::GenericFactory is a factory interface (Section 1.4.2.1). It has generic as part of its name because it is used to create objects of arbitrary types.


module FT { struct Property { ... }; // name-value pair typedef sequence<Property> Properties; typedef Properties Criteria; typedef CORBA::RepositoryId TypeId; ... interface GenericFactory { typedef any FactoryCreationId; Object create_object( in TypeId type_id, in Criteria the_criteria, out FactoryCreationId factory_creation_id) raises (...); void delete_object( in FactoryCreationId factory_creation_id) raises (...); }; };
Figure 18.1: The FT::GenericFactory interface

The first parameter to the create_object() operation is a repository id (Section 9.4). This parameter specifies the type of the object to be created. The next parameter specifies a sequence of name-value pairs that can be used for an application-specific purpose, for example, to provide initialization values for the object. The operation returns a reference to the newly created object, but is also uses an out parameter to specify an identifier that can later be used to uniquely identify the object to the factory that created it. In particular, this unique identifier can be used as a parameter to the delete_object() operation.

18.1.3  The PropertyManager Interface

The concept of “one size fits all” does not apply to fault tolerance. Instead, there are many different techniques that can be used to achieve fault tolerance, where each technique is applicable to some but not all types of application. Because of this, developers use properties (name-value pairs) to inform CORBA-FT which techniques are used in an application.

The FT::PropertyManager interface defines operations that are used to manage the fault tolerance properties. Properties can be set to be default properties, type-specific properties (that is, properties that are specific to an IDL interface) or object group properties (that is, properties specific to a replicated object). Object group properties override type-specific properties, which in turn override default properties.

The fault tolerance properties defined by CORBA-FT are discussed below.

Factories.

Developers have to implement the GenericFactory interface and create an instance of it in each replica of a CORBA-FT server. When the CORBA-FT infrastructure wants to create members of an object group, it invokes create_object() on GenericFactory objects in servers.

CORBA is object-oriented rather than process-oriented. Because of this, the CORBA-FT infrastructure does not know in which server process a specific GenericFactory resides. To work around this, CORBA-FT defines a type called FT::Location, which is a typedef of CosNaming::Name (Figure 4.1), that is, a compound string. In effect, a location is the conceptual name by which CORBA-FT knows a server process. CORBA-FT specifies the format that this compound string should have. In effect, it encodes the host on which a server runs and a conceptual name for the server process.

The factories property takes a value that is a sequence of structures, where each structure contains a reference for a GenericFactory object, a location and some criteria (like Properties, Criteria is a sequence of name-value pairs) that can be passed as a parameter to create_object().

The factories property is the only property that cannot be set as a default property. Instead, it must be specified for each interface type and/or for each object group. Having a type-specific factories property makes it feasible for developers to have several type-specific factories within a single location (server process), if they should wish. Alternatively, if a developer wants a single factory to be used to create several types of object in a server then that factory can be registered several times: once for each type of IDL interface.

Replication style.

The replication style property has one of the following values: STATELESS, COLD_PASSIVE, WARM_PASSIVE or ACTIVE. There is also a placeholder for ACTIVE_WITH_VOTING, which is likely to be supported in the future. All of these property values are constants defined in the FT module.

The STATELESS value indicates that the behavior of an object is independent of the history of invocations upon the object. This replication style could be used for an object that provides read-only access to a database.

In the COLD_PASSIVE and WARM_PASSIVE replication styles, only one member, called the primary member, executes the invocations upon the object group. The other members of the object group are called backup members. Periodically, a checkpoint (that is, a snapshot of the state) of the primary member is taken. In addition, CORBA-FT infrastructure persistently logs every request that is invoked upon the primary member. This log is truncated when a new checkpoint of the primary member’s state is taken. When the primary member fails, the most recent checkpoint plus a reply of requests in the log is used to bring a backup member up-to-date and promote it to being the new primary member.

The difference between the COLD_PASSIVE and WARM_PASSIVE is the technique used to bring a backup member up-to-date. In the COLD_PASSIVE replication style, the most recent checkpoint of the old primary member is loaded into the new primary and then requests in the log file are re-invoked on the new primary. In the WARM_PASSIVE replication style, whenever a checkpoint of the primary member is taken, it is automatically loaded into the backup members, thereby enabling a failover to be handled faster.

In the ACTIVE and ACTIVE_WITH_VOTING replication styles, the IOGR contains contact details not for the group members, but rather for gateways, which are delegation servers. When a gateway receives a request from a client, it delegates the request to all the group members. The CORBA-FT specification allows the gateway to use a proprietary multicast protocol to communicate with the group members (this use of a proprietary protocol is transparent to both client and server developers). Each group member processes an invocation independently and sends its reply back to the gateway.

In the ACTIVE replication style, the gateway picks one of the replies and sends this back to the client, and discards the other replies.

In the ACTIVE_WITH_VOTING replication style, the gateway collects all the replies and compares them. They should all be equal, but if there is a fault somewhere then one or more of the replies might be incorrect. Assuming that a majority of the replies are identical, the gateway sends one of the identical replies to the client and discards the other replies. The ACTIVE_WITH_VOTING replication style is not yet supported in CORBA-FT, but it is expected to be supported in the future.

Initial and minimum number of replicas.

An integer property is used to specify the initial number of replicas of an object that should be created. Another integer property specifies the minimum number of replicas of an object that are needed to maintain the desired fault tolerance. If “too many” replicas terminate so that the quantity of replicas falls below the desired minimum then more replicas are created to increase the quantity back to the desired minimum level.

Membership style.

The value of the membership style property can be either MEMB_INF_CTRL or MEMB_APP_CTRL.

If the MEMB_INF_CTRL membership style is used then a server creates an object group by invoking create_object() on the replication manager, which is part of the CORBA-FT infrastructure. The replication manager then invokes create_object() on factories in server processes, in order to create replicas of the object.

If the MEMB_APP_CTRL membership style is used then a server creates an initially empty object group by invoking create_object() on the replication manager. The server then must then do one of the following to populate the object group with members:

Consistency style and checkpoint interval.

The value of the consistency style property can be either CONS_INF_CTRL or CONS_APP_CTRL.

If the CONS_INF_CTRL consistency style is used then the CORBA-FT infrastructure automatically performs checkpointing, logging of requests and failover when a primary member terminates. When the CONS_INF_CTRL consistency style is used then a complementary policy value is used to specify the frequency at which checkpoints are performed.

If the CONS_APP_CTRL consistency style is used then the server application code must perform checkpointing, logging of requests and failover when a primary member terminates.

Fault monitoring, interval & timeout, and granularity.

The fault monitoring style property can be one of PULL, PUSH or NOT_MONITORED. The PUSH fault monitoring style is not yet supported, but support for it is anticipated in the future.

If the PULL fault monitoring style is used then the CORBA-FT infrastructure periodically invokes a “ping”-style operation, called is_alive(), to check if object members are still alive.

The fault monitoring interval and timeout property is a struct that has two fields. One field specifies the frequency of the “ping” requests, and the other specifies the time allowed for responses to those requests to determine whether an object is faulty.

The fault monitoring granularity property complements the PULL fault monitoring style. The granularity can be one of MEMB, LOC, or LOC_AND_TYPE. The MEMB (short for member) value indicates that the CORBA-FT infrastructure must ping each member individually. If the number of object groups (and members within those groups) is large or if the monitoring interval is very short then this property value can result in significant network overhead. The network overhead can be reduced by specifying the LOC (short for location) value. A location is, in essence, a server process. A single ping message is sent for all the object members that live at the same location (that is, within the same server process). If the pinged object is faulty then it is assumed that all the object members at that location are faulty. The LOC_AND_TYPE value is a similar to the LOC value except that one ping message is sent for all the object members of the same (IDL interface) type that live at the same location.

If the NOT_MONITORED fault monitoring style is used then CORBA-FT does not periodically check if object members are still alive. Instead, developers implement their own fault monitoring functionality and report faults to CORBA-FT’s fault notifier, which is discussed in Section 18.5.

18.1.4  The ReplicationManager Interface

As stated previously, the ReplicationManager interface inherits from ObjectGroupManager, GenericFactory and PropertyManager. It also defines two new operations: register_fault_notifier() and get_fault_notifier(). A fault notifier (interface FaultNotifier) is an object that is used for reporting faults. A discussion of fault notifiers is provided in Section 18.5.

An implementation of CORBA-FT provides an infrastructure process that contains a ReplicationManager object, and application programs can connect to this by passing "ReplicationManager" as a parameter to resolve_initial_references(). The replication manager object appears to application programmers as a single object. However, in reality it is replicated so that it is not a single point of failure. The CORBA-FT specification requires that there must not be any single points of failure within an implementation of CORBA-FT.

18.2  Writing CORBA-FT Servers

18.2.1  Modifications to IDL Interfaces

The first step in making a fault-tolerant server is to ensure that the IDL interfaces of objects implemented by a server inherit from appropriate interfaces defined by CORBA-FT. As an example, let us assume that a server is to implement an interface Foo, and also a factory interface (Section 1.4.2.1) for it called FooFactory. The IDL for the server might be written as shown in in Figure 18.2.


interface Foo : FT::PullMonitorable , FT::Checkpointable { ... void destroy(); }; interface FooFactory : FT::PullMonitorable , FT::Checkpointable { Foo create(...); };
Figure 18.2: Example IDL for a Server that uses CORBA FT

The PullMonitorable interface defines an is_alive() operation. This is a “ping”-style operation that is periodically invoked by the CORBA-FT infrastructure to check which replicas of an object are alive. A servant can trivially implement this so that it always returns true to indicate that is is alive. Alternatively, a servant’s implementation of this operation could perform application-specific health checks to ensure that the server process (or related hardware) is in a consistent state, and return true only if this is the case.

The Checkpointable interface defines operations, get_state() and set_state(), that are used to get and restore the “state” (for example, instance variables) of an object. The state is represented as binary data (that is, sequence<octet>), and it is the responsibility of the server developer to convert an object’s state to and from this binary format.

An IDL interface can optionally inherit from Updateable, which is a subtype of Checkpointable. The Updateable interface defines two operations called get_update() and set_update(). An “update” is a delta change in the state of an object since the last checkpoint.

Whether or not an IDL interface has to inherit from Checkpointable or PullMonitorable depends on the fault tolerance properties in effect for objects of that type. In particular, the IDL interface must inherit from Checkpointable if it uses the CONS_INF_CTRL policy along with either the COLD_PASSIVE or WARM_PASSIVE policies. Likewise, the IDL interface must inherit from PullMonitorable if it uses the PULL fault monitoring style.

18.2.2  Creating and Destroying Replicated Objects

To implement a CORBA-FT server, a developer must write servant classes that implement the server’s IDL interfaces—Foo and FooFactory in our running example—and also write a servant class for the GenericFactory interface (shown previously in Figure 18.1). This means that the GenericFactory interface is implemented by the replication manager infrastructure process (Section 18.1.4) and also by every server process.

When a server wishes to create a replicated object—for example, in the body of FooFactory::create()2pt—it does not create the object locally. Instead, what happens depends on the membership style being used.

If the MEMB_INF_CTRL membership style is being used then the server calls create_object() on the GenericFactory interface that is inherited by the replication manager. The replication manager then invokes create_object() on the GenericFactory objects in some of the server replicas. The implementation of create_object() in a server replica creates a normal CORBA object that is a member of (that is, replica in) the object group. The replication manager then constructs an IOGR by combining the contact details of the individual members. It is this IOGR that FooFactory::create() returns to the client.

If the MEMB_APP_CTRL membership style is used then a server creates an initially empty object group by invoking create_object() on the replication manager. The server then must then do one of the following to populate the object group with members:

The create_object() operation has an out parameter that is used to associate an identifier with a newly created object. This identifier is unique within the factory that creates the object. It is the responsibility of the caller to remember this identifier and to pass it as a parameter when later calling delete_object().

When a server wishes to destroy a CORBA object, for example, in the body of Foo::destroy(), it invokes delete_object() on the GenericFactory object in the replication manager. The replication manager then invokes delete_object() on the GenericFactory objects in the server replicas so that each member of the object group can be destroyed.

18.2.3  Registering Server Replicas with CORBA-FT

The mainline of a CORBA-FT server should create one or more GenericFactory objects and export their object references to, say, a file or the Naming Service.

Developers need to write a utility (or perhaps use a proprietary utility provided with a CORBA-FT implementation) that registers properties for the fault tolerant application with the replication manager. Some of these properties will be details of factories and their locations for each IDL interface type. This requirement to register factories is the reason why each CORBA-FT server needs to be able to export an IOR for its own factories.

Having registered a system’s properties with the replication manager, a utility could then invoke create_object() on the replication manager to create an object group (this is assuming that the application uses the MEMB_INF_CTRL membership style). The IOGR obtained from this can then be made available to clients via, say, the Naming Service.

18.3  CORBA-FT Support in Clients

An IOGR contains an embedded TaggedComponent (Section 10.2.3) that indicates the object reference is for an object group rather than for a “normal” object. When the client-side CORBA runtime system encounters this TaggedComponent then it enables client-side CORBA-FT capabilities that enhance the end-to-end fault tolerance of the system. Clients built with a non-FT implementation of CORBA ignore the TaggedComponent. Such clients can still communicate with a CORBA-FT server, but they are not able to take advantage of all the fault tolerance capabilities provided by CORBA-FT.

18.3.1  Keeping IOGRs Up to Date

The TaggedComponent embedded in an IOGR records a version number for the IOGR. Whenever the set of members for an object group changes (for example, when a member dies or is restarted), the replication manager updates the version number of the IOGR and notifies infrastructure in CORBA-FT servers of the updated version number. Each time a CORBA-FT client makes a remote call, a service context (Section 11.6) is used to transmit the IOGR’s version number along with the request. The CORBA-FT infrastructure in the server compares the version number in the received service context to the version number that it has.

The purpose of this version number protocol is to increase the likelihood that an IOGR held by a client contains contact details for only the currently-alive members of the object group. By omitting contact details for currently-dead members from the IOGR, CORBA-FT reduces the likelihood that clients will waste time trying to communicate with currently-dead members.

18.3.2  Making Sure Clients Invoke on Primary Members

If an object group uses the COLD_PASSIVE or WARM_PASSIVE replication styles then one of the profiles (sets of contact details) in its IOGR contains a TaggedComponent (Section 10.2.3) to indicate which is the primary member of the object group. Ideally, the CORBA runtime system in a client should use this TaggedComponent as a strong hint regarding to which member of the object group it should send its requests. However, the client will ignore this hint if the client has been built with a non-CORBA-FT product. Even if the client has been built with a CORBA-FT product, the hint might be out of date, because there might have been a failover that resulted in a different member of the object group becoming the new primary member.

If a request is sent to a backup member of an object group then CORBA-FT infrastructure in the backup server uses a redirection message (Section 11.4) to redirect the client to the primary member.

18.3.3  Transparent Retries of Failed Invocations

As will be discussed in Section 18.4, CORBA-FT infrastructure in servers log all request and reply messages. A CORBA-FT client transmits a FT_REQUEST service context with each request. This service context contains a string that uniquely identifies the client, an integer that uniquely identifies the request (within that client) and an expiration time. When a server receives a request that contains a FT_REQUEST service context, the CORBA-FT infrastructure checks to see if there a request message with an identical service context in the log. If there is then the incoming request is not executed; instead, the corresponding reply message from the log is transmitted back to the client. The purpose of this mechanism is to allow the runtime system of a CORBA-FT client to perform automatic retries of failed invocations while preserving at-most-once invocation semantics.

18.3.4  Heartbeat Messages

TCP/IP (upon which IIOP is based) does not cope very well with some types of disruptions to a network. For example, if a client process is connected to a server process via TCP/IP then the client will be notified promptly if the server process dies but will not be notified promptly if the server’s machine fails or is abruptly disconnected from the network. In such cases, the client process may wait for a long time (perhaps even forever) for a reply from the server. This problem can be solved by setting round-trip timeouts in the client. However, doing this for each request can be laborious, even if you know approximately how long a particular invocation should take.

CORBA-FT provides an alternative mechanism for detecting network failures in a timely manner. This mechanism involves the CORBA-FT infrastructure in a client periodically sending ping-style messages to CORBA-FT servers. When a CORBA-FT server receives one of these messages (called heartbeat messages in CORBA-FT terminology), the CORBA-FT infrastructure in the server does not dispatch the request to an object. Instead, the CORBA-FT infrastructure itself sends back a reply to the client. The intention of avoiding a dispatch to the target object is that the heartbeat messages are to check network connectivity only rather than whether or not an object is alive.

Whether or not heartbeat messages are transmitted is determined by a client-side policy (Section 16.1). The policy also specifies the frequency with which heartbeat requests are transmitted and the timeout period for receiving a reply.

18.4  Logging and Recovery Infrastructure

Each CORBA-FT server has some built-in infrastructure called the logging mechanism and recovery mechanism. These pieces of infrastructure are provided by a CORBA-FT implementation, but CORBA-FT does not define IDL interfaces for them because they are not invoked directly by application code.

The logging mechanism is responsible for persistently logging the request messages that arrive for an object and the reply messages sent after invocations have completed. It also logs the periodic checkpoints (and possible updates) of the primary member of an object group. The log can be accessed in a distributed manner. The details of how this is achieved is an implementation detail, but one possible way is for the log to be written to a replicated database; another possible technique is for each host to maintain the log in local volatile storage and use a reliable, totally-ordered multicast protocol to send log updates to the other hosts.

If the COLD_PASSIVE or WARM_PASSIVE replication style is used and the primary member of an object group dies then a backup member is promoted to be the primary member. When this happens, the recovery mechanism is used to bring the state of that member up-to-date. The recovery mechanism is also used in the COLD_PASSIVE or ACTIVE replication styles when a new member is introduced to the group.

When the recovery mechanism is used, it analyses the log and calls set_state() on the relevant object member to initialize it to the last checkpoint state. Then it might call set_update() to load the most recent “update” to the object. Finally, it re-invokes the request messages that are more recent than the last checkpoint/update to bring the object fully up to date.

In order to conserve space, the logging mechanism compacts the log periodically. In particular, whenever a new checkpoint of an object is obtained, previous checkpoints and request/reply messages that are older than the new checkpoint can usually be removed from the log. Likewise, whenever a new update is obtained by calling get_update() then older updates and request/reply messages can usually be removed from the log. The only exception to this is that some request/reply messages may be retained if the expiration time in the FT_REQUEST service context in the request has not yet occurred. This is to support the transparent retry of failed invocations discussed in Section 18.3.3.

18.5  Fault Notifiers

An implementation of CORBA-FT contains some infrastructure called a fault monitor or fault detector—these two terms are used interchangeably within the CORBA-FT specification. A fault monitor has the task of detecting faults. CORBA-FT does not define an IDL interface for the fault monitor because it is not invoked directly by users. To aid scalability, there may be several fault monitors within a CORBA-FT deployment: typically one on each host where CORBA-FT servers run. Fault monitors can check for faults by invoking the is_alive() operation on object members.

When a fault monitor detects a fault, it needs to report it to the replication manager and possibly also to user-written applications that might, for example, analyze reports and show summaries of them on a graphical display. The OMG decided that the mechanism used to report faults should be based on concepts in the Notification Service (Section 22.3). The Notification Service itself was considered to be complex enough that it would be difficult to implement all of its functionality in a fault tolerant way. Instead, the bare minimum subset of Notification Service-style functionality required for CORBA-FT was extracted and repackaged in the FT::FaultNotifier interface, which is shown in Figure 18.3.


module FT { ... interface FaultNotifier { typedef unsigned long long ConsumerId; void push_structured_fault( in CosNotification::StructuredEvent event); void push_sequence_fault( in CosNotification::EventBatch event); ConsumerId connect_structured_fault_consumer( in CosNotifyComm::StructuredPushConsumer push_consumer); ConsumerId connect_sequence_fault_consumer( in CosNotifyComm::SequencePushConsumer push_consumer); void disconnect_consumer( in ConsumerId connection) raises(...); void replace_constraint( in ConsumerId connection, in CosNotification::EventTypeSeq event_types, in string constr_expr); }; };
Figure 18.3: The FT::FaultNotifier interface

Application code can obtain a reference to the fault notifier by invoking resolve_initial_references("ReplicationManager") to connect to the replication manager and then invoking get_fault_notifier() on it.

A fault monitor does not need to explicitly register with the fault notifier. Instead, once a fault monitor has a reference to the fault notifier, it can invoke push_structured_fault() to send a single fault event to the fault notifier. Alternatively, it can invoke push_sequence_fault() to send a sequence of fault events to the fault notifier.

An application that wants to receive fault events does need to register with the fault notifier. It can do this by invoking connect_structured_fault_consumer() if it wants to receive fault events one-at-a-time. If an application prefers to receive fault events batched up in a sequence then it invokes connect_sequence_fault_consumer(). Both of these operations return a ConsumerId that is later passed as a parameter to disconnect_consumer() to disconnect the consumer from the fault notifier.

By default, a consumer of fault events receives all the events. However, once connected, a consumer can invoke replace_constraint() to specify a constraint that the fault notifier uses to filter out unwanted events (Section 22.3.4.1).

The CORBA-FT specification gives precise details of the information that is contained inside a fault event. The information includes the location, IDL interface type and object group id of the object that failed.

The replication manager registers itself with the fault notifier as a consumer of fault events. Developers can write their own applications that also register with the fault notifier as consumers of fault events. Such applications, for example, might analyze faults (based on location, frequency and so on) or show summaries of faults on a graphical display. A fault monitor does not know how many consumers are connected to the fault notifier. A fault monitor simply pushes its fault events to the fault monitor and, in turn, the fault monitor pushes the fault events to all registered consumers.

18.6  Critique

CORBA-FT defines a lot of infrastructure. Because of this, at first sight it appears that CORBA-FT is very complex. However, most of this infrastructure is pre-implemented by a CORBA-FT product and it works behind the scenes. Only a relatively small amount of the infrastructure is visible to, or must be implemented by, developers. This means that CORBA-FT is not as complex as it first seems. However, there is no doubt that CORBA-FT is more complex (and also more powerful) than the proprietary fault tolerance mechanisms discussed in Chapter 17.

Use of CORBA-FT affects the design and coding of applications. Because of this, it is best if CORBA-FT is designed into an application from the start, rather than being retrofitted to an existing application as an afterthought. In contrast, most proprietary fault tolerance mechanisms are enabled via configuration rather than via coding. This means that, quite often, use of a proprietary fault tolerance mechanism can be retrofitted to an existing application quite easily.

Perhaps the biggest drawback of CORBA-FT is that it is an optional part of the CORBA specification and, unfortunately, most CORBA products neglect to implement it. The author is aware of only one CORBA implementation, TAO, that currently implements CORBA-FT; of course, there could be other CORBA-FT implementations of which the author is not aware.


1
Alternatively, an IOGR might contain a separate profile for each of several gateways that delegate requests to object replicas. More details about gateways is provided in the discussion about the ACTIVE replication style.

Previous Up Next