2.2 Middleware for distributed systems
The term middleware first appeared in the late 1980s to describe network connection management software [124]. It became popular in the mid-nineties with the development and penetration of network technologies and distributed systems [142]. Distributed systems [142] enable us to use the best combination of hardware and software components for an enterprise. However, distributed systems are among the most complex artifacts human beings have ever constructed. It is difficult to construct a coherent and operational distributed system that integrates the needed components, due to the inherent heterogeneity, distribution problems of distributed systems. The developers need to deal with the complexity of networks and communications, different hardware platforms, operating systems and programming languages [19, 27]. Middleware is originally designed to simplify distributed system construction. Middleware is a set of distributed software services that exists between network operating systems and distributed applications [50, 124], as shown in Fig. 2.1. It adds mechanisms and services that are much more specialized than those provided by the operating system. It enables application engineers to abstract from the implementation of error-prone and complex low-level details, such as concurrency control, transaction management and network communication, and allows them to focus on application requirements. The construction of a large class of distributed systems can be hence simplified by leveraging middleware. The quality of the developed systems is also enhanced through using middleware.
Figure 2.1 Left-hand side: middleware structure; right-hand side: ISO/OSI reference model
The implementation of the middleware can take different forms with different design strategies. Generally, four main categories [50, 29] of middleware for distributed systems can be distinguished. They are transactional middleware, message-oriented middleware, Remote Procedural Call (RPC) middleware, and distributed object and component middleware [117, 124]. Transactional middleware supports transactions involving components that run on distributed hosts. The products in this category include BEA's Tuxedo [67]. Message-oriented middleware supports the communication between distributed system components by facilitating message exchange. Products in this category include IBM's MQSeries [59] and Sun's Java Message Queue [68]. RPC middleware is an important early middleware model that is based on Remote Procedure Calls (RPCs). The standard DCE (Distributed Computing Environment), specified by the Open Software Foundation, is completely identified with RPC middleware. Object and Component middleware [117] evolved from RPCs, whose most popular models are Microsoft's Component Object Models (COM, DCOM, COM+) [133], Sun Microsystems' Enterprise JavaBeans (EJB) [11], and OMG's Common Object Request Broker Architecture (CORBA) [110, 146].
In spite of the diversity of design strategies, the middleware can be still characterized at an abstract level that abstracts from particular product characteristics. For instance, some researchers [50, 29] characterize the middleware according to the requirements for distributed system construction, i.e., the difficulties that arise during distributed system construction. Accordingly, the middleware provides services to deal with network communication, component coordination, reliability, scalability and heterogeneity. This classification is quite complete and general enough to describe the functionalities of the middleware for distributed systems.
Nevertheless, we have a different target in the thesis. We want to characterize the middleware for distributed systems in such a way that allows a better understanding of the middleware for mobile systems, and that emphasizes the main difference between the two kinds of middleware. Therefore, we would like to emphasize the very basic requirements like interoperability and distribution transparency, but not the more advanced requirements like reliability and scalability. Consequently, we depict the middleware from four aspects (Table 2.1): A1 component interoperability, A2 component behavior, A3 network communication and A4 distribution transparency. The classification follows a top-to-down structure (Fig. 2.2), where the component interoperability is on the very top layer and it allows the application components to interoperate. It is also the very basic and kernel requirement. The other aspects, i.e., component behavior and network communication, help the realization of the interoperability on the lower layers.
A1 Component interoperability Middleware [142] provides a view of a single interoperable coherent system and is tended to handle a collection of independent components. Middleware is sometimes called a glue technology because it is often used to integrate heterogeneous distributed components, and make components interoperable (Fig. 2.2), which means that a component on one system can access a component on another system. To achieve such an integration and interoperation, there are several main problems need to be addressed. First, components come out of different programming languages, data representations and operating systems. Secondly, components need to define interfaces at appropriate levels of abstraction in order to advertise the services that they provide. Thirdly, the components are located on different hosts, and the network and communications need to be solved.
Middleware solves these problems through providing the definition of services, standard programming interface and standard protocols. For example, object-oriented middleware CORBA [146] and DCOM [24] support the definition of object (component) services. The service provided by a component is encapsulated as an object and the interface of an object describes the provided service, which is a set of method calls defined through an IDL (Interface Definition Language). The interfaces defined in an IDL file serve as a contract between a server and its clients. Clients interact with a server by invoking methods described in the IDL.
IDL is designed to be independent of a particular programming language. The middleware can define binding to different programming languages. For example, CORBA [146] defines bindings to C, C++, Smalltalk, Ada, Java and OO-Cobol. These programming language bindings determine how object types with their attributes, operations and exceptions are implemented in server objects and how clients can make object requests and catch exceptions the server may raise. Through such definition of services and IDL, the different types of components have now a homogeneous definition, which allows a system construction through integrating legacy and commercial off-the-shelf components with newly built components.
A2 Component behavior In order to provide a single interoperable coherent system, one important aspect of middleware is to manage component behaviors, facilitate component interaction, and enable the cooperation of distributed components. More specifically, we can distinguish inter-component behavior and intra-component behavior. Given that components execute concurrently on distributed hosts, the middleware could support [50] multithreads of control, single thread of control, or the both. The activation and deactivation of the component execution process need to be supported too. We call such behaviors that happens in the scope of one component intra-component behavior. It controls the execution state of the component.
On the contrary, inter-component behavior happens in the scope of a group of components. Component interaction (Fig. 2.2) is a synonym of inter-component behavior and we will use component-interaction in the rest of the paper. Component interaction covers component communication, collaboration, and coordination. These different aspects are not orthogonal and there is no clear boundary. For example, components need a method to communicate with each other through the network. The method can be message based, transactional based, event based, etc. The communication requires the coordination and synchronization between different actions and components, and the components collaborate with each other in order to perform a task.
Middleware can be distinguished through the supported component interaction patterns and paradigms [124, 50, 29]. For example, in distributed computing, the middleware can be distinguished in RPC (Remote Procedure Call) pattern, message based pattern, event-based pattern, etc. For example, in both DCOM [24] and CORBA [146], the interactions between a client process and an object server process are implemented as object-oriented RPC-style communication. Figure 2.3 shows a typical RPC structure. To invoke a remote function, the client makes a call to the client stub. The stub packs the call parameters into a request message, and invokes a wire protocol to ship the message to the server. At the server side, the wire protocol delivers the message to the server stub, which then unpacks the request message and calls the actual function on the object. In DCOM, the client stub is referred to as the proxy and the server stub is referred to as the stub. In contrast, the client stub in CORBA is called the stub and the server stub is called the skeleton. Sometimes, the term “proxy” is also used to refer to a running instance of the stub in CORBA.
A3 Network communication In a distributed system, components often locate on different hosts. Network communications (Fig. 2.2) are involved when the remote components interact. The interoperability of components on the networking layer is achieved through using standard networking protocols, which are often classified by the ISO/OSI (Fig. 2.1) model [50, 142]. Most middleware platforms are built on top of the transport layer (TCP or UDP are examples). Transport layer protocols generally provide services that handle the flow of data between systems and provides access to the network for applications via sockets. Application engineers need to implement session, presentation, and application layer when programming at this level of abstraction. This is costly, error prone and time-consuming [50]. Middleware implements session and presentation layer, enables application engineers to request parameterized services from remote components and can execute them as atomic and isolated transactions. The transmitted parameters have often complex data structures. The presentation layer implementation of the middleware provides the ability to transform these complex data structures into a format that can be transmitted using a transport protocol, i.e., a sequence of bytes. This transformation is referred to as marshalling and the reverse is called unmarshalling.
A4 Distribution transparency One main functionality of middleware for distributed systems is to provide distribution transparency to the application. The distribution transparency can be further classified as interaction, network communication and location transparency. Interaction transparency means that the application components do not notice that they are interacting with other remote components, and the remote interaction will be performed by the middleware. Moreover, the application components are not aware of the network communication and the location of other components. For example, in an object-oriented middleware, if a client wants to perform some interaction with a server object, it issues a local call request to the middleware. The middleware locates the server-object by querying a database to discover the location of the requested object and then invokes the server-object. This processes involves the low layer network communication mechanisms from the middleware. The client and server do not need to know where the other is located.
From the description we can observe that the main requirements for middleware for distributed systems is to integrate heterogeneous distributed components and make the components a single interoperable coherent system. There are not so many other major requirements from the application side anymore. Therefore, the main functionality of the middleware is to provide component interoperability, which is the very basic and kernel requirement. Accordingly, we can define the middleware in the following as:
Def. 2.1 Middleware for distributed systems is a set of distributed software services that exists between network operating systems and distributed applications. It adds mechanisms and services that are much more specialized than those provided by the operating system. It integrates heterogeneous components, and makes the components a single interoperable coherent system.