CORAL Component documentation: RelationalAccess

Ioannis Papadopoulos

CERN

30 January 2006


1. Introduction

1.1. Purpose of the component

This component defines the core of the CORAL public API. It provides the user- and developer-level interfaces and a simple implementation of a transient table description structure. It also defines the hierarchy of exceptions that are thrown by the API implementations in case of error conditions.

1.2. Known problems and restrictions

  • The underlying SQL functionality is limited by the access patterns defined by the RelationalAccess API.

1.3. Repository of the component

:pserver:anonymoys@coral.cvs.cern.ch:/cvs/coral/coral/RelationalAccess

1.4. Glossary

  • RDBMS Acronym for Relational DataBase Management System

2. Description of the RelationalAccess API functionality

2.1. Design of the Interfaces

2.1.1. UML diagram of the abstract classes

The public interfaces of the CORAL API are depicted in the following UML class diagram.

2.1.2. Exceptions

Most of the methods of the abstract interfaces return a void or a reference to an object. In case of error conditions there is usually an error message in the form of a SEAL message stream that is generated by the implementation plugin. In addition to that a relevant C++ exception is thrown. All exceptions thrown by coral derive from the base coral::Exception class, which in turn is an extension of the SEAL Exception. Which specific exception is thrown in each case is documented in the methods of the RelationalAccess interfaces and it is part of the method semantics. The full exception hierarchy is shown in the following UML class diagram.

2.2. Accessing a database

2.2.1. The Relational Service and Technology Domains

The software entry point for accessing a relational database through the RelationalAccess package is the IRelationalService interface. CORAL provides a default implementation as a SEAL component with the label "CORAL/Services/RelationalService". For more details refer to the documentation of the RelationalService package.

The IRelationalService interface provides access to IRelationalDomain objects for the various different technology types. The IRelationalDomain interface provides information about the technology type, the implementation of the corresponding plugin module and its version.

An IRelationalDomain object is retrieved by the IRelationalService specifying either the technology type, or a connection string (which contains the technology type). The IRelationalDomain is the base class for the corresponding SEAL component of a technology plugin. The component should be named according to the technology name with the following convention: If the technology type is "MyRDBMS", then the IRelationalService will be searching for a plugin labeled as "CORAL/RelationalPlugins/MyRDBMS". Optionally the implementation name can be specified. In this case the plugin should be labeled as "CORAL/RelationalPlugins/MyRDBMS/MyImplementation".

The Relational Service may handle several implementation plugins for a given RDBMS technology. By default a native implementation (a plugin where the implementation name is not specified in its label) has a higher priority than all the others. A user can at any moment change the default implementation for a given technology either through the IRelationalService API, or by setting accordingly the corresponding property of the Relational Service implementation.

2.2.2. Connection strings

The IRelationalDomain interface acts as a factory for IConnection objects which are used for the physical connection to a relational database. Such an object is obtained by specifying a connection string which has to have the format

technology_protocol://hostName:portNumber/databaseOrSchemaName
where technology can be oracle, mysql, odbc, etc. (all small letters) and refers to the actual plugin implementing the RelationalAccess abstract interfaces. The protocol is optional and can be file, http, etc. It is used for server-less, file-based databases, like SQLite. The port number is optional as well.

With a given IConnection object a user may start a session, using the ISession interface, with its associated transactional context. In case the back-end allows, for the same connecion, several parallel and independent sessions may me started.

2.2.3. The authentication mechanism

The ISession interface allows a user authenticate to a database specifying a user name and a password. Alternatively a reference to an IAuthenticationService object may be passed explicitly or implicitly looking for such a registered service anywhere in the corresponding context tree.

The IAuthenticationService is a simple interface which returns for a given connection string the connection credentials (typically user name and password) through an IAuthenticationCredentials object. It has been introduced in the design of the component so that authentication parameters need not appear in the connection string, which may have to be shared by users with different access rights. Refer to the relevant documentation for the semantics of the existing implementations of this interface.

2.2.4. Service Indirection

The CORAL API accomodates the use case where relational data may be distributed on a computing grid. In this case a user addresses a database service not through its physical characteristics, but rather using a logical name. The ILookupService class defines the interface that a software module performing logical-to-physical name lookups for database services. A user, may specify not only the logical name, but may also restrict the lookup with the access mode (database role) and the authentication mechanism for the available replicas. The result of the lookup operation is an IDatabaseServiceSet object, which is a container of IDatabaseServiceDescription objects. An IDatabaseServiceDescription describes a replica which fulfills the lookup criteria. The ILookupService returns the available replicas in the order that they should be tried out.

The Lookup Service can be through of as the equivalent of the File Catalog in the POOL framework.

The ILookupService interface delivers the list of all the available replicas corresponding to a logical name with an ordering which is dependent on the back end. In grid applications accessing databases in a distributed environment a re-ordering might be desirable in order to pick up first the most proximate database service. To this end the CORAL API defines an IReplicaSortingAlgorithm.h interface whose implementations can be used to encode any ordering logic.

2.2.5. Client-side Monitoring

In a production environment it is important to monitor the activities in a database server. The monitoring output gives hints not only on how to tune applications and optimize schemas. It also contributes towards the definition of the most efficient deployment models and hardware configurations. Server-side monitoring is usually a responsibility of the database administrators. It produces information such as the performance of the server as a function of the number of concurrent clients running an application, or the most frequent or resource consuming operations.

Client-side monitoring can be used to gather information which is specific to a particular application. Combined with the information of the server-side monitoring it may reveal the contribution of the specific application to the server load and/or the need for tuning of the application itself, or it may hint problems in the deployment model.

CORAL facilitates client-side monitoring with the IMonitoringService developer-level interface. It is an interface which is used by the RDBMS-specific plugin implementations to register events that need to be monitored. Such events are the connecting to and disconnecting from a database server, the starting and ending of a user session, the signaling of transaction boundaries, and mostly important the issuing of SQL statements and the associated execution time.

In order for a RDBMS-specific plugin implementation to record monitoring data it is necessary that there is an implementation of the IMonitoringService interface, defined as a SEAL component, reachable by the plugin through its local context tree (like an IAuthenticationService implementation). Moreover, the user should explicitly enable the monitoring, specifying also the verbosity level, through the IMonitoring interface accessible through the ISession.

The reporting of the monitoring events is done through the IMonitoringReporter interface accessible from the IMonitoringService object.

2.2.6. Connecting to a database in the most generic way

In CORAL, connecting to a database, starting a user session and authenticating are three separate operations. The IConnection and ISession interfaces have been designed such that they can facilitate application-level connection pooling. The IConnectionService is the interface for components that are responsible for providing to the client components handles to physical connections that are re-used and probed for the status of the physical connection before they are made available for use. In particular, the IConnectionService is an abstract factory for ISessionProxy objects which make sure that a physical connection to the underlying server is always valid.

The use of the IConnectionService is highly recommended in applications in a distributed environment, since any implementation is responsible of provide a valid ISessionProxy object which corresponds to the best replica of a logical database service. In that respect, the role of an IConnectionService implementation is the general coordination of the various high-level CORAL components. Thus an IConnectionService has to:

  • Load if necessary an implementation of the IRelationalService interface and use it internally in subsequent calls.
  • Load if necessary an implementation of the IAuthenticationService and use it internally in subsequent calls.
  • Load if necessary an implementation of the IMonitoringService and issue calls to the IMonitoring interface of the ISession objects it manages.
  • Load if necessary an implementation of the ILookupService and use it whenever a logical database service name is specified in a request for an ISessionProxy.

The Connection Service will also make sure that the number of physical connections to a given database is minimized, by creating several ISession objects under a single IConnection.

The IConnectionService interface can also be used for configuring a CORAL-based application. In particular, its IConnectionServiceConfiguration interface can be used to select the particular implementations of the various interfaces that should be loaded. Moreover, it can be used to override the default parameters that are related to the connection re-attempting and replica failover mechanisms.

Finally, it provides access to IMonitoringReporter interface of the loaded IMonitoringService, to allow a user trigger the reporting of the monitoring events.

Given that the IConnectionService is the logical entry point for CORAL, as of release 1.7.0, a ConnectionService facade object is defined, which implements the IConnectionService interface and hides all the underlying complexities related to the loading and looking up of the various plugin libraries.

2.2.7. Transactional context

Every operation with a relational database using the RelationalAccess component is executed within a transactional context. There is a single transactional context for each session. A reference to the ITransaction interface can be retrieved by an ISession object once a connection has been established.

A user can start a transaction in an update (default) or read-only mode, commit and roll back the changes.

2.3. Managing schemas

Once connected to a relational database, the user works with the data defined under the schema defined for the table collection (MySQL database, Oracle user schema, SQLite file, etc.) specified in the connection string. The ISchema interface which is retrieved by the ISession (or ISessionProxy) object allows the user to list, create and drop tables in the working schema. This is the nominal working schema, defined by the connection string itself. A user may also work with any other named schema for which the necessary access rights have been granted, through the relevant call in ISession (or ISessionProxy). The ISchema object which is obtained is completely independent from the nominal or other named ones. The only things that are shared are the transaction sequence and the type converter.

The ISchema can be used to retrieve a reference to an ITable object specifying its name. This interface allows the user to

  • retrieve the description of the corresponding table via the ITableDescription interface

  • set the access privileges using the ITablePrivilegeManager interface

  • alter the table definition using the ITableSchemaEditor interface

  • perform data manipulation on the table through the ITableDataEditor interface

  • execute queries with the data of the corresponding table

The ITableDescription object allows the user to retrieve the following information from a table:

  • The column names and types (the C++ equivalents)
  • The NULLness and the UNIQUEness of a column
  • The definition (column names) of its primary key, in case there is one, using the IPrimaryKey interface
  • The definitions (key name, column names, referenced table, referenced columns) of its foreign keys using the IForeignKey interface
  • The definitions (column names, uniqueness) of its indices using the IIndex interface
  • The definitions (column names) of its multi-column unique constraints using the IUniqueConstraint interface

In order to create a new table in the schema the user has to construct an ITableDescription object and pass it to the ISchema. The component provides the TableDescription class which implements the ITableDescription and ITableSchemaEditor interfaces for this purpose.

Views can be created using IViewFactory objects which can obtained from the ISchema interface. They extend the IQueryDefinition interface and are used to create IView objects, which can be used to retrieve the original definition of the created views and to describe their columns.

By default, for most technologies, a table or view is created without any access right to anybody but the table owner. It is therefore a good practice for tables that will be exposed to other database users to use the ITablePrivilegeManager interface to grant the relevant access rights immediatelly after the table creation.

Note that the semantics of the interfaces assume that the names of the tables, views, columns, keys, indices and constraints are case sensitive. Given though that some databases do not preserve the case, it is highly recommended to capitalize all such system names, especially if the CORAL is expected to be used for cross-populating databases of different technologies.

2.4. Performing data manipulation

The ITableDataEditor allows the privileged user perform DML operations on a table. In particular it is allowed to

  • Insert a new row. The input row is fed with an AttributeList with an AttributeListSpecification that is compatible to a subset of the table's columns and types.
  • Modify existing rows. In this case the SET and WHERE clauses of the corresponding SQL statement have to be provided by the user. The user is recommended to use bind variables for the WHERE and SET clauses which can be passed through an AttributeList.
  • Delete existing rows. In this case the WHERE clause of the corresponding SQL statement have to be provided by the user. The user is recommended to use bind variables for the WHERE clause.
  • Insert query result sets (INSERT-SELECT statements) using an IOperationWithQuery object.
  • Perform bulk operations of the above using IBulkOperation and IBulkOperationWithQuery objects. Such operations are highly recommmended in repetitive operations where only the input data change. In this case the user specifies the row buffer to be bound and the number of rows to be cached at the client side before the data are actually sent to the server.

2.5. Issuing queries

Queries can by issued using the IQuery interface. The ISchema and ITable interfaces acts as a factories of IQuery objects. The IQuery class extends the IQueryDefinition class, which is the interface for defining the query logic. A query is formed by

  • Specifying the output variables. In case the IQuery object has been constructed from an ITable, if no variable is specified a wildcard query is formed. For each variable an alias can be specified.
  • Specifying the table list. In case the IQuery object has been constructed from an ITable this is not required. In case of a general query the name of a table or a view can be specified, providing optionally an alias.
  • Defining the query condition. The user has to specify the WHERE clause of the corresponding SQL statement. It is highly recommended to use bind variables passing the corresponding AttributeList.
  • Specifying order variables. By doing so the user instructs the system to append to the SQL query an ORDER BY clause.
  • Limiting the number rows in the result set. This is a very useful operation when the maximum number of rows that will be retrieved at a query is known.
  • Defining sub-queries. A user can define a sub-query by retrieving a reference to an IQueryDefinition, whose result-set can be used as an input table in the parent query.
  • Apply set operations. A user can define a query by retrieving a reference to an IQueryDefinition which appears as the RHS query in the resulting SQL statement.

The IQuery interface controls the actual execution of the query by:

  • Defining the return type of the output variables The return type of the output variables or expressions is internally deduced by the RDBMS client software. However, the user may redefine the output types, either fully by specifying an AttributeList to be used as the output buffer, or by partially by specifying the C++ type of a column in the result set.
  • Defining the number of prefetched rows or the size of the buffer on the client side for the rows if the result set. This operation can be used to reduce the number of roundtrips to the server for fetching the result of a query.
  • Locking the rows of the result set. This operation blocks other concurrent clients from updating the affected rows. This is a useful feature which serves the use cases where the protection of data consistency from parallel write/update applications is required.

On execution the IQuery interface returns a reference to an ICursor object. It is simply an iterator over the rows of the result set. The Attributes in the output buffer (AttributeList) have the values of the underlying variables updated in every iteration. The associated boolean variables expressing the NULL-ness of the values are updated as well.

2.6. Controlling the behaviour of web caches

Some RDBMS implementations may use a web cache for fast retrieval and efficient deployment of read-only data. This comes with the cost of the risk that data may become stale from some moment and onwards. Depending on the data model an application may be designed that some tables should always be refreshed before they are accessed for the first time within an application.

CORAL allows an application to fully control of web caches through the use of the IWebCacheControl interface, that can be retrieved from an IConnectionService object. This interface allows the client component to define policies related to a particular schema, as this is defined by the logical or physical connection string. The RDBMS plugins may use then the caching policies defined by the client components, through calls to the IWebCacheInfo objects corresponding to the particular schema.

3. Example of usage of the RelationalAccess API

Examples of usage of the CORAL API can be found under the CORAL integration tests:

4. Implementation specifics

The RelationalAccess package provides only the definition of abstract interfaces and the exception hierarchy. The only implementation code that exists is for the TableDescription class, which is a transient implementation of the ITableDescription and ITableSchemaEditor interfaces. This class exists in order to provide the user a means to define and create new tables.

The TableDescription class uses internally simple transient implementations of the IColumn, IPrimaryKey, IIndex, IForeignKey and IUniqueConstraint interfaces. The class itself is used also internally in the various RDBMS-specific implementation plugins.

5. Reference documentation of the RelationalAccess public interfaces