Learning AWS(Second Edition)
上QQ阅读APP看书,第一时间看更新

Addressing data extensibility requirements

Having a rigid database schema will not work for you across all your customers. Customers have their specific business rules and supporting data requirements. They will want to introduce their own customization to the database schema. However, ensure that you don't change your schema for a tenant to an extent that your product no longer fits into an SaaS model. But you do want to bake in sufficient flexibility and extensibility to handle the custom data requirements of your customers (without impacting subsequent product upgrades or patch releases).

One approach to achieving extensibility in the database schema is to pre-allocate a bunch of extra fields in your tables, which can then be used by your customers to implement their own business requirements. All these fields can be defined as string or varchar fields. You can also create an additional metadata table to further define a field label, data type, field length, and so on, for each of these fields on a per tenant basis. You can choose to create a metadata table per field or have a single metadata table for all the extra fields in the table. Alternatively, you can introduce an additional column for the table name to have a common table describing all custom fields (for each tenant) across all the tables in the schema.

This approach is depicted in the following figure. Fields 1 to 4 are defined as extra columns in the customer table. Further, the metadata table defines the field labels and data types.

A second approach takes a name-value pair approach, where you have a main data table that points to an intermediate table containing the value of the field and a pointer to a meta data table that contains the field label, data type, and so on, information. This approach cuts out potential waste in the first approach but is obviously more complicated to implement.

A variation on these two approaches is to define an extra field per table and store all custom name-value pairs per tenant in an XML or JSON format.

A third approach is to add columns per tenant as required. This approach is more suitable in the separate database or separate schema per tenant models. However, this approach should generally be avoided as it leads to complexity in application code, that is, handling an arbitrary number of columns in a table per tenant. Further, it can lead to operational headaches during upgrades.

You will need to design your database schema carefully for providing custom extensions to your database schema as this can have a ripple effect on the application code and the user interface.

In addition to introducing a tenant id column in the database, if the application has web service interfaces then these services should also include the tenant id parameter in its request and/or response schema. To ensure a smooth transition between shared and isolated application instances, it is important to maintain tenant ids in the application tier. In addition, tenant aware business rules can be encoded in a business rules engine, and tenant specific workflows can be modeled in multi-tenanted workflow engine software using Business Process Execution Language (BPEL) process templates.

In cases where you end up creating a tenant-specific web service, you will need to design it in a manner that least impacts your other tenants. A mediation proxy service that contains routing rules can help in this case. This service can route the requests from a particular tenant's users (specified by the tenant id in the request) to the appropriate web service implemented for that tenant.

Similarly, the front end or the UI can also be configured for each tenant to provide a more customized look-and-feel (for example, CSS files per tenant), tenant specific logos, and color schemes. For differences in tenant UIs, portal servers can be used to serve up portlets, appropriately.

If different service levels need to be supported across tenants, then an instance of the application can be deployed on separate infrastructure for your higher-end customers. The isolation provided at the application layer (and the underlying infrastructure) helps avoid tenants impacting each other by consuming more CPU or memory resources than originally planned.

Logging also needs to be tenant aware (that is, use tenant id in your log record format). You can also use other resources such as queues, file directories, directory servers, caches, and so on, for each of your tenants. These can be done in a dedicated or separated out application stacks (per tenant). In all cases, make use of the tenant id filter for maximum flexibility.