2004年12月28日

金羊网 2004-12-09 09:00:24
新快报讯(记者周继坚)广州新人办理婚姻登记将不受周末、节假日婚姻登记机关放假的影响!记者昨日从广州市民政部门获悉,广州市目前已经在五个老区试点,实行婚姻登记机关周末和法定节假日照常办理登记业务。

据了解,广东省民政厅今年10月率先在深圳市宝安区进行试点,在周末和法定节假日增加1-2天工作时间,所有申请办理婚姻登记的居民可以在当天将手续办理完毕。

据悉,广州的试点工作已经在天河、越秀、荔湾、东山、海珠等五区展开。但各个婚姻登记处的工作人员没有增加,目前尚无法进行大量的婚姻登记业务,确实有需要的市民可以先行预约。

 

2004年12月26日

 Spring.NET – .NET Application Framework

Spring.Net 的一些介绍文字

探讨Spring框架使用真相

隐藏在.NET中的IoC?

SpringFramework中文论坛

by Scott W. Ambler, Copyright 2003-2004

This essay is taken from Chapter 14 of Agile Database Techniques

 

Most modern business application development projects use object technology such as Java or C# to build the application software and relational databases to store the data.  This isn’t to say that you don’t have other options, there are many applications built with procedural languages such as COBOL and many systems will use object databases or XML databases to store data.  However, because object and relational technologies are by far the norm that’s what I assume you’re working with in this chapter.  If you’re working with different storage technologies then many of the concepts are still applicable, albeit with modification (don’t worry, Realistic XML overviews mapping issues pertaining to objects and XML). 

There is an impedance mismatch between object and relational technology, technologies that project teams commonly use to build software-based system with.  It is quite easy to overcome this impedance mismatch, the secret is twofold:  you need to understand the process of mapping objects to relational databases and you need to understand how to implement those mappings.  In this chapter the term “mapping” will be used to refer to how objects and their relationships are mapped to the tables and relationships between them in a database.  As you’ll soon find out it isn’t quite as straightforward as it sounds although it isn’t too bad either.

 

Table of Contents

 

1. The Role of the Agile DBA

Figure 1 shows the role that an Agile DBA plays when it comes to mapping objects to relational databases.  There are three primary activities that we are interested in:

  1. Mapping.  The basic goal is to determine an effective strategy for persisting object’s data.  This includes saving both the data attribute’s of individual objects and the relationships between objects, all the while respecting the inheritance structures between classes.

  2. Implementing mappings.

  3. Performance tuning. 

An interesting thing to note about Figure 1 is that Agile DBAs and application developers work together on all three activities, although the Agile DBA may be responsible for ensuring the mappings are effective they’re not solely responsible for the actual effort.  Working with others, not working alone, is the secret to success in agile software development. 

 

Figure 1. The role of the Agile DBA when mapping.

 

2. Basic Concepts

When learning how to map objects to relational databases the place to start is with the data attributes of a class.  An attribute will map to zero or more columns in a relational database.  Remember, not all attributes are persistent, some are used for temporary calculations.  For example, an Student object may have an averageMark attribute that is needed within your application but isn’t saved to the database because it is calculated by the application.  Because some attributes of an objects are objects in their own right, a Customer object has an Address object as an attribute – this really reflects an association between the two classes that would likely need to be mapped, and the attributes of the Address class itself will need to be mapped.  The important thing is that this is a recursive definition: At some point the attribute will be mapped to zero or more columns.

The easiest mapping you will ever have is a property mapping of a single attribute to a single column.  It is even simpler when the each have the same basic types, e.g. they’re both dates, the attribute is a string and the column is a char, or the attribute is a number and the column is a float.

 

Mapping Terminology

Mapping (v).  The act of determining how objects and their relationships are persisted in permanent data storage, in this case relational databases. 

Mapping (n). The definition of how an object’s property or a relationship is persisted in permanent storage.

Property.  A data attribute, either implemented as a physical attribute such as the string firstName or as a virtual attribute implemented via an operation such as getTotal() which returns the total of an order.

Property mapping.  A mapping that describes how to persist an object’s property.

Relationship mapping.  A mapping that describes how to persist a relationship (association, aggregation, or composition) between two or more objects.

It can make it easier to think that classes map to tables, and in a way they do, but not always directly.   Except for very simple databases you will never have a one-to-one mapping of classes to tables, something you will see later in this chapter with regards to inheritance mapping.  However, a common theme that you will see throughout this chapter is that a one class to one table mapping is preferable for your initial mapping (performance tuning may motivate you to refactor your mappings).

For now, let’s keep things simple.  Figure 2 depicts two models, a UML class diagram and a physical data model which follows the UML data modeling profile.  Both diagrams depict a portion of a simple schema for an order system.  You can see how the attributes of the classes could be mapped to the columns of the database.  For example, it appears that the dateFulfilled attribute of the Order class maps to the DataFulfilled column of the Order table and that the numberOrdered attribute of the OrderItem class maps to the NumberOrdered column of the OrderItem table. 

 

Figure 2. Simple mapping example.

Note that these initial property mappings were easy to determine for several reasons.  First, similar naming standards were used in both models, an aspect of Agile Modeling (AM)’s Apply Modeling Standards practice.  Second, it is very likely that the same people created both models.  When people work in separate teams it is quite common for their solutions to vary, even when the teams do a very good job, because they make different design decisions along the way.  Third, one model very likely drove the development of the other model.  In Different Projects Require Different Strategies I argued that when you are building a new system that your object schema should drive the development of your database schema.

The easiest mapping you will ever have is a property mapping of a single attribute to a single column.  It is even simpler when the each have the same basic types, e.g. they’re both dates, the attribute is a string and the column is a char, or the attribute is a number and the column is a float.

Even though the two schemas depicted in Figure 2 are very similar there are differences.  These differences mean that the mapping isn’t going to be perfect.  The differences between the two schemas are:

  • There are several attributes for tax in the object schema yet only one in the data schema.  The three attributes for tax in the Order class presumably should be added up and stored in the tax column of the Order table when the object is saved.  When the object is read into memory, however, the three attributes would need to be calculated (or a lazy initialization approach would need to be taken and each attribute would be calculated when it is first accessed).  A schema difference such as this is a good indication that the database schema needs to be refactored to split the tax column into three.

  • The data schema indicates keys whereas the object schema does not.  Rows in tables are uniquely identified by primary keys and relationships between rows are maintained through the use of foreign keys.  Relationships to objects, on the other hand, are implemented via references to those objects not through foreign keys.  The implication is that in order to fully persist the objects and their relationships that the objects need to know about the key values used in the database to identify them.  This additional information is called “shadow information”.

  • Different types are used in each schema. The subTotalBeforeTax attribute of Order is of the type Currency whereas the SubTotalBeforeTax column of the Order table is a float.  When you implement this mapping you will need to be able to convert back and forth between these two representations without loss of information.

 

2.1. Shadow Information

Shadow information is any data that objects need to maintain, above and beyond their normal domain data, to persist themselves.  This typically includes primary key information, particularly when the primary key is a surrogate key that has no business meaning, concurrency control markings such as timestamps or incremental counters, and versioning numbers.  For example, in Figure 2 you see that the Order table has an OrderID column used as a primary key and a LastUpdate column that is used for optimistic concurrency control that the Order class does not have.  To persist an order object properly the Order class would need to implement shadow attributes that maintain these values.  

Figure 3 shows a detailed design class model for the Order and OrderItem classes.  There are several changes from Figure 2.  First, the new diagram shows the shadow attributes that the classes require to properly persist themselves.  Shadow attributes have an implementation visibility, there is a space in front of the name instead of a minus sign, and are assigned the stereotype <<persistence>> (this is not a UML standard).  Second, it shows the scaffolding attributes required to implement the relationship the two classes.  Scaffolding attributes, such as the orderItems vector in Order, also have an implementation visibility.  Third, a getTotalTax() operation was added to the Order class to calculate the value required for the tax column of the Order table.  This is why I use the term property mapping instead of attribute mapping – what you really want to do is map the properties of a class, which sometimes are implemented as simple attributes and other times as one or more operations, to the columns of a database.

 

Figure 3. Including “shadow information” on a class diagram.

One type of shadow information that I have not discussed yet is a boolean flag to indicate whether an object currently exists in the database.  The problem is that when you save data to a relational database you need to use a SQL update statement if the object was previously retrieved from the database and a SQL insert statement if the data does not already exist.  A common practice is for each class to implement an isPersistent boolean flag, not shown in Figure 3, that is set to true when the data is read in from the database and set to false when the object is newly created.

It is a common style convention in the UML community to not show shadow information, such as keys and concurrency markings, on class diagrams.  Similarly, the common convention is to not model scaffolding code either. The idea is that everyone knows you need to do this sort of thing, so why waste your time modeling the obvious? 

Shadow information doesn’t necessarily need to be implemented by the business objects, although your application will need to take care of it somehow.  For example, with Enterprise JavaBeans (EJBs) you store primary key information outside of EJBs in primary key classes, the individual object references a corresponding primary key object.  The Java Data Object (JDO) approach goes one step further and implement shadow information in the JDOs and not the business objects.

 

2.2 Mapping Meta Data

Figure 4 depicts the meta data representing the property mappings required to persist the Order and OrderItem classes of Figure 3.  Meta data is information about data.  Figure 4 is important for several reasons.  First, we need some way to represent mappings.  We could put two schemas side by side, as you see in Figure 2, and then draw lines between them but that gets complicated very quickly.  Another option is a tabular representation that you see in Figure 4.  Second, the concept of mapping meta data is critical to the functioning of persistence frameworks which are a database encapsulation strategy that can enable agile database techniques.

 

Figure 4. Meta data representing the property maps.

Property

Column

Order.orderID

Order.OrderID

Order.dateOrdered

Order.DateOrdered

Order.dateFulfilled

Order.DateFulfilled

Order.getTotalTax()

Order.Tax

Order.subtotalBeforeTax

Order.SubtotalBeforeTax

Order.shipTo.personID

Order.ShipToContactID

Order.billTo.personID

Order.BillToContactID

Order.lastUpdate

Order.LastUpdate

OrderItem.ordered

OrderItem.OrderID

Order.orderItems.position(orderItem)

OrderItem.ItemSequence

OrderItem.item.number

OrderItem.ItemNo

OrderItem.numberOrdered

OrderItem.NumberOrdered

OrderItem.lastUpdate

OrderItem.LastUpdate

 

The naming convention that I’m using is reasonably straightforward: Order.dateOrdered refers to the dateOrdered attribute of the Order class.  Similarly Order.DateOrdered refers to the DateOrdered column of the Order table.  Order.getTotalTax() refers to the getTotalTax() operation of Order and Order.billTo.personID is the personID attribute of the Person object referenced by the Order.billTo attribute.  Likely the most difficult property to understand is Order.orderItems.position(orderItem) which refers to the position within the Order.orderItems vector of the instance of OrderItem that is being saved.

Figure 4 hints at an important part of the technical impedance mismatch between object technology and relational technology.  Classes implement both behavior and data whereas relational database tables just implement data.  The end result is that when you’re mapping the properties of classes into a relational database you end up mapping operations such as getTotalTax() and position() to columns.  Although it didn’t happen in this example, you often need to map two operations that represent a single property to a column – one operation to set the value, e.g. setFirstName(), and one operation to retrieve the value, e.g. getFirstName().  These operations are typically called setters and getters respectively, or sometimes mutators and accessors.

Whenever a key column is mapped to a property of a class, such as the mapping between OrderItem.ItemSequence and Order.orderItems.position(orderItem), this is really part of the effort of relationship mapping, discussed later in this chapter.  This is because keys implement relationships in relational databases. 

 

2.3 How Mapping Fits Into The Overall Process

See the essay Evolutionary Development.

 

3. Mapping Inheritance Structures

Relational databases do not natively support inheritance, forcing you to map the inheritance structures within your object schema to your data schema.  Although there is somewhat of a backlash against inheritance within the object community, due in most part to the fragile base class problem, my experience is that this problem is mostly due to poor encapsulation practices among object developers than with the concept of inheritance (Ambler 2001a).  What I’m saying is that the fact you need to do a little bit of work to map an inheritance hierarchy into a relational database shouldn’t dissuade you from using inheritance where appropriate.

The concept of inheritance throws in several interesting twists when saving objects into a relational DB.  How do you organize the inherited attributes within your data model?  In this section you’ll see that there are three primary solutions for mapping inheritance into a relational database, and a fourth supplementary technique that goes beyond inheritance mapping.  These techniques are:

To explore each technique I will discuss how to map the two versions of the class hierarchy presented in Figure 6.  The first version depicts three classes – Person, an abstact class, and two concrete classes, Employee and Customer.  You know that Person is abstract because its name is shown in italics.  In older versions of the UML the constraint “{abstract}” would have been used instead.  The second version of the hierarchy adds a fourth concrete class to the hierarchy, Executive.  The idea is that you have implemented the first class hierarchy and are now presented with a new requirement to support giving executives, but not non-executive employees, fixed annual bonuses.  The Executive class was added to support this new functionality.

For the sake of simplicity I have not modeled all of the attributes of the classes, nor have I modeled their full signatures, nor have I modeled any of the operations.  This diagram is just barely good enough for my purpose, in other words it is an agile model.  Furthermore these hierarchies could be approved by applying the Party analysis pattern (Fowler 1997) or the Business Entity (Ambler 1997) analysis pattern.  I haven’t done this because I need a simple example to explain mapping inheritance hierarchies, not to explain the effective application of analysis patterns – I always follow Agile Modeling (AM)’s Model With A Purpose principle.

Figure 6.  Two versions of a simple class hierarchy.

Inheritance can also be a problem when it’s misapplied – for example, the hierarchy in Figure 11.6 could be better modeled via the Party (Hay 1996, Fowler 1997) or the Business Entity (Ambler 1997) patterns.  For example, if someone can be both a customer and an employee you would have to objects in memory for them, which may be problematic for your application.  I’ve chosen this example because I needed a simple, easy to understand class hierarchy to map.  

3.1 Map Hierarchy To A Single Table

Following this strategy you store all the attributes of the classes in one table.  Figure 7 depicts the data model for the class hierarchies of Figure 6 when this approach is taken.  The attributes of each the classes are stored in the table Person, a good table naming strategy is to use the name of the hierarchy’s root class, in a very straightforward manner.

 

Figure 7. Mapping to a single table.

 

Two columns have been added to the table – PersonPOID and PersonType.  The first column is the primary key for the table, you know this because of the <<PK>> stereotype, and the second is a code indicating whether the person is a customer, an employee, or perhaps both.  PersonPOID is a persistent object identifier (POID), often simply called an object identifier (OID), which is a surrogate key.  I could have used the optional stereotype of <<Surrogate>> to indicate this but chose not to as POID implies this, therefore indicating the stereotype would only serve to complicate the diagram (follow the AM practice Depict Models Simply).   Data Modeling 101 discusses surrogate keys in greater detail.

The PersonType column is required to identify the type of object that can be instantiated from a given row.  For example the value of E would indicate the person is an employee, C would indicate customer, and B would indicate both.  Although this approach is straightforward it tends to break down as the number of types and combinations begin to grow.  For example, when you add the concept of executives you need to add a code value, perhaps X, to represent this.  Now the value of B, representing both, is sort of goofy.  Furthermore you might have combinations involving executives now, for example it seems reasonable that someone can be both an executive and a customer so you’d need a code for this.   When you discover that combinations are possible you should consider applying the Replace Type Code With Booleans database refactoring, as you see in Figure 8.

For the sake of simplicity I did not include columns for concurrency control, such as the time stamp column included in the tables of Figure 3, nor did I include columns for data versioning.

Figure 8. A refactored approach.

 

 

3.2 Map Each Concrete Class To Its Own Table

With this approach a table is created for each concrete class, each table including both the attributes implemented by the class and its inherited attributes.  Figure 9 depicts the physical data model for the class hierarchy of Figure 6 when this approach is taken.  There are tables corresponding to each of the Customer and Employee classes because they are concrete, objects are instantiated from them, but not Person because it is abstract.  Each table was assigned its own primary key, customerPOID and employeePOID respectively.  To support the addition of Executive all I needed to do was add a corresponding table with all of the attributes required by executive objects.

 

Figure 9. Mapping concrete classes to tables.

 

3.3 Map Each Class To Its Own Table

Following this strategy you create one table per class, with one column per business attributes and any necessary identification information (as well as other columns required for concurrency control and versioning).  Figure 10 depicts the physical data model for the class hierarchy of Figure 6 when each class is mapped to a single table.  The data for the Customer class is stored in two tables, Customer and Person, therefore to retrieve this data you would need to join the two tables (or do two separate reads, one to each table).

The application of keys is interesting.  Notice how personPOID is used as the primary key for all of the tables. For the Customer, Employee, and Executive tables the personPOID is both a primary key and a foreign key.  In the case of Customer, personPOID is its primary key and a foreign key used to maintain the relationship to the Person table.  This is indicated by application of two stereotypes, <<PK>> and <<FK>>.  In older versions of the UML it wasn’t permissible to assign several stereotypes to a single model element but this restriction was lifted in UML version 1.4.

 

Figure 10.  Mapping each class to its own table.

A common modification that you may want to consider is the addition of a type column, or boolean columns as the case may be, in the Person table to indicate the applicable subtypes of the person.  Although this is additional overhead it makes some types of queries easier.  The addition of views is also an option in many cases, an approach that I prefer over the addition of type or boolean columns because they are easier to maintain.

 

3.4 Map Classes To A Generic Table Structure

A fourth option for mapping inheritance structures into a relational database is to take a generic, sometimes called meta-data driven approach, to mapping your classes.  This approach isn’t specific to inheritance structures, it supports all forms of mapping.  In Figure 11 you see a data schema for storing the value of attributes and for traversing inheritance structures.  The schema isn’t complete, it could be extended to map associations for example, but it’s sufficient for our purposes.  The value of a single attribute is stored in the Value table, therefore to store an object with ten business attributes there would be ten records, one for each attribute.  The Value.ObjectPOID column stores the unique identifier for the specific object (this approach assumes a common key strategy across all objects, when this isn’t the case you’ll need to extend this table appropriately).  The AttributeType table contains rows for basic data types such as data, string, money, integer and so on.  This information is required to convert the value of the object attribute into the varchar stored in Value.Value.

 

Figure 11. A generic data schema for storing objects.

Let’s work through an example of mapping a single class to this schema.  To store the OrderItem class in Figure 3 there would be three records in the Value table.  One to store the value for the number of items ordered, one to store the value of the OrderPOID that this order item is part of, and one to store the value of the ItemPOID that describes the order item.  You may decide to have a fourth row to store the value of the lastUpdated shadow attribute if you’re taking an optimistic locking approach to concurrency control. The Class table would include a row for the OrderItem class and the Attribute table would include one row for each attribute stored in the database (in this case either 3 or 4 rows). 

Now let’s map the inheritance structure between Person and Customer, show in Figure 6, into this schema.  The Inheritance table is the key to inheritance mapping.  Each class would be represented by a row in the Class table.  There would also be a row in the Inheritance table, the value of Inheritance.SuperClassPOID would refer to the row in Class representing Person and Inheritance.SubClassPOID would refer to the row in Class representing Customer. To map the rest of the hierarchy you require one row in Inheritance for each inheritance relationship.

 

3.5 Mapping Multiple Inheritance

Until this point I have focused on mapping single inheritance hierarchies, single inheritance occurs when a subclass such as Customer inherits directly from a single parent class such as Person.  Multiple inheritance occurs when a subclass has two or more direct superclasses, such as Dragon directly inheriting from both Bird and Lizard in Figure 12.  Multiple inheritance is generally seen as a questionable feature of an object-oriented language, since 1990 I have only seen one domain problem where multiple inheritance made sense, and as a result most languages choose not to support it.  However, languages such as C++ and Eiffel do support it so you may find yourself in a situation where you need to map a multiple inheritance hierarchy to a relational database.

Figure 12 shows the three data schemas that would result from applying each of the three inheritance mapping strategies.  As you can see mapping multiple inheritance is fairly straightforward, there aren’t any surprises in Figure 12.  The greatest challenge in my experience is to identify a reasonable table name when mapping the hierarchy into a single table, in this case Creature made the most sense.

Figure 12. Mapping multiple inheritance.

 

3.6 Comparing The Strategies

None of these mapping strategies are ideal for all situations, as you can see in Table 1.  My experience is that the easiest strategy to work with is to have one table per hierarchy at first, then if you need to refactor your schema according.  Sometimes I’ll start by applying the one table per class strategy whenever my team is motivated to work with a “pure design approach”.  I stay away from using one table per concrete class because it typically results in the need to copy data back and forth between tables, forcing me to refactor it reasonably early in the life of the project anyway. I rarely use the generic schema approach because it simply doesn’t scale very well.

It is important to understand that you can combine the first three strategies – one table per hierarchy, one table per concrete class, and one table per class – in any given application.  You can even combine these strategies in a single, large hierarchy.

 

Table 1. Comparing the inheritance mapping strategies.

Strategy

Advantages

Disadvantages

When to Use

One table per hierarchy

Simple approach.

Easy to add new classes, you just need to add new columns for the additional data.

Supports polymorphism by simply changing the type of the row.

Data access is fast because the data is in one table.

Ad-hoc reporting is very easy because all of the data is found in one table.

Coupling within the class hierarchy is increased because all classes are directly coupled to the same table.  A change in one class can affect the table which can then affect the other classes in the hierarchy.

Space potentially wasted in the database.

Indicating the type becomes complex when significant overlap between types exists.

Table can grow quickly for large hierarchies.

This is a good strategy for simple and/or shallow class hierarchies where there is little or no overlap between the types within the hierarchy.

One table per concrete class

Easy to do ad-hoc reporting as all the data you need about a single class is stored in only one table. 

Good performance to access a single object’s data.

When you modify a class you need to modify its table and the table of any of its subclasses.  For example if you were to add height and weight to the Person class you would need to add columns to the Customer, Employee, and Executive tables.

Whenever an object changes its role, perhaps you hire one of your customers, you need to copy the data into the appropriate table and assign it a new POID value (or perhaps you could reuse the existing POID value). 

It is difficult to support multiple roles and still maintain data integrity.  For example, where would you store the name of someone who is both a customer and an employee?

When changing types and/or overlap between types is rare.

One table per class

Easy to understand because of the one-to-one mapping. 

Supports polymorphism very well as you merely have records in the appropriate tables for each type. 

Very easy to modify superclasses and add new subclasses as you merely need to modify/add one table.

Data size grows in direct proportion to growth in the number of objects.

 

There are many tables in the database, one for every class (plus tables to maintain relationships). 

Potentially takes longer to read and write data using this technique because you need to access multiple tables.  This problem can be alleviated if you organize your database intelligently by putting each table within a class hierarchy on different physical disk-drive platters (this assumes that the disk-drive heads all operate independently). 

Ad-hoc reporting on your database is difficult, unless you add views to simulate the desired tables.

When there is significant overlap between types or when changing types is common.

Generic schema

Works very well when database access is encapsulated by a robust persistence framework.

It can be extended to provide meta data to support a wide range of mappings, including relationship mappings.  In short, it is the start at a mapping meta data engine.

It is incredibly flexible, enabling you to quickly change the way that you store objects because you merely need to update the meta data stored in the Class, Inheritance, Attribute, and AttributeType tables accordingly.

Very advanced technique that can be difficult to implement at first.

It only works for small amounts of data because you need to access many database rows to build a single object.

You will likely want to build a small administration application to maintain the meta data. 

Reporting against this data can be very difficult due to the need to access several rows to obtain the data for a single object.

For complex applications that work with small amounts of data, or for applications where you data access isn’t very common or you can pre-load data into caches.

 

 

4. Mapping Object Relationships

In addition to property and inheritance mapping you need to understand the art of relationship mapping.  There are three types of object relationships that you need to map: association, aggregation, and composition.  For now, I’m going to treat these three types of relationship the same – they are mapped the same way although there are interesting nuances when it comes to referential integrity. 

 

4.1. Types of Relationships

There are two categories of object relationships that you need to be concerned with when mapping.  The first category is based on multiplicity and it includes three types:

  • One-to-one relationships.  This is a relationship where the maximums of each of its multiplicities is one, an example of which is holds relationship between Employee and Position in Figure 13.  An employee holds one and only one position and a position may be held by one employee (some positions go unfilled).

  • One-to-many relationships. Also known as a many-to-one relationship, this occurs when the maximum of one multiplicity is one and the other is greater than one.  An example is the works in relationship between Employee and Division.  An employee works in one division and any given division has one or more employees working in it.

  • Many-to-many relationships. This is a relationship where the maximum of both multiplicities is greater than one, an example of which is the assigned relationship between Employee and Task.  An employee is assigned one or more tasks and each task is assigned to zero or more employees.

The second category is based on directionality and it contains two types, uni-directional relationships and bi-directional relationships.

  • Uni-directional relationships.  A uni-directional relationship when an object knows about the object(s) it is related to but the other object(s) do not know of the original object.  An example of which is the holds relationship between Employee and Position in Figure 13, indicated by the line with an open arrowhead on it.  Employee objects know about the position that they hold, but Position objects do not know which employee holds it (there was no requirement to do so).  As you will soon see, uni-directional relationships are easier to implement than bi-directional relationships.

  • Bi-directional relationships.  A bi-directional relationship exists when the objects on both end of the relationship know of each other, an example of which is the works in relationship between Employee and Division.  Employee objects know what division they work in and Division objects know what employees work in them.

 

Figure 13. Relationships between objects.

 

It is possible to have all six combinations of relationship in object schemas.  However one aspect of the impedance mismatch between object technology and relational technology is that relational technology does not support the concept of uni-directional relationships – in relational databases all associations are bi-directional.  

 

4.2. How Object Relationships Are Implemented

Relationships in object schemas are implemented by a combination of references to objects and operations.  When the multiplicity is one (e.g. 0..1 or 1) the relationship is implemented with a reference to an object, a getter operation, and a setter operation.  For example in Figure 13 the fact that an employee works in a single division is implemented by the Employee class via the combination of the attribute division, the getDivision() operation which returns the value of division, and the setDivision() operation which sets the value of the division attribute. The attribute(s) and operations required to implement a relationship are often referred to as scaffolding.

When the multiplicity is many (e.g. N, 0..*, 1..*) the relationship is implemented via a collection attribute, such as an Array or a HashSet in Java, and operations to manipulate that array.  For example the Division class implements a HashSet attribute named employees, getEmployees() to get the value, setEmployees() to set the value, addEmployee() to add an employee into the HashSet, and removeEmployee() to remove an employee from the HashSet. 

When a relationship is uni-directional the code is implemented only by the object that knows about the other object(s).  For example, in the uni-directional relationship between Employee and Position only the Employee class implements the association.  Bi-directional associations, on the other hand, are implemented by both classes, as you can see with the many-to-many relationship between Employee and Task.

 

4.3. How Relational Database Relationships Are Implemented

Relationships in relational databases are maintained through the use of foreign keys.  A foreign key is a data attribute(s) that appears in one table that may be part of or is coincidental with the key of another table.  With a one-to-one relationship the foreign key needs to be implemented by one of the tables.  In Figure 14 you see that the Position table includes EmployeePOID, a foreign key to the Employee table, to implement the association.  I could easily have implemented a PositionPOID column in Employee instead.

 

Figure 14. Relationships in a relational database.

 

To implement a one-to-many relationship you implement a foreign key from the “one table” to the “many table”.  For example Employee includes a DivisionPOID column to implement the works in relationship to Division.  You could also choose to overbuild your database schema and implement a one-to-many relationship via an associative table, effectively making it a many-to-many relationship.

There are two ways to implement many-to-many associations in a relational database.  The first one is to implement in each table the foreign key column(s) to the other table several times.  For example to implement the many-to-many relationship between Employee and Task you could have five TaskPOID columns in Employee and the Task table could include seven EmployeePOID columns.  Unfortunately you run into a problem with this approach when you assign more than five tasks to an employee or more than seven employees to a single task.  A better approach is to implement what is called an associative table, an example of which is EmployeeTask in Figure 14, which includes the combination of the primary keys of the tables that it associates.  With this approach you could have fifty people assigned to the same task, or twenty tasks assigned to the same person, and it wouldn’t matter.  The basic “trick” is that the many-to-many relationship is converted into two one-to-many relationships, both of which involve the associative table.

Because foreign keys are used to join tables, all relationships in a relational database are effectively bi-directional.  This is why it doesn’t matter in which table you implement a one-to-one relationship, the code to join the two tables is virtually the same.  For example, with the existing schema in Figure 14 the SQL code to join across the holds relationship would be

SELECT * FROM Position, Employee

WHERE Position.EmployeePOID = Employee.EmployeePOID

Had the foreign key been implemented in the Employee table the SQL code would be

SELECT * FROM Position, Employee

WHERE Position.PositionPOID = Employee.PositionPOID

A consistent key strategy within your database can greatly simplify your relationship mapping efforts.  The first step is to prefer single-column keys.  The next step is to use a globally unique surrogate key, perhaps following the GUID or HIGH-LOW strategies, so you are always mapping to the same type of key column.

Now that we understand how to implement relationships in each technology, let’s see how you map them.  I will describe the mappings from the point of view of mapping the object relationships into the relational database.  An interesting thing to remember is that in some cases you have design choices to make.  Once again beware of the “magic CASE tool button” that supposedly automates everything for you.

 

4.4. Relationship Mappings

A general rule of thumb with relationship mapping is that you should keep the multiplicities the same.  Therefore a one-to-one object relationship maps to a one-to-one data relationship, a one-to-many maps to a one-to-many, and a many-to-many maps to a many-to-many.  The fact is that this doesn’t have to be the case, you can implement a one-to-one object relationship with to a one-to-many or even a many-to-many data relationship.  This is because a one-to-one data relationship is a subset of a one-to-many data relationship and a one-to-many relationship is a subset of a many-to-many relationship. 

Figure 15 depicts the property mappings between the object schema of Figure 13 and the data schema of Figure 14. Note how I have only had to map the business properties and the shadow information of the objects, but not scaffolding attributes such as Employee.position and Employee.tasks. These scaffolding attributes are represented via the shadow information that is mapped into the database.  When the relationship is read into memory the values of stored in the primary key columns will be stored in the corresponding shadow attributes within the objects.  At the same time the relationship that the primary key columns represent will be defined between the corresponding objects by setting the appropriate values in their scaffolding attributes.

 

Figure 15. Property mappings.

Property

Column

Position.title

Position.Title

Position.positionPOID

Position.PositionPOID

Employee.name

Employee.Name

Employee.employeePOID

Employee.EmployeePOID

Employee.employeePOID

EmployeeTask.EmployeePOID

Division.name

Division.Name

Division.divisionPOID

Division.DivisionPOID

Task.description

Task.Description

Task.taskPOID

Task.TaskPOID

Task.taskPOID

EmployeeTask.TaskPOID

 

4.4.1 One-To-One Mappings

Consider the one-to-one object relationship between Employee and Position.  Let’s assume that whenever a Position or an Employee object is read into memory that the application will automatically traverse the holds relationship and automatically read in the corresponding object.  The other option would be to manually traverse the relationship in the code, taking a lazy read approach where the other object is read at the time it is required by the application.  The trade-offs of these two approaches are discussed in Implementing Referential Integrity.  Figure 16 shows how the object relationships are mapped.  

 

Figure 16. Mapping the relationships.

Object Relationship

From

To

Cardinality

Automatic Read

Column(s)

Scaffolding Property

holds

Employee

Position

One

Yes

Position.EmployeePOID

Employee.position

held by

Position

Employee

One

Yes

Position.EmployeePOID

Employee.position

works in

Employee

Division

One

Yes

Employee.DivisionPOID

Employee.division

has working in it

Division

Employee

Many

No

Employee.DivisionPOID

Division.employees

assigned

Employee

Task

Many

No

Employee.EmployeePOID

EmployeeTask.EmployeePOID

Employee.tasks

assigned to

Task

Employee

Many

No

Task.TaskPOID

EmployeeTask.TaskPOID

Task.employees

 

Let’s work through the logic of retrieving a single Position object one step at a time:

  1. The Position object is read into memory.

  2. The holds relationship is automatically traversed.

  3. The value held by the Position.EmployeePOID column is used to identify the single employee that needs to be read into memory.

  4. The Employee table is searched for a record with that value of EmployeePOID.

  5. The Employee object (if any) is read in and instantiated.

  6. The value of the Employee.position attribute is set to reference the Position object.

 

Now let’s work through the logic of retrieving a single Employee object one step at a time:

  1. The Employee object is read into memory.

  2. The holds relationship is automatically traversed.

  3. The value held by the Employee.EmployeePOID column is used to identify the single position that needs to be read into memory.

  4. The Position table is searched for a row with that value of EmployeePOID.

  5. The Position object is read in and instantiated.

  6. The value of the Employee.position attribute is set to reference the Position object.

Now let’s consider how the objects would be saved to the database.  Because the relationship is to be automatically traversed, and to maintain referential integrity, a transaction is created.  The next step is to add update statements for each object to the transaction.  Each update statement includes both the business attributes and the key values mapped in Figure 15.  Because relationships are implemented via foreign keys, and because those values are being updated, the relationship is effectively being persisted.  The transaction is submitted to the database and run.

There is one annoyance with the way the holds relationship has been mapped into the database.  Although the direction of this relationship is from Employee to Position within the object schema, it’s been implemented from Position to Employee in the database.  This isn’t a big deal, but it is annoying.  In the data schema you can implement the foreign key in either table and it wouldn’t make a difference, so from a data point of view when everything else is equal you could toss a coin.  Had there been a potential requirement for the holds relationship to turn into a one-to-many relationship, something that a change case (Bennett 1997, Ambler 2001a) would indicate, then you would be motivated to implement the foreign key to reflect this potential requirement.  For example, the existing data model would support an employee holding many positions.  However, had the object schema been taken into account, and if there were no future requirements motivating you to model it other wise, it would have been cleaner to implement the foreign key in the Employee table instead.

 

4.4.2. One-To-Many Mappings

Now let’s consider the works in relationship between Employee and Division in Figure 13.  This is a one-to-many relationship – an employee works in one division and a single division has many employees working in it.  As you can see in Figure 15 an interesting thing about this relationship is that it should be automatically traversed from Employee to Division, something often referred to as a cascading read, but not in the other direction.  Cascading saves and cascading deletes are also possible, something covered in the discussion of referential integrity.

When an employee is read into memory the relationship is automatically traversed to read in the division that they work in.  Because you don’t want several copies of the same division, for example if you have ten employee objects that all work for the IT division you want them to refer to the same IT division object in memory.  The implication is that you will need to implement a strategy for doing this, one option is to implement a cache that ensures only one copy of an object exists in memory or to simply have the Division class implement it’s own collection of instances in memory (effectively a mini-cache).  If the application needs to it will read the Division object into memory, then it will set the value of Employee.division to reference the appropriate Division object.  Similarly the Division.addEmployee() operation will be invoked to add the employee object into its collection.

Saving the relationship works in the same way as it does for one-to-one relationships – when the objects are saved so are their primary and foreign key values so therefore the relationship is automatically saved.

Every example in this chapter uses foreign keys, such as Employee.DivisionPOID, pointing to the primary keys of other tables, in this case Division.DivisionPOID.   This doesn’t have to be the case, sometimes a foreign key can refer to an alternate key.  For example, if the Employee table of Figure 14 were to include a SocialSecurityNumber column then that would be an alternate key for that table (assuming all employees are American citizens).  If this where the case you would have the option to replace the Position.EmployeePOID column with Position.SocialSecurityNumber.

 

4.4.3. Many-To-Many Mappings

To implement many-to-many relationships you need the concept of an associative table, a data entity whose sole purpose is to maintain the relationship between two or more tables in a relational database. In Figure 13 there is a many-to-many relationship between Employee and Task.  In the data schema of Figure 14 I needed to introduce the associative table EmployeeTask to implement a many-to-many relationship the Employee and Task tables.  In relational databases the attributes contained in an associative table are traditionally the combination of the keys in the tables involved in the relationship, in the case EmployeePOID and TaskPOID.  The name of an associative table is typically either the combination of the names of the tables that it associates or the name of the association that it implements. In this case I chose EmployeeTask over Assigned.  

Notice the multiplicities in Figure 13.  The rule is that the multiplicities “cross over” once the associative table is introduced, as indicated in Figure 14.  A multiplicity of 1 is always introduced on the outside edges of the relationship within the data schema to preserve overall multiplicity of the original relationship.  The original relationship indicated that an employee is assigned to one or more tasks and that a task has zero or more employees assigned to it. In the data schema you see that this is still true even with the associative table in place to maintain the relationship.

Assume that an employee object is in memory and we need a list of all the tasks they have been assigned.  The steps that the application would need to go through are:

  1. Create a SQL Select statement that joins the EmployeeTask and Task tables together, choosing all EmployeeTask records with the an EmployeePOID value the same as the employee we are putting the task list together.

  2. The Select statement is run against the database.

  3. The data records representing these tasks are marshaled into Task objects.  Part of this effort includes checking to see if the Task object is already in memory.  If it is then we may choose to refresh the object with the new data values (this is a concurrency issue).

  4. The Employee.addTask() operation is invoked for each Task object to build the collection up.

A similar process would have been followed to read in the employees involved in a given task.  To save the relationship, still from the point of view of the Employee object, the steps would be:

  1. Start a transaction.

  2. Add Update statements for any task objects that have changed.

  3. Add Insert statements for the Task table for any new tasks that you have created.

  4. Add Insert statements for the EmployeeTask table for the new tasks.

  5. Add Delete statements for the Task table any tasks that have been deleted.  This may not be necessary if the individual object deletions have already occurred.

  6. Add Delete statements for the EmployeeTask table for any tasks that have been deleted, a step that may not be needed if the individual deletions have already occurred.

  7. Add Delete statements for the EmployeeTask table for any tasks that are no longer assigned to the employee.

  8. Run the transaction.

Many-to-many relationships are interesting because of the addition of the associative table.  Two business classes are being mapped to three data tables to support this relationship, so there is extra work to do as a result.

 

4.5. Mapping Ordered Collections

Figure 2 depicted a classic Order and OrderItem model with an aggregation association between the two classes.  An interesting twist is the {ordered} constraint placed on the relationship – users care about the order in which items appear on an order.  When mapping this to a relational database you need to add an addition column to track this information.  The database schema, also depicted in Figure 2, includes the column OrderItem.ItemSequence to persist this information.  Although this mapping seems straightforward on the surface, there are several issues that you need take into consideration.  These issues become apparent when you consider basic persistence functionality for the aggregate:

  • Read the data in the proper sequence.  The scaffolding attribute that implements this relationship must be a collection that enables sequential ordering of references and it must be able to grow as new OrderItems are added to the Order.  In Figure 3 you see that a Vector is used, a Java collection class that meets these requirements.  As you read the order and order items into memory the Vector must be filled in the proper sequence.  If the values of the OrderItem.ItemSequence column start from 1 and increase by 1 then you can simply use the value of the column as the position to insert order items into the collection.   When this isn’t the case you must include an ORDER BY clause in the SQL statement submitted to the database to ensure that the rows appear in order in the result set.

  • Don’t include the sequence number in the key.  You have an order with five order items in memory and they have been saved into the database.  You now insert a new order item in between the second and third order items, giving you a total of six order items.  With the current data schema of Figure 2 you have to renumber the sequence numbers for every order item that appears after the new order item and then write out all them even though nothing has changed other than the sequence number in the other order items.  Because the sequence number is part of the primary key of the OrderItem table this could be problematic if other tables, not shown in Figure 2, refer to rows in OrderItem via foreign keys that include ItemSequence.  A better approach is shown in Figure 17 where the OrderItemID column is used as the primary key.

  • When do you update sequence numbers after rearranging the order items?  Whenever you rearrange order items on an order, perhaps you moved the fourth order item to be the second one on the order, you need to update the sequence numbers within the database.  You may decide to cache these changes in memory until you decide to write out the entire order, although this runs the risk that the proper sequence won’t be saved in the event of a power outage.

  • Do you update sequence numbers after deleting an order item?  If you delete the fifth of six order items do you want to update the sequence number for what is now the fifth item or do you want to leave it as it.  The sequence numbers still work – the values are 1, 2, 3, 4, 6 – but you can no longer use them as the position indicators within your collection without leaving a hole in the fifth position.

  • Consider sequence number gaps greater than one.  Instead of assigning sequence numbers along the lines of 1, 2, 3, … instead assign numbers such as 10, 20, 30 and so on.  That way you don’t need to update the values of the OrderItem.ItemSequence column every time you rearrange order items because you can assign a sequence number of 15 when you move something between 10 and 20. You will need to change the values every so often, for example after several rearrangements you may find yourself in the position of trying to insert something between 17 and 18.  Larger gaps help to avoid this (e.g. 50, 100, 150, …) but you’ll never completely avoid this problem.

 

Figure 17. Improved data schema for persisting Order and OrderItem. 

 

4.6. Mapping Recursive Relationships

A recursive relationship, also called reflexive relationships (Reed 2002; Larman 2002), is one where the same entity (class, data entity, table, …) is involved with both ends of the relationship.  For example the manages relationship in Figure 18 is recursive, representing the concept that an employee may manage several other employees.  The aggregate relationship that the Team class has with itself is recursive – a team may be a part of one or more other teams. 

Figure 18 depicts a class model that includes two recursive relationships and the resulting data model that it would be mapped to.  For the sake of simplicity the class model includes only the classes and their relationships and the data model includes only the keys.  The many-to-many recursive aggregation is mapped to the Subteams associative table in the same way that you would map a normal many-to-many relationship – the only difference is that both columns are foreign keys into the same table.  Similarly the one-to-many manages association is mapped in the same way that you would map a normal one-to-many relationship, the ManagerEmployeePOID column refers to another row in the Employee table where the manager’s data is stored.

 

Figure 18. Mapping recursive relationships.

 

5. Mapping Class-Scope Properties

Sometimes a class will implement a property that is applicable to all of its instances and not just single instances.  The Customer class of Figure 19 implements nextCustomerNumber, a class attribute (you know this because it’s underlined) which stores the value of the next customer number to be assigned to a new customer object.  Because there is one value for this attribute for the class, not one value per object, we need to map it in a different manner.  Table 2 summarizes the four basic strategies for mapping class scope properties.

 

Figure 19. Mapping class scope attributes.

 

Table 2. Strategies for mapping class scope properties.

Strategy

Example

Advantages

Disadvantages

Single Column, Single-Row Table

The CustomerNumber table of Figure 19 implements this strategy. 

Simple

Fast access

Could result in many small tables

Multi-Column, Single-Row Table for a Single Class

If Customer implemented a second class scope attribute then a CustomerValues table could be introduced with one column for each attribute.

Simple

Fast access

Could result in many small tables, although fewer than the single column approach

Multi-Column, Single-Row Table for all Classes

The topmost version of the ClassVariables table in Figure 19.  This table contains one column for each class attribute within your application, so if the Employee class had a nextEmployeeNumber class attribute then there would be a column for this as well.

Minimal number of tables introduced to your data schema.

Potential for concurrency problems if many classes need to access the data at once.  One solution is to introduce a ClassConstants table, as shown in Figure 19, to separate attributes that are read only from those that can be updated.

Multi-Row Generic Schema for all Classes

The bottommost version of the ClassVariables and ClassConstants tables of Figure 19.  The table contains one row for each class scope property in your system.

Minimal number of tables introduced to your data schema.

Reduces concurrency problems (assuming your database supports row-based locking).

Need to convert between types (e.g. CustomerNumber is an integer but is stored as character data).

The data schema is coupled to the names of your classes and their class scope properties.  You could avoid this with an even more generic schema along the lines of Figure 11.

 

 

6. Performance Tuning

One of the most valuable services that an Agile DBA can perform on a development team is performance tuning.  A very good book is Database Tuning by Shasha and Bonnet (2003).  When working with structured technology most of the performance tuning effort was database-oriented, generally falling into one of two categories:

  1. Database performance tuning.  This effort focuses on changing the database schema itself, often by denormalizing portions of it.  Other techniques include changing the types of key columns, for example an index is typically more effective when it is based on numeric columns instead of character columns; reducing the number of columns that make up a composite key; or introducing indices on a table to support common joins.

  2. Data access performance tuning. This effort focuses on improving the way that data is accessed.  Common techniques include the introduction of stored procedures to “crunch” data in the database server to reduce the result set transmitted across the network; reworking SQL queries to reflect database features; clustering data to reflect common access needs; and caching data within your application to reduce the number of accesses. 

Neither of these needs go away with object technology, although as Figure 20 implies the situation is a little more complicated.  An important thing to remember is that your object schema also has structure to it, therefore changes to your object schema can affect the database access code that is generated based on the mappings to your database.  For example, assume that the Employee class has a homePhoneNumber attribute.  A new feature requires you to implement phone number specific behavior (e.g. your application can call people at home).  You decide to refactor homePhoneNumber into its class, and example of third normal object form (3ONF), and therefore update your mappings to reflect this change.  Performance degrades as a result of this change, motivating you to change either your mappings which the data access paths or the database schema itself.  The implication is that a change to your object source code could motivate a change to your database schema.  Sometimes the reverse happens as well.  This is perfectly fine, because as an agile software developer you are used to working in an evolutionary manner as shown in Figure 5.

 

Figure 20. Performance tuning opportunities.

 

There are two main additions to performance tuning that you need to be aware of: mapping tuning and object schema tuning.  Mapping tuning is described below.  When it comes to object schema tuning most changes to your schema will be covered by common refactorings (Fowler 1999).  However, a technique called lazy reading can help dramatically. 

 

6.1. Tuning Your Mappings

Throughout this chapter you have seen that there is more than one way to map object schemas to data schemas – there are four ways to map inheritance structures, two ways to map a one-to-one relationship (depending on where you put the foreign key), and four ways to map class-scope properties.  Because you have mapping choices, and because each mapping choice has its advantages and disadvantages, there are opportunities to improve the data access performance of your application by changing your choice of mapping.  Perhaps you implemented the one table per class approach to mapping inheritance only to discover that it’s too slow, motivating you to refactor it to use the one table per hierarchy approach.

It is important to understand that whenever you change a mapping strategy that it will require you to change either your object schema, your data schema, or both.

 

6.2. Lazy Reads

An important performance consideration is whether the attribute should be automatically read in when the object is retrieved.  When an attribute is very large, for example the picture of a person could be 100k whereas the rest of the attributes are less than 1k, and rarely accessed you may want to consider taking a lazy read approach.  The basic idea is that instead of automatically bringing the attribute across the network when the object is read you instead retrieve it only when the attribute is actually needed.  This can be accomplished by a getter method, an operation whose purpose is to provide the value of a single attribute, that checks to see if the attribute has been initialized and if not retrieves it from the database at that point.

Other common uses for lazy read is reporting and for retrieving objects as the results of searches where you only need a small subset of the data of an object.  

 

7. Why Data Schemas Shouldn’t Drive Object Schemas (Revisited)

The material for this section has been reposted as Why Data Models Don’t Drive Object Models (And Vice Versa)

 

8. Implementation Impact On Your Objects

The impedance mismatch between object technology and relational technology forces you to map your object schema to your data schema. To implement these mappings you will need to add code to your business objects, code that impacts your application.  These impacts are the primary fodder for the argument that object purists make against using object and relational technology together.  Although I wish the situation were different, the reality is that we’re using object and relational technology together and very likely will for many years to come.  Like it or not we need to accept this fact. 

I think that there is significant value in summarizing how mapping impacts your objects.  Some of this material you have seen in this chapter and some you will see in other chapters.  The impacts on your code include the need to:

  • Maintain shadow information. 

  • Refactor it to improve overall performance.

  • Work with legacy data.  It is common to work with legacy databases and that there are often significant data quality, design, and architectural problems associated with them.  The implication is that you often need to map your objects to legacy databases and that your objects may need to implement integration and data cleansing code to do so.

  • Encapsulate database access. Your strategy for encapsulating database access determines how you will implement your mappings.  Your objects will be impacted by your chosen strategy, anywhere from including embedded SQL code to implementing a common interface that a persistence framework requires.

  • Implement concurrency control. Because most applications are multi-user, and because most databases are accessed by several applications, you run the risk that two different processes will try to modify the same data simultaneously.  Therefore your objects need to implement concurrency control strategies that overcome these challenges.

  • Find objects in a relational database.  You will want to work with collections of the same types of objects at once, perhaps you want to list all of the employees in a single division.   

  • Implement referential integrity.  There are several strategies for implementing referential integrity between objects and within databases.  Although referential integrity is a business issue, and therefore should be implemented within your business objects, the reality is that many if not all referential integrity rules are implemented in the database instead. 

  • Implement security access control.  Different people have different access to information.  As a result you need to implement security access control logic within your objects and your database.

  • Implement reporting.  Do your business objects implement basic reporting functionality or do you leave this effort solely to reporting tools that go directly against your database.  Or do you use a combination.  

  • Implement object caches.  Object caches can be used to improve application performance and to ensure that objects are unique within memory.

 

9. Implications for Model Driven Architecture (MDA)

The Model-Driven Architecture (MDA) (Object Management Group 2001b) defines an approach to modeling that separates the specification of system functionality from the specification of its implementation on a specific technology platform.  In short, it defines guidelines for structuring specifications expressed as models.  The MDA promotes an approach where the same model specifying system functionality can be realized on multiple platforms through auxiliary mapping standards, or through point mappings to specific platforms.  It also supports the concept of explicitly relating the models of different applications, enabling integration, interoperability and supporting system evolution as platform technologies come and go.

Although the MDA is based on the Unified Modeling Language (UML), and the UML does not yet officially support a data model, my expectation is that object to relational mapping will prove to be one of the most important features that MDA-compliant CASE tools will support.   My hope is that the members of the OMG find a way to overcome the cultural impedance mismatch and start to work with data professionals to bring issues such as UML data modeling and object-to-relational mapping into account.  Time will tell.

 

10. Patternizing What You Have Learned

In this chapter you learned the basics of mapping objects to relational databases (RDBs), including some basic implementation techniques that will be expanded on in following chapters.  You saw that there are several strategies for mapping inheritance structures to RDBs and that mapping object relationships into RDBs is straightforward once you understand the differences between the two technologies.  Techniques for mapping both instance attributes and class attributes were presented, providing you with strategies to complete map a class’s attributes into an RDB.

This chapter included some methodology discussions that described how mapping is one task in the iterative and incremental approach that is typical of agile software development.  A related concept is that it is a fundamental mistake to allow your existing database schemas or data models to drive the development of your object models.  Look at them, treat them as constraints, but don’t let them negatively impact your design if you can avoid it.

Throughout this chapter I have described mapping techniques in common prose.  Although most authors prefer this technique, visit www.ambysoft.com/mappingObjects.html for an extensive list of links to mapping papers, some authors choose to write patterns instead.  The first such effort was the Crossing Chasms pattern language (Brown & Whitenack 1996)  and the latest effort is captured in the book Patterns of Enterprise Application Architecture (Fowler et. al. 2003).  Table 3 summarizes the critical material presented in this chapter as patterns, using the names suggested by other authors wherever possible.

 

Table 3. Mapping patterns.

Pattern

Description

Class Table Inheritance

Map each individual class within an inheritance hierarchy to its own table.

Concrete Table Inheritance

Map the concrete classes of an inheritance hierarchy to its own table.

Foreign Key Mapping

A relationship between objects is implemented in a relational database as foreign keys in tables.

Identity Field

Maintain the primary key of an object as an attribute.  This is an example of Shadow Information.

Lazy Initialization

Read a high-overhead attribute, such as a picture, into memory when you first access it, not when you initially read the object into memory.

Lazy Read

Read an object into memory only when you require it.

Legacy Data Constraint

Legacy data sources are a constraint on your object schema but they should not drive its definition.

Map Similar Types

Use similar types in your classes and tables.  For example it is easier to map an integer to an numeric column than it is to map it to a character-based column.

Map Simple Property to Single Column

Prefer to map the property of an object, such as the total of an order or the first name of an employee, to a single database column.

Mapping-Based Performance Tuning

To improve overall data access performance you can change your object schema, your data schema, or the mappings in between the two.

Recursive Relationships Are Nothing Special

Map a recursive relationship exactly the same way that you would map a non-recursive relationship.

Representing Objects as Tables

Prefer to map a single class to a single table but be prepared to evolve your design based to improve performance.

Separate Tables for Class-Scope Properties

Introduce separate tables to store class scope properties.

Shadow Information

Classes will need to maintain attributes to store the values of database keys (see Identity Field) and concurrency columns to persist themselves.

Single Column Surrogate Keys

The easiest key strategy that you can adopt within your database is to give all tables a single column, surrogate key that has a globally unique value.

Single Table Inheritance

Map all the classes of an inheritance hierarchy to a single table.

Table Design Time

Let your object schema form the basis from which you develop your data schema but be prepared to iterate your design in an evolutionary manner.

Uni-directional Key Choice

When a one-to-one unidirectional association exists from class A to class B, put the foreign key that maintains the relationship in the table corresponding to class A.

 

10. References and Suggested Online Readings

List of References

At www.ambysoft.com/mappingObjects.html I maintain a list of links to mapping white papers posted on the web.

 

Suggested books:

Agile Database Techniques This book describes the philosophies and skills required for developers and database administrators to work together effectively on project teams following evolutionary software processes such as Extreme Programming (XP), the Rational Unified Process (RUP), Feature Driven Development (FDD), Dynamic System Development Method (DSDM), or The Enterprise Unified Process (EUP).  In March 2004 it won a Jolt Productivity award.
The Object Primer 3rd Edition: Agile Model Driven Development (AMDD) with UML 2 This book presents a full-lifecycle, agile model driven development (AMDD) approach to software development.  It is one of the few books which covers both object-oriented and data-oriented development in a comprehensive and coherent manner.  Techniques the book covers include Agile Modeling (AM), Full Lifecycle Object-Oriented Testing (FLOOT), over 30 modeling techniques, agile database techniques, refactoring, and test driven development (TDD).If you want to gain the skills required to build mission-critical applications in an agile manner, this is the book for you.
Patterns of Enterprise Application Architecture This book presents a collection of architectural patterns, many of which hit on persistence-related issues.  I highly suggest this book as a complement to the material presented in this chapter.
Developing Applications with Java and UML This book is similar to The Object Primer 2/e.  It goes into Java a little bit more although does not go very far beyond the UML or the RUP.  The book does cover basic mapping concepts and includes more source code examples.
Mastering EJB 2/e This book covers EJB, including persistence-related issues.  It covers the basics as well as advanced persistence issues for EJB-based development.

 

Let Us Help

Ronin International, Inc. continues to help numerous organizations to learn about and hopefully adopt agile techniques and philosophies.  We offer both consulting and training offerings, including Agile Database Techniques Training.  In addition we suggest that you visit the Agile Modeling Site and the Enterprise Unified Process (EUP) site.

You might find several of my books to be of interest, including The Object Primer, Agile Modeling, The Elements of UML 2.0 Style, and Agile Database Techniques.

For more information please contact Michael Vizdos at 866-AT-RONIN (U.S. number) or via e-mail (michael.vizdos@ronin-intl.com).

 

Suggestion or Question? Agile Modeling Logo Enterprise Unified Process (EUP) Logo  

Ronin International

Page first posted: January 14 2003
Page last updated: April 1 2004

Every example in this chapter uses foreign keys, such as Employee.DivisionPOID, pointing to the primary keys of other tables, in this case Division.DivisionPOID.   This doesn’t have to be the case, sometimes a foreign key can refer to an alternate key.  For example, if the Employee table of Figure 14 were to include a SocialSecurityNumber column then that would be an alternate key for that table (assuming all employees are American citizens).  If this where the case you would have the option to replace the Position.EmployeePOID column with Position.SocialSecurityNumber.

 

4.4.3. Many-To-Many Mappings

To implement many-to-many relationships you need the concept of an associative table, a data entity whose sole purpose is to maintain the relationship between two or more tables in a relational database. In Figure 13 there is a many-to-many relationship between Employee and Task.  In the data schema of Figure 14 I needed to introduce the associative table EmployeeTask to implement a many-to-many relationship the Employee and Task tables.  In relational databases the attributes contained in an associative table are traditionally the combination of the keys in the tables involved in the relationship, in the case EmployeePOID and TaskPOID.  The name of an associative table is typically either the combination of the names of the tables that it associates or the name of the association that it implements. In this case I chose EmployeeTask over Assigned.  

Notice the multiplicities in Figure 13.  The rule is that the multiplicities “cross over” once the associative table is introduced, as indicated in Figure 14.  A multiplicity of 1 is always introduced on the outside edges of the relationship within the data schema to preserve overall multiplicity of the original relationship.  The original relationship indicated that an employee is assigned to one or more tasks and that a task has zero or more employees assigned to it. In the data schema you see that this is still true even with the associative table in place to maintain the relationship.

Assume that an employee object is in memory and we need a list of all the tasks they have been assigned.  The steps that the application would need to go through are:

  1. Create a SQL Select statement that joins the EmployeeTask and Task tables together, choosing all EmployeeTask records with the an EmployeePOID value the same as the employee we are putting the task list together.

  2. The Select statement is run against the database.

  3. The data records representing these tasks are marshaled into Task objects.  Part of this effort includes checking to see if the Task object is already in memory.  If it is then we may choose to refresh the object with the new data values (this is a concurrency issue).

  4. The Employee.addTask() operation is invoked for each Task object to build the collection up.

A similar process would have been followed to read in the employees involved in a given task.  To save the relationship, still from the point of view of the Employee object, the steps would be:

  1. Start a transaction.

  2. Add Update statements for any task objects that have changed.

  3. Add Insert statements for the Task table for any new tasks that you have created.

  4. Add Insert statements for the EmployeeTask table for the new tasks.

  5. Add Delete statements for the Task table any tasks that have been deleted.  This may not be necessary if the individual object deletions have already occurred.

  6. Add Delete statements for the EmployeeTask table for any tasks that have been deleted, a step that may not be needed if the individual deletions have already occurred.

  7. Add Delete statements for the EmployeeTask table for any tasks that are no longer assigned to the employee.

  8. Run the transaction.

Many-to-many relationships are interesting because of the addition of the associative table.  Two business classes are being mapped to three data tables to support this relationship, so there is extra work to do as a result.

 

4.5. Mapping Ordered Collections

Figure 2 depicted a classic Order and OrderItem model with an aggregation association between the two classes.  An interesting twist is the {ordered} constraint placed on the relationship – users care about the order in which items appear on an order.  When mapping this to a relational database you need to add an addition column to track this information.  The database schema, also depicted in Figure 2, includes the column OrderItem.ItemSequence to persist this information.  Although this mapping seems straightforward on the surface, there are several issues that you need take into consideration.  These issues become apparent when you consider basic persistence functionality for the aggregate:

  • Read the data in the proper sequence.  The scaffolding attribute that implements this relationship must be a collection that enables sequential ordering of references and it must be able to grow as new OrderItems are added to the Order.  In Figure 3 you see that a Vector is used, a Java collection class that meets these requirements.  As you read the order and order items into memory the Vector must be filled in the proper sequence.  If the values of the OrderItem.ItemSequence column start from 1 and increase by 1 then you can simply use the value of the column as the position to insert order items into the collection.   When this isn’t the case you must include an ORDER BY clause in the SQL statement submitted to the database to ensure that the rows appear in order in the result set.

  • Don’t include the sequence number in the key.  You have an order with five order items in memory and they have been saved into the database.  You now insert a new order item in between the second and third order items, giving you a total of six order items.  With the current data schema of Figure 2 you have to renumber the sequence numbers for every order item that appears after the new order item and then write out all them even though nothing has changed other than the sequence number in the other order items.  Because the sequence number is part of the primary key of the OrderItem table this could be problematic if other tables, not shown in Figure 2, refer to rows in OrderItem via foreign keys that include ItemSequence.  A better approach is shown in Figure 17 where the OrderItemID column is used as the primary key.

  • When do you update sequence numbers after rearranging the order items?  Whenever you rearrange order items on an order, perhaps you moved the fourth order item to be the second one on the order, you need to update the sequence numbers within the database.  You may decide to cache these changes in memory until you decide to write out the entire order, although this runs the risk that the proper sequence won’t be saved in the event of a power outage.

  • Do you update sequence numbers after deleting an order item?  If you delete the fifth of six order items do you want to update the sequence number for what is now the fifth item or do you want to leave it as it.  The sequence numbers still work – the values are 1, 2, 3, 4, 6 – but you can no longer use them as the position indicators within your collection without leaving a hole in the fifth position.

  • Consider sequence number gaps greater than one.  Instead of assigning sequence numbers along the lines of 1, 2, 3, … instead assign numbers such as 10, 20, 30 and so on.  That way you don’t need to update the values of the OrderItem.ItemSequence column every time you rearrange order items because you can assign a sequence number of 15 when you move something between 10 and 20. You will need to change the values every so often, for example after several rearrangements you may find yourself in the position of trying to insert something between 17 and 18.  Larger gaps help to avoid this (e.g. 50, 100, 150, …) but you’ll never completely avoid this problem.

 

Figure 17. Improved data schema for persisting Order and OrderItem. 

 

4.6. Mapping Recursive Relationships

A recursive relationship, also called reflexive relationships (Reed 2002; Larman 2002), is one where the same entity (class, data entity, table, …) is involved with both ends of the relationship.  For example the manages relationship in Figure 18 is recursive, representing the concept that an employee may manage several other employees.  The aggregate relationship that the Team class has with itself is recursive – a team may be a part of one or more other teams. 

Figure 18 depicts a class model that includes two recursive relationships and the resulting data model that it would be mapped to.  For the sake of simplicity the class model includes only the classes and their relationships and the data model includes only the keys.  The many-to-many recursive aggregation is mapped to the Subteams associative table in the same way that you would map a normal many-to-many relationship – the only difference is that both columns are foreign keys into the same table.  Similarly the one-to-many manages association is mapped in the same way that you would map a normal one-to-many relationship, the ManagerEmployeePOID column refers to another row in the Employee table where the manager’s data is stored.

 

Figure 18. Mapping recursive relationships.

 

5. Mapping Class-Scope Properties

Sometimes a class will implement a property that is applicable to all of its instances and not just single instances.  The Customer class of Figure 19 implements nextCustomerNumber, a class attribute (you know this because it’s underlined) which stores the value of the next customer number to be assigned to a new customer object.  Because there is one value for this attribute for the class, not one value per object, we need to map it in a different manner.  Table 2 summarizes the four basic strategies for mapping class scope properties.

 

Figure 19. Mapping class scope attributes.

 

Table 2. Strategies for mapping class scope properties.

Strategy

Example

Advantages

Disadvantages

Single Column, Single-Row Table

The CustomerNumber table of Figure 19 implements this strategy. 

Simple

Fast access

Could result in many small tables

Multi-Column, Single-Row Table for a Single Class

If Customer implemented a second class scope attribute then a CustomerValues table could be introduced with one column for each attribute.

Simple

Fast access

Could result in many small tables, although fewer than the single column approach

Multi-Column, Single-Row Table for all Classes

The topmost version of the ClassVariables table in Figure 19.  This table contains one column for each class attribute within your application, so if the Employee class had a nextEmployeeNumber class attribute then there would be a column for this as well.

Minimal number of tables introduced to your data schema.

Potential for concurrency problems if many classes need to access the data at once.  One solution is to introduce a ClassConstants table, as shown in Figure 19, to separate attributes that are read only from those that can be updated.

Multi-Row Generic Schema for all Classes

The bottommost version of the ClassVariables and ClassConstants tables of Figure 19.  The table contains one row for each class scope property in your system.

Minimal number of tables introduced to your data schema.

Reduces concurrency problems (assuming your database supports row-based locking).

Need to convert between types (e.g. CustomerNumber is an integer but is stored as character data).

The data schema is coupled to the names of your classes and their class scope properties.  You could avoid this with an even more generic schema along the lines of Figure 11.

 

 

6. Performance Tuning

One of the most valuable services that an Agile DBA can perform on a development team is performance tuning.  A very good book is Database Tuning by Shasha and Bonnet (2003).  When working with structured technology most of the performance tuning effort was database-oriented, generally falling into one of two categories:

  1. Database performance tuning.  This effort focuses on changing the database schema itself, often by denormalizing portions of it.  Other techniques include changing the types of key columns, for example an index is typically more effective when it is based on numeric columns instead of character columns; reducing the number of columns that make up a composite key; or introducing indices on a table to support common joins.

  2. Data access performance tuning. This effort focuses on improving the way that data is accessed.  Common techniques include the introduction of stored procedures to “crunch” data in the database server to reduce the result set transmitted across the network; reworking SQL queries to reflect database features; clustering data to reflect common access needs; and caching data within your application to reduce the number of accesses. 

Neither of these needs go away with object technology, although as Figure 20 implies the situation is a little more complicated.  An important thing to remember is that your object schema also has structure to it, therefore changes to your object schema can affect the database access code that is generated based on the mappings to your database.  For example, assume that the Employee class has a homePhoneNumber attribute.  A new feature requires you to implement phone number specific behavior (e.g. your application can call people at home).  You decide to refactor homePhoneNumber into its class, and example of third normal object form (3ONF), and therefore update your mappings to reflect this change.  Performance degrades as a result of this change, motivating you to change either your mappings which the data access paths or the database schema itself.  The implication is that a change to your object source code could motivate a change to your database schema.  Sometimes the reverse happens as well.  This is perfectly fine, because as an agile software developer you are used to working in an evolutionary manner as shown in Figure 5.

 

Figure 20. Performance tuning opportunities.

 

There are two main additions to performance tuning that you need to be aware of: mapping tuning and object schema tuning.  Mapping tuning is described below.  When it comes to object schema tuning most changes to your schema will be covered by common refactorings (Fowler 1999).  However, a technique called lazy reading can help dramatically. 

 

6.1. Tuning Your Mappings

Throughout this chapter you have seen that there is more than one way to map object schemas to data schemas – there are four ways to map inheritance structures, two ways to map a one-to-one relationship (depending on where you put the foreign key), and four ways to map class-scope properties.  Because you have mapping choices, and because each mapping choice has its advantages and disadvantages, there are opportunities to improve the data access performance of your application by changing your choice of mapping.  Perhaps you implemented the one table per class approach to mapping inheritance only to discover that it’s too slow, motivating you to refactor it to use the one table per hierarchy approach.

It is important to understand that whenever you change a mapping strategy that it will require you to change either your object schema, your data schema, or both.

 

6.2. Lazy Reads

An important performance consideration is whether the attribute should be automatically read in when the object is retrieved.  When an attribute is very large, for example the picture of a person could be 100k whereas the rest of the attributes are less than 1k, and rarely accessed you may want to consider taking a lazy read approach.  The basic idea is that instead of automatically bringing the attribute across the network when the object is read you instead retrieve it only when the attribute is actually needed.  This can be accomplished by a getter method, an operation whose purpose is to provide the value of a single attribute, that checks to see if the attribute has been initialized and if not retrieves it from the database at that point.

Other common uses for lazy read is reporting and for retrieving objects as the results of searches where you only need a small subset of the data of an object.  

 

7. Why Data Schemas Shouldn’t Drive Object Schemas (Revisited)

The material for this section has been reposted as Why Data Models Don’t Drive Object Models (And Vice Versa)

 

8. Implementation Impact On Your Objects

The impedance mismatch between object technology and relational technology forces you to map your object schema to your data schema. To implement these mappings you will need to add code to your business objects, code that impacts your application.  These impacts are the primary fodder for the argument that object purists make against using object and relational technology together.  Although I wish the situation were different, the reality is that we’re using object and relational technology together and very likely will for many years to come.  Like it or not we need to accept this fact. 

I think that there is significant value in summarizing how mapping impacts your objects.  Some of this material you have seen in this chapter and some you will see in other chapters.  The impacts on your code include the need to:

  • Maintain shadow information. 

  • Refactor it to improve overall performance.

  • Work with legacy data.  It is common to work with legacy databases and that there are often significant data quality, design, and architectural problems associated with them.  The implication is that you often need to map your objects to legacy databases and that your objects may need to implement integration and data cleansing code to do so.

  • Encapsulate database access. Your strategy for encapsulating database access determines how you will implement your mappings.  Your objects will be impacted by your chosen strategy, anywhere from including embedded SQL code to implementing a common interface that a persistence framework requires.

  • Implement concurrency control. Because most applications are multi-user, and because most databases are accessed by several applications, you run the risk that two different processes will try to modify the same data simultaneously.  Therefore your objects need to implement concurrency control strategies that overcome these challenges.

  • Find objects in a relational database.  You will want to work with collections of the same types of objects at once, perhaps you want to list all of the employees in a single division.   

  • Implement referential integrity.  There are several strategies for implementing referential integrity between objects and within databases.  Although referential integrity is a business issue, and therefore should be implemented within your business objects, the reality is that many if not all referential integrity rules are implemented in the database instead. 

  • Implement security access control.  Different people have different access to information.  As a result you need to implement security access control logic within your objects and your database.

  • Implement reporting.  Do your business objects implement basic reporting functionality or do you leave this effort solely to reporting tools that go directly against your database.  Or do you use a combination.  

  • Implement object caches.  Object caches can be used to improve application performance and to ensure that objects are unique within memory.

 

9. Implications for Model Driven Architecture (MDA)

The Model-Driven Architecture (MDA) (Object Management Group 2001b) defines an approach to modeling that separates the specification of system functionality from the specification of its implementation on a specific technology platform.  In short, it defines guidelines for structuring specifications expressed as models.  The MDA promotes an approach where the same model specifying system functionality can be realized on multiple platforms through auxiliary mapping standards, or through point mappings to specific platforms.  It also supports the concept of explicitly relating the models of different applications, enabling integration, interoperability and supporting system evolution as platform technologies come and go.

Although the MDA is based on the Unified Modeling Language (UML), and the UML does not yet officially support a data model, my expectation is that object to relational mapping will prove to be one of the most important features that MDA-compliant CASE tools will support.   My hope is that the members of the OMG find a way to overcome the cultural impedance mismatch and start to work with data professionals to bring issues such as UML data modeling and object-to-relational mapping into account.  Time will tell.

 

10. Patternizing What You Have Learned

In this chapter you learned the basics of mapping objects to relational databases (RDBs), including some basic implementation techniques that will be expanded on in following chapters.  You saw that there are several strategies for mapping inheritance structures to RDBs and that mapping object relationships into RDBs is straightforward once you understand the differences between the two technologies.  Techniques for mapping both instance attributes and class attributes were presented, providing you with strategies to complete map a class’s attributes into an RDB.

This chapter included some methodology discussions that described how mapping is one task in the iterative and incremental approach that is typical of agile software development.  A related concept is that it is a fundamental mistake to allow your existing database schemas or data models to drive the development of your object models.  Look at them, treat them as constraints, but don’t let them negatively impact your design if you can avoid it.

Throughout this chapter I have described mapping techniques in common prose.  Although most authors prefer this technique, visit www.ambysoft.com/mappingObjects.html for an extensive list of links to mapping papers, some authors choose to write patterns instead.  The first such effort was the Crossing Chasms pattern language (Brown & Whitenack 1996)  and the latest effort is captured in the book Patterns of Enterprise Application Architecture (Fowler et. al. 2003).  Table 3 summarizes the critical material presented in this chapter as patterns, using the names suggested by other authors wherever possible.

 

Table 3. Mapping patterns.

Pattern

Description

Class Table Inheritance

Map each individual class within an inheritance hierarchy to its own table.

Concrete Table Inheritance

Map the concrete classes of an inheritance hierarchy to its own table.

Foreign Key Mapping

A relationship between objects is implemented in a relational database as foreign keys in tables.

Identity Field

Maintain the primary key of an object as an attribute.  This is an example of Shadow Information.

Lazy Initialization

Read a high-overhead attribute, such as a picture, into memory when you first access it, not when you initially read the object into memory.

Lazy Read

Read an object into memory only when you require it.

Legacy Data Constraint

Legacy data sources are a constraint on your object schema but they should not drive its definition.

Map Similar Types

Use similar types in your classes and tables.  For example it is easier to map an integer to an numeric column than it is to map it to a character-based column.

Map Simple Property to Single Column

Prefer to map the property of an object, such as the total of an order or the first name of an employee, to a single database column.

Mapping-Based Performance Tuning

To improve overall data access performance you can change your object schema, your data schema, or the mappings in between the two.

Recursive Relationships Are Nothing Special

Map a recursive relationship exactly the same way that you would map a non-recursive relationship.

Representing Objects as Tables

Prefer to map a single class to a single table but be prepared to evolve your design based to improve performance.

Separate Tables for Class-Scope Properties

Introduce separate tables to store class scope properties.

Shadow Information

Classes will need to maintain attributes to store the values of database keys (see Identity Field) and concurrency columns to persist themselves.

Single Column Surrogate Keys

The easiest key strategy that you can adopt within your database is to give all tables a single column, surrogate key that has a globally unique value.

Single Table Inheritance

Map all the classes of an inheritance hierarchy to a single table.

Table Design Time

Let your object schema form the basis from which you develop your data schema but be prepared to iterate your design in an evolutionary manner.

Uni-directional Key Choice

When a one-to-one unidirectional association exists from class A to class B, put the foreign key that maintains the relationship in the table corresponding to class A.

 

10. References and Suggested Online Readings

List of References

At www.ambysoft.com/mappingObjects.html I maintain a list of links to mapping white papers posted on the web.

 

Suggested books:

Agile Database Techniques This book describes the philosophies and skills required for developers and database administrators to work together effectively on project teams following evolutionary software processes such as Extreme Programming (XP), the Rational Unified Process (RUP), Feature Driven Development (FDD), Dynamic System Development Method (DSDM), or The Enterprise Unified Process (EUP).  In March 2004 it won a Jolt Productivity award.
The Object Primer 3rd Edition: Agile Model Driven Development (AMDD) with UML 2 This book presents a full-lifecycle, agile model driven development (AMDD) approach to software development.  It is one of the few books which covers both object-oriented and data-oriented development in a comprehensive and coherent manner.  Techniques the book covers include Agile Modeling (AM), Full Lifecycle Object-Oriented Testing (FLOOT), over 30 modeling techniques, agile database techniques, refactoring, and test driven development (TDD).If you want to gain the skills required to build mission-critical applications in an agile manner, this is the book for you.
Patterns of Enterprise Application Architecture This book presents a collection of architectural patterns, many of which hit on persistence-related issues.  I highly suggest this book as a complement to the material presented in this chapter.
Developing Applications with Java and UML This book is similar to The Object Primer 2/e.  It goes into Java a little bit more although does not go very far beyond the UML or the RUP.  The book does cover basic mapping concepts and includes more source code examples.
Mastering EJB 2/e This book covers EJB, including persistence-related issues.  It covers the basics as well as advanced persistence issues for EJB-based development.

 

Let Us Help

Ronin International, Inc. continues to help numerous organizations to learn about and hopefully adopt agile techniques and philosophies.  We offer both consulting and training offerings, including Agile Database Techniques Training.  In addition we suggest that you visit the Agile Modeling Site and the Enterprise Unified Process (EUP) site.

You might find several of my books to be of interest, including The Object Primer, Agile Modeling, The Elements of UML 2.0 Style, and Agile Database Techniques.

For more information please contact Michael Vizdos at 866-AT-RONIN (U.S. number) or via e-mail (michael.vizdos@ronin-intl.com).

 

Suggestion or Question? Agile Modeling Logo Enterprise Unified Process (EUP) Logo  

Ronin International

Page first posted: January 14 2003
Page last updated: April 1 2004

注:这是偶在一网站上看到的文章,感觉非常好,因为真正的软件,必须和数据库想联系,而数据库的设计没法可依,只能语意分析,这批文件就是介绍企业级的MRP&PRM

用饭局的例子说明:MRP还是PRM

一天中午,在某大企业里主管ERP选型工作的老张突然回到家里对妻子说:“亲爱的老
婆,晚上几个同事要来家里吃饭。这些天和很多ERP公司的人打交道,我已经学了ERP的
管理精华。你看,我专门带回了装ERP软件的笔记本,这次我要用最先进的ERP理念来完
成咱家的请客过程了。我要把这次宴会搞成一次ERP家宴!”。“我已经用销售模块和
客户关系模块全面管理与同事的关系往来了。这次他们确定要来吃饭的信息,我已经放
到了合同管理和订单管理中,而且自动传递数据到应收应付模块、财务模块、还有生产
模块中,根据客人的意向和要求,确定了做什么菜,也就是主生产计划都有了”。

妻子:“那太好了,家里就是你的生产车间了,我是车间主任,你的主生产计划里是哪
几样菜,什么时间做?”

老张:“客人们7点左右就来了,最好8点钟能吃完。菜有:凉菜拼盘、糖醋里脊、西湖
醋鱼、宫保鸡丁、清蒸河蟹、锅巴肉片,这些都是你的拿手菜,你看可以吗?”妻子:
“没问题,看我的吧”

老张:我已经把这些菜的做法存入到BOM中了,下一步,让我来用BOM展开的方法,看看
都需要什么原料。——具体的原料有:鲤鱼一条、螃蟹一斤、瘦肉1斤、鸡肉半斤、锅
巴一袋、白酒1瓶、番茄5个、鸡蛋10个、调料若干。看,这就是物料需求计划了。我已
经把咱家冰箱里的东西都存入ERP库存模块了,让我看看库存还有多少……还需要再买
鱼、螃蟹、6个鸡蛋、5个西红柿、一袋盐、锅巴等等。

老张把这些数据记录到采购模块中,开始进行供应商对比查询,说:鱼应该去自由市场
买,螃蟹东边超市的最便宜、鸡蛋是街对面小卖部的最好,而且按照经济批量鸡蛋一次
买12个最便宜、锅巴和盐最少一袋、鱼买一斤一条的最好………看,采购计划已经有
了,就照这个去买吧。

妻子立即出发,很快把需要的东西买回来了。老张把价格数量一一记录到笔记本里,质
检合格后办理了入库——放入冰箱。但是发生了小小的问题,冰箱里原有的东西已经很
多,有些东西放不下了。好在很快就要开始做饭,东西不会放很长时间,应该不会有大
的问题。老张再把花的多少钱一笔一笔都做帐存入财务模块,马上统计出这次采购金
额、物料成本的信息。

现在的时间只是下午3点多,除了冰箱放不下的小问题外,ERP家宴一切准备顺利,工作
效率很高。老张骄傲地说,看,ERP的威力显示出来了吧?现在的工作流程是按照最先
进的管理理念,最科学最合理的方法来制定,以前总是买多了剩下,或者就短缺,现在
完全按照需求采购,真是大不一样啦。妻子也说,ERP就是比手工好,以前帐总是算不
清楚,现在一下就算完了。

但是,事情还没有完呢,下一步该怎么办呢?客人们7点钟来,几点开始炒菜?早了菜
凉了,晚了时间来不及。妻子问老张,老张说,这相当于生产调度,这是你车间具体执
行的事情呀,你以前做菜怎么个做法,哪个工作应该提前多长时间开始,哪个是瓶颈资
源,你应该有经验吧。但是妻子有点发蒙,以前从没有被要求在这么短时间做这么多的
菜。所有菜的工序全加在一起总共需要2个多小时。仔细算了算,家里有三个煤气灶,
正好可以同时开火坐上三个锅:炸锅、蒸锅、炒锅。妻子一人同时应付三个锅没问题,
每道菜准备原料的过程还另外需要一个人,老张可以担任。这样,很多工作都可以同时
做,应该用比2个小时短得多的时间完成。但是,到底多长时间可以完成?1个小时还是
1个半小时?这么多工作从何入手呢?是一道一道菜做?还是两道一起做?能三道菜一
起做吗?这道菜的关键资源是蒸锅,另一道是炸锅,好几个菜搀在一起是怎么回事谁知
道?每道工序的开工提前期到底是多少?关键的路径是什么?在妻子的追问下,老张对
着笔记本操作了半天想找个答案,最终也说不出个所以然来。

这时候有同事打电话过来了,问几点能吃完,大家再去打保龄球。正为此事发愁的老张
含糊地说1个多小时吧。这不是给了客户一个交货承诺了吗?问题是从开始做饭到全部
做完,1个小时完的了吗?谁也说不清呀,妻子更着急了。这时候女儿又打来电话,问
晚上能不能请几个同学来吃饭,只要做两个菜?这时候还来填乱,不是更麻烦了吗。妻
子说:不行不行,你们出去吃吧!‘唉,上门的客户给赶跑了!’。老张对着笔记本想
以后上了ERP的情景直发呆。

为了保证工期,避免延期违约的麻烦,妻子作出了决定:立即动手开始干。

……

几天以后,老张开始总结这次ERP宴会行动的得失,总结出来的问题主要有以下:

第一:螃蟹和鱼买的早了,本来是活着的,结果到了做菜的时候已死了1个多小时,味
道不好了。
第二:有的热菜早早做出了,等到客人来了上的却是凉菜。
第三:还有的菜上得太晚了,为了等最后一道菜大家空坐着半天,工序明显安排不合
理。连最后去打球也耽误了。

总之,在前面所有管理环节都顺畅的情况下,最后的生产过程不如人意。

但是妻子很委屈。那么多菜,本来一道一道做要用2个多小时,最后给压缩成1个半小时
做完,已经很不容易了,菜上得晚了,但是厨房里已经一直在忙呢;想早吃完,只能早
做,热菜就难免会凉了;鱼和螃蟹死了,你的采购计划哪里有几点买鱼和买螃蟹的提
示。你的ERP家宴原定1小时,可为什么ERP不告诉你1个小时根本完不成?

老张无言以对,也开始考虑这个事儿。他知道,这些问题从本质上是作为ERP生产管理
核心的MRP的缺陷所导致的必然。MRP本身是针对物料需求计划的,没有对资源能力的限
制,根本得不出在有限资源和多种约束条件下的生产作业计划。没有作业计划,哪来精
确时间的物料需求计划?而这些对企业来说是必须的数据。这可怎么办呢?看来只能对
MRP反其道而行之,那就是PRM了,有这么个软件吗?还真的有这个软件,叫流程资源管
理,是北京东方小吉星公司最新的APS(先进生产排程)产品,据说是中国唯一与国际
水平同步产品,那就拿来试试看吧。

为了检验这个‘PRM流程资源管理’,老张又请了几位同事,经过确认,还是跟上次一
样的菜谱和时间。这回PRM家宴与上次的MRP家宴会有什么不一样吗?老张与妻子一起用
PRM来研究如何安排这顿晚宴。

PRM和MRP的思路就是不一样,MRP的核心是BOM(BILL OF MATERIAL)——物料表,而
PRM的核心是BOW(BILL OF WORK)——工作表,每个工作都包含了物料、资源、时间等
全部生产信息。老张首先把做每道菜的整个过程,用什么资源、物料、多长时间、逻辑
关系等等都录入到PRM中,建立一个BOW,再点击一个不起眼的‘计算’按钮,看看能出
现什么吧。计算机不停地眨着眼睛在计算,用了几分钟的时间,这倒是很新鲜,它在算
什么呢?妻子好奇地盯着它。结果终于出来了,那是一个详细的做菜的计划列表,还配
有甘特图。仔细一看,两个人都大吃一惊,PRM明确回答:只要42分钟就能完成全部的
做菜工作。而且精确指出鱼一条应该在7:20:00的时候用、螃蟹一斤应该在7:40:00
用,其他各色物料各是多少,几点几分的时候需要,每道工序几点开始几点结束,中间
有多少自由时间,哪些工序是关键工序。

这可能吗?两个人对着甘特图使劲检查,先看各道菜的工序安排对不对。没错,就连凉
菜必须放一段时间才能装盘、里脊必须炸完2分钟以后再开始炒糖醋里脊、炸完锅巴必
须立即炒锅巴肉片这样的细节都分毫不差。那么是不是有资源在冲突呀,两人依次检查
配菜、蒸锅、炸锅、炒锅,每个资源都是在42分钟的时间里十分紧凑地安排所有工作,
都是干完一件工序再开始干另外一件,环环相扣,丝毫不乱,绝对没有冲突。这才是真
正的‘资源计划’呀!两个人感叹,早知道有这么短时间完成的方法上次何至如此。

老张很快又用PRM算出了另外几种42分钟完成的方法,正在对比哪种更好。这时候又有
同事打电话过来问时间,老张爽快地回答:一个小时搞定!女儿也打电话,问晚上加个
菜:请同学吃炸丸子汤,成吗?这可是一道很难做的菜呀,先捏丸子,再炸,最后做
汤,几个工序加起来时间要半个多小时。老张告诉不要急,在PRM里加入这个菜算了一
下。很快得出结果:在某个适当的时间开始做这道菜,充分利用资源的空闲,整个流程
只增加了11分钟,53分钟完成,还是不到一个小时。没问题,来吧!看,原来赶跑的客
户又回来了。

女儿又问:要我回来帮你吗?女儿不会做菜,只能和老张一样干配菜的工作。老张马上
用PRM算了一下,再增加一个配菜——如果不增加丸子汤这道菜,最短时间可以从42变
为41分钟,只减少了一分钟;如果增加丸子汤这道菜,最短时间没有任何变化,还是53
分钟。因为关键资源都不是配菜,所以女儿回来也没什么用处。老张很痛快地告诉女
儿:‘用不着你来帮忙了’,一副胸有成竹的样子。

给女儿打完电话,老张突然想到:我现在鱼和螃蟹的需求时间已经精确到秒了。这回我
可以直接电话要求鲜货供应商给我按时送上门,他们有这个服务内容,我就不用自己去
采购了。新鲜的鱼和螃蟹按时来了直接进厨房下锅,再不会不新鲜了,而且根本不占库
存,连冰箱都不需要用了,上次冰箱放不下的问题就这么给解决了。这对一次家宴的作
用可能微不足道,但对企业来说,这可是解决了库存长期积压的大问题了。这种方式就
相当于把自己的生产计划与外部物流完全集成,这不就是形成SCM供应链了吗。看来做
到SCM的前提是你必须自己先有精确的生产作业计划,要不外部物流再准时也没用。而
现在我的产品提供给客户的时间也是精确到秒,可以满足他们同样的要求。这样整条链
上的各个环节不就能同时达到高效生产,最大限度降低库存了吗。原本很高深的SCM现
在看来如此简单。

………….

第二天,老张的同事们都在谈论昨天的宴会,重点不是菜的味道如何,而是老张和妻子
神奇的做饭过程,三个锅同时开火,几道菜一起开工,一边炒一边蒸一边炸,眼见两人
有条不紊不慌不忙,一样样地放下这件拿起那件,于是一道道菜不断上桌,象变魔术一
样,实在厉害。吃过第一次请客的人都奇怪,同样的菜,这次为什么如此不同?

老张这回对于ERP的理解有了更深刻的认识,ERP——‘企业资源计划的’关键就是要对
‘资源’进行‘计划’,象PRM家宴计划这种方式,明确给出每个资源应该怎么去工
作,才能满足多种约束条件,同时发挥出最大效率,这才能叫ERP。而不是传统的MRP那
样只给出物料需求计划,再用手工制定作业计划。

根据自己的体会,老张在纸上写下了这样一个公式:
MRP+MIS(进销存财务)=MRPII
PRM+MIS(进销存财务)=ERP
http://blog.aspcool.com/mqingqing123/posts/1412.aspx

2004年12月25日

1、SourceForge上开源的iTextSharp,一个用于生成PDF文件的开源.net项目,它可以用C#代码直接画pdf,它的网站上有很详细的教程,这是它最大的好处.网址:http://itextsharp.sourceforge.net/

2、Yet Another Forum论坛,ASP.NET Forum 2现在变成商业的了,Yet Another Forum,它是开源.net项目。网址在http://www.yetanotherforum.net/

3、Cuyahoga:Cuyahoga is an open source .NET web site framework. It provides content management capabilities and has a modular approach.
The main goal of the project is to show .NET developers that there is a different way of building web applications than the well known sample applications. Although the project is targeted primarily at .NET developers, anybody interested can download the source and start playing.

莱纳斯·托瓦尔兹(Linus Torvalds)开发出Linux后,受到威胁最大的操作系统大概便是Sun的Solaris了。
      
       现在,托瓦尔兹及Linux再次与老对手狭路相逢。Sun已经将Solaris转变为一个开放源代码项目,并正在建立以Solaris为中心的编程人员社区,促进Solaris在X86架构服务器上的应用。
      
       但现年34岁的这位芬兰籍程序设计师并没有因此而忧心忡忡,事实上,它一点都不以为意,并对Sun的这一举动称为“儿戏”。
      
       托瓦尔兹曾任职Transmeta芯片商多年,现在则决定继续在新雇主处OpenSource Development Labs(开放源代码开发实验室)待上一段时间。
              通过与长期的合作者和主要副手安德鲁·莫顿(AndrewMorton)合作,托瓦尔兹正在试验一种新的Linux开发模式:经常性地对现有的2.6版Linux内核进行修改,而不是数个月进行一次“大修”,其结果是:Linux内核的改进更迅速了。
              在接受CNETNews.com采访时,托瓦尔兹讨论了Solaris、他的即兴编程风格和其它一些问题。       
       问:你如何看待Sun在发布Solaris10时采取的一些措施━━技术改进、开放源代码、支持x86芯片?       
       答:我对Sun持一种观望的态度。它一贯是“语言的巨人,行动的矮子”,我在等着它的行动。       
       问:我认为Sun已经采取了一些措施,它已经恢复了x86版Solaris,并增加了一些有趣的功能━━容器、DTrace、ZFS,它还在积极寻求软件开发人员和软件厂商的支持。Sun还宣布,正式版本的x86版Solaris将是免费的,你如何看待Sun向x86平台的靠拢和Solaris中的新功能?
      
       答:Solaris/x86纯属“儿戏”,据我所知,它支持的硬件非常少。如果你抱怨Linux在驱动程序方面还有一些问题的话,你不妨去试试Solaris/x86再来说。
      
       问:IBM的史蒂夫·米勒(SteveMills)曾说过,Linux的大部分开发路线图是“人云亦云”:Unix就是Linux发展的榜样。那么,Linux有自己的发展路线图呢?还是只是在利用Unix的技术?
      
       答:我是成熟概念的信仰者。我景仰的英雄是牛顿,部分原因是他获得的巨大的科学成就,但更重要的是他有一句“千古名言”:如果说我看得更远的话,那只是因为我站在巨人的肩膀上。
      
       牛顿实际上并不是个容易相处的人,但这句话正点出了科学的精义。开放源代码就是这样,其要旨就是站在了巨人的肩膀上,对其它技术、理念进行循序见进的改进。
      
       在我看来,仅仅为了实现部分新功能强调自己与众不同而从零开始发明全新的技术,那真是超级愚蠢加虚荣。Linux的伟大之处就在于,我们不会一概抹煞他人技术,没有在倒洗澡水时连同孩子一块儿扔掉。不像许多计划都以为自己多了不起,这种“我最行”心态(NotInventedHere,简称NIH)其实是一种病态。
      
       问:你觉得外界对Linux哪些误解最让你气愤?
      
       答:我不会那么容易生气,因此也没有什么说会特别气愤的东西,不过外界有一种说法挺有趣的,认为单靠一人或一家公司之力就可以让整个市场天翻地覆,这种误解其实非关Linux,甚至非关IT产业,有人总认为事情的成功是因为当初有人特别有远见,所以才有今天的成果。大家似乎都很相信这种说法,这其实是一种虚荣心作祟。
      
       我总是得不断跟外人解释,我没能耐控制Linux的发展,这只是因为这种环境而助长了开发,而不是因为某某领导人多厉害。不管是什么超级伟大的教练或什么精神宗师,其实都是同样的道理。
      
       问:这种伟人论我自己也觉得蛮假的,不过再怎么说,你对Linux还是有相当的影响力,而Linux又对运算产业有巨大影响,你会因为Linux而更谦虚?还是更觉得有使命感?
      
       答:我并非以前就没使命感,但我不会因为Linux而更加谦虚。我反而更加认识到那些呼风唤雨人士怎么做其实跟他所处的环境变化有很大的关连,虽然这不会让我因此变得谦虚,但至少会让我更加脚踏实地些。
      
       我并非说个人不重要,个人其实很重要,且我相信聪明人能够做的事比一般凡人更胜千百倍。但更重要的是你必须有适当的环境才能让人发光,而Linux最大的成就就是让能人得以发亮。
      
       问:当Sun发布开放源代码版的Solaris时,你是否会看看其源代码呢?
      
       答:可能不会。我不看的原因并非是因为憎恨,而只是因为我没有时间和兴趣。Linux从来都与“其它竞争对手”无关,其目的只是希望能够超越它自己,因此我对Solaris兴趣并不大。我相信,只要Solaris真有任何优异之处,别人会告诉我的。
      
       问:你刚才不是说我们都站在巨人的肩膀上吗?Solaris中可能确实会有一些不错的创意,为什么要忽视它呢?
      
       答:因为我个人认为,该拿的我都已经从Unix通用原理上学到了,Solaris就没有值得学习的地方了。不管从哪个方向来比较,现在Linux就是略胜一筹。
      
       但更重要的是,即使是我错了,Solaris的确有优于Linux的地方,会有人给我提出来。自己试图找出这些东西无疑是在浪费时间。
      
       问:我们假设,未来数年后,Linux已经战胜了市场上所有版本的Unix,到时候,你还从哪里寻找灵感?
      
       答:我从来都不缺乏灵感。
      
       我的灵感不是来自其它系统,而是用户的需求。用户通常不会说“Unix可以这样做,Linux怎么不行”云云。一般我们常听见的是:“我希望有某某功能,但却找不到”,或者说“虽然可以这样做,但却不甚理想”等。我们的灵感都是来自于此。
      
       问:你在近期规划和长期规划方面花了多少精力?我认为你是一个“即兴的”人,而不是一个“制订长期规划的”人。
      
       答:对,我真的不会做出长期的规划。我的长期规划都是模糊的、“直觉的”东西,我无法用文字表达它们。我会尽量避免制定长期的目标,对自己喜欢和不喜欢的东西更多的是一种感觉。一些人可能会认为这是一种随意的方式,没错就是如此,但它非常灵活,我们不会因为专注五年后的事,反而忽略了眼前应该解决的问题。
      
       我觉得有那种远大理想虽然有趣,但也很可怕,我最常讲的是,我们没什么宏图大志,我们只专注在小改善上,至于小改善是否会掀起大革命就顺其自然了。
      
       问:你认为没有理查德(RichardStallman)的远见卓识,GNU项目和Linux所采用的GPL能够问世吗?
      
       答:少了他还是有可能实现,但会有些不同。这就好像说“若少了某某伟人,世界会怎样一样?”世界是会不一样,毫无疑问的是,远见卓识带来的动力是非常大的。
      
       你也许还会问到,没有我Linux是否会问世这一问题。我不是在自我吹捧,可能一种BSD版本的Unix会发展起来,也可能是另外一个大学生开发出了另一套操作系统。
      
       问:为什么会为Linux选择GPL?你希望新版GPL会有哪些改进?
      
       答:我那时只希望找个能符合以下两点的许可协议:一是将代码向其他人开放,二是对代码的改进都能提供给大家。就是如此而已,其他说法都是言过其实。
      
       这听起来非常简单,但大多数的开放源代码许可协议都不符合我的标准,它们的失败之处就在于:使某些人能够控制对代码的改进。
      
       我对新版GPL并不担心,我不是律师,对具体的字眼不感兴趣。我对GPL唯一的抱怨就是很多简单的事情却要罗哩八唆写了一堆,不过碰上法律似乎就是如此。
      
       问:内核开发过程如何变化?
      
       答:我设想的最大变化是开放2.7版Linux内核树(一种对2.6版Linux内核树“分叉”处理的试验版本),但没有人支持这一想法,其他人坚信,2.6.x版Linux内核的开发模式就很好。
      
       这并不意味着2.7.x不会问世━━它将在未来数个月后问世,而是意味着稳定版Linux内核树将使开发版黯然失色。我认为这是一种成熟的标志,也表明稳定版本对人们非常重要,不能轻易地抛弃它。
      
       问:新的开发过程意味着改进将会被更快地集成到Linux中吗?
      
       答:是的,这是这种模式的优势之一。人们已经讨厌了长达二年的开发周期。
      
       问:哪些变化足以触发2.7版Linux内核树的发布?
      
       答:如果我知道,一定会告诉你。以前的开发周期中,我们都知道需要修正的核心问题,但这会对依赖这些核心架构的部件形成很大的破坏作用。
      
       问:目前有多少人在开发Linux?我认为少数人贡献了大部分的代码。
      
       答:经常性地参与Linux内核开发的约有200名开发人员。记录表明,去年参与Linux内核开发的有1000人左右,但大多数都是偶尔参与开发活动。不过这还不包括其他开发人员的贡献,比如作测试,品管或给予意见者。
      
       问:你如何看待RedHat、Novell在Linux市场上越来越重要的地位?它们更多地是根据客户而不是你的要求修正Linux,你对此感到苦恼吗?
      
       答:我与客户打交道越少越好。我认为商业性厂商的最大贡献就是成为开发人员与客户之间的沟通桥梁,它们能够在纯技术和纯商业性问题之间取得一个平衡。开发源代码也可让技术人员比较能够忠于自己。
      
       问:你是否感到是Linux厂商在主导着Linux的发展,而你只是过客而已?
      
       答:对我而言不是这样,我认为厂商也不会这样认为。但他们的确给了许多意见,人们需要参与感,如果有人觉得自己只是过客,这对于大家都不是好事情。
      
       问:Linux进入台式机领域现在最大的障碍在桌面型技术还是营销方面呢?
      
       答:这是一个综合性问题。应该是两者都有,在桌面型技术方面来说,Linux还有许多改善空间;有些也跟营销方面有关,但更重要的是用户的“惯性”,就是使用者有没有动力的问题。
      
       一般人总是懒得改变,过去这一年来这种情况尤为明显,技术明明已经在那边了,但用户还没有转换的心理准备,因此我才会认为商业化的桌面型很重要,这也是DOS(以及后来的Windows)让大家觉得很熟悉的主因,大众化的桌面型总要早上这条路,只是这还得花上好几年的时间。

2004年12月24日

Example Application Screenshot

Introduction

Hibernate has been huge in the Java world for being a easy to use high performance OR mapper, but only recently has it started to make a name for itself in the .NET world with NHibernate.

I’ve personally had a fair amount with different OR mappers, and object persistence frameworks. I decided to jump in and play with alpha .4 of NHibernate… and have honestly found it to be quite capable and relatively low on bugs (I haven’t found any yet anyway).

Background

After reading a couple of articles here (NHibernate) on the subject of NHibernate, I had a general clue, but probably more questions then answers.

I decided about bundling some real world examples along with explanation of why I am doing what, and it should be a pretty good starting point for anyone wanting to dive into OR Mappers, whether it be hobby or an enterprise application.

I’ll start out first by saying, initially my work was based of these examples, however… I’ve found that the examples were sometimes incorrect, or files weren’t named, or procedures weren’t explained. Anyhow I aim to help with that!

Article Goals

This is my first submission of any kind, so keep non-constructive criticism to a minimum. I am 100% open to feedback, I don’t claim to have all the answers, or claim to have the best methods to achieve this or that.

Over the next couple in a series of articles, I’d like to demonstrate n-tier frameworks utilizing NHibernate.

In this part, it’ll be a basic data layer logic that I will be referencing directly for the time being, but in later articles, I intend to implement my business logic layer, as well as security on my business methods.

Target audience… anyone who is interested in N-Tier development. As I said before, it’ll help hobbyists, and people interested in getting started in enterprise development. As well as, maybe fill in a little bit where I struggled (because of lack of documentation). I’d also like to provide a starting point for people who are just getting into building robust applications (whether for business or hobby… no excuse for bad application design).

Getting Started…

These steps are required, however, the order in which you do them is debatable.

Database Schema – I am using the Northwind database that is so standard, or downloadable from Microsoft. For now, I only really need Customers and Orders, the others are not necessary.

Northwind Schema

Class Definitions – You need to write a class with at least one constructor that has 0 parameters, and public properties representing data in the database: Bags, or Collections (Customer.Orders will end up being a IList after everything is said and done, we’ll go into this a little more later…).

Customer.cs

using System;
using System.Collections;

namespace nhibernator.BLL
{
    /// <SUMMARY>
    /// Summary description for Customer.
    /// </SUMMARY>
    public class Customer
    {
        #region Private Internal Members

        private string m_CustomerID, m_CompanyName, m_ContactName,
                m_Address, m_City, m_Region, m_PostalCode, m_Country;
        private IDictionary m_Orders;
        #endregion

        #region Public Properties

        public string CustomerID
        {
            get
            {
                return m_CustomerID;
            }
            set
            {
                m_CustomerID = value;
            }

        }

        public string CompanyName
        {
            get
            {
                return m_CompanyName;
            }
            set
            {
                m_CompanyName = value;
            }
        }

        public string ContactName
        {
            get
            {
                return m_ContactName;
            }
            set
            {
                m_ContactName = value;
            }
        }

        public string Address
        {
            get
            {
                return m_Address;
            }
            set
            {
                m_Address = value;
            }
        }

        public string City
        {
            get
            {
                return m_City;
            }
            set
            {
                m_City = value;
            }
        }

        public string Region
        {
            get
            {
                return m_Region;
            }
            set
            {
                m_Region = value;
            }
        }

        public string PostalCode
        {
            get
            {
                return m_PostalCode;
            }
            set
            {
                m_PostalCode = value;
            }
        }

        public string Country
        {
            get
            {
                return m_Country;
            }
            set
            {
                m_Country = value;
            }
        }

        public IDictionary Orders
        {
            get
            {
                return m_Orders;
            }
            set
            {
                m_Orders = value;
            }
        }

        #endregion

        public Customer()
        {
            //
            // TODO: Add constructor logic here
            //
        }
    }
}

Ok, so now we have Customer class defined, we’ll want to more or less duplicate this with Products, and orders.

Mapping file<ClassName>.hbm.xml to be built as an embedded resource (defines how our Entities map to database objects).

You’ll need one of these for each class you intend on persisting. Sounds time consuming, but comparing alternatives, it’s relatively pain free.

Customer.hbm.xml

<HIBERNATE-MAPPING" xmlns="urn:nhibernate-mapping-2.0">
    <CLASS name="nhibernator.BLL.Customer, nhibernator" table="Customers">

        <ID name="CustomerID" column="CustomerID" type="String" length="20">
            <GENERATOR class=assigned />
        </ID>
        <!-- Map properties I'd like to persist/fetch,
            assume column = propery name,
            and type is determined by reflection -->
        <property name="CompanyName"></property>
        <property name="ContactName"></property>
        <property name="Address"></property>
        <property name="City"></property>
        <property name="Region"></property>
        <property name="PostalCode"></property>
        <!-- Orders collection, pull customer orders but,
           loading lazy to minimize load time, resource usage. -->
        <SET name="Orders" cascade="all" lazy="true">
            <KEY column="CustomerID" />
            <ONE-TO-MANY class="nhibernator.BLL.Order, nhibernator" />
        </SET>
    </CLASS>
</HIBERNATE-MAPPING>

Unfortunately, Northwind has some questionable design practices. However, it’s everywhere and thus I think it to be the perfect database for this example.

Normally, CustomerID, the ID field in Customer would be int or GUID, and unique int, and would have slightly different syntax. However, in this case it does not.

We can effectively “Alias” table field names to class properties like so:

<property name="CompName" column= "CompanyName" type="String" length="40"/>

However, I’d like to stick to database fields as much as possible. You can actually omit the column all together if you want the property to have the same name…

<property name="CompanyName" length="40"/>

You might also notice I’ve removed the type=”string” bit of text. As of recent versions anyway NHibernate uses reflection to determine the type of the data (perhaps there are some performance ramifications, though really don’t know).

Important note…

While you could probably load the mappings on the fly by specifying a filename, it seems to be a general practice to compile to embedded resource. Once you’ve created your .hbm.xml, make sure your build action is Embedded Resource like so:

build action

So now, I leave you to write the mapping files for the other two persisted objects, or you can simply download the code!

NHibernate configuration – can be done through XML file, resource stream, or manually in code. (I chose the code route, as trial and error with other methods never worked as they should; don’t know if this is error on my part, or side effect of being an alpha).

NHibernate needs to be told what database provider to use, what SQL dialect to use, and connection string. Normally, a config file would look like this:

<?xml version="1.0" encoding="utf-8" ?>
<HIBERNATE-CONFIGURATION xmlns="urn:nhibernate-configuration-2.0">
    <SESSION-FACTORY name="nhibernator">

        <property name="connection.provider">
            NHibernate.Connection.DriverConnectionProvider
        </property>
        <property name="connection.driver_class">
            NHibernate.Driver.SqlClientDriver</property>
        <property name="connection.connection_string">
            Server=localhost;initial catalog=Northwind;Integrated Security=SSPI
        </property>
        <property name="show_sql">false</property>
        <property name="dialect">
            NHibernate.Dialect.MsSql2000Dialect
          </property>
        <property name="use_outer_join">true</property>
        <property name="query.substitutions">
            true 1, false 0, yes 'Y', no 'N'
        </property>

        <MAPPING assembly="nhibernator" />
    </SESSION-FACTORY>
</HIBERNATE-CONFIGURATION>

However, like I said, I couldn’t get the embedded resource stream for that to work, so I’ve written up the following code to simulate it. (Not as easy to come along and change later, but sufficient for getting the idea across).

In my data layer class CustomerFactory.cs, I have this code in the constructor for the time being:

config = new Configuration();
            IDictionary props = new Hashtable();

props["hibernate.connection.provider"] =
     "NHibernate.Connection.DriverConnectionProvider";
props["hibernate.dialect" ] = "NHibernate.Dialect.MsSql2000Dialect";
props["hibernate.connection.driver_class"] =
     "NHibernate.Driver.SqlClientDriver" ;
props["hibernate.connection.connection_string"] =
     "Server=localhost;initial catalog=Northwind;Integrated Security=SSPI" ;

foreach( DictionaryEntry de in props )
{
    config.SetProperty( de.Key.ToString(), de.Value.ToString() );
}

config.AddAssembly("nhibernator");

*Log4Net configuration – sure you could skip this, but then good luck making sense of errors when things don’t work =) In my client app, I simply read a log4net.config on load, and start from there.

n-Tiered Frameworks…

If you’ve encountered any sort of professional or enterprise software development, you’ve undoubtedly had these concepts beaten into your brain, and probably for the good.

Too much abstraction is bad, but a decent separation of data, business logic, and presentation logic can improve efficiency, code readability, and general upkeep.

So I’ve created a root namespace, and then created DLL for data layer logic (sure I could have used something a little more informative, but I am a creature of habit).

A subnamespace (?) of BLL for business layer logic…

And finally a whole other project for my presentation logic (I like to know I can use my tiered logic in console, web, remoting, webservices etc.).

DataLayerLogic – By definition, this code really should probably handle getting/updating/inserting/deleting entities in the database as well as data validation (SQL injection proofing etc.)

Luckily, NHibernate handles a fair amount of what’s supposed to be done in the data layer itself, and so we generally end up writing some minor data validation, and methods to fetch all, fetch a single entity, or fetch entities based on some criteria.

What I’ve done is create a separate “factory” for each entity type (CustomerFactory, ProductFactory, OrderFactory). In the constructor, I set up the NHibernate session factory, and spawn a NHibernate isession. Then each method to set/get data checks to see session.IsConnected, and if it’s not session.ReConnect(). This is important as data isn’t blocked the entire time my data factory is instantiated. I perform the logic, and then disconnect. On dispose of my factory class, I make sure session gets disposed etc.

I’ve got two methods built out for fetching customers in CustomerFactory.cs.

First… GetCustomers() in which I simply get back an IList of Customer objects.

Secondly… GetCustomer(string CustomerID) in which I feed it an ID such as “BERGS”, and since I fetch with FETCHMODE.EAGER flag, I get all the subcollections, without trying to access them.

BusinessLayerLogic – Theoretically, in a lot of cases, simply wrapping DataLayerLogic methods/classes is sufficient. However I end to fire off events, and do extra validation that wouldn’t be simple insertion problems.

In this article, I’ve not really built out this layer; look for it in the next in this series.

PresentationLogic – in this case, I simply create an instance of my DataLayerLogic, or invoke some of the static methods, and update user interface appropriately. (One could use ASP.NET or WinForms, or upcoming XAML or any number of presentation methods here).

I was sort of in a rush in this case; however, my little WinForms app will fetch all customers, and insert hard-coded data as a new user. (I’ll make this a lot more presentable in an update to this article.)

Of course, your namespace structure could look totally different (as most people have <companyname>.<technology>.<project>.<feature>), but for all intent and purposes, I’ve come up with this:

namespace layout

Notice the .hbm.xmls. You can actually throw these in a mappings folder or whatever makes things cleaner for you…

I can simply call a static method and get a collection of Customers which are data-bindable by default.

Where to go from here?

Nhibernate has a wealth of different configuration options for caching and performance tweaking, which I will get into in my second article. Unfortunately, there isn’t a whole lot of documentation on the subject, so you can try looking at hibernate documentation.

Also in the next article, you can expect to see a mechanism for my n-tier BusinessLayerLogic to check credentials for fetching the data.

In closing…

Like I said, I struggled with NHibernate. Getting it up and running the first time can be quite a pain; however, it can be done. The example works, but you’ll probably want to download NHibernate separately from here and add nhibernate.dll to your “References”.

Again, please leave constructive criticism only. This is my first article, and I realize it was on a broad subject. Thanks, and keep your eyes open for part 2!

History

v1 – First version of the article. No bugging me about layout, I’m working on it. Documentation too!

About ronnyek


I’ve been involved with software development since teaching Basic to teachers in 6th grade. Since then I’ve been involved with every aspect of computers.

Lately, I’ve involved myself very much in building Entperise Java and .Net applications.

Click here to view ronnyek’s online profile.

发布日期: 4/29/2004 | 更新日期: 4/29/2004

Dare Obasanjo
Microsoft Corporation

2003 年 9 月 15 日

摘要:Dare Obasanjo 再次回顾了他的 RSS Bandit 项目(它是一个从多个 Web 站点检索并显示新闻提要的 C# 应用程序),并使用 .NET 框架的多项 XML 功能来改善它,以构建丰富的 .NET 客户端应用程序。(16 页打印页)

*

下载 RSSBandit Installer.msi 示例文件

本页内容
简介 简介
了解 RSS Bandit 的用户界面 了解 RSS Bandit 的用户界面
RSS Bandit 结构概述 RSS Bandit 结构概述
XML 技术和 RSS Bandit XML 技术和 RSS Bandit
RSS Bandit 的未来规划 RSS Bandit 的未来规划


简介

在上一篇文章中,我描述了 RSS Bandit 应用程序的内部工作原理,该应用程序通过处理 Web 上的 RSS 提要来收集各种 Web 站点上的信息。正如我在上一篇文章中提到的那样,RSS 是一种 XML 格式,用于组合来自联机新闻源的新闻和类似内容。

RSS 提要是一个定期更新的 XML 文档,其中包含有关新闻源和其中内容的元数据。一个 RSS 提要至少应包含一个代表新闻源的 channelchannel 包含一个标题、链接和描述新闻源的说明。此外,一个 RSS 提要通常会包含一个或多个代表单个新闻项的 item 元素,其中每个元素都应包含一个标题、链接或说明。

自撰写上一篇文章以来,RSS 已被更广泛地用作 Web 上散布新闻的机制。RSS 提要不仅由联机新闻源(如 Yahoo!NewsBBCRolling Stone magazine)使用,而且已经在以开发人员为中心的信息源(如 Microsoft Developer Network (MSDN)Oracle Technology Network (OTN)Sun Developer Network)之间变得很流行。随着 RSS 提要的迅速增加,桌面新闻集合器已经成为一个功能强大的工具,让那些有兴趣及时了解各种新闻源信息的人不必浏览多个 Web 站点来获取固定的信息。

在过去的几个月中,GotDotNet 上的 RSS Bandit 工作区已经收到了 .NET 框架开发人员社区多个成员(如 Torsten RendelmannMichael EarlsJoe Feser)和许多 RSS Bandit 工作区其他成员的踊跃投稿。本文描述在过去几个月内 RSS Bandit 应用程序中各种新增功能的内部原理。

返回页首返回页首


了解 RSS Bandit 的用户界面

RSS Bandit 的用户界面受邮件和新闻阅读器(如 Microsoft Outlook® 和 Microsoft Outlook Express)的启发。在 RSS Bandit 的当前版本和上一篇文章中的版本之间有一个最显著的区别,那就是用户界面得到了改善。RSS Bandit 使用一个魔术库,此库是用户界面控件的框架,它提供的功能比基本 Windows 窗体控件更丰富。

xml09152003-fig01

1. RSS Bandit 阅读新闻

魔术库提供的强大功能之一就是能够创建选项卡式窗格,此功能允许用户在单个应用程序中嵌套多个窗体。下面的图 2 显示魔术库中的选项卡式窗格如何允许用户从 RSS Bandit 内部利用多个 Web 浏览器窗口。

xml09152003-fig02

2. RSS Bandit 浏览 Web


RSS Bandit 结构概述

RSS Bandit 应用程序由两个截然不同的部分组成:图形用户界面组件XML 和网络组件。主要的 GUI 类是 WinGuiMainRssBanditApplication 类,而主要的 XML 和网络类是 RssHandlerRssLocater

RssHandler 类按指定间隔时间下载 RSS 提要,并将它们提交给 CacheManager 来存储。CacheManager 使用的存储区与应用程序的耦合程度不紧密,实际上,CacheManager 是一个抽象的类,目前它只有一个具体实现,即用于将文件缓存到本地文件系统上的 FileCacheManager。这种灵活性意味着可以在将来引进新类型的 CacheManager,以便使用更好、更优化的存储区,如数据库管理系统。同样,RssHandler 类与用户界面的耦合程度也不紧密,它可以由需要处理 RSS 提要的其他应用程序重新使用。利用 RssHandler 类的客户端会在实例化该类时注册一个回调(委托)。然后,在下载新提要或更新后的提要时,RssHandler 对象会调用已注册的回调。有关要下载哪些提要的信息以及其他配置数据,可从用 XML 编写的提要订阅列表中获取。RssLocater 将在尝试发现特定 Web 站点的 RSS 提要时使用,并在尝试定位提要时,使用一组定义完善的试探法。

RssBanditApplicationApplicationContext 继承,并控制 WinGuiMain。这是一个 Windows 窗体,其中包含一个树视图(用于显示所订阅提要的列表,这些提要可按自由定义的类别分组)、一个列表视图(用于以您熟悉的任何 NNTP 阅读器(如 Outlook Express)中的线程模式,来显示有关当前选定提要中项目的信息),以及一个内嵌的 Web 浏览器(用于显示项目内容)。在启动时,RssBanditApplication 会确认是否有正在运行的程序实例。如果有,它会向该实例转发任一命令行参数并自行终止。如果没有正在运行的类实例,则 RssBanditApplication 会用 RssHandler 注册一个委托,RssHandler 可管理 RSS 提要的下载和处理。在下载新提要或更新后的提要后,RssBanditApplication 会使用 Safe, Simple Multithreading in Windows Forms, Part 1 文章(作者是 Chris Sells)中描述的方法,以线程安全方式通过委托进行更新。

RssBanditApplication 类还可充当多种用户界面组件(菜单、工具栏按钮、上下文菜单等)的中介,并可将操作委托给 WinGuiMainRssHandler,或者在用户与应用程序交互时亲自处理它们。每个可代表用户启动操作的主要用户界面组件都可实现 ICommand 接口(命令模式)和 ICommandComponent 接口,以提取多个类的实现细节。

该用户界面还允许用户管理 RssHandler 类各个方面的行为。用户可以在订阅列表中添加和删除提要、配置提要的下载频率、按类别组织提要,以及设置代理服务器信息。RssItemFormatter 类可使用 XslTransform 类来处理新闻项的显示内容,它使用用户定义的 XSLT 样式表,并将实现 IXPathNavigable 接口的 RssItem 转换为 HTML。


XML 技术和 RSS Bandit

RSS Bandit 应用程序充分利用了 .NET 框架中的 XML 技术。RSS Bandit 可使用 XML 序列化在 XML 配置文件和对象之间进行转换;可使用 XSLT 来实现新闻项的可自定义视图;可使用 XPath 来处理 RSS 提要的 HTML 内容并删除可能有恶意的元素;可使用 System.Xml.XmlWriter 类来确保它编写格式正确的 XML,等等。

使用 XSLT 的可自定义主题

在 RSS Bandit 中阅读新闻项时,它们显示在该应用程序右下方的窗格中,此窗格实际上是一个嵌入式 Web 浏览器控件。使用嵌入式 Web 浏览器可以显示新闻提要的内容,初始版本的 RSS Bandit 并没有利用它所带来的灵活性。在当前版本的 RSS Bandit 中,用户可以创建一个 XSLT 样式表,以自定义新闻项在 Web 浏览器窗格中的显示方式。图 3 是一个配置菜单的屏幕快照,在这里,用户可以从 RSS Bandit 应用程序的模板文件夹中选择特定的样式表

xml09152003-fig03

3. 选择自定义样式表

下载的每个 RSS 提要都表示为一个包含 RssItem 对象列表的 FeedInfo 对象。RssItem 类可实现 IXPathNavigable 接口,这意味着它是 System.Xml.Xsl.XslTransform 类的 Transform 方法可接受的输入内容。在实现 IXPathNavigable 接口时,会将 RssItem 公开为 RSS 2.0 XML 提要,其中包含一个代表 RssItem 中数据的项目。在 Web 浏览器窗格中显示新闻项时,当前选定的 XSLT 样式表和 RssItem 实例会作为输入内容传递到 XslTransform 类的 Transform 方法中,该方法随后会在浏览器窗格中呈现转换结果。

由于大多数 RSS 提要在其内容中不使用 XHTML,而是喜欢使用纯文本或正规 HTML,因此有必要使用 Chris Lovett 的 SgmlReader 类来处理此类提要,SgmlReader 类可用于将 HTML 内容转换为 XHTML。

用 XSLT 导入提要列表

用于存储用户订阅的 RSS 提要列表的现有 XML 格式有好几种。这些格式包括 OPMLOCS 和我在上一篇文章中为 RSS Bandit 选择的格式。

尽管 RSS Bandit 在内部适用于我的提要列表格式,但还可以导入 OPML 或 OCS 格式的提要列表。如果导入的提要列表不是 RSS Bandit 格式,则会检查它是否为 OPML 或 OCS 格式。如果提要列表采用这两种格式之一,则会针对导入的提要列表调用一个样式表,以便将特定格式转换为 RSS Bandit 提要列表格式。下面的样式表可将 OCS 文件转换为我的提要订阅列表格式:

<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns="http://www.25hoursaday.com/2003/RSSBandit/feeds/"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:dc="http://purl.org/metadata/dublin_core#"
exclude-result-prefixes="dc rdf">
  <xsl:output method="xml" indent="yes" />
  <xsl:template match="/">
    <feeds>
      <xsl:for-each select="/rdf:RDF/rdf:description/rdf:description">
        <feed>
          <title>
            <xsl:choose>
              <xsl:when test="dc:title">
                <xsl:value-of select="dc:title" />
              </xsl:when>
              <xsl:otherwise>
                <link>No title for RSS feed provided in imported OCS</link>
              </xsl:otherwise>
            </xsl:choose>
          </title>
          <link>
            <xsl:choose>
              <xsl:when test="rdf:description/@about">
                <xsl:value-of select="rdf:description/@about" />
              </xsl:when>
              <xsl:otherwise>
                <link>No URL for RSS feed provided in imported OCS</link>
              </xsl:otherwise>
            </xsl:choose>
          </link>
        </feed>
      </xsl:for-each>
    </feeds>
  </xsl:template>
</xsl:stylesheet>

将导入的文件转换为 RSS Bandit 提要订阅列表格式之后,它会与在启动时处理的提要订阅列表的内部表示形式合并在一起。

配置文件和 W3C XML 架构

RSS Bandit 提要列表格式是使用 W3C XML 架构定义 (XSD) 文件进行描述的,该文件允许应用程序利用 .NET 框架的 XML 序列化功能将 XML 转换为强类型对象,以便在与提要列表格式的内容交互时提供更自然的编程模型。

RSS Bandit 的集成式搜索 功能也有一个 XML 配置文件格式。用户可以选择使用一个或多个搜索引擎,直接从 RSS Bandit 用户界面搜索 Web。在默认情况下,配置文件包含有关 GoogleFeedsterMSN Search 的信息。用户还可以使用 XmlSerializer 类来处理搜索配置文件,XmlSerializer 类可将搜索配置文件转换为强类型对象的图形,以提供一个更自然的编程模型来与配置信息进行交互。下面是搜索配置文件的架构。

<xs:schema
targetNamespace='http://www.25hoursaday.com/2003/RSSBandit/searchConfiguration/'
 xmlns:xs='http://www.w3.org/2001/XMLSchema' elementFormDefault='qualified'
 xmlns:c='http://www.25hoursaday.com/2003/RSSBandit/searchConfiguration/'>
  <xs:element name='searchConfiguration'>
    <xs:complexType>
      <xs:sequence>
        <xs:element name='engine' minOccurs='0' maxOccurs='unbounded'>
          <xs:complexType>
            <xs:sequence>
              <xs:element name='title' type='xs:string' />
              <xs:element name='search-link' type='xs:anyURI'>
                <xs:annotation>
                  <xs:documentation>
       This defines the base URL of the search engine.
       The placeholder for the search expression is '[PHRASE]' without
     the single quotes but with the brackets!
                           </xs:documentation>
                </xs:annotation>
              </xs:element>
              <xs:element name='description' type='xs:string' />
              <xs:element name='image-name' type='xs:string' />
            </xs:sequence>
            <xs:attribute name='active' type='xs:boolean' />
          </xs:complexType>
        </xs:element>
      </xs:sequence>
      <xs:attribute name='open-newtab' type='xs:boolean' use='optional' />
    </xs:complexType>
  </xs:element>
</xs:schema>

图 4 显示如何在 RSS Bandit 应用程序中使用搜索配置文件中的信息。

xml09152003-fig04

4. RSS Bandit 搜索 Web

生成格式正确的 XML 和导出 OPML 文件

许多新闻集合器都使用 OPML 格式来存储提要列表信息,因此,该格式对于 RSS Bandit 的用户非常有利,因为它能够将其内部的提要列表导出到 OPML 文件中。在初始版本的 RSS Bandit 中,我用以下代码生成了数个可用于几个测试案例的 OPML 文件,但是很多用户在尝试此功能时失败。

StringBuilder sb = new StringBuilder("<opml>\n<body>\n"); 

            if(_feedsTable != null){

               foreach(feedsFeed f in _feedsTable.Values){
                  sb.AppendFormat("<outline title='{0}' xmlUrl='{1}'
                  />\n", f.title, f.link);
               }
            }
   sb.Append("</body>\n</opml>");

上述代码的问题在于,它同时处理了构造 XML 和串联文本值,尽管这是一个比较诱人的建议,但却是错误的。当用户在 RSS 提要的标题中使用被 XML 视为特殊的字符(如和符号 (&) 或单引号 (‘))时,RSS Bandit 会生成格式有误的 XML。为了解决此问题,我决定使用专门针对编写 XML 而设计的 .NET 框架类 — XmlWriter 类。下面是为了使用 XmlWriter 类而重写的同一任务。

XmlTextWriter writer = new XmlTextWriter(feedStream,System.Text.Encoding.UTF8);

writer.WriteStartElement("opml");
writer.WriteStartElement("body");            

if(_feedsTable != null) {

     foreach(feedsFeed f in _feedsTable.Values) {
       writer.WriteStartElement("outline");
       writer.WriteAttributeString("title",f.title);
       writer.WriteAttributeString("xmlUrl", f.link);
       writer.WriteEndElement();
     }
   }            

   writer.WriteEndElement(); //close <body>
   writer.WriteEndElement(); //close <opml>
   writer.Flush();
   writer.Close();

XPath 和 RSS Bandit:自动发现提要

在订阅 Web 站点的 RSS 提要时,主要困难之一就是发现 RSS 提要的位置。在 2002 年 8 月,Mark Pilgrim 曾描述了一个超自由主义 RSS 定位器的算法,该算法由以下几个步骤组成:

1给定 Web 站点的主要地址,下载主页并查找指向 RSS 提要的 LINK 元素。如果您找到的话,就使用它们。

2如果该站点不支持通过 LINK 元素自动发现 RSS,请扫描主页上的所有链接,并智能地猜测它们中的哪个(或哪些)指向 RSS 提要。同一服务器上以 .rss、.rdf 或 .xml 结尾的地址链接是主要候选提要。下载其中的每个链接,并通过检查每个文件的初始内容来查看哪个提要实际上是 RSS 提要。

3如果不成功,请在同一服务器上查找地址中包含 rss、rdf 或 xml 的地址链接,查看它们是否为 RSS 文件。

4如果仍不成功,请按顺序重复以上两个步骤,但需要扩展搜索范围以包括外部服务器上的地址,这是因为许多 weblog 使用第三方服务来为它们的 Web 站点提供 RSS 提要。清除 127.0.0.1 地址,然后查看剩下的链接是否是 RSS 文件。

5如果仍不成功,请尝试使用 Syndic8。Syndic8 可跟踪多个站点内的成千上万个 RSS 提要,并提供一个以编程方式与它进行交互的 XML-RPC 接口

RSS Bandit 使用 RssLocater 类来实现上面的自动发现过程。第一步涉及到下载 Web 站点并在其中搜索链接。作为一个 XML 的狂热支持者,我希望使用 XPath 在文档中搜索链接,但意识到这会非常困难,因为大多数站点都不是用基于 XML 的 XHTML 标记语言编写的,而是用与 XML 不兼容的旧版 HTML 编写的。在这种情况下,可以借助于 Chris Lovett 的 SgmlReader 类SgmlReader 类可以读入 HTML 文档并将它表示为 XML 文档,随后可以使用 .NET 框架中的传统 XML API 来处理此 XML 文档。以下代码片段显示用户如何使用 XPath 来获取 HTML 文档中引用 RSS 提要的所有 LINK 元素:

SgmlReader reader = new SgmlReader();
   reader.InputStream = new StreamReader(GetWebPage(url));
   reader.Href = url;
   reader.DocType= "HTML";
   XmlDocument doc = new XmlDocument();
   doc.XmlResolver = null;
   doc.Load(reader);

   ArrayList list = new ArrayList(); 

   //<link rel="alternate" type="application/rss+xml" title="RSS" href="url/to/rss/file">

   foreach(XmlNode node in doc.SelectNodes("//*[local-name()='link' and
@type='application/rss+xml' and @title='RSS']/@href")){
     string url = ConvertToAbsoluteUrl(node.Value, node.BaseURI);
     if(LooksLikeRssFeed(url)){
       list.Add(url);
     }
   }

图 5 中的屏幕快照显示一个对话框,若在嵌入式 Web 浏览器中查看的站点是 MSDN 主页,则在单击 RSS Bandit 应用程序上的 Autodiscover Feeds 按钮时,该对话框会弹出。

xml09152003-fig05

5. 启动提要自动发现功能

图 6 中的屏幕快照显示在站点中成功找到 RSS 提要的结果。

xml09152003-fig06

6. 找到的提要

使用 XPath 筛选可能有恶意的内容

如上所述,RSS 提要中的 HTML 内容会先转换为 XHTML,然后才显示在浏览器窗格中。如果不注意从 RSS 提要的 HTML 内容中去除可能有恶意的元素(如脚本块),可能会导致安全问题。由于 RSS 提要中的 HTML 内容是使用 Chris Lovett 的 SgmlReader 类转换为 XHTML 的,因此可以方便地使用 XPath 和 XmlDocument 类来去除不需要的标记。以下代码片段显示如何使用 XPath 从 RSS 提要的内容中筛选掉可能有恶意的元素和属性。

//remove potentially malicious tags
      string badtagQuery = "//@style | //*[local-name()='script' or local-
name()='object'or local-name()='embed' or local-name()='iframe' or local-
name()='meta' or local-name()='frame'or local-name()='frameset' or local-
name()='link' or local-name()='style']";

      foreach(XmlNode badtag in doc.SelectNodes(badtagQuery)){

         XmlAttribute badattr = badtag as XmlAttribute;

         if(badattr != null){
            badattr.OwnerElement.Attributes.Remove(badattr);
         }else{
            badtag.ParentNode.RemoveChild(badtag);
         }
      }

从 RSS Bandit 张贴注释

我最初创建 RSS Bandit 的目的是将它用作跟踪经常访问的多个 weblog 的方法。在早些时候,我认为 weblog 的一个有趣方面就是它们的对话特性,特别是用户可以通过 Web 观看讨论场面的方式。RSS Bandit 有许多尝试利用 weblog 对话特性的功能。

如果特定 RSS 提要中的某个新闻项引用 RSS Bandit 中的其他新闻项,或被其他新闻项引用,则这种关系会显示在用户界面中,作为电子邮件和新闻阅读器的线程消息回放。这提供了一个非常直观的机制,用于在用户订阅的 weblog 上跟踪讨论,这是因为互相引用的帖子会显示在一起。RSS Bandit 还提供了多种方法,与为响应特定新闻项而张贴的注释进行交互,具体方法取决于其 RSS 提要中提供的信息。用于提供有关新闻项注释信息的 RSS 元素有许多,其中包括 comment 元素(它提供一个链接,用户可以在此处于用户界面中张贴新闻项注释)、slash:comments 元素(用于指出为响应新闻项而张贴的注释数量)、wfw:commentRss 元素(为新闻项注释提供 RSS 提要的位置)和 wfw:comment 元素(它提供的 URI 可接受作为新闻项答复来发送的 RSS 项,新闻项使用的是 HTTP POST)。

RSS Bandit 支持上面的元素,这些元素提供有关特定新闻项注释的信息。下面的图 7 是 RSS Bandit 的屏幕快照,它显示全部四个与注释有关的使用中 RSS 元素。

xml09152003-fig07

7. RSS Bandit 阅读和张贴注释

将注释张贴到 RSS Bandit 的机制是 CommentAPI,CommentAPI 指定应用程序如何通过将RSS 项张贴到特定 URI 来发送对 RSS 提要中新闻项的响应。以下代码片段显示 RSS Bandit 如何使用 HTTP POST 来发送对 RSS 提要中新闻项的答复。

public HttpStatusCode PostCommentViaCommentAPI(string url, RssItem item){

      HttpWebRequest request = (HttpWebRequest) WebRequest.Create(url);
      request.Timeout          = 1 * 60 * 1000; //one minute timeout
      request.UserAgent        = this.UserAgent;
      request.Proxy            = this.Proxy;
      request.Credentials = CredentialCache.DefaultCredentials;
      request.Method = "POST";
      request.ContentType = "text/xml";
      string comment = item.ToString(true);
      request.ContentLength = comment.Length; 

      StreamWriter myWriter = null;
      try{
         myWriter = new StreamWriter(request.GetRequestStream());
         Trace.WriteLine(comment);
         myWriter.Write(comment);
      } catch(Exception e){

         throw new WebException(e.Message, e);
      }finally{
         if(myWriter != null){
            myWriter.Close();
         }
      }
  HttpWebResponse response = (HttpWebResponse) request.GetResponse();
  return response.StatusCode;
}

应注意的是,对 CommentAPI 的支持尚不普及。CommentAPI 实现列表中提供了支持 CommentAPI 的 Web 站点列表。一个值得注意的 CommentAPI 支持程序是 .Text blog 引擎,Weblogs @ ASP.NETWeblogs @ DotNetJunkies.com 均使用该引擎。

RSS Bandit 有一个虚拟文件夹,其中存储有通过图 8 所示的 CommentAPI 张贴的所有注释。

xml09152003-fig08

8. 已发送项目文件夹

插件结构

在最近的一个名为在 CLR 中传递 XML 数据的 MSDN TV 情节中,Don Box 描述了在 .NET 框架的应用程序之间传递 XML 的各种机制。用户可以将许多类型选作 .NET 框架中 XML 文档的表示形式,如 StringIXPathNavigableXmlDocumentXmlReader 类的实例,它们各有利弊。最近,Simon Fell 建议将 IBlogExtension 接口用作基于 .NET 框架构建的新闻集合器的通用机制,以便与插件共享信息,并选择 IXPathNavigable 接口作为在新闻集合器和插件之间传递 XML(特别是 RSS 项)的方法。

RSS Bandit 支持 IBlogExtension 接口,因此允许开发人员构建与 RSS Bandit 集成的插件。图 9 中的屏幕快照显示与 w.bloggar 插件(位于 http://www.sharpreader.net/wBloggarPlugin.zip 上,由 Luke Hutteman 编写)的集成,该插件使用户能够使用流行的 w.bloggar weblog 编辑器在他们的 weblog 中张贴特定新闻项。

xml09152003-fig09

9. RSS Bandit 张贴到 Weblog


RSS Bandit 的未来规划

自从我发表上一篇文章以来,RSS Bandit 已经得到突飞猛进的发展,这主要是由于我和 Torsten Rendelmann 的不懈努力以及 RSS Bandit 工作区中许多人的帮助。我打算继续与 .NET 开发人员社区中的各位成员一起开发 RSS Bandit,并一直将它用作显示 .NET 框架功能的方法,以及构建利用 XML 强大功能的丰富客户端应用程序的平台。

在下一版本的 RSS Bandit 中,我希望看到几个功能,如使用 更新程序块进行自动更新、新闻项选择的类似于报纸的视图,以及从 RSS Bandit 直接编辑用户的 weblog 的功能。如果您愿意协助我们向 RSS Bandit 中添加这些功能或其他功能,欢迎您加入该工作区并提供帮助。您的帮助对我们总是有用的。

Dare Obasanjo 是 Microsoft 的 WebData 组的成员,该小组在 .NET 框架的 System.Xml 和 System.Data 命名空间、Microsoft XML 核心服务 (MSXML) 和 Microsoft 数据访问组件 (MDAC) 中开发组件。

有关本文的任何问题或评论,欢迎张贴到 GotDotNet 上的 Extreme XML 留言板

转到原英文页面

2004年12月23日

Introduction

Recently, I wrote an article on the development of a master-page framework in ASP.NET. Master Page framework was an extension of the ASP.NET UI.Page framework, and most of it is related to the static appearance of the Page. Then, I started receiving emails about the dynamic behavior of pages, and controlling the interactions between user-controls on a page, and communication between different pages etc. So, I thought there is a need of discussion on this aspect of the .NET Framework, and this would not be possible without mentioning the Model-View-Controller Design Pattern, that is embedded everywhere in the ASP.NET Framework. This article is composed of two parts. In the first part, we will discuss in detail about the Model-View-Controller Design pattern in context to ASP.NET, using UML, and finish our discussion with writing an example application as a bonus for its implementation. In the next part of the article, we’ll wrap up our discussion with dynamic aspects of pages by explaining different types of fine-grained controllers with their design and implementation, and we’ll discuss about the details of the HTTP pipeline too and what is the best approach to use it; so stay tuned, lot to come :) .

Model-View-Controller Design Pattern:

This is the classical and most famous of all design patterns especially when we talk about User Interfaces. A user interface is basically a presentation layer of some sort of model-data. If the data changes, the view is updated accordingly, or vice-versa. In this particular scenario, if you look closely, you can observe that there are two objects involved for performing this task, i.e., a View and a Model. There might be more than one view of a data model. Like, one view could be a Grid presentation, and other could be a Graphical Plot of some sort, like Bar chart or Pie chart, or so. Here is how it would look like:

image001.jpg

I know, you should have these questions now, like how the view object would talk with the model object and vice versa, and how we want to look at the model, which part of the model will go to which view and which part of the view and how. Now, we are talking about one other object, the controller object; this controller object acts as glue between model and view. Controller object is basically a business object, provides the answer how we want to view the data, what rules should be applied for its presentation or so, and what will happen if some one changes some parameters in the view, how to react to some key-board or mouse events if this is an interactive application, and some times you need to update the model and send messages to other views that the data is changed, so that they can postback a request for updating their own contents. You will get all these answers shortly!

Robustness Analysis:

This Model-View-Controller can be best described visually, using robustness analysis, first introduced by Ivar Jacobson, in his award winning book on Object Oriented Software Engineering (see reference below), and was further explained by Doug Rosenberg et al in his book, Use Case Driven Object Modeling with UML. Robustness Analysis involves analyzing the narrative text of use cases, identifying a first-guess set of objects that will participate in those use cases, and classifying these objects based on the roles they play, and it helps you partition objects within a Model-View-Controller paradigm. Robustness analysis enables the ongoing discovery of objects, and helps you ensure that you’ve identified most of the major domain classes before starting any additional design or development. Ivar has classified them as:

  1. «entity» objects depict long lived objects, deals mostly with the persisted states.
  2. «boundary» objects depict links between the system and environment, communicating.
  3. «control» objects depict use-case-specific behavior.

Entity objects are no more than the information or data your boundary objects are looking for. These might be database tables, Excel files, or might be “transient” session or cached data or so.

Boundary objects are the objects with which the actors (for instance, the users) will be communicating in your software system. These might be any windows, screens, dialogs and menus, or other UIs in your system. You can easily identify them while analyzing your use cases.

Control objects (controllers) are the business objects or your business web-services. This is where you capture your business rules that are used to filter out the data to be presented to the user he has requested for or something. So, the controllers are actually controlling the businesses needs or the business itself.

Here are the visual icons of these three UML stereotypes:

image003.jpg

Rules of the game:

Here are the four basic rules you should enforce when extracting these objects while analyzing your use cases and making interaction diagrams among these objects:

  • Actors can only talk to the boundary objects.
  • Boundary objects can only talk to controllers and actors.
  • Entity objects can only talk to controllers.
  • Controllers can talk to boundary objects and entity objects, and to other controllers, but not to actors.

image005.jpg

If you look at the above figure, you can see the only access point for an actor to reach any object is through the Boundary objects. Boundary objects can’t talk with each other. While Control objects can talk freely with all of the objects except that it has no access to the actor or the user of the system. And the Entity objects can’t talk with each other too; they can access each other but through the Control object. Also, an Entity object can only reach Boundary object through the Control object. So the Control object is really a media of communication between each layer, and that’s why it is suited for business rules.

Robustness Analysis and Model-View-Controller:

Now, we have all the rules in place, let’s see how these objects are related to Model-View-Controller paradigm. Well, good news is this, that they have one-to-one mapping with the objects, derived from this analysis:

  • Entity object maps to Model object,
  • Boundary object maps to View object, and
  • Controller is same in both.

And the other good news is that the same rules are applied to these objects too. It means, when we are doing Robustness Analysis, we can use Model-View-Controller objects in place of Entity-Boundary-Controller objects. Here is how our objects would look like now:

image007.jpg

And here is the simplistic and hypothetical sequence diagram for MVC. What you see in this diagram, a web-user initiated a query and an event is generated that is handled by the controller and gets information that is needed from the model, validates the information and passes back the result set to the view.

image009.jpg

Model-View-Controller is an approach for segregating different layers, in such a manner that it can be easy to maintain for a moderate to very complex application. It would be best suitable when you are making something for the web, where the business rules change frequently with time. Also, it’s the best way of writing applications in a service oriented architecture, where you write/encapsulate your business rules in a separate web-service than the Data-Access Layer, where the service is only dealing with Model/Data related stuff. Since the user interfaces are also separate, you can build powerful Custom Controls, and put them in the User-Controls for deploying as Pagelets. And finally, what is left, you need just to build proper interactions for them, enforcing rules mentioned above, while designing any solution.

Examples:

So, let’s see where we stand now: we now know what these model-view-controller objects are, how they look like, and what are the rules we should enforce when we do analysis of a particular use case scenario and making interaction diagrams out of them. Let’s take an example scenario and explain it using this technique!

Use case (UC.1.001):

User shall be able to logon to the web-application.

Here is the UML diagram for this use case; it shows an actor (web-user) and logon use-case relationship, it would more be elaborated when we go into the analysis phase of it. I am taking a very simple case, of login and deferring the registration use-case for our example, for making it simple and understandable at the first go.

image011.jpg

Before going into any analysis, it would be better, or I call it, it’s a must, that you should have the visual prototype in front of you. Here is how our prototype would look like:

image013.jpg

What it means, that the user is presented with a logon screen, where he can enter his identity and press the Login button for access to the system. The user can also turn on the check-mark on the check box for remembering him; in that case, a cookie is needed on his system, to be enabled. Also, when he logs in, a new session is created on the server side for keeping his activities alive while using the web-application. With all these, let’s elaborate this scenario using the Robustness Analysis diagrams:

image015.jpg

Let’s take a closer look at the above diagram. User interacts with Logon Screen (the view) that has one submit button for login (has-a relationship in UML is shown with a diamond shaped line at one end), and has a checkbox for enabling the cookie storage or so. What happens when the user clicks on any of these two buttons, the request goes to the MainController object, which then decides what to do with these actions, since it has all the logic built for handling those events, and then does some validations from security control object that in turn takes data from the Data Access object? MainController then creates a session for this user, it also stores user’s info like encrypted passwords, in somewhere in the cookie store or so. With this analysis, we are now very much clear that, apart from the logon screen, there would be four other objects involved in this scenario; MainController, SessionState object, the CookieStore object and the Database Access objects. It does have given you an idea what are the objects required to do a particular job, but it does not tell you what functionality is needed for accomplishing this task. For that, we need a sequence diagram, here is how the sequence diagram should look like:

image017.jpg

After careful study of the above sequence diagram, we can safely classify our objects as there should be a MainController, that takes care the business logic for the page, and a BusinessService object that takes care of all the validation, Session Management and cookie setup stuff. There would be a DataAccessGateWay (model) object that encapsulates all the details of talking with the back end database. Here is how the class diagram looks like:

image019.jpg

Isn’t it beautiful that all of this robustness analysis leads us to a design where we can define our first set of detailed class diagrams? I know you’ll love it. J

The most important of all of the objects above, are the business objects that are the controller objects. Now, let’s have a closer look at how ASP.NET framework facilitates us in defining these Controller classes.

ASP.NET and Model-View-Controller (MVC)

Basic idea of MVC is to segregate the presentation layer from the business layer and the Data Access Layer, or the model. Let’s see how it is done in ASP.NET.

image021.jpg

Let’s have a closer look at this picture, Controller is bound to some model, and updates the View when the data changes or vice-versa. So, the view depends on the model, passively. We achieve this goal, with the code-behind feature of ASP.NET, the page/user-controls logic is separate from that of the view, that is presented either in “*.aspx” files or in “*.ascx” files. And the controls you add to the page-controller, i.e. System.UI.Page, you try to make them data-bounded as much as possible for consistency and synchronization. And also, you embed your event handler’s logic in the page controller using the code-behind feature. And if your architecture is service-oriented architecture, then you can introduce another controller as a business service (might be security service) that talks to the back-end model as needed and returns secured data to the Page Controller. Here is how it would look like:

image023.jpg

It’s Showtime (bonus++):

As a bonus, we would be developing a full fledged master-details web application that will give you enough details about model-view-controller (MVC) and its inner working, as well as a way of communication between different user-controls without even knowing each other. Also, with it you’ll get a reusable library that is composed of classes OleDataLinkAdapter and MsSqlDataLinkAdapter. These objects are wrappers around ADO.NET Connection and Command objects.

Master-Details ASP.NET Application

The application is a web-application that has a master-page composed of a company logo, a banner, main contents, and footer. Main contents should display a parent-child relationship in the form of grid-view. There shall be two grids, the top one would be the master grid and the bottom would be the details one. When the user clicks on any item in the master view, the details/child view changes accordingly and should display the related child data.

Use-Case001: User shall be able to view/manipulate the master-details for Northwind Customer/Orders tables.

image025.png

We would be applying the same technique as in my previous article, that is Pattern Oriented Architecture and Design (POAD). For this, we need the prototypic pattern of our system we are planning to design. And here is how the prototype would look like:

image027.png

So, what we see is what we get, that is, our application would be composed of a MasterPage, an HTML frame container, then HTML Table, and this table would contain further the two controls, Master-User-Control and the Details-User-Control. These user controls in turn would contain the MasterDataGrid and DetailsDataGrid Views. Here is the static inventory of the objects, needed to fulfill this job:

image029.png

Robustness Analysis of the Use-Case001

We now have some understandings of the objects needed to fulfill this job, but before going into the detailed design, we need to analyze the use cases, to find-out the proper interactions and relationships among them. This would be done using the Robustness Analysis, while keeping into our mind the MVC concept. Let’s draw the conceptual Robustness Diagram for this scenario:

image031.png

What it means, that the user is presented with a master-details view. The upper Master-Grid-view is used to select the master table items. So, whenever the user selects master item, the child view is updated accordingly. Let’s see how we achieve these interactions between objects. Look at the above diagram closely and you will see that when the user selects an item from the master-grid-view, the MasterUserControl would be notified, and that in turn notifies this event to the listener, that is the MainController object. MainController object would notify about this event to the other listener, the DetailsUserControl.

These user-controls have direct access to the Model (DataAccessGateway) that uses MsSqlDataLinkAdapter to establish MS-SQL database connections (for OLEDB connections, use the other adapter, OleDataLinkAdapter). This is a classic example of the MVC and its inner working. Here is the object interaction diagram for this particular scenario:

image033.png

The MasterController and the DetailsController are differential Controllers, that they are responsible of controlling one User-Control at a time and fine tune them while the MainController object acts as an integrator, or a channel for these two controllers to pass information to and fro between them. Let’s see how we are planned for them to work. For an object being a Notifying object, needs to implement the INotifier interface, while a Listener object needs to implement the IListner interface. Also, a listener object needs to attach itself with the relevant notifier in-order for being notified. In our case, we have the Master-Controller and, the Details-Controller is the listener. MainController is both listener and notifier, and acts as glue between them.

Now, at this point, we have all the information about the objects, responsible for doing the job, how they would look like, and about their behavior and functionality. Let’s now design them and develop the complete class design. And here is the detailed class diagram for this application:

image035.png

Implementation using C# and ASP.NET

Code hierarchy is shown in the following figure, where you will find DataAccess adapters in Shams.Data.dll, updated MasterPage Framework library in Shams.Web.UI.MasterPages.dll, and I have collected some libraries from different sources on the web, you’ll find them under Shams.Web.UI.WebControls.dll. The sample application is in Shams.MVC.WebApp namespace. All you need is to open Shams.MVC.WebApp solution and you are ready to go. And verify the virtual path in the file “Shams.MVC.WebApp.csproj.webinfo” to be created in the IIS. Here is the one for it:

<VisualStudioUNCWeb>

<Web URLPath = "http://localhost/shams/MVC/WebApp/Shams.MVC.WebApp.csproj" />

</VisualStudioUNCWeb>

image037.png

Here are some code snippets, from the application, starting with the web.config. The AppSettings section in the web.config file would contain two things for this application, the connection string for the NorthWind DataTable, and the path to the MasterPageUserControl.

Here is how it does look like:

<appSettings>
  <add key="MasterPageUserControl" value="MasterPageUserControl.ascx"/>
  <add key="ConnectString001"
       value="server=localhost;database=northwind;Integrated Security=true;"/>
  <add key="ConnectString002"
       value="Provider=SQLOLEDB;server=localhost;database=northwind;
                                        Integrated Security=true;"/>
</appSettings>

The model is the DataAccessGateway class and it returns the DataSet that’s been used in the Master-Details data grid controls. Here is how it looks like:

image039.png

It uses the MsSqlDataLinkAdapter that is a wrapper around ADO.NET connection and command objects. This is a singleton class, with all of its functions being static. Here is the skeleton of this class:

image041.png

In our particular scenario, DataAccessGateway uses this class, to connect to the MS-SQL NorthWind database, by calling the Connect(.) function of it. Once you have this connection object, you call another function FilldataSet(.) to return you a brand-new dataset being used for data-binding. Once it is done, you call Disconnect(.) to close the connection object.

Other notable classes are the controller classes, UcMasterDetails (Main-Controller) and two other classes, UcMainContentsMaster (Master Controller) and UcMainContentsDetails (Details Controller). Here is how we connected these two classes with each other using UcMasterDetails as a mediator.

image043.png

And finally! Here is how the application would look like when you run it. :)

image045.png

The MVC pattern I used in the application is the minimal use of it, but it’s the best pattern being used in all the navigations between pages and between controls. It also depends on how you plot a plan for your application in making a prototype application. So folks, that’s it for now, I have tried to un-cover everything about the MVC design pattern, with some useful applications for its implementation. This is one of the most commonly used and misunderstood pattern. I tried to make it simple and give you all the details it carries, along with the robustness analysis, that’s the key in analyzing such applications.

Improvement:

There is always a point of improvement in any system, and the same here is applicable too. I haven’t talked about any efficiency related stuff, because that itself needs a full article. So, I am putting this responsibility on you to go through the documentation for improving efficiency of ASP.NET applications, like Caching, View-State Management and Session Management, and How to avoid round-trips to the server, like post-backs etc. And I would appreciate if you involve me in your findings, thank you very much.

Question-Answers Session:

First, is the funny one ;)

  • Q) Say (God forbid) you are a color blind that you can’t see green color or so, and she is with green outfit, what will you see…?
  • A) Nothing at all or you’ll see what you want to see (use your imaginations, apply abstractions) J.

    Now the serious one!

    • Q) What do you consider when architecting and implementing a new system being successful?
    • A) Your vision should be broad when you architect or design it, and should be focused while implementing it.

    They were my answers, but you are welcome to send me your answers at shams@microteck.net, and I will post them on my web site very soon.

    Bibliography:

    1. Object Oriented Software Engineering by Ivar Jacobson.
    2. Design Patterns by Erich Gamma, et al.
    3. Patterns Of Enterprise Architecture by Martin Fowler.
    4. Use Case Driven Object Modeling with UML by Rosenberg et al.
    5. UML Case Tool: Enterprise Architect from www.sparxsystems.com.


    About Shams Mukhtar


    Lead Architect with 15++ years of software design and development experience. Architected and designed many industrial softwares and passed through full software development life-cycle. Strong hold in Object-Oriented software engineering using UML with Design Patterns, C++/VC++, C# and Java. Domain expertise are in Distributed Computing, Messaging Systems, Multi-threading, Component developments, Computer Graphics, Embedded Systems, GIS development, framework development, User-Interface designs, Chemical Engineering and Process Controls. Having both Bachelors and Masters degrees in Engineering with certifications in Obect Oriented Analysis and design

    Click here to view Shams Mukhtar’s online profile.

Now the serious one!

  • Q) What do you consider when architecting and implementing a new system being successful?
  • A) Your vision should be broad when you architect or design it, and should be focused while implementing it.

They were my answers, but you are welcome to send me your answers at shams@microteck.net, and I will post them on my web site very soon.

Bibliography:

  1. Object Oriented Software Engineering by Ivar Jacobson.
  2. Design Patterns by Erich Gamma, et al.
  3. Patterns Of Enterprise Architecture by Martin Fowler.
  4. Use Case Driven Object Modeling with UML by Rosenberg et al.
  5. UML Case Tool: Enterprise Architect from www.sparxsystems.com.


About Shams Mukhtar


Lead Architect with 15++ years of software design and development experience. Architected and designed many industrial softwares and passed through full software development life-cycle. Strong hold in Object-Oriented software engineering using UML with Design Patterns, C++/VC++, C# and Java. Domain expertise are in Distributed Computing, Messaging Systems, Multi-threading, Component developments, Computer Graphics, Embedded Systems, GIS development, framework development, User-Interface designs, Chemical Engineering and Process Controls. Having both Bachelors and Masters degrees in Engineering with certifications in Obect Oriented Analysis and design

Click here to view Shams Mukhtar’s online profile.

The most important of all of the objects above, are the business objects that are the controller objects. Now, let’s have a closer look at how ASP.NET framework facilitates us in defining these Controller classes.

ASP.NET and Model-View-Controller (MVC)

Basic idea of MVC is to segregate the presentation layer from the business layer and the Data Access Layer, or the model. Let’s see how it is done in ASP.NET.

image021.jpg

Let’s have a closer look at this picture, Controller is bound to some model, and updates the View when the data changes or vice-versa. So, the view depends on the model, passively. We achieve this goal, with the code-behind feature of ASP.NET, the page/user-controls logic is separate from that of the view, that is presented either in “*.aspx” files or in “*.ascx” files. And the controls you add to the page-controller, i.e. System.UI.Page, you try to make them data-bounded as much as possible for consistency and synchronization. And also, you embed your event handler’s logic in the page controller using the code-behind feature. And if your architecture is service-oriented architecture, then you can introduce another controller as a business service (might be security service) that talks to the back-end model as needed and returns secured data to the Page Controller. Here is how it would look like:

image023.jpg

It’s Showtime (bonus++):

As a bonus, we would be developing a full fledged master-details web application that will give you enough details about model-view-controller (MVC) and its inner working, as well as a way of communication between different user-controls without even knowing each other. Also, with it you’ll get a reusable library that is composed of classes OleDataLinkAdapter and MsSqlDataLinkAdapter. These objects are wrappers around ADO.NET Connection and Command objects.

Master-Details ASP.NET Application

The application is a web-application that has a master-page composed of a company logo, a banner, main contents, and footer. Main contents should display a parent-child relationship in the form of grid-view. There shall be two grids, the top one would be the master grid and the bottom would be the details one. When the user clicks on any item in the master view, the details/child view changes accordingly and should display the related child data.

Use-Case001: User shall be able to view/manipulate the master-details for Northwind Customer/Orders tables.

image025.png

We would be applying the same technique as in my previous article, that is Pattern Oriented Architecture and Design (POAD). For this, we need the prototypic pattern of our system we are planning to design. And here is how the prototype would look like:

image027.png

So, what we see is what we get, that is, our application would be composed of a MasterPage, an HTML frame container, then HTML Table, and this table would contain further the two controls, Master-User-Control and the Details-User-Control. These user controls in turn would contain the MasterDataGrid and DetailsDataGrid Views. Here is the static inventory of the objects, needed to fulfill this job:

image029.png

Robustness Analysis of the Use-Case001

We now have some understandings of the objects needed to fulfill this job, but before going into the detailed design, we need to analyze the use cases, to find-out the proper interactions and relationships among them. This would be done using the Robustness Analysis, while keeping into our mind the MVC concept. Let’s draw the conceptual Robustness Diagram for this scenario:

image031.png

What it means, that the user is presented with a master-details view. The upper Master-Grid-view is used to select the master table items. So, whenever the user selects master item, the child view is updated accordingly. Let’s see how we achieve these interactions between objects. Look at the above diagram closely and you will see that when the user selects an item from the master-grid-view, the MasterUserControl would be notified, and that in turn notifies this event to the listener, that is the MainController object. MainController object would notify about this event to the other listener, the DetailsUserControl.

These user-controls have direct access to the Model (DataAccessGateway) that uses MsSqlDataLinkAdapter to establish MS-SQL database connections (for OLEDB connections, use the other adapter, OleDataLinkAdapter). This is a classic example of the MVC and its inner working. Here is the object interaction diagram for this particular scenario:

image033.png

The MasterController and the DetailsController are differential Controllers, that they are responsible of controlling one User-Control at a time and fine tune them while the MainController object acts as an integrator, or a channel for these two controllers to pass information to and fro between them. Let’s see how we are planned for them to work. For an object being a Notifying object, needs to implement the INotifier interface, while a Listener object needs to implement the IListner interface. Also, a listener object needs to attach itself with the relevant notifier in-order for being notified. In our case, we have the Master-Controller and, the Details-Controller is the listener. MainController is both listener and notifier, and acts as glue between them.

Now, at this point, we have all the information about the objects, responsible for doing the job, how they would look like, and about their behavior and functionality. Let’s now design them and develop the complete class design. And here is the detailed class diagram for this application:

image035.png

Implementation using C# and ASP.NET

Code hierarchy is shown in the following figure, where you will find DataAccess adapters in Shams.Data.dll, updated MasterPage Framework library in Shams.Web.UI.MasterPages.dll, and I have collected some libraries from different sources on the web, you’ll find them under Shams.Web.UI.WebControls.dll. The sample application is in Shams.MVC.WebApp namespace. All you need is to open Shams.MVC.WebApp solution and you are ready to go. And verify the virtual path in the file “Shams.MVC.WebApp.csproj.webinfo” to be created in the IIS. Here is the one for it:

<VisualStudioUNCWeb>

<Web URLPath = "http://localhost/shams/MVC/WebApp/Shams.MVC.WebApp.csproj" />

</VisualStudioUNCWeb>

image037.png

Here are some code snippets, from the application, starting with the web.config. The AppSettings section in the web.config file would contain two things for this application, the connection string for the NorthWind DataTable, and the path to the MasterPageUserControl.

Here is how it does look like:

<appSettings>
  <add key="MasterPageUserControl" value="MasterPageUserControl.ascx"/>
  <add key="ConnectString001"
       value="server=localhost;database=northwind;Integrated Security=true;"/>
  <add key="ConnectString002"
       value="Provider=SQLOLEDB;server=localhost;database=northwind;
                                        Integrated Security=true;"/>
</appSettings>

The model is the DataAccessGateway class and it returns the DataSet that’s been used in the Master-Details data grid controls. Here is how it looks like:

image039.png

It uses the MsSqlDataLinkAdapter that is a wrapper around ADO.NET connection and command objects. This is a singleton class, with all of its functions being static. Here is the skeleton of this class:

image041.png

In our particular scenario, DataAccessGateway uses this class, to connect to the MS-SQL NorthWind database, by calling the Connect(.) function of it. Once you have this connection object, you call another function FilldataSet(.) to return you a brand-new dataset being used for data-binding. Once it is done, you call Disconnect(.) to close the connection object.

Other notable classes are the controller classes, UcMasterDetails (Main-Controller) and two other classes, UcMainContentsMaster (Master Controller) and UcMainContentsDetails (Details Controller). Here is how we connected these two classes with each other using UcMasterDetails as a mediator.

image043.png

And finally! Here is how the application would look like when you run it. :)

image045.png

The MVC pattern I used in the application is the minimal use of it, but it’s the best pattern being used in all the navigations between pages and between controls. It also depends on how you plot a plan for your application in making a prototype application. So folks, that’s it for now, I have tried to un-cover everything about the MVC design pattern, with some useful applications for its implementation. This is one of the most commonly used and misunderstood pattern. I tried to make it simple and give you all the details it carries, along with the robustness analysis, that’s the key in analyzing such applications.

Improvement:

There is always a point of improvement in any system, and the same here is applicable too. I haven’t talked about any efficiency related stuff, because that itself needs a full article. So, I am putting this responsibility on you to go through the documentation for improving efficiency of ASP.NET applications, like Caching, View-State Management and Session Management, and How to avoid round-trips to the server, like post-backs etc. And I would appreciate if you involve me in your findings, thank you very much.

Question-Answers Session:

First, is the funny one ;)

  • Q) Say (God forbid) you are a color blind that you can’t see green color or so, and she is with green outfit, what will you see…?
  • A) Nothing at all or you’ll see what you want to see (use your imaginations, apply abstractions) J.

    Now the serious one!

    • Q) What do you consider when architecting and implementing a new system being successful?
    • A) Your vision should be broad when you architect or design it, and should be focused while implementing it.

    They were my answers, but you are welcome to send me your answers at shams@microteck.net, and I will post them on my web site very soon.

    Bibliography:

    1. Object Oriented Software Engineering by Ivar Jacobson.
    2. Design Patterns by Erich Gamma, et al.
    3. Patterns Of Enterprise Architecture by Martin Fowler.
    4. Use Case Driven Object Modeling with UML by Rosenberg et al.
    5. UML Case Tool: Enterprise Architect from www.sparxsystems.com.


    About Shams Mukhtar


    Lead Architect with 15++ years of software design and development experience. Architected and designed many industrial softwares and passed through full software development life-cycle. Strong hold in Object-Oriented software engineering using UML with Design Patterns, C++/VC++, C# and Java. Domain expertise are in Distributed Computing, Messaging Systems, Multi-threading, Component developments, Computer Graphics, Embedded Systems, GIS development, framework development, User-Interface designs, Chemical Engineering and Process Controls. Having both Bachelors and Masters degrees in Engineering with certifications in Obect Oriented Analysis and design

    Click here to view Shams Mukhtar’s online profile.

Now the serious one!

  • Q) What do you consider when architecting and implementing a new system being successful?
  • A) Your vision should be broad when you architect or design it, and should be focused while implementing it.

They were my answers, but you are welcome to send me your answers at shams@microteck.net, and I will post them on my web site very soon.

Bibliography:

  1. Object Oriented Software Engineering by Ivar Jacobson.
  2. Design Patterns by Erich Gamma, et al.
  3. Patterns Of Enterprise Architecture by Martin Fowler.
  4. Use Case Driven Object Modeling with UML by Rosenberg et al.
  5. UML Case Tool: Enterprise Architect from www.sparxsystems.com.

About Shams Mukhtar


Lead Architect with 15++ years of software design and development experience. Architected and designed many industrial softwares and passed through full software development life-cycle. Strong hold in Object-Oriented software engineering using UML with Design Patterns, C++/VC++, C# and Java. Domain expertise are in Distributed Computing, Messaging Systems, Multi-threading, Component developments, Computer Graphics, Embedded Systems, GIS development, framework development, User-Interface designs, Chemical Engineering and Process Controls. Having both Bachelors and Masters degrees in Engineering with certifications in Obect Oriented Analysis and design

Click here to view Shams Mukhtar’s online profile.

2004年12月21日
Background and How Sessions Are Implemented

ASP.NET provides a framework for storing data that is specific to an individual user with the Session object. A page can add information to the Session object, and any other page can then retrieve the information for the same user. In order to preserve server memory, ASP.NET implements a rolling timeout mechanism which discards the session information for a user if no request is seen within the timeout period (default 20 minutes which is reset with each request).

It is often useful in an ASP.NET site to know for a particular request if the user’s session information is still intact (that a timeout has not occurred). One common need is to be able to inform the user why they lost their session information, by redirecting to a page that describes the timeout amount and how to avoid the problem in the future.  Without this technique it is difficult to know if a session variable is not present whether it was never set properly or the user waited too long between requests.  Many ASP.NET developers just reference session variables without first ensuring they are actually present.  This causes the infamous “Object reference not set” exception, which can be very difficult to trace back to the specific cause.  Code that checks for null session values is useful, but does not help the developer understand if it was never set properly or if the user just lost her session.  This technique can help to clearly identify that the user waited to long between requests and the session storage information was removed.

This is not the same as using the Session_OnEnd event which can be used for cleanup, logging, or other purposes.  It is also not for enforcing security on a web site. 

How Sessions Are Implemented

Since the HTTP protocol used by web browsers to request files from web servers is stateless, ASP.NET needs to determine which requests were from the same user. The primary mechanism utilizes a non-persistent cookie that is issued by the web server that contains a session id value. The id provided by this cookie is the key used to index into the session infrastructure to access the user’s specific data. The session framework is implemented by the HTTP module System.Web.SessionState.SessionStateModule, which executes before the .aspx page events. The module uses the EnableSessionState attribute from the @Page directive to determine if it must retrieve the user’s session information (and whether it needs to write out changes when the request is complete). If the EnableSessionState attribute is true (which it is by default), the module retrieves all of the user’s session information and sets the Session property of the Page class to an instance of the HttpSessionState class. This article focuses on the cookie mechanism, although a cookie-less method of sessions is implemented in ASP.NET (the session id is embedded in the URL string). The Session information can be stored in-process (default, stores in web server memory), with a state service, or a SQL Server database. This article will focus on the in-process storage, but the technique applies to all three locations.

Example User Session

A user opens a browser instance and requests an ASP.NET page from a site. If the EnableSessionState attribute is true, the session module adds the ASP.NET_SessionId cookie to the response. On subsequent requests to the same web site, the browser supplies the ASP.NET_SessionId cookie which the server side module uses to access the proper user’s information.

Detecting Timeouts

The ASP.NET HttpSessionState class provides a useful IsNewSession( ) method that returns true if a new session was created for this request.  The key to detecting a session timeout is to also look for the ASP.NET_SessionId cookie in the request.  If this is a new session but the cookie is present, this indicates a timeout situation.  In order to implement this effectively for an entire web site, it is useful to utilize the “Base Page” concept described in a previous article.

basePageSessionExpire.cs

 public class basePageSessionExpire : System.Web.UI.Page
 {
    public basePageSessionExpire()
    {
    }

  override protected void OnInit(EventArgs e)
  {
       base.OnInit(e);

   //It appears from testing that the Request and Response both share the 
   // same cookie collection.  If I set a cookie myself in the Reponse, it is 
   // also immediately visible to the Request collection.  This just means that 
   // since the ASP.Net_SessionID is set in the Session HTTPModule (which 
   // has already run), thatwe can't use our own code to see if the cookie was 
   // actually sent by the agent with the request using the collection. Check if 
   // the given page supports session or not (this tested as reliable indicator 
   // if EnableSessionState is true), should not care about a page that does 
   // not need session
   if (Context.Session != null)
   {
    //Tested and the IsNewSession is more advanced then simply checking if 
   // a cookie is present, it does take into account a session timeout, because 
   // I tested a timeout and it did show as a new session
    if (Session.IsNewSession)
    {
     // If it says it is a new session, but an existing cookie exists, then it must 
   // have timed out (can't use the cookie collection because even on first 
   // request it already contains the cookie (request and response
     // seem to share the collection)
     string szCookieHeader = Request.Headers["Cookie"];
     if ((null != szCookieHeader) && (szCookieHeader.IndexOf("ASP.NET_SessionId") >= 0))
     {
      Response.Redirect("sessionTimeout.htm");
     }  
    } 
   }
  }
}

sessionTimeout.htm

This can be any page on the site, example just redirects to this page so just show a simple “A timeout has occurred” message for this article.

Each other page on the site just needs to derive from this new base page instead of the default System.Web.UI.Page, so just change the line in the code behind class from “: System.Web.UI.Page” to “: basePageSessionExpire”.  Each page should also set the EnableSessionState variable as appropriate:

  • false – page request does not access any session information (the base page uses this to know that it does not need to check for timeout on this request since it does not require session information) 
  • ReadOnly – page request uses session information but does not modify it
  • true – page request reads and updates session information

Conclusion

It is often useful to know for a given request whether the user’s session information is still present.  The technique demonstrated is a straightforward implementation that can be easily applied to an entire web site that uses cookie based ASP.NET Session objects. 

Send comments or questions to robertb@aspalliance.com.