Archive for design

Love For Lazy Load

Posted in Design Issues with tags , , , , , , , on August 5, 2008 by moffdub

My love affair with domain-driven design has not been all roses and chocolate. I don’t know if this is the norm in other engineering disciplines, but in our perverse world of software, there is always this tension between the right way to design something and the efficient way to design something.

My primary complaint with the Evans book is that there aren’t enough practical examples to illustrate concepts. I’ll grant you that this was likely not a goal of the book. Indeed, the Nilsson book fills the void.


(I mention this book so often, you’d think it was the Bible)

The horrible truth, though, is that no project can be completely and purely domain-driven. The expressive and encapsulated world of objects is built for expressing concepts and encapsulating data. Performance concerns are the realm of another sorcerer.

Often, that sorcerer is a relational database. Tools help with that, though. A more sinister sorcerer lies within object land itself.

On some teams, this sorcerer is caching. A so-called best practice, in the name of pleasing the Mighty Cacher, is to design objects for caching: separate objects with shared state from objects with user-specific state.


(How long until this guy starts showing up in my “visions“?)

To me, this means designing your object model around a technical service. This detracts from the expressiveness of the model.

Another sorcerer actually lays within the Evans book itself:

Whether to provide a traversal or depend on a search becomes a design decision, trading off the decoupling of the search against the cohesiveness of the association. Should the Customer object hold a collection of all the Orders placed? Or should the Orders be found in the database, with a search on the Customer ID field? The right combination of search and association makes the design comprehensible.

The implications here are that you can gain in both performance and clarity: performance by not forcing an entire collection of Orders to be saved or loaded per Customer, and clarity by avoiding “entanglements.”

To me, this means designing your object model around, yet another, technical service. This too detracts from the expressiveness of the models.

I know this is engineering, and I know there are trade-offs. But at what point does a trade-off become a cop-out? Are these two trade-offs really necessary?

I say they’re not, and the masked avenger that will deliver us from a muddied domain model will get here just as soon as His Laziness can get off the couch.


(I’d be more apathetic if I weren’t so lethargic)

I’m talking about, of course, the Lazy Loading pattern. This pattern can let us have our cake and eat it too, at least in the two binds I described.

First, let’s talk about the caching problem. Let’s say you have one object, A, composed of two others, B and C, and let’s say object C is the cachable object. The design edict would have us, in the name of caching, do away with object A and have objects B and C in our model instead, no matter how disjointed the concepts they both encapsulate are.

I say phooey on that. Keep object A as it should be, and lazily load object C with Lazy Initialization. The lazy load mechanism can check the cache first for the correct instance of object C. If it doesn’t find it, hit the persistent store, load the instance, throw it in the cache, and finally give the reference back to object A.

This way, I get domain model expressiveness by keeping object A in its natural habitat, while at the same time avoiding caching parts of object A that should not be cached.

Next target: associations versus repository searches. Phooey! Model associations wherever they naturally occur. When I read the code for the definition of your class, the modeled associations should be explicit in the code as an array, ArrayList, HashMap, whatever.

Use a Virtual Proxy for that Order interface. Wrap the real, expressive Order object in an OrderProxy that has the order ID as a field too (this keeps lazy loading stuff out of the Order domain object). Now when Customer objects are loaded, they still have a collection of Orders, but unbeknownst to Customer, they are merely IDs. Only when one of those Orders is accessed is that Order loaded.

We keep domain expression. We keep performance. As for the entanglements — I don’t agree that they are a problem, as long as you allow repository access to aggregates; in other words, some globalness, where appropriate.

I’m not even opposed to aggregate roots holding references to other aggregate roots; this expresses your domain. If you are confined to this sort of framework, then you are indeed entangled. Globally accessible repositories are your escape hatch.

Why doesn’t the Lazy Loading pattern get more attention and love? Sure, His Laziness might be a fat white cat slovenly laying on a couch drinking a beer. But, it is one of the patterns for which I find the most practical use. More importantly, it can have benefits that both the project team and the users can observe and appreciate.

The Intension/Locality Hypothesis

Posted in Semantics with tags , , , , on July 23, 2008 by moffdub

I am on constant alert for bullcrap. In the software world, the rapid rate of change makes it all too easy for flunkies to toss around the latest ill-defined buzzword to crowd out qualified people from jobs.

In My Definition of Software Architecture, I dedicated an entire post to one of these marshmellowy terms: software architecture. I included in my own definition the Intension/Locality Hypothesis as defined by Amnon H. Eden.

The hypothesis, concisely, is:

  • A statement about a software system is local, or strategic, (software design) if and only if it is preserved under any expansion.
  • A statement about a software system is non-local, or intensional, (software architecture) if and only if it is not preserved under some expansion.
  • A statement about a software system is extensional (implementation) if and only if it is preserved both under any expansion and under any reduction.

The keyword is “expansion.” What does that mean?

Wikipedia gives an example, but doesn’t provide the sought-after clarity:

The Client-Server style is architectural (strategic) because a program that is built by this principle can be expanded into a program which is not client server; for example, by adding peer-to-peer nodes.

All well and good, and it is true (right now, we don’t know why, but it is like porn; I know it when I see it). But what prompted the author of this sentence to choose nodes to add? Couldn’t I just as easily have said “The client-server style is design (local) because a program that is built by this principle still satisfies this principle when extra data access classes are added”?

This fuzziness has led me not to rely solely on Intension/Locality as my guiding light for software architecture. If you go straight to the source and read this paper, a very nice mathematical definition is described. Still, I am weary of formal methods anywhere in an industry that exists only in our minds.

The paper actually shows a couple of proofs, one of an architectural statement and one of a design statement.

First, here is your dinky example off which to work:

class Object
{

};

class Nil1: public Object
{
vector Objs;
};

class Nil2
{

};

It is important to define a model to describe this “system.” The model will set the context in which you make statements, and will up-end my fallacious counter-argument in the client-server example above. In the Intension/Locality world, models have entities and relationships.

Our example model:

Entities:
Object, Nil1, Nil2

Relations:
Class = {Object, Nil1, Nil2}
Inherit = { (Nil1, Object) }
Members = { (Nil1, Object) }

Loosely: Object, Nil1, and Nil2 are classes. Nil1 inherits from Object. Nil1 has as class members Objects.

Now make a couple of statements about the system:

  • Statement 1: āˆ€c: Class(c) ==> Inherit*(c, Object)
    (Inherit* is the transitive closure of Inherit; basically all classes that directly or indirectly inherit from Object)

    Loosely: for each class c in the set of all classes, c directly or indirectly inherits from Object.

  • Statement 2: Class(component) ^ Class(composite) ^ Inherit(composite, component) ^ Members(composite, component)

    Loosely: component is a class, composite is a class, composite inherits from component, and a composite has as members components

“Expansion” is specifically defined (loosely; keep reading) by the addition of entities. In our case, classes.

Proof that Statement 1 is non-local: consider for a moment that Nil2 does not exist. Then Statement 1 holds because every class inherits from Object, namely Nil1. Now add Nil2. Statement 1 now does not hold because Nil2 does not inherit from Object. QED.

Proof that Statement 2 is local: consider again that Nil2 does not exist. Statement 2 holds since Nil1 and Object are classes, Nil1 inherits from Object and Nil1 has as members Objects. Now add Nil2 back in. Statement 2 still holds for the same reason. Statement 2, unlike Statement 1, does not have a “for all” quantifier that is “global” as far as our model goes. So it is not necessary for Statement 2 to hold for Nil1 and Nil2, or Nil2 and Object.

Now I will disprove my “proof” that the client-server architecture is both architectural and design.

First, a model:

Entities:
Box1, Box2, Box3, Box4

Relations:
Clients = {Box1, Box2, Box3}
Servers = {Box4}
Service = { (Box1, Box4), (Box2, Box4), (Box3, Box4) }

Statement 3: āˆ€cāˆ€s: Clients(c) ^ Servers(s) ==> Service(c, s)

The Wikipedia example holds when you add Box5 to the list of entities. (Box5, Box4) is not in the Service relation. Therefore, Statement 3 is non-local.

Now my counter-example doesn’t even make sense. Classes are not entities. We could augment the model and add them as entities, but the relations would remain unchanged, and therefore adding classes to a client-server system is a design activity because Statement 3 is unchanged.

What about The Project? Let’s put some of the statements about its architecture to the Intension/Locality test and see if we gain any insight.

I think I need a model first:

Entities (grouped for convenience):
Application, Domain, Data Access, Infra,
Dropdown Cache, Auto-complete Cache,
Central DB,
External System,
VB.NET Client,
Fat Client 1, Fat Client 2, Fat Client 3

Relations:
Layers = {Application, Domain, Data Access, Infra}
Client Caches = {Dropdown Cache, Auto-complete Cache}
Server Database = {Central DB}
Daily Update App = {External System}
Client Tiers = {VB.NET Client}
Clients = {Fat Client 1, Fat Client 2, Fat Client 3}

LayerOnTier = { (VB.NET Client, Domain), (VB.NET Client, Data Access), (VB.NET Client, Application), (VB.NET Client, Infra) }
ClientCaches = { (VB.NET Client, Dropdown Cache), (VB.NET Client, Auto-complete Cache) }
ServerOfClient = { (VB.NET Client, Central DB) }

Updater = { (External System, Central DB) }

ClientInstances = { (Fat Client 1, VB.NET Client), (Fat Client 2, VB.NET Client), (Fat Client 3, VB.NET Client) }

You can probably guess that this is an instance of the architecture, with three clients.

For simplicity, let’s look at a subset of the model, namely Layers, Client Tiers, and LayersOnTier.

Statement 4: āˆ€ L: Layers(L) ^ |Client Tiers| = c ==> LayerOnTier(L, VB.NET Client)

Loosely: for all L, if L is a Layer and the number of Client Tiers is some constant (in our instantiation of the model, 1), then L is a layer on the VB.NET Client tier.

A subtlety in Amnon’s idea is that expansion must be “proper”, meaning you cannot modify existing entities. So you can’t change how the entities relate to one another in some relationship. I further take “proper expansion” to mean an expansion that makes sense in your model. Amnon has such an example in the Implicit Invocation example in the paper cited above.

In the proof below, I am not only going to add a Layer, but also an element to the LayerOnTier relation. It doesn’t make sense to add a layer of software to a system without specifying where it will run.

Proof that Statement 4 is local: add entity Service to Layers and add relation (Service, VB.NET Client) to LayerOnTier. Statement 4 continues to hold no matter how many layers I add to the VB.NET Client tier. Adding a new layer to the system is a design decision.

Note that an expansion of the relation Client Tiers is not valid because Statement 4 forbids it.

You know what? This is enough for me to realize something, so let me stop here. What if I stated LayerOnTier and Client Tiers differently? Then Statement 4 might not hold when Service is added. It would then be non-local and therefore an architectural statement. So why did I include it? Because I said so.

Seriously. Recall this essay by Paul Clements. He states that what is architectural and what is not is because the “architect said so”.

My point is this: Statement 4 is local under one model of the system, and can be made non-local in another. The make-up of that model is in the eyes of the architect.

Here’s the new model, abbreviated for brevity:

Entites:
Layers = { same as before }
Client Tiers = { VB.NET Client, App Server }
Note: now the constant c in Statement 4 is equal to 2

Relations:
LayerOnTier = { same as before }

Proof that Statement 4 is non-local: add entity Service to Layers and add relation (Service, App Server) to LayerOnTier. Statement 4 is clearly false now because Service resides on the App Server tier. Adding a new layer to the system is now an architectural decision.

In the first model, I, the architect, decreed that there was to be but one tier on which layers of software could run. By the Intension/Locality Hypothesis, that relegated the choice of how many layers to have to a design decision.

In the second model, I, the architect, specified “Thou shalt have an application server and a client tier” – and this made the decision of adding a layer an architectural decision, because its runtime location would violate a global statement (Statement 4).

I think this seemingly rigorous derivation of the fact that what is architectural and what is not can vary according to who says what indicates something intrinsicly arbitrary about software architecture. So much for clearing the air!

It’s like my old software engineering professor, Larry Bernstein, says: “the answer is always ‘it depends.’”

Method Regulator Pattern

Posted in Design Issues with tags , , , , , , , , , , on July 1, 2008 by moffdub

(skip ahead to download here)

This blog’s thusfar most popular post, The Getter Setter Debate, spawned this comment from Wolter:

“Just put them in the same package and set them to package-private.
Or put warnings in the API documentation that the getters/setters are for internal use only, and are likely to change.”

While this is a viable solution, I strive for maximal generality. Granted, Java and .NET support the concept of package or “friend” access. Even so, this is language and/or platform dependent, and it can force some awkward package organization in some situations.

Since the thought-provoking idea of avoiding getters occurred to me, I’ve come to the following conclusion: yes, they should be avoided as much as possible, but no, you can’t avoid them completely. They are needed for legitimate uses, such as UI mapping and persistence.

And now, the natural question arises: if you are on a large team, how do you enforce this policy?

There is the possibility of manual intervention. Indeed, many teams employ code reviews and walkthroughs, and every instance of an accessor method being used can be scrutinized. A small example of domain-driven design called TimeAndMoney takes this approach in the file Money.java:

 /**
* How best to handle access to the internals? It is needed for
* database mapping, UI presentation, and perhaps a few other
* uses. Yet giving public access invites people to do the
* real work of the Money object elsewhere.
* Here is an experimental approach, giving access with a
* warning label of sorts. Let us know how you like it.
*/
public BigDecimal breachEncapsulationOfAmount() {
    return amount;
}

public Currency breachEncapsulationOfCurrency() {
    return currency;
}

Certainly, as the author of this class, you are discouraging other developers from using a method in their code with the substring “breachEncapsulation”. And giving the power to the class author is a step in the right direction.

Approach

But you know me. I’m never happy with anything. I’m especially not content with relying on others to behave. It rarely works.

This got me thinking of a pattern-based solution, one where only certain classes are authorized to use another class’s getters and setters, and violation of such would cause run-time problems like exceptions.

I began tinkering around with this approach and came up with the Method Regulator Pattern:

The heart of this setup is the Regulator. It keeps track of a type that is being regulated and a list of types that are authorized to use regulated methods of that type.

What is a regulated method? It is up to the class that wants to be regulated to, first off, inherit from the Regulated Object class, and second, call the inherited authorizeCaller method as the first statement in every method that should be regulated.

The Regulated Object base class uses the process-global Regulated Types class to look up all of the Regulators for the class that called authorizeCaller, including those inherited from base classes and interfaces. Then, for each Regulator, if at least one of them authorizes the method call, the authorization succeeds. Otherwise, indicate failure somehow, like with an exception.

The burning question remains: how does authorizeCaller know the type of the caller? This will depend on your platform, but your platform must have some kind of reflective capability. Both Java and .NET support the ability to query the activation stack: getStackTrace in Java and StackTrace in .NET.

Example

I would never be one to just describe something very abstractly, wave my hands at it, and declare that it works. What am I, a professor? I went ahead and implemented this pattern in C#. You can download it here. Included is a set of NUnit tests.

Another burning question: what prevents some mischievous or misinformed coder from highjacking an instance of a Regulator, twiddling with it to allow access to a specific type to their new class, and rendering this scheme impotent?

This is an implementation issue, and I chose to deal with it by providing methods to close or “lock” a regulator once we’re finished authorizing types. I also do the same for trying to replace an entire Regulator for a given type with a new Regulator.

Some of you are confused. I’m confused as well. I think an example will clear up what is going on.

Suppose you have the following relatively complicated class hierarchy of five classes, each of which has some getters that you’d like to regulate:

Attached to each class in that diagram is a note indicating which classes are allowed to call getters on each class. It is important to note that authorized callers in a base class should be inherited by a subclass to preserve the Liskov Substitution Principle.

Speaking of which, suppose there are five calling classes like so:

Therefore, the classes authorized to use the accessors of class Sub1 are Caller2, Caller1, and Caller5, the latter two inherited. Also, since Caller3 and Caller4 implement Interface1, we could just as easily authorize Interface1 for Sub3 instead of the two classes individually.

Here is some code for setting up who can call accessors on who:

public void Setup()
{
MethodAccessRegulator superReg = new MethodAccessRegulator(typeof(Super));
superReg.addAuthorizedType(typeof(Caller1));
superReg.closeRegulator();

RegulatedTypes.addRegulatedType(superReg);

MethodAccessRegulator sub1Reg = new MethodAccessRegulator(typeof(Sub1));
sub1Reg.addAuthorizedType(typeof(Caller2));
sub1Reg.closeRegulator();

RegulatedTypes.addRegulatedType(sub1Reg);

MethodAccessRegulator subInterfaceReg = new MethodAccessRegulator(typeof(SubInterface));
subInterfaceReg.addAuthorizedType(typeof(Caller5));
subInterfaceReg.closeRegulator();

RegulatedTypes.addRegulatedType(subInterfaceReg);

MethodAccessRegulator sub2Reg = new MethodAccessRegulator(typeof(Sub2));
sub2Reg.addAuthorizedType(typeof(Caller2));
sub2Reg.closeRegulator();

RegulatedTypes.addRegulatedType(sub2Reg);

MethodAccessRegulator sub3Reg = new MethodAccessRegulator(typeof(Sub3));
sub3Reg.addAuthorizedType(typeof(Caller3));
sub3Reg.addAuthorizedType(typeof(Caller4));
sub3Reg.closeRegulator();

RegulatedTypes.addRegulatedType(sub3Reg);
}

Here is some code for Super:

public class Super: MethodAccessRegulatedObject
{
private int _x;

public int x
{
get
{
authorizeCaller(this);
return _x;
}
}
}

Boring; only one member to protect, but it gets the point across.

Now if, say, Caller2 is defined in this way:

public class Caller2
{
public void useSub1()
{
int i = (new Sub1()).x;
}

public void useSub2()
{
int i = (new Sub2()).z;
}

// inherited access
public void useSub3()
{
int i = (new Sub3()).x;
}

// should throw exception
public void useSuper()
{
int i = (new Super()).x;
}
}

and then we use Caller2 like this:

public void TestSingleInheritance()
{
(new Caller2()).useSub1();
(new Caller2()).useSub2();

try
{
(new Caller2()).useSuper();
Assert.Fail();
}
catch (MethodAuthorizationException ex)
{
Console.WriteLine(ex.Message);
}
}

the first two calls will succeed, while the second one will throw a MethodAuthorizationException with message “An object of type Caller2 attempted an unauthorized method call on class Super”.

Catches

  • First, we make use of this StackTrace class, and we kind of assume where we are in the stack. This approach, as I have implemented it, breaks if you build the download in Release mode, because calls to InvokeMethodFast are inserted instead of the actual calling class.
  • Second, a much better way for client classes to declare their desire to be regulated would be to use Aspect-Oriented Programming. I do not have experience with AOP yet, so this is a future enhancement. I also wanted a lightweight approach for the blog, instead of telling you to download something like NAspect.
  • The “locking” aspect mentioned above (all the code in that Setup() function) is only effective if you can, within your organization, centralize the creation and instantiation of the Regulated Types object and the constituent Regulators.
  • Finally, all of this has to happen within a single process, though I think if you are distributing your objects, you might have more pressing problems.
  • This entire scheme might be rendered impotent by reflection.

I look forward to some feedback on the practicality, usability, and security of this approach.

Follow

Get every new post delivered to your Inbox.