« July 2007 | Main | October 2007 »

August 2007

A Periodic Table of Visualization Methods

A lot of work went into this. A "periodic table" of visualization methods for data, information, concepts, strategy, metaphors, process and structure.

Here's a screen shot - be sure and visit the original if you're interested - when you mouse over each cell, you get an example of the corresponding visualization method.

Periodic_table

I didn't see any of the visualization techniques used by structure101 for visualizing software dependencies and architectural layers. It is more focused on business processes, though Data Flow (Df), Entity-relationship (E) and Flowchart (Fl) diagrams are there.

Package design matters - Part 1

Java packages are often used like file-system folders to organize source. But source files differ from "normal" files in that they are highly inter-dependent. Considering this interdependence as a package hierarchy evolves can have significant productivity benefits.

Packages as Folders

Java packages provide an ideal way to organize code into a scalable, hierarchical structure that helps us find specific code.

In this sense, packages can be used like folders in a file-system: 

  • We place files with something in common in the same folder. 
  • When a folder grows too big and we find we’re having trouble finding files, we split the folder into sub-folders according to some criteria that makes sense to us.
  • We share files by placing them in a common area on company network, in which case the structure evolves according to the varying criteria of different people.
  • We often have trouble deciding which folder a file best belongs, and make an arbitrary decision.

Often Packages are only used as a kind of filesystem equivallent.  However the package hierarchy can also be used to reinforce the intended design and associated development activities.

Packages as Design Abstractions

Source files differ from other files typically stored in the file-system:

  • They depend on the detailed contents of other source files. 
  • Are created and edited in groups of multiple files. 
  • Are subject to a high number of relatively small changes.
  • Are edited by a team of developers rather than individuals. 
  • Should be reusable on future projects.
  • Should be easy to change without impacting other files too widely.
  • Must support the defined deployment environment.
  • Are subject to a QA, version control and release processes.

The aim of a package design should be to support these characteristics.  For example, the design could explicitly support Martin's “Reuse/Release Equivalence Principle (REP)” (article, book) whereby packages are developed, built, tested and released against released versions of the packages upon which they depend. 

Design is not something that happens once at the start of the project – it is an activity that spans the life of an application or product.  This fact has become explicit with the iterative and Agile development models.  As the code-level design continues, the package-level design emerges
Unfortunately, the emergent design is often invisible and so forgotten.  Not only does the original design degrade, but the overall structure tends to become excessively complex.  As the supporting structure dissolves, development activities become harder and the cost of each new feature increases.

This priciple of emergent design is important here. Clearly when I have a project of 50 classes in a half-dozen packages, the overhead of a sub-optimal package dependencies isn't going to slaughter me. But if my project is going to grow to 5000 classes, then putting in minimal effort from the start can save huge effort when things get more complicated.

In part 2 I'll take a closer look at cyclic package dependencies and why they matter.

An Overview of Structure101 Architecture Diagrams

Structure101 lets you work with both structure (the whole code-base as it is) and architecture (the subset of the structure that you really care about, and how it should be). It lets you define the architecture in the context of the physical structure and diseminate this to the team. Architecture diagrams are what makes this possible.

Layering and composition

Structure101 architecture diagrams use a concise visual notation for representing architectural layering and composition. Here is an example of one of the architecture diagrams that we use for the structure101 code-base.

Layeringandcomposition_3
The principle is simple; components (“cells”) should only depend on components at lower levels, not in the same or higher levels.

Layering Overrides

Sometimes a top-down dependency structure is too simple to capture the intent of an architecture. "Overrides” allow you to override the default layering of a diagram. For example we may decide to allow a specific dependency from a cell to a higher-level cell. The override is shown as a green (“allowed”) arrow on the architecture diagram. (Note that enabling this “upward” dependencies practically  merges the “hiView”, “xbase” and “graph” components from the perspective of testing, reuse, development, etc.)

Layeringoverrides
A more common example is where we wish to enforce a more strict layering. For example we may want one layer to only use the next layer down, but not layers below that. Such an override is shown as red (“disallowed”) on the architecture diagram.

Combining diagrams

It is not necessary to include all aspects of an architecture on a single structure101 architecture diagram.

A common scenario is where a number of “add-ins” are distributed across several packages. For example, this diagram shows part of the structure101 architecture.

Combiningdiagrams1
It is correct, but incomplete. Classes in assemblies.X should never depend on classes in lang.Y.  We could express this by adding several overrides, but it is much cleaner to use a separate diagram for this aspect of the architecture.

The next diagram defines a number of “language packs” that do not have a direct equivalent in the physical structure (they are “pure” architecture components), but express the architectural constraint that was missing above.

Combiningdiagrams2
The combination of the 2 diagrams defines the intended architecture.

Mapping the architecture to physical code

In order to understand how a physical code-base conforms to an intended architecture, we need to map the architectural components (cells) to physical code. 

Simple patterns are used to establish this mapping. This has a number of benefits:

  • If a diagram contains a component mapped to com.headway.lang.* and the team creates a new package com.headway.lang.cobol, then the diagram is not rendered obsolete – all the classes in the new package map to the intended component.
  • I can create components with more complex mappings with expressions such as com.headway.*.test.?
  • I can create and show a component for which no code has yet been created, either by specifying no pattern or specifying the paths where I expect the new code to be implemented.
  • I can effectively “hide” physical entities from a diagram. For example any code in com.headway.lang.cobol will simply map to a component with the expression com.headway.lang.* - I do  not need to show package cobol on the diagram if I don’t want to.

Another flexibility is that a physical entity maps to the component with the most specific pattern.  For example if I include 2 components, one with com.headway.lang.* and the other with the expression com.headway.lang.java.*, then the class com.headway.lang.java.myClass will map to the latter. The effects of this can be at the same time subtle and powerful. For example I could move the component that maps to com.headway.lang.java.* into another “parent” altogether. 

Finally, each diagram has a (possibly empty) expression that maps to  “excluded” items. This is useful if some physical entities would otherwise undesirably map to a component in the diagram.

Once the mapping is established, any dependencies that violate the architecture is shown on the diagram as a curved dotted line as shown here between component “graph” and the higher-level package “hiView”.

Maptophysical
It is easy to discover the code-level cause of a violation by selecting it on the diagram within a structure101 client or IDE plug-in.

Structure101 IntelliJ Plug-in Build 104

This update now checks for architecture violations automatically when you do a build (previous version was "on-demand" only).  More on structure101 IDE plug-ins.

Beautiful Structure

In response to O’Reilly's just-published Beautiful Code, Johnathan Edwards explains why he couldn't go along with the premise. One sentence in his excellent piece stood out for me:

"The human mind can not grasp the complexity of a moderately sized program, much less the monster systems we build today."

This is true, but only to a point. Clearly it is humanly impossible to understand the whole "design" of a million-line code-base by studying just the lines of code. But hopefully that is not necessary.

If it's written in an OO language, then I'm mentally constructing class diagrams as I read the code. Much better if these are visible to me - I can work on understanding the class-level structures, dipping down to the code-level as I need to.

I'm still not going to make sense of thousands of classes as a single conceptual group, But that's not what I'm going to do. I'll start organizing the classes into groups. Ideally this was done as the code-base evolved and I have some physical representation of these groups to help me with my task. In Java, the package structure largely serves this purpose. Understanding the package hierarchy and interdependecies (in conjuction with dipping down to the class and code-levels) is not going to be a cake walk, but if the hierarchies are well-structured, it is surely possible.

The degree to which my million line code-base can be understood is therefore largely dependent on:

  1. How well the hierarchies are structured
  2. How good a job they do in explaining the code-base

1 is pretty easy to measure (a lack of cyclic dependencies; not too much complexity at any point of breakout in the hierarchy). But being measurable, I'd call this "quality" rather than "beauty".

2 depends on 1, but goes a step further. It requires the inspired human touch. Herein lies the beauty.

The bottom line is that structure and architecture are an intrinsic part of the code, and any discussion of code "beauty" without them isn't going to work for today's monster systems.

Spring Framework 2.1 M3 Architecture

Here are some architecture diagrams for Spring Framework 2.1 M3 (released yesterday). You can point the (free) structure101 plug-in at these and get IDE warnings if your customizations break Jeurgen's architecture.

Here is the top level breakout of org.springframwork:

Springarchitecture

Structure101 created this from the physical code-base. All the cells in the diagram use only lower-level cells. With such a clean structure, I did no further editing of the diagram, other than to adjust the level of nesting.

Below is a further breakout of some of the larger modules.

org.springframework.aop:

Springaop

org.springframework.beans:

Springbeans

org.springframework.jdbc:

Springjdbc

org.springframework.jms:

Springjms

org.springframework.orm:

Springorm

org.springframework.web

Springweb

You can view these online here (I'll update later today), and if you have a Spring-based project, you could install the structure101 Eclipse or IntelliJ plug-in (free from here) and point it to the Spring project in the online repository (use this url: http://www.structure101.com/java/data in the plug-in properties) and the diagrams will be visible inside your IDE, any existing violations flagged (i.e. if you have created any upward dependencies), and you will be warned if and when you make code-changes that are inconsistent with the layering.

This is a new-ish feature - please email me directly and let me know how you got on or if you have questions.

Code Organization Guidelines for Large Code Bases

In an excellent on-line presentation Juergen Hoeller gives rationale and guidelines for controlling the structure of large, evolving code-bases. Juergen is the chief architect of the Spring framework, which as I have previously pointed out is structurally almost perfect. This didn't happen by accident.

If you don't have time go though the 88 minute presentation, here is a nice sysnopsis by Mike Nereson.