Traits in #NGIS: Cautious Optimism on the #Documentum Back End

Traits are a total break from the traditional object model in Documentum.  With the Next Generation Information Server (NGIS) being built from the ground up with new technologies and design principles, EMC could seize the opportunity to transform rather than merely update server-side Documentum as we know it.  However, such breaks are rarely clean and can penalize existing users with migration, training, and change management issues.

Previously on DCTM: The Next Generation

Yesterday’s Docbase

There has been a growing disconnect between how we design for the back end and the front end. The front end developer can apply agile methodologies, use new patterns like composition over inheritance, and organize data based on tags and facets instead of deep hierarchies.  Back-end developers are getting these benefits from new systems like NoSQL databases, but Documentum architects are stuck in the 90s with the current content server–in no small part because it runs on relational databases like Oracle.  Documentum did an excellent job marrying objects and tables before ORM (object-relational mapping) was a catchy initialism; the inherent disparity between the models that was inconvenient before, but now it actively hinders new ideas on the back end.  Basing NGIS on EMC’s XML database xDB changes all that, and traits are a windfall from that decision.

The latest details on traits comes from a video by Jeroen van Rotterdam, IIG’s Chief Architect.  There are two basic constructs for designing with traits:  Trait definitions group data definitions, services, and event models into packages that are attached to objects at runtime.  Type definitions control which traits can or must be attached to objects and how those traits resolve conflicts; e.g., order of precedence or mutual exclusion.  Objects end up being truly lightweight–only an ID attribute for sure and very probably a type attribute as well.  This will be a very different world than the present-day Documentum schema of data-only inheritance, tacked-on type-aware behavior, and a bloated base object type with every possible attribute stuffed into it.

For Documentum architects and server-side programmers, this is the first exciting thing to happen to Documentum since, well, forever.  The model is one of the things that made Documentum stand out; it painted a rich, complex picture of what a document could be. Instead of document as file, a document included versions, renditions, and complex structures with flexible binding (i.e., virtual documents).  A document was a different thing (or collection of things) based on context and function.  It also stepped beyond the straight jacket of the relational model to include multi-value (repeating) attributes and SQL extensions that recognized the world is more like objects than tables.  Although this perspective on document management is still valid today, the methodologies to realize it feel outdated. Traits represent almost two decades of lessons learned since the first docbase schemas struggled out of the primordial scanned-document/shared-folder ooze.

Changes of this magnitude don’t happen often in software systems.  A mature market doesn’t take well to fundamental shifts that aren’t backward compatible, so there’s significant risk here assuming EMC wants to keep its current customers happy. Discussion about migration paths from the current content server to NGIS are still little more then speculation. Given how many customers are turning to other options, being bold may be the solution if EMC can make a better product with a less onerous migration path.  Let’s put aside those messy details for now and consider what traits may mean for the Documentum architect.

Will this become my new Linkedin profile picture?

A  Few of My Favorite Things

It’s hard to say if the implementation of traits will live up to the potential from what we know so far.  I had similar high hopes for DBOF, light-weight objects, Aspects, and DFS when talked about in theory, but they all fell short in implementation either in outright design or by being born prematurely.  Letting my inner optimist out of his cage for a moment, these are some things I hope to see the implementation of traits bring about:

Attributes become rich data rather than columns in a database. Defining and constraining attributes using XSD is an easy win since NGIS sits atop EMC’s XML-based xDB instead of an RDBMS, and it gets rid of the stapled-on data dictionary in the current Documentum toolbox.  I hope this means that attributes can contain complex data–perhaps small XML documents and certainly associative arrays–but removing size constraints and character set problems (odd escape characters, special characters, unicode, etc.) is a huge win regardless. Maybe this will help people realize that attributes aren’t all metadata.  Sometimes, they are the data; e.g., non-content objects.

The broken promises of DBOF and Aspects can finally be fully realized.  A weakness in Documentum’s original model was the separation of data and behavior.  Being object-based, there were familiar ways to organize data (e.g., inheritance) that fell short of real OOP because they didn’t equally apply to behavior.  DBOF was the first attempt to relate code to the type hierarchy; it was hobbled by being duct-taped onto the side of the client libraries (DFC) instead of integrated into the server directly.  Then Aspects repeated DBOF’s same fundamental mistakes.  Traits appear to marry data and behavior nicely, and I can’t imagine them being handled anywhere but together on the server.

Traits eliminate the need for inheritance and fat objects.  Inheritance as a design principle has been under fire for years:  Single inheritance is restricting; multiple inheritance introduces pitfalls along with greater flexibility; both are difficult to refactor.  The single inheritance nature of the Documentum object model wasn’t as restrictive then because it only included data.  With behavior coming along for the ride in NGIS, it becomes a bigger issue.  Just like in programming, it looks like EMC is favoring composition over inheritance.  I haven’t seen anything showing inheritance for types or traits in NGIS yet, and I won’t be surprised if it doesn’t.  Another consequence of this composition-based approach is the object becomes really, really lightweight–little more than identity on one end and trait container on the other.  That looks surprisingly like the really, really lightweight document in MongoDB with it’s single _ID default attribute.

The Documentum architect gets a whole new (NoSQL) toolbox to build and extend systems. Although Documentum grew out of the relational database culture of the time, it broke major relational conventions because the document management problem space felt more object-oriented than table-oriented; i.e., repeating attributes, virtual documents, object-ish extensions to SQL.  Now that NoSQL is here, it’s obviously a better fit:  Documentum architects taking a look at 10Gen’s webinar on schema design in MongoDB should quickly recognize the familiar ground.  What’s new is how dynamic, real-time the data model becomes with traits because they are only attached as needed at runtime, and objects can have different versions of a trait at the same time allowing for upgrades in a conditional or rolling manner.  The docbase will no longer be where content goes to die; it becomes a Livin’ Thing. Hopefully sizing, deploying, and scaling NGIS systems will also look more NoSQL than relational, but traits don’t give us any hints there.

Me if dm_history repeats itself

Cautious Optimism

I’ll admit to some cautious optimism here; traits may be a sign that NGIS will make Documentum architecture interesting again.  Beyond just updating the toolbox, the new capabilities may inspire organizations to start solving new, interesting problems again. Documentum did a good job of getting people to think about document management the first time around, but now everybody “knows” what document management is and what Documentum is “good for”.  Traits hint that NGIS could be a game changer on the much-less-talked-about back end; it won’t directly delight the New User, but a pretty front end on a shaky server foundation won’t be particularly useful to the New User. It’s up to EMC to get the implementation right and get the word out.

 

Based on the blog post Next Generation Information Server: Traits Explained | Jeroen’s Crazy Content and video:

http://www.youtube.com/watch?feature=player_embedded&v=jpPdtfwjmc4