Maarten's Sense of Data

This blog is created in order to stimulate discussion and the exchange of ideas about all sorts of subjects related to the fascinating world of data. I would like to use the blog for organizing my own thoughts and ideas, and for inviting other people who share an interest in the intriguing world of data to react on the postings. So please feel free to react and contribute! Disclaimer: This blog represents my own personal views, and not those of my employer or any third party.

Friday, September 29, 2006

EDA: are you polling for states or are you interrupted by events?

At NS, we are preparing our organization for Event Driven Architecture, or EDA. There’s a growing interest for this subject in the literature, but there still are some remaining discussions. One of them is about what it is that we should be interested in, having adopted EDA as a main approach to our design of and thinking about information processes and the automated systems meant to support them.

EDA: are you polling for states or are you interrupted by events?

There seems to exist an obscurity in the realm of Event Driven Architecture, or EDA. What are the building blocks of EDA? What should we be interested in, if we enter this event driven world? Are they states in our environment or are they things called ‘events’? Or are these two terms referring to the same thing?

It has been common practice for a long time to let states be the things that drive systems to act or react. People and other systems are oftentimes perceived to be reacting to states in their environment, once noticed, of course. If the traffic light is green (which is a state), you drive, right? And if you notice that a customer has ordered something (yet another state), you deliver. States inform you about appropriate actions to perform.

But there’s a drawback to the use of these states. Though states can tell you what to do, they don’t tell you when to react. Are other drivers blowing their horns because the traffic light has been green for a while before you accelerate? Did your customer turn to your competitor because you didn’t react quickly enough to her order?

If interpreted appropriately, events imply conditions in the environment as well as these states do. But events offer a bit more than just states. They include information about timeliness as well. It shouldn’t be your perception of a state in your environment that you react to or act upon. You should be reacting to perceived events. That’s what EDA is about, IMHO. You needn’t be polling your environment for states of interest to you. Instead, you should let yourself be interrupted by events in your environment.

This story implies, however, that, in order to fully leverage the events you perceive, you need not only understand what these events themselves are about. You also need to understand events in their context, in terms of their implications. You must be able to infer the conditions they create. You must form an idea about what these events mean to you. This way, events inform you about what happened, when it happened, and what state resulted, thereby enabling you to select the appropriate reactions AND the timeliness needed to be agile, in other words: they give the what and the when.

So if you ask me, I would be much more interested in events than in states, because events are more informative, providing you with the triggering logic you need to be agile.

This view comes with a consistent naming convention for events. I hope to present you with an insight into this naming convention soon.

Tuesday, September 12, 2006

Are you a Symbol Juggler?

People in organizations say they are convinced that their data are their primary business asset, or at least, a very important one. When examining the way people use this asset, a very different picture pops up. Saying your data is important is something completely different from acting according to that insight. One of the roadblocks that need to be taken is to get rid of the symbol juggling practice you can find in all organizations.

Are you a Symbol Juggler?
There’s something tricky about data. What you see of them, as it happens, is not what they are all about. The only things you really see are symbols that represent the data in a certain context, presented on a monitor or on a piece of paper, or still some other media.

But the data in fact ARE no symbols. They are propositions about objects you perceive in your reality. Data describe a world, whether it is real or fictional. It’s just unfortunate that we need symbols for visualizing these descriptions.

The way we use symbols to represent and present data to a perceiver may easily lead you astray. It tends to make you believe that you’re only acting upon some values (the symbols!) that are arranged in some way and in a specific context, like this table in this database. The idea that you are actually manipulating descriptions of some reality, descriptions that are supposed to be true, often doesn’t come to mind.

However presentations with large sets of data can be quite helpful to people now and then, for example for quick searches when analyzing troublesome data, they prevent us from constructing the semantics of the data in our heads. Our brains are just not powerful enough to handle large amounts of information with a deep understanding, especially when handling a data set over a significant period of time.

The danger of all this is that people using the data might make wrong decisions that don’t match reality or are not appropriate for the real situation at hand.

If you symbol-juggle instead of handle semantics carefully, you’re simply not aware of what you’re actually doing, and you might harm reality by describing a world that is not correct. As a consequence of this, you or someone else might end up making the wrong decisions.

People who do this are either mindless symbol jugglers who don’t know what they are doing, or evil manipulators who know it all too well! In my opinion, these evil types belong in jail, whereas the mindless jugglers should take a course on a topic like Data Awareness!

But that probably isn’t a complete solution to this problem. I think we can do better. Data Awareness can also be good advice for the ones who design for user interfaces. I’m sure design guidelines can be found that can help improve human users to correctly handle the information that systems present to them. Any ideas?

I believe certain sorts of data maltreatments could be prevented if only people improved their sense of data and our systems were adapted to this juggling tendency. Because if there is one lesson to be learned from decades of automation, it would be that oftentimes creative users act according to a rule like this one:

IF I CAN DO THIS TRICK, I WILL DO THIS TRICK, REGARDLESS OF WHETHER OR NOT I SHOULD DO THIS TRICK!

Hey, she's got to do her work!

Yeah, that’s our juggler!

Tuesday, September 05, 2006

The Architect

In IT, we are more and more aware of the need for flexibility. In the systems integration domain, this is one of our main concepts. Sometimes there are very simple techniques to achieve a goal. One of them, often used but not often enough, is the subject of this posting.

The Architect
An architect, whatever sort of, is a remarkable person. She has a number of powerful tools at her disposal. One of her tools that intrigue me the most is her ability to distinguish between things that, in her eyes, are different in nature and have different concerns. She can then conclude that you should separated these assets, and she can advise you as to how you should relate the separated assets, in order to enable you to continue your work, be it better than before, of course.

This ability is highly underestimated. In my opinion, however, it might be her primary trick. The way I think of an architect is of a person who brings order, by preventing people who are too close to reality to be able to overlook it, to throw everything of interest to them on one big heap.

Some people aren’t very good at considering what exactly they are doing and how, and if you are doing the actual work in any kind of domain, you tend to put everything you need within close range, especially when you don’t find the time now and then to sit back and reorganize. If you can’t do this, for whatever what reason, you need an architect!

In my job, I frequently feel the need to separate. It has become a second nature to me. Am I becoming an architect? Not so long ago, I didn’t have so much affinity with these extraordinary wise people, who lived on a cloud high above me, far out of reach. But this urge of mine to separate-and-relate makes me see them quite differently now, it feels like an elevator is bringing me closer to the heavens of the gods.

Some things REALLY ought to be separated. If not, they get you into trouble. You can probably think of some. Would you feel comfortable in a house without any internal walls to divide your toilet from your living room? How about a city with an airport next to your favorite restaurant? Do you keep your financial administration paperwork between your novels on the same bookshelf? Maybe you once did, as a student, and your father took the role of an architect, unasked, for your sake…

Though good architects can advise you beforehand to prevent you getting into trouble, it’s sad to see that they are most often consulted only when you’ve already been plagued for a long time.

Information Technology has its own nasty plagues. But it seems to be very difficult to get a grip on them, and make them explicit. Even more difficult things get when you start looking for their causes. Sometimes it’s better to take an architect’s intuition and indicate a solution-for-almost-everything. I know of such a medicine. I would recommend it to everyone in IT, especially those who have a say in this. So, if you’re experiencing any feelings of discomfort in your work, try this:

SEPARATE CONCEPTS FROM THEIR TECHNICAL REALIZATIONS!

I’m glad I did this while designing our data modeling world. I thought it wise to separate the conceptual aspects of data from the technical ones. And of course, wherever you separate, you should relate, or should I say re-relate. This is how the concept of the corridor is invented. Or, more close to IT, the concept of middleware. In our CDM, this re-relation is done by our third type of data model: The Realization Model (see also the picture in Where do you live and what do you do? -Why, does it matter?). Do you know of any other kind of re-relaters, things specialized in exactly this function? You know, the world is full of them, thanks to our friends on the clouds.

Friday, September 01, 2006

Where do you live and what do you do? –Why, does it matter?

Many organizations keep multiple data models, each one specialized for a specific use. These models are hardly maintained in harmony. This situation is unwanted: it results in a large number of overlapping but inconsistent models, while maintenance of these models is costly. This posting is intended to put forward a CDM as a possible solution to this problem. At least, it makes a start.

Where do you live and what do you do? –Why, does it matter?
Data live in many different houses. It’s where we put them: in external memories (in database tables or in data files) for data storage and retrieval, in message fields for data exchange, in user interface windows and on paper printouts for data presentation to human users, in microprocessor registers for direct data processing and in internal memory structures for indirect data processing. And I don’t pretend to be exhaustive here.

So, we can do many different things with data, and as a consequence of that, we put them in different places. For anything we can do with our data, one place is more suitable than another. Processing data directly from a hard drive? Better not. Storing data in internal memory? Risky. And automated systems read data from a piece of paper reliably and efficiently only with a lot of effort.

What’s the constant factor in this story? It’s the data themselves. Whatever we intend to do with them, wherever we put them, they still remain the same data. At least, their semantics remain the same. It’s only after some processing has been carried out on the data, that we can expect a change in semantics in the results.

There are, however, some aspects to our data that may very well be adapted to what we want to do with them or to where we want to put them. These aspects have nothing to do with data semantics. Instead, they are the citizens of the technical data model, describing the technical formats of the data. Data oftentimes take on a different format when moving to another house, just like you would put on a jacket when leaving home for work.

So where is all this leading to? Well, this story contributed to the global design of our world of canonical data modeling. We make an explicit distinction between semantics and technical formats. We use different model types for them. Semantics are described in a conceptual model, whereas descriptions of technical formats can be found in a technical model. Because of the fact that we model data living in a large number of houses, even houses of different types, we create multiple technical models.

But what about our conceptual models? Well, there is only one. For the data we want to describe, it doesn’t matter where they are, or what we intend to do with them. This one conceptual model supports many of our Applications of a CDM. For example, it eases correct data mapping (see app 2), and it gives a practical data catalogue to search for data living anywhere (see app 3).

Oh yes, there’s a third type of model involved, one that relates the semantics to the technical formats. It describes the relationships between the elements in the technical models and the elements in the conceptual model.

Your data systems, whether it be a business application, a database, a data warehouse, a message type or whatever thing that your data can live in, should all have their own technical data model, all linked to your one-and-only CDM! This will help reducing cost and it will give your data models more value.

Maarten's Sense of Data

Friday, September 29, 2006

EDA: are you polling for states or are you interrupted by events?

Tuesday, September 12, 2006

Are you a Symbol Juggler?

Tuesday, September 05, 2006

The Architect

Friday, September 01, 2006

Where do you live and what do you do? –Why, does it matter?

About Me

Links

Previous Posts

Archives