This post is the third in my “5 Keys” to EIM series, covering the core principles of enterprise information management (EIM):
1 – Introduction to the “5 Keys” Series
2 – The “5 Keys” to Data Governance
Today we’re looking at data architecture management – simply defined as “most of what needs to be done before getting started.” This includes developing a data model, planning out databases and table structures, and figuring out the key data flows and integrations that will need to happen between systems. Much of the detailed work will happen in the other EIM disciplines, but data architecture management ensures that there is a holistic method to the madness – that’s the intent, anyway.
Figure 1: 10 Data Management Disciplines (adapted from the Data Management Association)
Let’s look at the 5 keys to making data architecture management successful:
KEY 1: Approach data architecture like building architecture: seriously!
Data architecture management is one of the most tempting areas for businesses to underinvest – in fact, technology architects of all kinds seem to be universally underappreciated. After all, architects don’t really do much that others understand – producing standards, structures, and frameworks to benefit mostly other technical folks. That can be a tough marketing challenge to demonstrate value to the business!
I think that by using the right analogy, it becomes quite simple for those of us on the business-side to get it: Take the technical adjectives away from the word “architecture” and think of traditional building architects. What do they do? They ensure that what the bricks make meets your needs, looks beautiful, and doesn’t fall apart. Replace the bricks of with “bits of data” and there you go! No sane person would start building a new office without proper architecture, so likewise we should not ask our data developers and others to provide for our information needs without sufficient data architecture management.
KEY 2: Fundamentally changing architecture during the build phase increases time, cost, and risk
We might as well extend the previous analogy: what happens if we get halfway done building a new office, and then decide that our original requirements and design were wrong? It takes time to analyze the new needs, design a revised solution, and then come up with a plan to take the current build state and transform it into the new design. The time and cost implications of this are obvious, but the risk implications may not be so clear.
The reason risk increases is that we are no longer starting from a fully known state – we are making assumptions that the builders did exactly what was specified, or any deviations were fully documented. The time pressure also likely rushes the analysis and related decisions, which increases the likelihood of mistakes. Then, once the build phase resumes, there will be a temptation to work faster than optimal to reach deadlines that haven’t been adequately extended. All of these are risks!
This is not suggesting that one should not fundamentally change data architecture in-flight – sometimes we must. The takeaway here is that we should know the full implications of our options, and then make a decision that optimizes the return on investment looking forward.
KEY 3: Perfection is impossible – do not believe a data architect who insists otherwise
I’ve read far too many things that would put reasonable people to sleep immediately. Far too many of these involve data architecture management, and the drowsiest of the bunch typically take a “one right way” approach to the topic. The basis of this is that best practices have been determined, and if you don’t follow them perfectly, your organization is doomed and is going to go bankrupt immediately. My counter-argument to these theorists is that everything else about IT organizations seems to have flaws, so why are we insisting that our data architecture be perfect?
The point here is that it is okay to be pragmatic in your approach to data architecture. It is fine to architect in enough detail today to get your development team moving, and refine and extend as you go forward. The most talented data architects will be able to first address the pieces with the most dependencies, but do it without limiting the extensibility of their solutions (which would increase the risks, like we talked about in Key 3).
KEY 4: Third-normal form has its place, but so do criminals – do not allow either to override sound logical reasoning and cost/benefit analysis
Third-normal form (3NF) is the go-to for database architects – it is a framework that specifies the optimal amount of data repetition in a typical transactional database. If that doesn’t make sense, just know that it is a data-geek technique that attempts to store data in a way that limits errors, uses disk space efficiently, and is reasonably efficient to query. Doesn’t that sound like a compromise? It is, but the result is that 3NF is a good-enough solution in most circumstances.
Though there are few circumstances where 3NF won’t work at all, it is not always the answer. Good data architecture management will relate a thorough understanding of data normalization techniques with a complete picture of what needs to be accomplished from a business perspective, and keep it aligned with the architecture standards already adopted within the company – it is a lot to think about!
Good data architects understand this and work hard to find the right balance between standards and exceptions – and ideally make things work so the rest of us don’t need to care how the data is stored.
KEY 5: Buying a pre-built model may expedite parts of the process, but it is no replacement for quality data architecture talent
Unless your business is EXACTLY like the business reflected in the model, you will have some customization required. This fine-tuning is the most skilled part of the modeling process, and it is risky to turn it over to people who haven’t learned lessons building the easy part. The more a pre-built model needs to be modified, the less help it provides – and only the most experienced data architects should be tasked with transforming a pre-built model into a custom solution.
Doing a good job with data architecture management goes far beyond scripting tables and writing queries – it is about putting the right blueprints in place so your data construction workers (i.e. data developers) know what to do. My next “5 keys” post will talk more about data development, or what to do once those blueprints are written.
Anthony J. Algmin is a Manager in the Business Intelligence Practice at West Monroe Partners.