By Larry Danberger June 2, 2018
Why do we encourage the creation, documentation and distribution of data models?
From the perspective of a business user that does their daily job in areas such as customer service or operational support, they don’t care how the data is designed they just use the tools and expect it to work. If that is the extent of data usage then data models are not needed.
Most organizations desire to also do various types of data processing and analysis: business intelligence, predictive analytics, data integration, and data provisioning to name a few. Data typically moves through an organization from the operational applications to other tools utilized for visualizations and reports of various types. Often this requires some form of Extract, Transform, and Load (ETL) functionality to convert from one system into another. It may get further massaged into usable format for the reports, or provisioned for self-service type access. Either purpose, it is critical that the data be recreated accurately and in full completeness, otherwise the reports are inaccurate and misleading. This is one place where data models should be utilized.
Backfilling additional attributes is very costly and time consuming in terms of internal processes (managing enhancements, quality checks, user testing, deployment to production etc.). Not knowing the shape of the data will result in inaccurate ETLs and reporting. And without knowing the data, data quality checks will be incomplete. If data models are not used, expect a lot of rework.
As illustrated below, data modeling can be performed in simple stages, each providing significant information to select audiences. Level 1 is a high level ‘common knowledge’ identification of what the organization does. Level 2 identifies the business objects utilized by each business subject area. Often level 3 is sufficient for most business decisions and activities. Level 4 is typically only needed for software integration, development, or similar IT/IS activities.
Data models are a fundamental component of Enterprise Architecture. Without a clean design the organization is probably not running at it’s optimal capacity. This is like building a house: if you don’t have a plan when you start, you probably won’t end up with what you want. You can build a dog house easily, and probably a shed, but building a house without a plan can be foolhardy, and building a complex office tower without a plan could be disasterous in many ways. The complexities of a business can be similar to that of an office tower.
Common Data Models (CDM) are becoming more relevant within large software applications, and eventually will ease the efforts required for integration and support. Future generations of the web may fundamentally change how web pages integrate with each other (different organizations), sharing common models to augment data beyond anything we can easily do today.