Your (Data) Family tree

Your (Data) Family tree

As data volumes grow every day, organizations strive to create a confidence level in their data. This is becoming one of the primary tasks of the CDOs (Chief Data Officers) where they work closely with both the business users and the technology providers to ensure the accuracy, completeness and security of the enterprise newly discovered wealth (i.e., data collected/being collected).
Sometimes though, the activity of collecting and validating data sources is like sitting at the holiday dinner table and asking the elders of the family about the family tree – who the ancestors are, where they came from and where they have settled?  Since the family elders sometimes do not have the most “accurate” data of the family tree, we end up with either having conflicting messages or inaccurate picture of the family tree.
I am sure you have seen the commercials about the online software that “collects” some records coupled with user input to create an online family tree for a low fee of $9.99 per month, while providing a DNA test for additional cost.  A common question that is asked is how accurate is this information?  Would you bet your life on it?
Back to our data governance world.  The previous couple of paragraphs tells the story of every CDO who has purchased an expensive enterprise data management tool and asked his/her team to start collecting information about where the data is coming from and where it is going to?
Luckily, there are cases where the “DNA test” can help such as metadata stored in databases, some documentations, shared business vocabulary, taxonomies, ontologies and, technology to business cross-reference documentation.  However, they do not give the complete picture, especially in the cases where data is transformed and not explained or tracked while moving from point A to point B.
CDOs need to invest in the data lineage requirement when building their enterprise data management framework and focus on automating the data collection and tracking of the transformations.  Automation not only provides the speed required for the “governance” part of the CDO position, it also ensures repeatability and the continuous update of the data lineage for your organization’s critical data elements.
Read more about automated data lineage

3 VSAM to DB2 Conversion Strategy Options

3 VSAM to DB2 Conversion Strategy Options


Many organizations who run legacy application still have a large percentage of its critical data stored as VSAM based files. while VSAM as a technology has been around for many years and stood the test of time, the advantages of a relational database management system lie in a data structure that is easy to understand. specially these days with the advances in data analytics field – bigdata- , the need to have data structures that easily accessible is becoming a real issue today.


Over the years. the Mapador platform has helped many of our clients transfer they key data – such as CIF files- to a relational database – specially DB2.

here is our approach and some strategies for VSAM to DB2 conversion



  1. Determine data and process requirements
  •  Determine existing physical model.
  • Using existing VSAM files determine what are the relationships between files.
  • Output would be a physical data model
  • Determine how physical model corresponds to current business and logical model (if there is one).
  • If entity relationships are defined in current logical model and do not correspond to the VSAM physical model then impact on existing processes using vsam data need to be analyzed.
  • If entity relationships are not defined in current logical model :
    • Determine how missing entity relationships would be integrated into the current business and logical model
    • Analyze impact on processes, using VSAM data


  1. Determine Conversion Strategy
  •  If there are significant discrepancies in the models or other factors that would justify a redesign/rewrite of the application then intermediate solution would be to:
    • Define DB2 tables to correspond with VSAM records exactly.
    • Convert existing code and data to sql/db2/ (see conversion strategy 1)
  • If there are no discrepancies in the models and no other factors that would justify a redesign/rewrite of the application then solution would be to :
    • Define db2 tables to correspond with VSAM physical model
    • Convert existing code and data to SQL/DB2 (see conversion strategy 2 )
  • If there are minimal discrepancies and no other factors that would justify a redesign/rewrite of the application then solution would be to:
    • Integrate VSAM physical model into current model(s), using DB2 for physical model
    • Convert existing code and data to SQL/DB2 (see conversion strategy 3)

Conversion strategy 1

Since this strategy is intermediate term, the main focus is to convert the existing applications from Vsam to DB2 with the least amount of program logic changes. Almost all VSAM statements could be automatically or manually converted without any logic changes. (except return code checking). Consolidating multi-file reads into consolidated SQL joins would not be done since this would require more effort.

Conversion strategy 2

Since this strategy is longer term, the main focus is to convert the existing applications from VSAM to DB2 making maximum use of SQL capabilities. Some VSAM statements can be directly (automatically) converted to SQL.(read,write,delete) without any logic changes at all (except return code checking) . However, any programs that navigate through multiple files should be re-structured to be either platform independent or to coincide with the full capabilities of SQL e.g selecting only required fields, not entire record.

Conversion strategy 3

This strategy would be similar to conversion strategy 2. In addition, program changes would be required to reflect the changes in the model.

VSAM Statements   DB2 SQL Commands   Comments
Write Insert Automated
Read Select Automated
ReWrite Update Automated + ?
Delete Delete Automated + ?
Open / Close Not Applicable Automated
Start Browser Declare & Open Cursor (Single table) Automated + ?
Read Next Fetch Automated + ?
Read Prev Fetch Automated + ?
End Browse Close Cursor Automated

read more about how Mapador can automate application mass change