Automated Application Cartography for Data Lineage
|As enterprise portfolios are becoming more and more complex, understanding the underlying applications, and especially how these applications interact with each other, has become extremely difficult.To create and maintain a good level of understanding, some applications can be analyzed using existing tools. But some have limited tool support or none at all. Moreover, the knowledge about inter-application dependencies is almost always captured manually, an error-prone solution with results that become outdated quickly.
The lack of reliable understanding represents a real risk to everyday maintenance and significantly adds to the costs of IT initiatives such as modernizations, portfolio consolidations and others.
Regulatory requirements put an increased demand for a solution to this problem, and at a potentially even more granular level. Auditors may require proof of understanding where the value of a particular, critical business data element originates from, and how its value is arrived at through potentially dozens and dozens of filters and transformations.
To develop such understanding is a daunting task when dealing with legacy based systems, spanning across different, even home-grown technologies and architecture implementations. As a result, such regulatory compliance -such as Basel III / BCBS 239 ,IFRS-9 ,CCAR and Volkers Rule – presents an extremely difficult, potentially very costly and time-consuming challenge.
Your Business Flow
How it is implemented
Untangled by Mapador
Save error-prone documentation effort
Export to industry standard EDM tools
Mapador’s Automated Application Cartography produces the technology agnostic view of your applications data lineage in a graphical format. No human interaction needed to connect technologies, it is all done out of the box.
Every node/step captures:
What is Data Lineage
Automated Application Cartography (AAC)
Automated Application Cartography uses sophisticated parsing engines to automatically analyze individual programming languages, accompanying scripts and operational components based on their syntax, regardless of implementing technologies. It then outputs all results into a common database and connects the various elements regardless of what technology type they originated from.
Using available relationship and impact views, as well as the accompanying graphical maps, IT personnel can quickly and reliably understand applications and portfolios. Impacts can be correctly understood in seconds, instead of potentially requiring days of manual analysis.
Just as a traditional street map helps you navigate from point A to point B, Automated Application Cartography allows you to understand impacts within and across your applications. A component within the application (e.g. a program, paragraph, database table, etc.) is connected to other components via the programming statements and constructs used. This is similar to how a given location of a traditional map is connected to another location via streets and avenues. And just as a traditional map provides higher level contexts (e.g. city, province, country, world), Automated Application Cartography organizes lower level components into increasingly higher level groups (e.g. subsystems, systems, portfolios).
Best of all, once installed, the underlying repository can be continuously kept up to date by synchronizing its contents with production updates, thus providing an on-going, correct reflection of the entire application portfolio.
Mapador’s Automated Application Cartography has proven to provide 25-30% of savings of maintenance budgets, year over year.
Automated Application Cartography provides the necessary base for tracing a given data element across multiple systems with the highest degree of confidence and agility. It records the transformations a data element is exposed to, as well as understands the flow and order of all processes where such transformations occur.
Mapador uses its Deep Parsing engine to accomplish all such tracing. This engine allows for specific targets to be defined for tracing and queries Mapador’s own repository to find the given data lineage.
The ‘Target Definition’ can define components directly (i.e. a specific field name), a generic name (i.e. fields with certain characters in their name), a generic type (i.e. fields with a certain definition) or a combination of these.
However, by relying on a language’s actual syntax for parsing, Mapador does not depend on naming conventions or any given variable name. These are only used as a starting point for impact analysis. For example, one could define to find all impacts of a field named ‘Customer-Number’. This would be used as input to the analysis only – after that, regardless of how the fields are named in the impact chain, Mapador will find all impacts.
As described above, Deep Parsing can also be initiated by defining a certain field definition as the target. In this case, variable naming is completely irrelevant.
Deep Parsing outputs a Target Impact List, comprising of all components matching the Target Definition as well as all their affected components, organized by technology type. This information is saved in the Repository, for further processing.
Deep Parsing is a recursive process. From the target list, components are automatically followed through within the technology and all impacts are output into the repository. Impacts connected to other technologies are then fed back into the parsing process to identify impacts in those other technologies. For example, if a field is followed through within COBOL all the way to an output data store (such as a file or database table) the recursive process then investigates any other technologies that may input the affected data store, such as sort or other utilities, database loads or other system components written in other languages.
This is done by a feedback-loop into the affected other technology’s deep parsing process, with the new target as input. The process continues to execute until all impacts in all technology types are exhausted.
Data Lineage Output
Mapador can output the results of the above process in various formats. This includes reports organized for Business Analysts or digital output in standard Mapador format, XML format for further input to other tools (such as IBM’s FastTrack, IBM infosphere ,AB Initio Metadata Hub and Collibra data governance center) or other custom formats defined by the client.
Automated Application Cartography Benefits
- The Code Never Lies
Automatic Application Cartography with Deep Parsing reports Data Lineage directly from production code. As such, it mines processes and transformations as they actually occur in real operational environments.
- Automated and Repeatable
As enterprises endeavor to address data lineage issues, one possible approach is to manually document data flow through the systems. For a relatively smaller set of applications, this can potentially be accomplished in the allocated timeframes, even if expensively.
However, manual analysis is error prone and not cost-effectively repeatable. In fact, this is exactly why Data Flow Diagrams developed sometimes decades ago were never really kept up to date.
Without keeping the results up to date, even high-quality manual analysis quickly becomes outdated. As a result, confidence in the correctness of manually obtained data lineages cannot be as high as desired.
Data Lineage obtained from Automated Application Cartography and Deep Parsing directly overcomes these problems.
- Ready for Growth
Data governance is already formally being addressed by many organizations. Automated Application Cartography bridges data governance gaps by enriching the data collected into data governance frameworks through automation.
Implementing Individual data lineage requirements will also likely grow over the next few years, due to increasing regulatory requirements. Implementing Mapador’s repeatable process allows for the inclusion of such future growth in a decreasingly costly manner.