The goal of physical data upgrade is to create a clean and comprehensive set of data stores for each application. Inherent in this task is the need to examine the main physical data stores of the system to determine the validity of data values and to purify the "dirty" data.
Such dirty data might include incorrect data values, missing data values, values not corresponding to field definitions, unwanted duplicates, incorrect dependency or integrity rules and inconsistencies between data stores, among other things.
In addition, redevelopment goals may necessitate a data conversion process to expand a physical file or database (for century date change, for instance) or to insert new data values based on changing application requirements.
Specific objectives for physical data upgrade include:
· Purifying existing data values in main application data stores
· Expanding physical files or databases to conform to new user requirements
· Converting existing data stores to conform to new requirements
· Activate/deactivate interfaces (bridges or APIs) to support phased deployment of enterprise wide migration projects
The entrance criteria for the physical data upgrade task are listed below.
· A requirement for specific files or databases to be purified, expanded or converted
· Determination of the physical data stores to be purified, expanded or converted
· Determination of the target file or database definition and structure
Note: This task does not attempt to select a target definition or structure from a list of alternatives, but only to accomplish the upgrade to a previously selected target.
· Completion of the environmental analysis task
· Completion of the Data Definition Analysis of the technical assessment component of the Inventory/Analysis stage
· Completion of the field and record size expansion task of the Positioning stage
Note: Optional - This is required unless physical data conversion does not entail a field or file expansion. In the field and record size expansion task, the logical definitions were upgraded and impacted source code was changed. In this task, the physical data upgrades that correspond to those source level changes are made.
The personnel and skill requirements necessary to meet the physical data upgrade task objectives are identified below.
· Data Definition Analyst
- Knowledge of application data structure and content, and expertise in applying data manipulation tools and techniques
· Current Systems Expert
- Knowledge of existing system data usage as implemented
· Data Administrator/DBA
- Knowledge of data usage procedures and standards for the application area and the organization
· Systems Programmer
- Security access to physical system objects
The system components and related inputs required to initiate and complete the physical data upgrade task are listed below.
· Access to main system data stores
· Identification of physical files or databases to be expanded or converted
Note: This could be all files across the systems of interest.
· Identification of utilized physical files or databases belonging to different systems which will not be expanded or converted
· Physical Data Upgrade control log Form 024
· Positioning work plan Form 020
· Environmental analysis reports linking physical files and databases with their definitions and executable modules
· Defective data lists from Physical Data Analysis step of Data Definition Analysis task
· Architectural assessment data flow diagram reports describing input and output data stores for each system, and those which are shared between systems
Technologies supporting the physical data upgrade task include data migration, interface/bridge generator, physical data analyzer, open systems repository, compiler/preprocessor, program editor and cataloging tools. These tools are used to represent information as required by this task.
Data migration tool
Within this task, the role of the data migration tool is to automate the process of converting data values in a physical data store. This may involve mapping data from multiple existing data stores into a single target store, inserting new data values based on a default or some other user criteria, converting existing values to a new length or attribute type, or moving data to a new platform.
To accomplish this, data migration tools must have the ability to analyze, manipulate and convert data to user specification. The tool should be able to utilize either a data dictionary, a repository or a set of user-defined specifications, or some combination of the three, to provide a source definition and a target definition for the data to be converted. In addition, benchmarking features or strategies are essential to accurate conversion processing.
Interface/bridge generator
The main role the interface/bridge generator tool plays within this task is to support the conversion of date field definitions within programs. If redevelopment goals include expanding or converting some physical data stores without expanding related data stores belonging to different systems, and if data is shared between expanded and unexpanded systems, activation and deactivation of data interface routines is required. This technique allows for files to be selectively converted over time and eliminates the need to simultaneously convert all of the programs that use a single file.
Physical data analyzer
Within this task, the role of the physical data analyzer is to automatically scan selected data and produce a sampling of defective data. There are three common methods for determining defective data: comparison of the data to user-defined rules, comparison of the data to system definitions (either hard-coded in the source or defined in the dictionary) and analysis of the data alone to discover inherent rules and dependencies.
Some tools review all the data in the selected data stores, while others sample the data in some pre-defined or random pattern. Some physical data analyzers can also produce management reports on the quality of the data.
Open systems repository
A repository provides an important, yet optional, capability to link data definitions to other definitions and to physical objects using a formal model. Use of repository in this task requires that system-wide meta-data was established during environmental analysis and data definition analysis. For large systems or cross functional expansion efforts, this repository model can be maintained as a sophisticated mechanism for managing project efforts.
Requirements include the ability to reflect system components as objects within the repository model and populate that model from a legacy environment. The objective in this task is to reflect the relationships between data store objects and interface objects developed in the Create Expansion Interfaces/Strategy step.
Compiler/preprocessors
A source language compiler/preprocessor is required to parse source code and convert it into machine executable instructions. This compilation process identifies many types of errors, and requires their correction before it will complete successfully.
Program editor
This is required to create the interface programs and subroutines necessary if shared files are involved and not all systems will commit to the expansion project. Use a program editor appropriate to the environment.
Cataloging tool
This is required to catalog or register locations and names of new data stores as they are created. Any library management tool should be able to meet this requirement.
The physical data upgrade task is comprised of the following task steps
Verify Physical Data Integrity Purify Physical Data Expand & Convert Physical Data Create Expansion Interface Strategy