cropped-cropped-SiteLogo-copy.png

Deena Chadwick
The CommonSensical BA

6 Measures For Data Integrity Analysis

In a nutshell, Data Integrity means you can trust your data. In a time and industry where metrics and data seem to rule, people’s perception of your data will ultimately affect your organizations’ reputation. When data cannot be trusted, people do not immediately assume it was an accident, instead they assume it was malicious. This is why it is important to analyze the integrity of any data you are adding to your system.

Data Is Considered Trustworthy If It Is

Complete

Complete does not mean every field or attribute is filled in. Data is considered complete when it has all of the information needed for its intended use.

Accurate

Data is considered accurate when it is correct. The best way to ensure accuracy is to put quality measurements in place that confirm the data being entered or updated is error-free.

Up To Date

We live in an On-Demand time, where customers are won or lost in seconds. Attention spans and patience is in short supply. Outdated data can seriously harm your bottom line.

Fortune 1000 enterprises will lose more money in operational inefficiency due to data quality issues than they will spend on data warehouse and customer relationship management (CRM) initiatives.

Gartner

Measures For Data Integrity

Physical integrity is the protection of data’s wholeness and accuracy as it’s stored and retrieved. When natural disasters strike, power goes out, or hackers disrupt database functions, physical integrity is compromised. Human error, storage erosion, and a host of other issues can also make it impossible for data processing managers, system programmers, applications programmers, and internal auditors to obtain accurate data.

Entity integrity relies on the creation of primary keys, or unique values that identify pieces of data, to ensure that data isn’t listed more than once and that no field in a table is null. It’s a feature of relational systems which store data in tables that can be linked and used in a variety of ways.

Logical integrity keeps data unchanged as it’s used in different ways in a relational database. Logical integrity protects data from human error and hackers as well, but in a much different way than physical integrity does. There are four types of logical integrity.

Referential integrity refers to the series of processes that make sure data is stored and used uniformly. Rules embedded into the database’s structure about how foreign keys are used ensure that only appropriate changes, additions, or deletions of data occur. Rules may include constraints that eliminate the entry of duplicate data, guarantee that data is accurate, and/or disallow the entry of data that doesn’t apply.

Domain integrity is the collection of processes that ensure the accuracy of each piece of data in a domain. In this context, a domain is a set of acceptable values that a column is allowed to contain. It can include constraints and other measures that limit the format, type, and amount of data entered.

 

User-defined integrity involves the rules and constraints created by the user to fit their particular needs. Sometimes entity, referential, and domain integrity aren’t enough to safeguard data. Often, specific business rules must be taken into account and incorporated into data integrity measures.