Staging Validation
This new validation process includes an expanded library of validation rules and makes it easier to create and manage custom validation rules.
Staging Validation Overview
Once data has been loaded into the Generate Staging tables, the Staging Validation process can be executed within SQL Server Management Studio (SSMS). This process scans the data in corresponding staging tables and applies validation rules to determine if any data have issues.
A Staging Validation rule can apply to a particular column in a staging table or can be defined to apply validation logic across multiple columns/conditions. A rule can be associated to one or more EDFacts reports. For example, Generate includes a rule that column StudentIdentifierState
in table Staging.K12Enrollment
is required. This rule is associated to all EDFacts reports that produce student counts.
Generate currently includes a library of over 50 pre-defined staging validation rules, located in table Staging.StagingValidationRules
. These rules are used in over 1,000 instances across multiple Fact Types and Report Codes. The full list of rules and relationships can be viewed in Staging.vwStagingValidationRules
.
Rules can be defined, added, and associated at any time by the Generate developers and SEA users.
More information for how to add rules is later in this document.
Run Staging Validation
Load data into the Generate Staging Tables, then execute the following procedure in SSMS (with the appropriate parameter values) to run the validation process:
@SchoolYear
The school year that corresponds to the data in staging.
@FactTypeOrReportCode
Either an EDFacts Report Code (i.e., C029, C002, C005) or a Generate Fact Type (i.e., Directory, ChildCount, Exiting). If an incorrect value is provided, the query will return a list of valid values.
@RemoveHistory
Determines if the validation results will replace historical validation results for the same school year and fact type/report code or will append to the existing results. This is an optional parameter, and if not supplied will default to 0 (do not remove history).
@PrintSQL
A debugging capability that will display the dynamic SQL that will run to perform the staging validation. This is an optional parameter and if not supplied will default to 0 (do not show the SQL).
View Staging Validation Process Results
To view the results of the validation process, execute the following procedure in SSMS (with the appropriate parameter values):
@IncludeHistory
Will include all history of the validation results for the school year and fact type/report code as well as the latest results (assuming the history is still available) based on the @RemoveHistory parameter value used when executing the Staging Validation process. This is an optional parameter and if not provided will default to 0 (do not include history in the results).
Results of the Staging Validation process will be displayed in SSMS, similar to the example below:
If no results are returned, it means the staging data passed all existing validation rules.
Staging Validation Columns
Id
Internal Id for the result record.
StagingValidationRuleId
The Rule Id that was processed to produce the result. There may be some auto-generated results that do not pertain to a rule:
-1 This indicates that a staging table is required for the specific report and cannot be empty.
-2 This indicates that an option set value in a staging table is not mapped in the Generate
Staging.SourceSystemReferenceData
table.-9 This indicates that the defined rule has a syntax error and could not be executed. Any results having a -9 should be reviewed by the rule developer.
SchoolYear
The school year for which the result applies.
FactTypeOrReportCode
The fact type or report code for which the result applies.
StagingTableName
The staging table name for which the result applies.
ColumnName
The column name in the staging table for which the result applies.
Severity
The severity level of the result. Options are:
Informational – this result indicates that the data may have an issue that may produce invalid report data.
Error – this result indicates that the data will likely produce incorrect report data.
Critical – this result indicates that the data may cause the migration to fail and/or the report to not produce.
ValidationMessage
Provides information about the result.
RecordCount
The number of records identified as not passing the validation rule.
ShowRecordsSQL
A SQL query that will produce the records that failed the validation. You can copy the contents from this cell into a query window and execute it to see the data.
InsertDate
The date and time the Staging Validation process inserted this result into the table.
ETL developers can review the staging validation results and respond as needed to remediate the finding. This may include the following actions:
Changes to the source data.
Changes to the ETL.
Adjustments to mapped values in
Staging.SourceSystemReferenceData.
Review by data owners and/or EDFacts coordinators to determine if any action is needed.
After making changes and refreshing staging data if needed, the Staging Validation process can be rerun to review the updated results.
Staging Validation Technical Details
The Staging Validation utility is comprised of the following components:
Staging Tables
StagingValidationResults
– contains the results of staging validation executionsStagingvalidationRules
– contains the staging validation rulesStagingValidationRules_ReportsXREF
– contains the cross-reference showing which staging validation rules apply to which EDFacts reports.
Staging Stored Procedures
StagingValidation_Execute
– executes the staging validation processStagingValidation_GetResults
– shows the results of the staging validation processStagingValidation_InsertRule
– used to insert a new rule into the Staging Validation Rules tableStagingValidation_AssignRuleToReports
– used to assign an existing rule to reports
Staging Views
vwStagingValidationRules
– shows all staging validation rules and their relationship to EDFacts reports
Viewing Existing Staging Validation Rules
To view rule definitions, query table Staging.StagingValidationRules
:
Staging Validation Rules Columns
StagingValidationRuleId
The internal rule Id.
StagingTableId
The staging table Id for which this rule applies.
StagingColumnId
The staging column Id for which this rule applies.
RuleDscr
A description of the rule.
Condition
The programmatic condition for this rule.
ValidationMessage
The message included in the validation result.
Severity
The severity level for the rule.
CreatedBy
The rule creator. All rules created by the CIID development team will show “Generate”. Custom rules created by SEAs can have whatever value they prefer.
CreatedDateTime
The date and time the rule was inserted into the table.
Since this table contains Id values, a better way to view the rules is to use the view. To view all existing Staging Validation rules and determine for which reports they are applied, query Staging.vwStagingValidationRules
:
The view shows all columns from Staging.StagingValidationRules
, but also shows if/where those rules area applied by returning the following additional columns:
StagingValidationRuleId – the internal Id for a validation rule. If this value is NULL, it means that no rule exists that is associated to the report code, table and column shown.
StagingValidationRuleId_XREF – this indicates if the rule is associated to a report code. If this value is NULL, it means that the rule is not associated to a report code.
FactTypeCode
The Fact Type for which this rule applies. Fact Types are groups of EDFacts reports such as “Assessment”, “Directory”, “Discipline”, etc.
DimFactTypeId
The internal Id for the Fact Type.
ReportCode
The EDFacts Report Code for which this rule applies.
ReportName
The EDFacts Report Name for which this rule applies.
GenerateReportId
The internal Id for the Report.
StagingTableName
The name of the Staging column for which this rule applies.
GenerateReportId_XREF
The internal report Id for which this rule applies.
Enabled_XREF
Indicates if this rule is enabled. A rule can be disabled or enabled, and only enabled rules will be executed during the Staging Validation process.
Understanding StagingValidationRuleId and StagingValidationRuleId_XREF:
View vwStagingValidationRules
shows a left join on all possible combinations of Staging Tables, Staging Columns and EDFacts Reports. Columns StagingValidationRuleId
and StagingValidationRuleId_XREF
shows if a rule exists and if/where that rule is being applied.
NULL
NULL
No rule exists for the Staging Table and Staging Column shown.
NOT NULL
NULL
A rule exists for the Staging Table and Staging Column but is not associated with the Report Code.
NOT NULL
NOT NULL
A rule exists for the Staging Table and Staging Column and is associated with the Report Code shown.
NULL
NOT NULL
This combination will not exist.
Creating Staging Validation Rules
The Staging stored procedure named “StagingValidation_InsertRule
” can be used to easily add new rules to the database. The procedure accepts the following parameters:
@FactTypeOrReportCode
The Fact Type or Report Code(s) for which this rule applies. This parameter can be single Fact Type value, a single Report Code value, a comma-delimited list of Report Code values, or have a value of “All” to indicate this rule applies to all EDFacts reports for which the Staging Table is used.
@StagingTableName
the Staging table for which this rule applies
@StagingColumnName
the Staging column for which this rule applies
@RuleDscr
a description for the rule
@Condition
the logic condition this rule applies to determine the rule result. There are three methods for defining a condition:
“Required” – if the @Condition value is “Required”, this rule indicates that the staging column is required to be populated and must not contain NULL values.
“where ….” – if the @Condition value starts with “where”, the rule looks for all record in the Staging Table and Staging Column where the condition exists and returns all columns from the Staging Table in the results.
“select …” – if the @Condition value starts with “select”, the rule applies the logic defined in the rule, and will return the data elements specified in the condition.
@ValidationMessage
the message returned in the Validation Result when the data fails the validation rule.
@CreatedBy
indicates who created the rule. All rules provided in the Generate releases will show “Generate”. States can populate this with their preferred value for rules they create.
@Enabled
indicates if the rule is enabled. Only enabled rules are executed by the Staging Validation process
Notice that the “Severity” for a rule is not a parameter for inserting a rule. Generate will automatically set the severity for a rule.
Also note that Generate automatically applies some validations that are not defined as a rule. For example, Generate knows which staging tables are required to contain data for certain EDFacts reports. Generate also knows which staging columns must contain option set values mapped in the SourceSystemReferenceData
table. Therefore, no rules exist in the StagingValidationRules
table for these instances. Instead, Generate automatically applies these rules when executing the Staging Validation process, so rules do not need to be manually defined for these conditions. These automatic results will show a specific Rule Id in the results:
-1 This indicates that a staging table is required for the specific report and cannot be empty
-2 This indicates that an option set value in a staging table is not mapped in the Generate
Staging.SourceSystemReferenceData
table.-9 This indicates that the defined rule has a syntax error and could not be executed. Any results having a -9 should be reviewed by the rule developer.
Examples
Adding a REQUIRED Rule:
This example shows how to add a REQUIRED rule to a specific column in a specific table for a specific EDFacts report code.The @Condition value = “Required”
Adding a Conditional Rule with Basic Logic:
This example shows how to add a simple conditional rule that applies to a single staging table and column for all reports that are part of Fact Type “Discipline”. The @Condition value starts with “where”. The results will search all records in @StagingTableName where the condition exists.
Adding a Conditional Rule with Expanded Logic:
This example shows how to add a conditional rule with expanded logic that may span multiple tables and columns. This rule is specific to the list of Reports shown in the @FactTypeOrReportCode parameter. The @Condition value starts with “select”, so the results will show only the columns defined in the condition.
Adding a Conditional Rule For ALL Reports:
This example shows how to add a conditional rule that applies to all EDFacts reports for which the @StagingTableName is used. The @FactTypeOrReportCode value = “All”. This rule is saying that the RecordStartDateTime column in Staging.K12Enrollment must meet the condition in all EDFacts reports that use the K12Enrollment table.
Assigning Existing Rules to a Fact Type or Report Code(s)
If a rule exists in StagingValidationRules
, but is not associated to a specific Fact Type or Report Code, Generate provides the capability to create these relationships with procedure StagingValidation_AssignRuleToReports
. The procedure accepts the following parameters:
@StagingValidationRuleId
The rule Id to be assigned
@FactTypeOrReportCode
The Fact Type or Report Code(s) for which this rule applies. This parameter can be single Fact Type value, a single Report Code value, a comma-delimited list of Report Code values, or have a value of “All” to indicate this rule applies to all EDFacts reports for which the Staging Table is used.
@CreatedBy
Indicates who created the rule. All rules provided in the Generate releases will show “Generate”. States can populate this with their preferred value for rules they create.
@Enabled
Indicates if the rule is enabled. Only enabled rules are executed by the Staging Validation process.
Below are several examples of using the procedure to assign a rule:
This example assigns rule 36 to all Assessment reports.
This example assigns rule 19 to reports C188 and C189.
Additional Information
When a new version of Generate is released, all Validation Rules provided by the Generate development team will be replaced/updated. Any Validation Rules created by SEAs will be retained in the new version.
If new Staging Tables and/or Staging Columns are added to Generate, the Generate Development team will need to add rows to the following tables in the App schema to allow rules to be created for those new tables/columns:
App.GenerateStagingTables
App.GenerateStagingColumns
App.SourceSystemReferenceMapping_DomainFile_XREF
App.vwStagingRelationships
shows the relationship between Fact Types, Report Codes, Staging Tables, Staging Columns andSourceSystemReferenceData
.
Last updated