-
Notifications
You must be signed in to change notification settings - Fork 1
DataMonitoring
Periodic data profile logging allows us to detect when a data set has changed in unexpected.
Consider log of COUNT(*)
of ED_VisitFact recorded each time a it's ETL package is executed.
$x_k=$ED_VisitFact_Log where $x=$RowCount (ie COUNT(*)
) and ROW_NUMBER()
for each log entry compute:
change in row count
RowCount - LAG(RowCount)
DATEDIFF(DAY,LAG(EtlDate),EtlDate)
Model
Define alert triggers: if
Define meta parameter object
- database name,
- schema name,
- table name,
- column name(s) (optional)
Meta parameter objects imply SQL COUNT(*)
aggregation where each column is added to GROUP BY
clause.
Let =COUNT(*)(*)
data:
-
$i=0,1,\ldots,N$ etl sequence index, -
$T=(t_i)_{i=0}^N$ etl time stamp sequence, -
$X=(x_i)_{i=0}^N$ row count
calculations:
- row count change per day:
$\frac{dx}{dt}\approx$ - relative row count change per day:
$\frac{dx}{dt}\approx\frac{x_i-x_{i-1}}{x_i(t_i-t_{i-1})}$
this is a footer