Skip to content

Commit

Permalink
Merge pull request #422 from Azure-Player/issue-374-inc-deploy
Browse files Browse the repository at this point in the history
Issue 374 inc deploy
  • Loading branch information
NowinskiK authored Oct 29, 2024
2 parents 02ea6ff + c34a9de commit 9dec673
Show file tree
Hide file tree
Showing 15 changed files with 305 additions and 119 deletions.
36 changes: 16 additions & 20 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,6 @@ The main advantage of the module is the ability to publish all the Azure Data Fa
- [How it works](#how-it-works)
- [Step: Create ADF (if not exist)](#step-create-adf-if-not-exist)
- [Step: Load files](#step-load-files)
- [Step: Pre-deployment](#step-pre-deployment)
- [Step: Replacing all properties environment-related](#step-replacing-all-properties-environment-related)
- [Column TYPE](#column-type)
- [Column NAME](#column-name)
Expand Down Expand Up @@ -215,6 +214,7 @@ $opt = New-AdfPublishOption
* [Boolean] **DoNotStopStartExcludedTriggers** - specifies whether excluded triggers will be stopped before deployment (default: *false*)
* [Boolean] **DoNotDeleteExcludedObjects** - specifies whether excluded objects can be removed. Applies when `DeleteNotInSource` is set to *True* only. (default: *true*)
* [Boolean] **IncrementalDeployment** - specifies whether Incremental Deployment mode is enabled (default: *false*)
* [String] **IncrementalDeploymentStorageUri** - indicates Azure Storage where the latest deployment state file is stored (no default)
* [Enum] **TriggerStopMethod** - determines which triggers should be stopped.
Available values: `AllEnabled` (default) | `DeployableOnly`
Find more about the above option in section [Step: Stoping triggers](#step-stoping-triggers)
Expand Down Expand Up @@ -398,7 +398,8 @@ graph LR;
You must have appropriate permission to create new instance.
*Location* parameter is required for this action.

If ADF does exist and `IncrementalDeployment` is ON, the process gets Global Parameters to load latest **Deployment State** from ADF.
If ADF does exist and `IncrementalDeployment` is ON, the process loads latest **Deployment State** from Storage.
Note: The above flag will be disabled when related parameter (`IncrementalDeploymentStorageUri`) is empty.

## Step: Load files

Expand All @@ -407,15 +408,6 @@ If ADF does exist and `IncrementalDeployment` is ON, the process gets Global Par
This step reads all local (json) files from a given directory (`rootfolder`).


## Step: Pre-deployment

💬 In log you'll see line: `STEP: Pre-deployment`

It prepares new (empty) file in `factory` folder if such file doesn't exist.
The file is needed for further steps to keep Deployment State in Global Parameter.

> This step is enable only when `IncrementalDeployment` is ON and `DeployGlobalParams` is ON.
## Step: Replacing all properties environment-related

💬 In log you'll see line: `STEP: Replacing all properties environment-related...`
Expand Down Expand Up @@ -640,11 +632,12 @@ The mechanism is smart enough to publish all objects in the right order, thence

💬 In log you'll see line: `STEP: Updating (incremental) deployment state...`

After the deployment, in this step the tool prepares the list of deployed objects and their hashes (MD5 algorithm). The array is wrap up in json format and stored as new global parameter `adftools_deployment_state` in factory file.
After the deployment, in this step the tool prepares the list of deployed objects and their hashes (MD5 algorithm).
The array is wrap up in json format and stored as blob file `{ADF-Name}.adftools_deployment_state.json` in provided Storage.
**Deployment State** speeds up future deployments by identifying objects have been changed since last time.

> The step might be skipped when `IncrementalDeployment = false` OR `DeployGlobalParams = false` in *Publish Options*.
> You'll see warning in the console (log) when only `IncrementalDeployment = true`.
> The step might be skipped when `IncrementalDeployment = false` in *Publish Options*.
> You'll see warning in the console (log) when `IncrementalDeployment = true` and `IncrementalDeploymentStorageUri` is empty.

## Step: Deleting objects not in source
Expand All @@ -670,26 +663,29 @@ Since v.1.6 you have more control of which triggers should be started. Use `Trig
## Incremental Deployment

> This is new feature (ver.1.4) in public preview.
> This is new feature (ver.1.4) in public preview. Since ver.1.10 the process doesn't use ADF Global Parameter to keep Deployment State data. You must provide Storage URL instead.
Usually the deployment process takes some time as it must go through all object (files) and send them via REST API to be deployed. The more objects in ADF the longer process takes.
In order to speed up the deployment process, you may want to use new switch `IncrementalDeployment` (new in *Publish Options*) to enable smart process of identify and deploy only objects that have been changed since last deployment.

### How it works?
It uses **Deployment State** kept in one of Global Parameters and is save/read to/from ADF service.
It uses **Deployment State** kept in as json file and is write/read to/from Azure BLOB Storage.
When the mode is ON, the process does a few additional steps across entire deployment process:
1. Reads Global Parameters from ADF (when not newly created) to get previous **Deployment State**
1. Reads Deployment State (json file) from Storage to get previous **Deployment State**
2. Identifies which objects are unchanged and excludes them from deployment
3. Calculates MD5 hashes of deployed objects and merges them to previous **Deployment State**
4. Saves **Deployment State** as `adftools_deployment_state` global parameter
4. Saves **Deployment State** as `{ADFName}.adftools_deployment_state.json` file in Storage

> Note: In order to use this feature, the following option parameters must be set:
> - `IncrementalDeployment` = `True`
> - `IncrementalDeploymentStorageUri` = `https://sqlplayer2020.file.core.windows.net/adftools` (example)
### Remember
* Incremental Deployment assumes that no one changes ADF objects manually in the cloud
* You must deploy Global Parameters in order to save Deployment State
* Objects' hashes are calculate after update of properties. If you change config for an object - it will be deploy
* If you want to redeploy all objects again, you've got two options:
* Set `IncrementalDeployment = false` OR
* Delete manually `adftools_deployment_state` global parameter in target ADF service
* Delete Deployment State (json) file manually from provided Storage account's location


# Selective deployment, triggers and logic
Expand Down
2 changes: 1 addition & 1 deletion azure.datafactory.tools.psd1
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@
RootModule = 'azure.datafactory.tools.psm1'

# Version number of this module.
ModuleVersion = '1.10.0'
ModuleVersion = '1.10.1'

# Supported PSEditions
# CompatiblePSEditions = @()
Expand Down
5 changes: 5 additions & 0 deletions changelog.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,14 @@ All notable changes to this project will be documented in this file.

The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/).

## [1.10.1] - 2024-10-24
### Fixed
* Incremental deploy feature causes payload limit issue #374

## [1.10.0] - 2024-08-06
### Fixed
* Trigger Activation Failure Post-Selective Deployment when TriggerStopMethod = `DeployableOnly` #386
* Significantly improved performance of unit tests by mocking target ADF

## [1.9.1] - 2024-06-17
### Fixed
Expand Down
3 changes: 2 additions & 1 deletion en-us/messages_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,4 +37,5 @@ ADFT0028 | Expected format of name for 'FullName' input parameter is: objectType
ADFT0029 | Unknown object type: *type*.
ADFT0030 | AzType '$AzType' is not supported.
ADFT0031 | Empty value in config file. Path: [*path*]. Check previous warnings.
ADFT0032 | The process is exiting the function. Do fix the issue and run again.
ADFT0032 | The process is exiting the function. Do fix the issue and run again.
ADFT0033 | Incremental Deployment Option DISABLED as Storage Uri is not provided.
1 change: 1 addition & 0 deletions private/!AdfPublishOption.class.ps1
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ class AdfPublishOption {
[Boolean] $DoNotStopStartExcludedTriggers = $false
[Boolean] $DoNotDeleteExcludedObjects = $true
[Boolean] $IncrementalDeployment = $false
[String] $IncrementalDeploymentStorageUri = ''
[TriggerStopTypes] $TriggerStopMethod = [TriggerStopTypes]::AllEnabled
[TriggerStartTypes] $TriggerStartMethod = [TriggerStartTypes]::BasedOnSourceCode
}
73 changes: 53 additions & 20 deletions private/DeploymentState.class.ps1
Original file line number Diff line number Diff line change
Expand Up @@ -40,33 +40,66 @@ class AdfDeploymentState {

}

function Get-StateFromService {
function Get-StateFromStorage {
[CmdletBinding()]
param ($targetAdf)
param (
[Parameter(Mandatory)] $DataFactoryName,
[Parameter(Mandatory)] $LocationUri
)

$res = Get-GlobalParam -ResourceGroupName $targetAdf.ResourceGroupName -DataFactoryName $targetAdf.DataFactoryName
$d = @{}
$moduleName = $MyInvocation.MyCommand.Module.Name
$moduleVersion = (Get-Module -Name $moduleName).Version.ToString()
$Suffix = "adftools_deployment_state.json"
$ds = [AdfDeploymentState]::new($moduleVersion)
$storageAccountName = Get-StorageAccountNameFromUri $LocationUri
$storageContext = New-AzStorageContext -UseConnectedAccount -StorageAccountName $storageAccountName
$blob = [Microsoft.Azure.Storage.Blob.CloudBlockBlob]::new("$LocationUri/$DataFactoryName.$Suffix")
Write-Host "Ready to read file from storage: $($blob.Uri.AbsoluteUri)"

try {
$InputObject = $res.properties.adftools_deployment_state.value.Deployed
$d = Convert-PSObjectToHashtable $InputObject
}
catch {
Write-Verbose $_.Exception
}

return $d
$storageContainer = Get-AzStorageContainer -Name $blob.Container.Name -Context $storageContext
$folder = $blob.Parent.Prefix
$FileRef = $storageContainer.CloudBlobContainer.GetBlockBlobReference("$folder$DataFactoryName.$Suffix")
if ($FileRef.Exists()) {
$FileContent = $FileRef.DownloadText()
#Write-Host $FileContent -BackgroundColor Blue
$json = $FileContent | ConvertFrom-Json
$ds.Deployed = Convert-PSObjectToHashtable $json.Deployed
$ds.adftoolsVer = $json.adftoolsVer
$ds.Algorithm = $json.Algorithm
$ds.LastUpdate = $json.LastUpdate
Write-Host "Deployment State loaded from storage."
return $ds
}
else {
Write-Host "No Deployment State found."
}
return $ds
}

function Set-StateToStorage {
[CmdletBinding()]
param (
[Parameter(Mandatory)] $ds,
[Parameter(Mandatory)] $DataFactoryName,
[Parameter(Mandatory)] $LocationUri
)

$Suffix = "adftools_deployment_state.json"
$dsjson = ConvertTo-Json $ds -Depth 5
Write-Verbose "--- Deployment State: ---`r`n $dsjson"

class AdfGlobalParam {
$type = "Object"
$value = $null
Set-Content -Path $Suffix -Value $dsjson -Encoding UTF8
$storageAccountName = Get-StorageAccountNameFromUri $LocationUri
$storageContext = New-AzStorageContext -UseConnectedAccount -StorageAccountName $storageAccountName
$blob = [Microsoft.Azure.Storage.Blob.CloudBlob]::new("$LocationUri/$DataFactoryName.$Suffix")
$r = Set-AzStorageBlobContent -ClientTimeoutPerRequest 5 -ServerTimeoutPerRequest 5 -CloudBlob $blob -File $Suffix -Context $storageContext -Force

AdfGlobalParam ($value)
{
$this.value = $value
}
Write-Host "Deployment State saved to storage: $($r.BlobClient.Uri)"
}

# Function to get Storage Account name from URI
function Get-StorageAccountNameFromUri($uri) {
$accountName = ($uri -split '\.')[0].Substring(8) # Assumes URI starts with "https://"
return $accountName
}

1 change: 1 addition & 0 deletions private/GlobalParam.ps1
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,7 @@ function Set-GlobalParam([Adf] $adf)
}
catch {
Write-Error -Exception $_.Exception
$response = ""
}
return $response
}
Expand Down
103 changes: 58 additions & 45 deletions public/Publish-AdfV2FromJson.ps1
Original file line number Diff line number Diff line change
Expand Up @@ -142,15 +142,21 @@ function Publish-AdfV2FromJson {
$opt = New-AdfPublishOption
}

if ([string]::IsNullOrEmpty($opt.IncrementalDeploymentStorageUri) -and $opt.IncrementalDeployment)
{
Write-Warning "ADFT0033: Incremental Deployment Option DISABLED as Storage Uri is not provided."
$opt.IncrementalDeployment = $false
}

if (!$DryRun.IsPresent) {
Write-Host "STEP: Verifying whether ADF exists..."

$targetAdf = Get-AzDataFactoryV2 -ResourceGroupName "$ResourceGroupName" -Name "$DataFactoryName" -ErrorAction:Ignore
if ($targetAdf) {
Write-Host "Azure Data Factory exists."
if ($opt.IncrementalDeployment -and !$DryRun.IsPresent) {
Write-Host "Loading Deployment State from ADF..."
$ds.Deployed = Get-StateFromService -targetAdf $targetAdf
Write-Host "Loading Deployment State from Storage..."
$ds = Get-StateFromStorage -DataFactoryName $DataFactoryName -LocationUri $opt.IncrementalDeploymentStorageUri
}
}
else {
Expand Down Expand Up @@ -185,28 +191,28 @@ function Publish-AdfV2FromJson {

Write-Debug ($adf | Format-List | Out-String)

Write-Host "===================================================================================";
Write-Host "STEP: Pre-deployment"
if ($opt.IncrementalDeployment -and $opt.DeployGlobalParams) {
Write-Host "Incremental Deployment Mode: Preparing..."
Write-Debug "Incremental Deployment Mode: Checking whether factory file exist..."
if ($adf.Factories.Count -eq 0) {
Write-Debug "Creating empty factory file..."
$EmptyFactoryFileBody = '{ "name": "'+ $adf.Name +'", "properties": { "globalParameters": {} } }'
$o = New-Object -TypeName "AdfObject"
$o.Adf = $Adf
$o.Name = $DataFactoryName
$o.Type = 'factory'
$o.Body = $EmptyFactoryFileBody | ConvertFrom-Json
$o.FileName = Save-AdfObjectAsFile -obj $o
$adf.GlobalFactory.FilePath = $o.FileName
$adf.GlobalFactory.body = $EmptyFactoryFileBody
$adf.GlobalFactory.GlobalParameters = $o.Body.Properties.globalParameters
$adf.Factories.Add($o) | Out-Null
Write-Host ("Factories: 1 object created.")
}
Write-Host "Incremental Deployment Mode: Preparation Done"
}
# Write-Host "===================================================================================";
# Write-Host "STEP: Pre-deployment"
# if ($opt.IncrementalDeployment -and $opt.DeployGlobalParams) {
# Write-Host "Incremental Deployment Mode: Preparing..."
# Write-Debug "Incremental Deployment Mode: Checking whether factory file exist..."
# if ($adf.Factories.Count -eq 0) {
# Write-Debug "Creating empty factory file..."
# $EmptyFactoryFileBody = '{ "name": "'+ $adf.Name +'", "properties": { "globalParameters": {} } }'
# $o = New-Object -TypeName "AdfObject"
# $o.Adf = $Adf
# $o.Name = $DataFactoryName
# $o.Type = 'factory'
# $o.Body = $EmptyFactoryFileBody | ConvertFrom-Json
# $o.FileName = Save-AdfObjectAsFile -obj $o
# $adf.GlobalFactory.FilePath = $o.FileName
# $adf.GlobalFactory.body = $EmptyFactoryFileBody
# $adf.GlobalFactory.GlobalParameters = $o.Body.Properties.globalParameters
# $adf.Factories.Add($o) | Out-Null
# Write-Host ("Factories: 1 object created.")
# }
# Write-Host "Incremental Deployment Mode: Preparation Done"
# }

Write-Host "===================================================================================";
Write-Host "STEP: Replacing all properties environment-related..."
Expand Down Expand Up @@ -283,26 +289,32 @@ function Publish-AdfV2FromJson {
Write-Host "===================================================================================";
Write-Host "STEP: Updating (incremental) deployment state..."
if ($opt.IncrementalDeployment) {
if ($opt.DeployGlobalParams -eq $false) {
Write-Warning "Incremental Deployment State will not be saved as publish option 'DeployGlobalParams' = false"
} else {
Write-Debug "Deployment State -> SetStateFromAdf..."
$ds.SetStateFromAdf($adf)
$dsjson = ConvertTo-Json $ds -Depth 5
Write-Verbose "--- Deployment State: ---`r`n $dsjson"
$gp = [AdfGlobalParam]::new($ds)
$report = new-object PsObject -Property @{
Updated = 0
Added = 0
Removed = 0
}
Update-PropertiesForObject -o $adf.Factories[0] -action 'add' -path 'globalParameters.adftools_deployment_state' -value $gp -name 'type' -type 'factory' -report $report
# if ($opt.DeployGlobalParams -eq $false) {
# Write-Warning "Incremental Deployment State will not be saved as publish option 'DeployGlobalParams' = false"
# } else {
Write-Debug "Deployment State -> SetStateFromAdf..."
$ds.SetStateFromAdf($adf)
# $dsjson = ConvertTo-Json $ds -Depth 5
# Write-Verbose "--- Deployment State: ---`r`n $dsjson"
#$gp = [AdfGlobalParam]::new($ds)
# $report = new-object PsObject -Property @{
# Updated = 0
# Added = 0
# Removed = 0
# }
# Update-PropertiesForObject -o $adf.Factories[0] -action 'add' -path 'globalParameters.adftools_deployment_state' -value $gp -name 'type' -type 'factory' -report $report

#Write-Verbose "Redeploying Global Parameters..."
#$adf.Factories[0].Deployed = $false
#$adf.Factories[0].ToBeDeployed = $true
#Deploy-AdfObject -obj $adf.Factories[0]
# }

Write-Verbose "Redeploying Global Parameters..."
$adf.Factories[0].Deployed = $false
#$adf.Factories[0].ToBeDeployed = $true
Deploy-AdfObject -obj $adf.Factories[0]
}
# https://learn.microsoft.com/en-us/azure/storage/blobs/blob-powershell
# Set-Content -Path "adfdeploymentstate.json" -Value $dsjson -Encoding UTF8
# $ctx = New-AzStorageContext -UseConnectedAccount -StorageAccountName "sqlplayer2020"
# Set-AzStorageBlobContent -Container "adftools" -File "adfdeploymentstate.json" -Context $ctx -Blob "$DataFactoryName.adfdeploymentstate.json" -Force
Set-StateToStorage -ds $ds -DataFactoryName $DataFactoryName -LocationUri $opt.IncrementalDeploymentStorageUri
} else
{
Write-Host "Incremental Deployment State will not be saved as publish option 'IncrementalDeployment' = false"
Expand All @@ -320,8 +332,9 @@ function Publish-AdfV2FromJson {
$elapsedTime = new-timespan $script:StartTime $(get-date)
Write-Host "==============================================================================";
Write-Host " ***** Azure Data Factory files have been deployed successfully. *****`n";
Write-Host "Data Factory name: $DataFactoryName";
Write-Host "Region (Location): $location";
Write-Host " Data Factory name: $DataFactoryName";
Write-Host "Resource Group name: $ResourceGroupName";
Write-Host " Region (Location): $location";
Write-Host ([string]::Format(" Elapsed time: {0:d1}:{1:d2}:{2:d2}.{3:d3}`n", $elapsedTime.Hours, $elapsedTime.Minutes, $elapsedTime.Seconds, $elapsedTime.Milliseconds))
Write-Host "==============================================================================";

Expand Down
6 changes: 4 additions & 2 deletions test/!RunAllTests.ps1
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,9 @@ Param(
[Switch]$InstallModules
)

Write-Host " ========= ENVIRONMENT =========="
Write-Host "Host Name: $($Host.name)"
Write-Host "PowerShell Version: $($PSVersionTable.PSVersion)"

$rootPath = Switch ($Host.name) {
'Visual Studio Code Host' { split-path $psEditor.GetEditorContext().CurrentFile.Path }
Expand All @@ -25,8 +27,8 @@ $folder = Split-Path $rootPath -Parent

Write-Host "Setting new location: $folder"
Push-Location "$folder"
Get-Location | Out-Host

Get-Location
Write-Host " ========= ENVIRONMENT =========="

# Add the module location to the value of the PSModulePath environment variable
#$p = [Environment]::GetEnvironmentVariable("PSModulePath")
Expand Down
Loading

0 comments on commit 9dec673

Please sign in to comment.