Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Resumable Full Load In Mongodb #61

Merged
merged 23 commits into from
Feb 18, 2025
Merged

Conversation

vikash390
Copy link
Collaborator

@vikash390 vikash390 commented Jan 15, 2025

Description

Added logging for failed chunks in the terminal and implemented handling to retry processing them efficiently.

Fixes # (issue)

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

  • Scenario A
  • Scenario B

Screenshots or Recordings

Related PR's (If Any):

@vikash390 vikash390 changed the title Add logging for failed chunks in terminal (work in progress) feat:Add logging and handling for failed chunks Jan 19, 2025
@hash-data hash-data linked an issue Jan 20, 2025 that may be closed by this pull request
@vikash390 vikash390 marked this pull request as ready for review January 30, 2025 08:12
@hash-data hash-data changed the title feat:Add logging and handling for failed chunks feat: Resumable Full Load In Mongodb Jan 30, 2025
Copy link
Collaborator

@hash-data hash-data left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More improvements required

Comment on lines 57 to 68
if end.After(last) {
boundry = types.Chunk{
Min: generateMinObjectID(start),
Max: "",
}
logger.Info("scheduling last full load chunk query!")
} else {
boundry = types.Chunk{
Min: generateMinObjectID(start),
Max: generateMinObjectID(end),
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can be simplified. Created boundry struct at the end.

}
}
chunks = append(chunks, boundry)
chunk := types.Chunk{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are you again creating an chunk var?

stream.AppendChunksToStreamState(chunk)
start = end
}

}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if node crashes here, where are we storing the chunks into state file?

types/state.go Outdated
@@ -81,36 +81,69 @@ func (s *State) MarshalJSON() ([]byte, error) {
return json.Marshal(p)
}

// DualSyncMap struct to hold Cursor and Chunks
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change comment. Please self-review PR line by line.

})
return chunks
}
func (s *ConfiguredStream) UpdateChunkStatusInStreamState(chunkID string, newStatus string) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need to persist to disk here. as OOM can occur at any place.

@shubham19may shubham19may changed the base branch from master to development February 14, 2025 13:54
@hash-data
Copy link
Collaborator

Will add defaultFlag for partitioning path in this PR

@hash-data hash-data changed the base branch from development to master February 14, 2025 19:52
logger/logger.go Outdated
@@ -180,7 +124,8 @@ func FileLogger(content any, filePath string, fileName, fileExtension string) er

func Init() {
// Configure lumberjack for log rotation
timestamp := fmt.Sprintf("%d-%d-%d_%d-%d-%d", time.Now().Year(), time.Now().Month(), time.Now().Day(), time.Now().Hour(), time.Now().Minute(), time.Now().Second())
timeStamp := time.Now().UTC()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

name it something else. one can not have same name but camel case

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will do it

@hash-data hash-data changed the base branch from master to development February 18, 2025 06:56
@hash-data hash-data merged commit 1a9b2ff into development Feb 18, 2025
5 checks passed
@hash-data hash-data deleted the feat/state-controller branch February 18, 2025 07:01
hash-data added a commit that referenced this pull request Feb 18, 2025
Co-authored-by: Datazip <datazip@Datazips-MBP.localdomain>
Co-authored-by: Datazip <datazip@Datazips-MacBook-Pro.local>
Co-authored-by: Shubham Baldava <shubham@datazip.io>
Co-authored-by: hash-data <ankit@datazip.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feat] Resumable full load
3 participants