Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Programming exercises: Improve repository diff report generation performance #10374

Open
wants to merge 9 commits into
base: develop
Choose a base branch
from

Conversation

SimonEntholzer
Copy link
Contributor

@SimonEntholzer SimonEntholzer commented Feb 21, 2025

Checklist

General

Server

  • Important: I implemented the changes with a very good performance and prevented too many (unnecessary) and too complex database calls.
  • I strictly followed the principle of data economy for all database calls.
  • I strictly followed the server coding and design guidelines.
  • I added multiple integration tests (Spring) related to the features (with a high test coverage).
  • I documented the Java code using JavaDoc style.

Changes affecting Programming Exercises

  • High priority: I tested all changes and their related features with all corresponding user types on a test server configured with the integrated lifecycle setup (LocalVC and LocalCI).

Motivation and Context

The performance of endpoint programming-exercises/{exerciseId}/submissions/{submissionId1}/diff-report/{submissionId2} can be improved for LocalVC, as we can avoid checking out the repository.

Description

Improves performance of the diff report generation, (between commits of one repository) by not checking out the repos, but using the local path for LocalVC repositories.
I didn't find a way to make diff generation work with this method between two different repositories (e.g. solution and template)

Steps for Testing

  1. Create a Programming exercise
  2. Use the review changes button
    image
  3. Create an exam, and participate with a student in the programming exercise with multiple submissions
  4. After the exam, as the instructor check the exam timeline component to view the diffs of the submissions (exam -> student exams -> view -> exam timeline)

Make sure the repository diff view works everywhere as intended.

Testserver States

You can manage test servers using Helios. Check environment statuses in the environment list. To deploy to a test server, go to the CI/CD page, find your PR or branch, and trigger the deployment.

Review Progress

Code Review

  • Code Review 1
  • Code Review 2

Manual Tests

  • Test 1
  • Test 2

Test Coverage

Screenshots

image

Summary by CodeRabbit

  • New Features

    • Diff report generation now tracks changes between submissions more reliably.
    • Repository status displays now offer clearer insights into working copy cleanliness.
  • Refactor

    • Streamlined the process for fetching test repository content for improved stability and error handling.
    • Consolidated diff report functionality by removing redundant components to enhance overall performance.

@SimonEntholzer SimonEntholzer requested a review from a team as a code owner February 21, 2025 08:35
@github-actions github-actions bot added server Pull requests that update Java code. (Added Automatically!) iris Pull requests that affect the corresponding module programming Pull requests that affect the corresponding module labels Feb 21, 2025
Copy link

coderabbitai bot commented Feb 21, 2025

Walkthrough

The pull request refactors repository handling and diff report generation across multiple components. It modifies how test repository contents are fetched in the Pyris service by replacing an Optional<Repository> with a map and improving error handling. The changes remove the CommitHistoryService, update method visibility and naming in GitService, and streamline the diff report generation process in the ProgrammingExerciseGitDiffReport service and its corresponding web resource. In addition, repository cleanliness checks have been updated consistently from isClean to isWorkingCopyClean.

Changes

File(s) Change Summary
src/.../iris/service/pyris/PyrisDTOService.java Replaced an Optional<Repository> with a Map<String, String> for test repository contents; restructured fetching logic and enhanced error handling.
src/.../programming/service/CommitHistoryService.java Removed the CommitHistoryService class, which previously handled git diff report generation between commits.
src/.../programming/service/GitService.java
src/.../programming/service/RepositoryService.java
src/.../programming/web/repository/RepositoryResource.java
Renamed methods from isClean to isWorkingCopyClean and updated method visibility (e.g., changed getOrCheckoutRepository to private) to clarify repository status checks.
src/.../programming/service/ProgrammingExerciseGitDiffReportService.java
src/.../programming/web/ProgrammingExerciseGitDiffReportResource.java
Refactored diff report generation to use repository URIs and commit hashes; introduced new methods (e.g., createReportForTemplateWithSolution, generateReportForCommits) and removed dependency on CommitHistoryService.

Sequence Diagram(s)

sequenceDiagram
    participant C as Client
    participant R as GitDiffReportResource
    participant S as GitDiffReportService
    participant G as Git Engine

    C->>R: GET /git-diff-report/{exerciseId}/{commitHash1}/{commitHash2}
    R->>S: generateReportForCommits(repositoryUri, commitHash1, commitHash2)
    S->>S: prepareRepository(repositoryUri)
    S->>G: Retrieve repository state and commit details
    G-->>S: Commit details, diff data
    S->>S: createReport(commit details)
    S-->>R: ProgrammingExerciseGitDiffReport
    R-->>C: 200 OK with diff report
Loading

Possibly related PRs

Suggested labels

tests, server, performance, ready for review, programming

Suggested reviewers

  • BBesrour
  • Hialus
  • krusche
  • coolchock
  • az108
  • JohannesStoehr
  • simonbohnen
✨ Finishing Touches
  • 📝 Generate Docstrings (Beta)

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai or @coderabbitai title anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (6)
src/main/java/de/tum/cit/aet/artemis/iris/service/pyris/PyrisDTOService.java (1)

69-87: Refactor to avoid duplicating repository-fetching logic.
These lines largely replicate the logic already encapsulated in getRepositoryContents(...) (lines 163–189). Extract a shared method or reuse getRepositoryContents(...) to maintain consistency and reduce code duplication.

src/main/java/de/tum/cit/aet/artemis/programming/service/ProgrammingExerciseGitDiffReportService.java (4)

185-185: Consider implementing LocalVC approach
The TODO comments reference bypassing a full checkout for LocalVC. I can help propose a bare repository approach or partial clone to improve performance.

Also applies to: 189-190


191-192: Potential concurrency or performance overhead
Performing checkouts in parallel might warrant concurrency safeguards. Consider a bare repository approach for improved performance under load.


259-260: Rename local variable
Since this is a generic prepareRepository method, renaming the variable for clarity would align better with its usage.

-Repository templateRepo = gitService.getOrCheckoutRepository(vcsRepositoryUri, true);
+Repository repository = gitService.getOrCheckoutRepository(vcsRepositoryUri, true);

324-343: Graceful handling of missing commits
Returning null could lead to null-pointer issues. Throwing a dedicated exception or returning an Optional might be safer.

src/main/java/de/tum/cit/aet/artemis/programming/service/RepositoryService.java (1)

192-192: Consider implementing the suggested enhancement.

The TODO comment suggests offering the method without requiring a commitId parameter. This could help improve performance by avoiding unnecessary commit lookups when the latest state is sufficient.

Would you like me to help implement this enhancement? I can propose a solution that handles both cases (with and without commitId).

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f9490fc and c24a436.

📒 Files selected for processing (7)
  • src/main/java/de/tum/cit/aet/artemis/iris/service/pyris/PyrisDTOService.java (1 hunks)
  • src/main/java/de/tum/cit/aet/artemis/programming/service/CommitHistoryService.java (0 hunks)
  • src/main/java/de/tum/cit/aet/artemis/programming/service/GitService.java (2 hunks)
  • src/main/java/de/tum/cit/aet/artemis/programming/service/ProgrammingExerciseGitDiffReportService.java (9 hunks)
  • src/main/java/de/tum/cit/aet/artemis/programming/service/RepositoryService.java (3 hunks)
  • src/main/java/de/tum/cit/aet/artemis/programming/web/ProgrammingExerciseGitDiffReportResource.java (3 hunks)
  • src/main/java/de/tum/cit/aet/artemis/programming/web/repository/RepositoryResource.java (1 hunks)
💤 Files with no reviewable changes (1)
  • src/main/java/de/tum/cit/aet/artemis/programming/service/CommitHistoryService.java
🧰 Additional context used
📓 Path-based instructions (1)
`src/main/java/**/*.java`: naming:CamelCase; principles:{sin...

src/main/java/**/*.java: naming:CamelCase; principles:{single_responsibility,small_methods,no_duplication}; db:{perf_queries,datetime_not_timestamp}; rest:{stateless,singleton,delegate_logic,http_only,minimal_dtos}; dtos:{java_records,no_entities,min_data,single_resp}; di:constructor_injection; kiss:simple_code; file_handling:os_indep_paths; practices:{least_access,avoid_transactions,code_reuse,static_member_ref,prefer_primitives}; sql:{param_annotation,uppercase,avoid_subqueries};java:avoid_star_imports

  • src/main/java/de/tum/cit/aet/artemis/programming/web/repository/RepositoryResource.java
  • src/main/java/de/tum/cit/aet/artemis/iris/service/pyris/PyrisDTOService.java
  • src/main/java/de/tum/cit/aet/artemis/programming/web/ProgrammingExerciseGitDiffReportResource.java
  • src/main/java/de/tum/cit/aet/artemis/programming/service/GitService.java
  • src/main/java/de/tum/cit/aet/artemis/programming/service/RepositoryService.java
  • src/main/java/de/tum/cit/aet/artemis/programming/service/ProgrammingExerciseGitDiffReportService.java
🧠 Learnings (1)
src/main/java/de/tum/cit/aet/artemis/iris/service/pyris/PyrisDTOService.java (3)
Learnt from: julian-christl
PR: ls1intum/Artemis#9322
File: src/main/java/de/tum/cit/aet/artemis/programming/repository/ProgrammingExerciseStudentParticipationRepository.java:0-0
Timestamp: 2024-11-12T12:51:51.201Z
Learning: In Artemis, an `ExerciseGroup` always has an associated `Exam`, so `exerciseGroup.exam` is never null.
Learnt from: julian-christl
PR: ls1intum/Artemis#9322
File: src/main/java/de/tum/cit/aet/artemis/programming/repository/ProgrammingExerciseStudentParticipationRepository.java:170-172
Timestamp: 2024-11-12T12:51:46.554Z
Learning: In Artemis, `exercise.exerciseGroup` may be null, as not all exercises belong to an `ExerciseGroup`.
Learnt from: Hialus
PR: ls1intum/Artemis#8607
File: src/main/java/de/tum/in/www1/artemis/web/rest/programming/ProgrammingExerciseResource.java:64-64
Timestamp: 2024-11-12T12:51:51.201Z
Learning: For the Artemis project, import statements are automatically managed by formatters and should not be commented on.
⏰ Context from checks skipped due to timeout of 90000ms (7)
  • GitHub Check: Call Build Workflow / Build and Push Docker Image
  • GitHub Check: Call Build Workflow / Build .war artifact
  • GitHub Check: client-style
  • GitHub Check: client-tests
  • GitHub Check: server-style
  • GitHub Check: server-tests
  • GitHub Check: Analyse
🔇 Additional comments (23)
src/main/java/de/tum/cit/aet/artemis/iris/service/pyris/PyrisDTOService.java (2)

65-65: Initialization of an empty Map looks good.
This ensures a safe default if the repository retrieval fails.


67-68: Consider null checks for the test repository URI.
If exercise.getVcsTestRepositoryUri() can ever be null, it may cause unexpected exceptions further down the pipeline.

src/main/java/de/tum/cit/aet/artemis/programming/service/ProgrammingExerciseGitDiffReportService.java (11)

7-7: Imports look good
The usage of StandardCharsets and JGit diff classes is appropriate.

Also applies to: 17-19


28-28: ProfileService injection
Injecting ProfileService is coherent with the local VCS logic. Its usage at line 325 to check local VCS mode is consistent.

Also applies to: 66-66, 70-71, 79-79


117-117: Report creation invoked
The call to createReportForTemplateWithSolution fits the new refactor. Assigning commit hashes afterward is clear.


195-195: FileTreeIterator usage
Initializing a new tree parser for the submission’s repository is correct for diffing.


197-197: Switching back to default branch
This ensures the repository returns to a clean state post-diff.


232-233: Documentation for template/solution URIs
These parameters are described well.


237-245: Refactored method for template/solution
Encapsulating the diff creation logic between template and solution helps maintain clarity. The TODO for LocalVC remains valid.


253-253: Documentation clarified
Specifying that the repository is checked out and reset to HEAD helps explain the approach.

Also applies to: 255-255


313-323: Method documentation
Great explanation of how to generate a diff for two commits.


345-356: Method documentation
The JavaDoc clearly outlines the commit-based diff logic.


357-383: Diff creation logic
This is a straightforward usage of DiffFormatter; rename detection and handling of identical commits is well-managed.

src/main/java/de/tum/cit/aet/artemis/programming/web/ProgrammingExerciseGitDiffReportResource.java (5)

65-65: Constructor injection
Accepting ParticipationAuthorizationCheckService and RepositoryService in the constructor aligns with the refactored structure.


125-125: Doc improvement
The endpoint’s description is clearer with the new path details.


129-130: Parameter documentation
Providing explicit parameter descriptions benefits API consumers.


137-137: Method signature rename
Switching to submissionId instead of submissionId1 simplifies understanding of the parameter’s role.


142-142: Submission retrieval
Leveraging orElseThrow ensures valid submission handling.

src/main/java/de/tum/cit/aet/artemis/programming/web/repository/RepositoryResource.java (1)

333-333: Renamed method usage
Adopting isWorkingCopyClean aligns with the updated naming convention and clarifies the purpose of the check.

Also applies to: 337-337

src/main/java/de/tum/cit/aet/artemis/programming/service/RepositoryService.java (2)

486-488: LGTM! The new method name is more descriptive.

The rename from isClean to isWorkingCopyClean better describes the method's purpose, making it clear that it checks the cleanliness of the working copy.


499-501: LGTM! The new method name is more descriptive.

The rename from isClean to isWorkingCopyClean better describes the method's purpose, making it clear that it checks the cleanliness of the working copy. This change maintains consistency with the single-parameter version.

src/main/java/de/tum/cit/aet/artemis/programming/service/GitService.java (2)

346-346: LGTM! Improved encapsulation by making the method private.

The visibility change from public to private better encapsulates this internal helper method, following the principle of least privilege.


1111-1116: LGTM! The new method name is more descriptive.

The rename from isClean to isWorkingCopyClean better describes the method's purpose, making it clear that it checks the cleanliness of the working copy. This change maintains consistency with the rest of the codebase.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
iris Pull requests that affect the corresponding module programming Pull requests that affect the corresponding module server Pull requests that update Java code. (Added Automatically!)
Projects
Status: Work In Progress
Development

Successfully merging this pull request may close these issues.

2 participants