Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow for passing in multiple directories #338

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

rchatrath7
Copy link

@rchatrath7 rchatrath7 commented Feb 6, 2025

On large projects, I will have files that are contextually related but in separate sub directories that I would like to bundle together. However, due to token limits, I don't necessarily want to find the closest common parent directory and bundle the entire thing. What I really wanted was an option to be able to pass in multiple directories as arguments and have repomix bundle them all together automatically, something like

repomix path/to/foo path/to/bar ...

This PR adds support for having the directory argument be an arbitrary number of directories instead of a single string and collects all of that into one output.

➜  repomix git:(rc/multiple-dir-paths) npm run repomix src/cli src/core

yields

➜  repomix git:(rc/multiple-dir-paths) cat repomix-output.xml | rg -U "<directory_structure>([\s\S]*?)</directory_structure>" -r '$1'

actions/
  defaultAction.ts
  initAction.ts
  migrationAction.ts
  remoteAction.ts
  versionAction.ts
file/
  workers/
    fileCollectWorker.ts
    fileProcessWorker.ts
  fileCollect.ts
  fileManipulate.ts
  filePathSort.ts
  fileProcess.ts
  fileSearch.ts
  fileTreeGenerate.ts
  fileTypes.ts
  gitCommand.ts
  packageJsonParse.ts
  permissionCheck.ts
metrics/
  workers/
    fileMetricsWorker.ts
    outputMetricsWorker.ts
    types.ts
  calculateAllFileMetrics.ts
  calculateMetrics.ts
  calculateOutputMetrics.ts
output/
  outputStyles/
    markdownStyle.ts
    plainStyle.ts
    xmlStyle.ts
  outputGenerate.ts
  outputGeneratorTypes.ts
  outputStyleDecorate.ts
packager/
  copyToClipboardIfEnabled.ts
  writeOutputToDisk.ts
security/
  workers/
    securityCheckWorker.ts
  filterOutUntrustedFiles.ts
  securityCheck.ts
  validateFileSafety.ts
tokenCount/
  tokenCount.ts
cliPrint.ts
cliRun.ts
cliSpinner.ts
packager.ts

Checklist

  • Run npm run test
  • Run npm run lint

Copy link
Contributor

coderabbitai bot commented Feb 6, 2025

📝 Walkthrough

Walkthrough

The changes update several functions and tests across the codebase to handle multiple directories instead of a single directory string. The runDefaultAction, executeAction, generateOutput, and pack functions now accept an array of directory paths. This adjustment propagates through the CLI, remote actions, configuration building, file searching, and output generation. The internal logic now maps over an array of directories, aggregating file search results and resolving paths in parallel. Test cases have been updated accordingly to pass arrays rather than single strings. Overall, these modifications improve consistency in how directory inputs are processed while preserving the existing functionality.

Sequence Diagram(s)

sequenceDiagram
  participant CLI
  participant CLI_Run
  participant DefaultAction
  participant Packager
  participant OutputGen

  CLI->>CLI_Run: Pass directories array, cwd, and options
  CLI_Run->>DefaultAction: executeAction(directories[], cwd, options)
  DefaultAction->>Packager: runDefaultAction(directories[], cwd, options)
  Packager->>Packager: Resolve and search files for each directory
  Packager->>OutputGen: generateOutput(directories[], config, ...)
  OutputGen-->>Packager: Return processed output
  Packager-->>DefaultAction: Return aggregated result
  DefaultAction-->>CLI_Run: Return final result
  CLI_Run-->>CLI: Display output
Loading

Possibly related PRs

  • feat: Add accurate codebase scope description to output header #330: The changes in the main PR are related to the modifications in the runDefaultAction function's parameter handling, which is also reflected in the runRemoteAction method of the retrieved PR, indicating a direct connection in their code-level changes.
  • feat(website): Improve performance of large repository output using Ace Editor #313: The changes in the main PR are related to the modifications made to the runDefaultAction function's parameter handling, which is also reflected in the runRemoteAction method of the retrieved PR, indicating a direct connection at the code level.
  • XML Escaping #287: The changes in the main PR are related to the modifications in the runDefaultAction function's parameter handling, which is also reflected in the retrieved PR's updates to the same function's invocation in runRemoteAction.

Tip

🌐 Web search-backed reviews and chat
  • We have enabled web search-based reviews and chat for all users. This feature allows CodeRabbit to access the latest documentation and information on the web.
  • You can disable this feature by setting web_search: false in the knowledge_base settings.
  • Please share any feedback in the Discord discussion.
✨ Finishing Touches
  • 📝 Generate Docstrings (Beta)

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🔭 Outside diff range comments (1)
tests/cli/actions/defaultAction.test.ts (1)

60-71: Add test cases for multiple directories functionality.

While the test has been updated to use an array, it only tests with a single directory. Consider adding the following test cases to validate the new multi-directory functionality:

  • Test with multiple valid directories
  • Test with an empty array
  • Test with invalid directory paths

Example test case:

it('should handle multiple directories successfully', async () => {
  const options: CliOptions = {
    output: 'custom-output.txt',
    verbose: true,
  };

  await runDefaultAction(['path/to/foo', 'path/to/bar'], process.cwd(), options);

  expect(configLoader.loadFileConfig).toHaveBeenCalled();
  expect(configLoader.mergeConfigs).toHaveBeenCalled();
  expect(packager.pack).toHaveBeenCalled();
});

it('should handle empty directory array', async () => {
  const options: CliOptions = {};
  await expect(runDefaultAction([], process.cwd(), options))
    .rejects
    .toThrow('At least one directory must be specified');
});

it('should handle invalid directory paths', async () => {
  const options: CliOptions = {};
  await expect(runDefaultAction(['non/existent/path'], process.cwd(), options))
    .rejects
    .toThrow('Directory not found');
});
🧹 Nitpick comments (7)
src/core/packager.ts (1)

47-51: Simplify array flattening using Array.prototype.flat()

The current code uses reduce with concat to flatten the array of collected files. This can be simplified by using flat(), improving readability and efficiency.

Apply this diff to simplify the code:

 const rawFiles = (
   await Promise.all(
     filePathsByDir.map(({ rootDir, filePaths }) =>
       deps.collectFiles(filePaths, rootDir, progressCallback)
     )
   )
- ).reduce((acc: RawFile[], curr: RawFile[]) => acc.concat(...curr), []);
+ ).flat();
tests/core/output/outputStyles/xmlStyle.test.ts (1)

26-26: Add test cases for multiple directories

To fully validate the updated functionality of handling multiple directories, consider adding test cases where generateOutput receives multiple directory paths. This will ensure the function behaves as expected with various inputs.

tests/core/output/outputStyles/plainStyle.test.ts (1)

26-26: Consider adding test cases for multiple directories

Since generateOutput now handles multiple directories, enhancing the test to include scenarios with multiple directory inputs will ensure comprehensive test coverage and verify correct functionality across different cases.

tests/core/output/outputGenerate.test.ts (1)

25-25: LGTM! Consider adding test cases for multiple directories.

The changes correctly update the generateOutput function calls to accept an array of directories. However, all test cases only test with a single directory ([process.cwd()]).

Consider adding test cases that verify the behavior with multiple directories to ensure the new functionality works as expected. For example:

const output = await generateOutput([process.cwd(), 'path/to/second/dir'], mockConfig, mockProcessedFiles, []);

Also applies to: 51-51, 86-86

tests/core/packager.test.ts (1)

72-72: LGTM! Consider expanding test coverage.

The changes correctly update the test to handle arrays of directories. However, like the previous file, the test only covers the single directory case.

Consider adding test cases that verify:

  1. Multiple directories being packed together
  2. Edge cases like empty array of directories
  3. Duplicate directories in the array

Also applies to: 84-84

src/cli/cliRun.ts (1)

46-46: LGTM! Consider updating the CLI description.

The changes correctly implement support for multiple directories. The default value of ['.'] maintains backward compatibility.

Consider updating the program description to explicitly mention the support for multiple directories. For example:

-      .description('Repomix - Pack your repository into a single AI-friendly file')
+      .description('Repomix - Pack one or more directories into a single AI-friendly file')

Also applies to: 76-76, 84-84, 105-105

src/cli/actions/defaultAction.ts (1)

107-182: Improve code consistency in buildCliConfig.

The refactoring of the buildCliConfig function improves code consistency by using the spread operator uniformly. This makes the code more maintainable and easier to read.

Consider extracting the repeated pattern into a helper function to reduce duplication:

+const updateOutput = (cliConfig: RepomixConfigCli, key: string, value: unknown): void => {
+  cliConfig.output = {
+    ...cliConfig.output,
+    [key]: value,
+  };
+};

 const buildCliConfig = (options: CliOptions): RepomixConfigCli => {
   const cliConfig: RepomixConfigCli = {};
   // ... other code ...
   if (options.topFilesLen !== undefined) {
-    cliConfig.output = {
-      ...cliConfig.output,
-      topFilesLength: options.topFilesLen,
-    };
+    updateOutput(cliConfig, 'topFilesLength', options.topFilesLen);
   }
   // Apply similar changes to other output updates
   return repomixConfigCliSchema.parse(cliConfig);
 };
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 836abcd and 2e57844.

📒 Files selected for processing (12)
  • src/cli/actions/defaultAction.ts (4 hunks)
  • src/cli/actions/remoteAction.ts (1 hunks)
  • src/cli/cliRun.ts (3 hunks)
  • src/core/output/outputGenerate.ts (4 hunks)
  • src/core/packager.ts (4 hunks)
  • tests/cli/actions/defaultAction.test.ts (22 hunks)
  • tests/cli/cliRun.test.ts (11 hunks)
  • tests/core/output/outputGenerate.test.ts (3 hunks)
  • tests/core/output/outputStyles/plainStyle.test.ts (1 hunks)
  • tests/core/output/outputStyles/xmlStyle.test.ts (1 hunks)
  • tests/core/packager.test.ts (2 hunks)
  • tests/integration-tests/packager.test.ts (2 hunks)
🔇 Additional comments (13)
src/core/packager.ts (1)

24-24: Update function parameter to accept multiple directories

Changing the pack function's rootDir parameter to rootDirs: string[] appropriately enables processing multiple directories. This change is correctly reflected in the function signature and facilitates the intended functionality.

tests/core/output/outputStyles/xmlStyle.test.ts (1)

26-26: Update test to pass multiple directories to generateOutput

Changing the argument to [process.cwd()] correctly aligns with the updated generateOutput function signature that now expects an array of directories.

tests/core/output/outputStyles/plainStyle.test.ts (1)

26-26: Update test to pass multiple directories to generateOutput

Changing the argument to [process.cwd()] correctly reflects the updated function signature of generateOutput that now accepts an array of directories.

src/cli/actions/remoteAction.ts (1)

50-50: LGTM! The change maintains compatibility with the new interface.

The update correctly wraps tempDirPath in an array to match the new runDefaultAction signature while maintaining the existing remote repository processing logic.

tests/integration-tests/packager.test.ts (1)

77-77: LGTM! Test updated to support multiple directories.

The test has been correctly updated to pass the input directory as an array, aligning with the new multi-directory support feature.

src/cli/actions/defaultAction.ts (3)

23-27: LGTM! Function signature updated to support multiple directories.

The runDefaultAction function now accepts an array of directories, aligning with the PR's objective to support multiple input directories.


46-46: LGTM! Path resolution handles multiple directories.

The code efficiently maps over the directories array to resolve all paths in parallel.


54-54: LGTM! Pack function call updated for multiple directories.

The pack function is now called with the array of target paths, maintaining consistency with the multi-directory support.

src/core/output/outputGenerate.ts (3)

7-7: LGTM! Improved type safety with explicit type import.

Added FileSearchResult type import to enhance type safety.


114-116: LGTM! Function signatures updated for multiple directories.

Both generateOutput and buildOutputGeneratorContext functions now accept an array of root directories, aligning with the multi-directory support feature.

Also applies to: 134-136


154-162: LGTM! Efficient parallel directory search implementation.

The code efficiently searches multiple directories in parallel using Promise.all and reduces the results into a single FileSearchResult. This implementation maintains good performance while supporting multiple directories.

tests/cli/cliRun.test.ts (1)

128-130: LGTM! Tests updated to support multiple directories.

All test cases have been consistently updated to pass directory arguments as arrays, maintaining comprehensive test coverage while supporting the new multi-directory feature.

Also applies to: 134-136, 140-141, 147-148, 154-156, 165-169, 177-181, 191-195, 203-207, 215-219, 227-231, 239-243, 251-257, 265-271, 279-285

tests/cli/actions/defaultAction.test.ts (1)

73-459: LGTM!

The feature-specific tests have been consistently updated to use the new array syntax while maintaining their focus on verifying the configuration options. Since these tests are primarily concerned with validating feature behavior rather than directory handling, using a single directory is acceptable.

Comment on lines +39 to +44
const filePathsByDir = await Promise.all(
rootDirs.map(async (rootDir) => ({
rootDir,
filePaths: (await deps.searchFiles(rootDir, config)).filePaths,
})),
);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Handle potential errors during file search across multiple directories

Using Promise.all assumes that all searchFiles operations will succeed. If searchFiles fails for any directory, it will reject the entire Promise.all, potentially causing unhandled exceptions. Consider implementing error handling or using Promise.allSettled to gracefully handle failures in individual directory searches and ensure robustness.

Apply this diff to handle errors using Promise.allSettled:

- const filePathsByDir = await Promise.all(
+ const filePathsByDirResults = await Promise.allSettled(
    rootDirs.map(async (rootDir) => ({
      rootDir,
      filePaths: (await deps.searchFiles(rootDir, config)).filePaths,
    })),
  );
+
+ const filePathsByDir = filePathsByDirResults
+   .filter(result => result.status === 'fulfilled')
+   .map(result => result.value);
+
+ // Optionally handle rejected promises
+ const rejectedDirs = filePathsByDirResults
+   .filter(result => result.status === 'rejected')
+   .map((result, index) => rootDirs[index]);
+
+ if (rejectedDirs.length > 0) {
+   progressCallback(`Warning: Failed to search files in directories: ${rejectedDirs.join(', ')}`);
+ }

Committable suggestion skipped: line range outside the PR's diff.

@rchatrath7 rchatrath7 force-pushed the rc/multiple-dir-paths branch from 2e57844 to e6834d9 Compare February 6, 2025 02:00
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
src/core/packager.ts (1)

47-51: Simplify the file collection logic.

The nested Promise.all with reduce makes the code harder to read. Consider extracting the reduction logic into a separate function for better readability.

-  const rawFiles = (
-    await Promise.all(
-      filePathsByDir.map(({ rootDir, filePaths }) => deps.collectFiles(filePaths, rootDir, progressCallback)),
-    )
-  ).reduce((acc: RawFile[], curr: RawFile[]) => acc.concat(...curr), []);
+  const collectFilesFromDir = async ({ rootDir, filePaths }: { rootDir: string; filePaths: string[] }) =>
+    deps.collectFiles(filePaths, rootDir, progressCallback);
+
+  const rawFilesArrays = await Promise.all(filePathsByDir.map(collectFilesFromDir));
+  const rawFiles = rawFilesArrays.flat();
src/core/output/outputGenerate.ts (1)

154-161: Simplify the empty directory search reduction.

The nested reduction logic is complex and could be simplified for better readability.

-      emptyDirPaths = (await Promise.all(rootDirs.map((rootDir) => searchFiles(rootDir, config)))).reduce(
-        (acc: FileSearchResult, curr: FileSearchResult) =>
-          ({
-            filePaths: [...acc.filePaths, ...curr.filePaths],
-            emptyDirPaths: [...acc.emptyDirPaths, ...curr.emptyDirPaths],
-          }) as FileSearchResult,
-        { filePaths: [], emptyDirPaths: [] },
-      ).emptyDirPaths;
+      const searchResults = await Promise.all(rootDirs.map((rootDir) => searchFiles(rootDir, config)));
+      emptyDirPaths = searchResults.flatMap(result => result.emptyDirPaths);
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2e57844 and e6834d9.

📒 Files selected for processing (12)
  • src/cli/actions/defaultAction.ts (4 hunks)
  • src/cli/actions/remoteAction.ts (1 hunks)
  • src/cli/cliRun.ts (3 hunks)
  • src/core/output/outputGenerate.ts (4 hunks)
  • src/core/packager.ts (4 hunks)
  • tests/cli/actions/defaultAction.test.ts (22 hunks)
  • tests/cli/cliRun.test.ts (11 hunks)
  • tests/core/output/outputGenerate.test.ts (3 hunks)
  • tests/core/output/outputStyles/plainStyle.test.ts (1 hunks)
  • tests/core/output/outputStyles/xmlStyle.test.ts (1 hunks)
  • tests/core/packager.test.ts (2 hunks)
  • tests/integration-tests/packager.test.ts (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (8)
  • tests/core/output/outputStyles/xmlStyle.test.ts
  • tests/core/output/outputStyles/plainStyle.test.ts
  • tests/core/output/outputGenerate.test.ts
  • tests/integration-tests/packager.test.ts
  • src/cli/actions/remoteAction.ts
  • tests/cli/actions/defaultAction.test.ts
  • tests/cli/cliRun.test.ts
  • tests/core/packager.test.ts
🔇 Additional comments (3)
src/core/packager.ts (1)

39-44: Handle potential errors during file search across multiple directories.

Using Promise.all assumes that all searchFiles operations will succeed. If searchFiles fails for any directory, it will reject the entire Promise.all, potentially causing unhandled exceptions.

src/cli/cliRun.ts (1)

46-46: LGTM! Good use of variadic arguments.

The change to use variadic arguments with [directories...] and the default value ['.'] is a clean way to handle multiple directory inputs while maintaining backward compatibility.

Also applies to: 76-76

src/cli/actions/defaultAction.ts (1)

107-182: LGTM! Consistent use of spread operator.

The refactoring to consistently use the spread operator for config building improves code readability and maintainability.

@@ -43,15 +43,15 @@ export const runDefaultAction = async (

logger.trace('Merged config:', config);

const targetPath = path.resolve(directory);
const targetPaths = directories.map((directory) => path.resolve(directory));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Consider validating directory existence.

The path resolution doesn't verify if the directories exist. Consider adding validation to provide better error messages.

-  const targetPaths = directories.map((directory) => path.resolve(directory));
+  const targetPaths = await Promise.all(
+    directories.map(async (directory) => {
+      const resolvedPath = path.resolve(directory);
+      try {
+        await fs.access(resolvedPath);
+        return resolvedPath;
+      } catch {
+        throw new Error(`Directory not found: ${directory}`);
+      }
+    })
+  );
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const targetPaths = directories.map((directory) => path.resolve(directory));
const targetPaths = await Promise.all(
directories.map(async (directory) => {
const resolvedPath = path.resolve(directory);
try {
await fs.access(resolvedPath);
return resolvedPath;
} catch {
throw new Error(`Directory not found: ${directory}`);
}
})
);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant