Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add map and extract endpoints with v1 updates for scrape and crawl #6787

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

aparupganguly
Copy link

@aparupganguly aparupganguly commented Feb 24, 2025

Added map and extract endpoints with v1 updates for scrape and crawl

@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. enhancement New feature or request labels Feb 24, 2025
@aparupganguly
Copy link
Author

aparupganguly commented Feb 24, 2025

@ericciarla Could you take a look at this PR and share your thoughts?

@aparupganguly aparupganguly changed the title Added map and extract endpoints with v1 updates for scrape and crawl feat: add map and extract endpoints with v1 updates for scrape and crawl Feb 24, 2025
@github-actions github-actions bot added enhancement New feature or request and removed enhancement New feature or request labels Feb 24, 2025
@ericciarla
Copy link

Looks good - pinging the Langflow team

@@ -67,9 +65,23 @@ def crawl(self) -> Data:
if scrape_options_dict:
params["scrapeOptions"] = scrape_options_dict

# Set default values for new parameters in v1
params.setdefault("maxDepth", 2)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Set these as inputs with advanced=True.

Also, set the value for these inputs as the default value required.

display_name="Enable Web Search",
info="When true, the extraction will use web search to find additional data.",
),
# # Optional: Not essential for basic extraction
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

optional parameters you can set as advanced =True such that they would not be visible in component unless the user look into the controls of the component.

MultilineInput,
Output,
SecretStrInput,
StrInput,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of StrInput, I suggest using MessageTextInput

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can actually remove unused imports

@edwinjosechittilappilly
Copy link
Collaborator

In components, if the output is a list, I would prefer it to be in DataFrame format. If the output is in JSON format, I suggest using Data. Furthermore, both Data and DataFrame outputs can coexist within the same component.

@edwinjosechittilappilly
Copy link
Collaborator

edwinjosechittilappilly commented Feb 25, 2025

Also to fix the format and lint error.
run make format_backend and make lint

These would be helpful to find and fix potential errors.

Feel free to ping me if any clarifications are required; I would be happy to help.

@github-actions github-actions bot added enhancement New feature or request and removed enhancement New feature or request labels Feb 25, 2025
@edwinjosechittilappilly
Copy link
Collaborator

Testing!

Copy link
Collaborator

@edwinjosechittilappilly edwinjosechittilappilly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Feb 25, 2025
@github-actions github-actions bot added enhancement New feature or request and removed enhancement New feature or request labels Feb 25, 2025
@ogabrielluiz ogabrielluiz added this pull request to the merge queue Feb 25, 2025
@ericciarla
Copy link

Awesome! Great work @edwinjosechittilappilly @aparupganguly and @ogabrielluiz !!

@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Feb 25, 2025
@github-actions github-actions bot added enhancement New feature or request and removed enhancement New feature or request labels Feb 28, 2025
@edwinjosechittilappilly edwinjosechittilappilly added this pull request to the merge queue Feb 28, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Feb 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request lgtm This PR has been approved by a maintainer size:L This PR changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants