[site] bbc.co.uk #658

kdenaeem · 2025-02-05T18:14:30Z

I am trying to scrape bbc.co.uk/news/world, I'm hoping to scrape 20/30 of the articles on the front page of this site

bbc_papers = newspaper.build("https://www.bbc.co.uk/news/world", number_threads=3)

article_urls = [article.url for article in bbc_papers.articles]
print(article_urls[10])

This always says list is out of index or return empty [], I'm guessing this is because the request was blocked.
Does anyone know why it wont return article_urls ?

kdenaeem added the help wanted Extra attention is needed label Feb 5, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[site] bbc.co.uk #658

[site] bbc.co.uk #658

kdenaeem commented Feb 5, 2025

[site] bbc.co.uk #658

[site] bbc.co.uk #658

Comments

kdenaeem commented Feb 5, 2025