Skip to content

Latest commit

 

History

History
43 lines (33 loc) · 3.95 KB

README-en.md

File metadata and controls

43 lines (33 loc) · 3.95 KB

HuffPost logo

Original reporting at the HuffPost (2011-2016)

Study on the "originality rate" of the HuffPost and 15 of its international editions

As seen in the European Journalism Observatory (in French)


The bylines of more than 1.8 million articles (blogs were excluded) published by 16 of the HuffPost's 18 versions were scraped1.
Each byline was categorized thus:

  • HP_yes when the byline includes "HuffPost" or the name of an employee or freelancer, even if another media organization is mentioned.
  • HP_no when the byline attributes authorship exclusively to another media organization (press agency, website, newspaper, etc.).
  • HP_unknown when authorship cannot be established.

The table below gives the rates per edition.
Links are to Jupyter notebooks in French.
One can access the raw data by downloading this csv file: scraping-nettoye.csv (apart from the Spanish edition, whose data is included in this file: scraping-ES-2.csv).

version       launch date   articles HP_yes HP_no HP_unknown originality rate
US2, site 2005-05-09 550 955 250 528 210 226 90 201 45.5%
Canada, site 2011-05-26 265 153 40 809 222 950 1 394 15.4%
United Kingdom, site 2011-07-06 161 263 118 317 42 757 189 73.4%
France, site 2012-01-23 54 156 49 815 4 088 253 92.0%
Québec, site 2012-02-08 390 231 44 282 344 510 1 439 11.3%
Spain, site 2012-06-07 56 348 48 879 7 381 88 86.7%
Italy, site 2012-09-24 64 880 53 820 9 944 1 116 83.0%
Japan, site 2013-05-06 23 708 16 490 6 865 353 69.6%
Maghreb, site 2013-06-25 28 653 25 200 3 337 116 87.9%
Germany, site 2013-10-01 68 733 31 831 33 445 3 457 46.3%
Brazil, site 2014-01-29 20 831 14 543 5 745 543 69.8%
South Korea, site 2014-02-26 51 890 25 945 25 476 469 50.0%
Greece, site 2014-11-20 55 433 55 004 279 150 99.2%
India, site 2014-12-08 14 618 8 613 3 154 2 851 58.9%
Australia, site 2015-08-18 17 154 12 335 3 255 1 564 71.9%
Mexico, site 2016-09-01 2 168 1 916 102 150 88.4%
All 1 826 174 798 327 923 514 104 333 43.7%
2 : All articles available and findable online published between the launch date and Dec. 31, 2016, were included in this study, apart from the US edition where articles were included starting Jan. 1, 2011.