-
Notifications
You must be signed in to change notification settings - Fork 6
Internet Yellow Pages
richmass1 edited this page Sep 20, 2023
·
10 revisions
https://github.com/InternetHealthReport/internet-yellow-pages/tree/main/iyp/crawlers
- Where can I find more information on the APNIC "eyeball" dataset?
- Does MANRS have a page for the network operators dataset used by IYP?
- This page has the same list of network operators, but it has different information about them
- When did Spamhaus launch their DROP list?
- How should I assign start dates to looking glass servers?
I believe that this parameter, in the case of a typical dataset containing historical data in chronological order, is supposed to be the date of the earliest data point. For some of these data sources, I couldn't find historical data; in these cases, I assume that the sources primarily maintain current data, and thus the oldest data from that data source would be from its launch. For these sources, I did my best to figure out when each was launched, and listed that date as dateStart.
- caida/AS Rank
- catalog: dataset:as_rank
- amsix (alice_lg)
- id: amsix_lg
- description: Looking glass server
- URL: https://lg.ams-ix.net/
- name: AMS-IX Looking Glass
- org: AMS-IX
- dateStart:
- amsix.py
- decix (alice_lg)
- id: decix_lg
- description: Looking glass server
- URL: https://lg.de-cix.net/
- name: DE-CIX Looking Glass
- org: DE-CIX
- dateStart:
- decix.py
- ecix (alice_lg)
- id: megaport_lg
- description: Looking glass server
- URL: https://lg.megaport.com/
- name: Megaport Looking Glass
- org: Megaport
- dateStart:
- ecix.py
- linx (alice_lg)
- id: linx_lg
- description: Looking glass server
- URL: https://alice-rs.linx.net/
- name: LINX Route Server Looking Glass
- org: LINX
- dateStart:
- linx.py
- apnic
- id: apnic_eyeball
- description: Ranks ASes by population within a given country
- URL: ?
- name: ?
- org: APNIC
- dateStart: ?
- eyeball.py
- atlas
- id: ripe_atlas
- description: A global network of devices that actively measure Internet connectivity
- URL: atlas.ripe.net
- name: RIPE Atlas
- org: RIPE NCC
- dateStart: 2014-03
- measurements.py
- bgpkit
- id: bgpkit
- description: Collection of various BGP and AS related data
- URL: bgpkit.com
- name: BGPKIT
- org: BGPKIT
- dateStart: 2012-05
- multiple scripts
- bgptools
- id: bgptools
- description: Collection of BGP data
- URL: bgp.tools
- name: bgp.tools
- org: Port 179 Ltd
- dateStart: 2021-09
- multiple scripts
- cisco
- id: cisco_umbrella_top1m
- description: List of most queried domains, as tracked by Cisco Umbrella
- URL: https://s3-us-west-1.amazonaws.com/umbrella-static/index.html
- name: Cisco Umbrella Popularity List
- org: Cisco
- dateStart: 2017-01
- umbrella_top1M.py
- citizenlabs
- id: citizenlabs_test_lists
- description: Lists of websites that might be targets of censorship, by country
- URL: https://github.com/citizenlab/test-lists/, more info: https://ooni.org/get-involved/contribute-test-lists/#about-test-lists
- name: CitizenLabs Test Lists
- org: CitizenLabs & OONI
- dateStart: 2014-04
- based on the first GitHub commit: https://github.com/citizenlab/test-lists/commit/917fe645c58459800d1ddf6f6df102c3e90334d3
- urldb.py
- cloudflare
- id: cloudflare_radar
- description: Data on internet traffic
- URL: https://radar.cloudflare.com/
- name: Cloudflare Radar
- org: Cloudflare
- dateStart: 2020-09
- multiple scripts
- emileaben
- id: aben_asnames
- description: Curated list of AS names
- URL: https://github.com/emileaben/asnames
- name: as-names
- org: RIPE NCC
- dateStart: 2022-06
- as_names.py
- ihr
- id: internet_health_report
- description: Data about AS dependency (i.e. which ASes are relied on by others) and other measures of connectivity
- URL: https://ihr.iijlab.net/ihr/en-us/
- name: Internet Health Report
- org: Internet Health Report
- dateStart: 2019-01
- multiple scripts
- inetintel
- id: inetintel_as_org
- description: Historical and current AS-to-organization mappings
- URL: https://github.com/InetIntel/Dataset-AS-to-Organization-Mapping
- name: AS-to-Organization Mapping
- org: Internet Intelligence Lab at Georgia Tech
- dateStart: 2022-10
- as_org.py
- manrs
- id: manrs_net_op_participants
- description: List of network operators participating in the MANRS initiative
- URL: ?
- name: MANRS Network Operator Participants
- org: MANRS
- dateStart: 2014-11
- members.py
- nro
- id: nro_delegated_extended
- description: ASN and IP range assignment records
- URL: https://www.nro.net/about/rirs/statistics/
- name: NRO Extended Allocation and Assignment Reports
- org: Number Resource Organization
- dateStart: 2019-10
- delegated_stats.py
- openintel
- id: openintel_active_dns
- description: Collection of DNS records for websites in top 1M lists
- URL: https://www.openintel.nl/data-access/
- name: OpenINTEL Active DNS Measurements
- org: OpenINTEL
- dateStart: 2016
- multiple scripts
- pch
- id: pch_routing_data
- description: Historical and current snapshots of route collections
- URL: https://www.pch.net/resources/Routing_Data/
- name: PCH Routing Data
- org: Packet Clearing House
- dateStart: 2003-03
- multiple scripts
- peeringdb
- already catalogued: https://catalog.caida.org/dataset/peeringdb
- rapid7
- id: rapid7_forward_dns
- description: Collections of responses to forward DNS requests
- URL: https://opendata.rapid7.com/sonar.fdns_v2/
- name: Rapid7 Forward DNS
- org: Rapid7
- dateStart: 2017-02
- multiple scripts
- as_names (ripe)
- id: ripe_as_names
- description: List of ASes and associated names
- URL: https://ftp.ripe.net/ripe/asnames/
- name: RIPE AS Names
- org: RIPE NCC
- dateStart: ?
- as_names.py
- roa (ripe)
- id: ripe_rpki_repo
- description: Archive of route origin authorizations (ROAs)
- URL: https://github.com/RIPE-NCC/internet-dataset-descriptions/blob/main/rpki-repo-archive.md
- name: RIPE NCC's RPKI Repo Archive
- org: RIPE NCC
- dateStart: 2011-01
- roa.py
- spamhaus
- id: spamhaus_drop_lists
- description: Lists of ASNs and IP ranges controlled by spammers or malicious organizations
- URL: https://www.spamhaus.org/drop/
- name: Spamhaus DROP Lists
- org: Spamhaus
- dateStart: 2004?
- multiple scripts
- stanford
- id: stanford_asdb
- description: AS-to-organization mappings, with organizations classified by industry
- URL: https://asdb.stanford.edu/
- name: Stanford ASdb
- org: Stanford Empirical Security Research Group
- dateStart: 2021-05
- asdb.py
- tranco
- id: tranco_top1m
- description: Ranking of the top 1M domains, created by analyzing multiple sources
- URL: https://tranco-list.eu/
- name: Tranco
- org: DistriNet
- dateStart: 2019-01
- top1M.py