-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathREADME.Rmd
107 lines (79 loc) · 4.4 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
---
output: github_document
always_allow_html: yes
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r setup, include = FALSE}
library(councilverse)
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```
# Overview
The `councilcount` package allows easy access to population data for around 70 demographic groups across various NYC geographic boundaries. This data was pulled from the 2017-2021 5-Year American Community Survey. For geographic boundaries that are not included in the ACS, like council districts, estimates were generated.
## Installation
You can install the released version of `councilcount` from GitHub
``` r
remotes::install_github("newyorkcitycouncil/councilcount")
```
## Load Package
```{r eval=FALSE}
library(tidyverse)
# load last
library(councilcount)
```
## Vignette
For demos of the functions included in `councilcount`, please visit `vignettes/councilverse.Rmd`.
## Quick Start
First load the `councilcount` package as above.
### Functions
#### R
`councilcount` includes 3 functions:
`get_bbl_estimates()` –- Generates a dataframe that provides population
estimates at the BBL level (there are also columns for various other
geographies, like council district, which can be used for geographic aggregation)
`get_geo_estimates()` –- Creates a dataframe that provides population estimates for selected
demographic variables along chosen geographic boundaries (e.g. council
district, borough, etc.) for a chosen ACS 5-Year survey
`get_ACS_variables()` –- Provides information on all of the available ACS demographic variables that can be
accessed via `get_geo_estimates()` for a specified survey year
Simply run `get_bbl_estimates()` and `get_census_variables()` to access the desired dataframes. They do not require any inputs.
`get_bbl_estimates()` has 1 parameter:
year` –- The desired year for BBL estimates. The years currently available are 2011, 2016, 2021, and 2022.
`get_ACS_variables()` has 1 parameter:
`acs_year` –- The end-year of the desired 5-Year ACS. The surveys currently available are 2007-2011, 2012-2016, 2017-2021, and 2018-2022.
`get_geo_estimates()` has 4 parameters:
`acs_year` -- The end-year of the desired 5-Year ACS. The surveys currently available are 2007-2011, 2012-2016, 2017-2021, and 2018-2022.
`geo` –- The desired geographic region. Please select from the following
list:
** Council Distrist: “councildist”
** Community Distrist: “communitydist”
** School District: “schooldist”
** Police Precinct: “policeprct”
** Neighborhood Tabulation Area: “nta”
** Borough: “borough”
** New York City: "city"
`var_codes` –- The desired demographic group(s), as represented
by the ACS variable code. To access the list of available demographic
variables and their codes, please run `get_census_variables()`
`boundary_year` –- If “councildist” is selected, the boundary year must
be specified as 2013 or 2023. The default is 2013.
Here is an example, in which codes for “Female” and “Adults with
Bachelor’s degree or higher” are used. The data is requested along 2023
Council District boundaries for the 2018-2022 ACS.
``` r
vars <- c('DP05_0003E', 'DP02_0068E')
get_geo_estimates(acs_year = "2022", geo = "councildist", var_codes = vars, boundary_year = "2023")
```
#### Python
The equivalent functions are also available in Python. To access them, use the following code:
``` Python
import sys
# set absolute path to councilverse/inst/python location
sys.path.insert(0, "/{YOUR PATH}/councilverse/inst/python")
from retrieve_estimates import get_bbl_estimates, get_census_variables, get_geo_estimates
```
`get_bbl_estimates()` and `get_census_variables()` function the same in both R and Python. However, `get_geo_estimates()` has some differences in Python. Instead of having separate parameters for geo and boundary year, there are two input options for Council Districts, "councildist13" and "councildist23." Data for New York City as a whole is also available using "nyc" for geo. Otherwise, the geo input options are the same. There are also two additional parameters, `polygons` and `download`, with the defaults set at False. If `polygons` is set to True, the dataframe will include a column with the geometries associated with each geographic region. If `download` is set to True, the dataframe will automatically download as a CSV when the function runs.