Skip to contents

This function matches input coordinates (latitude and longitude) to the DOSE dataset. It accepts either vectors of latitudes and longitudes or a dataframe containing these coordinates. Before matching, it ensures that only unique coordinates are processed to avoid duplicating operations on identical coordinates. It downloads the GADM-1 geometries from a specified URL if not already cached locally, unzips it, and returns a dataframe with unique input coordinates and matched DOSE data. Optionally, the function can filter the DOSE dataset by specific years. Additionally, users can specify countries directly to skip the country matching step, potentially saving processing time.

Usage

matchDOSE(
  lat = NULL,
  long = NULL,
  df = NULL,
  lat_col = "lat",
  long_col = "long",
  years = NULL,
  countries = NULL,
  format_countries = "iso3c",
  gpkg_path = NULL
)

Arguments

lat

Optional vector of latitudes of the points to match. Required if no dataframe is provided.

long

Optional vector of longitudes of the points to match. Required if no dataframe is provided.

df

Optional dataframe containing coordinates and possibly additional columns. If provided, 'lat' and 'long' vectors should not be provided. The dataframe must include columns specified by 'lat_col' and 'long_col' parameters.

lat_col

Optional name of the latitude column in 'df'. Only used if 'df' is provided. Defaults to "lat".

long_col

Optional name of the longitude column in 'df'. Only used if 'df' is provided. Defaults to "long".

years

Optional vector of years for which to filter the DOSE dataset. If NULL (the default), a 1:m matching is performed and data for all years are returned.

countries

Optional vector or dataframe column name of country identifiers. If provided, the function skips the country matching step. The identifiers can be in the format specified by 'format_countries'. This can significantly reduce processing time.

format_countries

Specifies the format of the country identifiers in 'countries'. Options are "country.name" (default), "iso3c", and "iso2c". This parameter is ignored if 'countries' is NULL.

gpkg_path

Optional path to store the .gpkg file. If not specified, the default cache directory is used.

Value

A dataframe with input coordinates (and any additional input dataframe columns) and matched DOSE data.

Examples

# Match two pairs of coordinates to DOSE using vectors
matched_data_vectors <- matchDOSE(lat = c(19.4326, 51.5074), long = c(-99.1332, -0.1276))
#> Geometries not found in machine. Downloading GADM-DOSE geometries...
#> Download successful using curl
#> GADM-DOSE successfully downloaded and stored in ~/.cache/subincomeR
#> Extracting files...
#> Passing 2 coordinates to the Nominatim single coordinate geocoder
#> Query completed in: 2 seconds
#> Matching coordinates to subdivisions...
#> Loading DOSE dataset...

# Match two pairs of coordinates to DOSE using a dataframe
df <- data.frame(ID = 1:2, latitude = c(19.4326, 51.5074), longitude = c(-99.1332, -0.1276))
matched_data_df <- matchDOSE(df = df, lat_col = "latitude", long_col = "longitude")
#> Passing 2 coordinates to the Nominatim single coordinate geocoder
#> Query completed in: 2 seconds
#> Matching coordinates to subdivisions...
#> Loading DOSE dataset...

# Match coordinates to DOSE for a specific year using vectors
matched_data_2019 <- matchDOSE(lat = c(19.4326), long = c(-99.1332), years = 2019)
#> Passing 1 coordinate to the Nominatim single coordinate geocoder
#> Query completed in: 1 seconds
#> Matching coordinates to subdivisions...
#> Loading DOSE dataset...
#> Filtering years...

# Match coordinates and specify countries to skip country matching
matched_data_with_countries <- matchDOSE(lat = c(19.4326, 51.5074), long = c(-99.1332, -0.1276),
                                         countries = c("MEX", "GBR"), format_countries = "iso3c")
#> Country identifiers are provided. Skipping geocoding...
#> Matching coordinates to subdivisions...
#> Loading DOSE dataset...