airbase.airbase module

class airbase.airbase.AirbaseClient[source]

Bases: object

The central point for requesting Airbase data.

Example:
>>> client = AirbaseClient()
>>> r = client.request("Historical", "NL", "DE", poll=["O3", "NO2"])
>>> r.download("data/raw")
summary : 100%|██████████| 2/2 [00:00<00:00,  2.19requests/s]
URLs    : 100%|██████████| 1.80k/1.80k [00:00<00:00, 17.4kURL/s]
download: 2.05Gb [01:58, 18.6Mb/s]
>>> r.download_metadata("data/metadata.tsv")
Writing metadata to data/metadata.tsv...
countries

All pollutants available from AirBase

static download_metadata(filepath, verbose=True)[source]

Download the metadata CSV file.

See https://discomap.eea.europa.eu/App/AQViewer/index.html?fqn=Airquality_Dissem.b2g.measurements

Parameters:
  • filepath (str | Path) –

  • verbose (bool) –

Return type:

None

request(source, *countries, poll=None, verbose=True)[source]

Initialize an AirbaseRequest for a query.

Pollutants can be specified by name/notation (poll). If no pollutants are specified, data for all available pollutants will be requested. If a poll is not available for a country, then we simply do not try to download those parquet files.

Requests proceed in two steps: First, URLs to individual parquet files are requested from the EEA server. Then these links are used to download the individual parquet files.

See https://eeadmz1-downloads-webapp.azurewebsites.net/

Parameters:
  • source (Literal['Historical', 'Verified', 'Unverified'] | ~airbase.parquet_api.dataset.Dataset) – One of 3 options. “Historical” data delivered between 2002 and 2012, before Air Quality Directive 2008/50/EC entered into force. “Verified” data (E1a) from 2013 to 2022 reported by countries by 30 September each year for the previous year. “Unverified” data transmitted continuously (Up-To-Date/UTD/E2a), from the beginning of 2023.

  • countries (str) – (optional), 2-letter country codes. Data will be requested for each country. Will raise ValueError if a country is not in self.countries. If no countries are provided, data for all countries will be requested.

  • poll (str | Iterable[str] | None) – (optional) pollutant(s) to request data for. Must be one of the pollutants in self.pollutants.

  • verbose (bool) – (optional) print status messages to stderr. Default True.

  • preload_urls – (optional) Request all the file URLs from the EEA server at object initialization. Default False.

Return AirbaseRequest:

The initialized AirbaseRequest.

Example:
>>> client = AirbaseClient()
>>> r = client.request("Historical", "NL", "DE", poll=["O3", "NO2"])
>>> r.download("data/raw")
summary : 100%|██████████| 2/2 [00:00<00:00,  2.19requests/s]
URLs    : 100%|██████████| 1.80k/1.80k [00:00<00:00, 17.4kURL/s]
download: 2.05Gb [01:58, 18.6Mb/s]
>>> r.download_metadata("data/metadata.tsv")
Writing metadata to data/metadata.tsv...
Return type:

AirbaseRequest

search_pollutant(query, limit=None)[source]

Search for a pollutant’s id number based on its name.

Parameters:
  • query (str) – The pollutant to search for.

  • limit (int | None) – (optional) Max number of results.

Returns:

The best pollutant matches. Pollutants are dicts with keys “poll” and “id”.

Example:
>>> AirbaseClient().search_pollutant("o3", limit=2)
>>> [{"poll": "O3", "id": 7}, {"poll": "NO3", "id": 46}]
Return type:

list[airbase.airbase.PollutantDict]

class airbase.airbase.AirbaseRequest(source, *country, poll=None, verbose=True)[source]

Bases: object

Handler for Airbase data requests.

Requests proceed in two steps: First, URLs to individual parquet files are requested from the EEA server. Then these links are used to download the individual parquet files.

See https://eeadmz1-downloads-webapp.azurewebsites.net/

Parameters:
  • source (Dataset) – One of 3 options. airbase.Dataset.Historical data delivered between 2002 and 2012, before Air Quality Directive 2008/50/EC entered into force. airbase.Dataset.Verified data (E1a) from 2013 to 2022 reported by countries by 30 September each year for the previous year. airbase.Dataset.Unverified data transmitted continuously (Up-To-Date/UTD/E2a), from the beginning of 2023.

  • country (str) – 2-letter country code or a list of them. If a list, data will be requested for each country.

  • poll (str | Iterable[str] | None) – (optional) pollutant(s) to request data for. Will be applied to each country requested. If None, all available pollutants will be requested.

  • verbose (bool) – (optional) print status messages to stderr. Default True.

  • preload_urls (bool) – (optional) Request all the csv download links from the Airbase server at object initialization. Default False.

download(dir, skip_existing=True, raise_for_status=True)[source]

Download into a directory, preserving original file structure.

Parameters:
  • dir (str | Path) – The directory to save files in (must exist)

  • skip_existing (bool) – (optional) Don’t re-download files if they exist in dir. If False, existing files in dir may be overwritten. Default True.

  • raise_for_status (bool) – (optional) Raise exceptions if download links return “bad” HTTP status codes. If False, a warnings.warn() will be issued instead. Default True.

Returns:

self

Return type:

None

download_metadata(filepath)[source]

Download the metadata CSV file.

See https://discomap.eea.europa.eu/App/AQViewer/index.html?fqn=Airquality_Dissem.b2g.measurements

Parameters:

filepath (str | Path) – Where to save the CSV

Return type:

None

pollutants: set[str]
session = <airbase.parquet_api.session.Session object>
class airbase.airbase.PollutantDict[source]

Bases: TypedDict

id: int
poll: str