airbase.parquet_api package
- class airbase.parquet_api.AggregationType(value)[source]
Bases:
str,Enumrepresents whether the data collected is obtaining the values: 1. Hourly data. 2. Daily data. 3. Variable intervals (different than the previous observations such as weekly, monthly, etc.)
https://eeadmz1-downloads-webapp.azurewebsites.net/content/documentation/How_To_Downloads.pdf
- Daily = 'day'
- Hourly = 'hour'
- Other = 'var'
- VariableIntervals = 'var'
- class airbase.parquet_api.Client(*, timeout=None, max_concurrent=10)[source]
Bases:
AbstractAsyncContextManagerHandle for requests to Parquet downloads API v1 https://eeadmz1-downloads-api-appservice.azurewebsites.net/swagger/index.html
- Parameters:
timeout (float | None) –
max_concurrent (int) –
- async city(payload)[source]
post request to /City
- Parameters:
payload (tuple[str, ...]) –
- Return type:
CityJSON
- async download_binary(url, path)[source]
get request to url, write response body content (in binary form) into a a binary file, and return path (exactly as the input)
- Parameters:
url (str) –
path (Path) –
- Return type:
Path
- async download_metadata(path)[source]
download compressed metadata file and returns path to uncompressed csv
- Parameters:
path (Path) –
- Return type:
Path
- async download_summary(payload)[source]
post request to /DownloadSummary
- Parameters:
payload (ParquetDataJSON) –
- Return type:
- async download_urls(payload)[source]
post request to /ParquetFile/urls
- Parameters:
payload (ParquetDataJSON) –
- Return type:
str
- class airbase.parquet_api.Dataset(value)[source]
Bases:
IntEnum1. Unverified data transmitted continuously (Up-To-Date/UTD/E2a) data from the beginning of 2024. 2. Verified data (E1a) from 2013 to 2023 reported by countries by 30 September each year for the previous year. 3. Historical Airbase data delivered between 2002 and 2012 before Air Quality Directive 2008/50/EC entered into force.
https://eeadmz1-downloads-webapp.azurewebsites.net/content/documentation/How_To_Downloads.pdf
- Airbase = 3
- E1a = 2
- E2a = 1
- Historical = 3
- UDT = 1
- Unverified = 1
- Verified = 2
- class airbase.parquet_api.ParquetData(country, dataset, pollutant=None, city=None, frequency=None, source='API')[source]
Bases:
NamedTupleinfo needed for requesting the URLs for country and dataset the request can be further restricted with the pollutant, city and frequency
Create new instance of ParquetData(country, dataset, pollutant, city, frequency, source)
- Parameters:
country (str) –
dataset (Dataset) –
pollutant (frozenset[str] | None) –
city (str | None) –
frequency (AggregationType | None) –
source (str) –
- city: str | None
Alias for field number 3
- country: str
Alias for field number 0
- frequency: AggregationType | None
Alias for field number 4
- pollutant: frozenset[str] | None
Alias for field number 2
- source: str
Alias for field number 5
- class airbase.parquet_api.Session(*, progress=False, raise_for_status=True)[source]
Bases:
AbstractAsyncContextManager- Parameters:
progress (bool) – (optional, default False) Show progress bars
raise_for_status (bool) – (optional, default True) Raise exceptions if any request from summary, url_to_files or download_to_directory methods returns “bad” HTTP status codes. If False, a
warnings.warn()will be issued instead. Default True.
- add_expected(numberFiles, size)[source]
add to the expected download files and size in Mb
- Parameters:
numberFiles (int) –
size (int) –
- Return type:
None
- add_urls(more_urls)[source]
add to the unique URLs ready for download
- Parameters:
more_urls (Iterable[str]) –
- Return type:
int
- async cities(*countries)[source]
city names id and notation from API
- Parameters:
countries (str) –
- Return type:
defaultdict[str, set[str]]
- async download_metadata(path, skip_existing=True)[source]
download station metadata into the given path.
- Parameters:
path (Path) –
pathlib.Pathto the station metadata (parent directory must exist)skip_existing (bool) – (optional, default True) Don’t re-download metadata if path already exists. If False, path may be overwritten.
- Return type:
None
- async download_to_directory(root_path, *, country_subdir=True, skip_existing=True)[source]
download into a directory
- Parameters:
root_path (Path) – The directory to save files in (must exist)
country_subdir (bool) – (optional, default True) Download files for different counties to different root_path sub directories. If False, download all files to root_path
skip_existing (bool) – (optional, default True) Don’t re-download files if they exist in root_path. If False, existing files in root_path may be overwritten. Empty files will be re-downloaded regardless of this option.
- Return type:
None
NOTE need to call url_to_files first, in order to retrieve the URLs to download, or add the urls directly with add_urls
- property expected_files: int
expected number of files to download
- property expected_size: int
expected download size in Mb
- property number_of_urls: int
number of unique URLs ready for download
- remove_url(url)[source]
remove URL from unique URLs ready for download
- Parameters:
url (str) –
- Return type:
None
- async summary(*download_infos)[source]
aggregated summary from multiple requests
- Parameters:
download_infos (ParquetData) – info about requested urls
- Return type:
None
- async url_to_files(*download_infos)[source]
multiple request for file URLs and return only unique URLs from each responses
- Parameters:
download_infos (ParquetData) – info about requested urls
- Return type:
None
- property urls: Iterable[str]
unique URLs ready for download
- async airbase.parquet_api.download(dataset, root_path, *, countries, pollutants=None, cities=None, frequency=None, summary_only=False, metadata=False, country_subdir=True, overwrite=False, quiet=True, raise_for_status=False, session=<airbase.parquet_api.session.Session object>)[source]
request file urls by country|city/pollutant and download unique files
- Parameters:
dataset (Dataset) – Dataset.Historical, Dataset.Verified or Dataset.Unverified.
root_path (Path) – The directory to save files in (must exist).
countries (frozenset[str] | set[str]) – Request observations for these countries.
pollutants (frozenset[str] | set[str] | None) – (optional, default None) Limit requests to these specific pollutants.
cities (frozenset[str] | set[str] | None) – (optional, default None) Limit requests to these specific cities.
summary_only (bool) – (optional, default False) Request total files/size, nothing will be downloaded.
metadata (bool) – (optional, default False) Download station metadata into root_path/”metadata.csv”.
country_subdir (bool) – (optional, default True) Download files for different counties to different root_path sub directories. If False, download all files to root_path
overwrite (bool) – (optional, default False) Re-download existing files in root_path. If False, existing files will be skipped. Empty files will be re-downloaded regardless of this option.
quiet (bool) – (optional, default True) Disable progress bars.
raise_for_status (bool) – (optional, default False) Raise exceptions if any request return “bad” HTTP status codes. If False, a
warnings.warn()will be issued instead.frequency (AggregationType | None) –
session (Session) –
- airbase.parquet_api.request_info_by_city(dataset, *cities, pollutants=None, frequency=None)[source]
download info one city at the time
- Parameters:
dataset (Dataset) –
pollutants (frozenset[str] | set[str] | None) –
frequency (AggregationType | None) –
- Return type:
- airbase.parquet_api.request_info_by_country(dataset, *countries, pollutants=None, frequency=None)[source]
download info one country at the time
- Parameters:
dataset (Dataset) –
pollutants (frozenset[str] | set[str] | None) –
frequency (AggregationType | None) –
- Return type:
Submodules
airbase.parquet_api.client module
Client for Parquet downloads API v1 https://eeadmz1-downloads-api-appservice.azurewebsites.net/swagger/index.html
- class airbase.parquet_api.client.Client(*, timeout=None, max_concurrent=10)[source]
Bases:
AbstractAsyncContextManagerHandle for requests to Parquet downloads API v1 https://eeadmz1-downloads-api-appservice.azurewebsites.net/swagger/index.html
- Parameters:
timeout (float | None) –
max_concurrent (int) –
- async city(payload)[source]
post request to /City
- Parameters:
payload (tuple[str, ...]) –
- Return type:
CityJSON
- async download_binary(url, path)[source]
get request to url, write response body content (in binary form) into a a binary file, and return path (exactly as the input)
- Parameters:
url (str) –
path (Path) –
- Return type:
Path
- async download_metadata(path)[source]
download compressed metadata file and returns path to uncompressed csv
- Parameters:
path (Path) –
- Return type:
Path
- async download_summary(payload)[source]
post request to /DownloadSummary
- Parameters:
payload (ParquetDataJSON) –
- Return type:
- async download_urls(payload)[source]
post request to /ParquetFile/urls
- Parameters:
payload (ParquetDataJSON) –
- Return type:
str
airbase.parquet_api.dataset module
- class airbase.parquet_api.dataset.AggregationType(value)[source]
Bases:
str,Enumrepresents whether the data collected is obtaining the values: 1. Hourly data. 2. Daily data. 3. Variable intervals (different than the previous observations such as weekly, monthly, etc.)
https://eeadmz1-downloads-webapp.azurewebsites.net/content/documentation/How_To_Downloads.pdf
- Daily = 'day'
- Hourly = 'hour'
- Other = 'var'
- VariableIntervals = 'var'
- class airbase.parquet_api.dataset.Dataset(value)[source]
Bases:
IntEnum1. Unverified data transmitted continuously (Up-To-Date/UTD/E2a) data from the beginning of 2024. 2. Verified data (E1a) from 2013 to 2023 reported by countries by 30 September each year for the previous year. 3. Historical Airbase data delivered between 2002 and 2012 before Air Quality Directive 2008/50/EC entered into force.
https://eeadmz1-downloads-webapp.azurewebsites.net/content/documentation/How_To_Downloads.pdf
- Airbase = 3
- E1a = 2
- E2a = 1
- Historical = 3
- UDT = 1
- Unverified = 1
- Verified = 2
- class airbase.parquet_api.dataset.ParquetData(country, dataset, pollutant=None, city=None, frequency=None, source='API')[source]
Bases:
NamedTupleinfo needed for requesting the URLs for country and dataset the request can be further restricted with the pollutant, city and frequency
Create new instance of ParquetData(country, dataset, pollutant, city, frequency, source)
- Parameters:
country (str) –
dataset (Dataset) –
pollutant (frozenset[str] | None) –
city (str | None) –
frequency (AggregationType | None) –
source (str) –
- city: str | None
Alias for field number 3
- country: str
Alias for field number 0
- frequency: AggregationType | None
Alias for field number 4
- pollutant: frozenset[str] | None
Alias for field number 2
- source: str
Alias for field number 5
- airbase.parquet_api.dataset.request_info_by_city(dataset, *cities, pollutants=None, frequency=None)[source]
download info one city at the time
- Parameters:
dataset (Dataset) –
pollutants (frozenset[str] | set[str] | None) –
frequency (AggregationType | None) –
- Return type:
- airbase.parquet_api.dataset.request_info_by_country(dataset, *countries, pollutants=None, frequency=None)[source]
download info one country at the time
- Parameters:
dataset (Dataset) –
pollutants (frozenset[str] | set[str] | None) –
frequency (AggregationType | None) –
- Return type:
airbase.parquet_api.session module
- class airbase.parquet_api.session.Session(*, progress=False, raise_for_status=True)[source]
Bases:
AbstractAsyncContextManager- Parameters:
progress (bool) – (optional, default False) Show progress bars
raise_for_status (bool) – (optional, default True) Raise exceptions if any request from summary, url_to_files or download_to_directory methods returns “bad” HTTP status codes. If False, a
warnings.warn()will be issued instead. Default True.
- add_expected(numberFiles, size)[source]
add to the expected download files and size in Mb
- Parameters:
numberFiles (int) –
size (int) –
- Return type:
None
- add_urls(more_urls)[source]
add to the unique URLs ready for download
- Parameters:
more_urls (Iterable[str]) –
- Return type:
int
- async cities(*countries)[source]
city names id and notation from API
- Parameters:
countries (str) –
- Return type:
defaultdict[str, set[str]]
- async download_metadata(path, skip_existing=True)[source]
download station metadata into the given path.
- Parameters:
path (Path) –
pathlib.Pathto the station metadata (parent directory must exist)skip_existing (bool) – (optional, default True) Don’t re-download metadata if path already exists. If False, path may be overwritten.
- Return type:
None
- async download_to_directory(root_path, *, country_subdir=True, skip_existing=True)[source]
download into a directory
- Parameters:
root_path (Path) – The directory to save files in (must exist)
country_subdir (bool) – (optional, default True) Download files for different counties to different root_path sub directories. If False, download all files to root_path
skip_existing (bool) – (optional, default True) Don’t re-download files if they exist in root_path. If False, existing files in root_path may be overwritten. Empty files will be re-downloaded regardless of this option.
- Return type:
None
NOTE need to call url_to_files first, in order to retrieve the URLs to download, or add the urls directly with add_urls
- property expected_files: int
expected number of files to download
- property expected_size: int
expected download size in Mb
- property number_of_urls: int
number of unique URLs ready for download
- remove_url(url)[source]
remove URL from unique URLs ready for download
- Parameters:
url (str) –
- Return type:
None
- async summary(*download_infos)[source]
aggregated summary from multiple requests
- Parameters:
download_infos (ParquetData) – info about requested urls
- Return type:
None
- async url_to_files(*download_infos)[source]
multiple request for file URLs and return only unique URLs from each responses
- Parameters:
download_infos (ParquetData) – info about requested urls
- Return type:
None
- property urls: Iterable[str]
unique URLs ready for download
- async airbase.parquet_api.session.download(dataset, root_path, *, countries, pollutants=None, cities=None, frequency=None, summary_only=False, metadata=False, country_subdir=True, overwrite=False, quiet=True, raise_for_status=False, session=<airbase.parquet_api.session.Session object>)[source]
request file urls by country|city/pollutant and download unique files
- Parameters:
dataset (Dataset) – Dataset.Historical, Dataset.Verified or Dataset.Unverified.
root_path (Path) – The directory to save files in (must exist).
countries (frozenset[str] | set[str]) – Request observations for these countries.
pollutants (frozenset[str] | set[str] | None) – (optional, default None) Limit requests to these specific pollutants.
cities (frozenset[str] | set[str] | None) – (optional, default None) Limit requests to these specific cities.
summary_only (bool) – (optional, default False) Request total files/size, nothing will be downloaded.
metadata (bool) – (optional, default False) Download station metadata into root_path/”metadata.csv”.
country_subdir (bool) – (optional, default True) Download files for different counties to different root_path sub directories. If False, download all files to root_path
overwrite (bool) – (optional, default False) Re-download existing files in root_path. If False, existing files will be skipped. Empty files will be re-downloaded regardless of this option.
quiet (bool) – (optional, default True) Disable progress bars.
raise_for_status (bool) – (optional, default False) Raise exceptions if any request return “bad” HTTP status codes. If False, a
warnings.warn()will be issued instead.frequency (AggregationType | None) –
session (Session) –
- airbase.parquet_api.session.pollutant_id_from_url(url)[source]
- numeric pollutant id from urls like
http://dd.eionet.europa.eu/vocabulary/aq/pollutant/1 http://dd.eionet.europa.eu/vocabularyconcept/aq/pollutant/44/view
- Parameters:
url (str) –
- Return type:
int
airbase.parquet_api.types module
type annotations from https://eeadmz1-downloads-api-appservice.azurewebsites.net/swagger/index.html
- class airbase.parquet_api.types.CityData[source]
Bases:
TypedDictpart of /City response
- cityName: str
- countryCode: str
- class airbase.parquet_api.types.CountryData[source]
Bases:
TypedDictpart of /Country response
- countryCode: str
- countryName: str
- class airbase.parquet_api.types.DownloadSummaryJSON[source]
Bases:
TypedDictfull /DownloadSummary response
- numberFiles: int
- size: int
- class airbase.parquet_api.types.ParquetDataJSON[source]
Bases:
TypedDictrequest payload to /DownloadSummary, /ParquetFile and /ParquetFile/urls
- aggregationType: NotRequired[Literal['hour', 'day', 'var'] | AggregationType]
- cities: list[str]
- countries: list[str]
- dateTimeEnd: NotRequired[str]
- dateTimeStart: NotRequired[str]
- pollutants: list[str]
- source: NotRequired[str]