airbase.parquet_api.session module
- class airbase.parquet_api.session.Session(*, progress=False, raise_for_status=True)[source]
Bases:
AbstractAsyncContextManager- Parameters:
progress (bool) – (optional, default False) Show progress bars
raise_for_status (bool) – (optional, default True) Raise exceptions if any request from summary, url_to_files or download_to_directory methods returns “bad” HTTP status codes. If False, a
warnings.warn()will be issued instead. Default True.
- add_expected(numberFiles, size)[source]
add to the expected download files and size in Mb
- Parameters:
numberFiles (int) –
size (int) –
- Return type:
None
- add_urls(more_urls)[source]
add to the unique URLs ready for download
- Parameters:
more_urls (Iterable[str]) –
- Return type:
int
- async cities(*countries)[source]
city names id and notation from API
- Parameters:
countries (str) –
- Return type:
defaultdict[str, set[str]]
- async download_metadata(path, skip_existing=True)[source]
download station metadata into the given path.
- Parameters:
path (Path) –
pathlib.Pathto the station metadata (parent directory must exist)skip_existing (bool) – (optional, default True) Don’t re-download metadata if path already exists. If False, path may be overwritten.
- Return type:
None
- async download_to_directory(root_path, *, country_subdir=True, skip_existing=True)[source]
download into a directory
- Parameters:
root_path (Path) – The directory to save files in (must exist)
country_subdir (bool) – (optional, default True) Download files for different counties to different root_path sub directories. If False, download all files to root_path
skip_existing (bool) – (optional, default True) Don’t re-download files if they exist in root_path. If False, existing files in root_path may be overwritten. Empty files will be re-downloaded regardless of this option.
- Return type:
None
NOTE need to call url_to_files first, in order to retrieve the URLs to download, or add the urls directly with add_urls
- property expected_files: int
expected number of files to download
- property expected_size: int
expected download size in Mb
- property number_of_urls: int
number of unique URLs ready for download
- remove_url(url)[source]
remove URL from unique URLs ready for download
- Parameters:
url (str) –
- Return type:
None
- async summary(*download_infos)[source]
aggregated summary from multiple requests
- Parameters:
download_infos (ParquetData) – info about requested urls
- Return type:
None
- async url_to_files(*download_infos)[source]
multiple request for file URLs and return only unique URLs from each responses
- Parameters:
download_infos (ParquetData) – info about requested urls
- Return type:
None
- property urls: Iterable[str]
unique URLs ready for download
- async airbase.parquet_api.session.download(dataset, root_path, *, countries, pollutants=None, cities=None, frequency=None, summary_only=False, metadata=False, country_subdir=True, overwrite=False, quiet=True, raise_for_status=False, session=<airbase.parquet_api.session.Session object>)[source]
request file urls by country|city/pollutant and download unique files
- Parameters:
dataset (Dataset) – Dataset.Historical, Dataset.Verified or Dataset.Unverified.
root_path (Path) – The directory to save files in (must exist).
countries (frozenset[str] | set[str]) – Request observations for these countries.
pollutants (frozenset[str] | set[str] | None) – (optional, default None) Limit requests to these specific pollutants.
cities (frozenset[str] | set[str] | None) – (optional, default None) Limit requests to these specific cities.
summary_only (bool) – (optional, default False) Request total files/size, nothing will be downloaded.
metadata (bool) – (optional, default False) Download station metadata into root_path/”metadata.csv”.
country_subdir (bool) – (optional, default True) Download files for different counties to different root_path sub directories. If False, download all files to root_path
overwrite (bool) – (optional, default False) Re-download existing files in root_path. If False, existing files will be skipped. Empty files will be re-downloaded regardless of this option.
quiet (bool) – (optional, default True) Disable progress bars.
raise_for_status (bool) – (optional, default False) Raise exceptions if any request return “bad” HTTP status codes. If False, a
warnings.warn()will be issued instead.frequency (AggregationType | None) –
session (Session) –
- airbase.parquet_api.session.pollutant_id_from_url(url)[source]
- numeric pollutant id from urls like
http://dd.eionet.europa.eu/vocabulary/aq/pollutant/1 http://dd.eionet.europa.eu/vocabularyconcept/aq/pollutant/44/view
- Parameters:
url (str) –
- Return type:
int