siphon.catalog
¶
Code to support reading and parsing catalog files from a THREDDS Data Server (TDS).
They help identifying the latest dataset and finding proper URLs to access the data.
-
class
siphon.catalog.
CaseInsensitiveDict
(*args, **kwargs)[source]¶ Extend
dict
to use a case-insensitive key set.
-
class
siphon.catalog.
CaseInsensitiveStr
(*args)[source]¶ Extend
str
to use case-insensitive comparison and lookup.
-
class
siphon.catalog.
CatalogRef
(base_url, element_node)[source]¶ An object for holding catalog references obtained from a THREDDS Client Catalog.
-
name
¶ str – The name of the
CatalogRef
element
-
href
¶ str – url to the
CatalogRef
’s THREDDS Client Catalog
-
title
¶ str – Title of the
CatalogRef
element
-
follow
()[source]¶ Follow the catalog reference and return a new
TDSCatalog
.Returns: The referenced catalog Return type: TDSCatalog
-
-
class
siphon.catalog.
CompoundService
(service_node)[source]¶ Hold information about compound services.
-
name
¶ str – The name of the compound service
-
service_type
¶ str – The service type (for this object, service type will always be “COMPOUND”)
-
services
¶ list[SimpleService] – A list of
SimpleService
objects
-
__init__
(service_node)[source]¶ Initialize a
CompoundService
object.Parameters: service_node ( Element
) – AnElement
representing a compound service node
-
-
class
siphon.catalog.
Dataset
(element_node, catalog_url='')[source]¶ An object for holding Datasets obtained from a THREDDS Client Catalog.
-
url_path
¶ str – url to the accessible dataset
-
access_urls
¶ CaseInsensitiveDict[str, str] – A dictionary of access urls whose keywords are the access service types defined in the catalog (for example, “OPENDAP”, “NetcdfSubset”, “WMS”, etc.
-
access_with_service
(service, use_xarray=None)[source]¶ Access the dataset using a particular service.
Return an Python object capable of communicating with the server using the particular service. For instance, for ‘HTTPServer’ this is a file-like object capable of HTTP communication; for OPENDAP this is a netCDF4 dataset.
Parameters: service (str) – The name of the service for accessing the dataset Returns: Return type: An instance appropriate for communicating using service
.
-
download
(filename=None)[source]¶ Download the dataset to a local file.
Parameters: filename (str, optional) – The full path to which the dataset will be saved
-
make_access_urls
(catalog_url, all_services, metadata=None)[source]¶ Make fully qualified urls for the access methods enabled on the dataset.
Parameters: - catalog_url (str) – The top level server url
- all_services (List[SimpleService]) – list of
SimpleService
objects associated with the dataset - metadata (dict) – Metadata from the
TDSCatalog
-
remote_access
(service=None, use_xarray=None)[source]¶ Access the remote dataset.
Open the remote dataset and get a netCDF4-compatible
Dataset
object providing index-based subsetting capabilities.Parameters: service (str, optional) – The name of the service to use for access to the dataset, either ‘CdmRemote’ or ‘OPENDAP’. Defaults to ‘CdmRemote’. Returns: Object for netCDF4-like access to the dataset Return type: Dataset
-
remote_open
()[source]¶ Open the remote dataset for random access.
Get a file-like object for reading from the remote dataset, providing random access, similar to a local file.
Returns: Return type: A random access, file-like object
-
resolve_url
(catalog_url)[source]¶ Resolve the url of the dataset when reading latest.xml.
Parameters: catalog_url (str) – The catalog url to be resolved
-
subset
(service=None)[source]¶ Subset the dataset.
Open the remote dataset and get a client for talking to
service
.Parameters: service (str, optional) – The name of the service for subsetting the dataset. Defaults to ‘NetcdfSubset’ or ‘NetcdfServer’, in that order, depending on the services listed in the catalog. Returns: Return type: a client for communicating using service
-
-
class
siphon.catalog.
DatasetCollection
[source]¶ Extend
IndexableMapping
to allow datetime-based filter queries.-
filter_time_nearest
(time, regex=None)[source]¶ Filter keys for an item closest to the desired time.
Loops over all keys in the collection and uses regex to extract and build datetime`s. The collection of `datetime`s is compared to `start and the value that has a
datetime
closest to that requested is returned.If none of the keys in the collection match the regex, indicating that the keys are not date/time-based, aValueError
is raised.Parameters: - time (
datetime.datetime
) – The desired time - regex (str, optional) – The regular expression to use to extract date/time information from the key. If given, this should contain named groups: ‘year’, ‘month’, ‘day’, ‘hour’, ‘minute’, ‘second’, and ‘microsecond’, as appropriate. When a match is found, any of those groups missing from the pattern will be assigned a value of 0. The default pattern looks for patterns like: 20171118_2356.
Returns: Return type: The value with a time closest to that desired
- time (
-
filter_time_range
(start, end, regex=None)[source]¶ Filter keys for all items within the desired time range.
Loops over all keys in the collection and uses regex to extract and build datetime`s. From the collection of `datetime`s, all values within `start and end (inclusive) are returned. If none of the keys in the collection match the regex, indicating that the keys are not date/time-based, a
ValueError
is raised.Parameters: - start (
datetime.datetime
) – The start of the desired time range, inclusive - end (
datetime.datetime
) – The end of the desired time range, inclusive - regex (str, optional) – The regular expression to use to extract date/time information from the key. If given, this should contain named groups: ‘year’, ‘month’, ‘day’, ‘hour’, ‘minute’, ‘second’, and ‘microsecond’, as appropriate. When a match is found, any of those groups missing from the pattern will be assigned a value of 0. The default pattern looks for patterns like: 20171118_2356.
Returns: Return type: All values corresponding to times within the specified range
- start (
-
-
class
siphon.catalog.
IndexableMapping
[source]¶ Extend
OrderedDict
to allow index-based access to values.
-
class
siphon.catalog.
SimpleService
(service_node)[source]¶ Hold information about an access service enabled on a dataset.
-
name
¶ str – The name of the service
-
service_type
¶ str – The service type (i.e. “OPENDAP”, “NetcdfSubset”, “WMS”, etc.)
-
access_urls
¶ dict[str, str] – A dictionary of access urls whose keywords are the access service types defined in the catalog (for example, “OPENDAP”, “NetcdfSubset”, “WMS”, etc.)
-
-
class
siphon.catalog.
TDSCatalog
(catalog_url)[source]¶ Parse information from a THREDDS Client Catalog.
-
catalog_url
¶ str – The url path of the catalog to parse.
-
base_tds_url
¶ str – The top level server address
-
datasets
¶ DatasetCollection[str, Dataset] – A dictionary of
Dataset
objects, whose keys are the name of the dataset’s name
-
services
¶ List – A list of
SimpleService
listed in the catalog
-
catalog_refs
¶ DatasetCollection[str, CatalogRef] – A dictionary of
CatalogRef
objects whose keys are the name of the catalog ref title.
-
__init__
(catalog_url)[source]¶ Initialize the TDSCatalog object.
Parameters: catalog_url (str) – The URL of a THREDDS client catalog
-
latest
¶ Get the latest dataset, if available.
-
-
siphon.catalog.
get_latest_access_url
(catalog_url, access_method)[source]¶ Get the data access url to the latest data using a specified access method.
These are available for a data available from a top level dataset catalog (url). Currently only supports the existence of one “latest” dataset.
Parameters: Returns: access_url – Data access URL to be used to access the latest data available from a given catalog using the specified access_method. Typically a single string, but not always.
Return type: