sa_gwdata Python package

The sa_gwdata package currently provides a helpful way to access the WaterConnect web services (see following page) and provide data as pandas.DataFrame instances. Features under development will allow easier ways to find and use well IDs across the entire interface, and other things.

Install

You can install sa_gwdata the usual way:

> pip install -U python-sa-gwdata

This will install and/or update the Python package sa_gwdata.

Usage

You can locate any wells by plain-text search for well identifiers:

>>> import sa_gwdata
>>> wells = sa_gwdata.find_wells("ADE206 ADE207")
>>> wells
[<sa_gwdata.Well(259424) 6628-25427 / ADE206 / DFW T2>,
 <sa_gwdata.Well(259425) 6628-25428 / ADE207 / DFW T1>
]

And then get pandas DataFrames with data in them:

>>> wls = sa_gwdata.water_levels(wells)
>>> wls.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 55 entries, 0 to 54
Data columns (total 21 columns):
DHNO               55 non-null int64
network            55 non-null object
Unit_Number        55 non-null int64
Aquifer            55 non-null object
Easting            55 non-null float64
Northing           55 non-null float64
Zone               55 non-null int64
Unit_No            55 non-null object
Obs_No             55 non-null object
obs_date           55 non-null object
dtw                48 non-null float64
swl                48 non-null float64
rswl               48 non-null float64
pressure           8 non-null float64
temperature        2 non-null float64
dry_ind            0 non-null float64
anom_ind           55 non-null object
pump_ind           55 non-null object
measured_during    55 non-null object
data_source        55 non-null object
Comments           18 non-null object
dtypes: float64(8), int64(3), object(10)
memory usage: 9.1+ KB

Access to WaterConnect webservices

To create the Groundwater Data session wrapper:

>>> from sa_gwdata import WaterConnectSession
>>> session = WaterConnectSession()

Then to access any of the web service calls:

>>> response = session.get("GetObswellNetworkData", params={"Network": "KAT_FP,PIKE_FP"})
>>> len(response.df)
190
>>> response.df.columns
Index(['aq_mon', 'chem', 'class', 'dhno', 'drill_date', 'lat',
   'latest_open_date', 'latest_open_depth', 'latest_sal_date',
   'latest_swl_date', 'latest_yield_date', 'litholog', 'logdrill', 'lon',
   'mapnum', 'max_depth', 'name', 'nrm', 'obsnetwork', 'obsnumber',
   'permit_no', 'purp_desc', 'pwa', 'replaceunitnum', 'sal', 'salstatus',
   'stat_desc', 'swl', 'swlstatus', 'tds', 'water', 'yield'],
  dtype='object')
>>> response.df.obsnumber.unique()
array(['KTR043', 'KTR023', 'KTR025', 'KTR026', 'PYP008', 'PAG003',
       'KTR065', 'LVD002', 'RMK004', 'RMK010', 'RMK006', 'RMK007',
       'KTR021', 'KTR022', 'RMK074', 'RMK080', 'RMK077', 'RMK055',
       'KTR034', 'RMK214', 'RMK215', 'RMK216', 'RMK229', 'RMK233',
       'GDN044', 'GDN055', 'GDN064', 'RMK355', 'RMK356', 'PAG069',
       'PAG070', 'PAG071', 'PAG077', 'PAG078', 'PAG079', 'PAG080',
       'PAG081', 'PAG082', 'PAG083', 'PAG084', 'PAG085', 'PAG086',
       'PAG038', 'PAG042', 'PAG043', 'PAG044', 'PAG045', 'PAG059',
       'PAG058', 'GDN186', 'RMK361', 'MTH012', 'PAG068', 'GDN128',
       'GDN132', 'GDN187', 'GDN188', 'PAG104', 'PYP055', 'RMK357',
       'RMK363', 'RMK365', 'RMK359', 'RMK362', 'RMK385', 'RMK374',
       'KTR060', 'KTR061', 'RMK368', 'GDN185', 'RMK369', 'RMK375',
       'PAG142', 'PAG162', 'PAG161', 'PAG117', 'RMK379', 'PAG130',
       'PAG129', 'PAG116', 'PAG115', 'MTH021', 'PAG089', 'PAG091',
       'PAG092', 'PAG094', 'PAG097', 'RMK370', 'RMK371', 'KTR067',
       'KTR068', 'RMK367', 'RMK347', 'RMK348', 'RMK349', 'RMK382',
       'RMK380', 'RMK381', 'PAG118', 'PAG114', 'PAG119', 'RMK354',
       'RMK384', 'RMK383', 'RMK364', 'RMK360', 'RMK366', 'KTR066',
       'RMK358', 'RMK373', 'PAG158', 'PAG155', 'PAG152', 'PAG135',
       'PAG134', 'PAG131', 'PAG143', 'PAG146', 'PAG151', 'PAG147',
       'PAG168', 'PAG165', 'RMK376', 'KTR058', 'KTR062', 'RMK372',
       'KTR064', 'KTR063', 'RMK377', 'KTR059', 'PAG139', 'PAG140',
       'PAG169', 'PAG170', 'PAG175', 'PAG153', 'PAG154', 'PAG157',
       'PAG156', 'PAG159', 'PAG160', 'PAG133', 'PAG132', 'PAG136',
       'PAG150', 'PAG149', 'PAG148', 'PAG145', 'PAG144', 'PAG122',
       'PAG174', 'PAG163', 'PAG173', 'PAG164', 'PAG166', 'PAG176',
       'PAG167', 'PAG141', 'PAG171', 'PAG138', 'PAG120', 'PAG137',
       'PAG177', 'PAG172', 'PAG123', 'PAG121', 'RMK386', 'PAG180',
       'PAG182', 'PAG181', 'PAG183', 'PAG179', 'PAG178', 'KTR071',
       'RMK388', 'RMK389', 'PAG184', 'PAG185', 'PAG186', 'PAG187',
       'PAG188', 'PAG189', 'KTR070', 'RMK392', 'KTR069', 'RMK395',
       'RMK394', 'RMK393', 'RMK390', 'RMK391'], dtype=object)

For futher information, check out the Jupyter Notebook tutorial.

Docstrings

Access to data

sa_gwdata.find_wells(input_text, **kwargs)[source]

Find wells and retrieve some summary information.

Parameters:input_text (str) – any well identifiers to parse. See sa_gwdata.parse_well_ids_plaintext() for details of other keyword arguments you can pass here.

For example:

>>> import sa_gwdata
>>> wells = sa_gwdata.find_wells("yat99 5840-46 ULE205")
...
>>> wells
['MLC008', 'ULE205', 'YAT099']
sa_gwdata.parse_well_ids_plaintext(input_text, types=('unit_no', 'obs_no'), unit_no_prefix='', obs_no_prefix='', dh_re_prefix='\\A')[source]

Parse possible well identifiers out of plain text.

Parameters:
  • input_text (str) – the text to parse well identifiers from. Can include multiple lines.
  • types (tuple) – types of identifiers to look for. Currently supported: “unit_no”, “obs_no”, “dh_no”
  • dh_re_prefix (str) – regexp pattern required before a dh_no regexp will match

Returns: a list of tuples e.g.

>>> from sa_gwdata import parse_well_ids
>>> parse_well_ids('sle15')
[('obs_no', 'SLE015')]
>>> parse_well_ids('6628150')
[]
>>> parse_well_ids('6628-150')
[('unit_no', '6628-150')]
>>> parse_well_ids('662800150')
[('unit_no', '6628-150')]
>>> parse_well_ids('259001', types=["dh_no"])
[('dh_no', '259001')]

Remember this doesn’t actually check whether these identifiers to a well in the real world; it just parses a string of text to find possible well identifiers. It’s pretty robust:

>>> parse_well_ids("SLE 15, SLE16, and also maybe 5910-1")
[('unit_no', '5910-1'), ('obs_no', 'SLE015'), ('obs_no', 'SLE016'), ('obs_no', 'YBE591')]

It has unfortunately matched “ybe 591” from the phrase “maybe 5910-1” as an obs_no.

sa_gwdata.water_levels(wells, session=None, **kwargs)[source]

Get table of water level measurements for wells.

Parameters:wells (list) – list of drillhole numbers (ints) or sa_gwdata.Well objects

Returns: pandas DataFrame.

sa_gwdata.salinities(wells, session=None, **kwargs)[source]

Get table of salinity samples for wells.

Parameters:wells (list) – list of drillhole numbers (ints) or sa_gwdata.Well objects

Returns: pandas DataFrame.

sa_gwdata.drillers_logs(wells, session=None, **kwargs)[source]

Get table of lithological intervals from drillers logs for wells.

Parameters:wells (list) – list of drillhole numbers (ints) or sa_gwdata.Well objects

Returns: pandas DataFrame.

Well identifiers

class sa_gwdata.Well(*args, **kwargs)[source]

Represents a well.

Parameters:
  • dh_no (int) – drillhole number (required)
  • unit_no (str/int) – unit number (optional)
  • obs_no (str/int) – obs number (optional)

Other keyword arguments will be set as attributes.

id

obs number if it exists, e.g. “NOA002”, if not, unit number e.g. “6628-123”, and in the rare case that a unit number does not exist, then drillhole no. e.g. “200135”.

Type:str
title

available attributes including name, e.g. “7025-3985 / WRG038 / WESTERN LAGOON”.

Type:str
obs_no

obs number

Type:ObsNo
unit_no

unit number

Type:UnitNo
set(dh_no, unit_no='', obs_no='', **kwargs)[source]

See Well constructor for docstring.

set_obs_no(*args)[source]

Set obswell number.

Args are passed to ObsNo constructor.

set_unit_no(*args)[source]

Set unit number.

Args are passed to UnitNo constructor.

to_scalar_dict()[source]

Convert Well to a dictionary containing scalar values.

Returns: dict.

Guaranteed keys are “dh_no”, “id”, “title” and “name”.

The keys present in well.unit_no.to_scalar_dict() will be added with the prefix “unit_no.”. Same for obs_no.

Any additional attributes will also be present.

path_safe_repr(remove_prefix=True)[source]

Return title containing only characters which are allowed in Windows path names.

class sa_gwdata.UnitNo(*args)[source]

Parse a well unit number.

Parameters:*args (str or int) – either the complete unit number or the map sheet and drillhole sequence numbers

Example:

>>> u1 = UnitNo("6628-123")
>>> u2 = UnitNo("662800123")
>>> u3 = UnitNo(662800123)
>>> u4 = UnitNo("6628-00123")
>>> u5 = UnitNo(6628, 123)
>>> u6 = UnitNo("6628", "00123")
>>> u7 = UnitNo("G662800123")
>>> u1 == u2 == u3 == u4 == u5 == u6 == u7
True
map

10K map sheet

Type:int
seq

sequence number

Type:int
hyphen

hyphenated format e.g. “6628-123”

Type:str
long

zero-filled format e.g. “662800123”

Type:str
long_int

zero-filled format as integer e.g. 662800123 or None if missing

Type:int/None
wilma

WILMA style e.g. “6628-00123”

Type:str
hydstra

Hydstra style e.g. “G662800123”

Type:str
set(*args)[source]

See UnitNo constructor for details of arguments.

class sa_gwdata.ObsNo(*args)[source]

Parse an observation well identifier.

Parameters:*args (str or int) – either one string, which can be either in the format ‘YAT017’ or ‘YAT-17’, etc.; or two values, either int or str, for the plan prefix (three letters referring to the hundred) and the sequence number. e.g. ‘YAT’, 17

Example:

>>> from sa_gwdata import ObsNo
>>> o1 = ObsNo("YAT017")
>>> o2 = ObsNo("YAT17")
>>> o3 = ObsNo("YAT 17")
>>> o4 = ObsNo("YAT", 17)
>>> o1 == o2 == o3 == o4
True
plan

hundred prefix

Type:str
seq

sequence number

Type:int
id

consistent zero-padded identifier e.g. “YAT017”

Type:str
egis

ENVGIS style e.g. “YAT 17”

Type:str
classmethod parse(*args, **kwargs)[source]

Parse an obs identifier, ignoring all parsing errors.

Arguments are the same as those for the class constructor, but all exceptions are ignored.

Returns: ObsNo.id if successful, a blank string if not.

set(*args)[source]

See ObsNo constructor for details of arguments.

WaterConnect web service utilities

class sa_gwdata.WaterConnectSession(*args, endpoint=None, sleep=2, verify=True, load_list_data=True, **kwargs)[source]

Wrapper around repeated requests to Groundwater Data.

Parameters:
  • endpoint (str) – url endpoint for API, optional
  • sleep (int) – minimum interval between requests in seconds. Be nice, do not reduce it.
  • verify (bool) – require valid SSL certificate

Other args and kwargs are passed to request.Session constructor.

Usage:

>>> from sa_gwdata import WaterConnectSession
>>> with WaterConnectSession() as s:
...     df = s.get("GetObswellNetworkData", params={"Network": "CENT_ADEL"})
get(path, app='WDDDMS', verify=None, **kwargs)[source]

HTTP GET verb to Groundwater Data.

Parameters:path (str) – final portion of URL path off the end of self.endpoint e.g. to GET https://www.waterconnect.sa.gov.au/_layouts/15/dfw.sharepoint.wdd /WDDDMS.ashx/GetAdvancedListsData then you would use path="GetAdvancedListsData".
post(path, app='WDDDMS', verify=None, **kwargs)[source]

Sends a POST request. Returns Response object.

Parameters:
  • url – URL for the new Request object.
  • data – (optional) Dictionary, list of tuples, bytes, or file-like object to send in the body of the Request.
  • json – (optional) json to send in the body of the Request.
  • **kwargs – Optional arguments that request takes.
Return type:

requests.Response

find_wells(input_text, **kwargs)[source]

Find wells and retrieve some summary information.

Parameters:input_text (str) – any well identifiers to parse. See sa_gwdata.parse_well_ids_plaintext() for details of other keyword arguments you can pass here.

For example:

>>> from sa_gwdata import WaterConnectSession
>>> with WaterConnectSession() as s:
...     wells = s.find_wells("yat99 5840-46 ULE205")
...
>>> wells
['MLC008', 'ULE205', 'YAT099']
refresh_available_groupings()[source]

Load lists data from API. Stores them in the attributes aquifers, networks, nrm_regions, pwas, pwras.

Any calls to sa_gwdata.WaterConnectSession.get() return an sa_gwdata.Response object:

class sa_gwdata.Response[source]
r

Return the HTTP requests.Response object.

json

Convert the response to JSON. Returns a dict/list.

df

If the response is a list, convert to a pandas DataFrame with columns converted into lowercase.

df_exists

Check if JSON can be converted to a DataFrame. Returns bool.