Classes
cassandraSession
Import:
labpack.databases.cassandra.cassandraSession
Description:
a class of methods for creating a session to a cassandra database
CQL Connector
https://datastax.github.io/python-driver/getting_started.html
https://flask-cqlalchemy.readthedocs.io/en/latest/
https://datastax.github.io/python-driver/cqlengine/third_party.html
Authentication
https://datastax.github.io/python-driver/api/cassandra/auth.html#
https://cassandra.apache.org/doc/latest/operating/security.html#enabling-password-authentication
__init__
Signature:
__init__(self, hostname, port=9042, username=”“, password=”“, cert_path=”“)
Description:
Argument | Type | Required | Default | Description |
---|---|---|---|---|
self | object | Yes | None | |
hostname | NoneType | Yes | None | |
port | int | 9042 | ||
username | str | "" | ||
password | str | "" | ||
cert_path | str | "" |
cassandraTable
Import:
labpack.databases.cassandra.cassandraTable
Description:
a class of methods for interacting with a table on cassandra
CQL Connector
https://datastax.github.io/python-driver/getting_started.html
https://cassandra.apache.org/doc/latest/cql/dml.html
NOTE: WIP
__init__
Signature:
__init__(self, keyspace_name, table_name, record_schema, cassandra_session, replication_strategy=None)
Description:
Argument | Type | Required | Default | Description |
---|---|---|---|---|
self | object | Yes | None | |
keyspace_name | NoneType | Yes | None | |
table_name | NoneType | Yes | None | |
record_schema | NoneType | Yes | None | |
cassandra_session | NoneType | Yes | None | |
replication_strategy | NoneType | None |
DatastoreTable
Import:
labpack.databases.google.datastore.DatastoreTable
Description:
a class to store json valid records as tabular style in google datastore
STORAGE:
https://cloud.google.com/datastore/docs/concepts/storage-size
each record size is dramatically impacted by the length of:
1. the name of the table
2. the id for each record
3. the name of each field
as well as the number of indexed fields. each index uses
storage equal to the combined size of each of the elements above
plus the value of the field and duplicate it twice... for both
ascending and descending orders.
by default, this class requires indices to be specified and
does not create null values for empty fields. if an empty
map is declared in the record schema, any data in that field
of a record is stringified and unindexable.
the most space efficient setup will not have any indices and
will use the value of the record id as the main query method
and/or will allow datastore to generate an id automatically
LIMITS:
https://cloud.google.com/datastore/docs/concepts/limits
REFERENCES:
https://googleapis.dev/python/datastore/latest/index.html
https://cloud.google.com/datastore/docs/concepts/entities
__init__
Signature:
__init__(self, datastore_client, table_name, record_schema, indices=None, default_values=False, verbose=False)
Description:
the initialization method for the sqlClient class
Argument | Type | Required | Default | Description |
---|---|---|---|---|
self | object | Yes | None | |
datastore_client | object | Yes | None | datastore.Client object |
table_name | str | Yes | "" | string with name for table of records |
record_schema | dict | Yes | None | dictionary with jsonmodel valid schema for records |
indices | list | None | list of strings with fields to index | |
default_values | bool | False | [optional] boolean to add default values to records | |
verbose | bool | False | [optional] boolean to enable database logging to stdout |
exists
Signature:
exists(self, record_id)
Description:
a method to determine if record exists
Argument | Type | Required | Default | Description |
---|---|---|---|---|
self | object | Yes | None | |
record_id | str | Yes | "" | string with id associated with record |
list
Signature:
list(self, filter=None, sort=None, limit=100, cursor=”“, ids_only=False)
Description:
a method to retrieve records using criteria evaluated on table indexes
NOTE: only fields which have been added to the indices argument at object
construction can be queried in-memory by Datastore and if an index
is added after records are in the database, records previously added
to the datastore are not automatically added to the index. to make
sure that all records are properly indexed, you must run
_update_indices
WARNING: although _update_indices is an optimized SCAN of Datastore,
it could be very costly
NOTE: composite indices allow for more complex queries in-memory
but must be registered with Datastore using gcloud client
and specified as index.yaml in app root and built before
https://cloud.google.com/datastore/docs/concepts/indexes
Argument | Type | Required | Default | Description |
---|---|---|---|---|
self | object | Yes | None | |
filter | dict | None | dictionary of dot path field name and jsonmodel query criteria | |
sort | list | None | list of single key-pair dictionaries with dot path field names | |
limit | int | 100 | integer with number of results to return | |
cursor | str | "" | string base64 url safe encoded with location of last result | |
ids_only | bool | False | boolean to enable return of only ids (reduces 'read' use to 1) |
create
Signature:
create(self, record)
Description:
a method to create a new record in the table
NOTE: this class uses the id field as the primary key for all records
if record includes an id field that is an integer, float
or string, then it will be used as the primary key.
NOTE: if the id field is missing, a unique 24 character url safe
string will be created for the id field and included in the
record. if the id field == 0.0, then datastore will assign a
randomly generated 16 digit numerical id
NOTE: key length is a significant component of the size of storing a
record. the built in datastore id take up 8 bytes while an id
string takes up bytes = len(id) + 1. in addition, all keys have
an additional 16 byte overhead and the record id is reused for
each index twice, once for ascending and once for descending
order.
NOTE: record fields which do not exist in the record_schema or
whose value do not match the requirements of the record_schema
will throw an InputValidationError
NOTE: list fields are stringified using json before they are saved
to the datastore and are not possible to search using query
statements. it is recommended that lists be stored instead as
separate tables
Argument | Type | Required | Default | Description |
---|---|---|---|---|
self | object | Yes | None | |
record | dict | Yes | None | dictionary with record fields |
read
Signature:
read(self, record_id)
Description:
a method to retrieve the details for a record in the table
Argument | Type | Required | Default | Description |
---|---|---|---|---|
self | object | Yes | None | |
record_id | str | Yes | "" | string or number with unique identifier of record |
update
Signature:
update(self, record)
Description:
a method to update an existing record
Argument | Type | Required | Default | Description |
---|---|---|---|---|
self | object | Yes | None | |
record | dict | Yes | None | dictionary with record fields |
delete
Signature:
delete(self, record_id)
Description:
a method to delete an existing record
Argument | Type | Required | Default | Description |
---|---|---|---|---|
self | object | Yes | None | |
record_id | str | Yes | "" | string or number with unique identifier of record |
remove
Signature:
remove(self)
Description:
a method to remove all records in table
export
Signature:
export(self, datastore_table, merge_rule=”skip”, coerce=False)
Description:
TODO a method to export all the records to another datastore table
Argument | Type | Required | Default | Description |
---|---|---|---|---|
self | object | Yes | None | |
datastore_table | NoneType | Yes | None | |
merge_rule | str | "skip" | ||
coerce | bool | False |
SQLSession
Import:
labpack.databases.sql.SQLSession
Description:
the initialization method for the SQLSession class
__init__
Signature:
__init__(self, database_url, verbose=False)
Description:
the initialization method for the SQLSession class
Argument | Type | Required | Default | Description |
---|---|---|---|---|
self | object | Yes | None | |
database_url | str | Yes | "" | string with unique resource identifier to database |
verbose | bool | False | [optional] boolean to enable database logging to stdout |
SQLTable
Import:
labpack.databases.sql.SQLTable
Description:
a class to store json valid records in a sql database
REFERENCES:
https://docs.sqlalchemy.org/en/13/core/tutorial.html
__init__
Signature:
__init__(self, sql_session, table_name, record_schema, rebuild=True, default_values=False, verbose=False)
Description:
the initialization method for the SQLTable class
Argument | Type | Required | Default | Description |
---|---|---|---|---|
self | object | Yes | None | |
sql_session | object | Yes | None | sql.SQLSession object |
table_name | str | Yes | "" | string with name for table of records |
record_schema | dict | Yes | None | dictionary with jsonmodel valid schema for records |
rebuild | bool | True | [optional] boolean to rebuild table with schema changes | |
default_values | bool | False | [optional] boolean to add default values to records | |
verbose | bool | False | [optional] boolean to enable database logging to stdout |
exists
Signature:
exists(self, record_id)
Description:
a method to determine if record exists
Argument | Type | Required | Default | Description |
---|---|---|---|---|
self | object | Yes | None | |
record_id | str | Yes | "" | string or number with unique identifier of record |
list
Signature:
list(self, filter=None, sort=None, limit=100, cursor=”“, ids_only=False)
Description:
a method to retrieve records from table which match query criteria
Argument | Type | Required | Default | Description |
---|---|---|---|---|
self | object | Yes | None | |
filter | dict | None | dictionary of dot path field name and jsonmodel query criteria | |
sort | list | None | list of single key-pair dictionaries with dot path field names | |
limit | int | 100 | integer with number of results to return | |
cursor | str | "" | string form of integer with offset to continue query | |
ids_only | bool | False | boolean to enable return of only ids (reduces 'read' use to 1) |
create
Signature:
create(self, record)
Description:
a method to create a new record in the table
NOTE: this class uses the id key as the primary key for all records
if record includes an id field that is an integer, float
or string, then it will be used as the primary key. if the id
field is missing, a random 64 bit integer (if a number) or a
unique 24 character url safe string (if a string) will be
created for the id field and included in the record
NOTE: record fields which do not exist in the record_schema or whose
value do not match the requirements of the record_schema
will throw an InputValidationError
NOTE: lists fields are pickled before they are saved to disk and
are not possible to search using sql query statements. it is
recommended that lists be stored instead as separate tables
NOTE: if a map field is declared as empty in the record_schema, then
all record fields inside it will be pickled before the
record is saved to disk and are not possible to search
Argument | Type | Required | Default | Description |
---|---|---|---|---|
self | object | Yes | None | |
record | dict | Yes | None | dictionary with record fields |
read
Signature:
read(self, record_id)
Description:
a method to retrieve the details for a record in the table
Argument | Type | Required | Default | Description |
---|---|---|---|---|
self | object | Yes | None | |
record_id | str | Yes | "" | string or number with unique identifier of record |
update
Signature:
update(self, updated, original=None)
Description:
a method to update changes to a record in the table
Argument | Type | Required | Default | Description |
---|---|---|---|---|
self | object | Yes | None | |
updated | dict | Yes | None | dictionary with updated record fields |
original | dict | None | [optional] dictionary with original record fields |
delete
Signature:
delete(self, record_id)
Description:
a method to delete a record in the table
Argument | Type | Required | Default | Description |
---|---|---|---|---|
self | object | Yes | None | |
record_id | str | Yes | "" | string or number with unique identifier of record |
remove
Signature:
remove(self)
Description:
a method to remove the entire table
:return string with status message
export
Signature:
export(self, sql_table, merge_rule=”skip”, coerce=False)
Description:
a method to export all the records in table to another sql table
Argument | Type | Required | Default | Description |
---|---|---|---|---|
self | object | Yes | None | |
sql_table | type | Yes | None | class object with sql table methods |
merge_rule | str | "skip" | string with name of rule to adopt for pre-existing records | |
coerce | bool | False | boolean to enable migration even if table schemas don't match |
labMagic
Import:
labpack.parsing.magic.labMagic
Description:
initialization method for labMagic class
__init__
Signature:
__init__(self, magic_file=”“)
Description:
initialization method for labMagic class
Argument | Type | Required | Default | Description |
---|---|---|---|---|
self | object | Yes | None | |
magic_file | str | "" | [optional] string with local path to magic.mgc file |
analyze
Signature:
analyze(self, file_path=”“, file_url=”“, byte_data=None)
Description:
a method to determine the mimetype and extension of a file from its byte data
Argument | Type | Required | Default | Description |
---|---|---|---|---|
self | object | Yes | None | |
file_path | str | "" | [optional] string with local path to file | |
file_url | str | "" | [optional] string with url of file | |
byte_data | NoneType | None | [optional] byte data from a file |
labRegex
Import:
labpack.parsing.regex.labRegex
Description:
instantiates class with a regular expression dictionary
__init__
Signature:
__init__(self, regex_schema, override=False)
Description:
instantiates class with a regular expression dictionary
Argument | Type | Required | Default | Description |
---|---|---|---|---|
self | object | Yes | None | |
regex_schema | dict | Yes | None | dictionary with regular expression name, pattern key-pairs |
override | bool | False | boolean to ignore value errors raised from regex name conflicts |
map
Signature:
map(self, string_input, n_grams=1)
Description:
Argument | Type | Required | Default | Description |
---|---|---|---|---|
self | object | Yes | None | |
string_input | NoneType | Yes | None | |
n_grams | int | 1 |
labID
Import:
labpack.records.id.labID
Description:
a class of methods for uniquely identifying objects
build-in methods:
self.uuid: uuid1 uuid object
self.id12: 12 character base 64 url safe string of posix time
self.id24: 24 character base 64 url safe string of md5 hash of uuid1
self.id36: 36 character base 64 url safe string of sha1 hash of uuid1
self.id48: 48 character base 64 url safe string of sha256 hash of uuid1
self.mac: string of mac address of device
self.epoch: current posix epoch timestamp with micro second resolution
self.iso: current iso utc datetime string
self.datetime: current python datetime
__init__
Signature:
__init__(self)
Description:
a method to initialize a unique ID based upon the UUID1 method
labDT
Import:
labpack.records.time.labDT
Description:
a class of methods for datetime conversion
for list of timezones:
https://stackoverflow.com/questions/13866926/python-pytz-list-of-timezones
for list of datetime directives:
https://docs.python.org/2/library/datetime.html#strftime-and-strptime-behavior
new
Signature:
new(cls)
Description:
a method to generate the current datetime as a labDT object
zulu
Signature:
zulu(self)
Description:
a method to report ISO UTC datetime string from a labDT object
NOTE: for timezone offset string use .isoformat() instead
epoch
Signature:
epoch(self)
Description:
a method to report posix epoch timestamp from a labDT object
rfc2822
Signature:
rfc2822(self)
Description:
a method to report a RFC-2822 Compliant Date from a labDT object
https://tools.ietf.org/html/rfc2822.html#page-14
pyLocal
Signature:
pyLocal(self, time_zone=”“)
Description:
a method to report a python datetime from a labDT object
Argument | Type | Required | Default | Description |
---|---|---|---|---|
self | object | Yes | None | |
time_zone | str | "" | [optional] string with timezone to report in |
jsLocal
Signature:
jsLocal(self, time_zone=”“)
Description:
a method to report a javascript string from a labDT object
Argument | Type | Required | Default | Description |
---|---|---|---|---|
self | object | Yes | None | |
time_zone | str | "" | [optional] string with timezone to report in |
humanFriendly
Signature:
humanFriendly(self, time_zone=”“, include_day=True, include_time=True)
Description:
a method to report a human friendly string from a labDT object
Argument | Type | Required | Default | Description |
---|---|---|---|---|
self | object | Yes | None | |
time_zone | str | "" | [optional] string with timezone to report in | |
include_day | bool | True | ||
include_time | bool | True |
fromEpoch
Signature:
fromEpoch(cls, epoch_time)
Description:
a method for constructing a labDT object from epoch timestamp
Argument | Type | Required | Default | Description |
---|---|---|---|---|
cls | NoneType | Yes | None | |
epoch_time | float | Yes | 0.0 | number with epoch timestamp info |
fromISO
Signature:
fromISO(cls, iso_string)
Description:
a method for constructing a labDT object from a timezone aware ISO string
Argument | Type | Required | Default | Description |
---|---|---|---|---|
cls | NoneType | Yes | None | |
iso_string | str | Yes | "" | string with date and time info in ISO format |
fromPython
Signature:
fromPython(cls, python_datetime)
Description:
a method for constructing a labDT from a python datetime with timezone info
Argument | Type | Required | Default | Description |
---|---|---|---|---|
cls | NoneType | Yes | None | |
python_datetime | object | Yes | None | datetime object with timezone info |
fromJavascript
Signature:
fromJavascript(cls, javascript_datetime)
Description:
a method to construct labDT from a javascript datetime string
Argument | Type | Required | Default | Description |
---|---|---|---|---|
cls | NoneType | Yes | None | |
javascript_datetime | str | Yes | "" | string with datetime info in javascript formatting |
fromPattern
Signature:
fromPattern(cls, datetime_string, datetime_pattern, time_zone, require_hour=True)
Description:
a method for constructing labDT from a strptime pattern in a string
https://docs.python.org/2/library/datetime.html#strftime-and-strptime-behavior
iso_pattern: ‘%Y-%m-%dT%H:%M:%S.%f%z’
human_friendly_pattern: ‘%A, %B %d, %Y %I:%M:%S.%f%p’
Argument | Type | Required | Default | Description |
---|---|---|---|---|
cls | NoneType | Yes | None | |
datetime_string | str | Yes | "" | string with date and time info |
datetime_pattern | str | Yes | "" | string with python formatted pattern |
time_zone | str | Yes | "" | string with timezone info |
require_hour | bool | True | [optional] boolean to disable hour requirement |