Classes

cassandraSession

Import:

labpack.databases.cassandra.cassandraSession

Description:

a class of methods for creating a session to a cassandra database

CQL Connector
https://datastax.github.io/python-driver/getting_started.html
https://flask-cqlalchemy.readthedocs.io/en/latest/
https://datastax.github.io/python-driver/cqlengine/third_party.html

Authentication
https://datastax.github.io/python-driver/api/cassandra/auth.html#
https://cassandra.apache.org/doc/latest/operating/security.html#enabling-password-authentication

__init__

Signature:
__init__(self, hostname, port=9042, username=”“, password=”“, cert_path=”“)

Description:

Argument Type Required Default Description
self object Yes None
hostname NoneTypeYes None
port int 9042
username str ""
password str ""
cert_path str ""

cassandraTable

Import:

labpack.databases.cassandra.cassandraTable

Description:

a class of methods for interacting with a table on cassandra

CQL Connector
https://datastax.github.io/python-driver/getting_started.html
https://cassandra.apache.org/doc/latest/cql/dml.html

NOTE:   WIP

__init__

Signature:
__init__(self, keyspace_name, table_name, record_schema, cassandra_session, replication_strategy=None)

Description:

Argument Type Required Default Description
self object Yes None
keyspace_name NoneTypeYes None
table_name NoneTypeYes None
record_schema NoneTypeYes None
cassandra_session NoneTypeYes None
replication_strategyNoneType None

DatastoreTable

Import:

labpack.databases.google.datastore.DatastoreTable

Description:

a class to store json valid records as tabular style in google datastore

    STORAGE:
    https://cloud.google.com/datastore/docs/concepts/storage-size

    each record size is dramatically impacted by the length of:

        1. the name of the table
        2. the id for each record
        3. the name of each field

    as well as the number of indexed fields. each index uses
    storage equal to the combined size of each of the elements above
    plus the value of the field and duplicate it twice... for both
    ascending and descending orders.

    by default, this class requires indices to be specified and 
    does not create null values for empty fields. if an empty
    map is declared in the record schema, any data in that field
    of a record is stringified and unindexable.

    the most space efficient setup will not have any indices and
    will use the value of the record id as the main query method
    and/or will allow datastore to generate an id automatically

    LIMITS:
    https://cloud.google.com/datastore/docs/concepts/limits

    REFERENCES:
    https://googleapis.dev/python/datastore/latest/index.html 
    https://cloud.google.com/datastore/docs/concepts/entities

__init__

Signature:
__init__(self, datastore_client, table_name, record_schema, indices=None, default_values=False, verbose=False)

Description:
the initialization method for the sqlClient class

Argument Type Required Default Description
self objectYes None
datastore_clientobjectYes None datastore.Client object
table_name str Yes "" string with name for table of records
record_schema dict Yes None dictionary with jsonmodel valid schema for records
indices list None list of strings with fields to index
default_values bool False [optional] boolean to add default values to records
verbose bool False [optional] boolean to enable database logging to stdout

exists

Signature:
exists(self, record_id)

Description:
a method to determine if record exists

Argument Type Required Default Description
self objectYes None
record_id str Yes "" string with id associated with record

list

Signature:
list(self, filter=None, sort=None, limit=100, cursor=”“, ids_only=False)

Description:
a method to retrieve records using criteria evaluated on table indexes

    NOTE:   only fields which have been added to the indices argument at object
            construction can be queried in-memory by Datastore and if an index
            is added after records are in the database, records previously added
            to the datastore are not automatically added to the index. to make
            sure that all records are properly indexed, you must run
            _update_indices
            WARNING: although _update_indices is an optimized SCAN of Datastore,
            it could be very costly

    NOTE:   composite indices allow for more complex queries in-memory
            but must be registered with Datastore using gcloud client
            and specified as index.yaml in app root and built before 
            https://cloud.google.com/datastore/docs/concepts/indexes

Argument Type Required Default Description
self objectYes None
filter dict None dictionary of dot path field name and jsonmodel query criteria
sort list None list of single key-pair dictionaries with dot path field names
limit int 100 integer with number of results to return
cursor str "" string base64 url safe encoded with location of last result
ids_only bool False boolean to enable return of only ids (reduces 'read' use to 1)

create

Signature:
create(self, record)

Description:
a method to create a new record in the table

        NOTE:   this class uses the id field as the primary key for all records
                if record includes an id field that is an integer, float
                or string, then it will be used as the primary key.

        NOTE:   if the id field is missing, a unique 24 character url safe 
                string will be created for the id field and included in the 
                record. if the id field == 0.0, then datastore will assign a 
                randomly generated 16 digit numerical id

        NOTE:   key length is a significant component of the size of storing a
                record. the built in datastore id take up 8 bytes while an id
                string takes up bytes = len(id) + 1. in addition, all keys have
                an additional 16 byte overhead and the record id is reused for
                each index twice, once for ascending and once for descending
                order.

        NOTE:   record fields which do not exist in the record_schema or 
                whose value do not match the requirements of the record_schema
                will throw an InputValidationError

        NOTE:   list fields are stringified using json before they are saved 
                to the datastore and are not possible to search using query
                statements. it is recommended that lists be stored instead as
                separate tables

Argument Type Required Default Description
self objectYes None
record dict Yes None dictionary with record fields

read

Signature:
read(self, record_id)

Description:
a method to retrieve the details for a record in the table

Argument Type Required Default Description
self objectYes None
record_id str Yes "" string or number with unique identifier of record

update

Signature:
update(self, record)

Description:
a method to update an existing record

Argument Type Required Default Description
self objectYes None
record dict Yes None dictionary with record fields

delete

Signature:
delete(self, record_id)

Description:
a method to delete an existing record

Argument Type Required Default Description
self objectYes None
record_id str Yes "" string or number with unique identifier of record

remove

Signature:
remove(self)

Description:
a method to remove all records in table

export

Signature:
export(self, datastore_table, merge_rule=”skip”, coerce=False)

Description:
TODO a method to export all the records to another datastore table

Argument Type Required Default Description
self object Yes None
datastore_tableNoneTypeYes None
merge_rule str "skip"
coerce bool False

SQLSession

Import:

labpack.databases.sql.SQLSession

Description:

the initialization method for the SQLSession class

__init__

Signature:
__init__(self, database_url, verbose=False)

Description:
the initialization method for the SQLSession class

Argument Type Required Default Description
self objectYes None
database_urlstr Yes "" string with unique resource identifier to database
verbose bool False [optional] boolean to enable database logging to stdout

SQLTable

Import:

labpack.databases.sql.SQLTable

Description:

a class to store json valid records in a sql database

REFERENCES:
https://docs.sqlalchemy.org/en/13/core/tutorial.html

__init__

Signature:
__init__(self, sql_session, table_name, record_schema, rebuild=True, default_values=False, verbose=False)

Description:
the initialization method for the SQLTable class

Argument Type Required Default Description
self objectYes None
sql_session objectYes None sql.SQLSession object
table_name str Yes "" string with name for table of records
record_schema dict Yes None dictionary with jsonmodel valid schema for records
rebuild bool True [optional] boolean to rebuild table with schema changes
default_valuesbool False [optional] boolean to add default values to records
verbose bool False [optional] boolean to enable database logging to stdout

exists

Signature:
exists(self, record_id)

Description:
a method to determine if record exists

Argument Type Required Default Description
self objectYes None
record_id str Yes "" string or number with unique identifier of record

list

Signature:
list(self, filter=None, sort=None, limit=100, cursor=”“, ids_only=False)

Description:
a method to retrieve records from table which match query criteria

Argument Type Required Default Description
self objectYes None
filter dict None dictionary of dot path field name and jsonmodel query criteria
sort list None list of single key-pair dictionaries with dot path field names
limit int 100 integer with number of results to return
cursor str "" string form of integer with offset to continue query
ids_only bool False boolean to enable return of only ids (reduces 'read' use to 1)

create

Signature:
create(self, record)

Description:
a method to create a new record in the table

        NOTE:   this class uses the id key as the primary key for all records
                if record includes an id field that is an integer, float
                or string, then it will be used as the primary key. if the id
                field is missing, a random 64 bit integer (if a number) or a
                unique 24 character url safe string (if a string) will be 
                created for the id field and included in the record

        NOTE:   record fields which do not exist in the record_schema or whose
                value do not match the requirements of the record_schema
                will throw an InputValidationError

        NOTE:   lists fields are pickled before they are saved to disk and
                are not possible to search using sql query statements. it is
                recommended that lists be stored instead as separate tables

        NOTE:   if a map field is declared as empty in the record_schema, then
                all record fields inside it will be pickled before the
                record is saved to disk and are not possible to search

Argument Type Required Default Description
self objectYes None
record dict Yes None dictionary with record fields

read

Signature:
read(self, record_id)

Description:
a method to retrieve the details for a record in the table

Argument Type Required Default Description
self objectYes None
record_id str Yes "" string or number with unique identifier of record

update

Signature:
update(self, updated, original=None)

Description:
a method to update changes to a record in the table

Argument Type Required Default Description
self objectYes None
updated dict Yes None dictionary with updated record fields
original dict None [optional] dictionary with original record fields

delete

Signature:
delete(self, record_id)

Description:
a method to delete a record in the table

Argument Type Required Default Description
self objectYes None
record_id str Yes "" string or number with unique identifier of record

remove

Signature:
remove(self)

Description:
a method to remove the entire table

    :return string with status message

export

Signature:
export(self, sql_table, merge_rule=”skip”, coerce=False)

Description:
a method to export all the records in table to another sql table

Argument Type Required Default Description
self objectYes None
sql_table type Yes None class object with sql table methods
merge_rulestr "skip" string with name of rule to adopt for pre-existing records
coerce bool False boolean to enable migration even if table schemas don't match

labMagic

Import:

labpack.parsing.magic.labMagic

Description:

initialization method for labMagic class

__init__

Signature:
__init__(self, magic_file=”“)

Description:
initialization method for labMagic class

Argument Type Required Default Description
self objectYes None
magic_filestr "" [optional] string with local path to magic.mgc file

analyze

Signature:
analyze(self, file_path=”“, file_url=”“, byte_data=None)

Description:
a method to determine the mimetype and extension of a file from its byte data

Argument Type Required Default Description
self object Yes None
file_path str "" [optional] string with local path to file
file_url str "" [optional] string with url of file
byte_data NoneType None [optional] byte data from a file

labRegex

Import:

labpack.parsing.regex.labRegex

Description:

instantiates class with a regular expression dictionary

__init__

Signature:
__init__(self, regex_schema, override=False)

Description:
instantiates class with a regular expression dictionary

Argument Type Required Default Description
self objectYes None
regex_schemadict Yes None dictionary with regular expression name, pattern key-pairs
override bool False boolean to ignore value errors raised from regex name conflicts

map

Signature:
map(self, string_input, n_grams=1)

Description:

Argument Type Required Default Description
self object Yes None
string_inputNoneTypeYes None
n_grams int 1

labID

Import:

labpack.records.id.labID

Description:

a class of methods for uniquely identifying objects

    build-in methods:
        self.uuid: uuid1 uuid object
        self.id12: 12 character base 64 url safe string of posix time
        self.id24: 24 character base 64 url safe string of md5 hash of uuid1
        self.id36: 36 character base 64 url safe string of sha1 hash of uuid1
        self.id48: 48 character base 64 url safe string of sha256 hash of uuid1
        self.mac: string of mac address of device
        self.epoch: current posix epoch timestamp with micro second resolution
        self.iso: current iso utc datetime string
        self.datetime: current python datetime

__init__

Signature:
__init__(self)

Description:
a method to initialize a unique ID based upon the UUID1 method

labDT

Import:

labpack.records.time.labDT

Description:

a class of methods for datetime conversion

for list of timezones:
    https://stackoverflow.com/questions/13866926/python-pytz-list-of-timezones
for list of datetime directives:
    https://docs.python.org/2/library/datetime.html#strftime-and-strptime-behavior

new

Signature:
new(cls)

Description:
a method to generate the current datetime as a labDT object

zulu

Signature:
zulu(self)

Description:
a method to report ISO UTC datetime string from a labDT object

    NOTE: for timezone offset string use .isoformat() instead

epoch

Signature:
epoch(self)

Description:
a method to report posix epoch timestamp from a labDT object

rfc2822

Signature:
rfc2822(self)

Description:
a method to report a RFC-2822 Compliant Date from a labDT object

        https://tools.ietf.org/html/rfc2822.html#page-14

pyLocal

Signature:
pyLocal(self, time_zone=”“)

Description:
a method to report a python datetime from a labDT object

Argument Type Required Default Description
self objectYes None
time_zone str "" [optional] string with timezone to report in

jsLocal

Signature:
jsLocal(self, time_zone=”“)

Description:
a method to report a javascript string from a labDT object

Argument Type Required Default Description
self objectYes None
time_zone str "" [optional] string with timezone to report in

humanFriendly

Signature:
humanFriendly(self, time_zone=”“, include_day=True, include_time=True)

Description:
a method to report a human friendly string from a labDT object

Argument Type Required Default Description
self objectYes None
time_zone str "" [optional] string with timezone to report in
include_day bool True
include_timebool True

fromEpoch

Signature:
fromEpoch(cls, epoch_time)

Description:
a method for constructing a labDT object from epoch timestamp

Argument Type Required Default Description
cls NoneTypeYes None
epoch_timefloat Yes 0.0 number with epoch timestamp info

fromISO

Signature:
fromISO(cls, iso_string)

Description:
a method for constructing a labDT object from a timezone aware ISO string

Argument Type Required Default Description
cls NoneTypeYes None
iso_stringstr Yes "" string with date and time info in ISO format

fromPython

Signature:
fromPython(cls, python_datetime)

Description:
a method for constructing a labDT from a python datetime with timezone info

Argument Type Required Default Description
cls NoneTypeYes None
python_datetimeobject Yes None datetime object with timezone info

fromJavascript

Signature:
fromJavascript(cls, javascript_datetime)

Description:
a method to construct labDT from a javascript datetime string

Argument Type Required Default Description
cls NoneTypeYes None
javascript_datetimestr Yes "" string with datetime info in javascript formatting

fromPattern

Signature:
fromPattern(cls, datetime_string, datetime_pattern, time_zone, require_hour=True)

Description:
a method for constructing labDT from a strptime pattern in a string https://docs.python.org/2/library/datetime.html#strftime-and-strptime-behavior iso_pattern: ‘%Y-%m-%dT%H:%M:%S.%f%z’ human_friendly_pattern: ‘%A, %B %d, %Y %I:%M:%S.%f%p’

Argument Type Required Default Description
cls NoneTypeYes None
datetime_string str Yes "" string with date and time info
datetime_patternstr Yes "" string with python formatted pattern
time_zone str Yes "" string with timezone info
require_hour bool True [optional] boolean to disable hour requirement