Data Access Function

Functional programming paradigm encourages one to interpret every.computing entity as a function. The abstraction of function, however, remains the same. It is an entity that accepts a set of predefined inputs and returns some data.
Stored data access should be modelled the functional way. For querying data, a function with the following signature needs to be implemented.
def query(data_source_n, object_hierarchy [], selection_criteria {})
data_source_n:
You might have access to a plethora of data sources built separately due to business needs. A database each for HR, Manufacturing, Sales, Distribution and Customers is a common configuration. The storage mechanism may be also as diverse as traditional RDBMS, No-SQL or even a set of Web APIs. An organization may choose to store transactional data in a regular RDBMS, customer interaction data in a document database and source Item and Pricing data over HTTP API calls from 3rd parties.
object_hierarchy:
The object you want to access may be located following a hierarchical path. For example customer.hq.zip or employee.year.quarter.bonus. The dot notation oftentimes messes with other code. It os better to express the hierarchy as an unbounded list - [..., ...,... ]
selection_criteria:
Each logical condition should be unique, which a set guarantees. Since the number of conditions for selection may vary, selection_criteria should be unbounded too!
A function can return only one value. But in real world use cases we come across, we often need to return multiple values. When we talk about multiple values, we generally assume arrays of similar type of data. Like phone numbers, educational qualifications. But common cases include things like address, which is multi-part and clobbering all the parts together into a lump of text is not a good idea. And then there is the most complex yet not so uncommon case of an array of multi-part data. Like say the last 5 addresses of an employee.
Fortunately, we have options of structured, serialized text formats like XML or JSON. Using either of those two, one can express even the most complex structure of data in terms of plain text.
Widespread use of XML and JSON has also made data insert functions more elegant. Instead of using some 30-odd parameters, a function may just have one input JSON as its signature. JSON has the additional advantage over XML in being able to support multiple JSON documents in a single file or variable. JSON array is a valid construct.
Something like this would a typical code look like:
def insert(json_array [{}])
A million records may be inserted in a single function call. Rarely do we come across such an example of power and elegance together!
Update, as a process, has two parts: a) perform query to locate the objects to be updated and b) replace the existing values with new values.
We already have a function called query for performing any arbitrary query. We will be using that. Then we will modify the objects thus received. Finally, we will send back the modified objects to the data source for permanent storage.
objects_to_be_updated=query(data_source_n, object_hierarchy [], selection_criteria {})...#objects getting modified......update(data_source_n, object_hierarchy [], objects_to_be_updated)
A data access object differs from a native, in-memory object in at least one significant way. There is no passing by reference. The entire object is expressed or serialized. So access is always public, verbose and mostly in clear text. That the "clear text" happens to be in either XML or JSON format is of some relief. Nevertheless, you will need a good set of utility functions to handle text in XML or JSON format. Each such document is in itself a mini-database with its own CRUD challenges.
Given the popularity of JSON, it is better to keep all the data formats in JSON. It is advisable to convert all the native data into JSON format once you fetch those for the first time in program memory. Hold on to JSON as long as possible, till you have to commit data to the disc. But in that case you will have a problem of synchronization between your in-memory JSON cache and the on-disc native data storage.
You can avoid that problem by switching to completely stateless programming. Do not keep data in memory any longer than absolutely needed.
September 17th, 2022