Skip to main content

late hour super class surgery

The original idea behind the design of my distributed web application framework started from the idea to create a content management system with a generalized compositional design paradigm. In my work at TheStreet.com I designed an ad management system that used the rudiments of this idea, the compositional structure of the design was pretty simple, the ad management tool simply needed to manage collections of advertisements in blocks known on the system as "tiles", the collection of the tile code into groups called "tilex" groups allowed the management of the collection as one, which included dynamic replacement of various strings to match parameters for the pages to which the "tilex" groups were joined. This simple relationship between specific ad management Entities and the associated relational tables, took a month to design and build. Months before I thought of a solution that could use a similar design idea but be generalized to manage any Entity at all. A collection of Stories on a Page, a collection of posts in a forum thread, collections of threads in a forum and even physical Entities like a collection of nodes in a system cluster. The generalized design idea is what was the impetus behind the development of the platform, with the first application designed as proof of concept to be a content management system. One of the requirements of the system was that management of all Entities would occur through a web interface, this was engineered in such a way as to be maximally efficient, allowing each Entity to customize the interface for managing instances of its type dynamically. To do this, allowing management of instances by Users was required. In most management UI's , Users are able to perform several actions against objects they can manage, most often the CRUD actions, of creating, retrieving, updating or deleting changes to objects stored on the associated datastore. I expanded these actions to cover 4 more actions that allow publishing, searching, importing and exporting. The problem is that how these actions are done will vary with the underlying datastore. I originally was designing to a specific database vendor but realized that a generalized vendor API should be built to maximize the power of the framework. I designed a vendor agnostic sub API that allowed me to encode the vendor specific differences into dynamically invoked code and allow client programmers to be agnostic to the actual underlying database. This takes us to the main issue of different key insertion methods between the vendors.

Key Insertion, different strokes for different folks.

Key insertion is an important part of relational database design. A key is basically a unique label associated with the rows of a database table. Some are integers, others are alphanumeric and some can be a combination of table fields. Some database vendors (mysql and mssql server) allow keys to automatically be inserted for tables with primary keys when new objects are inserted into those tables. However others, require the creation and use of special sequences that are incremented outside of the database tables, like Oracle. The different methods and different non ANSI compliant methods for enabling their use on tables makes my choice to create a db vendor API a prudent one, but it also forced a requirement on the part of the Entity class designs. Entities that are created or retrieved programmatically must have a way to temporarily extract the inserted items agnostically. If you don't have a known key value , you can't know if the item you are pulling out is the one you recently created. The framework is distributed across multiple processing nodes , so creation actions for new Entities can be happening simultaneously with subsequent retrieval how to disambiguate these objects? Enter, user defined keys.


User defined key

A user defined key is quick and dirty way to ensure that a new Entity object inserted into a table is retrieved by generating a unique sequence of letters or numbers and adding that to a field in the table. In my framework all tables have a name or description field which is of type "varchar", in the tables that require programmatic modification, the insert populates the field with a unique string consisting of the date and time and the node id of the machine performing the creation, this guarantees that the subsequent request for retrieval of an item by searching the field for that key will be the inserted item (if it succeeded) and not an item inserted at the time by another machine. The use of user defined keys comes with many advantages and a few disadvantages:

Advantages.

  • no need to accommodate any db, agnostic to their particular primary key generation method.
  • allows arbitrary length keys that can be dynamically changed in client code at run time without recompile if uniqueness constraint is weakened. If the rate of addition of new items exceeds the ability to generate unique keys (say containing date or time stamps) additional uniqueness factors can be added without recompile.
  • abstract method enforcement (see next section) provides a hint to class programmers of new Entity types to implement the method for client use.
  • allows mapping of huge space of possible keys inserted or extracted simultaneously with uniqueness enforced.

Disadvantages

  • uniqueness is only guaranteed if sufficiently orthogonal factors are chosen (date/time/node/site/user..etc.)
  • if key is hard to calculate because it is large, searching on table will incur increasing time cost
  • additional method implementation in all inheriting classes increases size of instances and there for the size of the running memory cost for the entire application.(see next section)

Two additional disadvantages are unique to my framework, in order to make the change at this late date to the core API , I will have to recompile the jar class distribution for the application and redistribute to the production servers. This will require they are restarted but thankfully they are designed in a redundant set so this will not require total downtime for the applications hosted on the site. Secondly, my system uses serialized instances of class objects to store object state to disk for system wide versioning. Objects of the previous class state will no longer be accessible in the interface as the class signature will fail, these objects need to be deleted. Luckily because none of the services are being utilized by paying customers yet, this will not impact any Users objects.

The actual insertion and retrieval are done by a database access API or persistence API custom built to allow Entities to mutate the datastore. The insert, retrieve, update and delete actions are methods enforced by underlying abstract method constraints in the superclass for all db access classes. The retrieve methods include the superclass enforced signature that has attribute integer, "id" and boolean "fillobject". The overloaded retrieve method exists for String retrieval of the mentioned name or description parameter only for those Entities that where encountered to require programmatic insert and retrieve. This solves the problem of programmatic insertion/retrieval but it leaves open the possibility that new Entities added to the framework dynamically could not be able to enable efficient programmatic modification via insertion and retrieval without the retrieve(String) method. I realized this would be an issue years ago while coding the ECMS application but felt I would address it at a later date, now that the second application is about to go live as a commercial solution and other applications may come online that time is now.

Adding a new abstract method signature

In order to ensure that all new Entities implement a retrieve method that can be used extract User defined keys, the superclass of the db class for all Entities must be modified. The java programming language makes this simple as adding the following to the class:

public abstract Entity retrieve(String desc,boolean recurse) throws EntegraDBAccessException ;

:this method signature forces all inheriting classes to provide an implementation of the method during compile time and thus ensures that all Entity classes that use db accessors will have a retrieve method for programmatic inserts and retrieves. The next part is the hard part or the easy part, depending on your perspective. I mentioned the two applications built so far using the framework , ECMS and collaboration, these applications necessitated the creation of nearly 40 Entities, of which only 15 had required programmatic insert/retrieve , thus 25 remain to provide implementation for the now enforced abstract signature mentioned above. To make matters more laborious, system classes use a different superclass called a db object , which is similar to the Entities but leaves out the compositional parent/child retrieval logic, but both are managed through the same UI and thus it also requires a retrieve(...) method signature, there are about 12 of those that require implementation. So the hardest part of this involves adding the required implementations, the methods are simple so it will be mostly cut and paste for a few hours but once it is done, the entire API will be enabled to support programmatic insert/retrieve using user defined keys and more important, any new Entities or system classes that inherit from the core classes will be forced to provide an implementation at compile time. This will make the client programmers life a lot easier when designing with the Entity objects.

Comments

Popular posts from this blog

Highly targeted Cpg vaccine immunotherapy for a range of cancer

Significance?


This will surely go down as a seminal advance in cancer therapy. It reads like magic:

So this new approach looks for the specific proteins that are associated with a given tumors resistance to attack by the body's T cells, it then adjusts those T cells to be hyper sensitive to the specific oncogenic proteins targeted. These cells become essentially The Terminator​ T cells in the specific tumor AND have the multiplied effect of traveling along the immune pathway of spreading that the cancer many have metastasized. This is huge squared because it means you can essentially use targeting one tumor to identify and eliminate distal tumors that you many not even realize exist.

This allows the therapy for treating cancer to, for the first time; end the "wack a mole" problem that has frustrated traditional shot gun methods of treatment involving radiation and chemotherapy ...which by their nature unfortunately damage parts of the body that are not cancer laden but …

Engineers versus Programmers

I have found as more non formally trained people enter the coding space, the quality of code that results varies in an interesting way.

The formalities of learning to code in a structured course at University involve often strong focus on "correctness" and efficiency in the form of big O representations for the algorithms created.

Much less focus tends to be placed on what I'll call practical programming, which is the type of code that engineers (note I didn't use "programmers" on purpose) must learn to write.

Programmers are what Universities create, students that can take a defined development environment and within in write an algorithm for computing some sequence or traversing a tree or encoding and decoding a string. Efficiency and invariant rules are guiding development missions. Execution time for creating the solution is often a week or more depending on the professor and their style of teaching code and giving out problems. This type of coding is devo…

AgilEntity Architecture: Action Oriented Workflow

Permissions, fine grained versus management headache
The usual method for determining which users can perform a given function on a given object in a managed system, employs providing those Users with specific access rights via the use of permissions. Often these permissions are also able to be granted to collections called Groups, to which Users are added. The combination of Permissions and Groups provides the ability to provide as atomic a dissemination of rights across the User space as possible. However, this granularity comes at the price of reduced efficiency for managing the created permissions and more importantly the Groups that collect Users designated to perform sets of actions. Essentially the Groups serve as access control lists in many systems, which for the variable and often changing environment of business applications means a need to constantly update the ACL’s (groups) in order to add or remove individuals based on their ability to perform certain actions. Also, the…