Skip to main content

Novel creativity will not happen in AI without Salience evaluation


So the last few years has seen impressive performance in machine learning models leveraging deep model processes involving multiple layers of neural networks emerging an ability to highly characterize a target image in the "style" of a given input image to produce an output image that appears as if it were created in an artistic way by the algorithm.

The apps. and filters leveraging these neural networks (convolutional being the ones most effective at this proto creative action) are quickly appearing in various apps.

However, for creating art....particularly creating novel art that is not just the result of a complex mathematical process against a single source and a single target image....such approaches are an utter failure....for example, as an illustrator I can be given two or 3 input images of a given character from different perspectives and on the basis of that small set of input create a wide variety of new images ....of that same character with high degree of verisimilitude.

How?

Where a convolution neural network requires a direct transformation between the target image and the input "style" image in order to create an output which appears to basically be the target as rendered through the style...for creating an entirely new representation of the same character IN a set of images a completely different approach is required.

The machine learning model will have to first be able to extract predictions on the dimensional nature of the character in the images.....if the images don't contain only one character they further need to disambiguate (in a small set this is trivial for us to do but hard as sin for an untrained network to do).

Once the model had a rough understanding of the dimensions of the subject it can then create some arbitrary perspective and then render a novel representation (using it's own desired "style" would make it even harder) and then rendering the output.

Such an improvement in image processing learning models is going to require an ability to take a short input set and create intermediate interpolations that obey dimensional rules of perspective while keeping proportions correct through those perspectives (so coordinate transformation)...it as our brains do it would have to emerge this capability without actually evolving algorithms for doing coordinate transformation in a mathematical sense but to do so the way we do...via an intuitive sense that doesn't rely on active mathematical calculation....further the model would have to find some way of keep in "mind" a chosen perspective long enough to allow rendering it from that "mind" without mixing it with other possible creative outcome.

I think this next level of creative expression in image processing neural networks will require some merging of visual processing and image processing networks as well as tying those together using a short and broad learning super model that can emerge a simple salience landscape that can emerge the option span for perspective and style of rendering at least to get such creativity from a general purpose cognitive model and not a custom architected one like the many that have found success creating mixed (convolved) images. Thus I assert to do this task the cognitive model MUST have a salience loop akin to the one below ....a dynamic cognition cycle for at least the image processing sub cycle of cognition.

Outside of a general purpose solution that leverages a salience loop to solve this problem of novel creativity...there may be a way to perform the same by architecting a complex interaction of networks...but I posit such architectures would be too unwieldy for machine learning researchers to discover the way they've discovered so much of the usefulness of their solutions....by trial and error. The complexity of using a fixed architecture approach is inversely proportional to the generality of the solution produced...it may work but it would be tightly coupled to the designed start problem. And so with this realization I propose a 4th hypothesis as extension to the Salience Theory of Dynamic Cognition that I posted in 2013. Dynamic cognition of the kind that will emerge general creative intelligence MUST leverage SABL (shallow and broad learning) entity relations as well as deep learning relations tied together via a salience driven driving process (leveraging autonomic and emotional modulation). AI which does not attempt to replicate efficient SABL cross connection of seemingly disparate deep networks focused on specific sensory dimensional datasets will not emerge neither novel creative nor self aware (conscious) intelligence.

Comments

Popular posts from this blog

the attributes of web 3.0...

As the US economy continues to suffer the doldrums of stagnant investment in many industries, belt tightening budgets in many of the largest cities and continuous rounds of lay offs at some of the oldest of corporations, it is little comfort to those suffering through economic problems that what is happening now, has happened before. True, the severity of the downturn might have been different but the common factors of people and businesses being forced to do more with less is the theme of the times. Like environmental shocks to an ecosystem, stresses to the economic system lead to people hunkering down to last the storm, but it is instructive to realize that during the storm, all that idle time in the shelter affords people the ability to solve previous or existing problems. Likewise, economic downturns enable enterprising individuals and corporations the ability to make bold decisions with regard to marketing , sales or product focus that can lead to incredible gains as the economic ...

Engineers versus Programmers

I have found as more non formally trained people enter the coding space, the quality of code that results varies in an interesting way. The formalities of learning to code in a structured course at University involve often strong focus on "correctness" and efficiency in the form of big O representations for the algorithms created. Much less focus tends to be placed on what I'll call practical programming, which is the type of code that engineers (note I didn't use "programmers" on purpose) must learn to write. Programmers are what Universities create, students that can take a defined development environment and within in write an algorithm for computing some sequence or traversing a tree or encoding and decoding a string. Efficiency and invariant rules are guiding development missions. Execution time for creating the solution is often a week or more depending on the professor and their style of teaching code and giving out problems. This type of coding is d...

AgilEntity Architecture: Action Oriented Workflow

Permissions, fine grained versus management headache The usual method for determining which users can perform a given function on a given object in a managed system, employs providing those Users with specific access rights via the use of permissions. Often these permissions are also able to be granted to collections called Groups, to which Users are added. The combination of Permissions and Groups provides the ability to provide as atomic a dissemination of rights across the User space as possible. However, this granularity comes at the price of reduced efficiency for managing the created permissions and more importantly the Groups that collect Users designated to perform sets of actions. Essentially the Groups serve as access control lists in many systems, which for the variable and often changing environment of business applications means a need to constantly update the ACL’s (groups) in order to add or remove individuals based on their ability to perform cert...