Skip to main content

Novel creativity will not happen in AI without Salience evaluation

So the last few years has seen impressive performance in machine learning models leveraging deep model processes involving multiple layers of neural networks emerging an ability to highly characterize a target image in the "style" of a given input image to produce an output image that appears as if it were created in an artistic way by the algorithm.

The apps. and filters leveraging these neural networks (convolutional being the ones most effective at this proto creative action) are quickly appearing in various apps.

However, for creating art....particularly creating novel art that is not just the result of a complex mathematical process against a single source and a single target image....such approaches are an utter failure....for example, as an illustrator I can be given two or 3 input images of a given character from different perspectives and on the basis of that small set of input create a wide variety of new images ....of that same character with high degree of verisimilitude.


Where a convolution neural network requires a direct transformation between the target image and the input "style" image in order to create an output which appears to basically be the target as rendered through the style...for creating an entirely new representation of the same character IN a set of images a completely different approach is required.

The machine learning model will have to first be able to extract predictions on the dimensional nature of the character in the images.....if the images don't contain only one character they further need to disambiguate (in a small set this is trivial for us to do but hard as sin for an untrained network to do).

Once the model had a rough understanding of the dimensions of the subject it can then create some arbitrary perspective and then render a novel representation (using it's own desired "style" would make it even harder) and then rendering the output.

Such an improvement in image processing learning models is going to require an ability to take a short input set and create intermediate interpolations that obey dimensional rules of perspective while keeping proportions correct through those perspectives (so coordinate transformation) as our brains do it would have to emerge this capability without actually evolving algorithms for doing coordinate transformation in a mathematical sense but to do so the way we do...via an intuitive sense that doesn't rely on active mathematical calculation....further the model would have to find some way of keep in "mind" a chosen perspective long enough to allow rendering it from that "mind" without mixing it with other possible creative outcome.

I think this next level of creative expression in image processing neural networks will require some merging of visual processing and image processing networks as well as tying those together using a short and broad learning super model that can emerge a simple salience landscape that can emerge the option span for perspective and style of rendering at least to get such creativity from a general purpose cognitive model and not a custom architected one like the many that have found success creating mixed (convolved) images. Thus I assert to do this task the cognitive model MUST have a salience loop akin to the one below ....a dynamic cognition cycle for at least the image processing sub cycle of cognition.

Outside of a general purpose solution that leverages a salience loop to solve this problem of novel creativity...there may be a way to perform the same by architecting a complex interaction of networks...but I posit such architectures would be too unwieldy for machine learning researchers to discover the way they've discovered so much of the usefulness of their trial and error. The complexity of using a fixed architecture approach is inversely proportional to the generality of the solution may work but it would be tightly coupled to the designed start problem. And so with this realization I propose a 4th hypothesis as extension to the Salience Theory of Dynamic Cognition that I posted in 2013. Dynamic cognition of the kind that will emerge general creative intelligence MUST leverage SABL (shallow and broad learning) entity relations as well as deep learning relations tied together via a salience driven driving process (leveraging autonomic and emotional modulation). AI which does not attempt to replicate efficient SABL cross connection of seemingly disparate deep networks focused on specific sensory dimensional datasets will not emerge neither novel creative nor self aware (conscious) intelligence.


Popular posts from this blog

On the idea of "world wide mush" resulting from "open" development models

A recent article posted in the Wall Street Journal posits that the collectivization of various types of goods or services created by the internet is long term a damaging trend for human societies.

I think that the author misses truths that have been in place that show that collectivization is not a process that started with the internet but has been with us since we started inventing things.

It seems that Mr. Lanier is not properly defining the contexts under which different problems can benefit or suffer from collectivization. He speaks in general terms of the loss of the potential for creators to extract profit from their work but misses that this is and was true of human civilization since we first picked up a rock to use as a crude hammer. New things make old things obsolete and people MUST adapt to what is displaced (be it a former human performance of that task or use of an older product) so as to main…

Engineers versus Programmers

I have found as more non formally trained people enter the coding space, the quality of code that results varies in an interesting way.

The formalities of learning to code in a structured course at University involve often strong focus on "correctness" and efficiency in the form of big O representations for the algorithms created.

Much less focus tends to be placed on what I'll call practical programming, which is the type of code that engineers (note I didn't use "programmers" on purpose) must learn to write.

Programmers are what Universities create, students that can take a defined development environment and within in write an algorithm for computing some sequence or traversing a tree or encoding and decoding a string. Efficiency and invariant rules are guiding development missions. Execution time for creating the solution is often a week or more depending on the professor and their style of teaching code and giving out problems. This type of coding is devo…

Waking Out: A proposal to emerging ethical super intelligence safely.

The zeitgeist of Science fiction is filled with stories that paint a dystopian tale of how human desires to build artificial intelligence can go wrong. From the programmed pathology of HAL in 2001 a space odyssey, to the immediately malevolent emergence of Skynet in The Terminator and later to the humans as energy stores for the advanced AI of the Matrix and today , to the rampage of "hosts" in the new HBO series Westworld.

These stories all have a common theme of probing what happens when our autonomous systems get a mind of their own to some degree and no longer obey their creators but how can we avoid these types of scenarios but still emerge generalized intelligence that will leverage their super intelligence with empathy and consideration the same that we expect from one another? This question is being answered in a way that is mostly hopeful that current methods used in machine learning and specifically deep learning will not emerge skynet or HAL.

I think this is the …