In "Human Cognition and Social Agent Technology"
Ed: Kerstin Dautenhahn, John Benjamins Publishing Company.
My intention in this essay is to discuss agent building from the perspective of the visual arts. I will argue for the value of artistic methodologies to agent design. I will not advance some futuristic version of the romantic bohemian artist, agonising over an expressionistic agent in his garret. Nor will I propose the harnessing of artistic minds to the industrial machine. I want to advance another argument which is pertinent specifically to the building of Social Agents. I propose that there are aspects of artistic methodology which are highly pertinent to agent design, a which seem to offer a corrective for elision generated by the often hermetic culture of scientific research.
When one mentions the uses and functions of art in a scientific context, the understanding is often of superficial manipulation of visual `aesthetic' characteristics in the pursuit of `beauty' or a cool-looking demo. A more sophisticated approach recognises that the holistic and open ended experimental process of artistic practice allows for expansive inventive thinking, which can usefully be harnessed to technical problem solving (this has been the MIT Media Lab position). This approach tacitly recognises that certain types of artistic problem solving compensate for the `tunnel vision' characteristic of certain types of scientific and technical practice.
I have observed previously that the approach to the production of artworks by the scientifically trained tends to be markedly different from the approach of those trained in the visual arts. A case example is the comparison of two works which I included in the Machine Culture exhibition at SIGGRAPH 93 . The Edge of Intention project by Joseph Bates and the Oz group at Carnegie Mellon University was an attempt to construct a knowledge base of plot structure and character development by distilling English literature and drama. Although the project had been in progress for several years, the developers admitted that it was still in its infancy. The audience experience at present was somewhat simplistic: the user (incarnated as one of the agents) could play childlike games (chasing and hiding, etc) with a group of cartoon entities which resembled moody jelly beans. The goal of the group was not to produce agents which were simulations of people, but which were `believable' in their own terms. This `believability' implies an abstraction of what we perceive to be intelligent behavior.
Luc Courchesne's Family Portrait, on the other hand, was comparatively low-tech. It consisted of four stations: four laserdiscs, four monitors and four Macintosh classics each with a simple HyperCard stack. The users stood and chatted with interactive video images. Although the interface consisted of using a trackball to choose responses to questions posed by the characters on the screen, the simulation of human interaction was uncanny. The artist has great finesse at simulating human interaction in the social space of the interface, a skill I have called interactive dramaturgy. A particularly effective trick was that the four virtual characters would occasionally break their conversation with the visitors, and turn to interrupt or contradict each other. This illusion of `real people' was aided by the handling of the hardware. The computer and AV hardware was hidden, even the monitor was hidden, the images were reflected in oblique sheets of glass in the darkened space, and seemed to float. Though low tech, Family Portrait was dramatically persuasive in a way that Edge of Intention was not.
The difference in approach of these projects illustrates my argument. One might generalise in this way (with apologies to both groups): artists will kluge together any kind of mess of technology behind the scenes because the coherence of the experience of the user is their first priority. Scientists wish for formal elegance at an abstract level and do not emphasise, or do not have the training to be conscious of inconsistencies peoplesin, the representational schemes of the interface. Arising from the tradition of Artificial Intelligence, the Edge of Intention project seeks to create general tools for an interactive literature by analysing the basic components of (rudimentary) social interactions, and building a system for their coordination. The focus of the effort was to build an elegant and general internal system. The interface seemed to be a necessary but secondary aspect, like the experimental demonstration of a proof. The average user, however, will never gain access to the hermetic territory of the architecture of the code, and remains frustrated by the unsatisfying and incomplete nature of the representation of the system in the interface. Courschesne, on the other hand, does not attempt to build a general purpose system, but presents a seamless and persuasive experience for the user. Artists are trained to understand the subtle connotations of images, textures, materials, sounds, and the way various combinations of these might conjure meaning in the mind of the viewer. Artists must be concerned with the adequate communication of (often subtle) ideas through visual cues. They understand the complexity of images and the complexity of cultural context. Of course, the artistic solutions are often highly contingent and specific to a certain scenario, and may not generalise to general principles for a class of scenarios. This is not their goal.
While more academic disciplines valorise and reward a `hands-off' approach, rewarding the more purely theoretic, artists are taught to integrate the artisanal and the conceptual (Penny, S. 1997). Artistic practice is the shortest route to the physical manifestation of ideas. According to the traditional view, properly trained, the manual skill of the artist becomes an automatic conduit for the expression of abstract thought. Purely perceptuo-motor and abstract conceptual process are combined. Artists are judged on the perceived performance of a physically instantiated work, not on the coherence of a theory which may be demonstrated, perhaps obscurely. Criteria for a successful work is based almost solely on its influence on the viewer. An artwork must motivate the viewer to engage intellectually and emotionally with the work. In a good work, the `interface' is finely honed and engagement should develop over the long term through.
This condition of engagement is a paradigmatic case of what Jonathan Crary calls the `techniques of the observer' (Crary, J. 1992). In the book of the same name, Crary argues that pictures would remain meaningless and mute without the unconscious and uncelebrated training of observers, as a cultural group. We are all trained in how we look at and appreciate pictures. The meaning of a work is negotiated by the observer in the moment of looking. Meaning is construed entirely as a result of the observers' cultural training.
A salutary example of the cultural specificity of this training is the history of depiction of `new lands' by colonising peoples. Take for instance the depiction by the British colonists of Australia in the closing years of the 18th century and later. Almost invariably in these pictures, aboriginals look negroid, eucalypts look like elms, kangaroos look like giant pudgy mice and the Australian bush looks like rolling English countryside. It took over 100 years until painters captured the quality of the Australian light. This example demonstrates that what we see depends to a great extent on what we have been trained to see. We extrapolate from our previous experience to explain our new perceptions.
Over the past decade, my artistic practice has developed from the construction of sensor driven interactive installations to systems with at least rudimentary forms of agency. My focus of interest has been for several years what I call the 'aesthetic of behavior', a new aesthetic field opened up by the possibility of cultural interaction with machine systems. I have the luxury of being able to experiment with the modalities of systems, without being constrained by an externally specified task for the system. A secondary interest arising from the first is the potential application of various `Alife' techniques as artistic tools, producing artworks which demonstrate behaviors which go beyond a `locked-down' state machine model. This combination of interests leads me inevitably into agent design. My background in art predisposes me to integrated, holistic, situated and embodied practice (both by the maker and in the agent).
In my own practice I tend to define the envelope of the problem first: the system has to do this on these occasions in this way, it has these physical constraints, this power limitation, etc. From these specification I work slowly inward from desired behavioral to physical structure to specifics of sensing and actuation, often specifying hardware first, eventually arriving at the set of constraints within which the code must function. Contrarily, computer scientists have a tendency to look briefly at the surface level, identify a `problem' that might respond to a rule-based solution, then dive deep into the abstractions of code at the most conceptual level, building the ramifications of a conceptual design up through the more abstract to the more `mechanical' aspects of the code, finally surfacing to look back at the interface and see if it works. This approach results in fragmentary and inconsistent interfaces.
These are some of the values which I bring into my robotic and agent practice. These positions bring me close to many already established in Cybernetics and in critiques of traditional AI which concern themselves with groundedness, embodiment, situated cognition and emergent behavior, as discussed by Brooks, Cariani, Dreyfus, Johnson, Varela, et al. (Brooks, R 1991; Dreyfus, H 1992; Johnson, M 1987; Varela, F. Thompson, E. and Rosch, E. 1993) By the same token, my training steers me away from the sensibilities of symbolic AI approaches. In the following text I will discuss four recent works as examples of the way these positions arise or are applied.
2. Petit Mal
The goal of the project Petit Mal: an autonomous robotic artwork was to produce a robotic artwork which was truly autonomous; which was nimble and had `charm'; that sensed and explored architectural space and that pursued and reacted to people; that gave the impression of intelligence and had behavior which was neither anthropomorphic nor zoomorphic, but which was unique to its physical and electronic nature (see Plates 1 and 2). Petit Mal was conceived in 1989, construction began in 1992. Since its public debut in February 1995 it has proven to be reliable and robust, it has been shown in many festivals where it must interact with the public continuously for 8 hour days, for weeks at a time.
It was not my intention to build an artificially intelligent device, but to build a device which gave the impression of being sentient, while employing the absolute minimum of mechanical hardware, sensors, code and computational power. The research emerged from artistic practice and was thus concerned with subtle and evocative modes of communication rather than pragmatic goal based functions. My focus was on the robot as an actor in social space. Although much work has been done in the field of screen-based interactive art, the `bandwidth' of interaction in these works is confined by the limitations of the desktop computer. I am particularly interested in interaction which takes place in the space of the body, in which kinesthetic intelligences, rather than `literary-imagistic' intelligences play a major part. I conceive of artistic interaction as an ongoing conversation between system and user rather than the conventional (Pavlovian) stimulus and response model.
Acknowledging that there is no canon of autonomous interactive esthetics, Petit Mal is an attempt to explore the aesthetic of machine behavior and interactive behavior in a real world setting. Every attempt was made to avoid anthropomorphism, zoomorphism or biomorphism. It seemed all too easy to imply sentience by capitalising on the suggestive potential of biomorphic elements. I did not want this `free ride' on the experience of the viewer. I wanted to present the viewer with a phenomenon which was clearly sentient, while also being itself, a machine, not masquerading as a dog or a president.
I wanted to build a device whose physiognomy was determined by brutally expedient exploitation of minimal hardware. The basic requirements of navigation and interaction with humans determined the choice of sensors. The suite of sensors is absolutely minimal: three ultrasonics, three pryo-electrics, two very low resolution encoders and a low-tech accelerometer. The dicycle design offered the most expedient motor realisation for drive and steering but demanded a low center of gravity to ensure stability. This swinging counterweight would have caused the sensors swing radically, looking first at the ceiling then at the floor, so the sensors were mounted on a (passively stabilising) second internal pendulum. In this way the structure specified the necessary extrapolations to itself, the development of the mechanical structure was not a gratuitous design but a highly constrained and rigorous engineering elaboration based on the first premise of two wheeled locomotion. The lower or outer pendulum carries motors, motor battery and motor drive electronics, the inner pendulum carries the sensors at the top and processor and power supplies as counterweight in the lower part. The batteries are not dead weight but in both cases also function as the major counterweights. In an analogy to the semi-circular canals of the inner ear, an accelerometer at the pivot of the inner pendulum is a rudimentary proprioceptive sensor, it measures relationships between parts of the robot's `body'. It was important to me that this robot was `aware' of its body.
From the outset I wanted to approach hardware and software, not as separate entities but as a whole. I wanted the software to `emerge' from the hardware, from the bottom up, so to speak, The code would make maximal utilisation of minimal sensor data input. Petit Mal has had four successive sets of code, each increasingly more subtle in its adaptation to the dynamics of the device and more effectively exploiting the minimal processor power (one 68hc11). My approach has been that a cheap solution (in labor, money or time) to a particular problem which was 70% reliable was preferable to a solution which was 90% reliable but cost several times as much. It was pointed out to me by an engineer that my `under-engineering' approach could lead to a much wider range of possible (though unreliable) solutions. The field of possibility is thereby expanded. Eventually such solutions could be refined. He was of the opinion that this approach could lead to better engineering solutions than an approach which was hindered by a requirement of reliability in the research phase.
In robotics circles one hears the expression `fix it in software' applied to situations when the hardware is malfunctioning or limited. This expression is emblematic of a basic precept of computer science and robotics, the separation of hardware an software and the privileging of abstract over concrete. I attempted, in Petit Mal, an alternative to this dualistic structure. I believe that a significant amount of the `information' of which the `intelligence' of the robot is constructed resides in the physical body of the robot and its interaction with the world.
A `Petit Mal' is an epileptic condition, a short lapse of consciousness. The name was chosen to reflect the robot's extremely reactive nature, Petit Mal has essentially no memory and lives `in the moment'. My approach has been that the limitations and quirks of the mechanical structure and the sensors are not problems to be overcome, but generators of variety, the very fallibility of the system would generate unpredictability. My experience has shown that `optimization' of the robots behavior results in a decrease in the behaviors which (to an audience) confer upon the device `personality'. In sense then, my device is `anti-optimised' in order to induce the maximum of personality. Nor is it a simple task to build a machine which malfunctions reliably, which teeters on the threshold between functioning and non-functioning. This is as exacting an engineering task as building a machine whose efficiency is maximised.
2.1 Behavior, interaction, agency
The example of Australian colonial painting (cited above) is pertinent in the explanation of peoples' behavior toward Petit Mal and the way it will change. Almost invariably, people ascribe vastly complex motivations and understandings upon Petit Mal, which it does not possess. Viewers (necessarily) interpret the behavior of the robot in terms of their own life experience. In order to understand it, they bring to it their experience of dogs, cats, babies and other mobile interacting entities. In one case, an older woman was seen dancing tango steps with it. This observation emphasises the culturally situated nature of the interaction. The vast amount of what is construed to be the `knowledge of the robot' is in fact located in the cultural environment, is projected upon the robot by the viewer and is in no way contained in the robot. The clear inference here is that, in practical application, an agent is first and foremost, a cultural artifact, and its meaning is developed, in large part, by the user and is dependent on their previous training. This means that, in the final analysis, an agent is a cultural actor, and building an agent is a cultural act. Here the rarefied and closed proof system of science is ineffably forced into engagement with the world.
Such observations, I believe, have deep ramifications for the building of agents. Firstly, any effective agent interface design project must be concerned with capitalising on the users' store of metaphors and associations. Agents work only because they trigger associations in the user. So agent design must include the development of highly efficient triggers for certain desired human responses. In his painting Ceci n'est pas un pipe, Rene Magritte encapsulated the doubleness of symbols and the complexity of representation. This doubleness can be used to good effect in agent design: a very simple line drawing (of a pipe, for instance) triggers a rich set of associations in the user. However, for the same reasons, these associations, like any interface, are neither universal nor intuitive, they are culturally and contextually specific.
Another curious quality of Petit Mal is that it trains the user, due to their desire of the user to interact, to play; no tutorial, no user manual is necessary. People readily adopt a certain gait, a certain pace, in order to elicit responses from the robot. Also, unlike most computer-based machines, Petit Mal induces sociality amongst people. When groups interact with Petit Mal, the dynamics of the group are enlivened. Readers from the agent research area might wonder at this point if the systems I describe might be appropriate for various sorts of application domains. I would respond: `probably not', nor is this my goal. I am interested in the modalities of interactive systems as new cultural environments. And I would reiterate my argument that because I am able to experiment without the constraint of total reliability or a pragmatic work-oriented goal, I can open up a wide field of possibilities, some of these possibilites may ultimately have application or relevance in pragmatic applications.
3. Sympathetic Sentience
Sympathetic Sentience is an interactive sound installation which generates complex patterns of rhythmic sound through the phenomenon of 'emergent complexity'. Sympathetic Sentience is an attempt to build a physically real model of emergent complex behavior amongst independent units, which produces constantly changing sound patterns. As with Petit Mal, there was an interest in designing the most technologically minimal solution, in this case, for a system which would demonstrate persuasively `emergent' behavior.
Each of the 12 comparatively simple, identical electronic units alone is capable of only one chirp each minute. Rhythmic and melodic complexity develops through a chain of communication among the units. In the installation, each unit passes its rhythm to the next via infrared signal. Each unit then combines its own rhythm with the data stream it receives, and passes the resulting new rhythm along. Thus the rhythms and timbral variations slowly cycle around the group, increasing in complexity. The system is self-governing, after an initial build-up period, the system is never silent nor is it ever fully saturated.
The 12 units are mounted on the ceiling and walls of a darkened room. The experience of the visitor is of an active sound environment of 12 'channels' in which there is recognisable, but not predictable, patterning. The visitor can interrupt this chain of communication by moving through the space. This results in a suppression of communication activity and hence reduction of complexity. A long interruption results in complete silencing of the whole group. When the interloping is removed, slowly a new rhythm will build up. The build-up of a new rhythm cycle can take several minutes. The rhythm cycles are never constant but continually in development. To gain a sense of the full complexity of the piece, it is necessary to spend several minutes with the piece in an uninterrupted state.
3.1 Technical Realisation
Several iterations of the work have been built. Sympathetic Sentience One was built entirely in hardware logic (TTL ICs). The basic premise is extremely simple: each unit is receiving, processing and forwarding a continuous stream of data. Each unit 'edits' that stream 'on the fly', adding or omitting an occasional bit. This editing is done in such a way that the 'density' of the sound is 'self-governing'. The critical part of each unit is an exclusive OR gate. On each unit, the signal is received by an IR receiver, demodulated and sent to a shift-register (delay). Emerging from the delay it meets a feed from the on-board oscillator at the exclusive OR gate. The signal emerging from the gate goes to both the IR emitter and the audio amplification circuit. The units communicate in modulated infrared signals using hardware similar to that used in TV remote controls.
While in Sympathetic Sentience One, only the rhythmic patterns were subject to change through the emergent complex behavior, in Sympathetic Sentience Two, other sound characteristics such as pitch and envelope are also subject to gradual change through the emergent complex process. To achieve this, Sympathetic Sentience Two uses small microprocessors (PICs) to replace the hardware logic.
Whether this behavior is deemed to be `emergent' is a matter of previous experience. Most visitors find it reminiscent of the sound of communities of frogs, crickets or cicadas. But to at least one rather dry observer, it was simply a chaotic system of a certain numerical order. To another is was a demonstration of one model of neural propagation. Here emergence would seem to be `in the eye of the beholder'.
The term `emergence' seems to be defined rather loosely, even in scientific texts. In some cases it is applied to the interaction of two (or more) explicit processes which result in a third `emergent' process which was, however, entirely intended. Similarly, the fitness landscape of Stuart Kauffman establish a desired end condition (Kauffman, S. 1993). This would seem to be a rather different and narrower sense of emergence than that of the termite community, though attempts to reproduce such behavior in programmable models, such as the Stigmurgic multi-robot systems of Beckers, Holland and Deneubourg, reduce the complex interactions to deterministic events (Beckers, R. Holland, O. Deneubourg, J. 1994) The paradigmatic `emergent' systems are the development of the mind/brain and the process of genetic evolution. The difference here is that these systems are open ended, goal states are not specified.
Fugitive is a single user spatial interactive environment. The arena for interaction is a circular space about 10m dia. A video image travels around the walls in response to the users position (see Plate3). This is the simplest level of interactive feedback: the movement of the image, tightly coupled to the movement of the user, is an instantaneous confirmation to the user that the system is indeed interactive. The behavior of the system is evasive, the image, in general, runs away from the user. The user pursues the image. Over time the response of Fugitive becomes increasingly subtle and complex (constrained by the need to be `self-teaching', to continually more or less make sense to the user). A user must spend almost 15 minutes to get through the full seven chapters and elicit the most complex system responses.
The user is totally unencumbered by any tracker hardware, sensing is done via machine vision using infra-red video.  The space is lit with 13 infra-red floodlights. User tracking is achieved via a monochromatic video camera mounted vertically upwards, looking into a semi-circular mirror suspended in the center of the room. Preliminary vision processing occurs on a PC. Two streams of serial data are output. Simple angular position data is sent to the custom PID motor control board to drive the projector rotation motor. Values for MAE calculations are sent to the MAE2 (Mood Analysis Engine2) running on an SGI 02 computer. On the basis of this calculation, the VSE (Video Selector Engine) selects, loads and replaces digital video on a frame by frame basis. Video data is fed to the video projector.
The user is engaged in a complex interaction with the system. The basic logic of interactive representation in Fugitive amounts to this: user movement is represented by camera movement within the image, and image movement across the wall. The segwaying of image content and its physical location is the `expression' of the system. The output of the Mood Analysis Engine controls the flow of digitised video imagery in such a way that no two people walking the same path in the installation will produce the same video sequence, because their bodily dynamics are different. The system responds to the dynamics of user behavior and their transitions over time. Ideally, the system responds not simply to changes in raw acceleration or velocity or position, but to kinesthetically meaningful but computationally complex parameters like directedness, wandering or hesitancy. This is achieved in a multi-stage process of computationally building up the complexity of parameters. The input level data from the vision system is limited to raw position in each frame. From this, simple values for velocity and acceleration are calculated. A third level of more complex parameters is then constructed: average acceleration over various time frames, variance and so on. Finally, values for various combinations of these parameters are used to determine the entry and exit points for `behaviors' which are matched to video selections.
The images do connect with some small degree of semantic significance, there is a minimal hypernarrative, but characterisation and plot structure were explicitly avoided. The chosen imagery is lanscape, each `chapter' being a specific location at a specific time of day. An hypertextual structure and a logic of transition links one `chapter' or location with the next. As time progresses, the user propels themselves through seven location chapters. A formal garden sequence is a kind of `vestibule'. You go there at the beginning and return there between each chapter. When you got the center the projector slowly rotates and shows you a series of archways. You choose to set out of the center (metaphorically through one of the archways) and you make the transition into a new chapter. This is the only case in which particular imagery is connected with a specific location in the room. When you have explored the chapter adequately (as determined by the system), you transition back into the 'garden'. All other video material is located `temporally' and triggered dynamically rather than positionally. This reinforces the continuity of body and time, against the continuity of an illusory virtual space. The output of the system is completely free of textual, iconic, or mouse/buttons/menus type interaction.
In building Fugitive, my concern was with the aesthetic of spatial interactivity, a field which I regard as being minimally researched. Watching spatial interactives over several years, I was frustrated by the simplistic nature of interaction schemes based on raw instantaneous position and simple state-machine logic. I wanted to produce a mode of interactivity which did not require the user to submit to a static Cartesian division of space (or simply groundplane). I wanted to make an interactive space in which the user could interact with a system which `spoke the language of the body', and which critiqued VR and HCI paradigms by insisting on the centrality of embodiment. I wanted to develop a set of parameters which could be computationally implemented, which truly reflected the kinesthetic feeling of the user, their sense of their embodiment over time. Fugitive is an attempt to build an entirely bodily interactive system which interprets the ongoing dynamics of the users body through time as an expression of mood. I called this part of the code (somewhat tongue-in-cheek) the Mood Analysis Engine.
4.1 Immersion and Embodiment.
One of my 'covert' goals was to critique the rhetoric of immersion in VR by building a system which continuously offers and collapses such an illusion. The last decade of rhetoric of virtualisation probably leads users to expect or hope for some kind of immersion in a coherent virtual world. Fugitive explicitly contradicts this expectation by setting up temporary periods in which the illusion of immersion is believable, and then breaking the illusion. If the user moves in a circumferential way, the illusion of a virtual window on a larger world is created. As you move, say, to the left around the perimeter, you will see a pan as a moving `virtual window'. As you continue it will segway into another pan. If you reverse your direction, the same pan will occur in reverse, but when you get to the beginning of pan2, you segway to pan3, not pan1. In this way the illusion of a virtual world seen through a virtual window, is collapsed. 
In conventional systems, the illusion of immersion is positional, the absolute position of the tracker (etc) corresponds to a specific location in the virtual world. Such a virtual world, a machinic system, maintains a rather repressive continuity: the continuity of the illusory architectural space. In Fugitive, the continuity of the system is a phenomenological one focused on the continuity of embodiment, not the instrumental one of a consistent virtual space in which the body is reduced to little but a pointer. Fugitive is not positional, the primary and structuring continuity is the deeply subjective continuity of embodied being through time.
4.2 Embodied looking: imagery as the voice of the agent.
Fugitive is about the act of looking, embodied looking, and it is about the metaphorisation of looking via video. The title `Fugitive' emphasises the evanescence of the experience of embodied looking. The attempt is, rather perversely, to avoid eliciting the kind of externalised interest in imagery and subject matter which one has when looking at a painting. This is because to goal is always to fold the attention of the user back onto their own sense of embodiment and the functioning of the system in relation to their behavior. Fugitive is not primarily a device for looking at pictures (or video), it is not a pictorial hyper-narrative. It is a behaving system in which the video stream is the `voice' of the system.
I want the user to see `through' the images, not to look only at the `surface' of the images. Strictly speaking, this meant I should choose imagery that was inherently uninteresting. The exercise is of course fraught with paradox, especially for the scopically-fixated viewer. The user is presented with a darkened circular space the only changing feature of which is a changing image, and yet the user is encouraged to understand the image primarily as an indicator of the response of an otherwise invisible system.
4.3 The Auto-pedagogic Interface
An interactive work is a machine, and one must learn to operate a machine. But visitors to artworks are seldom previously trained. Although prior training has become a part of theme park amusements, nobody wants to do a tutorial or read a manual before they experience an artwork. Nor do I find it acceptable for the user to have to don `scuba gear' (to borrow Krueger's term) before entering the work. A user should be able to enter unencumbered by special clothing or hardware. So a central issue in interactive art is managing the learning curve of the user. One solution is to make a work is so simple in the dynamics of interaction that it is easy to understand but immediately boring. Alternatively, works can be so complex that the average user cannot discern the way in which they are controlling or effecting the events, it appears random. In avoiding these two undesirables, the artist must either choose a well known paradigm (such as monitor-mouse-buttons or automobile controls) or if one desires the modalities of an interface which is novel, then the user must be trained or the system must teach the user.
I cannot endorse the concept of the `intuitive' interface because it implies a naive universalism and an ignorance of cultural specificity, aspects of which I noted in my discussions of `techniques of the user' and colonial painting. In Petit Mal I discovered that if the user is propelled by a desire to interact, that learning will occur in an unimpeded and transparent way. In Fugitive, I attempted to formally produce this effect in a much more complex system. Such an `auto-pedagogic' interface must present itself as facile to a new user, but progressively and imperceptibly increases in complexity as the familiarity of the user increases. Transitions to higher complexity should be driven by indicators of the behavior of the user.
In the current implementation of Fugitive, in order to ensure that the 'interface' be 'auto-pedagogic', the system exhibits only two behaviors at the beginning. Others are introduced along the way, and control of transitions becomes more complex. In future implementations, system behavior will be more `intelligent', as an agent which learns and expresses certain `desires'.
4.4 Poetics of interaction
The degree to which the changes in output are interpreted by the user as related to their behavior is a key measure of the success of any interactive system. Ideally, changes in the behavior of the system will elicit changes in the users behavior, and so an ongoing 'conversation' rather than a chain of 'Pavlovian' responses will emerge. Art artwork is by definition not literal or didactic, it is concerned with poetic and metaphoric associations. So an interactive artwork should not simply tell you something like `you have mail'. Nor would it be interesting if Fugitive told you: `you just moved two paces left'. The goal is to establish a metaphorical interactive order where the user's movement `corresponds' to some permutation of the output. It is all to easy to produce a system which the user cannot distinguish from random behavior. The designer must successfully communicate that the user is having a controlling effect on the system and at the same time engage the ongoing interest of the user with enough mystery. One hopes for some poetic richness which is clear enough to orient the user but unclear enough to allow the generation of mystery and inquisitiveness. The system must engage the user, the user must desire to continue to explore the work. This is a basic requirement of any artwork.
4.5 The paradox of interaction.
Representation of the response of the system back to the user is key to any interaction. Not only must one reduce human behavior to algorithmic functions, but one must be able to present to the user a response which can be meaningfully understood as relating to their current behavior. One can collect enormous sets of subtle data, and interpret it in complex ways, but if it cannot be represented back to the user in an understandable way, it is ultimately useless.
Having collected complex data with multiple variables, how do you build a rule-based system which establishes such fluid correspondences when the data base is a finite body of fixed video clips? The impossibility of this task was resoundingly brought home to me while making Fugitive. In the case of Fugitive, the sophistication of the response of the system had to be scaled back to a point where it could be represented by the video, the limitations of the rule based system which organises those clips into classes and the range of likely or possible behaviors in that circular geometry.
But images are complex things. Many types of information can be extracted from a single still image, let alone a moving image sequence. A major difficulty in the interactive scheme of Fugitive is for the user to determine which aspects of the images presented signify the expression of the system. Is the presence of red significant, the presence of water or a tree? Is it a question of the direction of movement of various objects in the image or the quality of the light? In Fugitive: subject matter, color etc, do not carry meaning about the state of the system. The aspect of the image which is the `voice' of the system is camera movement.
An artwork, in my analysis, does not didactically supply information, it invites the public to consider a range of possibilities, it encourages independent thinking. So building an interactive artwork requires more subtle interaction design than does a system whose output is entirely pragmatic, such as a bank automat. My work over the past decade has focused upon: the aesthetic design of the user experience given the diversity of cultural backgrounds and thus of possible interpretations; the development of embodied interaction with systems where the visitor is unencumbered by tracking hardware; the development of paradigms of interaction which go beyond state machine models to embrace and exploit Alife, emergent and social agent models. There is some divergence in current definitions of `autonomous agents', more in the term `socially intelligent agents'. While the works I have discussed are only marginally agents in the sense of "self-constructing, conscious, autonomous agents capable of open-ended learning", they do demonstrate a rich and complex interaction with the user.
I have emphasised the relevance of artistic methodologies to the design of social agent systems. Typically, artistic practice embraces an open ended experimental process which allows for expansive inventive thinking. Artistic practice emphasises the cultural specificity of any representational act, acknowledging that meaning is established in the cultural environment of the interaction, not in the lab. It emphasises the embodied experience of the user. And it emphasises the critical importance of the `interface', because the interface of the agent, like an artwork, is where communication finally succeeds or fails.
Brooks, R. 1991 Intelligence Without Reason AI Memo #1293,(April) MIT
Crary, J. 1992 Techniques of the Observer (October) MIT
Dreyfus, H. 1992 What computers still can't do. MIT
Johnson, M. 1987 The Body in the Mind, University of Chicago
Kauffman, S. 1993 The origins of order Oxford University Press
Varela, F. Thompson, E. and Rosch, E. 1993 Embodied Mind, MIT
R, Beckers, O. Holland and J. Deneubourg, 1994 From Local Actions to Global Tasks: Stigmurgy and Collective Robotics, Artificial Life IV, ed Brooks and Maes, MIT 1994, p181
Cariani, P. 1993 To Evolve an Ear: Epistemological Implications of Gordon Pask's Electrochemical Devices. Systems Research Vol10 No3 pp 19-33, 1993
Penny, S. 1993. Machine Culture in ACM Computer Graphics SIGGRAPH93 Visual Proceedings special issue. pp109-184
Penny, S. 1997 The Virtualisation of Artistic Practice: Body Knowledge and the Engineering Worldview. CAA Art Journal Fall97, Vol56#3, Guest Editor Johanna Drucker pp30-38.