Vol. 5, No. 1 (2011)





First-Person Shooters: Immersion and Attention

Mark Grimshaw, John P. Charlton, Richard Jagger

Eludamos. Journal for Computer Game Culture. 2011; 5 (1), pp. 29-44



First-Person Shooters: Immersion and Attention

Mark Grimshaw, John P. Charlton, AND Richard Jagger


The first-person shooter (FPS) game is often sold on its immersive potential and players' online forums are replete with opinions, discussions and polls as to the most immersive games experienced (for example, GameTrailers.com; Bohemia Interactive). In the United Kingdom alone, FPS games across desktop and laptop computers and a variety of dedicated gaming consoles have a large share of what, in 2008, was a £4 billion market; the FPS game Call of Duty: Modern Warfare 2 (Infinity Ward 2009) was the top computer game seller in the United Kingdom in 2009 and, in its first sales week alone, grossed over £67 million. Being able to design a game that will immerse the player more quickly and/or to a greater extent than previous games will clearly give commercial advantage.

What is immersion though? For such a widely used term, through Virtual Reality (VR) to the context of FPS games, there is a highly diverse understanding of the meaning of the term. This variable interpretation pertains not only to those who play computer games but also to those who design and study immersive environments. On the GameTrailers.com forum referenced above, and in response to a question raised as to the meaning of the word 'immersion', one forum poster (incorrectly) explains it as 'huge/big/enormous' while another (probably more correctly) opines that it is 'the state of consciousness when an immersant's awareness of physical self is diminished or lost by being surrounded in an engrossing total environment, often artificial'. Academics, equally, have wide interpretations of the term and there is, as yet, no widely established model providing a definition and describing the process.

Within the study of immersion in computer games, some explanations transfer theories of presence and telepresence from the field of VR to the related, but nevertheless different, experiences of gameworlds. VR presence is defined by Reiner and Hecht (2009) as 'a sense of being and acting inside a virtual place' as might be the case in 'fully surrounding and immersive VE systems'[1] where the user uses a visually-enclosing head-mounted display or is fully surrounded by software-generated images such as in the environment of a CAVE[2] (p.193).  This is the typical definition of presence in VR environments and, while it begs more questions than it answers, conceptually it can be traced at least as far back as the eighteenth century philosopher Diderot's description of absorption in painting: the beholder's physical presence in front of the painting is obliterated and the presence is transported to within the painting whereby beholder and painting become 'a closed and self-sufficient system' (Fried 1980, p.131-132).  Reiner and Hecht also provide an alternative definition of VR presence in 'non-immersive VE systems', such as projection tables and haptic VR, in which presence is the haptic 'sense of being able to touch and manipulate a virtual object' (p.183), but this definition and its environment are not germane to a discussion of immersion in FPS games.  Fencott (1999) suggests that, in the sense of being in the virtual world, presence has a cognitive basis and is the 'direct result of perception rather than [physically-based] sensation [and] the mental constructions that people build from stimuli are more important than the stimuli themselves'.  This cognitive basis to presence is taken up by Slater (2002) who proposes a Gestalt-based presence theory whereby those experiencing a VR world select among a set of presence hypotheses in order to decide where they feel present.  Conversely, Brenton et al. (2005) suggest that such hypotheses are not switched wholesale but are superimposed; our perception of presence comes from whichever hypothesis is dominant at that time.  This ability to foreground may help explain why it is possible to have the 'sense of being and acting inside a virtual place' (such as an FPS gameworld) while simultaneously being aware, to a lesser extent, of the background (such as the world beyond the periphery of the monitor).  Additionally, as we shall see, such a theory bears similarity to the theory of selective attention.

The above general definition of presence has been supplemented and adapted to fit the virtual environments of computer games.  Jennett, Cox and Cairns (2009) use the idea of real world dissociation to partly explain immersion in a bespoke 2D computer game (see also Jennett 2010).  The cognitive basis for immersion is supplemented with a sensory basis by, among others, Ermi and Mäyrä (2005) who provide a general definition of immersion in computer games ('becoming physically or virtually a part of the experience itself') and who give three possible forms of immersion: sensory, challenge-based, and imaginative.  Carr (2006) likewise makes a distinction between psychological immersion and sensory immersion.  In the latter case, she defines it as the player's senses being monopolized by the gameworld whereas psychological immersion 'involves the participant becoming engrossed through their imaginative or mental absorption' (p.69).[3] The notion of a physical and sensory immersion in the gameworld has been taken up by Grimshaw (2008a, in press) who uses game sound to construct a model of the acoustic ecology (of which the player is a contributing component) of the FPS game.

McMahan (2003) relates the degree of immersion to the degree of interactivity between player and gameworld while also including realism as one of the defining structures of immersion.  For immersion to occur, players 'must have a non-trivial impact on the environment […] immersion is not […] wholly dependent on audio or photo realism' (p.68-69).  For McMahan, such realism should be supplemented by a social realism, a plausibility of theme, for example, in addition to a more perceptual (rather than sensory) realism.  An example of such perceptual realism might be the use of Western artistic perspective to provide the illusion of a 3D space on the 2D monitor.  As regards McMahan's requirement of audio or photo realism and the level of authenticity required, Grimshaw (2008b) suggests that not authenticity but an appropriate level of verisimilitude is required.  Related to Grimshaw's concept of the FPS acoustic ecology incorporating the player, Calleja (2007) proposes the term 'incorporation' rather than immersion and proceeds to construct an Involvement Model derived from Goffman's metaphor of the frame.  Six frames of involvement are required for Calleja's incorporation (tactical, performative, affective, shared, narrative, and spatial) among which the player fluidly switches.

The player's internalization of these frames (and consequent incorporation) bears some similarity to Csíkszentmihályi's (1990) theory of flow. Flow, the concept of utterly focused motivation in a performance or learning task enabled through the harnessing of emotion, is related to the notion put forward by Kearney and Pivec (2007) that the motivation for the player's repeated engagement in the gameworld is provided by immersion (an idea supported also by Jennett [2010]).  According to the authors, emotion is a factor in immersion and this is suggested also by Calleja's affective frame in his Involvement Model and by Grau (2003) who defines immersion as a combination of 'diminishing critical distance […] and increasing emotional involvement' (p.13). Kearney and Pivec devised an experiment to measure immersion on the assumption that the less the player's eye movement and the lower the player's blink rate, the greater the immersion. Others, too, have used experimentation, usually combined with questionnaires, in an attempt to provide a psychophysiological basis for immersion and various emotions experienced during gameplay (for example, Shilling et al. 2002; Grimshaw et al. 2008; Nacke et al. 2010).  Such experimental data are still sparse and typically depend upon pre-defined definitions of immersion particularly where a parallel questionnaire directly asks participants about their experience of immersion.  The results of similar ongoing experiments, though, may well lead to a more refined and accurate definition.

The following themes can be teased from the brief overview of immersion research presented above, particularly where they have relevance to the type of virtual environment found in FPS games.  There is the notion of being in the gameworld, of incorporation within a virtual ecology.  This incorporation presupposes an ability on the part of the player to have an effect on the gameworld, its environment and objects no less than the gameplay; indeed to contribute events, such as the triggering of sounds, and actions that perturb the environment - in effect, to participate in the construction and dynamism of the game's ecology.  Whether this incorporating and active presence is purely cognitive and perceptual or physical and sensory is open to debate but the answer is likely to lie in a combination of the two modes.  Allied to this is a framework of realism applied to the gameworld.  This is not necessarily a matter of rigorous authenticity (in the recording and playback of audio samples, for example) but is more an artefact of verisimilitude.  In the context of immersive VR environments, Slater (2007) explains the sense of presence in the environment as arising out of the ability of the user to 'respond [to the VR environment] as if it were real'. As previously mentioned, such environments attempt to artificially monopolize the user's senses - in the case of vision, through the use of a head-mounted VR device, for example.  In FPS games, as with most other computer games (certainly all consumer games), although the player's auditory attention can be monopolized by the game through the use of headphones, the game's visual sphere, presented on a flat monitor occupying a small percentage of the player's visual field, must compete with the real-world environment visible beyond the game hardware. This is where presence hypotheses, either switched (the tautological term total immersion) or dominant (the oxymoronic term partial immersion might be appropriate here) prove useful.  The question remains, what is the mechanism for hypothesis selection?


FPS game elements and their potential to contribute to immersion

In order to facilitate immersion, FPS games depend upon image and sound for the sensory experiences they provide but manufacturers and games companies are seeking new interfaces that will enable immersion through other senses.  Touch has been modestly exploited in some games (not merely FPS games) for several years, and this mode continues to be developed (Zyga 2009, for example), but it will surely not be long before taste and smell become part of the immersive game designer's arsenal (Madrigal 2009).  Other game elements, not primarily sensory, contribute to player immersion: plot and story (even mere backstory); Non-player character (NPC or 'bot') artificial intelligence; and, importantly, a performative game space within which the player can act and interact.  This last is significant and is not solely a matter of a 3-dimensional visual representation in which the player can navigate and, in the FPS game, interact in a friendly or violent manner with other players and/or bots.  Aiding the illusion of being present within the gameworld displayed on the screen, the FPS game typically posits the player with a first-person perspective[4] in which an arm (or pair of arms) clutching a weapon recedes from the player into the perspective of the game world - a virtual prosthetic (Grimshaw, in press).  Aurally, the player has the listening perspective of a first-person auditor (Grimshaw 2008a) at the centre of the FPS game's acoustic environment.

In addition to the first-person position of the player's character and conventions such as Western artistic perspective, a typical FPS game utilizes many visual cues or sureties (to use MacMahan's term) that will be recognizable to the player from the space outside the gameworld (we will call this space the 'real world' for now).  These include a variety of geometrical shapes which may form themselves into landscapes, plants, vehicles, buildings, obstacles such as crates or access points such as doors.  Even depictions of space and planetary environments are recognizable, if not from personal experience, then from astronomy or the conventions of cinema.  The most outlandish weapons which are impossible to manufacture in the real world conform to visual expectations in having a 'business' end and a means of engagement such as a haft or hilt with which to thrust or a trigger to pull.  Realism FPS games scrupulously visually model their real world correlates: a World War II FPS game (early versions of the Call of Duty franchise, for example) are fetishistic in their attention to period detail in uniform, weapons, land and air vehicles among other visual artefacts.  Realism goes only so far though: the violence, for example, is contained within the game hardware and players' characters, in an effort to balance realism with gameplay satisfaction, are typically more immune to gunshot wounds and explosions than humans would be.

Sound, too, plays a role in fostering the conditions for player immersion.  We have already mentioned the position of the player in the acoustic environment of the game: as in the real world, where the visual space is restricted to what can be seen with stereoscopic vision, the aural space is unrestricted with reference to the position of heard sound sources.  Sound can be heard from any direction and the modern sound production systems available with most FPS games and the reproduction systems that players use (surround sound speakers or headphones) are capable of providing the player with increasingly accurate sound location discrimination.  The current trend is for game sound engines to be used to process the game's audio samples with reverberation and echoes that approximate those that one might expect to hear in real world analogues of the game's visual spaces.  Actions and events that can be seen typically result in a sound whose intensity falls off with increasing virtual distance and, as with the visual aspects, the sound of realism FPS games is closely modelled on (if not recorded from) the authentic real world sound.

Within a game, the player must be able to play. In the FPS game, this is enabled through Cartesian and performative spaces that enable a great degree of latitude of character movement[5] and contain within them the possibility to carry out the actions required for gameplay.  Some FPS games subtly direct players through a labyrinth (which may be hidden in the game's structural layer and therefore not visible to players) in order to progress play from stage A to stage B whereas others are open, free-ranging spaces within a containing boundary.  Guns can be fired, supplies picked up, obstacles jumped over, doors opened, ladders climbed, team-mates healed and enemies 'fragged': all actions possible in the real world (according to circumstance) and some games go further by providing actions possible only in science fiction (such as the teleporters of the Quake series whose use is recognized from TV and cinema).  Thus, many aspects of the game are mediated through codes of realism as they are perceived by the player.  Game elements and game actions behave in ways that simulate that behaviour in the world outside the game.

The technology, though, imposes both limits at the same time as it enables possibilities unachievable outside the gameworld. As regards the latter, the durability and typical FPS reincarnation of the player's character has already been mentioned.  A further example is that of gravity: in Quake 3 Arena (id Software 1999) and other FPS games, it is possible to adjust the game engine variable controlling the amount of Earth gravity simulation in the gameworld with the result that multi-storey buildings can be traversed in a single bound.  In terms of limitations, there are several and these may be overcome as the technology improves or otherwise changes.  Visually, the FPS game provides a restricted field of view that itself is contained within the monitor's screen around which elements of the player's world can still be seen.  Aurally, no game audio engine is yet capable of providing the richness and variability of real-world acoustic environments: the adaptation of procedural audio techniques, rather than the exclusive use of audio samples, may go some way to addressing this.  One must always question, though, the need for emulation as opposed to simulation.  How real is real enough especially if the game is to remain a game (see Grimshaw [2008b] for a discussion on perceptual realism and sound in FPS games)?


Attention: An important issue in immersion

Although, as discussed above, there is much debate as to the definition of immersion, elements common to all discussions of the topic involve the ideas that players are involved in a game to the extent that they do not notice stimuli in their physical surroundings (other than those presented by the gameworld) and become oblivious to the passage of time. These two critical points imply that a person's attention is completely focussed upon the gameworld, and therefore, in considering how designers can increase the immersive properties of games, it is germane to examine the psychology of attentional processes.

Attention is central to cognition since the issue of where processing resources are directed and maintained in the environment is critical for goal directed behaviour. A second important aspect is that stimuli not relevant to a task can be disregarded or 'filtered out'. This selective attention, or inhibition of extraneous detail, allows us to remain free from distraction and prevents us from becoming 'overwhelmed' by sensory input, thus enabling us to complete tasks and attain goals effectively. Central to this idea of process allocation is the notion that some elements in the environment can be dealt with in a relatively automatic way, while other aspects require a degree of controlled processing. That is, routines that are well learned eventually become relatively automatic requiring little attentional control. They are relatively 'set', whereas new or difficult situations require control operations to direct the appropriate resources to complete the task, this allowing processing to be adaptable to the situation.

One explanation of automatic processing is that it relies on access to previously stored knowledge in memory, while controlled processing relies on access to the active and flexible working memory system (Logan 1988). In a capture the flag FPS game, teams start at opposing bases and, in the course of playing the game many times, players learn routes (using representations of the game level layout stored in memory), learn defensive positions and attack tactics, the position of health packs and ammunition and so on.  Opening moves can often be limited and automated - spawn, collect weapons, take up defensive or attacking position, for example - but, as the game progresses, the number of possible moves is increased.  The use of stored knowledge and reliance on combinations of learned routines and controlled processing is as true for FPS players as for chess players (an oft-used example).  Vastly experienced players, including grandmasters, appear to have more direct access to 'chunks' of information in long term memory; these are remembered patterns that have been previously encountered. This access to the stored representations dramatically reduces the ongoing attentional demand.

In a gaming situation, automation is consistent with the idea that tasks and sequences of actions that were once found difficult to execute eventually become learned routines that can be completed relatively effortlessly as the appropriate remembered procedure or information is retrieved. With experience, these actions are then completed faster and require less attentional allocation. However, the learned automatic responses to a specific situation in a game are set and that knowledge is not easy to transfer to modifications of the situation. In this case automatic responses must be over-ridden to allow a flexible approach to the task. Therefore control of attentional resources is necessary.

The idea of controlling attentional resources when required was central to a seminal theory developed by Norman and Shallice (1980, 1986) and Shallice and Burgess (1996). In this theory, which postulates the existence of a supervisory attentional system, routine action 'packages', termed schemata, may be triggered by elements in the environment through sensory input or other schemata. This is considered to be a fully automatic process in response to a particular and well learned situation. When there is a possibility of conflict between schemata, the selection of the appropriate schema in preference to other similar schemata is termed contention scheduling. Contention scheduling is not an automatic process but at the same time does not require attentional control. Situations that are not learned or are novel require the intervention of the supervisory system, a system that directs the contention scheduling of schemata and exerts a higher level of control over lower level processing. That is, it directs and maintains our attentional resources and is critical in difficult tasks during problem solving, when automatic processes must be overridden and where there is a conflict between similar schemata.

To the novice FPS player, learning the rules of the game, the meaning of information presented on-screen and the game interface, is a difficult task which requires directed attention to the multiple tasks involved.  One must concentrate and selectively attend to the elements that allow for the actual process of playing, the controller layout or keyboard commands, while remaining attentive to an ever-changing environment.  At first, for example, the rocket jumps possible in Quake 3 Arena require a concentrated and co-ordinated effort.  However, through experience and practice, rocket jumping eventually becomes a process in which one can seemingly pay minimal attention to its mechanics and instead be able to pay attention to additional aspects such as aiming at enemies while in the midst of the jump.  These mechanics have, to some extent, become automated.  However, if there is a sudden and unexpected change in the situation, for example, the preparation for the rocket jump is interrupted by the appearance of an enemy at close quarters, attention is immediately diverted from the jump to resolving the issue.  Shallice (1988) would explain this as the ongoing schemata for the game being interrupted and the supervisory system intervening to select the appropriate action to take to resolve the situation.

Some aspects of automation can be explained in a different way, however, as being a consequence of the learning process. For example, in arithmetic, if a long transformation on any chosen number ends up simply doubling the original number, a person, on becoming aware of the rule, would not be expected to persevere with the original transformation, but simply apply the new found rule. The transformation itself has not become automated but, through repetition and experience, a short cut has been established which is dependent on access to the rule in long term memory. This short cut, or heuristic, accelerates the original process and therefore requires less attentional resources and speeds up the process overall. Heuristics therefore can be learned and become a new way of doing the same task. A second result of this learning effect is that through experience the ability to disregard potential distractors is much improved (Castel et al. 2005).

In gaming, as routines are learned, whether through applying heuristics or through automation, a reduction in time to complete tasks is observed, indicating that less attentional resource is being utilized. For example, in the well understood opening parts of a game, sequences of actions can be carried out while being able to direct attention to other aspects of the game, or indeed outside the game. In one sense the player is not immersed in the game at this point, indicating a relationship between attention and immersion.  However, during this period, attentional control would once again be relied upon if there were unexpected rule changes and sudden unpredictability in the well practised routines and this sudden application of attention might be thought of as a graded entry back into an immersive state. If certain actions were unable to become learned or automated in some way, game playing would be a laborious task and little progress would ever be made.

The work of Jennett (2010) identifies a sense of progression as being critical in immersion, suggesting that this occurs through players receiving positive feedback from the gaming environment (by, for example, progressing through a story line or through amassing points). This is highly psychologically (emotionally) rewarding and causes them to selectively attend to the game rather than other features of their environment and it is these feelings that motivate them to carry on playing the game. Thus, it is argued that the intrinsic rewards provided by computer game playing and its interactive nature result in computer game immersion being more than just an extreme instance of selective attention, rather 'immersion is a result of self-motivated attention which is enhanced through feedback from the game; environmental information being attenuated to a greater extent when a person's sense of progression is highest' (Jennett 2010, p.192).

Attention therefore is a key element in controlling and directing appropriate behaviour in response to sensory and internally generated input. A further issue to explore concerns the aspects of the environment to which attention is directed: what is considered important and what is thought irrelevant. The surroundings in which we exist are multi-sensory and complex and different elements are constantly vying for our attention.  If we are reading, gaming or relaxing any number of potential distractions exist that may break our concentration. For example an alarm, a voice, a passer-by or the smell of cooking all represent possible distractions. If we were unable to disregard what we consider unimportant, we would endlessly be giving attention to all aspects of our environment and show a high level of distractability. This ability to inhibit or 'filter out' is an important aspect of attention.

In terms of gaming it is critical to be able to respond to certain elements in the gaming environment while ignoring others. This ability to be able to selectively respond and attend to certain features in the environment would seem to be an important factor in becoming immersed. As has been found with expert chess players, the more proficient gamer may direct his or her attention to specific 'hot spots' in the gaming environment, that is, areas that have the potential to be affected by various factors. The less proficient player is more likely to attend to the scene as a whole or be drawn to visually or auditory dominant aspects that are irrelevant to progress or survival in the game.  This distraction may lead to failure in the game as important and threatening details are missed. If it were not possible to learn how to inhibit distractions and selectively attend to the critical parts, the gaming experience would quickly become one of frustration as development through the game would be made impossible by constant distraction to elements that were not important, for example 'chatter' between players and 'non-events'.  Significant elements in the gaming environment would be lost (in the low signal to noise ratio).  Examples of this in the FPS game might be the entry of new players into a multi-player game leading to a flurry of on-screen welcome messages.

It is possible that immersion and selective attention both describe the same process. Thus, the idea that immersion may not be a multi-component factor but can be best explained as a single entity, that is as a state of sustained selective attention, has been recently tested by Jennett (2010). Immersion, in this interpretation, is therefore a cognitive state that allows for the filtering out of certain materials in order to attend to the game as a whole.  The notion of filtering is central to theories of selective attention. Experimental studies show that when selectively attending to certain aspects of the environment, other aspects or distractors are filtered out. Originally, it was thought that irrelevant information is filtered out at an early stage and that this prevents an overloading of a limited capacity system (Broadbent 1958).  However, later empirical evidence suggests that distracting information is disregarded, but at a much later stage in the processing stream (for example, Deutsch and Deutsch 1963). Information is filtered at the semantic level, that is, irrelevant incoming information is not merely attenuated, but is processed to some extent in terms of meaning. This explains why, when two people are reading in a room, one may suddenly be aware of an announcement on the radio if it has some meaning to them, while the other uninterested party will carry on reading oblivious to the bulletin. The unattended information is therefore available to both parties but selected by only one of them.  However, the level to which the reader was immersed in the book may have an effect on the attention that can be paid to incoming information, however pertinent. In gaming situations, the degree to which one is immersed may therefore affect one's ability to react to information and sensory input from outside the game.

However, it perhaps seems counter-intuitive to think of selective attention and immersion as one and the same thing. Immersion suggests more than selective attention, as it appears to include behavioural components and indeed Jennett (2010) concluded that immersion could not be explained fully by selective attention theory. One clear indication she found was that dissociation was found between immersion and task difficulty. That is, the difficulty of the task had no effect on the level of immersion. Selective attention theory would not predict this dissociation because task difficulty would modulate immersion if it were solely a state of selective attention. However, in one of Jennett's experiments in which participants played two versions of a game,[6] which were identical apart from the fact that in one game the scoring was rigged to produce an artificially high score, greater real world dissociation (in terms of lesser ability to detect extraneous auditory stimuli) was found in the high scoring version of the game than in the low scoring version of the game. This was interpreted as showing that a player's motivation to continue playing the game (because of the high score being achieved), rather than cognitive load (the perceptual features of the game and game difficulty), is highly implicated in real world dissociation and immersion.

An alternative way to view the attentional component in immersion is that it is epiphenomenal to immersion itself. The term 'immersion mode' might be used to describe a cognitive set that supports the behavioural aspects of immersion. This mode could be said to be entered when the gamer becomes immersed and is maintained while the gamer remains in that state. The mode could be thought necessary for the control of cognitive processes required for the immersive experience including access to stored knowledge in long term memory and processing in working memory as well as executive functions such as decision making, planning and the inhibition of extraneous details. The mode would allow the gamer to respond to the gaming environment in a selective and task-directed manner and become immersed in attempting the challenging aspects of the game.

The level of attentional involvement in a gaming situation would seem to be dependent on several factors, many of which have been discussed by Jennett et al. (2008). These authors found support for the existence of Brown and Cairns' (2004) three levels of immersion which differ in the extent to which a person attends to their immediate (non-gameworld) environment; varying levels of attentional resources may be involved which consequently require different levels of cognitive involvement. These three levels are engagement, engrossment and total immersion.

Engagement is the least immersive level and this involves players making the effort to learn how to play the game and, not least, to learn how to use the controls. The assumption here is that there is a motivational barrier to becoming engaged with playing a game, and that if a player does not become highly engaged enough to learn how to play a game (because they are not attracted to it), then they will not bother to learn its rudiments and will never be in a position to proceed to the next two levels of immersion. At the engagement level it seems likely that attentional processes are similar to those involved in learning any other type of skill, conscious, effortful attention being paid to the specifics of the game (for instance, the characters involved, their powers, the environment in which they exist, the types of weapon available, the different levels of the game, which controls enable which actions and so forth). At this level, well learned routines are not available to the gamer, so all aspects of the environment require some degree of controlled processing.

The next level is engrossment, and here the player becomes more immersed in the gameworld because they no longer need to pay much attention to the paraphernalia of the game that exists in the non-virtual environment. This enables the player to become emotionally involved with the game and the controls become 'invisible'.  That is, the player does not need to pay attention to the controls because their use has become automated; this leaves a player free to concentrate on events in the gameworld and how they should respond to them. Thus, at the level of engrossment, there is some routine behaviour that has been previously 'mastered'. In the gaming situation the gamer could be thought of as actively selecting appropriate schemata to complete previously learned or familiar tasks and override inappropriate ones. For example, the sight of a particular character may trigger a schema for evasive action, while some other perceptual input, for example a particular sound, may trigger a search and rescue schema. Novel situations at this stage require attentional control, as do problem solving, prospective planning and decision making. Supervisory control is also required in situations where a well learned routine has to be curtailed due to some unexpected outcome. At this level, distractors seem to have less effect because attention is focussed on the more important aspects of the game and, through experience, resistance to (known) distractors increases. Situations where multiple events occur at the same time would require the gamer to switch selective attention between multiple tasks, which should hypothetically increase the level of engrossment.

Total immersion is the final level of immersion and corresponds to what most writers appear to have in mind when they refer to the concept of immersion. The most important feature here is real world dissociation in which one becomes so immersed that one is mentally transported from one's present physical surroundings into the gameworld with concomitant lower awareness of one's real world physical environment. Thus, to a great extent, the player becomes oblivious to real-world stimuli (or non-gameworld stimuli), events in the real world are unimportant, and attention is directed almost exclusively towards events in the gameworld and responses to it. However, the Jennett et al. (2008) study showed that this type of immersion is 'rare and rather fleeting' (p.642), in contrast to the two precursor levels of immersion. For designers, then, fostering the conditions under which this highest level of immersion can become less rare is a salient issue.


Designing immersion in FPS games

Feelings of being in control, that one is performing well (as indicated by positive feedback) and an optimal degree of challenge were found to be critical to a sense of progression in Jennett's work. Each of these suggests basic principles that games designers should work to. First, within the constraints of a game's parameters, designers should make learning to use controls as easy as possible. At this stage it would seem appropriate to minimise distractions in the gaming environment and emphasize that tasks can be mastered in certain ways that eventually become more automated or learned. All games involve a degree of learning to accomplish tasks and, in the early stages, simple routines should be intuitive and easy to comprehend and master. The faster and easier that this learning is, the more quickly usage of controls will become automated and the sooner this will allow the player to become immersed in the gameworld, without immersion being continually interrupted by having to pay attention to the (real world) controls. Of course, playing with consoles such as the Nintendo Wii can result in a merging of the real world and gameworld, with movements of characters in the gameworld mirroring those made in the real world by the players, and so this itself might enhance immersion.

Second, to give players a sense that they are performing well, designers should try to facilitate feelings in players that they are making progress through a game and that they are succeeding, Jennett (2010) finding that it is these sensations that give players a sense of satisfaction and keep a player playing a game (often for longer than anticipated). This ties in neatly with the finding that players need to be challenged, the idea that one should feel challenged in order to perform well and experience positive emotions when engaging in a task being well-known among psychologists. In the context of game playing, this entails ensuring that the level of game difficulty is matched with the player's level of skill. If the game is too easy for a player this will not present a challenge and will result in boredom - 'if games are too hard they're boring, and if they're too easy they're boring, but if they're right in the zone they're addictive' (Johnson quoted in Wasik 2006, p.33). As Jennett points out, creating this 'Goldilocks zone' entails a game providing some degree of negative feedback, indicating poor performance (for example, characters being hit by bullets, missiles, loss of points, and so on). On the other hand, if a game is too difficult then flow / immersion will be interrupted because the game becomes discontinuous. Indeed, games often adhere to this principle, lower levels being relatively easy, with subsequent levels proceeding through different levels of skill.

In addition to the above, two other requirements that are necessary for one to become totally immersed in a game are, first, a feeling of empathy with the principal character(s); most obviously in FPSs the character whose perspective one takes in the game world, and second, the existence of a sense of atmosphere in the gameworld by making the virtual environment, the characters which populate it, the actions which occur and other features of the game as relevant as possible to the scenario which the game is based upon (Brown and Cairns 2004).

Designers should not infer from the above that they should aim to keep players totally immersed for long periods of time; switching between states of consciousness where the player is and is not fully immersed in the gameworld was identified by Jennett (2010) as an important aspect of gaming. For example, they can switch between feeling that they are the character that represents them in the game world, being a player of the game viewing it from the outside (for example, having critical thoughts about certain aspects of game play) or interacting (as they normally would) with other entities in the non-gameworld environment. Crucially, Jennett suggests that the ability to switch between these states of consciousness is important in the enjoyment of gaming; certainly, with FPS games, it is easy to imagine how being totally immersed in a gameworld, and thus completely dissociated from the real world, without having the ability to remove oneself from the gameworld at will could result in the experience of game playing being traumatic rather than fun. Further, according to the Easterbrook hypothesis, it is possible that if total immersion in first person shooters were to give a player a sense of mortal danger, then a feedback loop might occur which might maintain and intensify immersion. This is the case since the aforementioned hypothesis contends that when people perceive a threat to their survival this leads to high arousal which in turn leads to attention becoming highly focused (Easterbrook 1959).

In this article, we have shown that attention is critical to the various stages of immersion; FPS games, indeed all computer games, are goal oriented and the role of attention in the game is to direct and maintain cognitive processing for the purposes of achieving that goal. Allied to this is the notion of selective attention whereby extraneous background detail can be suppressed allowing cognitive resources to be focussed upon the effective completion of important tasks. Selective attention is enabled through process allocation; commonly undertaken tasks become learned, routine and automatic, allowing other changeable circumstances within the game to be allocated controlled and directly processed resources. In order to enable the novice player to accustom themselves to the interface controls and the game's rules and scenarios - effectively, automating the common routines - positive feedback is an important factor in motivating the player to persevere at the learning stage. While this may be catered for by the players themselves opting to explore and discover and wonder at the gameworld (the ability to do this counts as a form of implicit reward for the commercial outlay often involved in accessing the game) rather than specifically attempting to achieve the game's goals, it can also be provided by a variety of other methods designed into the game by the game's creators. These might include a progressive difficulty in levels, artificially inflating scores, multiple modes of interaction or direct rewards for completing tasks or for the player's performance in the level. Importantly, the game should allow the player to switch in and out of immersion; from being immersed, to conversing with other players in a multi-player game, to being critically reflective about aspects of the game or the player's performance. Too much, and too sustained, immersion may prove traumatic rather than fun; a game, after all, is just a game.


Games Cited

id Software (1999) Quake III Arena. Activision (PC).

Infinity Ward (2009) Call of Duty: Modern Warfare 2. Activision (Xbox, PS3, PC).



Bohemia Interactive (2004) Most immersive firstperson games. Available at: http://forums.bistudio.com/showthread.php?t=38308 [Accessed: 19 December 2010].

Brenton, H., Gillies, M. Ballin, D. and Chatting, D. (2005) The uncanny valley: Does it exist and is it related to presence? Workshop on Human-Animated Characters Interaction. April 2005.

Broadbent, I. D. (1958) Perception and communication. London: Pergamon Press.

Brown, E. and Cairns, P. (2004) A grounded investigation of game immersion. In: CHI '04, Extended abstracts on human factors in computing systems. Vienna, Austria 24-29 April 2004. New York: ACM.

Calleja, G. (2007) Digital games as designed experience: Reframing the concept of immersion. Ph.D. Victoria University, Wellington, New Zealand.

Carr, D. (2006) Space, navigation and affect. In Carr, D., Buckingham, D., Burn, A. and Schott, G. (eds.) Computer games: Text, narrative and play. Cambridge: Polity, pp.59-71.

Castel, A. D., Pratt, J. and Drummond, E. (2005) The effects of action video game experience on the time course of inhibition of return and the efficiency of visual search. Acta Psychologica, Vol. 119, pp.217-230.

Csíkszentmihályi, M. (1990) Flow: The psychology of optimal experience. New York: Harper Perennial.

Deutsch, J. A. and Deutsch, D. (1963) Attention: some theoretical considerations. Psychological Review, Vol. 70, pp.80-90.

Easterbrook, J. A. (1959) The effect of emotion on cue utilization and the organization of behaviour. Psychological Review, Vol. 66, p.183-201.

Ermi, L. and Mäyrä, F. (2005) Fundamental components of the gameplay experience: Analysing immersion. DiGRA, Changing views -- Worlds in play. Toronto, Canada 16-20 June 2005.

Fencott, C. (1999) Presence and the content of virtual environments. Available at: http://web.onyxnet.co.uk/Fencott-onyxnet.co.uk/pres99/pres99.htm [Accessed: 19 December 2010].

Fried, M. (1980) Absorption and theatricality: Painting and beholder in the age of Diderot. Berkeley: University of California Press.

GameTrailers.com. (2009) What was the most immersive fps you have ever played? Available at: http://forums.gametrailers.com/thread/what-was-the-most-immersive-fp/761179 [Accessed: 19 December 2010].

Grau, O. (2003) Virtual art: From illusion to immersion. Cambridge (MA): MIT Press/Leonardo Books.

Grimshaw, M. (2008a) The acoustic ecology of the first-person shooter: The player experience of sound in the first-person shooter computer game. Saarbrücken: VDM Verlag Dr. Mueller.

Grimshaw, M. (2008b) Sound and immersion in the first-person shooter. International Journal of Intelligent Games & Simulation, Vol. 5 (1). Available at: http://www3.wlv.ac.uk/ijigs/Vol5/Num1/Abstracts.aspx [Accessed: 19 December 2010].

Grimshaw, M., Lindley, C. A. and Nacke, L. (2008) Sound and immersion in the first-person shooter: Mixed measurement of the player's sonic experience. Audio mostly. Piteå, Sweden 22-23 October 2008.

Grimshaw, M. (In press) Sound and immersion in digital games.  In Bijsterveld, K. and Pinch, T. (eds.) The Oxford handbook of sound studies.  New York: Oxford University Press.

Howson, G. (2009) UK games market worth GBP 4 billion - but what does that mean? The Guardian, 7 Jan. Available at: http://www.guardian.co.uk/technology/gamesblog/2009/jan/07/nintendo-games [Accessed: 19 December 2010].

Jennett, C., Cox, A. L. and Cairns, P. A. (2009) Investigating computer game immersion and the component real world dissociation. Human factors in computing systems. Boston, MA 4-9 April 2009.

Jennett, C. et al. (2008) Measuring and defining the experience of the immersion in games. International Journal of Human-Computer Studies, Vol. 66, pp.641-661.

Jennett, C. I. (2010) Is game immersion just another form of selective attention? An empirical investigation of real world dissociation in computer game immersion. Ph.D. University College London, United Kingdom.

Kearney, P. R. and Pivec, M. (2007) Immersed and how? That is the question. Game in' action. Sweden 13-15 June 2007.

Logan, G. D. (1988) Toward an instance theory of automatization. Psychological Review, Vol. 95, pp.492-527.

Madrigal, A. (2009) Researchers want to add touch, taste and smell to virtual reality. Available at: http://www.wired.com/wiredscience/2009/03/realvirtuality/ [Accessed: 9 September 2010].

McMahan, A. (2003) Immersion, engagement, and presence: A new method for analyzing 3-D video games. In Wolf, M. J. P. and Perron, B. (eds.) The video game theory reader. New York, London: Routledge, pp.67-87.

Nacke, L., Grimshaw, M. and Lindley, C. A. (2010) More than a feeling: Measurement of sonic user experience and psychophysiology in a first-person shooter game. Interacting with Computers, Vol. 22 (5), pp.336-343.

Norman, D. A. and Shallice, T. (1980) Attention to action: Willed and automatic control of behaviour. Center for Human Information Processing, Technical report no. 99. San Diego: University of California.

Norman, D. A. and Shallice,T. (1986) Attention to action. In Davidson, R. J., Schwartz, G. E. and Shapiro, D. (eds.) Consciousness and self regulation: Advances in research and theory, Vol. 4. New York: Plenum, pp.1-18.

Reiner, M. and Hecht, D. (2009) Behavioral indications of object-presence in haptic virtual environments. Cyberpsychology & Behavior, Vol. 12 (2), pp.183-186.

Shallice, T. (1988) From neuropsychology to mental structure. Cambridge: Cambridge University Press.

Shallice, T. and Burgess, P. W. (1996) The domain of supervisory processes and the temporal organisation of behaviour. Philosophical Transactions of the Royal Society of London B, Vol. 351, pp.1405-1412.

Shilling, R., Zyda, M. and Wardynski, E. C. (2002) Introducing emotion into military simulation and videogame design: America's Army: Operations and VIRTE. GameOn. London 30 November 2002, pp.151-154.

Slater, M. (2002) Presence and the sixth sense. PRESENCE: Teleoperators and virtual environments. Cambridge (MA): MIT Press.

Slater, M. (2007) If you respond as if it were real, then it is presence. Starlab. Available at: http://www.cs.ucl.ac.uk/research/vr/Projects/PRESENCCIA/Public/001_01-PeachI-MelSlater-June07-STARLAB.pdf [Accessed: 30 September 2010].

UKIE (2010) Call of Duty: Modern Warfare 2 ends year as best selling videogame of 2009. Available at: http://ukie.info/node/187 [Accessed: 19 December 2010].

Wasik, B. (2006) Grand theft education: Literacy in the age of video games. Harper's Magazine, Sep. pp.31-39.

Zyga, L. (2009) Immersive game system allows physical interaction between players. Available at: http://www.physorg.com/news180695187.html [Accessed: 9 September 2010].



[1]      'VE' stands for virtual environment.

[2]      'CAVE' stands for cave automatic virtual environment - images projected onto multiple surfaces of a cube-shaped room.

[3]      Carr uses the term perceptual immersion to contrast psychological immersion but, to avoid confusion with Fencott's definition, it should be read as sensory immersion.

[4]      Hence the genre's label although it is important to recognize that some FPS games have third-person modes in which the player views an entire avatar on screen similar to the standard view of many role-playing games.

[5]      An illusion because the character does not move, the gameworld and its elements move around the central position of the character as fixed in the game engine.

[6]      It should be noted that this game was a bespoke 2-dimensional game with simplified graphics and not typical of modern 3-dimensional FPS games.