Image Composition  

Essay on Image Composition

Composition is a crucial factor in any image.  But it's always been hard to determine and describe precisely what makes good composition good, and bad ones bad.  I've thought about how to clarify the subject, by relating it to how the human visual system operates.  If you have any feedback I'd appreciate it.

Terminology:
Focus of contrast (plural foci): an area of localized contrast that's small enough for the eye to take in without moving.   These will be quite small, since the fovea (part of the retina that sees details) only covers a few degrees of our field of vision.  (This is why our eyes have to move around so much.)

A focus has a certain amplitude.  This indicates how much our eyes are drawn to that particular contrast.  Although judging the amplitude will necessarily be slightly subjective and comparative, on the whole it's quite objective and easy to do.

Note: the following concerns visuals like photography, painting, drawing and sculpture, as opposed to typography and graphic design; the latter often aims more to grab and hold attention for a very limited span.  Visuals of the former kind usually try to invite the viewer on a longer journey of discovery; so our aim in composing such images should be to enable the viewer to look for a long time without discomfort, in fact to feel so good about looking that they find themselves staying much longer than they planned...

Sight is our most important sense, by far.  Certain visuals (like some 60's Op-Art) can cause severe discomfort, headache, nausea, even trigger epileptic fits.  Lack of visual stimulus becomes painful after a certain time, to the point that the brain desperately invents visuals of its own to fill the void.  There is a definite link between visuals and our physical well-being; we should be fully aware of it, and utilize it.

The following images are small due to bandwidth restrictions, they should be viewed larger to obtain the full effect - I suggest full screen and a fairly close viewing distance.


Here's a perfectly symmetrical image, with only two foci, of exactly similar weight, close together.   (I'm simplifying by only considering contrasts in size, shape and value.) 

As an image it fails to hold our attention for very long, partly due to its lack of meaning, but also because it soon becomes annoying, tiresome to look at.  (Soon as measured in minutes, not seconds.) 
On the right, see a rough indication of how the eyes will move across the image.  (These movements are called saccades and are semi-subconscious.)

They become 'caught' in the center, hardly venturing outside at all - there just isn't anything there to 'stick to'.  And if the eyes do leave the center, they stay glued to the edges and corners of the image.  This is uncomfortable.
(I'm guessing because our visual system can't gather information very efficiently - plus its tiring for the eye muscles.)

(A similarly bad image would result from using only one focus, even if not centered.) 


But how about moving the 2 foci further apart, to lead the eye around more real estate?

Even worse - this configuration will quickly tire the eyes.  (Try it; enlarge it and look at it intently for a minute or two.)

Six similar foci are almost as bad, as you can see on the right.  In fact the more the worse, seems to be the rule.

 

 


So let's try something else.  Maybe 2 slanted?  A little bit better - now the eye goes in a triangle between the foci and and the bottom left corner (mostly). 

But even better is lowering the amplitude of one focus, making it less important and 'attractive' to the eye.   The visual cortex is now freer to direct the eye around the image along different paths, which is obviously less tiring.

Let's add more.

 

 

Here's one of many possible configurations that are close to ideal, according to these guidelines:
1.  All foci of descending weight; no two exactly the same, at least none of the more important ones.
2.  All foci off-center, making it possible for the eye to travel widely around the picture, going around many times along many different paths.  (This means triangular and curved paths are more useful than square-rectangular.)
3.  None of the more important foci are too close to the edges of the image, which could lead the eye outside of the frame; this can also be annoying.
4.  The spacing of the foci should vary - the empty spaces, the negative spaces between them, should be as varied as the foci themselves - no two the same size and shape.

Note that the last image has a very natural and aesthetic feeling to it, and even though it's nothing but numbers it's easy to look at for a longer time.  (Try the same experiment as above, enlarge it and look at it for a minute or so; you'll definitely not feel the same discomfort as with the other one.)
This makes sense; evolution optimized our visual system to scan and process the natural environment - so anything that resembles it more will make for easier viewing than something that resembles it less.

Now to apply this to some real images.

 

On the left an image with many nearly equal foci distributed quite homogenously - this creates an uninteresting composition. 
On the right, after some tweaking, the composition is a little better.

(Not sure who the artist is, if you know please contact me.)

 

 

 

 

 

 

 

 

Below - a famous painting by a true master of composition, stylized to show up the contrasts better.  The schematics give us clues as to why this image has such a pleasing composition; it seems to follow the guidelines perfectly.

Of course there is more to composition than this, but the guidelines can be quite helpful as a starting point. 
Some other things to consider:

Balance

Another aspect of composition is the concept of 'balance'.  An image can be perceived as heavier on one side than the other - or perhaps top-heavy.   If so then it's perceived as off-balance, usually an un-desirable effect.  
Edges with greater tonal contrasts tend to register as having greater weight.
 This can be adjusted a lot by just changing the lighting in the scene a bit, if for some reason you don't want to move your objects.

Color

Be very careful with strong colors - fully saturated colors should be kept to a minimum.  This goes double for CG, where the artist often thinks that a bright red car should be shaded 100% red - but this is wrong.  Using the eye-dropper tool in Photoshop on a photo will reveal the truth (max between 50-80% for non-incandescent colors).

Following the same guidelines above concerning contrast, the strongest color should only cover a few % of the image area, while medium strong colors can cover more, and the very muted colors (pastel colors, grays, browns, dark colors etc) should cover most of the image area.

You should avoid having two equal amounts of two different pure colors (bearing in mind that pure purple needs to cover an area 3 times larger than pure yellow to seem equal, and pure blue 2 times larger than pure orange - red and green are equals in this respect).

Golden Mean

Note that a rectangle is usually better than a square, either for the overall cropping, or for shapes within the frame.   The Golden Mean theory is based on this observation. 
Why should this be so?  I'm guessing because topologically the rectangle is more varied, less symmetrical (instead of 4 identical sides 2 are longer).  Which permits more variation in how the fovea travels across it, which is less tiring.  And anyway, we seem to almost always prefer variation to monotony, in any context.

 

 

Copyright 2003 Steven Stahlberg