The shot segmentation is the automatic identification methods by computer terminals of shots in a video. This involves identifying points automatically mount originally defined by the director by measuring the discontinuities between successive frames of the video. These mounting points are obviously known to the director of the video, but are generally not disclosed or available. To avoid a human operator a long and tedious tracking plans by viewing, automatic methods have been developed by computer scientists.
This is the problem the oldest and most studied video indexing, considered an essential building block for analysis and research videos. There are so far little direct applications of segmentation plans for the general public, or software digital video. However, this is a major step in the analysis of the video, allowing the definition and use of technical information retrieval in videos.
The shot segmentation is to identify the different shots of a video. This only makes sense if the video actually contains plans, that is to say, it was fitted by a director. Some types of videos (video surveillance, home videos …) do not lend themselves to this type of technique. The videos are generally considered to movies or shows television.
The shot segmentation is sometimes (incorrectly) called “segmentation scenes” two by some researchers. The segmentation into scenes however, is a different task, which is to identify the scenes; this concept is defined as a group sharing plans semantic consistency.
Different types of transitions between shots
There are many ways to make a transition between two planes. The simplest is the abrupt transition: moving from one plan to another without image transition. To make this more flexible way, the filmmakers have created a wide variety of smooth transitions, the fade to black, fades, the shutters, and many others, made easier by the use of computer and even consumer software video editing.
For shot segmentation, researchers generally distinguish two types: abrupt transitions (also called cuts, the English “cut”) and gradual transitions, which include all other types of transitions.
The main idea underlying the shot segmentation methods is that the images in the vicinity of a transition are very dissimilar. We then seek to identify discontinuities in the video stream.
The general principle is to extract an observation on each image, and then define a distance 9 (or similarity measure) between observations. The application of the distance between two successive images, the entire video stream, producing a one-dimensional signal, in which then seeks the peaks (resp. if hollow similarity measure), which correspond to the moments of high dissimilarity.
Observations and distances
The simplest observation is simply the set of pixels of the image. For 2 frames and N × M, the apparent distance is then the average of the absolute differences pixels pixels (distance L1):
More refined approaches can measure only significant changes, filtering the pixels that generate differences too low, that only add noise.
Unfortunately, the techniques in the field pixellic are very sensitive to movements of objects or camera. Techniques of block matching have been proposed to reduce the sensitivity to motion, but the methods in pixellic were largely supplanted by methods based on histograms.
The histogram of luminance or color, is a widely used observation. It is easy to calculate, and is relatively robust to noise and movement of objects, because a histogram ignores the spatial changes in the image.