The end-user needs to decide on one out of 4 possible levels of performance of the network and the connected video transmission devices. The performance classes 1 to 4 are introduced by EN 50132-5-1 and need to be selected acc. to the surveillance task:

  1. time accuracy for video transport stream: Class T1 to T4;

  2. interconnections - Timing requirements: Class 11 to I4;

  3. bandwidth limitation capability: Class C1 to C4;

  4. video stream priorizing: Class P1 to P4;

  5. maximum network loss, latency and jitter: Class S1 to S4 and M1 to M4;

  6. monitoring interval for interconnections: Security Grade 1 to 4 (see 4.2.2).

For high security applications redundancy and security of the network needs to be considered.

  1. Interoperability

If video transmission devices of different vendors shall be combined and operated together in a single IP network, it is necessary to take care of the compatibility. For this reason the integrator needs to select video transmission devices, which are compliant to prEN 50132-5. For a basic interoperability the IP video devices should be compatible to the protocol requirements of EN 50132-5-1 and -2 in terms of IP connectivity based on TCP/IP and UDP, video stream transport via RTP, one of the standardized video payload formats such as MPEG4 or H.264 and stream control based on RTSP. For eventing, device discovery and description there are different protocol options.

For a full interoperability of video stream transmission, stream control, eventing, discovery and description of network devices based on one framework, the integrator needs to select a high-level video ip protocol. He may choose a compatible implementation for IP video interoperability based on REST services or Web Services or any other open protocol which may be defined in the future, but is today not available.

If an ip video network is managed together with an IT network. It is recommended that the same administrators should have control over both networks.

  1. Wired transmission links

The most common form of an analogue wired connection is a coaxial cable. This is generally terminated with BNC connectors for compatibility. Standard coaxial cable (RG59) is suitable for transmission links of up to around 200 m. Larger ranges can be achieved by using rectifying amplifiers or cables with less attenuation (such as RG6 or RG11).

Another option for wired video transmission is a twisted pair cable. Common examples are Cat-5 and Cat-6 cables, which comprise four twisted copper wire pairs, and are used for analogue or digital transmission.

Fibre optics is an alternative solution which provides high capacity, high speed and low latency, long transmission distance with low signal attenuation (kms), resilience to electromagnetic interference, resilience to tapping.

  1. Wireless transmission links

A CCTV specifier should consider the needs of the viewer / system operator when designing the transmission network and appropriate network security. The main technology types have been summarised in Table 6.

Table 6 — Wireless transmission options

Link type

Transmission distance

Transmission frequencies

Link bandwidth (unidirectional)

Comments

Analogue RF

-30 m indoors

-100 m +

Outdoors

(Non Line of Sight)

2,4 GHz/5 GHz (Unlicensed bands)

Other frequencies can be used depending on spectral allocation and licensing details.

Dependant on installation specifics

Simple operation described here. More complex solutions can be offered.

‘Wifi’

(IEEE 802.11)

-30 m indoors

-100 m Outdoors

(Non Line of Sight)

2,4 GHz/5 GHz (Unlicensed bands)

Up to 74MBits/s

(802.11n)

Up to 19MBits/s (802.11g)

Generally not suitable for long range transmission. Range and throughput is heavily dependant on signal power at receiver.

Mobile WiMax(IEEE 802.16e)

Up to 50 km

(Line of sight)

Depends on installation. Configurable to both open and licensed frequencies

Up to 70MBit/s

System either delivers long transmission distance or high transfer rate, not both. Developing technology

2G

GSM

(Global System for

Mobile

Communications)

National/lnternational assuming system is within cell coverage (Inner City -1/2 mile from cell site

Rural - 5 miles from cell site)

-800-950 MHz or

-1,9 to-2,2 GHz

(Limited to cellular phone licensed bands)

14,4 kBit/s

More suited to speech and very low bit rate video or stills transmission. Requires a cellular service provider. Performance is dependant on carrier load, atmospherics and infrastructure provision.

3G

HSDPA

(High speed downlink packet access)

National/lnternationa I assuming system is within cell coverage (Inner City -1/2 mile from cell site

Rural - 5 miles from cell site)

-1,9 to-2,2 GHz

(Limited to cellular phone licensed bands)

Currently up to 14,4 MBit/s

Requires a cellular service provider. Performance is dependant on carrier load and atmospherics and infrastructure provision.



  1. Key considerations for IP based transmission systems

In a packet-based network, the performance of any video transmission device or application depends on the quality of service assigned to a particular application. To support video traffic adequate quality standards and performance figures shall be met for acceptable video streaming services. Especially four factors - bandwidth, latency, jitter, and packet loss - define the quality from the network point of view. How each is managed determines how effectively the network supports IP video traffic. A fifth factor 'redundancy' or 'alternative routing' is also an important consideration to help protect critical CCTV system- and operator­traffic.

  • Bandwidth - The size of the possible video stream pipe' (for example, 1 Mbps up through 10 Gbps). Several compression/decompression (codec) algorithms recommended by EN 50132-5-1 can reduce the amount of bandwidth needed for one IP video input to a fraction of the traditional Coax cable exclusively reserved for a single camera in this dedicated interconnection.

  • Latency or delay - 'The travel time through the pipe' - how long it takes for a packet to travel through the network. Live video is sensitive to delay. Maximum latency shall be according to performance requirements of EN 50132-5-1:2011, Clause 5. Typically, the network is not the largest contributor to the latency chain.

  • Jitter or delay variation - 'The received flow variation or pumping of stream' - the continuity with which packets arrive at their destination. Jitter buffers can temporarily delay incoming packets to compensate the jitter, but only some of the delay variations. These buffers have limits and excessive buffering can result in additional latency. Maximum jitter shall be in accordance with the performance requirements of EN 50132-5-1:2011, Clause 5.

  • Packet loss - 'The leak in the stream'. Packets can get lost because of collisions on the LAN, overloaded network links, or for many other reasons. Loss of packets beyond a very small percentage will degrade video quality. Note that IP video stream uses the User Datagram Protocol (UDP), which, unlike TCP used in non­streaming applications, does not provide the retransmission of packets. Maximum packet loss shall be in accordance with the performance requirements of EN 50132-5-1:2011, Clause 5.

  • Redundancy, Alternative Routing and Protection switching - Identifying and replacing a broken link or stream' to enable a reliable video transmission via alternative routes.

These factors are defined and covered in more details in EN 50132-5-1 including their impact on the network design.

  1. Video performance characteristics

    1. Image compression

Image compression settings should always be dictated by the operational requirement for each camera view, and not the storage capacity of a proposed system.

The compatibility of the image format transmitted, stored and exported from the CCTV system should be considered alongside image compression, many CCTV systems use proprietary codecs which are unable to be received and replayed by a widely available software application. See ClauseH Image storage and export.

The suitability of a profile level or type should be identified using an image quality test specific to the purpose of the camera view. A number of image quality tests are discussed in more detail in 13.3.

NOTE The live and the recorded views of the same scene can show different levels of quality, depending on which point in the image chain, the compression is applied.

Image quality tests for live, recorded and exported views should be defined to ensure the system is capable of meeting its OR.

  1. Frame rate

The required frame rate should be determined for each individual camera view. There are multiple factors which should be taken into account when selecting the desired frame rate.

These factors include:

  • the risk for the camera’s desired field of view as defined in the Risk Assessment,

  • the purpose of the camera as defined in the Operational Requirement,

  • the anticipated activity in the area to be observed,

  • the field of view of the camera,

  • whether the frame rate is changed by an external trigger such as an alarm device or VCA or VMD alarm,

  • whether the camera is observed by an operator, low frame rates can be difficult to view for sustained periods.

For example, a camera whose purpose is to capture a short pathway outside a building should be set with a sufficiently high frame rate that a person could not move from one side of the field of view to the other without appearing in a single frame.

Guidance on selecting an appropriate frame rate depending on the purpose and risk associated with each camera view is available in Appendix D.

In systems which allow reduction of frame rate and/or of image resolution of stored video after a set period of time in order to lower the overall storage requirement the reduced quality storage shall still be fit for purpose.

  1. Resolution

The resolution for a camera view shall be determined from the purpose of the camera as defined in the OR and required coverage. The camera should be able to achieve this resolution without using digital zoom. For example, if the 'Identify' category defined in 6.7 is required then any system with a resolution of 2CIF or below would require the subject to be very closely framed which is not practical in most cases.

If observation of a single wide area is required then a small number of high resolution cameras may be a better solution than a large number of lower resolution cameras. However if the area contains a large amount of activity then consideration should be made to whether it is suitable to be viewed by a single operator or multiple cameras is more suitable.

  1. Storage characteristics

    1. Storage

      1. General

The total storage requirement for a digital CCTV recorder should be estimated before a system is installed, so that a hard drive of the appropriate capacity can be specified. It is vital to ensure that sufficient capacity is available so that compromises do not have to be made on either the image quality or retention time.

The storage capacity needed in a CCTV system depends on several factors, which are summarised below. Typical values for each variable are given in Table 7.

Table 7 — Factors affecting the storage capacity required for a CCTV recorder

Variable

Frame size

Fps

Number of cameras

Operational hours

Retention period

Storage Management

Typical range

5 кВ - 50 кВ

1 -25

1-16+

1 -24

24 h - 31 Days

Add. 1 Day protected



Frame size - This value is the average size of each image as recorded. The actual figure will be a function of the image resolution (in pixels or TV lines) and the amount and type of compression applied to the image or video sequence (It is particularly dependent on whether inter-frame compression is used, in which case the average frame size will be an average of larger l-frames and smaller P-frames.) These factors are very much specific to the specific CCTV recorder, which can make the image size difficult to estimate accurately, and assistance should be sought from the system supplier.

Frames per second (fps) - The number of images recorded each second by a camera has a significant impact on the amount of data being generated. The preferred frame rate should have been identified during the level 2 operational requirement capture process.This value could be dynamic if a camera is triggered by external alarms or motion detection. For some systems there may be no recording unless activity is detected. For others, there may be continuous recording at a low frame rate, say 1 fps, until activity is detected, when there will be a short period of recording at a high frame rate, say 12 fps. If this is the case an average value should be calculated by estimating the number of anticipated triggers in a 24 h operational period, e.g.:

  • standard rate (RS) = 1 fps;

  • triggered rate (RT) = 12 fps;

  • triggered period (T) = 3 min;

  • number of triggers anticipated per day (N) = 10;

  • number of minutes per day at triggered rate = N x T = 30 min;

  • number of triggered frames generated = 30 x 60 x RT = 21 600;

  • number of minutes per day at standard rate = 23 h 30 min = 1 410 min;

  • number of standard frames generated per day = 1410 x 60 x RS = 84 600;

  • total number of frames generated per day = 21 600 + 84 600 = 106 200

  • average frame rate per second = 106 200 / number of seconds in 24 h = 106 200 / 86 400 = 1,2 fps.

Number of cameras - This is the number of recorded cameras used for the whole system under consideration, as specified in the operational requirement.

Operational hours - This is the number of hours the CCTV system will be operational, within a 24 h period, as specified in the operational requirement.

In a simple system this could be for the full 24 h per day, whereas in a more complex system it could be for a predefined number of hours whilst the premises are occupied I vacant.

Retention Period - The time for which the CCTV footage should be stored on the system before being overwritten, as specified in the OR.

Storage management- Where video data is to be prevented from being overwritten, there should be a facility to protect recordings from being deleted. The method and storage requirement should be defined in the OR. This should not reduce the retention period of the normal recording.

A general equation has been given to aid in estimating the total amount of storage required:

( Size x fps x C x Hours x 3,600 )

I ї~ооо 000 Ix Tr = Approximate Storage Requirement (GB)

where

Size = Image size in кВ;

fps = Images per second;

C = Number of cameras in the system;

Hours = Total number of operational hours in a 24 h period;

TR = Retention period;

3,600 is to convert seconds into hours (60 x 60);

1,000,000 is to convert кВ to GB, approx.This equation can be used for very basic systems where all the cameras are recording at the same image size, frame rate and operational hours. For more complex systems a storage requirement can be calculated for each camera and the resultant totals added to give the overall requirement for that system.

  1. Example 1

A CCTV system is being specified for a custody suite that is required to capture high quality images of 20 кВ per frame. 12 fps per camera are being generated, at an approximate stream rate of 240 kbits/s, and there are 8 cameras in the system. Each camera is recorded for 24 h per day, and the OR has stipulated a retention period of 31 days. The storage capacity is given by:

2

x 31 = 5142 (GB)

0x12x8x24x3,600

1,000,000

As can be seen this represents a large amount of data, and another strategy might need to be considered to ensure the amount of data being collected is manageable. In this case it might be considered that the amount of data being generated is necessary, in which case the storage provisions should be made. However it might be deemed more appropriate to reduce the image size/quality on half of the cameras, or to reduce the frame rate on some of the cameras. Another approach might be to use IR triggers or motion detection to trigger the image recording.

  1. Example 2

A retail outlet is installing a small CCTV system to view the access points (windows and doors) whilst the shop is closed. The image frame size has been to set to a ‘medium’ value (10 kb), and the resultant image checked for suitability against the level 2 OR requirements. The recorder will be triggered by motion detection and IR sensors and the average frame rate has been calculated as 2 fps for all the cameras. 6 camera locations have been identified to offer maximum coverage, and all the cameras will only be recording for the hours the venue is closed 7 pm until 7 am. As the reason for the system is to provide evidence after a break-in the retention time has again been set to 31 days. The storage requirement is given by:

1

x 31 = 160 GB

0x2x6x12x3,600

1,000,000

  1. Image storage and export

    1. Format of the compressed video data

Special or modified compression algorithms prevent the Police and the Courts having direct access to the CCTV data without the use of proprietary software.

The compressed images (and audio if present) shall be encoded using standard compression formats (see EN 50132-5-1 or Annex A “Current Standard Formats”). The compressed data shall comply strictly with the standards and contain the full information required to decode the images and audio.

The compression format and the means of locating the compressed data within the CCTV files shall be made public.

  1. Encryption

The images shall not be encrypted. The CCTV format can contain checksums or other methods for ensuring that changes to the data may be detected but, where used, they may not alter the compressed image information.

NOTE There is no requirement for manufacturers to release information on methods used to ensure that their CCTV files have not been tampered with. The Police ensure that CCTV data is valid for use within the Criminal Justice System by maintaining a clear chain of evidence - encryption can delay or prevent legitimate access to CCTV evidence.

The format of the CCTV files shall permit the size and aspect ratio of each image to be determined.

  1. Basic metadata (time, date, camera identifier)

Being able to correctly identify the time at which an image is captured is often essential to the use of CCTV in police investigation. Therefore:

The data contained within the CCTV files shall permit a time stamp and camera identifier to be associated with each image and audio sample. For CCTV without audio, the time stamp shall have a resolution of no less that one second. Where both video and audio are present, the time stamps shall have sufficient resolution to permit synchronised playback of the audio-visual streams.

The means for determining the time stamps and camera identifier on each image and audio sample shall be made public. There are many way of encoding time stamps, but whichever is used shall be stated.

The CCTV format shall specify any time offsets that are applied to time stamps and give the method for converting each time stamp into a local time that is local to a time zone and which includes any applicable daylight-saving adjustment.

Time should auto update for changes between any daylight saving offsets and UTC.

It should be considered, if precise timing is required, whether a network time server according to EN 50132-5-1 is used.

For additional metadata (e.g. geodata, floor level, VCA, PTZ positions, etc.) the format and compatibility shall be stated in the OR.

  1. Multiplexing format

Where a CCTV recording contains multiple steams of video (and audio) the CCTV files shall incorporate metadata which permit the streams to be de-multiplexed. The method for de-multiplexing shall be made public.

It is permissible for the CCTV format to contain other streams of data which are not essential for extracting the images and audio samples with their time stamps. The additional data streams may remain proprietary although it is recommended that their format is published so that they can be decoded independently of the manufacturer’s software.

It is recommended that each video and audio stream has a name which may be meaningful to the user of the CCTV system. Where names are present, the method for associating streams and their names shall be made public.

  1. Image enhancements

If the system provides enhancement tools such as image sharpening, brightening or zooming in on a particular part of the image then any applied enhancements should not change the original recording. If an enhanced image is exported, an audit trail documenting these changes should exist.

  1. Image export

To facilitate replay and export the following should be adhered to:

  • CCTV data exported from a recorder shall have no loss of individual frame quality, change of frame rate or audio quality. There should be no duplication or loss of frames in the export process. The system should not apply any format conversion or further compression to the exported images, as this can reduce the usefulness of the content,

  • any original metadata and/or authentication signatures shall be exported with the images,

  • a simple user guide should be available locally for reference by a trained operator,

  • the facility should be provided for the export of images from selected cameras within user-defined time periods,

  • simultaneous export and recording should be possible without affecting the performance of the system except on systems that require removal of the primary storage media for export purposes,

  • the export method of the system should be appropriate to the capacity of the system and its expected use;