Media Processing Service

This diagram show the user-visible parts of the media processing service.

Media Processing Service Diagram.jpg

Inputs represent any media source: a file, network-stream or video SDI devices.

Outputs represent media destinations, supporting the same type of media supported by the inputs.

A bus represents a switching or mixing operation allowing many inputs to be routed or mixed into several outputs. For each input and bus, a filter can be added to perform metadata analysis and/or stream enhancement.

The media processing service is the central component for media processing in ReCAP. The service exposes a model to route and mix live and offline media through filters for metadata analysis and stream processing.

Partners of ReCAP, and later, third parties can write their own filters to plug in analysis modules and stream enhancers into an existing media processing pipeline. As such, the media processing service is the central component to integrate visual analysis modules in ReCAP. Using state-of-the-art codecs and muxers ensure that the algorithms can be used in an actual broadcast production setting.

Underneath this model, the GStreamer media framework is used to implement the service. GStreamer is an open-source multimedia framework providing an open system to build general-purpose media pipelines. ReCAP is going to apply this framework to the broadcast production scenario. Within ReCAP, we aim to contribute as many bug fixes and core GStreamer features as possible directly to the open-source community.

Since the media processing service is based on GStreamer, the technical integration of visual analysis modules (i.e. filters in our model), and codecs / muxers is done by writing GStreamer plugins. The API/ABI2 of GStreamer is stable and will be reused for these purposes.

Distribution of Media Streams

The distribution of the media processing service across many nodes (i.e. virtual or physical machines) shall be done in two ways:

  1. A different stream is handled by each of the concurrently running nodes (e.g. batch- mode processing of files); or
  2. if the same stream is processed by many nodes, each processing different time segments of this single stream.

The ReCAP system will employ network clocks, such as IEEE-1588 PTP3 or NTP4 for global synchronisation between nodes. This ensures that streams stay synchronised with each other (i.e. lip-sync) and that a global maximum delay (i.e. latency) of the overall system can be negotiated to enable live and near-live productions.

Distributing the processing of high-resolution video without compression is often not possible on commodity hardware due to high bandwidth requirements. In order to make this feasible, the transport of video between distributed nodes requires compression. ReCAP will use light-compression video codecs that are robust for roundtrip transcoding to address this (e.g. JPEG 2000 or visually-lossless H.264).

GPU Acceleration and Specialised Hardware Acceleration

GPU hardware acceleration will be used for making analysis and processing algorithms applicable to real-time scenarios that would otherwise be too slow to be applied for live applications. Modern GPU hardware often comes with specialised co-processors for efficient hardware encoding. For certain encoding jobs in ReCAP, this specialised hardware is employed to reduce the required CPU resources for encoding, and to use less power which might be relevant for large-scale server applications. Modern SaaS platforms also allow access to these hardware resources in a cloud-based scenario.

Unified Platform for Live and Offline Scenarios

Some processing scenarios can only execute in offline mode (e.g. two-pass encoding of a file, or applying a computationally expensive visual analysis). Live productions, on the other hand, depend on the processing to keep the real-time constraint in order to avoid losing information (e.g. capture of a live camera feed). It is essential that the proposed solution provides for both real time analysis (where technically possible), as well as batch processing for offline/non-real time workflows. In ReCAP, both scenarios can be controlled using the same high-level API model. They also share the same low-level media framework, since many low-level components, such as demuxers, graphics renderers or simple codecs can operate in both live and offline mode. In order to maximise the re-use of these components, ReCAP aims at providing a unified platform to cover both live and offline production use cases.

Real-Time analysis would provide users with quick and essential metadata extraction, quick fixes and automatic indexing of content for fast turn-around workflows, such as news gathering, live event production and bulk capture processes (e.g. tape archive ingest).

Non-real time analysis of content would provide more granular, detailed and more accurate results, perhaps for the most important content or where time allows for greater analysis. Non-real time analysis would typically require greater computational resources (CPU, GPU, RAM and networking), here the use of cloud hosted SaaS or pay-as-you-go content analysis services might prove attractive to SMEs, which cannot afford typically very high infrastructure CAPEX costs for processing at scale.


The Micro-Service approach will provide ReCAP with a group of individual applications providing a single (or a small number) of services. Combined, these provide the higher-level functionality to the each of the components in the application layer. The opposing approach would be a single monolithic application.

There are clearly defined interfaces and protocols for communication between the services, for example the zeroMQ message bus and REST API via HTTP. The services and interfaces can then be tested and monitored in smaller units.

Separating the responsibility of services to smaller packages can allow for the graceful degradation of services without affecting the application as a whole. The modular nature also allows additional services to be added at a later date without the need to change all parts of the platform.

The hardware specification can be tailored for the individual service when architecting the platform, while providing isolation so that a malfunctioning service does not affect the full stack.

ReCAP will use automated discovery via Consul for Micro-Services to register themselves as being available. A higher level of orchestration coordinated by Saltstack will then resolve the dependencies between Micro-Services in the ordering of installation and configuration.