Machine Vision-based WebRTC

WebRTC, XMPP, and Computer Vision

DeepXHub RTC (Real-time Communication Architecture)

In this blog post, we will cover the architecture of WebRTC (Real-Time Communication) implemented in the DeepXHub platform.

Architecture map of the project
Figure 1: Overall RTC (Real-Time Communications) infrastructure of DeepXHub

How Real-Time Communications (WebRTC) are used in Video Analytics and Computer Vision platform context?

RTC (Real-Time Communications) infrastructure is used along Machine Learning and Media infrastructures in DeepXHub platform to enable such services as:

  • Push notification alerts – team members receive important updates about warnings and detections issued by the Machine Learning Computer Vision pipeline of DeepXHub processing infrastructure. This is especially handy for employees working in the field who may not have access to a computer terminal. They will receive push notification alerts with sound/vibration if there is a matter worth their attention. This may help them adjust their operation to avert a problem or introduce some improvement without further delay. DeepX has implemented our own custom-made Push Notifications service written in Erlang which makes it highly scalable and ready for the challenges of high load use cases.
  • HQ to field push messaging – in addition to automatically generated push notification alerts originating from Machine Learning detectors and DeepXHub Triage server, DeepXHub also provides an interface in its Dashboard allowing to manually send messages to all employees or specific channels. This may be useful to alert all workforce or a specific team about something or send an important communication directly to their mobile devices.
  • WebRTC video streaming – gStreamer, RTC signalling and TURN infrastructure allows DeepXHub to provide a wide set of media access capabilities importantly including real-time monitoring of CCTV / operational camera feeds irrespective of their type and codec thanks to transcoding capacity. The scalable media processing pipeline of the DeepXHub platform allows to process numerous camera feeds and stream the video in the widely accepted WebRTC (H264 and VP8/VP9 codecs) format which allows turning any smartphone or a laptop into a monitoring station, reducing the time and cost of accessing this valuable service. In addition, WebRTC streaming technology is used in DXTap (mobile client) and DX Dashboard (web client) to enable the feature of streaming user’s own video. This may be used for communication purposes or to enable additional video streams from the available smartphone device or laptop to complement the existent stationary / CCTV video feeds.
  • IM (Chat / Messaging) – real-time messaging is important for team coordination. DeepXHub offers a messaging solution across iOS, Android, and Web client applications thanks to cross-platform React Native and Reactive.js technologies. Messaging communication is conveniently tied with camera feeds/channels, mixing automated updates from Machine Learning Computer Vision detectors with relevant staff members’ communications in the same console interface. Messaging solution in DeepXHub is powered by ejabberd XMPP server technology, a very scalable chat server powering messaging in many high-load applications (e.g. WhatsApp, etc).

RTC Cluster

RTC Cluster is a set of mainly identical instances behind a load balancer. DeepXHub RTC cluster supports hot replace, upgrade and scale up thanks to its cluster architecture and Erlang-based stack. This means that if the capacity of the cluster is not enough in production, one or more of its instances can be switched off, upgraded and returned back into cluster, or additional machines can be added into cluster without causing any downtime to the service. This allows DeepXHub to offer 99.99% and higher SLA to its enterprise customers.

Figure 2: RTC (Real-Time Communications) cluster

RTC cluster implements such services as:

  • ejabberd – a chat/messaging server based on XMPP protocol, invaluable for real-time communications. This is a highly scalable technology used in numerous high-load applications including WhatsApp and others.
  • scaleable implementations of MnesiaMySQL and Cassandra databases
  • Apache Kafka pubsub connectivity allowing to export data and statistics for external processing without reducing the performance of the RTC cluster instances
  • Push Notification service – a scalable Erlang-based services custom built by DeepX to enable push notification alerts to smartphones and other devices of clients teams in the field, HQ and other environments. This is very important for timely updating the team of important factors detected by Machine Learning Computer Vision AI agents of DeepXHub.
Figure 3: Push Notification services for ML alerts

As shown in the figure above, Push Notification service’s queue of messages is sent into official push notification servers of Apple, Google and other providers operating the relevant smartphones operational systems and are afterwards received by end users at their devices.

  • Media / Streaming service including gStreamer, a signalling server and TURN server supporting WebRTC video streaming and commutation. This enables DeepXHub users to easily access real-time video monitoring and video analytics at any device available at their disposal such as smartphones (Androids and iPhones fully supported), PCs and laptops thanks to WebRTC and transcoding capacity of DeepXHub.
  • Stats Agent is actually a set of tiny resource optimized hooks and plugins connected into RTC services exporting raw data (such as messages, video streams, push notifications, users and sessions count) into Stats Service where it is aggregated for statistics and analytics purposes to be displayed in Dashboard web interface, sent via automated e-mail reports etc.
Figure 4: Stats Service where aggregation happens


DeepXHub includes a modern and highly scalable Real-Time Communications infrastructure allowing customers teams to:

  • collaborate with colleagues on existing tasks and processes involving video monitoring and video analytics (via channeled messaging, video, and push notifications)
  • immediately communicate important updates to the team in the field directly to their smartphones, reducing reaction time and providing full awareness and transparency into evaluation processes to the field operations team
  • easily ramp up additional workplaces using standard iOS and Android smartphone devices or standard laptops and PCs; iOS, Android, and Web are fully supported natively

We’ll be happy to answer any questions regarding our RTC server technology or commercial use cases.

Close Bitnami banner