GStreamer WebRTC: A flexible solution to web-based media

GStreamer's WebRTC implementation eliminates some of the shortcomings of using WebRTC in native apps, server applications, and IoT devices.
245 readers like this.
An intersection of pipes.

Opensource.com

Currently, WebRTC.org is the most popular and feature-rich WebRTC implementation. It is used in Chrome and Firefox and works well for browsers, but the Native API and implementation have several shortcomings that make it a less-than-ideal choice for uses outside of browsers, including native apps, server applications, and internet of things (IoT) devices.

Last year, our company (Centricular) made an independent implementation of a Native WebRTC API available in GStreamer 1.14. This implementation is much easier to use and more flexible than the WebRTC.org Native API, is transparently compatible with WebRTC.org, has been tested with all browsers, and is already in production use.

What are GStreamer and WebRTC?

GStreamer is an open source, cross-platform multimedia framework and one of the easiest and most flexible ways to implement any application that needs to play, record, or transform media-like data across a diverse scale of devices and products, including embedded (IoT, in-vehicle infotainment, phones, TVs, etc.), desktop (video/music players, video recording, non-linear editing, video conferencing, VoIP clients, browsers, etc.), servers (encode/transcode farms, video/voice conferencing servers, etc.), and more.

The main feature that makes GStreamer the go-to multimedia framework for many people is its pipeline-based model, which solves one of the hardest problems in API design: catering to applications of varying complexity; from the simplest one-liners and quick solutions to those that need several hundreds of thousands of lines of code to implement their full feature set. If you want to learn how to use GStreamer, Jan Schmidt's tutorial from LCA 2018 is a good place to start.

WebRTC is a set of draft specifications that build upon existing RTP, RTCP, SDP, DTLS, ICE, and other real-time communication (RTC) specifications and define an API for making them accessible using browser JavaScript (JS) APIs.

People have been doing real-time communication over IP for decades with the protocols WebRTC builds upon. WebRTC's real innovation was creating a bridge between native applications and web apps by defining a standard yet flexible API that browsers can expose to untrusted JavaScript code.

These specifications are constantly being improved, which, combined with the ubiquitous nature of browsers, means WebRTC is fast becoming the standard choice for video conferencing on all platforms and for most applications.

Everything is great, let's build amazing apps!

Not so fast, there's more to the story! For web apps, the PeerConnection API is everywhere. There are some browser-specific quirks, and the API keeps changing, but the WebRTC JS adapter handles most of that. Overall, the web app experience is mostly ?.

Unfortunately, for native code or applications that need more flexibility than a sandboxed JavaScript app can achieve, there haven't been a lot of great options.

Libwebrtc (Google's implementation), Janus, Kurento, and OpenWebRTC have traditionally been the main contenders, but each implementation has its own inflexibilities, shortcomings, and constraints.

Libwebrtc is still the most mature implementation, but it is also the most difficult to work with. Since it's embedded inside Chrome, it's a moving target and the project is quite difficult to build and integrate. These are all obstacles for native or server app developers trying to quickly prototype and experiment with things.

Also, WebRTC was not built for multimedia, so the lower layers get in the way of non-browser use cases and applications. It is quite painful to do anything other than the default "set raw media, transmit" and "receive from remote, get raw media." This means if you want to use your own filters or hardware-specific codecs or sinks/sources, you end up having to fork libwebrtc.

OpenWebRTC by Ericsson was the first attempt to rectify this situation. It was built on top of GStreamer. Its target audience was app developers, and it fit the bill quite well as a proof of concept—even though it used a custom API and some of the architectural decisions made it quite inflexible for most other uses. However, after an initial flurry of activity around the project, momentum petered out, the project failed to gather a community, and it is now effectively dead.  Full disclosure: Centricular worked with Ericsson to polish some of the rough edges around the project immediately prior to its public release.

WebRTC in GStreamer

GStreamer's WebRTC implementation gives you full control, as it does with any other GStreamer pipeline.

As we said, the WebRTC standards build upon existing standards and protocols that serve similar purposes. GStreamer has supported almost all of them for a while now because they were being used for real-time communication, live streaming, and many other IP-based applications. This led Ericsson to choose GStreamer as the base for its OpenWebRTC project.

Combined with the SRTP and DTLS plugins that were written during OpenWebRTC's development, it means that the implementation is built upon a solid and well-tested base, and implementing WebRTC features does not involve as much code-from-scratch work as one might presume. However, WebRTC is a large collection of standards, and reaching feature-parity with libwebrtc is an ongoing task.

Due to decisions made while architecting WebRTCbin's internals, the API follows the PeerConnection specification quite closely. Therefore, almost all its missing features involve writing code that would plug into clearly defined sockets. For instance, since the GStreamer 1.14 release, the following features have been added to the WebRTC implementation and will be available in the next release of the GStreamer WebRTC:

  • Forward error correction
  • RTP retransmission (RTX)
  • RTP BUNDLE
  • Data channels over SCTP

We believe GStreamer's API is the most flexible, versatile, and easy to use WebRTC implementation out there, and it will only get better as time goes by. Bringing the power of pipeline-based multimedia manipulation to WebRTC opens new doors for interesting, unique, and highly efficient applications. If you'd like to demo the technology and play with the code, build and run these demos, which include C, Rust, Python, and C# examples. 


Matthew Waters will present GStreamer WebRTC—The flexible solution to web-based media at linux.conf.au, January 21-25 in Christchurch, New Zealand.

User profile image.
Nirbheek Chauhan is a programmer who contributes to GNOME, GStreamer, Meson, and various FOSS projects.

Comments are closed.

Creative Commons LicenseThis work is licensed under a Creative Commons Attribution-Share Alike 4.0 International License.