TL;DR
There is a new webrtcbin2 rust plugin containing split webrtcsend and
webrtcrecv elements for handling a WebRTC session. The highlights of
webrtcbin2 are that it uses less threads per session by using rtpsend
and rtprecv (also rust), implementing DTLS handling internally, using
librice (ICE in rust), and handling signalling all within an async runtime.
All of these components also share threads with other instances of webrtcsend
and webrtcrecv allowing for an even further reduction in the amount of
resources significantly improving scalability.
The landscape
When I originally wrote the webrtcbin GStreamer element almost
10 years ago,
I did not completely envision the number of users that would come to use this code
in some way shape or form. From HTTP based standards such as WHIP, and WHEP and
the myriad of projects that use WebRTC in some way for ingest or egress. WebRTC
is still one of the best ways to transport live video into a web browser for display.
WebRTC's loose compatibility with the SIP ecosystem is also a driving force
behind WebRTC's continued use.
Now, webrtcbin has definitely proved itself in situations that require
a small number of sessions. Using webrtcbin for a mixing server (MCU)
or even SFU with hundreds or even thousands of streams in a single application
is still a tall ask. The biggest reason for this is the number of threads that are
created for every WebRTC session.
Threads
- RTCP thread -
rtpbin(used bywebrtcbin) creates a thread per session essentially for handling timeouts required by RTCP. rtpjitterbuffercreates a thread per incoming stream in order to be able to handle timeouts and deal with late or missing RTP packets.dtlsenc- A thread whose sole purpose is for being able to handle DTLS timeouts.webrtcbinand signalling - A dedicated thread for handling signalling related changes such as SDP generation, applying remote SDPs, handling ICE candidates, etc.webrtcbinand ICE - ICE uses libnice on a dedicated ICE network thread per WebRTC session.- Media streaming threads - One streaming thread for sending and receiving media data.
When an application requires many WebRTC sessions, the memory requirements and context switching overhead of having 5 extra threads per WebRTC session can limit the number of sessions that can be concurrently executed.
Pipeline loops
Another concern I had is that for the server mixing/forwarding use case, pipeline
loops were almost a necessity due to the basic requirement that participants in a
WebRTC call wanting to be able to see and listen to each other. The obvious
answer to this problem is to split the pipeline and use some wormhole elements
such as appsrc/appsink, intersink/intersrc, proxysrc/proxysink, etc.
What if? - webrtcbin2
With the benefit of hindsight, we can definitely improve on this situation and reduce the number of threads that is required by each additional WebRTC session. Let us go through the list from above.
Pipeline loops
In order to solve the problem of loops in the pipeline, I took a leaf out of the
design we made for rtpbin2 and created separate webrtcsend and webrtcrecv
elements that interact with a shared WebRTC session object by having the same id.
This allows data to flow essentially in one direction without requiring any kind of
loop in the pipeline graph.
For some background on why rtpbin2 was created, you can
have a look at a previous post I wrote.
Threads
- RTCP thread - Amortised over multiple instances inside
rtpbin2using a tokio scheduler. - Jitter buffer per stream -
rtprecv(part ofrtpbin2) uses the same tokio scheduler for RTCP handling as it does for handling timeouts and packets through the jitterbuffer introducing no extra threads. dtlsencis no longer - DTLS is performed (using OpenSSL) directly just before/after ICE processing.webrtcsend/webrtcrecvand signalling - Signalling occurs on a tokio runtime shared across all instances ofwebrtcsend/webrtcrecv.webrtcsend/webrtcrecvand ICE - Uses librice on the same tokio runtime aswebrtcsend/webrtcrecv.- Media streaming thread - Same as
webrtcbin. Can be amortised by using the threadshare elements.
If we count the number of threads saved, we can see that for every WebRTC session, at least 5 threads are no longer needed in the new design. At 100 sessions, that is roughly a 500 thread saving in both memory and contention.
Features of webrtcsend/webrtcrecv
While webrtcsend and webrtcrecv are functional and can successfully
communicate with a web browser such as Chrome or Firefox, there are still some
missing pieces. Some of the supported features include:
- Audio and/or Video streaming. Data channels are not currently supported.
- BUNDLE is supported and required for multiple media.
- rtcp-mux is required.
A non exhaustive list of not yet supported features include:
- Retransmissions and Forward Error Correction (
rtpbin2does not support this yet). - Data channels
- Renegotiation
- Statistics
- TURN servers (librice supports it but not yet implemented in
webrtcbin2)
All of these missing features are solveable with further implementation effort.
Example
A send and receive example using webrtcbin2 is available
the upstream repository
and can be used with this example web page.
Just make sure that data channels are not enabled as they are currently not supported.
Closing
This work will be part of the upcoming GStreamer 1.29.2 development snapshot or can be built from the main branch of gst-plugins-rs.
Writing a mature WebRTC implementation is an endeavour that requires a fair bit of implementation effort to complete. If you would like to help make a secure, mature WebRTC implementation for GStreamer please get in touch.