Hacker Newsnew | past | comments | ask | show | jobs | submit | Sean-Der's commentslogin

> use WebRTC and deploy selective forwarding units, which are going to be something custom

Would you mind explaining more? If you are doing WHIP/WHEP you should be able to drop in Broadcast Box/MediaMTX etc... and switch out servers and no one should notice. You can use browser/mobile/ffmpeg/OBS etc... get the same behavior. I care a lot about the broadcast space, want to learn about other problems.

> subtly speed up audio/video to keep everything in sync

You can use https://webrtc.googlesource.com/src/+/refs/heads/main/docs/n... to add more delay (if you want to force more buffering). Or if you don't link the media together (via MediaStream) you don't get the behavior you describe either!

> capture each participant's audio individually

That's a neat problem. I haven't solved this one myself, I wonder if it's easier with RtpTransport or insertable streams?


Regarding SFUs - with something like HLS, I can really easily scale up using something like a caching CDN (not entirely sure if that's the right term). But the idea goes: I can distribute the HLS media playlist, and have my media segment entries prefixed with a caching/CDN service. The service will be configured with the actual origin server, and when a segment isn't in the CDN, the CDN fetches from the origin, on-demand. That was a nice option when I was doing owncast streaming since I really only paid based on viewership, and just had to make sure I had the correct cache-related headers on my media segments.

Or alternatively - I can push media segments up to a CDN and distribute that way, using an s3-compatible service, or just rsyncing to a server with better bandwidth, etc. One thing I didn't care for - again back when I was broadcasting with Owncast - was that I needed to make sure old media segments were expired, otherwise I would rack up an insane bill. I had a 24/7 owncast stream and if you're not on top of expiring media segments with your CDN, it gets expensive fast.

The overall idea is - serving HLS is ultimately serving files and there's a good amount of tooling for that, right.

Now that you mention it, I think WHIP/WHEP can solve some of that. I just don't know of any service where I can have that same cache/CDN-like experience, of either having the CDN connect to the origin as needed and fan-out, or where I can push up and let the service distribute. (though - now I'm googling for "webrtc sfu as a service" and see that is a thing!).

Didn't know about the playout delay extension.

Whether capturing individual audio is easier with RtpTransport or insertable streams - I'm unsure. Possibly? I just figure since MoQ is going to rely on things like WebCodec/WebAudio there's hopefully a bit more control over what happens with audio as it comes in.

I'll admit though - I've started noticing how often podcasts are clearly recorded using something that doesn't allow per-participant recordings and, I'm guessing as long as the quality is good enough most aren't worrying about it.

EDIT: feel like I should mention Pion rules, I used it a few years ago to put together an SRT-to-WebRTC thing and RTMP-to-WebRTC thing to use with Janus Gateway, it was so easy.


Huge fan of BigBlueButton! It’s been cool to watch the project go through multiple big tech changes and still keep going.

Never give up! Free/Open Source software especially in schools is so important


You know... As it turns out, that's a different piece of software! (I was super confused at the comment and had to google it.)

My company, and long before that just the website I used to host different projects on, is "BigBlueCeiling" so I tend to default to "BigBlue_________" for project names or offshoots now.

I'm a huge fan of FOSS in general though. And if BigBlueBam found life inside of education I wouldn't hate it.


Yea!

* Do video playback out of the browser. You can render a subset of frames, use a different pipeline for decode etc...

* Pull video from a different source. Join Google Meet on current computer, but stream from another host.


I can’t wait for https://w3c.github.io/webrtc-rtptransport/ when you talk about pulling vide out seems like the perfect fit.

I ended up doing proxy because Google Meet doesn’t let me hook at any RTCPeerConnection APIs at all. I wanted to send synthetic media in, but couldn’t get it working. Ending up doing a virtual webcam on Linux.


Oh yes! I will pull together a demo.

With ‘media-send’ I can send it out to ffmpeg/GStreamer and that does all the heavy lifting


I made a demo recently with my Google home camera using the official API https://github.com/hparadiz/camera-notif

But your way of grabbing the stream is so much simpler.

Overlay layer is super new in KDE Plasma is the only problem. You can also do v4l2loopback and make it a virtual camera.


Have you tried doing video + pipewire yet?

I am also using v4l2loopback, but its annoying to juggle /dev/video* devices. I wanted to do video stuff in docker containers, and it would be amazing if I could do pipewire in each container and have no global state.

I couldn't get anything to work in Chromium. FireFox saw the device, but video didn't come across.


When you say +pipewire you mean just audio playback? If you are pushing video to a picture-in-picture overlay a user might expect that so yea you could write to the pipewire socket like any other program. It's usually fully open for you to do just that.

I use v4l2 regularly with OBS. In order for Chrome/Chromium to see it you need to make the device before launching Chrome/Chromium. You can start v4l2 devices automatically by setting a modprobe config for your kernel.

My v4l2 notes might be helpful http://technex.us/2022/06/v4l2-notes-for-linux/


I wrote this to make Reverse Engineering WebRTC services easier. Will also let you save/send arbitrary media from WebRTC sessions. The idea is you do all your auth/interaction in the browser, but then do all WebRTC in Go. So you have lots more control. More to do with it, but it is far enough along to share at least.

In the README is an screenshot of sending my webcam, but replacing outgoing video with a ffmpeg testsrc. Handoff sits between so it can replace with any arbitrary video.


Interesting and novel project. I don't have anything constructive to add, but well done.


Thanks :)

No better feeling to work on something and hear it is novel! So many projects that I think will be useful miss the mark.


Connect it to an AI talking head and you have a customer service center - users browsing a store can click to talk with 'someone'.


I've bookmarked your project years ago to attempt implementing webrtc fully in a niche programming language. But I think I may have vastly underrated how difficult this is.

Have you come across https://github.com/elixir-webrtc/ex_webrtc ?

Wasn't sure if they used Pion as a guide


What language? Would love to help :) especially with AI Coding I think it would be a lot more accessible these days.

ex_webrtc is super cool. They have a cool built-in dashboard/analytics flow. It is way more 'operations friendly' then Pion it seems. I haven't used it heavily myself though.


I am kind of a WebRTC noob but... this means after I define my input channel (audio track, video, etc.) and establish a peer connection I can send data from a different source?

Are there any complications with that or is it kind of on me to not confuse the other peer by sending unexpected formats?


Yep exactly! After it starts you can slice in any media you want.

You need to make sure you are sending the same codec that the remote expects, otherwise nothing else! You can do a different resolution, bitrate etc...


SFU is nice because it gives everyone privacy. Neither you or ISP know who you are communicating with.

Otherwise if everything is trusted I love that we can have conversations without depending on others over the internet :)


There's nothing bad about SFU, particularly the version you wrote, which forms the basis of Livekit. It would be my first choice for supporting larger groups in Briefing anyway. If the traffic is E2EE, it doesn't matter if an SFU is involved. The critical part is the signalling, in my opinion. This is where the initial communication is established. In the current version of my app, whose source code is yet to be published, this can happen via an untrusted server.


What’s painful about running LiveKit? What would make running WebRTC server easier?


The ecosystem around RTMP is bonkers.

RTMP has SO many users for video (Twitch, YouTube etc...) yet you have librtmp which has so many forks. OBS has its own version in-tree....

Then at the same time you have people trying to add extensions to RTMP still!

RTMP has held back the use cases I have cared about since ~2015 so I am excited to see people embrace other options.


That's exciting! When you were evaluating it everything about the protocol/APIs fits your needs?

Just features/software need to be implemented?


I wouldn't say I'm done evaluating it, and as a spare-time project, my NVR's needs are pretty simple at present.

But WebCodecs is just really straightforward. It's hard to find anything to complain about.

If you have an IP camera sitting around, you can run a quick WebSocket+WebCodecs example I threw together: <https://github.com/scottlamb/retina> (try `cargo run --package client webcodecs ...`). For one of my cameras, it gives me <160ms glass-to-glass latency, [1] with most of that being the IP camera's encoder. Because WebCodecs doesn't supply a particular jitter buffer implementation, you can just not have one at all if you want to prioritize liveness, and that's what my example does. A welcome change from using MSE.

Skipping the jitter buffer also made me realize with one of my cameras, I had a weird pattern where up to six frames would pile up in the decode queue until a key frame and then start over, which without a jitter buffer is hard to miss at 10 fps. It turns out that even though this camera's H.264 encoder never reorders frames, they hadn't bothered to say that in their VUI bitstream restrictions, so the decoder had to introduce additional latency just in case. I added some logic to "fix" the VUI and now its live stream is more responsive too. So the problem I had wasn't MSE's fault exactly, but MSE made it hard to understand because all the buffering was a black box.

[1] https://pasteboard.co/Jfda3nqOQtyV.png


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: