Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Hi everyone, contributor to Livebook and Nx here.

When we started the Numerical Elixir effort, we were excited about the possibilities of mixing projects like Google XLA's (from Tensorflow) and LibTorch (from PyTorch) with the Erlang VM abilities to run concurrent, distributed, and fault-tolerant software.

I am very glad we are at a point where those ideas are coming to life and I explore part of it in the video. My favorite bit: making the tensor serving implementation cluster distributed took only 400 LOC (including docs and tests!): https://github.com/elixir-nx/nx/pull/1090

I'll be glad to answer questions about Nx or anything from Livebook's launch week!



Nothing specific about this project, and don't feel obligated to respond, but I just wanted to thank you for all the work you've done with Elixir and the related ecosystem. Great language, great tools, and a helpful, welcoming community. It was a perfect introduction to practical functional programming.

Haven't found a big project for it yet, but I've done a bunch of little side projects since a friend who worked at Appcues gave me the hard sell on it around 2018.


> contributor to Livebook and Nx here

While accurate, it's a bit of an understatement :) Thanks for all your work, Jose.


Thanks for sharing!

The distributed ML currently seems focused on model execution. I see another commenter's excitement about "Looking forward to NX transformations that take distributed training next level." -- which I agree, will be quite interesting.

Where / how do you see Nx being used effectively in distributed training? Is distributed training a reality for open-sourced models to compete against big tech models?


For distributed training, one important feature is to be able to do GPU-to-GPU communication, such as allreduce, allgather, and all2all. Those are not supported at the moment but they are in our roadmap. At this level, however, it seems the language runtime itself plays a reduced role, so I don't expect the experience to be much different to, say, Python/JAX.

For the second question, my understanding is that all big tech models rely on distributed training, so distributed training is a requisite for competing really.


Do you ever think about why you’re probably a 100x programmer, is it just working memory and pure intelligence or some strategy or tactics that make you so good at this. Asking for a friend :-)


Is anyone working on audio libraries that will enable streaming audio chunks for Whisper processing? Saving audio files into a local file system, running ffmpeg to chunk, and then sending them off to Whisper is very tactical..


The current pipeline expects PCM audio blobs and, if data is coming from a microphone in the browser, you can do the initial processing and conversion in the browser (see the JS in this single file Phoenix app speech to text example [0]).

On the other hand, if you expect a variety of formats (mp3, wav, etc), then shelling out or embedding ffmpeg is probably the quickest path to achieve something. The Membrane Framework[1] is an option here too which includes streaming. I believe Lars is going to do a cool demo with Membrane and ML at ElixirConf EU next week.

[0]: https://github.com/elixir-nx/bumblebee/blob/main/examples/ph...

[1]: https://membrane.stream/


> I believe Lars is going to do a cool demo with Membrane and ML at ElixirConf EU next week.

Yes, the relevant part of his demo with the membrane pipeline appears to be here: https://github.com/lawik/lively/blob/master/lib/lively/media...


limited in usefulness.. seems that Lars kept a MembraneTranscript library dependency private



Quick example video from Chris McCord using ffmpeg and whisper in Phoenix: https://www.phoenixframework.org/blog/whisper-speech-to-text...


Sure.

I have a rough one using Membrane (media framework) that you can find here: https://github.com/lawik/membrane_transcription

I am using it for this talk I am putting together for ElixirConf EU so if you want it used in context that might be helpful: https://github.com/lawik/lively

Neither is release-worthy levels of polish but if interest is there I should make a proper library out of it.

That is to say streaming chunks works great already. I would love two things. Stitching the edges of the chunks, would probably need to do overlapping for that. And building chunks based on silence. That's more DSP than I know though.


Hey Lars! Building chunks on silence is a worthy cause! Why stitch the edges of the chunks? Is that because there isn't a clean chunk on silence?

I think this work is very important. I don't understand whether I actually needed to install the library dependencies for Membrane's sake or specifically for this use case (mad, ffmpeg, portaudio). Doesn't feel right..


You may be able to incorporate the [Membrane Framework](https://membrane.stream/) to do that. Built in Elixir, deals in those types of multimedia problems.

I'm not an expert here, but I'd expect that capturing a sample using Membrane and piping it into Whisper should be doable.


As everyone is chiming in on, fantastic work by you and your team.


Looks amazing! Small user feedback:

Even reading the blog, after installing the windows app it's not obvious how to get to the machine learning demos.

Also, after I found the +smart button from another page, on windows it fails due to lack of make (and presumably a set of compiler tools). This was frustrating trying to demo for someone on their computer.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: