Hacker Newsnew | past | comments | ask | show | jobs | submit | pyryt's commentslogin

I would love to do this on my codebase after every commit


Some names are just too tempting https://arxiv.org/abs/1507.02672


This looks promising! Does any LLM understand Instantdb yet?


For comparison, manslaughter average sentence in Finland seems to be around 9.5 yrs


You can get the ball off the ground if you turn your phone upside down. Then you can just sort of fly over the places where you'd normally fall. Takes maybe a minute or two to complete all the levels.


Ive enjoyed the classical sales books like SPIN selling, Challenger sale, Fanatical prospecting, The Psychology of Selling.


Has anyone experimented with integrating real-time lipsync into a low-latency audio bot? I saw some demos with d-id but their pricing was closer to $1/minute which makes it rather prohibitive


Interesting project, thanks for sharing


Knowing when to speak is actually a prediction task in itself. See eg https://arxiv.org/abs/2010.10874

Would be indeed great to get something like this integrated with whisper, LLM and TTS


Hard for me to imagine that this could be solved in text space. I think the prediction task needs to be done on the audio.


We thought about doing this in Whisper itself, since its already working in the audio space.


Yes, this is something we want to look into in more detail, really appreciate sharing the research.


Do you have use case demo videos somewhere? Would be great to see this in action


There's one at 00:30 in this YouTube video (timestamped the link): https://www.youtube.com/watch?v=1IdCWqTZLyA&t=32s


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: