Effects of Gen AI on High Skilled Work: Experiments with Software Developers (Score: 151+ in 15 hours)
Link: https://readhacker.news/s/6e8hT
Comments: https://readhacker.news/c/6e8hT
Study: Playing D&D helps autistic players in social interactions (Score: 150+ in 16 hours)
Link: https://readhacker.news/s/6e84M
Comments: https://readhacker.news/c/6e84M
Intent to unship: HTTP/2 Push (Score: 150+ in 13 hours)
Link: https://readhacker.news/s/6e84y
Comments: https://readhacker.news/c/6e84y
Show HN: Infinity – Realistic AI characters that can speak (Score: 152+ in 5 hours)
Link: https://readhacker.news/c/6e98J
Hey HN, this is Lina, Andrew, and Sidney from Infinity AI (https://infinity.ai/). We've trained our own foundation video model focused on people. As far as we know, this is the first time someone has trained a video diffusion transformer that’s driven by audio input. This is cool because it allows for expressive, realistic-looking characters that actually speak. Here’s a blog with a bunch of examples: https://toinfinityai.github.io/v2-launch-page/
If you want to try it out, you can either (1) go to https://studio.infinity.ai/try-inf2, or (2) post a comment in this thread describing a character and we’ll generate a video for you and reply with a link. For example:
“Mona Lisa saying ‘what the heck are you smiling at?’”: https://bit.ly/3z8l1TM
“A 3D pixar-style gnome with a pointy red hat reciting the Declaration of Independence”: https://bit.ly/3XzpTdS
“Elon Musk singing Fly Me To The Moon by Sinatra”: https://bit.ly/47jyC7C
Our tool at Infinity allows creators to type out a script with what they want their characters to say (and eventually, what they want their characters to do) and get a video out. We’ve trained for about 11 GPU years (~$500k) so far and our model recently started getting good results, so we wanted to share it here. We are still actively training.
We had trouble creating videos of good characters with existing AI tools. Generative AI video models (like Runway and Luma) don’t allow characters to speak. And talking avatar companies (like HeyGen and Synthesia) just do lip syncing on top of the previously recorded videos. This means you often get facial expressions and gestures that don’t make sense with the audio, resulting in the “uncanny” look you can’t quite put your finger on. See blog.
When we started Infinity, our V1 model took the lip syncing approach. In addition to mismatched gestures, this method had many limitations, including a finite library of actors (we had to fine-tune a model for each one with existing video footage) and an inability to animate imaginary characters.
To address these limitations in V2, we decided to train an end-to-end video diffusion transformer model that takes in a single image, audio, and other conditioning signals and outputs video. We believe this end-to-end approach is the best way to capture the full complexity and nuances of human motion and emotion. One drawback of our approach is that the model is slow despite using rectified flow (2-4x speed up) and a 3D VAE embedding layer (2-5x speed up).
Here are a few things the model does surprisingly well on: (1) it can handle multiple languages, (2) it has learned some physics (e.g. it generates earrings that dangle properly and infers a matching pair on the other ear), (3) it can animate diverse types of images (paintings, sculptures, etc) despite not being trained on those, and (4) it can handle singing. See blog.
Here are some failure modes of the model: (1) it cannot handle animals (only humanoid images), (2) it often inserts hands into the frame (very annoying and distracting), (3) it’s not robust on cartoons, and (4) it can distort people’s identities (noticeable on well-known figures). See blog.
Try the model here: https://studio.infinity.ai/try-inf2
We’d love to hear what you think!
Mapping 20k ships that sank during WW II (❄️ Score: 151+ in 3 days)
Link: https://readhacker.news/s/6dX2K
Comments: https://readhacker.news/c/6dX2K
2M users but no money in the bank (Score: 155+ in 10 hours)
Link: https://readhacker.news/s/6e7RQ
Comments: https://readhacker.news/c/6e7RQ
Show HN: Wealthfolio: A Private, Open-Source Investment Tracker (🔥 Score: 153+ in 2 hours)
Link: https://readhacker.news/s/6e8vz
Comments: https://readhacker.news/c/6e8vz
Why Don't Tech Companies Pay Their Engineers to Stay? (Score: 150+ in 11 hours)
Link: https://readhacker.news/s/6e7em
Comments: https://readhacker.news/c/6e7em
Swift is a more convenient Rust (🔥 Score: 151+ in 2 hours)
Link: https://readhacker.news/s/6e85d
Comments: https://readhacker.news/c/6e85d
Firefox will consider a Rust implementation of JPEG-XL (❄️ Score: 150+ in 2 days)
Link: https://readhacker.news/s/6dZmA
Comments: https://readhacker.news/c/6dZmA
Deploying rust in existing firmware codebases (Score: 151+ in 16 hours)
Link: https://readhacker.news/s/6e6cw
Comments: https://readhacker.news/c/6e6cw
Reflection 70B, the top open-source model (Score: 150+ in 9 hours)
Link: https://readhacker.news/s/6e6Bf
Comments: https://readhacker.news/c/6e6Bf
NIH cancels ‘Havana syndrome’ research, citing unethical coercion (❄️ Score: 150+ in 3 days)
Link: https://readhacker.news/s/6dTB6
Comments: https://readhacker.news/c/6dTB6
Common food dye found to make skin and muscle temporarily transparent (Score: 150+ in 6 hours)
Link: https://readhacker.news/s/6e6CK
Comments: https://readhacker.news/c/6e6CK
Tell HN: Burnout is bad to your brain, take care (🔥 Score: 165+ in 1 hour)
Link: https://readhacker.news/c/6e79V
I am depressed and burned out for quite some time already, unfortunately my brain still couldn't recover from it.
If I summarize the impact of burnout to my brain:
- Before: I could learn things pretty quickly, come up with solutions to the problems, even be able to see common patterns and see bigger underlying problems
- After: can't learn, can't work, can't remember, can't see solutions for trivial problems (e.g. if your shirt is wet, you can change it, but I stare at it thinking when it is going to get dried up)
Take care of your mental health
LSP: The Good, the Bad, and the Ugly (❄️ Score: 151+ in 3 days)
Link: https://readhacker.news/s/6dYnV
Comments: https://readhacker.news/c/6dYnV
Show HN: Using SQL's Turing completeness to build Tetris (❄️ Score: 152+ in 2 days)
Link: https://readhacker.news/s/6dZQH
Comments: https://readhacker.news/c/6dZQH
Your Name in Landsat (❄️ Score: 150+ in 3 days)
Link: https://readhacker.news/s/6dWQE
Comments: https://readhacker.news/c/6dWQE
Nginx has moved to GitHub (Score: 153+ in 6 hours)
Link: https://readhacker.news/s/6e8Tv
Comments: https://readhacker.news/c/6e8Tv
LwIP – Lightweight IP Stack (Score: 150+ in 17 hours)
Link: https://readhacker.news/s/6e7gc
Comments: https://readhacker.news/c/6e7gc
The Elements of APIs (2021) (❄️ Score: 150+ in 3 days)
Link: https://readhacker.news/s/6dVLX
Comments: https://readhacker.news/c/6dVLX
What happens when you touch a pickle to an AM radio tower (Score: 150+ in 9 hours)
Link: https://readhacker.news/s/6e7v8
Comments: https://readhacker.news/c/6e7v8
Serverless-registry: A Docker registry backed by Workers and R2 (Score: 151+ in 19 hours)
Link: https://readhacker.news/s/6e67J
Comments: https://readhacker.news/c/6e67J
The expected value of the game is positive regardless of Ballmer’s strategy (Score: 151+ in 5 hours)
Link: https://readhacker.news/s/6e7JC
Comments: https://readhacker.news/c/6e7JC
Did Sandia use a thermonuclear secondary in a product logo? (🔥 Score: 159+ in 2 hours)
Link: https://readhacker.news/s/6e7Tb
Comments: https://readhacker.news/c/6e7Tb
People can read their manager's mind (2015) (❄️ Score: 151+ in 3 days)
Link: https://readhacker.news/s/6dSZj
Comments: https://readhacker.news/c/6dSZj
The Early Days of Valve from a Woman Inside (Score: 150+ in 7 hours)
Link: https://readhacker.news/s/6e6L6
Comments: https://readhacker.news/c/6e6L6
Clojure 1.12.0 is now available (Score: 150+ in 5 hours)
Link: https://readhacker.news/s/6e6FP
Comments: https://readhacker.news/c/6e6FP
Desed: Demystify and debug your sed scripts (Score: 150+ in 21 hours)
Link: https://readhacker.news/s/6e4C7
Comments: https://readhacker.news/c/6e4C7
Why I self host my servers and what I've recently learned (Score: 153+ in 1 day)
Link: https://readhacker.news/s/6dYzh
Comments: https://readhacker.news/c/6dYzh