Jensen Huang says kids shouldn't learn to code — they should leave it up to AI.

L4sBot@lemmy.world · 9 months ago

Jensen Huang says kids shouldn't learn to code — they should leave it up to AI.

Skvlp@lemm.ee · 9 months ago

Those vocals are pretty good for being computer generated. It’s no replacement for greats like Bowie, Simone, Jagger, Winehouse, Yorke, etc, etc, but it’s not supposed to be. Sometimes it’ll do the trick, sometimes it’ll be a necessity, it’ll work for some backing vocals, demos, sketches, songwriting experimentation, guide vocals, and so on. I hope we’ll see awesome AI tools being used to make awesome music.

I definitely have that fear myself, but I hope human resilience hangs in there. Besides, I don’t think I’d care if the masses listen to bland shit by 17 songwriters or bland shit by AI ;)

Dojan@lemmy.world · edit-2 9 months ago

The quality of the vocals are now honestly less dependant on the synthesis engine than on the skill of the original singer, and the intent of the production team. Hayden is a first-party library produced by Dreamtonics, and they tend to be very focused on having their voices do a specific thing. Ninezero for example is all-in on that gravelly rock type voice and won’t do soft ballads easily or with any particular quality.

This was true even for VOCALOID; most of the VOCALOID libraries are absolute bunk. YAMAHA’s (developer of VOCALOID) first signature English library, CYBER DIVA sounds so bad. The (in my opinion) best library for VOCALOID happens to be a Hello Kitty collaboration. For some reason they chose a traditional Japanese singer with an incredible vocal range to be the voice provider rather than a voice actor, and the quality of that voice is reflected in the voice library.

EclipsedSounds has three libraries now, and they’ve focused more on capturing the qualities of the original singer. Their first library SOLARIA is a Soprano whose voice is provided by Emma Rowley. Their second library ASTERIAN is a bass, voiced by Eric Hollaway (known as ‘thatbassvoice’). Their third, SAROS, is a tenor whose provider I don’t think has come forth yet. They are much more expressive than most libraries produced by Dreamtonics. SAROS’ second vocal demo is a great example.

One of the neat things about them being synthesized is that these libraries can sing in English, Japanese, Mandarin, Cantonese, and Spanish (and with some fiddling, likely in other languages too - I managed to get SAROS to perform in Norwegian thanks to the Spanish update). Where SynthV really falls short is the occasional glitches when you push the vocals, as well as the lack of vocal ornamentation; there’s no good way of performing say, growls at the moment.

I think ultimately human creativity will preservere. We’ll likely see a lot of AI generated garbage as people are getting used to the tools and finding ways of working with them in the next couple of years. After that, I don’t know. Even then there’ll be people that prefer to just do everything by themselves.

We manage to make garbage even without AI. Disney’s “Wish” was so bad people think AI was used, but I think it’s more a matter of “direction by corporate.” Corporate decided to seagull the entire project and the original creative vision was basically destroyed by corporate interests. You see it all the time in the games industry as well; creativity is set aside for proven established ideas, and market appeal. Risks are not allowed.

PipedLinkBot@feddit.rocks · 9 months ago

Here is an alternative Piped link(s):

Ninezero for example is all-in on that gravelly rock type voice

SAROS’ second vocal demo is a great example

Piped is a privacy-respecting open-source alternative frontend to YouTube.

I’m open-source; check me out at GitHub.