Sunday, June 2, 2024

AI Advances: Music!


WOW! Have things advanced in AI music these past few months!

For the past year, I've been deeply involved in learning, using, and putting together AI websites and programs to work for me as a product designer. The improvements are usually amazing and often happen in large leaps. Graphic generation alone has been mind-blowing. It is not perfect, so much is still hit-and-miss.

As for video generation, it is way too new and early -- and I'm not impressed at the overall look of these seconds-long videos. It needs to be less random; currently, there is no consistency. I find it annoying and boring. But enough about that, let's go back to the music . . . 

Okay, here's a song I generated.
The songs I create aren't simply left to AI to fully create. I wrote the lyrics, fully described what I wanted to be included in the song, and generally guided the direction of the song in 32 seconds parts. The best way to describe creating AI music is that you are like a producer in a recording studio, keeping the stuff you like and generally saying "Well, that part's not right -- how about doing it this way instead?" "Add a harmonica solo here.", or "add in a 3-part harmony chorus at this point."

My first attempt at doing Crosby, Stills, and Nash song
with a cameo appearance of Paul McCartney towards the end

I'm using Udio which allows developing a full song by building half-minute sections at a time. Udio is currently in beta mode as of this writing, its still free to use, BUT, the $10/month is much more worth the time if you really want to generate your own songs. There is a huge random outcome to most of this process. I find I generally need about 12-32 takes to find the 30+ second clips I want. 

Want to hear more of my songs? Link: youtube.com/@ArghTunes

Sunday, February 4, 2024

Google's New AI, Bard - 1st Impression

You might expect a well-known and powerful company like Google to eventually enter the AI competition arena. One could assume that Google has had a major AI-like system for years, used for web searches and information gathering. Or perhaps not? My first impression of Bard, Google's response to ChatGPT, is, "Holy $#%&! This might be the worst AI I've encountered in over a year!"

Let me explain the level of detail I usually expect. As a toy designer, I use AI to help me collect images of toys, conduct market research, and explore styles and color schemes, among other things. Recently, a client requested generic toy robots but wasn't sure about the exact appearance. This is where AI shines, by generating images/mock-ups in a variety of possible styles.

A very traditional cute metal toy robot design - requested in metallic silver without colors

Another variation showing different flocked materials and some color accents

Above are two examples of robot toy designs, created by Dall-E (also known as ChatGPT-4).

I find Dall-E to be the best at "listening" to exactly what I'm asking for in my toy design prompts. Many other image generators tend to overlook key details mentioned in the prompt. Now, let's look at what Google's Bard created. Note: This was done using the exact same prompts I used with Dall-E.

Google . . . I'm speechless. (Was this thing shot in the chest?)
This one's even better. And by "better" I mean worse.
Why the colors? Why does it look old, grimy and used.
These things would give nightmares to the Island of Misfit Toys. They do look as if a child created them. The colors are so muted and conflicting. If AI was involved with these designs, I'd like to know how it came to this point. Personally, I feel Google shouldn't have released Bard to the public. They should have kept it closed until their technology was at least on par with the AI world a year ago.