Can a Machine's Voice Performance Truly Hold Emotional Depth?

I was listening to a vocal track the other day, and it got me wondering about something. With all the advances in voice synthesis, we’re getting closer to machines that can mimic the subtle inflections and tones of a human singer or speaker. But if a program perfectly replicates the sound of, say, heartbreak or joy in a voice, does that performance lose its meaning because it isn’t “felt” by a person? Or does it actually gain a different kind of significance, maybe as a new form of artistic expression in its own right?

Coming from health informatics, I see data and patterns all day, but I also spend my free time with dance and VR art areas deeply tied to human physicality and creative intent. That contrast makes this question really fascinating to me. What do you all think? Is the origin of the emotion what gives it weight, or can the replication itself become meaningful?