Theory of Everything

Thread Starter

Jennifer Solomon

Joined Mar 20, 2017
112
It's certainly possible in principle. Play a sustained A minor triad on the piano, centered at middle C. Your fingers are playing the notes associated with 220 Hz, 262 Hz, and 330 Hz. Nonlinearities in the piano cause harmonic distortion in each of these notes, producing a sequence of weighted overtones at integer multiples of each of the fundamental frequencies. How these overtones are weighted determines the timbre of the chord, how it sounds.

There is enormous variety possible in the weights -- every instrument has its own signature, which the performer can modulate through her playing. Despite all this seeming complexity, the overtones are related to the fundamental frequencies by a very simple rule (integer multiples). So, if you know the fundamental frequencies in a complex wave -- and these are generally easy to pick out -- you know how to group any overtones present in the complex wave. Thus, for example, if you had really meant to play A major instead of A minor, you would change the 262 Hz tone (C) to 277 Hz (C#), and then find all the overtones harmonically related to 262 Hz and change them to be related to 277 Hz, with the same relative weighting.

That's the theory. In practice, there are of course complications. But then it just comes to down to implementation details. Here's a short video showing a working example of single-note manipulation in polyphonic (chordal) settings:
Very cool, thanks for the info... puzzling to me is how a speaker is essentially vibrating a parent wave, but all the overtones are present in that wave. How is the speaker representing all of the timbre complexities in one "unilateral" vibration?

Further, digitization of the same wave is a numeric "snapshot" of a given magnitude of it, a certain number of times per second. How are these binary snapshots representing all of those overtone nuances, that when fed back to a D/A converter, re-represent all the constituent parts of the wave?
 

MrChips

Joined Oct 2, 2009
34,882
Maybe you misunderstand.
A "numeric snapshot" is one value in time. This contains zero information of anything occurring before and after this instance.
Two "numeric snapshots" acquired consecutively Δt seconds apart now has more information as a pair.

Let us go further. Three sampled points, for example 0, 100, 0, collectively supplies even more information.

The point here is that while one data point provides very little information, many data points collectively contain much more information, i.e. the collective whole is greater than each individual point on its own.

Here is an interesting fact.
A sequence of sampled points, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,... contains an infinite number of frequencies, i.e all frequencies.
 

Thread Starter

Jennifer Solomon

Joined Mar 20, 2017
112
Maybe you misunderstand.
A "numeric snapshot" is one value in time. This contains zero information of anything occurring before and after this instance.
Two "numeric snapshots" acquired consecutively Δt seconds apart now has more information as a pair.

Let us go further. Three sampled points, for example 0, 100, 0, collectively supplies even more information.

The point here is that while one data point provides very little information, many data points collectively contain much more information, i.e. the collective whole is greater than each individual point on its own.

Here is an interesting fact.
A sequence of sampled points, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,... contains an infinite number of frequencies, i.e all frequencies.
No, I do understand — I meant Δt between two points in actuality; in light of that, my questions above still apply:

Puzzling to me is how a speaker is essentially vibrating a parent wave, but all the overtones are present in that wave. How is the speaker representing all of the timbre complexities in one "unilateral" vibration?

Further, digitization of the same wave is a numeric "snapshot" of a given magnitude of it, a certain number of times per second. How are these binary snapshots representing all of those overtone nuances, that when fed back to a D/A converter, re-represent all the constituent parts of the wave?
 

nsaspook

Joined Aug 27, 2009
16,344
No, I do understand — I meant Δt between two points in actuality; in light of that, my questions above still apply:

Puzzling to me is how a speaker is essentially vibrating a parent wave, but all the overtones are present in that wave. How is the speaker representing all of the timbre complexities in one "unilateral" vibration?

Further, digitization of the same wave is a numeric "snapshot" of a given magnitude of it, a certain number of times per second. How are these binary snapshots representing all of those overtone nuances, that when fed back to a D/A converter, re-represent all the constituent parts of the wave?
Study this: https://en.wikipedia.org/wiki/Nyquist–Shannon_sampling_theorem
 

MrChips

Joined Oct 2, 2009
34,882
Ok. Let's have another try.

One data point does not a tone make.

An ADC sampling at a rate of 10,000 samples per second would acquire one sample every 100 micro-seconds.
So for a note that lasts for 1/10 of a second, the ADC would have acquired and generated 1000 data points.
This would be our "numeric snapshot", not one point but 1000 points.

A sine wave would look like that shown in the following picture. The sound to our ear and brain would be rather uninteresting. You are correct to observe that musical instruments add "color" to the note by having overtones, timbre, etc.

While the guitar and piano are both playing the same musical note, the acoustic waveforms are very much different from that of a pure sinewave. You should be able to observe that the basic periodicity of the note is present in all three waveforms while the two different instruments have peculiar amounts of artifacts. These are the distinctive "color" that our brain perceive.

1585865999846.png
 

Thread Starter

Jennifer Solomon

Joined Mar 20, 2017
112
Ok. Let's have another try.

One data point does not a tone make.

An ADC sampling at a rate of 10kHz would acquire one sample every 100 micro-seconds.
So for a note that lasts for 1/10 of a second, the ADC would have acquired and generated 1000 data points.
This would be our "numeric snapshot", not one point but 1000 points.

A sine wave would look like that shown in the following picture. The sound to our ear and brain would be rather uninteresting. You are correct to observe that musical instruments add "color" to the note by having overtones, timbre, etc.

While the guitar and piano are both playing the same musical note, the acoustic waveforms are very much different from that of a pure sinewave. You should be able to observe that the basic periodicity of the note is present in all three waveforms while the two different instruments have peculiar amounts of artifacts. These are the distinctive "color" that our brain perceive.

View attachment 203314
Thanks for the reply again...

I get the fact that it is multiple points that determine the audible tone in a single wave. Let me try rephrasing:

Piano = Wave 1
Guitar = Wave 2
Piano and guitar playing together = Wave 3

Audio speaker is representing wave 3, which represents the "union" of wave 1 and wave 2.

Now, taking a recording of wave 3 and playing it through the speaker will allow us to hear wave 3.

But how is it that one can capture wave 3 from the speaker, and deconstruct this wave into wave 1 and 2, if we're dealing with a single vibratory element? At any given point in time, the cone of the speaker is simply moving in and out, representing wave 3.

How is wave 1 and wave 2's information embedded in wave 3 as one parent wave, but ALSO retaining discrete addressability subsequent to being amalgamated into wave 3? <- this is the actual question.
 
Last edited:

Thread Starter

Jennifer Solomon

Joined Mar 20, 2017
112
Horses and water. :)

You're thinking one dimensional.
Thanks for the video...

I have an IQ of 150+ and I'm not getting my answer from this. ;)
Microphone in a room, multiple wave forms hit the microphone from multiple instruments, and it is vibrating as an amalgamation at any given moment of all those waves. I.e., the diaphragm of the mic is vibrating unilaterally to represent all the waveforms as an amalgam, independent of the question of "multiple dimensions" you raise.

It then is digitized to a computer, and millions of zeroes and one's represent that amalgamated waveform.

You can then reconstruct the wave in a DAW and then pull apart the constituent waves into individual stems of the parent wave and hear their independent timbres.

What
hard scientific explanation is there for this specific capacity?
 
Last edited:

cmartinez

Joined Jan 17, 2007
8,786
You can then reconstruct the wave in a DAW and then pull apart the constituent waves into individual stems of the parent wave and hear their independent timbres.

What
hard scientific explanation is the explanation for this specific capacity?
A complex sound wave is simply a superposition of more elemental sound waves, resulting in a sound that the brain then decodes and distinguishes individually if it was trained to do that. For example, if I listen to a piano and a guitar playing at the same time, my brain can "decode" the parts played by the piano and those played by the guitar. But I can do that under the condition that I'd be already familiar with them both. That is, if you were to take a caveman and showed him the same recording, he wouldn't be able to do the same thing simply because he'd be unfamiliar with said instruments.

That being said, a piece of software can in theory separate the two, under the condition that said software already has samples of both instruments to work with. I have still to find software that does this perfectly. That's why at first I said it was impossible... but NSA stepped in, and I stand corrected... I now recognize that it's nearly impossible. And by that, I say nearly impossible to make a perfectly clean separation of the two.
 

nsaspook

Joined Aug 27, 2009
16,344
Thanks for the video...

I have an IQ of 150+ and I'm not getting my answer from this. ;)
Microphone in a room, multiple wave forms hit the microphone from multiple instruments, and it is vibrating as an amalgamation at any given moment of all those waves. I.e., the diaphragm of the mic is vibrating unilaterally to represent all the waveforms as an an amalgam, independent of the question of "multiple dimensions" you raise.

It then is digitized to a computer, and millions of zeroes and one's represent that amalgamated waveform.

You can then reconstruct the wave in a DAW and then pull apart the constituent waves into individual stems of the parent wave and hear their independent timbres.

What
hard scientific explanation is there for this specific capacity?
It's not your IQ that's in question. It's your seeming inability to comprehend the evidence (by several different methods) presented to you as being able to pull apart the constituent waves into individual stems of the parent wave and hear their independent timbres.

Perfectly NO, possible YES.
 

Thread Starter

Jennifer Solomon

Joined Mar 20, 2017
112
It's not your IQ that's in question. It's your seeming inability to comprehend the evidence (by several different methods) presented to you as being able to pull apart the constituent waves into individual stems of the parent wave and hear their independent timbres.

Perfectly NO, possible YES.
Ha.. Ok, I see the confusion, sorry. My question sort of "morphed" into two. The original question I got my answer to, thank you. It's possible, to a limited, imperfect degree to deconstruct the wave and get at some of the primary stems that represent it.

The second question, which is related, is the underlying mechanics of the actual recording of a wave, which is seen in my post before this one... e.g., a single ribbon diaphragm in a room is "hearing" all the waves and vibrating as one single vibration over a given time frame. That vibration then travels down a wire and then can be digitized or reproduced. How are the individual components of that wave still remaining addressable? Where do they "exist?" over any time t.
 

cmartinez

Joined Jan 17, 2007
8,786
HOW are the individual components of that wave still remaining addressable? WHERE do they "exist?" over any time t.
"They" exist in one single wave that has the properties of the previous two (or more) added together... it's your brain's incredible capability that has the capacity to "decode" them. And I say "decode" because objectively speaking the wave retains no individual information of its previous constituents. It is your brain (or a computer) that "separates them" (although I'd use the term "distinguishes them").

Take for example, a single string on a single guitar. When you pluck it, it won't make a pure sinusoidal sound wave. Instead, it would give away a complex one rich in harmonics etc... but using a Fourier transform, for instance, it wold be perfectly possible to separate it into individual pure sinusoidal waves! And yet, you didn't pluck a source of accessibly single sinusoids to obtain that sound, but rather a string on an instrument.
 

nsaspook

Joined Aug 27, 2009
16,344
Ha.. Ok, I see the confusion, sorry. My question sort of "morphed" into two. The original question I got my answer to, thank you. It's possible, to a limited, imperfect degree to deconstruct the wave and get at some of the primary stems that represent it.

The second question, which is related, is the underlying mechanics of the actual recording of a wave, which is seen in my post before this one... e.g., a single ribbon diaphragm in a room is "hearing" all the waves and vibrating as one single vibration over a given time frame. That vibration then travels down a wire and then can be digitized or reproduced. How are the individual components of that wave still remaining addressable? Where do they "exist?" over any time t.
They exist as individual energy entities, acoustic and/or electrical in a sound system. What you see on something like a oscope is a limited (I don't mean there is some hidden dimensional signal, only that X instrument displays X information) picture of the full dimensional (time/frequency) domain of the signal. That energy can't be created or destroyed only transformed, transferred, ..., simple thermodynamics. That transformation might be lossy to the extent the transportation media can't support the full information of the original signals(s) but if we have a proper transform we can decompose that superposition back to it's individual components. It's like using a broadband RF electric field detector. We read a level that's a superposition of all signals (energy) it can detect. If we add filters for narrow bands we can then deconstruct that superposition of all signals (energy) back into its individual components.
 
Last edited:

Thread Starter

Jennifer Solomon

Joined Mar 20, 2017
112
They exist as individual energy entities, acoustic and/or electrical in a sound system. What you see on something like a oscope is a limited (I don't mean there is some hidden dimensional signal, only that X instrument displays X information) picture of the full dimensional (time/frequency) domain of the signal. That energy can't be created or destroyed only transformed, transferred, ..., simple thermodynamics. That transformation might be lossy to the extent the transportation media can't support the full information of the original signals(s) but if we have a proper transform we can decompose that superposition back to it's individual components. It's like using a broadband RF electric field detector. We read a level that's a superposition of all signals (energy) it can detect. If we add filters for narrow bands we can then deconstruct that superposition of all signals (energy) back into its individual components.
Yes, I understand this...

But picture a single ribbon diaphragm microphone in a cathedral. 50 instruments are all playing different timbres and pitches, including reverberation and other room effects and artifacts.

These waves all "converge upon" the diaphragm, and it begins vibrating to represent the union of all those waves from all directions, with their unique nuances, and their dimensional placement within the space. The microphone is not "seeing" the wave as having "width" or spatial dimension. It simply vibrates unilaterally, agnostic to the source, as a representation of the union of all the waves.

In real time, this wave at every point can be reduced to a string of binary numbers. Those binary numbers represent ALL of the various waves in the room. The binary numbers can be relayed to the proper circuitry to reconstruct the parent wave with all of that information.

Same question applies the grooves in a vinyl record. The individual waves are somehow "in that parent wave" as represented in the grooves.

How is the data stored...
 
Last edited:

Thread Starter

Jennifer Solomon

Joined Mar 20, 2017
112
"They" exist in one single wave that has the properties of the previous two (or more) added together... it's your brain's incredible capability that has the capacity to "decode" them. And I say "decode" because objectively speaking the wave retains no individual information of its previous constituents. It is your brain (or a computer) that "separates them" (although I'd use the term "distinguishes them").

Take for example, a single string on a single guitar. When you pluck it, it won't make a pure sinusoidal sound wave. Instead, it would give away a complex one rich in harmonics etc... but using a Fourier transform, for instance, it wold be perfectly possible to separate it into individual pure sinusoidal waves! And yet, you didn't pluck a source of accessibly single sinusoids to obtain that sound, but rather a string on an instrument.
Correct, bit "decoding" what?? Essentially, sonic information "embedded" at every section (1-2 seconds?) that represents everything that "happened" in that room, with all of its nuances, artifacts, etc.! The mind's ability to decode it is one thing entirely (which is incredible), but what and where it's decoding is another!
 

cmartinez

Joined Jan 17, 2007
8,786
but what and where it's decoding is another!
again, the only way to answer that is by previously knowing what is (or could be) in the tape... what if there's a sound whose source is unknown? like an alien talking or something ... how can the mind know what it is, or if it's natural at all?
 

MrChips

Joined Oct 2, 2009
34,882
We are going around in a never ending circle.
You keep asking the same questions. Yet with different perspectives of the same answer, you return with the same questions.

But picture a single ribbon diaphragm microphone in a cathedral. 50 instruments are all playing different timbres and pitches, including reverberation and other room effects and artifacts.
Accepted. Multiple instruments, sound sources, voices in a noisy room. It doesn't matter.

These waves all "converge upon" the diaphragm, and it begins vibrating to represent the union of all those waves from all directions, with their unique nuances, and their dimensional placement within the space. The microphone is not "seeing" the wave as having "width" or spatial dimension. It simply vibrates unilaterally, agnostic to the source, as a representation of the union of all the waves.
Agreed. Accepted. I don't know what you mean by "unilaterally". However, it doesn't matter what is the diaphragm, whether a membrane on the eardrum, a microphone, or seismograph.

In real time, this wave at every point can be reduced to a string of binary numbers. Those binary numbers represent ALL of the various waves in the room. The binary numbers can be relayed to the proper circuitry to reconstruct the parent wave with all of that information.
True. Accepted. It really doesn't matter if it is binary numbers, voltage, pressure, light illumination. It is a time series of information. All the information is contained in the data pattern.

How is the data stored...
We already told you. The data is not stored anywhere. The data is inherent in the changing time series, in this case an acoustic pressure wave. There has to be air or some medium involved. Sound does not travel in a vacuum.

Is it possible to deconstruct the data back to the original sources by mechanical, analytical, computer, artificial intelligence?
Yes. Perfect deconstruction - no. Reasonable deconstruction so that we can recognize the sources, yes, very possible.

How is the ear able to deconstruct and decode the data in the wave?
That is the area of psychoacoustics and demonstrates the marvel of the human brain.
 

Thread Starter

Jennifer Solomon

Joined Mar 20, 2017
112
I sincerely hope we're not being trolled... :rolleyes:
I assure you you're not.

And I sincerely appreciate the engagement in the question!

Here's the problem:

Assuming no meta-physicality whatsoever, the brain is nothing more than a machine — classical, quantum, what have you — a numeric processing device. Software can "decode" the wave somewhat as well, and break it up into a few of its constituent parts as can the brain.

It's not like we have a mic for each instrument picking up a separate wave with separate data, where we're seeing separate tracks in a DAW. We have ONE diaphragm vibrating in this scenario. The brain and software can "decode" "it"... *IT* is the question! at NO time is there "piano, strings, and trumpets" in the binary information that wave was reduced to. Yet it is fed back to a digital-to-analog converter, and the reconstructed wave represents ALL of those waves that occurred during that time, as if the diaphragm were ITSELF a multi-track recorder.

At t=3s to t=5s, for example, where is the piano, guitar, drums, "represented" so that they can be "heard as separate entities" from that wave if literally a ribbon simply picked up ONE variance of air pressure at any give time duration?

Do you see my issue here?
 
Last edited:
Top