Theory of Everything

Thread Starter

Jennifer Solomon

Joined Mar 20, 2017
15
Let's say you captured an audio recording onto an analog format.

The analog recording is of an entire orchestra.

One of the violins played a bad note starting at precisely 4:49:03...

Technically that note is "embedded" precisely at that point in the wave.

Where and how is the note "stored" in the wave? Is there any current theory on how to access that information?
 
Last edited:

Papabravo

Joined Feb 24, 2006
14,691
Let's say you captured an audio recording onto an analog format.

The analog recording is of an entire orchestra.

One of the violins played a bad note at precisely 4:49:03...

Technically that note is "embedded" precisely at that point in the wave.

Where and how is the note "stored" in the wave? Is there any current theory on how to access that information?
Storing is a digital concept and implies that the notes are discrete in some sense. An analog recording is a continuous replica of the original sounds with noise and imperfections and everything. You can access the bad note by capturing and replaying that section of the analog recording and that is it.
 

WBahn

Joined Mar 31, 2012
26,304
Saying that the bad note is captured at precisely some moment in time is nonsensical. The bad note has finite energy and power and hence has temporal extent; it thus spans a non-infinitesimal window of time.

Before you could come up with some theory as to how to access that information, you first have to define what "access" means and what "that information" means? What makes it a "bad note". If the best you can come up with is that it wasn't the note that was supposed to be played, then how can anyone come up with a theory on how to access information about notes that weren't supposed to be played? How do you know which notes were supposed to be played? How are you going to incorporate that information into your theory?
 

KeithWalker

Joined Jul 10, 2017
1,303
The analog recorded waveform of the orchestra will contain the information of all the instruments being played at any one time. A slice in time of a recorded complex waveform can be displayed in the time domain using an oscilloscope and frequency domain using Fourier analysis. The diagram below is a graphical representation on the left of the complex time domain waveform of three notes being played at the same time. On the right is a display of the same complex waveform, over the same period of time, in the frequency domain, showing the frequency and amplitude of the three original notes.
I think that answers your question. There are dedicated instruments called Fourier analyzers which can display frequency domain waveforms and some modern oscilloscopes can do fast Fourier transforms on the fly.
Regards,
Keith

FreqDomain.jpg
 
Last edited:

Papabravo

Joined Feb 24, 2006
14,691
The analog recorded waveform of the orchestra will contain the information of all the instruments being played at any one time. A slice in time of complex waveform can be displayed in the frequency domain using Fourier analysis. The diagram below is a graphical representation on the left of the complex time domain waveform of three notes being played at the same time. On the right is a display of the same complex waveform in the frequency domain, showing the frequency and amplitude of the three original notes. There are dedicated instruments called Fourier analyzers which can display frequency domain waveforms and some modern oscilloscopes can do fast Fourier transforms on the fly.
Regards,
Keith

View attachment 202648
That is all well and good, but how do we identify the "bad note" in the frequency domain. What purpose would it serve even if we could?
 

MrChips

Joined Oct 2, 2009
22,102
An audio recording is a recording of history in time.

When a note is recorded at exactly 4:49:03 you have to examine not what happened at an instant in time, but over a period of time, for example a fraction of a second.

As an analogy, suppose you made a video recording of someone falling down the stairs. At what precise time did that person fall down the stairs? As you can see, there is no precise time. At some point the person tripped then tumbled and this action took place over a period of time. The tripping may have transpired over a small fraction of a second followed by 3 seconds of physically tumbling.

We now know that sound is a pressure wave, like ripples on a pond.
When you hear a musical note, what reaches your ear is a series of waves. For example, a note might have a wave frequency of 500 cycles every second, that is, 500 waves per second. The time between two peaks in the wave is 1/500 seconds or 2 milli-seconds (one milli-second is a thousand times shorter than a second). It would take 20ms (20 milli-seconds) for 10 waves to reach our ear. Our brain cannot register a note with receiving just one wave. We need many waves before we can recognize the note.

Hence this note reached our ear at 4:49:03 and lasted for 20ms. There were 10 peaks in the wave. We can visually see this in the recording using an instrument called an oscilloscope. A similar instrument is called a waveform digitizer. We can actually use a sound editor and localize this wave, erase it, and replace it with the correct wave, not easy to do but still doable.
 

cmartinez

Joined Jan 17, 2007
7,174
The way I understand what you're asking is, how do you separate a single note (be it good or bad) from a single instrument in an entire orchestra? I'm afraid that sort of thing goes beyond my scientific/engineering capabilities... but it does tell you something about the marvel of the human brain. It never ceases to amaze me how we can filter a person's voice from a crowd that's speaking at the same volume. At first we rely on reading that person's lips along with careful listening. But afterwards, we can close our eyes and separate his/her message by concentrating on things such as pitch, mannerisms and tone.

This is an excellent challenge for artificial intelligence developers.
 

MrChips

Joined Oct 2, 2009
22,102
This is what you would see in real time on an oscilloscope:

1585368586402.png


This what a recording might look like on an audio editor showing 27 seconds of a recording.

1585368349009.png

Now you can zoom in to the portion containing the bad note and it might look like this:


1585368643559.png
 

Thread Starter

Jennifer Solomon

Joined Mar 20, 2017
15
The way I understand what you're asking is, how do you separate a single note (be it good or bad) from a single instrument in an entire orchestra? I'm afraid that sort of thing goes beyond my scientific/engineering capabilities... but it does tell you something about the marvel of the human brain. It never ceases to amaze me how we can filter a person's voice from a crowd that's speaking at the same volume. At first we rely on reading that person's lips along with careful listening. But afterwards, we can close our eyes and separate his/her message by concentrating on things such as pitch, mannerisms and tone.

This is an excellent challenge for artificial intelligence developers.

You get the gist of what I’m asking. Thanks for everyone’s reply... the essence I’m getting at is:
The captured recording at any given “time slice” is one parent wave composed of dozens of waves, each of which represents notes being played by all the instruments. So buried at second 3.5 to 4.5 is an A440 of a violin, for example, with all its unique timbre overtones.

How can one get to that one single wave and change it, if, for example, it was “the wrong note?”
 

Thread Starter

Jennifer Solomon

Joined Mar 20, 2017
15
Storing is a digital concept and implies that the notes are discrete in some sense. An analog recording is a continuous replica of the original sounds with noise and imperfections and everything. You can access the bad note by capturing and replaying that section of the analog recording and that is it.
That’s exactly my point... that specific note is “stored” as a wavelet in the parent wave, no?

How is the mind able to “access” or identify that one note and even mentally change just it?
 

Thread Starter

Jennifer Solomon

Joined Mar 20, 2017
15
Right
Storing is a digital concept and implies that the notes are discrete in some sense. An analog recording is a continuous replica of the original sounds with noise and imperfections and everything. You can access the bad note by capturing and replaying that section of the analog recording and that is it.
...the mind can parse out that one “wavelet” and even modulate it mentally... how is it accessing that wavelet, with all its overtone nuances? (Sorry for double reply, can’t delete it)
 

MrAl

Joined Jun 17, 2014
7,849
Hi,

The brain remembers what a violin sounds like so it is able to match up the sound to the memory of that sound. You might have to set up a data base of various instruments that you could use to compare.
To tune it back to the right note though you would use something like Autotune.

The best way however is to be able to access the microphone for that single instrument during the live performance. I think they might be doing that these days so you could check into that too.
 

crutschow

Joined Mar 14, 2008
25,674
How is the mind able to “access” or identify that one note and even mentally change just it?
I think that's a question that no-one really has an answer to.
If you look at the oscilloscope of a music signal it looks like an almost random pattern of sounds.
But your ear and brain readily decodes that and can generally identify all the instruments making the music as well as who might also be singing a vocal along with the music.
I don't think anyone understands how the mind is able to do pick out all those individual sounds when they piled on top of each other, and which no computer is close to doing.
Seems like a miracle.
 

MrAl

Joined Jun 17, 2014
7,849
Digital signal processing changed everything because of the high level of analysis of signals it allows. Not only that, there are analysis techniques most of us here never used and probably never heard of.
For example, reverse the first four letters of "spectrum" and we get "cepstrum", reverse some for "frequency" and we get "qufrency, reverse some for "phase" and we get "sahpe". All associated with a different way of analyzing a signal i wonder if anyone here ever heard these words before.
There are also somewhat new types of filters that work on both time and frequency simultaneously.

I've worked with the complex cepstrum and power cepstrum in digital image processing in the past but have not worked with the time & frequency type filters (maybe one time long time ago).

But if we hear an instrument we never heard before in an orchestra we cant know what it is until we hear that instrument separately and described by some means, and this implies learning through memory. Once we have that memory we can then go on to try to recognize it in a group of other instruments.
Thus i believe that any device that can pick out various instruments would have to have their trace recordings stored in the system.
It's been done with facial recognition already EVEN at differing angles.

Digital signal processing is just so useful it's hard to predict what it will be able to do next.
 

MrChips

Joined Oct 2, 2009
22,102
Along with what has already been said, here is one way at looking at it.

With all the rest of the orchestra playing, you can consider the note that draws your attention to be a signal buried in noise.
This is a situation that occurs very frequently and can be tackled with DSP tools (Digital Signal Processing).

A common example is radar processing. In such situations, the signal can be extracted from the noise using a technique called autocorrelation. Since the signal generated by the radar system is known, one can optimize the SNR (Signal to Noise Ratio) by sending a signal of a particular pattern explicitly designed for best detection.

My Master's thesis was based on DSP of human brain waves. It required extracting alpha waves from EEG (electroencephalogram) using DSP in a era before personal computers became readily available.

New techniques are being developed and applied to very complex situations such as radio communications. If you wish to delve deeper into the math and application of AI (Artificial Intelligence) and neutral networks there are many papers and publications available online. Here is one example.
 

nsaspook

Joined Aug 27, 2009
7,744
New techniques are being developed and applied to very complex situations such as radio communications. If you wish to delve deeper into the math and application of AI (Artificial Intelligence) and neutral networks there are many papers and publications available online. Here is one example.
Researchers have noticed that these types of BSS methods don't work as well as expected on EEG signals. It's seems the brain somehow correlates the generated signals so it's not a Cocktail Party in your head. The many simultaneous sources of electrical activity in your head are triggered and are statistically dependent correlated sources.


It's interesting to see the advances in WSR (wireless signal recognition). Long ago most methods for identifying and tracking a specific EM source were based on isolating unique factors for each transmitter source. The slight differences in RF frequency/phase shifts during modulation, carrier artifacts like short/log term drift, IM distortion, PA power supply based hum or AM modulation effects. Individually each factor was small but with sufficient factor bins and very high stability, fidelity receivers it was possible to classify some individual units just by the received signals.
 
Last edited:

bogosort

Joined Sep 24, 2011
523
I think that's a question that no-one really has an answer to.
If you look at the oscilloscope of a music signal it looks like an almost random pattern of sounds.
But your ear and brain readily decodes that and can generally identify all the instruments making the music as well as who might also be singing a vocal along with the music.
I don't think anyone understands how the mind is able to do pick out all those individual sounds when they piled on top of each other, and which no computer is close to doing.
Seems like a miracle.
There's nothing miraculous about it. Pychophysics is a mature field, with well over a century's worth of research into how the auditory system works. While there still are some open questions, the nature of sound discrimination is not one of them.
 

cmartinez

Joined Jan 17, 2007
7,174
There's nothing miraculous about it. Pychophysics is a mature field, with well over a century's worth of research into how the auditory system works. While there still are some open questions, the nature of sound discrimination is not one of them.
Can you elaborate? To my knowledge, the tech needed to accomplish what the op has asked does not yet exist.
 
Top