Theory of Everything

Jennifer Solomon · Mar 27, 2020

Let's say you captured an audio recording onto an analog format.

The analog recording is of an entire orchestra.

One of the violins played a bad note starting at precisely 4:49:03...

Technically that note is "embedded" precisely at that point in the wave.

Where and how is the note "stored" in the wave? Is there any current theory on how to access that information?

Papabravo · Mar 27, 2020

Jennifer Solomon said:
Let's say you captured an audio recording onto an analog format.

The analog recording is of an entire orchestra.

One of the violins played a bad note at precisely 4:49:03...

Technically that note is "embedded" precisely at that point in the wave.

Where and how is the note "stored" in the wave? Is there any current theory on how to access that information?

Storing is a digital concept and implies that the notes are discrete in some sense. An analog recording is a continuous replica of the original sounds with noise and imperfections and everything. You can access the bad note by capturing and replaying that section of the analog recording and that is it.

WBahn · Mar 27, 2020

Saying that the bad note is captured at precisely some moment in time is nonsensical. The bad note has finite energy and power and hence has temporal extent; it thus spans a non-infinitesimal window of time.

Before you could come up with some theory as to how to access that information, you first have to define what "access" means and what "that information" means? What makes it a "bad note". If the best you can come up with is that it wasn't the note that was supposed to be played, then how can anyone come up with a theory on how to access information about notes that weren't supposed to be played? How do you know which notes were supposed to be played? How are you going to incorporate that information into your theory?

KeithWalker · Mar 27, 2020

The analog recorded waveform of the orchestra will contain the information of all the instruments being played at any one time. A slice in time of a recorded complex waveform can be displayed in the time domain using an oscilloscope and frequency domain using Fourier analysis. The diagram below is a graphical representation on the left of the complex time domain waveform of three notes being played at the same time. On the right is a display of the same complex waveform, over the same period of time, in the frequency domain, showing the frequency and amplitude of the three original notes.
I think that answers your question. There are dedicated instruments called Fourier analyzers which can display frequency domain waveforms and some modern oscilloscopes can do fast Fourier transforms on the fly.
Regards,
Keith

Papabravo · Mar 27, 2020

KeithWalker said:
The analog recorded waveform of the orchestra will contain the information of all the instruments being played at any one time. A slice in time of complex waveform can be displayed in the frequency domain using Fourier analysis. The diagram below is a graphical representation on the left of the complex time domain waveform of three notes being played at the same time. On the right is a display of the same complex waveform in the frequency domain, showing the frequency and amplitude of the three original notes. There are dedicated instruments called Fourier analyzers which can display frequency domain waveforms and some modern oscilloscopes can do fast Fourier transforms on the fly.
Regards,
Keith

View attachment 202648

That is all well and good, but how do we identify the "bad note" in the frequency domain. What purpose would it serve even if we could?

MrChips · Mar 27, 2020

An audio recording is a recording of history in time.

When a note is recorded at exactly 4:49:03 you have to examine not what happened at an instant in time, but over a period of time, for example a fraction of a second.

As an analogy, suppose you made a video recording of someone falling down the stairs. At what precise time did that person fall down the stairs? As you can see, there is no precise time. At some point the person tripped then tumbled and this action took place over a period of time. The tripping may have transpired over a small fraction of a second followed by 3 seconds of physically tumbling.

We now know that sound is a pressure wave, like ripples on a pond.
When you hear a musical note, what reaches your ear is a series of waves. For example, a note might have a wave frequency of 500 cycles every second, that is, 500 waves per second. The time between two peaks in the wave is 1/500 seconds or 2 milli-seconds (one milli-second is a thousand times shorter than a second). It would take 20ms (20 milli-seconds) for 10 waves to reach our ear. Our brain cannot register a note with receiving just one wave. We need many waves before we can recognize the note.

Hence this note reached our ear at 4:49:03 and lasted for 20ms. There were 10 peaks in the wave. We can visually see this in the recording using an instrument called an oscilloscope. A similar instrument is called a waveform digitizer. We can actually use a sound editor and localize this wave, erase it, and replace it with the correct wave, not easy to do but still doable.

cmartinez · Mar 28, 2020

The way I understand what you're asking is, how do you separate a single note (be it good or bad) from a single instrument in an entire orchestra? I'm afraid that sort of thing goes beyond my scientific/engineering capabilities... but it does tell you something about the marvel of the human brain. It never ceases to amaze me how we can filter a person's voice from a crowd that's speaking at the same volume. At first we rely on reading that person's lips along with careful listening. But afterwards, we can close our eyes and separate his/her message by concentrating on things such as pitch, mannerisms and tone.

This is an excellent challenge for artificial intelligence developers.

MrChips · Mar 28, 2020

This is what you would see in real time on an oscilloscope:

This what a recording might look like on an audio editor showing 27 seconds of a recording.

Now you can zoom in to the portion containing the bad note and it might look like this:

MrAl · Mar 28, 2020

Actually what i think is used in music is the short time fourier transform.
https://en.wikipedia.org/wiki/Short-time_Fourier_transform

You might start with that. This would be used in something like Autotune which has been around since the 1990's i think.

Jennifer Solomon · Mar 28, 2020

cmartinez said:
The way I understand what you're asking is, how do you separate a single note (be it good or bad) from a single instrument in an entire orchestra? I'm afraid that sort of thing goes beyond my scientific/engineering capabilities... but it does tell you something about the marvel of the human brain. It never ceases to amaze me how we can filter a person's voice from a crowd that's speaking at the same volume. At first we rely on reading that person's lips along with careful listening. But afterwards, we can close our eyes and separate his/her message by concentrating on things such as pitch, mannerisms and tone.

This is an excellent challenge for artificial intelligence developers.

You get the gist of what I’m asking. Thanks for everyone’s reply... the essence I’m getting at is:
The captured recording at any given “time slice” is one parent wave composed of dozens of waves, each of which represents notes being played by all the instruments. So buried at second 3.5 to 4.5 is an A440 of a violin, for example, with all its unique timbre overtones.

How can one get to that one single wave and change it, if, for example, it was “the wrong note?”

Jennifer Solomon · Mar 28, 2020

Papabravo said:
Storing is a digital concept and implies that the notes are discrete in some sense. An analog recording is a continuous replica of the original sounds with noise and imperfections and everything. You can access the bad note by capturing and replaying that section of the analog recording and that is it.

That’s exactly my point... that specific note is “stored” as a wavelet in the parent wave, no?

How is the mind able to “access” or identify that one note and even mentally change just it?

Jennifer Solomon · Mar 28, 2020

Right

Papabravo said:
Storing is a digital concept and implies that the notes are discrete in some sense. An analog recording is a continuous replica of the original sounds with noise and imperfections and everything. You can access the bad note by capturing and replaying that section of the analog recording and that is it.

...the mind can parse out that one “wavelet” and even modulate it mentally... how is it accessing that wavelet, with all its overtone nuances? (Sorry for double reply, can’t delete it)

MrAl · Mar 28, 2020

Hi,

The brain remembers what a violin sounds like so it is able to match up the sound to the memory of that sound. You might have to set up a data base of various instruments that you could use to compare.
To tune it back to the right note though you would use something like Autotune.

The best way however is to be able to access the microphone for that single instrument during the live performance. I think they might be doing that these days so you could check into that too.

crutschow · Mar 28, 2020

Jennifer Solomon said:
How is the mind able to “access” or identify that one note and even mentally change just it?

I think that's a question that no-one really has an answer to.
If you look at the oscilloscope of a music signal it looks like an almost random pattern of sounds.
But your ear and brain readily decodes that and can generally identify all the instruments making the music as well as who might also be singing a vocal along with the music.
I don't think anyone understands how the mind is able to do pick out all those individual sounds when they piled on top of each other, and which no computer is close to doing.
Seems like a miracle.

MrAl · Mar 28, 2020

Digital signal processing changed everything because of the high level of analysis of signals it allows. Not only that, there are analysis techniques most of us here never used and probably never heard of.
For example, reverse the first four letters of "spectrum" and we get "cepstrum", reverse some for "frequency" and we get "qufrency, reverse some for "phase" and we get "sahpe". All associated with a different way of analyzing a signal i wonder if anyone here ever heard these words before.
There are also somewhat new types of filters that work on both time and frequency simultaneously.

I've worked with the complex cepstrum and power cepstrum in digital image processing in the past but have not worked with the time & frequency type filters (maybe one time long time ago).

But if we hear an instrument we never heard before in an orchestra we cant know what it is until we hear that instrument separately and described by some means, and this implies learning through memory. Once we have that memory we can then go on to try to recognize it in a group of other instruments.
Thus i believe that any device that can pick out various instruments would have to have their trace recordings stored in the system.
It's been done with facial recognition already EVEN at differing angles.

Digital signal processing is just so useful it's hard to predict what it will be able to do next.

nsaspook · Mar 28, 2020

There are methods to separate out and analyze audio signals in mixed music and voice signals. Blind Source Separation ICA Using these methods you can separate one voice within a groups of many voices and background sounds like music.

https://arxiv.org/pdf/1812.07504.pdf
https://arxiv.org/pdf/1805.01201.pdf

MrChips · Mar 28, 2020

Along with what has already been said, here is one way at looking at it.

With all the rest of the orchestra playing, you can consider the note that draws your attention to be a signal buried in noise.
This is a situation that occurs very frequently and can be tackled with DSP tools (Digital Signal Processing).

A common example is radar processing. In such situations, the signal can be extracted from the noise using a technique called autocorrelation. Since the signal generated by the radar system is known, one can optimize the SNR (Signal to Noise Ratio) by sending a signal of a particular pattern explicitly designed for best detection.

My Master's thesis was based on DSP of human brain waves. It required extracting alpha waves from EEG (electroencephalogram) using DSP in a era before personal computers became readily available.

New techniques are being developed and applied to very complex situations such as radio communications. If you wish to delve deeper into the math and application of AI (Artificial Intelligence) and neutral networks there are many papers and publications available online. Here is one example.

nsaspook · Mar 28, 2020

MrChips said:
New techniques are being developed and applied to very complex situations such as radio communications. If you wish to delve deeper into the math and application of AI (Artificial Intelligence) and neutral networks there are many papers and publications available online. Here is one example.

Researchers have noticed that these types of BSS methods don't work as well as expected on EEG signals. It's seems the brain somehow correlates the generated signals so it's not a Cocktail Party in your head. The many simultaneous sources of electrical activity in your head are triggered and are statistically dependent correlated sources.

It's interesting to see the advances in WSR (wireless signal recognition). Long ago most methods for identifying and tracking a specific EM source were based on isolating unique factors for each transmitter source. The slight differences in RF frequency/phase shifts during modulation, carrier artifacts like short/log term drift, IM distortion, PA power supply based hum or AM modulation effects. Individually each factor was small but with sufficient factor bins and very high stability, fidelity receivers it was possible to classify some individual units just by the received signals.

bogosort · Mar 28, 2020

crutschow said:
I think that's a question that no-one really has an answer to.
If you look at the oscilloscope of a music signal it looks like an almost random pattern of sounds.
But your ear and brain readily decodes that and can generally identify all the instruments making the music as well as who might also be singing a vocal along with the music.
I don't think anyone understands how the mind is able to do pick out all those individual sounds when they piled on top of each other, and which no computer is close to doing.
Seems like a miracle.

There's nothing miraculous about it. Pychophysics is a mature field, with well over a century's worth of research into how the auditory system works. While there still are some open questions, the nature of sound discrimination is not one of them.

cmartinez · Mar 28, 2020

bogosort said:
There's nothing miraculous about it. Pychophysics is a mature field, with well over a century's worth of research into how the auditory system works. While there still are some open questions, the nature of sound discrimination is not one of them.

Can you elaborate? To my knowledge, the tech needed to accomplish what the op has asked does not yet exist.

Thread starter	Similar threads	Forum	Replies	Date
A	"Pure" reactive circuits exist only in theory?	General Electronics Chat	9	Jun 9, 2026
Y	noise analysis comparation of theory and ltspice simulation	PCB Layout , EDA & Simulations	13	Feb 13, 2026
M	analog filter theory - attenuation or gain?	Analog & Mixed-Signal Design	3	Nov 14, 2025
	Closed Loop Gain Formula Derivation - Why Superposition Theory is valid for the inverting amp.	Homework Help	1	May 11, 2025
	Who know who is Ted Mellem author of book "New Theory of Everything"	General Science, Physics & Math	33	Jan 24, 2022

Theory of Everything

Join our Engineering Community! Sign-in with:

Theory of Everything

Jennifer Solomon

Papabravo

WBahn

KeithWalker

Papabravo

MrChips

cmartinez

MrChips

MrAl

Jennifer Solomon

Jennifer Solomon

Jennifer Solomon

MrAl

crutschow

MrAl

nsaspook

MrChips

nsaspook

bogosort

cmartinez

You May Also Like

A 7 to 60 V Input Buck Converter: A Wide-Range DC-DC Step-Down Module

Qualcomm Unwraps Products Bringing AI to Entry–Level Laptops and Robotics

Microchip Unveils Plug-In Timing Module for AI-Burdened Data Centers

STMicro Unwraps Low-Power Image Sensors for Always-On Designs