Sign in with
Sign up | Sign in
Your question

Fourier Analyses, or, how the Orchestra learned to play "J..

Last response: in Home Audio
Share
October 11, 2004 11:47:59 PM

Archived from groups: rec.audio.pro (More info?)

I'm looking to find out more about writing some software that will use
traditional classical instruments to emulate "natural" or "non musical
sounds." The software will perform some type of analyses on an audio
file, I imagine FFT would be used at some point, but the problem with
FFT is that it only tells you what "perfect" or pure sine wave based
frequencies are present in a sound. Besides the flute, not much else
in an orchestra has anything close to a sine wave output. After this
analysis is done, the software will look through a library of sounds
made by traditional instruments. These sounds will include every
noise and playing style every traditional instrument can produce. The
software will then juggle the sounds around at various dynamic levels
in various rhythms and etc until it comes up with the closest
combination to the original sound. Perhaps a car engine sound file
would yield three Double Basses, a flute or two in very quiet
irregular rhythms, and maybe a horn would be involved during gear
changes. I might not have to tell you that Gyorgy Ligeti's
"Atmospheres" and his "Mechanical Music" served as the chief
inspiration for this idea.

Has anybody ever heard of anything like this, or know where I might
start to look for info on this subject? I'm not looking for
programming help, but rather, help with setting up the math. Are
there any scientific communities online that I could point my
questions to? Any books on this type of thing. I've heard Csound
might work for this. I thought Csound was for composing, not for
analyzing existing sound files. I can't seem to come up with the
right keywords to get anything out of Google, but I hoped someone here
might be able to put me on the right path.
Anonymous
October 12, 2004 11:09:24 AM

Archived from groups: rec.audio.pro (More info?)

inkexit@yahoo.com (Ryan) wrote in message news:<dea39397.0410111847.69138fb1@posting.google.com>...
> I'm looking to find out more about writing some software that will use
> traditional classical instruments to emulate "natural" or "non musical
> sounds." The software will perform some type of analyses on an audio
> file, I imagine FFT would be used at some point, but the problem with
> FFT is that it only tells you what "perfect" or pure sine wave based
> frequencies are present in a sound. Besides the flute, not much else
> in an orchestra has anything close to a sine wave output. After this
> analysis is done, the software will look through a library of sounds
> made by traditional instruments. These sounds will include every
> noise and playing style every traditional instrument can produce. The
> software will then juggle the sounds around at various dynamic levels
> in various rhythms and etc until it comes up with the closest
> combination to the original sound. Perhaps a car engine sound file
> would yield three Double Basses, a flute or two in very quiet
> irregular rhythms, and maybe a horn would be involved during gear
> changes. I might not have to tell you that Gyorgy Ligeti's
> "Atmospheres" and his "Mechanical Music" served as the chief
> inspiration for this idea.
>
> Has anybody ever heard of anything like this, or know where I might
> start to look for info on this subject? I'm not looking for
> programming help, but rather, help with setting up the math. Are
> there any scientific communities online that I could point my
> questions to? Any books on this type of thing. I've heard Csound
> might work for this. I thought Csound was for composing, not for
> analyzing existing sound files. I can't seem to come up with the
> right keywords to get anything out of Google, but I hoped someone here
> might be able to put me on the right path.

Ryan,

You might want to visit the Fourier analysis again... my understanding
is that with it, you are not just determining the fundamental
frequency of a sound, but all the other frequencies present in it, as
well. The key is that any given sound IS a collection of sine waves,
at different intensities, with different relationships in *time*.

For example, a square wave is a sine wave at the fundamental, then a
series of harmonics (3rd, 5th, 7th, 9th, etc.) in diminishing
amplitude, in a specific arrangement. Fourier can describe this
arrangement.

What you are attempting to do with sounds reminds me of those posters,
where one large picture (say, of a person) is made up of hundreds of
smaller pictures. The movie poster for "The Truman Show" starring Jim
Carrey comes to mind. Perhaps some of the math or the code from that
system may work for what you are doing.

Regards,

Karl Winkler
Lectrosonics, Inc.
http://www.lectrosonics.com
October 12, 2004 1:39:40 PM

Archived from groups: rec.audio.pro (More info?)

inkexit@yahoo.com (Ryan) wrote in message news:<dea39397.0410111847.69138fb1@posting.google.com>...
> ... I imagine FFT would be used at some point, but the problem with
> FFT is that it only tells you what "perfect" or pure sine wave based
> frequencies are present in a sound. Besides the flute, not much else
> in an orchestra has anything close to a sine wave output....



No, that's the whole point of Fourier Analysis. ALL waveforms no
matter how complex can be de-composed into sine waves of various
amplitudes and phases.

Mark
Anonymous
October 12, 2004 2:28:59 PM

Archived from groups: rec.audio.pro (More info?)

Ryan <inkexit@yahoo.com> wrote:
>I'm looking to find out more about writing some software that will use
>traditional classical instruments to emulate "natural" or "non musical
>sounds." The software will perform some type of analyses on an audio
>file, I imagine FFT would be used at some point, but the problem with
>FFT is that it only tells you what "perfect" or pure sine wave based
>frequencies are present in a sound.

No. ANY arbitrary waveform can be decomposed down to sine waves. When you
put the sines back together, you can reconstitute the original wave. This is
the WHOLE POINT of the Fourier series. The time domain and frequency domain
representations of the waveform are equivalent and you can convert from one
to the other and back with impunity.

>Besides the flute, not much else
>in an orchestra has anything close to a sine wave output. After this
>analysis is done, the software will look through a library of sounds
>made by traditional instruments. These sounds will include every
>noise and playing style every traditional instrument can produce. The
>software will then juggle the sounds around at various dynamic levels
>in various rhythms and etc until it comes up with the closest
>combination to the original sound.

Why use a computer for this anyway? George Gershwin did a perfectly good
job of this by ear.
--scott
--
"C'est un Nagra. C'est suisse, et tres, tres precis."
October 12, 2004 6:29:47 PM

Archived from groups: rec.audio.pro (More info?)

kludge@panix.com (Scott Dorsey) wrote in message news:<ckgpnb$p4p$1@panix2.panix.com>...


Hi Scott. How have you been? Heard anymore Sonic Youth of late?

> Ryan <inkexit@yahoo.com> wrote:
> >I'm looking to find out more about writing some software that will use
> >traditional classical instruments to emulate "natural" or "non musical
> >sounds." The software will perform some type of analyses on an audio
> >file, I imagine FFT would be used at some point, but the problem with
> >FFT is that it only tells you what "perfect" or pure sine wave based
> >frequencies are present in a sound.
>
> No. ANY arbitrary waveform can be decomposed down to sine waves. When you
> put the sines back together, you can reconstitute the original wave. This is
> the WHOLE POINT of the Fourier series. The time domain and frequency domain
> representations of the waveform are equivalent and you can convert from one
> to the other and back with impunity.

So what I have to do is perform FFT on each of my sound "samples", the
squeak of a vilon played behind the bridge, a viol's "dry string"
sounds, regular arco, pizzicato, etc, ect, all the ohter instruments,
etc. And then perform an FFT on any given sound file I'm interested
in emulating. After that, what kind of math would be used to sort
through all the samples and figure what goes best where?

Samplitude features an FFT analyses window. It just looks like a
regular EQ anlysis to me. Is it the case that if I take each
frequency as a sine wave and apply it to the given amplitude that I
will have achieved X's sound? Is there anyway to simplify that? Even
the simplest natural sounds have about a 5khz range. Do I have to
create 5000 individual sine waves? The FFT graph only shows frequency
over time, How do I find out about the relationships between the
frequencies as far as timming? For example say a put a sine wave at
2Khz and 1Khz. Obviously the 2Khz occilates twice as fast as the 1
Khz, but beyond that, the starting/ending points (where y=0) might not
sink up. The 2Khz sine may start, say, 300ths of a second after the
1Khz. I don't think info like this can be found out by the FFT
window, can it?

Do I have this right at all, or am I still nopt grasping Fourier
transforms?

> >Besides the flute, not much else
> >in an orchestra has anything close to a sine wave output. After this
> >analysis is done, the software will look through a library of sounds
> >made by traditional instruments. These sounds will include every
> >noise and playing style every traditional instrument can produce. The
> >software will then juggle the sounds around at various dynamic levels
> >in various rhythms and etc until it comes up with the closest
> >combination to the original sound.
>
> Why use a computer for this anyway? George Gershwin did a perfectly good
> job of this by ear.
> --scott

The hope is to use this as a learning tool and eventually stop using
it, not unlike training wheels on a bicycle. I could probably do a
decent job of this in a tonal 4/4 world, but most real life sounds
contain dissonant and microtonal intervals, as well as many
"co-rhythms" that work together to create larger aspects of the sound,
such as pulses, and trigonometric polynomials. Making something that
sounds like a train whistle is one thing. I imagine it would have
been rather difficult for a composer of even Gershwin's skill to
notate out the sound of a babbling brook during a rain storm, with a
distant propeller airplane heard off in the far distance. It would
break my head, not to mention take a considerable amount of time for
me to do this by ear. Whereas with a system of this sort, I could run
twenty analyses and in a day know far more about this type of
orchestration than I would in a month if I did it all in my head. I
would gain a good overall knowledge that I can use as starting points
for future works, I would have a "feel for it". On the other hand,
doing this all by ear until I figure out how to make it work, is like
finding out the details first and only later getting the overall
picture--not the most efficient way of working. Like trying to
complete a jigsaw puzzle with no picture of what the finished puzzle
looks like. Learning the individual interactions between the parts
does not always lead to a good understanding of the whole. Anyway, I
learn best working from the outside in.
Anonymous
October 12, 2004 8:27:37 PM

Archived from groups: rec.audio.pro (More info?)

On Tue, 12 Oct 2004 07:09:24 -0700, Karl Winkler wrote:

<snip>
> What you are attempting to do with sounds reminds me of those posters,
> where one large picture (say, of a person) is made up of hundreds of
> smaller pictures. The movie poster for "The Truman Show" starring Jim
> Carrey comes to mind. Perhaps some of the math or the code from that
> system may work for what you are doing.

There is a free program called "Soundmosaic" that does exactly this. It
sorta works.

http://thalassocracy.org/soundmosaic/

(Some people here may appreciate the demo of a George Bush speech
combined with a chimp screaming.)

And "Dissasociated studio" which does the same kind of thing, but within a
single audio file.

http://www.panix.com/~asl2/music/dissoc_studio/
Anonymous
October 12, 2004 10:14:46 PM

Archived from groups: rec.audio.pro (More info?)

Ryan <inkexit@yahoo.com> wrote:
>kludge@panix.com (Scott Dorsey) wrote in message news:<ckgpnb$p4p$1@panix2.panix.com>...
>
>Hi Scott. How have you been? Heard anymore Sonic Youth of late?

I'm listening to Toots and the Maytals as I type this...
>
>> No. ANY arbitrary waveform can be decomposed down to sine waves. When you
>> put the sines back together, you can reconstitute the original wave. This is
>> the WHOLE POINT of the Fourier series. The time domain and frequency domain
>> representations of the waveform are equivalent and you can convert from one
>> to the other and back with impunity.
>
>So what I have to do is perform FFT on each of my sound "samples", the
>squeak of a vilon played behind the bridge, a viol's "dry string"
>sounds, regular arco, pizzicato, etc, ect, all the ohter instruments,
>etc. And then perform an FFT on any given sound file I'm interested
>in emulating. After that, what kind of math would be used to sort
>through all the samples and figure what goes best where?

I'm not sure this will really do what you want, but you can try it. You
could just do a standard correlation coefficient and see how close they
come.

Then again, you could probably just do a correlation coefficient on the
samples themselves. That might be fun to look at.

>Samplitude features an FFT analyses window. It just looks like a
>regular EQ anlysis to me. Is it the case that if I take each
>frequency as a sine wave and apply it to the given amplitude that I
>will have achieved X's sound? Is there anyway to simplify that? Even
>the simplest natural sounds have about a 5khz range. Do I have to
>create 5000 individual sine waves? The FFT graph only shows frequency
>over time, How do I find out about the relationships between the
>frequencies as far as timming? For example say a put a sine wave at
>2Khz and 1Khz. Obviously the 2Khz occilates twice as fast as the 1
>Khz, but beyond that, the starting/ending points (where y=0) might not
>sink up. The 2Khz sine may start, say, 300ths of a second after the
>1Khz. I don't think info like this can be found out by the FFT
>window, can it?

No, you probably want a tool like matlab. How many terms you want to
calculate out to depends on how good an approximation you want. I think
that the number of terms that you're going to get is going to be larger
than the number of samples in the original file for most arbitrary sounds.
You can decide to reduce this by bandlimiting the original signal, though.
--scott
--
"C'est un Nagra. C'est suisse, et tres, tres precis."
Anonymous
October 13, 2004 4:21:44 AM

Archived from groups: rec.audio.pro (More info?)

Symbolic Sound's KYMA does resynthesis in real-time.

Gotta buy the box, though . . .




Kurt Riemann
October 13, 2004 3:52:20 PM

Archived from groups: rec.audio.pro (More info?)

kludge@panix.com (Scott Dorsey) wrote in message
> I'm listening to Toots and the Maytals as I type this...

Woah. New stuff? I haven't even heard the title before. I'm kinda
weening off the Youth a little bit. Nowadays I'm really into Ligeti.
Have you heard his "Atmospheres" or his "San Fransisco Polyphony", or
his "Continuum (fur Cembalo)"? My goodness! They're must-listens.
He has some of the most revolutionary music I have ever heard. I'm
sure you will understand my wanting for a type of software like this
once you hear these pieces, if you haven't heard them already.

> >So what I have to do is perform FFT on each of my sound "samples", the
> >squeak of a vilon played behind the bridge, a viol's "dry string"
> >sounds, regular arco, pizzicato, etc, ect, all the ohter instruments,
> >etc. And then perform an FFT on any given sound file I'm interested
> >in emulating. After that, what kind of math would be used to sort
> >through all the samples and figure what goes best where?
>
> I'm not sure this will really do what you want, but you can try it. You
> could just do a standard correlation coefficient and see how close they
> come.
>
> Then again, you could probably just do a correlation coefficient on the
> samples themselves. That might be fun to look at.

A correlation for the whole sound file, or a correlation every set
number of seconds, or a type of gui tool to use to set up the sections
you want to emulate. This would be good for Monophonic reduction, but
more math would be involved if you wanted to reduce the sound file to,
say, 3 concurrent, or 13 concurrent instruments, right?

Ideally this software would/could use both of these approaches.

> >Samplitude features an FFT analyses window. It just looks like a
> >regular EQ anlysis to me. Is it the case that if I take each
> >frequency as a sine wave and apply it to the given amplitude that I
> >will have achieved X's sound? Is there anyway to simplify that? Even
> >the simplest natural sounds have about a 5khz range. Do I have to
> >create 5000 individual sine waves? The FFT graph only shows frequency
> >over time, How do I find out about the relationships between the
> >frequencies as far as timming? For example say a put a sine wave at
> >2Khz and 1Khz. Obviously the 2Khz occilates twice as fast as the 1
> >Khz, but beyond that, the starting/ending points (where y=0) might not
> >sink up. The 2Khz sine may start, say, 300ths of a second after the
> >1Khz. I don't think info like this can be found out by the FFT
> >window, can it?
>
> No, you probably want a tool like matlab. How many terms you want to
> calculate out to depends on how good an approximation you want. I think
> that the number of terms that you're going to get is going to be larger
> than the number of samples in the original file for most arbitrary sounds.
> You can decide to reduce this by bandlimiting the original signal, though.
> --scott

Terms? As in how many instruments I want to end up with? Or by what
specs I will measure the orignal soundfile? If the later, do you mean
something like bitrate, samplerate, something else? Why would the
number of terms be greater than the samplerate? Is matlab an audio
tool. Probably just a math program right? So I would enter in pcm
info and run the calculations and then use the output to create a pcm
file? Sorry so many questions.
Anonymous
October 13, 2004 9:01:59 PM

Archived from groups: rec.audio.pro (More info?)

Ryan wrote:

> Terms? As in how many instruments I want to end up with? Or by what
> specs I will measure the orignal soundfile? If the later, do you mean
> something like bitrate, samplerate, something else? Why would the
> number of terms be greater than the samplerate? Is matlab an audio
> tool. Probably just a math program right? So I would enter in pcm
> info and run the calculations and then use the output to create a pcm
> file? Sorry so many questions.

Ryan, most of what you are asking about is well beyond the
state of the art, the art being DSP. I would suggest that
you go to comp.dsp and set forth what it is you want to get
more specific feedback about it.


Bob
--

"Things should be described as simply as possible, but no
simpler."

A. Einstein
Anonymous
October 14, 2004 7:50:50 AM

Archived from groups: rec.audio.pro (More info?)

On Wed, 13 Oct 2004 17:01:59 -0700, Bob Cain
<arcane@arcanemethods.com> wrote:

>
>
>Ryan wrote:
>
>> Terms? As in how many instruments I want to end up with? Or by what
>> specs I will measure the orignal soundfile? If the later, do you mean
>> something like bitrate, samplerate, something else? Why would the
>> number of terms be greater than the samplerate? Is matlab an audio
>> tool. Probably just a math program right? So I would enter in pcm
>> info and run the calculations and then use the output to create a pcm
>> file? Sorry so many questions.
>
>Ryan, most of what you are asking about is well beyond the
>state of the art, the art being DSP.

I'm trying to follow the thoughts... it appears what he wants is a
computer program that does with an orchestra what one does with a
synthesizer to imitate the sound of a musical instrument ("imitative
synthesis"). I suppose nowadays you could write a program that scans a
digitized audio recording and makes a patch (or orchestral score) that
somewhat crudely approximates the sound, but it could surely be
tweaked by hand/ear to make it better, or perhaps a synthesist (person
making a synth patch) would just start over and make something that
sounds better/closer. I doubt that having it do a mathematical
operation such as a fit a least-squares match of the FFT would make it
anywhere near the "original sound" as would a person experienced in
doing these things.
But to make "arbitrary sounds" with orchestral instruments ... the
only thing I've heard that's anything like this is on Peter Shickele's
"Upper West Side" where he says something about hearing Vivaldi one
more time. The strings play throught the melody once, then they play
the beat of the melody with hip-hop record-scratching sounds. It was
hard to believe my ears. Is there a video? I'd like to SEE these
string players reproducing this speed-up-and-slow-down
record-scratching sound.


>I would suggest that
>you go to comp.dsp and set forth what it is you want to get
>more specific feedback about it.

Like MIDI output of polyphonic audio input, this technology is not
quite (actually nowhere near) ready for prime time.

>
>
>Bob

-----
http://mindspring.com/~benbradley
October 14, 2004 12:07:07 PM

Archived from groups: rec.audio.pro (More info?)

Ben Bradley <ben_nospam_bradley@mindspring.com> wrote in message news:<d6trm0p9u4j0adp0sr6tf6tkgf29ou0ung@4ax.com>...

> I'm trying to follow the thoughts... it appears what he wants is a
> computer program that does with an orchestra what one does with a
> synthesizer to imitate the sound of a musical instrument ("imitative
> synthesis"). I suppose nowadays you could write a program that scans a
> digitized audio recording and makes a patch (or orchestral score) that
> somewhat crudely approximates the sound, but it could surely be
> tweaked by hand/ear to make it better, or perhaps a synthesist (person
> making a synth patch) would just start over and make something that
> sounds better/closer. I doubt that having it do a mathematical
> operation such as a fit a least-squares match of the FFT would make it
> anywhere near the "original sound" as would a person experienced in
> doing these things.

Well, maybe, I don't really know. I'd be surprised if some type of
math couldn't be rigged up that would do as good a job as a human.
It's all analytical, and actually not too subjective. It will either
sound like a jet engine or not, and since the computer will "know"
what a jet engine sounds like thanks to the FFT and differential
analysis, it seems to me this shoud be as easy as asking a computer to
come up with a number that adds to 7 to make ten.

It doesn't matter if it's in real time or not to me. It could take an
hour to process a minute long soundfile for all I care. And once I
get something together I can tweak it for better results, and it
doesn't have to be perfect. Again, this will be mainly a learning
tool.

> But to make "arbitrary sounds" with orchestral instruments ... the
> only thing I've heard that's anything like this is on Peter Shickele's
> "Upper West Side" where he says something about hearing Vivaldi one
> more time. The strings play throught the melody once, then they play
> the beat of the melody with hip-hop record-scratching sounds. It was
> hard to believe my ears. Is there a video? I'd like to SEE these
> string players reproducing this speed-up-and-slow-down
> record-scratching sound.

Hmmm, I've never heard this before. You're not talking about the
musical "west side story" I gather. Anyway, my guess is that it's a
simple dodecaphonic or maybe microtonal glissando performed with light
enough pressure on the bow/strings to emit that rosiny scratchy sound.
I keep bringing up this guy's name, but if you haven't listened to
any Ligeti, you really owe it to yourself to. His "Atmospheres" and
his "Harmonies (for organ)" are good starting points. His music often
sounds like Arbitrary sounds, and it's always produced with
traditional instruments. "Harmonies" is especially interesting. The
organ has to be rigged up to change the inner air pressure so as to
play microtonally. The low powered organ sounds like a giant whoosh
of sound, or the kind of still wonder you might expect an astronaut to
hear in his head. It mesmerizes and twinkles like distant stars or
complex microscopic schools of glowing plankton in the ocean at night.
In fact, a small bit of Atmoshperes was used in 2001: a space
oddessy. A lot of his music takes you into the moment, stops your
breath, and makes you question why no one else thought of it first.
He does this partially by emulating real world sound.



> >I would suggest that
> >you go to comp.dsp and set forth what it is you want to get
> >more specific feedback about it.

This is good advice.


> Like MIDI output of polyphonic audio input, this technology is not
> quite (actually nowhere near) ready for prime time.

If it was out and available on every supermarket endcap, I probably
wouldn't want anything to do with it! ;-) This interests me because
as far as I know, it isn't really done that much (the orchestration,
not the software), and certainly not the extent I want to take it to.


> >
> >
> >Bob
>
> -----
> http://mindspring.com/~benbradley
Anonymous
October 14, 2004 2:40:58 PM

Archived from groups: rec.audio.pro (More info?)

Ryan wrote:


> Well, maybe, I don't really know. I'd be surprised if some type of
> math couldn't be rigged up that would do as good a job as a human.

Prepare, then, to be surprised. Our mechanisms for feature
extraction and interpretation remain largely a mystery. The
process is highly algorithmic and that is very different
than mathematical, although math can be employed in some
algorithmic process.

> It's all analytical, and actually not too subjective. It will either
> sound like a jet engine or not, and since the computer will "know"
> what a jet engine sounds like thanks to the FFT and differential
> analysis, it seems to me this shoud be as easy as asking a computer to
> come up with a number that adds to 7 to make ten.

An FFT doesn't begin to disclose what you are looking for in
and of itself. It's no more than a view of the same data
with a different independant axis. It contains no
information at all about when things happen.

In any event, the ear brain does not do a Fourier analysis.
There are frequency dependant mechanisms but they are
totally ad hoc in terms of what nature found most useful for
subsequent analysis.

In a very real sense you are asking for an artificial ear
all the way through to the process of blind separation.
That problem remains a curiousity that researchers are
merely nibbling the edges of.

You might want to Google on "blind separation" to see how
much your problem involves that and how little progress has
been made.


Bob
--

"Things should be described as simply as possible, but no
simpler."

A. Einstein
Anonymous
October 14, 2004 8:18:28 PM

Archived from groups: rec.audio.pro (More info?)

Ryan wrote:

> I'd be surprised if some type of
> math couldn't be rigged up that would do as good a job as a human.

If the math is required to make the assumptions you make in the next few
sentences putting the calcs together is going to be tough.

> It's all analytical, and actually not too subjective. It will either
> sound like a jet engine or not, and since the computer will "know"
> what a jet engine sounds like thanks to the FFT and differential
> analysis, it seems to me this shoud be as easy as asking a computer to
> come up with a number that adds to 7 to make ten.

Do all oboes sound the same? All violins? All trumpets? _All jet
engines_?

"Not too subjective" goes into the grist mill when a creative mind
chooses among available voicings for a given instrument.

--
ha
Anonymous
October 14, 2004 8:27:00 PM

Archived from groups: rec.audio.pro (More info?)

I think one of the things you'll find, investigating these real-world
sounds, is that most of them differ drastically from the sound made by most
musical instruments in that they are inharmonic; in other words, musical
instruments produce sound consisting mostly of a fundamental and harmonics,
at integer multiples of the fundamental frequency. Real-world noises, to a
great extent, have mixtures of frequencies that aren't integer multiples of
one another.

The implication of that, of course, is that in trying to score instruments
to sound like real-world noises, you'll have to suppress their natural
tendency to play with integer-multiple harmonic series. In other words,
you'll need to force them to stop behaving like musical instruments. Thus,
for example, the suggestion of the light-pressure bow producing extraneous,
"non-musical" sounds in the Schickele recording. Contemporary composers have
been doing things like this for a while, with varying degrees of success --
I think back to the string snaps in Bartok's Music for Strings, Percussion
and Celesta, in effect making the fiddles into percussion instruments.

Interesting project, and quite a challenge.

Peace,
Paul
October 14, 2004 9:10:10 PM

Archived from groups: rec.audio.pro (More info?)

walkinay@thegrid.net (hank alrich) wrote in message news:<1gln2gt.133gqqf10zi75aN%walkinay@thegrid.net>...
> Ryan wrote:
>
> > I'd be surprised if some type of
> > math couldn't be rigged up that would do as good a job as a human.
>
> If the math is required to make the assumptions you make in the next few
> sentences putting the calcs together is going to be tough.
>
> > It's all analytical, and actually not too subjective. It will either
> > sound like a jet engine or not, and since the computer will "know"
> > what a jet engine sounds like thanks to the FFT and differential
> > analysis, it seems to me this shoud be as easy as asking a computer to
> > come up with a number that adds to 7 to make ten.
>
> Do all oboes sound the same? All violins? All trumpets? _All jet
> engines_?
>
> "Not too subjective" goes into the grist mill when a creative mind
> chooses among available voicings for a given instrument.

Well, I'm just starting to get my hands around this. I think I may be
suffering from "don't know how to ask the right questions" syndrome.
Just to clarify a bit: It is certainly true that no two oboes sound
the same, in fact the very same oboe can sound different from day to
day or from climate to climate. I think we could aproximate the sound
of a bassoon, and since this is is only a learning tool, not intended
to produce a perfect final product, that would be good enough. On the
other hand, for this problem, there is only one sound of a jet engine,
and that sound would be whatever soundfile I choose to feed to the
software. Although both sounds will have to be analyzed to produce
the desired effect, the file I seek to emulate, "the jet engine
sound," will never have to suffer from aproximation. That's what I
meant by "the computer will know" what a jet engine sounds like.
October 14, 2004 9:20:01 PM

Archived from groups: rec.audio.pro (More info?)

Bob Cain <arcane@arcanemethods.com> wrote in message news:<ckmdni030d4@enews3.newsguy.com>...
> Ryan wrote:
>
>
> > Well, maybe, I don't really know. I'd be surprised if some type of
> > math couldn't be rigged up that would do as good a job as a human.
>
> Prepare, then, to be surprised. Our mechanisms for feature
> extraction and interpretation remain largely a mystery. The
> process is highly algorithmic and that is very different
> than mathematical, although math can be employed in some
> algorithmic process.
>
> > It's all analytical, and actually not too subjective. It will either
> > sound like a jet engine or not, and since the computer will "know"
> > what a jet engine sounds like thanks to the FFT and differential
> > analysis, it seems to me this shoud be as easy as asking a computer to
> > come up with a number that adds to 7 to make ten.
>
> An FFT doesn't begin to disclose what you are looking for in
> and of itself. It's no more than a view of the same data
> with a different independant axis. It contains no
> information at all about when things happen.

Is there any kind of analysis that does? I used FFT because that's
the only one I've really ever heard of. What if I perform a different
FFT for every second of the soundfile?

>
> In any event, the ear brain does not do a Fourier analysis.
> There are frequency dependant mechanisms but they are
> totally ad hoc in terms of what nature found most useful for
> subsequent analysis.

Was this a typo? I hope this doesn't offend, but every site I've
looked at about this says that indeed our ears do function as FFT
devices. If this is incorrect I'd very much like to know the turth
about the matter.


> In a very real sense you are asking for an artificial ear
> all the way through to the process of blind separation.
> That problem remains a curiousity that researchers are
> merely nibbling the edges of.
>
> You might want to Google on "blind separation" to see how
> much your problem involves that and how little progress has
> been made.
>
>
> Bob

Is this what I'm asking for? I really don't know myself. It seems to
me FFT would work ideally if the only instruments I wanted to score
for were flutes. Flutes have an almost perfect sine wave output. And
since FFT is a breakdown of the sound into sine waves, I'd think this
would work quite well, except of course for the limited bass range of
the flute family. No?

Regardless, thanks for giving me some new info to go on.
October 14, 2004 9:29:24 PM

Archived from groups: rec.audio.pro (More info?)

"Paul Stamler" <pstamlerhell@pobox.com> wrote in message news:<oNxbd.698387$Gx4.105015@bgtnsc04-news.ops.worldnet.att.net>...

> I think one of the things you'll find, investigating these real-world
> sounds, is that most of them differ drastically from the sound made by most
> musical instruments in that they are inharmonic; in other words, musical
> instruments produce sound consisting mostly of a fundamental and harmonics,
> at integer multiples of the fundamental frequency. Real-world noises, to a
> great extent, have mixtures of frequencies that aren't integer multiples of
> one another.

This is something I've always wondered about. I thought everything
obeyed the 1st harmonic, 2cnd harmonic, etc., rules. Is it possible
for a sound to have no overtones? I thought that even computer
generated sounds that have no harmonics on screen, produce them
automatically when they come out of the speaker. I thought the
harmonic series was just part of the physics of sound. Yes, real
world sounds often contain dissonant and un related intervals, but if
we broke down the overall sound to a set of sounds, wouldn't these
sounds in themselves produce the natural overtones?

> The implication of that, of course, is that in trying to score instruments
> to sound like real-world noises, you'll have to suppress their natural
> tendency to play with integer-multiple harmonic series. In other words,
> you'll need to force them to stop behaving like musical instruments.

How about microtones? I imagine the sound of an F#+ coming out of an
oboe would create some funny interactions with the harmonics. But I
could be wrong.


> Thus,
> for example, the suggestion of the light-pressure bow producing extraneous,
> "non-musical" sounds in the Schickele recording. Contemporary composers have
> been doing things like this for a while, with varying degrees of success --
> I think back to the string snaps in Bartok's Music for Strings, Percussion
> and Celesta, in effect making the fiddles into percussion instruments.
>
> Interesting project, and quite a challenge.
>
> Peace,
> Paul
Anonymous
October 15, 2004 12:22:22 AM

Archived from groups: rec.audio.pro (More info?)

Bob Cain wrote:

> An FFT doesn't begin to disclose what you are looking for in
> and of itself. It's no more than a view of the same data
> with a different independant axis. It contains no
> information at all about when things happen.

Or why things happen.

--
ha
Anonymous
October 15, 2004 2:18:42 AM

Archived from groups: rec.audio.pro (More info?)

Ryan wrote:


> Well, I'm just starting to get my hands around this. I think I may be
> suffering from "don't know how to ask the right questions" syndrome.
> Just to clarify a bit: It is certainly true that no two oboes sound
> the same, in fact the very same oboe can sound different from day to
> day or from climate to climate. I think we could aproximate the sound
> of a bassoon, and since this is is only a learning tool, not intended
> to produce a perfect final product, that would be good enough. On the
> other hand, for this problem, there is only one sound of a jet engine,
> and that sound would be whatever soundfile I choose to feed to the
> software. Although both sounds will have to be analyzed to produce
> the desired effect, the file I seek to emulate, "the jet engine
> sound," will never have to suffer from aproximation. That's what I
> meant by "the computer will know" what a jet engine sounds like.

You've got me confused now, what is it that you are wanting
to do that is different than a sampler?


Bob
--

"Things should be described as simply as possible, but no
simpler."

A. Einstein
Anonymous
October 15, 2004 2:39:14 AM

Archived from groups: rec.audio.pro (More info?)

Ryan wrote:


>>An FFT doesn't begin to disclose what you are looking for in
>>and of itself. It's no more than a view of the same data
>>with a different independant axis. It contains no
>>information at all about when things happen.
>
>
> Is there any kind of analysis that does? I used FFT because that's
> the only one I've really ever heard of. What if I perform a different
> FFT for every second of the soundfile?

Very good! You've just described the STFT, short time
Fourier transform. It does give information about when
things happen with no greater resolution than the length of
the FFT. They can be overlapped for better resolution.
There is also the variety of wavelet transforms which allow
you to trade off the resolution in frequency and in time
according to a principle similar to Heisenberg's. They are
tricky to use.

The question remains to be answered in some detail what
information you want to obtain.

>
>
>>In any event, the ear brain does not do a Fourier analysis.
>> There are frequency dependant mechanisms but they are
>>totally ad hoc in terms of what nature found most useful for
>>subsequent analysis.
>
>
> Was this a typo? I hope this doesn't offend, but every site I've
> looked at about this says that indeed our ears do function as FFT
> devices. If this is incorrect I'd very much like to know the turth
> about the matter.

Nope. No offense taken. There is a _big_ difference
between a FT and an ad hoc and idiosyncratic feature
extraction mechanism that uses a very complicated organic
filter as part of its discrimination. The FT has a precise
mathematical formulation involving inner products with sin
and cosine signals at a precise set of frequencies. The ear
just doesn't do that. There is a gross similarity but
that's about all.

The Ghost could address this in some detail if anyone could
get him to do something besides insult people. When he was
young he published with one of the pioneers in the field of
hearing research, someone who I believe got a Nobel Prize
for it.

> Is this what I'm asking for? I really don't know myself.

I'm having trouble figuring that out exactly too. :-)

In case you've received any new information that might help
you frame it better, would you care to try again?
Refinement to specs from vague ideas is not an uncommon
process in the user/marketing/engineering cyclic process.


Bob
--

"Things should be described as simply as possible, but no
simpler."

A. Einstein
October 16, 2004 2:20:06 AM

Archived from groups: rec.audio.pro (More info?)

Bob Cain <arcane@arcanemethods.com> wrote in message news:<cknnq201hct@enews4.newsguy.com>...

> The Ghost could address this in some detail if anyone could
> get him to do something besides insult people. When he was
> young he published with one of the pioneers in the field of
> hearing research, someone who I believe got a Nobel Prize
> for it.


The Ghost?

> > Is this what I'm asking for? I really don't know myself.
>
> I'm having trouble figuring that out exactly too. :-)
>
> In case you've received any new information that might help
> you frame it better, would you care to try again?
> Refinement to specs from vague ideas is not an uncommon
> process in the user/marketing/engineering cyclic process.

Well, I think you had the right idea the first time, before I
attempted to be more conscise and confused you. I will jot out a
basic algorithm for the software:

1. Analyze real instrument sound files. These files should inculde
every possible way every classical instrument can be played, from the
traditional to the avant garde. For the viols for example, from plain
jane arco to bartok's snapping strings to harmonics to different bow
pressures to playing behind the bridge to the tapping of fingers on
the body of the instruments. There should be files that represent the
instruments at all possible dynamic levels. There should be files
that feature the instruments playing in micotones if it can do so.
(Most classical instruments can.) Also, there should be analysis of
the instruments in "static form." By this I mean the part of the
sound after the intial attack, which can be looped over and over again
to give the impression the note is sustaining. This is done in
standard synthesis as well as good sample libraries. It may take
quite awhile to amass all these samples, but once collected the
analysis of them only has to be done once.

2. Deduct from these analyzations the prime aspects of these sounds.
If we only have, say ten frequencies to represent this sound, which
ones would be the most usefull. Or would some other type of info
about the file be more imprtant than it's frequencies? So now we have
a set of data instead of just a pcm sound file. We can call these
data sets, "fingerprints." This is mainly to help speed up the math
performed later during step 4, though it will compromise the accuracy
of the final product. Ideally, the user should be able to select the
amount of data to be derived from the samples.

3. Analyze any given sound file. These would be the "real world"
sounds. Or anything at all. In fact, I was thinking last night that
the ultimate test for this software would be to feed it, say,
Beethoven's 9th, and see how close it could aproximate it.

4. Run differential, or co-efficent on the "real world" sound file
compared to all the "sound fingerprints" the program created in step
2.

5. Create midi file. After the program has deduced what would be the
best combination of instruments in which playing styles at what
pitches and what dynamics, playing at what kind of rhytmic figures,
etc., the program would simply create a multiple staff midi file with
all said info scored on it.

Viola!
Anonymous
October 16, 2004 5:27:43 PM

Archived from groups: rec.audio.pro (More info?)

On Fri, 15 Oct 2004 22:20:06 -0700, Ryan wrote:

> Bob Cain <arcane@arcanemethods.com> wrote in message
> news:<cknnq201hct@enews4.newsguy.com>...
>
>> The Ghost could address this in some detail if anyone could get him to
>> do something besides insult people. When he was young he published
>> with one of the pioneers in the field of hearing research, someone who
>> I believe got a Nobel Prize for it.
>
>
> The Ghost?
>
>> > Is this what I'm asking for? I really don't know myself.
>>
>> I'm having trouble figuring that out exactly too. :-)
>>
>> In case you've received any new information that might help you frame
>> it better, would you care to try again? Refinement to specs from vague
>> ideas is not an uncommon process in the user/marketing/engineering
>> cyclic process.
>
> Well, I think you had the right idea the first time, before I attempted
> to be more conscise and confused you. I will jot out a basic algorithm
> for the software:
>
> 1. Analyze real instrument sound files. These files should inculde
> every possible way every classical instrument can be played, from the
> traditional to the avant garde. For the viols for example, from plain
> jane arco to bartok's snapping strings to harmonics to different bow
> pressures to playing behind the bridge to the tapping of fingers on the
> body of the instruments. There should be files that represent the
> instruments at all possible dynamic levels. There should be files that
> feature the instruments playing in micotones if it can do so. (Most
> classical instruments can.) Also, there should be analysis of the
> instruments in "static form." By this I mean the part of the sound
> after the intial attack, which can be looped over and over again to give
> the impression the note is sustaining. This is done in standard
> synthesis as well as good sample libraries. It may take quite awhile to
> amass all these samples, but once collected the analysis of them only
> has to be done once.

Why not use mathametical models of the instruments? I would imagine the
amount of samples required to cover all the sounds a violin could make
would be impossible (think of playing a false harmonic on all the strings
of a violin at every position, and with every bowing style). With a model,
you have defined the 'prime aspects of these sounds' in a very flexible
way. The computer could adjust the way the model is 'played' to find the
best fit to the sound you wish to analyse.

This would perhaps get nearer to fulfilling the interesting idea in your
original post-

"Perhaps a car engine sound file
would yield three Double Basses, a flute or two in very quiet irregular
rhythms, and maybe a horn would be involved during gear changes."

The computer could go through every single possible sound a violin could
make by iterating though all possible bow positions/angle/velocity, finger
positions etc until it found the combination that would most closely
approximate the sound you want to analyse.

If I were to persue this, I would brutally simplify things to start with.
For example, make some simple rules for an experiment...

All music is played on a single instrument model that creates perfect sine
waves. Each note this instrument makes has a fixed decay to silence over a
period of one second.
The only variables this instrument has is how loud each note is, and its
pitch, fixed to a chromatic scale.
The limitations of the 'player' of this instrument has is to play twenty
notes per second, and as many as ten notes at once.

Then, take the sound file to be analysed and every 20th of a second, try
each of the limited range of sounds this instrument can create until you
find the one that correlates most closely. (Litrally by fft correlation?)

Once that is done, you should have a performance on a very simple
instrument that has some relation to the file you wish to analyse.

Then, the model could be made slightly more complex, ie - this instrument
is an ideal Karplus-Strong string with a simple frequency dependent loss
filter. It has the properties of the length of the string, where the
string is struck, and the amount of energy imparted. It is monophonic, and
can change pitch at a limited rate.

The disadvantages of this way of working would be - Iterating though each
sound a model could create would be *very* time consuming once the models
became more realistic. It's very hard to create good physical models of
real instruments.

The advantages would be -
It might actually work. :)  Or at least provide a way to begin attacking
this interesting but extraordinarily difficult task. The model does not
just define a fixed set of sounds (samples) an instrument can create, but
also defines the limitations in how that instrument can be played.

I think that you would have to create a model of limitations of the player
as well as the instrument anyway if you were using samples. This would
be very difficult if the computer does not 'understand' the instrument
like a physical model, as you would have to create a large amount of
rules by hand for each sample.



> 2. Deduct from these analyzations the prime aspects of these sounds. If
> we only have, say ten frequencies to represent this sound, which ones
> would be the most usefull. Or would some other type of info about the
> file be more imprtant than it's frequencies? So now we have a set of
> data instead of just a pcm sound file. We can call these data sets,
> "fingerprints." This is mainly to help speed up the math performed later
> during step 4, though it will compromise the accuracy of the final
> product. Ideally, the user should be able to select the amount of data
> to be derived from the samples.
>
> 3. Analyze any given sound file. These would be the "real world"
> sounds. Or anything at all. In fact, I was thinking last night that
> the ultimate test for this software would be to feed it, say,
> Beethoven's 9th, and see how close it could aproximate it.
>
> 4. Run differential, or co-efficent on the "real world" sound file
> compared to all the "sound fingerprints" the program created in step 2.
>
> 5. Create midi file. After the program has deduced what would be the
> best combination of instruments in which playing styles at what pitches
> and what dynamics, playing at what kind of rhytmic figures, etc., the
> program would simply create a multiple staff midi file with all said
> info scored on it.
>
> Viola!
Anonymous
October 17, 2004 12:35:17 AM

Archived from groups: rec.audio.pro (More info?)

Ryan wrote:
> Bob Cain <arcane@arcanemethods.com> wrote in message news:<cknnq201hct@enews4.newsguy.com>...
>
>
>>The Ghost could address this in some detail if anyone could
>>get him to do something besides insult people. When he was
>>young he published with one of the pioneers in the field of
>>hearing research, someone who I believe got a Nobel Prize
>>for it.
>
>
>
> The Ghost?

Unimportant. If you don't know of him, you certainly don't
want to.

> 1. Analyze real instrument sound files. These files should inculde
> every possible way every classical instrument can be played, from the
> traditional to the avant garde. For the viols for example, from plain
> jane arco to bartok's snapping strings to harmonics to different bow
> pressures to playing behind the bridge to the tapping of fingers on
> the body of the instruments. There should be files that represent the
> instruments at all possible dynamic levels. There should be files
> that feature the instruments playing in micotones if it can do so.
> (Most classical instruments can.) Also, there should be analysis of
> the instruments in "static form." By this I mean the part of the
> sound after the intial attack, which can be looped over and over again
> to give the impression the note is sustaining. This is done in
> standard synthesis as well as good sample libraries. It may take
> quite awhile to amass all these samples, but once collected the
> analysis of them only has to be done once.

And has yet to be done once. :-)

You aren't really defining an analysis, or even the features
you would like extracted and cataloged. "Every possible way
an instrument can be played" has no meaning until you very
specifically give it that. It is what my high school
writing teacher called a glittering generality. I'm sorry
if that is a bit brutal but so was she. :-)

How would the subjective characteristics that your brain is
very good at discerining be algorithmically characterized
and what would be the form of the data the analysis
produced? You can't just describe it in subjective terms
because we have yet to teach machines this level of
subjective classification and discernment. We are a _long_
way from that.

Don't just offer the term FFT. There is no new information
in an FFT, just a different view of it. What you are
imagining would employ transforms of some kind, undoubtedly,
but which ones and exactly how they could be used to get at
the far more complex information you want is not even a well
formulated problem much less a solved one.

Imagine asking for a machine that could analyze and
catagorize smiles. What you are asking is far more
difficult and open ended.

>
> 2. Deduct from these analyzations the prime aspects of these sounds.

First you must very precisely characterize all of these
prime aspects via a, probably long, research program and
then figure out what processes must be applied to the data
to extract and classify them in those terms.

> If we only have, say ten frequencies to represent this sound, which
> ones would be the most usefull.

That particular "if" has no real connection to reality.

> Or would some other type of info
> about the file be more imprtant than it's frequencies?

Good question. Now you are getting to the heart of the matter.

> So now we have
> a set of data instead of just a pcm sound file.

Not quite yet we don't.

> We can call these
> data sets, "fingerprints."

What would be in these data sets.

> This is mainly to help speed up the math
> performed later during step 4, though it will compromise the accuracy
> of the final product.

What math?

> Ideally, the user should be able to select the
> amount of data to be derived from the samples.

Cool.

>
> 3. Analyze any given sound file. These would be the "real world"
> sounds. Or anything at all. In fact, I was thinking last night that
> the ultimate test for this software would be to feed it, say,
> Beethoven's 9th, and see how close it could aproximate it.

Approximate it with what?

>
> 4. Run differential, or co-efficent on the "real world" sound file
> compared to all the "sound fingerprints" the program created in step
> 2.

Each of your analyzed snippets would be a vector in a very
high dimensional parameter space. Once you defined that
space and a way to deduce all the coordinates in it for a
particular fingerprint, you could then determine the
corresponding vectors for your "real world" sounds. Problem
is that once the dimensions of a space get large enough, any
arbitrary vector in it will almost certainly be orthogonal
to any other. What this means is that they have about as
much in common as "left" and "wrong." Matching is poorly
defined in such situations.

>
> 5. Create midi file. After the program has deduced what would be the
> best combination of instruments in which playing styles at what
> pitches and what dynamics, playing at what kind of rhytmic figures,
> etc., the program would simply create a multiple staff midi file with
> all said info scored on it.

Yeah, simply.

> Viola!

What, you want to do all this synthesis with a single
instrument? :-)


Bob
--

"Things should be described as simply as possible, but no
simpler."

A. Einstein
October 17, 2004 3:44:13 AM

Archived from groups: rec.audio.pro (More info?)

Well, I don't know Phil. Your idea sounded interesting at first, but
then towards the end you describe how hard it would be to use
"realisitic" models anyway, so you kind of defeat your own suggestion.
Plus, this would be even more comp sci and math I'd have to learn. I
do appreciate your ideas however, and I thank you.

I was thinking maybe when I can afford it I would just spring for the
Vienna Symphonic Library Orchestral Cube. It purports to provide
samples of everything I want, recorded by world class players in
anechoic chambers. It would be ideal if it wasn't for the three
thousand dollar price tag. :( 

Anyway, I'm starting to think maybe I should just do the work with my
ear instead of my computer. Most of the posters here tend to think a
software solution would be next to impossible. Might as well brush up
on my ear training and spend the time using my right side of the brain
instead of the left. Hell, from the looks of it I could spend three
years figuring out this software, it would probably only take me three
days to do a rough guess transcription. Maybe I'm finally figuring
out how much harder it is to find a lazy way of doing things.


philicorda <philicorda@azriel.tydrwg.org> wrote in message news:<pan.1998.01.19.06.29.26.213251@azriel.tydrwg.org>...

> Why not use mathametical models of the instruments? I would imagine the
> amount of samples required to cover all the sounds a violin could make
> would be impossible (think of playing a false harmonic on all the strings
> of a violin at every position, and with every bowing style). With a model,
> you have defined the 'prime aspects of these sounds' in a very flexible
> way. The computer could adjust the way the model is 'played' to find the
> best fit to the sound you wish to analyse.
>
> This would perhaps get nearer to fulfilling the interesting idea in your
> original post-
>
> "Perhaps a car engine sound file
> would yield three Double Basses, a flute or two in very quiet irregular
> rhythms, and maybe a horn would be involved during gear changes."
>
> The computer could go through every single possible sound a violin could
> make by iterating though all possible bow positions/angle/velocity, finger
> positions etc until it found the combination that would most closely
> approximate the sound you want to analyse.
>
> If I were to persue this, I would brutally simplify things to start with.
> For example, make some simple rules for an experiment...
>
> All music is played on a single instrument model that creates perfect sine
> waves. Each note this instrument makes has a fixed decay to silence over a
> period of one second.
> The only variables this instrument has is how loud each note is, and its
> pitch, fixed to a chromatic scale.
> The limitations of the 'player' of this instrument has is to play twenty
> notes per second, and as many as ten notes at once.
>
> Then, take the sound file to be analysed and every 20th of a second, try
> each of the limited range of sounds this instrument can create until you
> find the one that correlates most closely. (Litrally by fft correlation?)
>
> Once that is done, you should have a performance on a very simple
> instrument that has some relation to the file you wish to analyse.
>
> Then, the model could be made slightly more complex, ie - this instrument
> is an ideal Karplus-Strong string with a simple frequency dependent loss
> filter. It has the properties of the length of the string, where the
> string is struck, and the amount of energy imparted. It is monophonic, and
> can change pitch at a limited rate.
>
> The disadvantages of this way of working would be - Iterating though each
> sound a model could create would be *very* time consuming once the models
> became more realistic. It's very hard to create good physical models of
> real instruments.
>
> The advantages would be -
> It might actually work. :)  Or at least provide a way to begin attacking
> this interesting but extraordinarily difficult task. The model does not
> just define a fixed set of sounds (samples) an instrument can create, but
> also defines the limitations in how that instrument can be played.
>
> I think that you would have to create a model of limitations of the player
> as well as the instrument anyway if you were using samples. This would
> be very difficult if the computer does not 'understand' the instrument
> like a physical model, as you would have to create a large amount of
> rules by hand for each sample.
>
>
>
> > 2. Deduct from these analyzations the prime aspects of these sounds. If
> > we only have, say ten frequencies to represent this sound, which ones
> > would be the most usefull. Or would some other type of info about the
> > file be more imprtant than it's frequencies? So now we have a set of
> > data instead of just a pcm sound file. We can call these data sets,
> > "fingerprints." This is mainly to help speed up the math performed later
> > during step 4, though it will compromise the accuracy of the final
> > product. Ideally, the user should be able to select the amount of data
> > to be derived from the samples.
> >
> > 3. Analyze any given sound file. These would be the "real world"
> > sounds. Or anything at all. In fact, I was thinking last night that
> > the ultimate test for this software would be to feed it, say,
> > Beethoven's 9th, and see how close it could aproximate it.
> >
> > 4. Run differential, or co-efficent on the "real world" sound file
> > compared to all the "sound fingerprints" the program created in step 2.
> >
> > 5. Create midi file. After the program has deduced what would be the
> > best combination of instruments in which playing styles at what pitches
> > and what dynamics, playing at what kind of rhytmic figures, etc., the
> > program would simply create a multiple staff midi file with all said
> > info scored on it.
> >
> > Viola!
Anonymous
October 17, 2004 7:34:17 PM

Archived from groups: rec.audio.pro (More info?)

On Sat, 16 Oct 2004 23:44:13 -0700, Ryan wrote:

> Well, I don't know Phil. Your idea sounded interesting at first, but
> then towards the end you describe how hard it would be to use
> "realisitic" models anyway, so you kind of defeat your own suggestion.

Absolutely. It would perhaps be a more ideal method, though it's far more
complicated and messy. I wonder how well the most simple model would work?
A computers 'interpretation' with a simple string and player model would
be interesting to hear, even though it may not bear much relationship to
the original music.

There are a number of programs out there that purport to do polyphonic
pitch detection -
http://www.music-notation.info/en/compmus/audio2midi.ht...

But, they rely on differentiating the different instruments by their
range, rather than their harmonic content, and I have no idea how well the
polyphonic pitch detection works. Perhaps combining the two approaches
of the pitch detection they do, and yours of harmonic 'fingerprints' to
identify the instruments?

> Plus, this would be even more comp sci and math I'd have to learn. I
> do appreciate your ideas however, and I thank you.
>
> I was thinking maybe when I can afford it I would just spring for the
> Vienna Symphonic Library Orchestral Cube. It purports to provide
> samples of everything I want, recorded by world class players in
> anechoic chambers. It would be ideal if it wasn't for the three
> thousand dollar price tag. :( 
>
> Anyway, I'm starting to think maybe I should just do the work with my
> ear instead of my computer. Most of the posters here tend to think a
> software solution would be next to impossible. Might as well brush up
> on my ear training and spend the time using my right side of the brain
> instead of the left. Hell, from the looks of it I could spend three
> years figuring out this software, it would probably only take me three
> days to do a rough guess transcription. Maybe I'm finally figuring out
> how much harder it is to find a lazy way of doing things.

Laziness is the mother of invention. :) 
October 18, 2004 3:31:45 AM

Archived from groups: rec.audio.pro (More info?)

Bob Cain <arcane@arcanemethods.com> wrote in message news:<cksp9j01vek@enews1.newsguy.com>...

> You aren't really defining an analysis, or even the features
> you would like extracted and cataloged. "Every possible way
> an instrument can be played" has no meaning until you very
> specifically give it that. It is what my high school
> writing teacher called a glittering generality. I'm sorry
> if that is a bit brutal but so was she. :-)

Yes, If I was writing for another audience I would have to adress this
more specifically. But you know what I'm getting at. I don't want to
post a billion word technical rubrick.

> How would the subjective characteristics that your brain is
> very good at discerining be algorithmically characterized
> and what would be the form of the data the analysis
> produced?

> What would be in these data sets.

> What math?

> Approximate it with what?

Hell man, these are the questions I came looking for the anwsers to.
You were supposed to answer these!

>
> > Viola!
>
> What, you want to do all this synthesis with a single
> instrument? :-)

lol
How is it spelled? Voiola?
Anonymous
October 18, 2004 5:33:57 AM

Archived from groups: rec.audio.pro (More info?)

Ryan wrote:

> Bob Cain <arcane@arcanemethods.com> wrote in message news:<cksp9j01vek@enews1.newsguy.com>...
>
>>You aren't really defining an analysis, or even the features
>>you would like extracted and cataloged. "Every possible way
>>an instrument can be played" has no meaning until you very
>>specifically give it that. It is what my high school
>>writing teacher called a glittering generality. I'm sorry
>>if that is a bit brutal but so was she. :-)
>
> Yes, If I was writing for another audience I would have to adress this
> more specifically. But you know what I'm getting at. I don't want to
> post a billion word technical rubrick.

:-) Aw, give it a shot.

>>How would the subjective characteristics that your brain is
>>very good at discerining be algorithmically characterized
>>and what would be the form of the data the analysis
>>produced?
>
>>What would be in these data sets.
>
>>What math?
>
>>Approximate it with what?
>
> Hell man, these are the questions I came looking for the anwsers to.
> You were supposed to answer these!

I hope you understand that my intent was to point out that
these aren't solved problems. There aren't even glimmers on
the horizon. You are defining a musical AI with an awesome
intelligence, processing capability and prodigous memory.

If you were to take this to a prospective Ph.D. advisor as
an area for a thesis, he'd look at you in amazement, shake
his head and, if he was kind, try to help you find one
little corner of it that might yield productive results if
you tugged on it for a few years.

There are people thinking and working on these kinds of
problems but I don't know where they congregate.

>>>Viola!
>>
>>What, you want to do all this synthesis with a single
>>instrument? :-)
>
> lol
> How is it spelled? Voiola?

:-) Voila!


Bob
--

"Things should be described as simply as possible, but no
simpler."

A. Einstein
October 18, 2004 6:51:44 PM

Archived from groups: rec.audio.pro (More info?)

Bob Cain <arcane@arcanemethods.com> wrote in message news:<ckvv5l02fii@enews4.newsguy.com>...

> > Yes, If I was writing for another audience I would have to adress this
> > more specifically. But you know what I'm getting at. I don't want to
> > post a billion word technical rubrick.
>
> :-) Aw, give it a shot.

You know, if I could be assured that what I want to do is feasible, I
really would write something like this up. Till then though, I'm a
busy man and it seems like a huge waste of time if nothing could ever
come from it.

>
> >>How would the subjective characteristics that your brain is
> >>very good at discerining be algorithmically characterized
> >>and what would be the form of the data the analysis
> >>produced?
>
> >>What would be in these data sets.
>
> >>What math?
>
> >>Approximate it with what?
> >
> > Hell man, these are the questions I came looking for the anwsers to.
> > You were supposed to answer these!
>
> I hope you understand that my intent was to point out that
> these aren't solved problems. There aren't even glimmers on
> the horizon. You are defining a musical AI with an awesome
> intelligence, processing capability and prodigous memory.

Yes. You are quite good at socratic method. I guess I just thought
you knew these answers but wanted to see me "jump through some hoops"
first, not malicously of course. But if what you're saying is that
the math, or system of maths this would require hasn't even been
"invented" yet, then that's an altogether different type of thing.

Anyway, thanks for your time and input.
Anonymous
October 18, 2004 6:58:02 PM

Archived from groups: rec.audio.pro (More info?)

Ryan wrote:

> Hell man, these are the questions I came looking for the anwsers to.
> You were supposed to answer these!

He's asking you the questions for which you must provide clear answeres
in order to approach your goal.

--
ha
Anonymous
October 18, 2004 8:53:40 PM

Archived from groups: rec.audio.pro (More info?)

Ryan wrote:

> Yes. You are quite good at socratic method. I guess I just thought
> you knew these answers but wanted to see me "jump through some hoops"
> first, not malicously of course. But if what you're saying is that
> the math, or system of maths this would require hasn't even been
> "invented" yet, then that's an altogether different type of thing.

It wasn't just to get you to jump through hoops, Ryan. I'm
truly interested in how a musically creative mind would
specify the problem in some detail. That's good input for
the more academic oriented folks who are working and
thinking at the computational level.

The biggest problem with all of this that I see is how to
specify in detail what's in the music that can be considered
features worth thinking about extracting algoritmically. If
a human can't get real down with that part then there is
little hope of implementing anything useful. Granted, for
the non-technically but strongly musically inclined it could
be a very frustrating experience to see how difficult it is
to reduce things that seem obvious to her to terms that have
any hope of an impelementation, but you gotta start somewhere.


Bob
--

"Things should be described as simply as possible, but no
simpler."

A. Einstein
Anonymous
October 18, 2004 9:29:42 PM

Archived from groups: rec.audio.pro (More info?)

inkexit@yahoo.com (Ryan) wrote in message news:<dea39397.0410172115.625b5091@posting.google.com>...
>>
> Well, you and I have no bad blood between us, Ghost. What's your take
> on this whole idea?

I don't have time at this moment to backtrack and read the entire
thread. So, if you have a specific question, please (re)state it in
as concise terms as possible, and I will answer it if I feel that I am
qualified to do so. If not, I will do my best to refer you to someone
who can.
Anonymous
October 18, 2004 11:25:18 PM

Archived from groups: rec.audio.pro (More info?)

Hi Ryan-

The sines and cosines that get used to build up a waveform in
Fourier analysis are the "basis functions" of the Fourier
transform. It is possible to decompose signals using many
different types of bases. The Fourier basis (sines and cosines,
harmonically related if the signal is of finite extent) has
some nice mathematical properties that make the decomposition
(and recomposition) simpler, mathematically, than it is with
many other bases. But that simplicity doesn't make the
Fourier basis "right" for all applications.

In your case, you want to use as "basis functions" the signals
played by standard instruments. These are much more complicated
than the sines and cosines in a Fourier basis. Besides the
fact that the sustained waveform from an instrument playing
a note has a non-sinusoidal shape, notes are transient (they
start and stop in time) and also dynamic (their pitch, volume,
and timbre vary in time, e.g., due to
tremolo, vibrato, etc.). Although it is mathematically possible
to represent signals with such dynamic, transient structure via
a Fourier transform, I don't think a Fourier decomposition
is well-suited to your problem.

One approach is to actually take samples of the instruments
you'll use, playing all the notes available, and use them (with
various durations) directly as your basis. This would be the most
accurate approach, but the calculations you'd need to do to find
the expansion coefficients (i.e., the score!) would probably
be extremely difficult computationally, and probably not
well-defined (the basis is likely neither complete nor
orthogonal). You'd be doing something like additive synthesis,
but with a much bigger basis than is usually used! Looking
up some of the math associated with additive synthesis might
provide you with some leads.

A possible option that has the potential to be more computationally
tractible would be to use some kind of wavelet or other
time-scale or time-frequency transform rather
than a Fourier transform. Very roughly speaking, you can
think of such a transform as breaking up a signal into
*localized* pulses, i.e., notes! That is, where a Fourier
transform represents a signal as a sum of "eternal" sines
and cosines of specific frequencies, a time-frequency transform
breaks up the signal into separate parts that are localized both in
frequency *and* time. You might be able to find some way to
project a wavelet or other time-frequency transform of the sound
you are interested in onto the transforms of sounds from the
instruments you have available; this would give you the notes
and volumes needed to most closely match the desired signal.
This won't make any fundamental problems with the incompleteness
or redundancy of your basis (choice of instruments & notes) go
away, but use of such transforms might provide methods of
approximation that make the problem more tractable computationally.

A google search on "wavelets" and "music" will probably get you
started. This wavelet FAQ might also help:

http://www.math.ucdavis.edu/~saito/courses/ACHA.s04/wav...

Here's a review article on time-frequency analysis of sounds
from musical instruments---your basis functions, so to speak:

http://epubs.siam.org/sam-bin/getfile/SIREV/articles/38...

If you want to learn more about Fourier expansions from
a musical point of view, see:

http://ccrma.stanford.edu/~jos/mdft/

Here's a reference that turned up in my own quick googling using
"time scale transform music" that may provide a starting point
for thinking along these lines, if you can find a copy:

Kronland-Martinet R., Grossmann A. "Application of time-frequency and
time-scale methods to the analysis, synthesis and transformation of
natural sounds." in "Representations of Musical Signals", C. Roads,
G. De Poli, A. Picciali Eds, MIT Press, october 1990.

Interlibrary loan may help you here!

A similar search using "time frequency transform music" turned up
"Musical Transformations using the Modification of Time-Frequency
Images" in a 1993 issue of *Computer Music Journal*:

http://mitpress.mit.edu/catalog/item/default.asp?tid=67...

This is just from some quick googling and these are probably not
the best or most recent references that may be relevant. Wavelet
and time-frequency analysis is now very mature and there are
entire textbooks and monographs on these topics. Good
luck with this.

Peace,
Tom Loredo

--

To respond by email, replace "somewhere" with "astro" in the
return address.
Anonymous
October 19, 2004 12:03:22 PM

Archived from groups: rec.audio.pro (More info?)

inkexit@yahoo.com (Ryan) wrote in message news:<dea39397.0410111847.69138fb1@posting.google.com>...
> I'm looking to find out more about writing some software that will use
> traditional classical instruments to emulate "natural" or "non musical
> sounds." The software will perform some type of analyses on an audio
> file, I imagine FFT would be used at some point, but the problem with
> FFT is that it only tells you what "perfect" or pure sine wave based
> frequencies are present in a sound. Besides the flute, not much else
> in an orchestra has anything close to a sine wave output. After this
> analysis is done, the software will look through a library of sounds
> made by traditional instruments. These sounds will include every
> noise and playing style every traditional instrument can produce. The
> software will then juggle the sounds around at various dynamic levels
> in various rhythms and etc until it comes up with the closest
> combination to the original sound. Perhaps a car engine sound file
> would yield three Double Basses, a flute or two in very quiet
> irregular rhythms, and maybe a horn would be involved during gear
> changes. I might not have to tell you that Gyorgy Ligeti's
> "Atmospheres" and his "Mechanical Music" served as the chief
> inspiration for this idea.
>
> Has anybody ever heard of anything like this, or know where I might
> start to look for info on this subject? I'm not looking for
> programming help, but rather, help with setting up the math. Are
> there any scientific communities online that I could point my
> questions to? Any books on this type of thing. I've heard Csound
> might work for this. I thought Csound was for composing, not for
> analyzing existing sound files. I can't seem to come up with the
> right keywords to get anything out of Google, but I hoped someone here
> might be able to put me on the right path.

I know it's not what you had originally asked, but give a listen to
the first few bars of Mahler's 1st symphony, last movement. Closest
thing I've heard to an orchestra sounding like a jet engine, without
intentionally doing so.

-Karl
October 19, 2004 11:40:22 PM

Archived from groups: rec.audio.pro (More info?)

I'm looking to find out more about writing some software that will use
traditional classical instruments to emulate "natural" or "non musical
sounds." The software will perform some type of analyses on an audio
file, I imagine FFT would be used at some point, but the problem with
FFT is that it only tells you what "perfect" or pure sine wave based
frequencies are present in a sound. Besides the flute, not much else
in an orchestra has anything close to a sine wave output. After this
analysis is done, the software will look through a library of sounds
made by traditional instruments. These sounds will include every
noise and playing style every traditional instrument can produce. The
software will then juggle the sounds around at various dynamic levels
in various rhythms and etc until it comes up with the closest
combination to the original sound. Perhaps a car engine sound file
would yield three Double Basses, a flute or two in very quiet
irregular rhythms, and maybe a horn would be involved during gear
changes.


the_ghostbuster@netzero.com (The Ghost) wrote in message news:<b5fb78ba.0410181629.71b77280@posting.google.com>...
> inkexit@yahoo.com (Ryan) wrote in message news:<dea39397.0410172115.625b5091@posting.google.com>...
> >>
> > Well, you and I have no bad blood between us, Ghost. What's your take
> > on this whole idea?
>
> I don't have time at this moment to backtrack and read the entire
> thread. So, if you have a specific question, please (re)state it in
> as concise terms as possible, and I will answer it if I feel that I am
> qualified to do so. If not, I will do my best to refer you to someone
> who can.
October 19, 2004 11:42:35 PM

Archived from groups: rec.audio.pro (More info?)

Tom Loredo <loredo@somewhere.cornell.edu> wrote in message news:<417450DE.2990F934@somewhere.cornell.edu>...
> Hi Ryan-
>
> The sines and cosines that get used to build up a waveform in
> Fourier analysis are the "basis functions" of the Fourier
> transform. It is possible to decompose signals using many
> different types of bases. The Fourier basis (sines and cosines,
> harmonically related if the signal is of finite extent) has
> some nice mathematical properties that make the decomposition
> (and recomposition) simpler, mathematically, than it is with
> many other bases. But that simplicity doesn't make the
> Fourier basis "right" for all applications...

Thank you for this copious amount of unsolicited information. It is
already proving useful.
October 20, 2004 12:02:27 AM

Archived from groups: rec.audio.pro (More info?)

Bob Cain <arcane@arcanemethods.com> wrote in message news:<cl1l23020i5@enews2.newsguy.com>...

> It wasn't just to get you to jump through hoops, Ryan. I'm
> truly interested in how a musically creative mind would
> specify the problem in some detail. That's good input for
> the more academic oriented folks who are working and
> thinking at the computational level.

Well, being both a musician and an engineer's son, I'd have to say
that the real work would be done by the mathmaticians. I think I
adequately described the problem, but then again, since I know exactly
what I want, it is hard for me to evaluate wether or not I expressed
it. I'm sure your old english teacher went over this phenomenon once
or twice. If there is something about the description (not the math)
that you think needs a little more flesh, name it and I will further
describe.

>
> The biggest problem with all of this that I see is how to
> specify in detail what's in the music that can be considered
> features worth thinking about extracting algoritmically. If
> a human can't get real down with that part then there is
> little hope of implementing anything useful.

Humans aren't so absolute. Give us some choices however, and we can
easily narrow it down.

> Granted, for
> the non-technically but strongly musically inclined it could
> be a very frustrating experience to see how difficult it is
> to reduce things that seem obvious to her to terms that have
> any hope of an impelementation, but you gotta start somewhere.

If I understood math better I would probably be able to reduce it a
little more myself. As it is though, I think we will have to work
together for optimum results.
Anonymous
October 20, 2004 2:26:55 AM

Archived from groups: rec.audio.pro (More info?)

Ryan wrote:

> Well, being both a musician and an engineer's son, I'd have to say
> that the real work would be done by the mathmaticians.

I disagree. The hard part is the definition. It isn't
going to be math, at any rate, it will be algorithm and
there is a huge difference. Elements of math will, of
course, enter into it and what tools there are in that
regard need to be understood but putting them all together
in useful procedures that illustrate the definition is
probably going to be a lot easier than setting down those
definitions in an objectively meaningful way in the first place.

> I think I
> adequately described the problem, but then again, since I know exactly
> what I want, it is hard for me to evaluate wether or not I expressed
> it.

Exactly. Makes perfect sense to you.

> I'm sure your old english teacher went over this phenomenon once
> or twice. If there is something about the description (not the math)
> that you think needs a little more flesh, name it and I will further
> describe.

Start with some kind of objective measure of the subjective
that is in your head.

> If I understood math better I would probably be able to reduce it a
> little more myself.

I strongly urge you to consider that path. Knowing what you
want out of it will slant your math education in what could
be very productive ways.

> As it is though, I think we will have to work
> together for optimum results.

I have less hope for that. Our languages are too dissimilar
(and I'm probably way too old to learn yours.) :-)


Bob
--

"Things should be described as simply as possible, but no
simpler."

A. Einstein
Anonymous
October 20, 2004 3:01:06 PM

Archived from groups: rec.audio.pro (More info?)

Ryan <inkexit@yahoo.com> wrote:
>Bob Cain <arcane@arcanemethods.com> wrote in message news:<cl1l23020i5@enews2.newsguy.com>...
>
>> It wasn't just to get you to jump through hoops, Ryan. I'm
>> truly interested in how a musically creative mind would
>> specify the problem in some detail. That's good input for
>> the more academic oriented folks who are working and
>> thinking at the computational level.
>
>Well, being both a musician and an engineer's son, I'd have to say
>that the real work would be done by the mathmaticians. I think I
>adequately described the problem, but then again, since I know exactly
>what I want, it is hard for me to evaluate wether or not I expressed
>it. I'm sure your old english teacher went over this phenomenon once
>or twice. If there is something about the description (not the math)
>that you think needs a little more flesh, name it and I will further
>describe.

My suggestion: get Octave, which is a public domain Matlab clone. Load
some samples into it. Start doing some ffts on some of the samples and
looking at plots. There will also be things in there for waterfall plots,
correlation, all kinds of nifty things to play with. I think they may also
have some wavelet decomposition stuff, which is another way of taking
waveforms apart.
--scott
--
"C'est un Nagra. C'est suisse, et tres, tres precis."
Anonymous
October 20, 2004 8:44:30 PM

Archived from groups: rec.audio.pro (More info?)

inkexit@yahoo.com (Ryan) wrote in message news:<dea39397.0410191840.38e86de8@posting.google.com>...

> I'm looking to find out more about writing some software that will use
> traditional classical instruments to emulate "natural" or "non musical
> sounds." The software will perform some type of analyses on an audio
> file, I imagine FFT would be used at some point, but the problem with
> FFT is that it only tells you what "perfect" or pure sine wave based
> frequencies are present in a sound. Besides the flute, not much else
> in an orchestra has anything close to a sine wave output. After this
> analysis is done, the software will look through a library of sounds
> made by traditional instruments. These sounds will include every
> noise and playing style every traditional instrument can produce. The
> software will then juggle the sounds around at various dynamic levels
> in various rhythms and etc until it comes up with the closest
> combination to the original sound. Perhaps a car engine sound file
> would yield three Double Basses, a flute or two in very quiet
> irregular rhythms, and maybe a horn would be involved during gear
> changes.

That topic seems like it would be related to musical instrument sound
synthesis, so you might want to see if there is a newsgroup that deals
spcifically with musical instrument synthesis. Because most natural
sounds and most instrument sounds are non-stationary varying, I do not
believe that FFT analysis is going to be useful. You may want to
consider coss-correlation instead.
!