Sign in with
Sign up | Sign in
Your question

Music through GSM codecs, use of psychoacoutic codecs

Last response: in Home Audio
Share
August 23, 2004 4:33:51 AM

Archived from groups: comp.compression,comp.dsp,comp.multimedia,rec.audio.tech (More info?)

Dear all,

I am working on a project that involves transmitting music through GSM
phones. Since the GSM audio codec is designed for speech and not music or
general audio, the quality is quite bad. I realise that this will always be
a problem, but I was wondering whether there are ways to preprocess the
music in order to reduce the amount of distortion.

One of the things I considered was to use the MP3 or Ogg Vorbis codec in
order to reduce some of the information in the original sound source, before
converting to gsm. My understanding is that one of the reasons why speech
codecs such as GSM and Speex do not use any kind of psychoacoustic analysis,
is because they increase the complexity of the system and hence the latency,
something which is unacceptable for a real-time applications. However, for
my application the music can be prepared in advanced and is saved on a
server thus it is feasible apply some complex algorithms on it.

My intention was to use the Ogg Vorbis codec at maximum quality. But I am
wondering how much of the compression is achieved by reducing the
information (i.e. removing masked sounds using psychoacoustic analysis) and
how much by just storing it more efficiently.

My question is, does this seem like a reasonable approach to preparing the
music for transmission through GSM? Would it be better to use a lower value
for the Vorbis quality? Are there any other things I can do to the music in
order to reduce the chance of distortion?

If anyone can help me with any of these issue, I would appreciate it very
much.

Many Thanks,

RG
Anonymous
August 23, 2004 4:33:52 AM

Archived from groups: comp.compression,comp.dsp,comp.multimedia,rec.audio.tech (More info?)

Do you have the option of sending it as data through the GSM phone instead
of sending it through the voice circuit, then MP3 or Ogg might be of use.
If you are converting back to audio and then using the GSM compression there
would be no advantage.

Rgds:
Eric

"rg" <rg1117@hotmail.com> wrote in message
news:cgbah8$4le$1@news.freedom2surf.net...
> Dear all,
>
> I am working on a project that involves transmitting music through GSM
> phones. Since the GSM audio codec is designed for speech and not music or
> general audio, the quality is quite bad. I realise that this will always
be
> a problem, but I was wondering whether there are ways to preprocess the
> music in order to reduce the amount of distortion.
>
> One of the things I considered was to use the MP3 or Ogg Vorbis codec in
> order to reduce some of the information in the original sound source,
before
> converting to gsm. My understanding is that one of the reasons why speech
> codecs such as GSM and Speex do not use any kind of psychoacoustic
analysis,
> is because they increase the complexity of the system and hence the
latency,
> something which is unacceptable for a real-time applications. However, for
> my application the music can be prepared in advanced and is saved on a
> server thus it is feasible apply some complex algorithms on it.
>
> My intention was to use the Ogg Vorbis codec at maximum quality. But I am
> wondering how much of the compression is achieved by reducing the
> information (i.e. removing masked sounds using psychoacoustic analysis)
and
> how much by just storing it more efficiently.
>
> My question is, does this seem like a reasonable approach to preparing the
> music for transmission through GSM? Would it be better to use a lower
value
> for the Vorbis quality? Are there any other things I can do to the music
in
> order to reduce the chance of distortion?
>
> If anyone can help me with any of these issue, I would appreciate it very
> much.
>
> Many Thanks,
>
> RG
>
>
>
Anonymous
August 23, 2004 4:33:52 AM

Archived from groups: comp.compression,comp.dsp,comp.multimedia,rec.audio.tech (More info?)

"rg" <rg1117@hotmail.com> wrote in message
news:cgbah8$4le$1@news.freedom2surf.net
> Dear all,
>
> I am working on a project that involves transmitting music through GSM
> phones. Since the GSM audio codec is designed for speech and not
> music or general audio, the quality is quite bad. I realise that this
> will always be a problem, but I was wondering whether there are ways
> to preprocess the music in order to reduce the amount of distortion.
>
> One of the things I considered was to use the MP3 or Ogg Vorbis codec
> in order to reduce some of the information in the original sound
> source, before converting to gsm. My understanding is that one of the
> reasons why speech codecs such as GSM and Speex do not use any kind
> of psychoacoustic analysis, is because they increase the complexity
> of the system and hence the latency, something which is unacceptable
> for a real-time applications. However, for my application the music
> can be prepared in advanced and is saved on a server thus it is
> feasible apply some complex algorithms on it.

> My question is, does this seem like a reasonable approach to
> preparing the music for transmission through GSM? Would it be better
> to use a lower value for the Vorbis quality? Are there any other
> things I can do to the music in order to reduce the chance of
> distortion?

Conventional wisdom is that stacking lossy compression algorithms makes
things worse, rather than better.

In rough terms, each different lossy compression algorithm throws away
different information. The different loses add up to to create more audible
artifacts.
Related resources
Anonymous
August 23, 2004 8:39:59 AM

Archived from groups: comp.compression,comp.dsp,comp.multimedia,rec.audio.tech (More info?)

In article <ueCdnZHK4ccx8LTcRVn-vg@comcast.com>,
Arny Krueger <arnyk@hotpop.com> wrote:
>"rg" <rg1117@hotmail.com> wrote in message
>news:cgbah8$4le$1@news.freedom2surf.net
>> I am working on a project that involves transmitting music through GSM
>> phones.
>>...
>> One of the things I considered was to use the MP3 or Ogg Vorbis codec
>> in order to reduce some of the information in the original sound
>> source, before converting to gsm.
>[snip]
>Conventional wisdom is that stacking lossy compression algorithms makes
>things worse, rather than better.
>
>In rough terms, each different lossy compression algorithm throws away
>different information. The different loses add up to to create more audible
>artifacts.

But those artifacts may not be audible on an "audio system" that
consists of an audio signal of narrow bandwidth and a tinny speaker
of narrow bandwidth. I imagine there's a lot of information one
could cut out without much loss to what one actually hears.

That said, the best reproduction would be if the GSM phone itself
contained the codec that processed the digital data into audio.
But I think if it's being converted to lossy audio remotely, it
would then have to be converted back to digital on its way to the
phone, and then again to audio. This will probably not sound good
no matter what one does.

Just my guess. Not that I know anything, I'm no expert.

-Alex
August 23, 2004 2:13:23 PM

Archived from groups: comp.compression,comp.dsp,comp.multimedia,rec.audio.tech (More info?)

Hi,
Thanks for your response. The music is begin 'played back' through the
phone. I.E. it is transmitted as sound and not as data. Think of it as
music-on-hold.

RG

"Eric K. Weber" <eric-nospam@webermusic.com> wrote in message
news:hKaWc.217$jR5.8431@news.uswest.net...
> Do you have the option of sending it as data through the GSM phone instead
> of sending it through the voice circuit, then MP3 or Ogg might be of use.
> If you are converting back to audio and then using the GSM compression
there
> would be no advantage.
>
> Rgds:
> Eric
>
> "rg" <rg1117@hotmail.com> wrote in message
> news:cgbah8$4le$1@news.freedom2surf.net...
> > Dear all,
> >
> > I am working on a project that involves transmitting music through GSM
> > phones. Since the GSM audio codec is designed for speech and not music
or
> > general audio, the quality is quite bad. I realise that this will always
> be
> > a problem, but I was wondering whether there are ways to preprocess the
> > music in order to reduce the amount of distortion.
> >
> > One of the things I considered was to use the MP3 or Ogg Vorbis codec in
> > order to reduce some of the information in the original sound source,
> before
> > converting to gsm. My understanding is that one of the reasons why
speech
> > codecs such as GSM and Speex do not use any kind of psychoacoustic
> analysis,
> > is because they increase the complexity of the system and hence the
> latency,
> > something which is unacceptable for a real-time applications. However,
for
> > my application the music can be prepared in advanced and is saved on a
> > server thus it is feasible apply some complex algorithms on it.
> >
> > My intention was to use the Ogg Vorbis codec at maximum quality. But I
am
> > wondering how much of the compression is achieved by reducing the
> > information (i.e. removing masked sounds using psychoacoustic analysis)
> and
> > how much by just storing it more efficiently.
> >
> > My question is, does this seem like a reasonable approach to preparing
the
> > music for transmission through GSM? Would it be better to use a lower
> value
> > for the Vorbis quality? Are there any other things I can do to the music
> in
> > order to reduce the chance of distortion?
> >
> > If anyone can help me with any of these issue, I would appreciate it
very
> > much.
> >
> > Many Thanks,
> >
> > RG
> >
> >
> >
>
>
August 23, 2004 2:45:09 PM

Archived from groups: comp.compression,comp.dsp,comp.multimedia,rec.audio.tech (More info?)

"Arny Krueger" <arnyk@hotpop.com> wrote in message
news:ueCdnZHK4ccx8LTcRVn-vg@comcast.com...
> "rg" <rg1117@hotmail.com> wrote in message
> news:cgbah8$4le$1@news.freedom2surf.net
> > Dear all,
> >
> > I am working on a project that involves transmitting music through GSM
> > phones. Since the GSM audio codec is designed for speech and not
> > music or general audio, the quality is quite bad. I realise that this
> > will always be a problem, but I was wondering whether there are ways
> > to preprocess the music in order to reduce the amount of distortion.
> >
> > One of the things I considered was to use the MP3 or Ogg Vorbis codec
> > in order to reduce some of the information in the original sound
> > source, before converting to gsm. My understanding is that one of the
> > reasons why speech codecs such as GSM and Speex do not use any kind
> > of psychoacoustic analysis, is because they increase the complexity
> > of the system and hence the latency, something which is unacceptable
> > for a real-time applications. However, for my application the music
> > can be prepared in advanced and is saved on a server thus it is
> > feasible apply some complex algorithms on it.
>
> > My question is, does this seem like a reasonable approach to
> > preparing the music for transmission through GSM? Would it be better
> > to use a lower value for the Vorbis quality? Are there any other
> > things I can do to the music in order to reduce the chance of
> > distortion?
>
> Conventional wisdom is that stacking lossy compression algorithms makes
> things worse, rather than better.
>
> In rough terms, each different lossy compression algorithm throws away
> different information. The different loses add up to to create more
audible
> artifacts.
>
>
Hi,

In most cases, this would be the case. However what we are finding is that
there is too much information in the music for the GSM codec to deal with,
causing significant amounts of distortion. So my intention was to reduce the
information before feeding it to the GSM part of the system, hoping to
reduce that distortion.

Thanks,

RG
Anonymous
August 23, 2004 10:42:55 PM

Archived from groups: comp.compression,comp.dsp,comp.multimedia,rec.audio.tech (More info?)

In article <cgbah8$4le$1@news.freedom2surf.net>, rg <rg1117@hotmail.com> wrote:
>Dear all,
>
>I am working on a project that involves transmitting music through GSM
>phones. Since the GSM audio codec is designed for speech and not music or
>general audio, the quality is quite bad. I realise that this will always be
>a problem, but I was wondering whether there are ways to preprocess the
>music in order to reduce the amount of distortion.

Long story short: no.

Why? The GSM codec, indeed most voice codecs used nowadays, compress
the signal by modeling the human voice tract as a tube of varying
cross-section excited by a series of pulse trains. The encoder tries
to figure out the pulse information and from there derive the transfer
function. For unvoiced signals such as fricatives it uses a simpler model
excited by noise. The decoder then uses this information to regenerate
the guessed-at signal.

Why are these types of codecs instead of MP3 used for phones?
1) Lots of research has made these sound pretty good for reduced bit rates.
2) You can prioritize your bit allocations, so critical bits absolutely
needed for intelligibility are encoded with robust error correction,
the next most important have CRC checksums (I think, it's been 10 years),
and the least important are allowed to have errors.

MP3 bits are more or less equally important for signal integrity, therefore
no such allocation could be derived, and you'd either need increased
bandwidth for error correction or you've have to live with more garbled
voices in spotty reception.

In any case, voice codecs are not the best way of encoding a music signal;
indeed it's quite surprising that music comes through as well as it does.


>[...pre-treating with MP3 or Ogg Vorbis...]
>
>My question is, does this seem like a reasonable approach to preparing the
>music for transmission through GSM? Would it be better to use a lower value
>for the Vorbis quality? Are there any other things I can do to the music in
>order to reduce the chance of distortion?

Neither MP3 or Ogg Vorbis will help you here: their compression algorithms
throw away completely different parts of the signal from voice codecs.
As Arnie says, you'd only be compounding your difficulties.


Francois.
August 24, 2004 12:46:37 AM

Archived from groups: comp.compression,comp.dsp,comp.multimedia,rec.audio.tech (More info?)

Hi,

Thank you very much for your reply. You clearly have a very good
understanding of the GSM codec. If you don't mind, I would like to ask you a
follow up question.
As you make clear, the GSM audio codec is very inappropriate for music, but
is there any way of analysing an audio/music file in order to evaluate how
much distortion there would be if fed through the system. For example, is it
possible to get a measurement of how well the codec is able to model a
particular sound source.

Many Thanks for your help,

RG

"(null)" <fps@idiom.com> wrote in message news:1093286575.57611@smirk...
> In article <cgbah8$4le$1@news.freedom2surf.net>, rg <rg1117@hotmail.com>
wrote:
> >Dear all,
> >
> >I am working on a project that involves transmitting music through GSM
> >phones. Since the GSM audio codec is designed for speech and not music or
> >general audio, the quality is quite bad. I realise that this will always
be
> >a problem, but I was wondering whether there are ways to preprocess the
> >music in order to reduce the amount of distortion.
>
> Long story short: no.
>
> Why? The GSM codec, indeed most voice codecs used nowadays, compress
> the signal by modeling the human voice tract as a tube of varying
> cross-section excited by a series of pulse trains. The encoder tries
> to figure out the pulse information and from there derive the transfer
> function. For unvoiced signals such as fricatives it uses a simpler model
> excited by noise. The decoder then uses this information to regenerate
> the guessed-at signal.
>
> Why are these types of codecs instead of MP3 used for phones?
> 1) Lots of research has made these sound pretty good for reduced bit
rates.
> 2) You can prioritize your bit allocations, so critical bits absolutely
> needed for intelligibility are encoded with robust error correction,
> the next most important have CRC checksums (I think, it's been 10
years),
> and the least important are allowed to have errors.
>
> MP3 bits are more or less equally important for signal integrity,
therefore
> no such allocation could be derived, and you'd either need increased
> bandwidth for error correction or you've have to live with more garbled
> voices in spotty reception.
>
> In any case, voice codecs are not the best way of encoding a music signal;
> indeed it's quite surprising that music comes through as well as it does.
>
>
> >[...pre-treating with MP3 or Ogg Vorbis...]
> >
> >My question is, does this seem like a reasonable approach to preparing
the
> >music for transmission through GSM? Would it be better to use a lower
value
> >for the Vorbis quality? Are there any other things I can do to the music
in
> >order to reduce the chance of distortion?
>
> Neither MP3 or Ogg Vorbis will help you here: their compression algorithms
> throw away completely different parts of the signal from voice codecs.
> As Arnie says, you'd only be compounding your difficulties.
>
>
> Francois.
>
Anonymous
August 24, 2004 12:46:38 AM

Archived from groups: comp.compression,comp.dsp,comp.multimedia,rec.audio.tech (More info?)

rg wrote:

> Hi,
>
> Thank you very much for your reply. You clearly have a very good
> understanding of the GSM codec. If you don't mind, I would like to ask you a
> follow up question.
> As you make clear, the GSM audio codec is very inappropriate for music, but
> is there any way of analysing an audio/music file in order to evaluate how
> much distortion there would be if fed through the system. For example, is it
> possible to get a measurement of how well the codec is able to model a
> particular sound source.
>
> Many Thanks for your help,
>
> RG

That one is easy enough even for me. Pass your file through a model
system and measure the distortion that results. If you want to see how
badly a lens distorts, take a picture with it.

Jerry
--
Engineering is the art of making what you want from things you can get.
¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯
August 24, 2004 3:26:15 AM

Archived from groups: comp.compression,comp.dsp,comp.multimedia,rec.audio.tech (More info?)

"Jerry Avins" <jya@ieee.org> wrote in message
news:412a6a30$0$21745$61fed72c@news.rcn.com...
> rg wrote:
>
> > Hi,
> >
> > Thank you very much for your reply. You clearly have a very good
> > understanding of the GSM codec. If you don't mind, I would like to ask
you a
> > follow up question.
> > As you make clear, the GSM audio codec is very inappropriate for music,
but
> > is there any way of analysing an audio/music file in order to evaluate
how
> > much distortion there would be if fed through the system. For example,
is it
> > possible to get a measurement of how well the codec is able to model a
> > particular sound source.
> >
> > Many Thanks for your help,
> >
> > RG
>
> That one is easy enough even for me. Pass your file through a model
> system and measure the distortion that results. If you want to see how
> badly a lens distorts, take a picture with it.
>
> Jerry
> --
> Engineering is the art of making what you want from things you can get.
> ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯
>

Thanks for your reply.

I have considered doing what you recommend. Essentially analysing the
correlation between the source file and the gsm version of the file. However
my concern is that since GSM is supposed to be a lossy codec, the two
versions will almost always be different. Would it be possible to identify
the distinction between information just being removed due to the coding
mechanism and actual distortions being introduced into it? I hope what I'm
trying to say here makes sense.

Thanks,

RG
Anonymous
August 24, 2004 3:26:16 AM

Archived from groups: comp.compression,comp.dsp,comp.multimedia,rec.audio.tech (More info?)

rg wrote:

> "Jerry Avins" <jya@ieee.org> wrote in message
> news:412a6a30$0$21745$61fed72c@news.rcn.com...
>
>>rg wrote:
>>
>>
>>>Hi,
>>>
>>>Thank you very much for your reply. You clearly have a very good
>>>understanding of the GSM codec. If you don't mind, I would like to ask
>
> you a
>
>>>follow up question.
>>>As you make clear, the GSM audio codec is very inappropriate for music,
>
> but
>
>>>is there any way of analysing an audio/music file in order to evaluate
>
> how
>
>>>much distortion there would be if fed through the system. For example,
>
> is it
>
>>>possible to get a measurement of how well the codec is able to model a
>>>particular sound source.
>>>
>>>Many Thanks for your help,
>>>
>>>RG
>>
>>That one is easy enough even for me. Pass your file through a model
>>system and measure the distortion that results. If you want to see how
>>badly a lens distorts, take a picture with it.
>>
>>Jerry
>>--
>>Engineering is the art of making what you want from things you can get.
>>¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯
>>
>
>
> Thanks for your reply.
>
> I have considered doing what you recommend. Essentially analysing the
> correlation between the source file and the gsm version of the file. However
> my concern is that since GSM is supposed to be a lossy codec, the two
> versions will almost always be different. Would it be possible to identify
> the distinction between information just being removed due to the coding
> mechanism and actual distortions being introduced into it? I hope what I'm
> trying to say here makes sense.
>
> Thanks,
>
> RG

It makes sense to me, but is seems like the technique for determining if
a number is divisible by seven. It exists, but it's more of a chore to
apply than it is do do the division.

Jerry
--
Engineering is the art of making what you want from things you can get.
¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯
Anonymous
August 25, 2004 1:38:10 AM

Archived from groups: comp.compression,comp.dsp,comp.multimedia,rec.audio.tech (More info?)

Hi,

Sometime back we were using a tool called OPTICOM (I think) for
comparing the original music and the encoded music (for mp3 encoder).
That takes care of so many things and mostly models the ear. Basically
it checks the distortion is audible or not. Maybe that will be useful
to compare how good the codecs behave.

Anandh

Jerry Avins <jya@ieee.org> wrote in message news:<412aa657$0$21760$61fed72c@news.rcn.com>...
> rg wrote:
>
> > "Jerry Avins" <jya@ieee.org> wrote in message
> > news:412a6a30$0$21745$61fed72c@news.rcn.com...
> >
> >>rg wrote:
> >>
> >>
> >>>Hi,
> >>>
> >>>Thank you very much for your reply. You clearly have a very good
> >>>understanding of the GSM codec. If you don't mind, I would like to ask
> >
> > you a
> >
> >>>follow up question.
> >>>As you make clear, the GSM audio codec is very inappropriate for music,
> >
> > but
> >
> >>>is there any way of analysing an audio/music file in order to evaluate
> >
> > how
> >
> >>>much distortion there would be if fed through the system. For example,
> >
> > is it
> >
> >>>possible to get a measurement of how well the codec is able to model a
> >>>particular sound source.
> >>>
> >>>Many Thanks for your help,
> >>>
> >>>RG
> >>
> >>That one is easy enough even for me. Pass your file through a model
> >>system and measure the distortion that results. If you want to see how
> >>badly a lens distorts, take a picture with it.
> >>
> >>Jerry
> >>--
> >>Engineering is the art of making what you want from things you can get.
> >>¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯
> >>
> >
> >
> > Thanks for your reply.
> >
> > I have considered doing what you recommend. Essentially analysing the
> > correlation between the source file and the gsm version of the file. However
> > my concern is that since GSM is supposed to be a lossy codec, the two
> > versions will almost always be different. Would it be possible to identify
> > the distinction between information just being removed due to the coding
> > mechanism and actual distortions being introduced into it? I hope what I'm
> > trying to say here makes sense.
> >
> > Thanks,
> >
> > RG
>
> It makes sense to me, but is seems like the technique for determining if
> a number is divisible by seven. It exists, but it's more of a chore to
> apply than it is do do the division.
>
> Jerry
Anonymous
August 25, 2004 1:47:31 AM

Archived from groups: comp.compression,comp.dsp,comp.multimedia,rec.audio.tech (More info?)

axlq@spamcop.net (axlq) wrote in message news:<cgbsev$bqu$2@blue.rahul.net>...

> But those artifacts may not be audible on an "audio system" that
> consists of an audio signal of narrow bandwidth and a tinny speaker
> of narrow bandwidth. I imagine there's a lot of information one
> could cut out without much loss to what one actually hears.

The artifacts can get especially ugly when transcoding.

Try this one for example:

wave -> gsm -> wave -> mp3

....you'll hear some truly bizarre chirping and gurgling sounds at
times so it appears sticking with one codec would result in the best
quality--with "best" being a relative quantity when such low bitrates
are involved. =)

GSM used to transmit speech does have its issues. For example, look
at the http://atnac2003.atcrc.com/POSTERS/Chong.pdf file. Considering
that various Chinese phonemes may get mangled in some cases (!), music
remaining aesthetically pleasing probably doesn't stand much of a
chance.

-t
!