Sign in with
Sign up | Sign in
Your question

Why was Intel a no-show on No Execute?

Last response: in CPUs
Share
Anonymous
a b à CPUs
May 26, 2004 12:12:29 PM

Archived from groups: comp.arch,comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

This has been discussed at quite some length in these newsgroups, but now it
looks like the mainstream press are starting to hear about it too. Intel had
to be embarrassed into including NX into its AMD64 implementation.

http://story.news.yahoo.com/news?tmpl=story&cid=1738&nc...

There's a few things that this article writer has gotten wrong, but a few
things were right.

One thing he got partially wrong was his statement about Intel having no
execute protection in the 16-bit segments. The feature was still there in
the 32-bit segments, Intel never got rid of them. It was stupid OS designers
who decided to ignore the feature that caused this problem.

Yousuf Khan

--
Humans: contact me at ykhan at rogers dot com
Spambots: just reply to this email address ;-)

More about : intel show execute

Anonymous
a b à CPUs
May 26, 2004 4:24:23 PM

Archived from groups: comp.arch,comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

Yousuf Khan wrote:

> One thing he got partially wrong was his statement about Intel
> having no execute protection in the 16-bit segments. The feature
> was still there in the 32-bit segments, Intel never got rid of
> them. It was stupid OS designers who decided to ignore the
> feature that caused this problem.

Are you calling them "stupid" because they opted for paging
instead of segmentation, in an effort to write a portable OS?

Do you think there should be an x86-specific Linux branch,
using segmentation instead of paging?
Anonymous
a b à CPUs
May 26, 2004 11:08:43 PM

Archived from groups: comp.arch,comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

Grumble wrote:

> Yousuf Khan wrote:

>> One thing he got partially wrong was his statement about Intel
>> having no execute protection in the 16-bit segments. The feature
>> was still there in the 32-bit segments, Intel never got rid of
>> them. It was stupid OS designers who decided to ignore the
>> feature that caused this problem.

> Are you calling them "stupid" because they opted for paging
> instead of segmentation, in an effort to write a portable OS?

> Do you think there should be an x86-specific Linux branch,
> using segmentation instead of paging?


I don't think it would be so hard to put all the data in a
data segment, and the code in a code segment, without overlapping
them. It requires the CS: prefix on any loads from the code
segment. Self modifying code is out of style these days,
so that shouldn't be much of a problem.

Now, for things like JIT where code is constantly being
written while running some arrangement would need to be made.

-- glen
Related resources
Anonymous
a b à CPUs
May 27, 2004 12:43:06 AM

Archived from groups: comp.arch,comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

In comp.arch Grumble <a@b.c> wrote:
> Yousuf Khan wrote:
>
> > One thing he got partially wrong was his statement about Intel
> > having no execute protection in the 16-bit segments. The feature
> > was still there in the 32-bit segments, Intel never got rid of
> > them. It was stupid OS designers who decided to ignore the
> > feature that caused this problem.
>
> Are you calling them "stupid" because they opted for paging
> instead of segmentation, in an effort to write a portable OS?
>
> Do you think there should be an x86-specific Linux branch,
> using segmentation instead of paging?
>

There was one for quite a while for pre-386 modes/machines.

--
Sander

+++ Out of cheese error +++
Anonymous
a b à CPUs
May 27, 2004 1:07:45 AM

Archived from groups: comp.arch,comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

In comp.sys.ibm.pc.hardware.chips glen herrmannsfeldt <gah@ugcs.caltech.edu> wrote:
> I don't think it would be so hard to put all the data in a
> data segment, and the code in a code segment, without overlapping
> them. It requires the CS: prefix on any loads from the code
> segment. Self modifying code is out of style these days,
> so that shouldn't be much of a problem.

That _still_ won't help (never mind interpreted or JIT).

If an attacker can redirect execution by modifying the
return address on the stack, s/he doesn't need their own
executable code. Just point to data like "/bin/sh" and
return to an `exec` syscall.

-- Robert
Anonymous
a b à CPUs
May 27, 2004 1:23:11 AM

Archived from groups: comp.arch,comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

In comp.sys.ibm.pc.hardware.chips Robert Redelmeier <redelm@ev1.net.invalid> wrote:
> In comp.sys.ibm.pc.hardware.chips glen herrmannsfeldt <gah@ugcs.caltech.edu> wrote:
>> I don't think it would be so hard to put all the data in a
>> data segment, and the code in a code segment, without overlapping
>> them. It requires the CS: prefix on any loads from the code
>> segment. Self modifying code is out of style these days,
>> so that shouldn't be much of a problem.
>
> That _still_ won't help (never mind interpreted or JIT).
>
> If an attacker can redirect execution by modifying the
> return address on the stack, s/he doesn't need their own
> executable code. Just point to data like "/bin/sh" and
> return to an `exec` syscall.

Ah, but you make me think -- all current CPUs have an internal
hardware call/return stack to speed up branch [mis]prediction.

It would be relatively simple to check this hw stack against
the memory stack and generate a fault if return addresses
don't match.

This could be enabled by a bit in the MSR if the OS has support
to handle/log "return addr faults". Most pgms should never
generate a return fault, but a mechanism could be made to
except those few that do.

A slightly bigger problem is the hw stacks are of limited
depth (6?) and it might be possible to flood them out.
But variable stack entry pointers would become more effective.

-- Robert
Anonymous
a b à CPUs
May 27, 2004 1:58:43 AM

Archived from groups: comp.arch,comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

> It would be relatively simple to check this hw stack against
> the memory stack and generate a fault if return addresses
> don't match.

Lookup "call-with-current-continuation" to see why this is not a good idea.
Or maybe just think of how to implement exception handling.


Stefan
Anonymous
a b à CPUs
May 27, 2004 2:55:37 AM

Archived from groups: comp.arch,comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

Grumble <a@b.c> wrote:
> Yousuf Khan wrote:
>
>> One thing he got partially wrong was his statement about Intel
>> having no execute protection in the 16-bit segments. The feature
>> was still there in the 32-bit segments, Intel never got rid of
>> them. It was stupid OS designers who decided to ignore the
>> feature that caused this problem.
>
> Are you calling them "stupid" because they opted for paging
> instead of segmentation, in an effort to write a portable OS?

No, for not opting to use both. There was no mutual exclusivity between
paging and segmentation. Both could be used and complement each other.

I think the original OS designers in their haste to port Unix to the new
32-bit Intel chip did a simple cross-compile, and then didn't bother to make
use of any of the Intel-specific features of their architecture. They just
left it at "good enough". Of course, using Intel features would've made them
non-portable, but a lot of stuff gets non-portable at the lowest levels of
the kernel anyways.

> Do you think there should be an x86-specific Linux branch,
> using segmentation instead of paging?

There already was. The original pre-1.0 Linux kernels were using segments
*and* paging. I think with addition of new people into the development team,
Linux's original purpose got changed from being the ultimate Intel OS (Unix
or otherwise), to being a free version of portable Unix.

Yousuf Khan
Anonymous
a b à CPUs
May 27, 2004 3:14:40 AM

Archived from groups: comp.arch,comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

Robert Redelmeier <redelm@ev1.net.invalid> wrote:
> That _still_ won't help (never mind interpreted or JIT).
>
> If an attacker can redirect execution by modifying the
> return address on the stack, s/he doesn't need their own
> executable code. Just point to data like "/bin/sh" and
> return to an `exec` syscall.

How's an attacker to do that, when the the code, the stack and the heap
don't even share the same memory addresses?

Yousuf Khan
Anonymous
a b à CPUs
May 27, 2004 3:56:46 AM

Archived from groups: comp.arch,comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

In comp.sys.ibm.pc.hardware.chips Yousuf Khan <news.tally.bbbl67@spamgourmet.com> wrote:
> How's an attacker to do that, when the the code, the stack and the heap
> don't even share the same memory addresses?

Easy. Overwrite the stack with crafted input to an unrestricted
input call (getch() is a frequent culprit). This is the basic
buffer overflow.

In the location for the return address (where EBP is usually
pointing), put in a return address that points to a suitably
dangerous part of the existing code. Like an `exec` syscall.
Above this return address, put in data to make that syscall
nefarious.

-- Robert
Anonymous
a b à CPUs
May 27, 2004 4:01:43 AM

Archived from groups: comp.arch,comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

In comp.sys.ibm.pc.hardware.chips Stefan Monnier <monnier@iro.umontreal.ca> wrote:
>> It would be relatively simple to check this hw stack against
>> the memory stack and generate a fault if return addresses
>> don't match.
>
> Lookup "call-with-current-continuation" to see why this is not a good idea.
> Or maybe just think of how to implement exception handling.

Exception handling is easy -- mismatch produces a MC interrupt.
The kernelspace ISR checks the MSRs which tell it that a return
addr mismatch occurred. Kenel decides what to do -- abort proc,
log, or proceed.

Sure it'll be slow, but how often are calls not paired with
returns? call jtable[eax*4] is the standard syntax for a
jump table, not `push eax/ret`

-- Robert
Anonymous
a b à CPUs
May 27, 2004 8:11:10 AM

Archived from groups: comp.arch,comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

Sander Vesik <sander@haldjas.folklore.ee> wrote:
>> Do you think there should be an x86-specific Linux branch,
>> using segmentation instead of paging?
>>
>
> There was one for quite a while for pre-386 modes/machines.

That was Minix. Linux has always been for 386 and later machines only.

Yousuf Khan
Anonymous
a b à CPUs
May 27, 2004 8:11:10 AM

Archived from groups: comp.arch,comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

Robert Redelmeier <redelm@ev1.net.invalid> wrote:
> In comp.sys.ibm.pc.hardware.chips Yousuf Khan
> <news.tally.bbbl67@spamgourmet.com> wrote:
>> How's an attacker to do that, when the the code, the stack and the
>> heap don't even share the same memory addresses?
>
> Easy. Overwrite the stack with crafted input to an unrestricted
> input call (getch() is a frequent culprit). This is the basic
> buffer overflow.
>
> In the location for the return address (where EBP is usually
> pointing), put in a return address that points to a suitably
> dangerous part of the existing code. Like an `exec` syscall.
> Above this return address, put in data to make that syscall
> nefarious.

Nope, won't work. Segmentation would protect it completely. There is no way
for data written to the heap to touch the data in the stack. Stack segment
and data segment are separate. It's like as if the stack had its own
container, the code has its own, and the data heap its own. What happens in
one container won't even reach the other containers.

Face it, segments were the perfect security mechanism, and systems
developers completely ignored it!

Yousuf Khan
Anonymous
a b à CPUs
May 27, 2004 2:13:18 PM

Archived from groups: comp.arch,comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

Robert Redelmeier wrote:

> Overwrite the stack with crafted input to an unrestricted
> input call (getch() is a frequent culprit).

There is no getch() in ISO C.

fgetc(), getc(), and getchar() return a single character.

Perhaps you meant gets().
Anonymous
a b à CPUs
May 27, 2004 2:49:36 PM

Archived from groups: comp.arch,comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

Robert Redelmeier wrote:

> Ah, but you make me think -- all current CPUs have an internal
> hardware call/return stack to speed up branch [mis]prediction.

e.g. the Athlon implements a 12-entry return address stack to
predict return addresses from a near or far call. As CALLs are
fetched, the next EIP is pushed onto the return stack. Subsequent
RETs pop a predicted return address off the top of the stack.

> It would be relatively simple to check this hw stack against
> the memory stack and generate a fault if return addresses
> don't match.

I think you've just killed the performance of recursive functions.

> This could be enabled by a bit in the MSR if the OS has support
> to handle/log "return addr faults". Most pgms should never
> generate a return fault

This is where I think you are wrong.

The K8 has a counter to measure this event:

88h IC Return stack hit
89h IC Return stack overflow

It would be interesting to take, say, SPEC CPU2000, and count
the number of overflows for each benchmark. I might try.

--
Regards, Grumble
Anonymous
a b à CPUs
May 27, 2004 2:49:37 PM

Archived from groups: comp.arch,comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

Grumble <a@b.c> writes:

>> It would be relatively simple to check this hw stack against
>> the memory stack and generate a fault if return addresses
>> don't match.

>I think you've just killed the performance of recursive functions.

And possibly longjmp()/setcontext() and the like; quite a bit of
additional work is needed to fix all such things (and if you want to
throw in binary compatibility, it's going to be harder still.

Casper
Anonymous
a b à CPUs
May 27, 2004 3:09:20 PM

Archived from groups: comp.arch,comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

>>>>> "YK" == Yousuf Khan <news.tally.bbbl67@spamgourmet.com> writes:

YK> That was Minix. Linux has always been for 386 and later machines
YK> only.

I think the ELKS people will be saddened to hear that.


/Benny
Anonymous
a b à CPUs
May 27, 2004 4:37:44 PM

Archived from groups: comp.arch,comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

In comp.sys.ibm.pc.hardware.chips Grumble <a@b.c> wrote:
> I think you've just killed the performance of recursive functions.

I don't think so. For a recursive function there are many
calls, possibly flooding out the hw return stack. But every
call has a return, and that address _is_ correct on both the
hw and memory stacks.

> 88h IC Return stack hit
> 89h IC Return stack overflow
>
> It would be interesting to take, say, SPEC CPU2000, and count
> the number of overflows for each benchmark. I might try.

Excellent! I do not suggest trapping out overflows.
They're to occur on deep recursion which should not contain
evil getch() calls. Just trap misses.

-- Robert

>
Anonymous
a b à CPUs
May 27, 2004 4:40:23 PM

Archived from groups: comp.arch,comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

In comp.sys.ibm.pc.hardware.chips Grumble <a@b.c> wrote:
> There is no getch() in ISO C.
> Perhaps you meant gets().

Thank you for the correction. I do mean gets().
I apologize for any confusion.

-- Robert

>
Anonymous
a b à CPUs
May 27, 2004 4:55:24 PM

Archived from groups: comp.arch,comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

In comp.sys.ibm.pc.hardware.chips Yousuf Khan <news.tally.bbbl67@spamgourmet.com> wrote:
> Nope, won't work. Segmentation would protect it completely. There is no way
> for data written to the heap to touch the data in the stack. Stack segment
> and data segment are separate. It's like as if the stack had its own
> container, the code has its own, and the data heap its own. What happens in
> one container won't even reach the other containers.

True in a literal sense.

But `c` compilers have this habit of allocating local variable
space on the stack. So when `char input[80];` is coded in a
routine, ESP gets decreased by 80 and that array is sitting
just below the return address!

I don't think it's _required_ by any standard that local vars are
allocated on the stack, but it sure makes memory managment easy.

AFAIK, only global vars and large malloc()s are put on the heap.

-- Robert
Anonymous
a b à CPUs
May 27, 2004 7:21:22 PM

Archived from groups: comp.arch,comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

Benny Amorsen <amorsen@vega.amorsen.dk> wrote:
>>>>>> "YK" == Yousuf Khan <news.tally.bbbl67@spamgourmet.com> writes:
>
>> That was Minix. Linux has always been for 386 and later machines
>> only.
>
> I think the ELKS people will be saddened to hear that.

So, it never surprises me to find Linux being ported to do something or
another at some point in time. I guess the question these days to ask is
whether there is something Linux hasn't been ported to? Commodore 64? Apple
II?

Yousuf Khan
Anonymous
a b à CPUs
May 27, 2004 7:49:20 PM

Archived from groups: comp.arch,comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

In comp.arch Yousuf Khan <news.tally.bbbl67@spamgourmet.com> wrote:
> Robert Redelmeier <redelm@ev1.net.invalid> wrote:
> > In comp.sys.ibm.pc.hardware.chips Yousuf Khan
> > <news.tally.bbbl67@spamgourmet.com> wrote:
> >> How's an attacker to do that, when the the code, the stack and the
> >> heap don't even share the same memory addresses?
> >
> > Easy. Overwrite the stack with crafted input to an unrestricted
> > input call (getch() is a frequent culprit). This is the basic
> > buffer overflow.
> >
> > In the location for the return address (where EBP is usually
> > pointing), put in a return address that points to a suitably
> > dangerous part of the existing code. Like an `exec` syscall.
> > Above this return address, put in data to make that syscall
> > nefarious.
>
> Nope, won't work. Segmentation would protect it completely. There is no way
> for data written to the heap to touch the data in the stack. Stack segment

But procedure local variables (including arrays) don't live in the heap,
they live on the stack.

> and data segment are separate. It's like as if the stack had its own
> container, the code has its own, and the data heap its own. What happens in
> one container won't even reach the other containers.

Doesn't matter. All you need for an exploit is to be able to make *one*
system call. And for that, you don't need to write to the code segment
at all. The stack is enough.

>
> Face it, segments were the perfect security mechanism, and systems
> developers completely ignored it!
>
> Yousuf Khan
>
>

--
Sander

+++ Out of cheese error +++
Anonymous
a b à CPUs
May 27, 2004 7:54:11 PM

Archived from groups: comp.arch,comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

Robert Redelmeier wrote:

> In comp.sys.ibm.pc.hardware.chips Grumble wrote:
>
>> I think you've just killed the performance of recursive functions.
>
> I don't think so. For a recursive function there are many
> calls, possibly flooding out the hw return stack. But every
> call has a return, and that address _is_ correct on both the
> hw and memory stacks.

You don't call any other function in your recursive functions? :-)

>> 88h IC Return stack hit
>> 89h IC Return stack overflow
>>
>> It would be interesting to take, say, SPEC CPU2000, and count
>> the number of overflows for each benchmark. I might try.
>
> Excellent! I do not suggest trapping out overflows.
> They're to occur on deep recursion which should not contain
> evil getch() calls. Just trap misses.

As far as I can tell, and with the exception of recursive
functions which call no other function, RAS overflow will
cause a RET misprediction.
Anonymous
a b à CPUs
May 27, 2004 8:09:03 PM

Archived from groups: comp.arch,comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

Sander Vesik <sander@haldjas.folklore.ee> wrote:
>> and data segment are separate. It's like as if the stack had its own
>> container, the code has its own, and the data heap its own. What
>> happens in one container won't even reach the other containers.
>
> Doesn't matter. All you need for an exploit is to be able to make
> *one* system call. And for that, you don't need to write to the code
> segment at all. The stack is enough.

The only place you can run code is from the code segment. If you insert code
into the stack segment, none of it will be executable. At best it might end
up causing the return address to go to the wrong part of the code segment
and therefore run the program from the wrong point, but more likely the
program will just end up locking up and be shutdown by the OS.

Yousuf Khan
Anonymous
a b à CPUs
May 27, 2004 9:30:41 PM

Archived from groups: comp.arch,comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

On Thu, 27 May 2004, Yousuf Khan wrote:

> The only place you can run code is from the code segment. If you insert code
> into the stack segment, none of it will be executable. At best it might end
> up causing the return address to go to the wrong part of the code segment
> and therefore run the program from the wrong point, but more likely the
> program will just end up locking up and be shutdown by the OS.

Changing branch address and stack values that get loaded to
arument registers (or just plain stack values on a stack machine)
are enough.

An object dump of a binary with stack overflow reveals the address
of a "system call" instruction, which is enough to know what return
adress is needed.

i.e. you don't need new code to execute you just need to get to
existing insn's in the binary with the appropriate state, and that
appropriate state can be set up with stack only overwriting.

Period.

Peter
Anonymous
a b à CPUs
May 27, 2004 10:15:19 PM

Archived from groups: comp.arch,comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

Yousuf Khan <bbbl67@ezrs.com> wrote:
> Sander Vesik <sander@haldjas.folklore.ee> wrote:
>>> and data segment are separate. It's like as if the stack had its own
>>> container, the code has its own, and the data heap its own. What
>>> happens in one container won't even reach the other containers.
>>
>> Doesn't matter. All you need for an exploit is to be able to make
>> *one* system call. And for that, you don't need to write to the code
>> segment at all. The stack is enough.
>
> The only place you can run code is from the code segment. If you insert code
> into the stack segment, none of it will be executable. At best it might end
> up causing the return address to go to the wrong part of the code segment
> and therefore run the program from the wrong point, but more likely the
> program will just end up locking up and be shutdown by the OS.
>
> Yousuf Khan

Yousuf,

Check out the following link:

http://packetstormsecurity.nl/groups/horizon/stack.txt

which explains how you can do overflow attack
when stack is not executable.
Although this is illustrated in Solaris/SPARC,
it equally applies to any x86.

Seongbae
Anonymous
a b à CPUs
May 27, 2004 10:41:19 PM

Archived from groups: comp.arch,comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

> Segments would've fully protected everything.

Your assurance is endearing. But re-read the thread for a counter example
where the only code executed (in this process anyway) already exists (it
just forks off a /bin/sh shell).

Segments protect just as "fully" as separate address spaces do.
It's better than nothing, but unless you're extremely careful, it's not
sufficient for real security. Better make sure buffer overflows *can't*
happen, so you can actually reason about properties of your code.


Stefan
Anonymous
a b à CPUs
May 27, 2004 10:44:57 PM

Archived from groups: comp.arch,comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

In comp.sys.ibm.pc.hardware.chips Grumble <a@b.c> wrote:
> You don't call any other function in your recursive functions? :-)

Hey, I avoid recursion. But if you called another fn,
it too would return.

> As far as I can tell, and with the exception of recursive
> functions which call no other function, RAS overflow will
> cause a RET misprediction.

It should case a RET misprediction even then unless it duplicates
TOS when it pops. For use as a security mechanism, it'd be
better if TOS was tagged empty or missing. Then no MCE.

-- Robert

>
Anonymous
a b à CPUs
May 27, 2004 10:58:00 PM

Archived from groups: comp.arch,comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

> Exception handling is easy -- mismatch produces a MC interrupt.
> The kernelspace ISR checks the MSRs which tell it that a return
> addr mismatch occurred. Kenel decides what to do -- abort proc,
> log, or proceed.

And how does the kernel "decide what to do"?
It's so simple to prevent buffer overflows, there's really no reason to go
to the trouble of some special hardware mechanism to catch some "odd"
behavior which may sometimes catch some forms of buffer-overflow-exploits.

> Sure it'll be slow, but how often are calls not paired with returns?

Can be pretty frequent with some languages/compilers, although admittedly
the cost of the misprediction you get with current CPUs is a strong
incentive to try and avoid such situations.


Stefan
Anonymous
a b à CPUs
May 28, 2004 1:05:37 AM

Archived from groups: comp.arch,comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

>> You don't call any other function in your recursive functions? :-)
> Hey, I avoid recursion.

Too bad. Usually makes for clean and simple code, whose security is
simpler to verify.


Stefan
Anonymous
a b à CPUs
May 28, 2004 1:51:48 AM

Archived from groups: comp.arch,comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

In comp.sys.ibm.pc.hardware.chips Stefan Monnier <monnier@iro.umontreal.ca> wrote:
> And how does the kernel "decide what to do"?

Whatever it's been programmed to do, likely on a per-process basis.
Likely it'd start APatchy with something like
# /usr/sbin/nooverflow httpd &

> It's so simple to prevent buffer overflows, there's really no reason to go

Simple? Writing good code is simple? Wading through millions
of lines of cruft is simple?

> to the trouble of some special hardware mechanism to catch some "odd"
> behavior which may sometimes catch some forms of buffer-overflow-exploits.

I don't think there are that many forms of buffer overflows.
All result from an open ended IO call like gets().
Do you know any other kinds?

> Can be pretty frequent with some languages/compilers, although admittedly

Which ones? mispairing call/ret is a fast way to overflow the stack.

-- Robert
Anonymous
a b à CPUs
May 28, 2004 2:08:33 AM

Archived from groups: comp.arch,comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

>> And how does the kernel "decide what to do"?

> Whatever it's been programmed to do, likely on a per-process basis.
> Likely it'd start APatchy with something like
> # /usr/sbin/nooverflow httpd &

Say a mismatch (it's not just overflows) happens in a program that uses
exceptions (and where mismatches are hence not necessarily a sign of
a buffer-overflow-exploit): how is the kernel to determine if a given
mismatch is harmless?

>> It's so simple to prevent buffer overflows, there's really no reason to go
> Simple?

Trivial: use a language where it's automatically enforced.
I.e. basically any language other than C. Or use a C compiler that goes
through the extra trouble of trying to prevent overflow exploits
(i.e. by allocating stack variables on a separate stack, or by using fat
pointers, or ...).

>> to the trouble of some special hardware mechanism to catch some "odd"
>> behavior which may sometimes catch some forms of buffer-overflow-exploits.
> I don't think there are that many forms of buffer overflows.

Maybe not, but they can happen in many different kinds of code and there can
be many forms of exploits. So it can be between very difficult and
impossible for a low-level system to determine if a given behavior is part
of the normal execution or is the sign of an exploit.


Stefan
Anonymous
a b à CPUs
May 28, 2004 8:14:48 AM

Archived from groups: comp.arch,comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

In comp.sys.ibm.pc.hardware.chips Stefan Monnier <monnier@iro.umontreal.ca> wrote:
> Say a mismatch (it's not just overflows) happens in a program that uses
> exceptions (and where mismatches are hence not necessarily a sign of
> a buffer-overflow-exploit): how is the kernel to determine if a given
> mismatch is harmless?

Well if the pgm has designed-in mismatches, the kernel can't
determine it, and the the pgm would have to be run with that
protection disabled. But how many languages (other than asm)
even _allow_ mismatched call/ret?

> Maybe not, but they can happen in many different kinds of code and there can
> be many forms of exploits. So it can be between very difficult and
> impossible for a low-level system to determine if a given behavior is part
> of the normal execution or is the sign of an exploit.

Well, actually there is another way. The OS could monitor
events like return adress mismatches and take defensive
actions when an increase is noted.

-- Robert
Anonymous
a b à CPUs
May 28, 2004 7:03:48 PM

Archived from groups: comp.arch,comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

In comp.arch Yousuf Khan <bbbl67@ezrs.com> wrote:
> Sander Vesik <sander@haldjas.folklore.ee> wrote:
> >> and data segment are separate. It's like as if the stack had its own
> >> container, the code has its own, and the data heap its own. What
> >> happens in one container won't even reach the other containers.
> >
> > Doesn't matter. All you need for an exploit is to be able to make
> > *one* system call. And for that, you don't need to write to the code
> > segment at all. The stack is enough.
>
> The only place you can run code is from the code segment. If you insert code

only superficialy true. as you have control of the stack, you can cause any
number of function calls to happen with the parameters of your choice. This
is essentialy the same as running code.

> into the stack segment, none of it will be executable. At best it might end
> up causing the return address to go to the wrong part of the code segment
> and therefore run the program from the wrong point, but more likely the
> program will just end up locking up and be shutdown by the OS.

Only if you don't know the addresses of functions and system calls.

>
> Yousuf Khan
>
>

--
Sander

+++ Out of cheese error +++
Anonymous
a b à CPUs
May 28, 2004 7:12:19 PM

Archived from groups: comp.arch,comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

In comp.arch Robert Redelmeier <redelm@ev1.net.invalid> wrote:
> In comp.sys.ibm.pc.hardware.chips Stefan Monnier <monnier@iro.umontreal.ca> wrote:
> > And how does the kernel "decide what to do"?
>
> Whatever it's been programmed to do, likely on a per-process basis.
> Likely it'd start APatchy with something like
> # /usr/sbin/nooverflow httpd &
>
> > It's so simple to prevent buffer overflows, there's really no reason to go
>
> Simple? Writing good code is simple? Wading through millions
> of lines of cruft is simple?
>
> > to the trouble of some special hardware mechanism to catch some "odd"
> > behavior which may sometimes catch some forms of buffer-overflow-exploits.
>
> I don't think there are that many forms of buffer overflows.
> All result from an open ended IO call like gets().
> Do you know any other kinds?

Yes. Accepting user provided content length and not checking it against
your buffer size.

>
> > Can be pretty frequent with some languages/compilers, although admittedly
>
> Which ones? mispairing call/ret is a fast way to overflow the stack.
>
> -- Robert
>

--
Sander

+++ Out of cheese error +++
Anonymous
a b à CPUs
May 28, 2004 7:24:25 PM

Archived from groups: comp.arch,comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

In comp.arch Robert Redelmeier <redelm@ev1.net.invalid> wrote:
> In comp.sys.ibm.pc.hardware.chips Stefan Monnier <monnier@iro.umontreal.ca> wrote:
> > Say a mismatch (it's not just overflows) happens in a program that uses
> > exceptions (and where mismatches are hence not necessarily a sign of
> > a buffer-overflow-exploit): how is the kernel to determine if a given
> > mismatch is harmless?
>
> Well if the pgm has designed-in mismatches, the kernel can't
> determine it, and the the pgm would have to be run with that
> protection disabled. But how many languages (other than asm)
> even _allow_ mismatched call/ret?

Consider a user mode threads package that uses get/setcontext()
or setjmp / longjmp and so on.

>
> -- Robert
>

--
Sander

+++ Out of cheese error +++
Anonymous
a b à CPUs
May 28, 2004 7:48:29 PM

Archived from groups: comp.arch,comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

In comp.sys.ibm.pc.hardware.chips Sander Vesik <sander@haldjas.folklore.ee> wrote:
> Consider a user mode threads package that uses
> get/setcontext() or setjmp / longjmp and so on.

Well, I'm not entirely sure how these constructs are
implemented by the compilers, but I would expect a
simple `jmp` instruction. This does NOT disturb the
hw call/ret stack, nor pose any buffer-overflow danger.

-- Robert
Anonymous
a b à CPUs
May 28, 2004 8:35:34 PM

Archived from groups: comp.arch,comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

>> Say a mismatch (it's not just overflows) happens in a program that uses
>> exceptions (and where mismatches are hence not necessarily a sign of
>> a buffer-overflow-exploit): how is the kernel to determine if a given
>> mismatch is harmless?

> Well if the pgm has designed-in mismatches, the kernel can't
> determine it, and the the pgm would have to be run with that
> protection disabled. But how many languages (other than asm)
> even _allow_ mismatched call/ret?

Any language with exceptions: C++, Java, C (with setjmp/longjmp), ...

>> Maybe not, but they can happen in many different kinds of code and there can
>> be many forms of exploits. So it can be between very difficult and
>> impossible for a low-level system to determine if a given behavior is part
>> of the normal execution or is the sign of an exploit.

> Well, actually there is another way. The OS could monitor
> events like return adress mismatches and take defensive
> actions when an increase is noted.

A buffer-overflow exploit might only need one mismatch.


Stefan
Anonymous
a b à CPUs
May 28, 2004 10:22:16 PM

Archived from groups: comp.arch,comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

In comp.sys.ibm.pc.hardware.chips Stefan Monnier <monnier@iro.umontreal.ca> wrote:
> Any language with exceptions: C++, Java, C (with setjmp/longjmp), ...

Why should exceptions change anything? AFAIK, all exceptions are
kernel events / interrupts wherein the previous context is fully
saved and restored. Userspace exception handlers are supposed
to be isolated code with their own returns.

-- Robert
Anonymous
a b à CPUs
May 29, 2004 7:55:10 AM

Archived from groups: comp.arch,comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

Robert Redelmeier <redelm@ev1.net.invalid> wrote in message news:<0zltc.804$Fg5.576@newssvr23.news.prodigy.com>...
> ...
>
> But `c` compilers have this habit of allocating local variable
> space on the stack. So when `char input[80];` is coded in a
> routine, ESP gets decreased by 80 and that array is sitting
> just below the return address!
>
> I don't think it's _required_ by any standard that local vars are
> allocated on the stack, but it sure makes memory managment easy.

It also facilitates recursion and re-entrancy. But it needn't be the
same stack as the return linkage pointer.

> AFAIK, only global vars and large malloc()s are put on the heap.

Only malloc()s.

Toby

>
> -- Robert
Anonymous
a b à CPUs
May 29, 2004 1:09:06 PM

Archived from groups: comp.arch,comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

In article <srLtc.1051$pF5.486@newssvr23.news.prodigy.com>,
Robert Redelmeier <redelm@ev1.net.invalid> wrote:
>In comp.sys.ibm.pc.hardware.chips Stefan Monnier <monnier@iro.umontreal.ca> wrote:
>> Any language with exceptions: C++, Java, C (with setjmp/longjmp), ...
>
>Why should exceptions change anything? AFAIK, all exceptions are
>kernel events / interrupts wherein the previous context is fully
>saved and restored. Userspace exception handlers are supposed
>to be isolated code with their own returns.

Boggle.

It ain't what you don't know that causes the trouble; it's what you
know that ain't so.


Regards,
Nick Maclaren.
Anonymous
a b à CPUs
May 29, 2004 7:10:12 PM

Archived from groups: comp.arch,comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

Sander Vesik <sander@haldjas.folklore.ee> wrote:
>> The only place you can run code is from the code segment. If you
>> insert code
>
> only superficialy true. as you have control of the stack, you can
> cause any number of function calls to happen with the parameters of
> your choice. This is essentialy the same as running code.

I see, so how long has C been passing command-line parameters through the
stack? How many other languages do this?

Yousuf Khan
Anonymous
a b à CPUs
May 29, 2004 11:39:34 PM

Archived from groups: comp.arch,comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

In article <d6ce4a6c.0405290255.5f77d483@posting.google.com>,
Toby Thain <toby@telegraphics.com.au> wrote:
>Robert Redelmeier <redelm@ev1.net.invalid> wrote in message news:<0zltc.804$Fg5.576@newssvr23.news.prodigy.com>...
>>
>> But `c` compilers have this habit of allocating local variable
>> space on the stack. So when `char input[80];` is coded in a
>> routine, ESP gets decreased by 80 and that array is sitting
>> just below the return address!
>>
>> I don't think it's _required_ by any standard that local vars are
>> allocated on the stack, but it sure makes memory managment easy.
>
>It also facilitates recursion and re-entrancy. But it needn't be the
>same stack as the return linkage pointer.

That is true.

>> AFAIK, only global vars and large malloc()s are put on the heap.
>
>Only malloc()s.

That isn't. It depends on the implementation where variably sized
arrays are put, for example.


Regards,
Nick Maclaren.
Anonymous
a b à CPUs
May 29, 2004 11:42:13 PM

Archived from groups: comp.arch,comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

In article <oJ1uc.7840$JmE.5318@news04.bloor.is.net.cable.rogers.com>,
Yousuf Khan <bbbl67@ezrs.com> wrote:
>Sander Vesik <sander@haldjas.folklore.ee> wrote:
>>> The only place you can run code is from the code segment. If you
>>> insert code
>>
>> only superficialy true. as you have control of the stack, you can
>> cause any number of function calls to happen with the parameters of
>> your choice. This is essentialy the same as running code.
>
>I see, so how long has C been passing command-line parameters through the
>stack? How many other languages do this?

Sinve the beginning. In pretty well all stack-based languages,
you can emulate such a call with no hassle. In some, it is more
difficult.


Regards,
Nick Maclaren.
Anonymous
a b à CPUs
May 30, 2004 12:14:51 AM

Archived from groups: comp.arch,comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

On Wed, 26 May 2004, Yousuf Khan wrote:

> No, for not opting to use both. There was no mutual exclusivity between
> paging and segmentation. Both could be used and complement each other.

Google for ingo molnar, execshield, ascii armou?r.

-Peter
Anonymous
a b à CPUs
May 30, 2004 1:20:26 AM

Archived from groups: comp.arch,comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

Toby Thain <toby@telegraphics.com.au> wrote:
+---------------
| Robert Redelmeier <redelm@ev1.net.invalid> wrote:
| > I don't think it's _required_ by any standard that local vars are
| > allocated on the stack, but it sure makes memory managment easy.
|
| It also facilitates recursion and re-entrancy. But it needn't be the
| same stack as the return linkage pointer.
+---------------

But if you *don't* do it, then you have trouble with stack fragmentation
and/or collisions with your "argument stack" expanding at a different rate
than your "linkage stack", resulting in one or the other bumping into
arbitrary limits at inconvenient times. As a result, one or the other
of the stacks gets pushed off into the heap (usually the argument stack)
as a linked list of stack-allocated "malloc()" blocks [optimized by
allocating a bunch at a time], which puts a lot of stress on "malloc()",
or gets pushed into a separately-managed segment of address space, which
puts pressure on memory allocation in general and the dynamic loader in
particular.

We had some of these issues with the Am29000 Subroutine Calling Standard
(circa 1987), which had both a "register cache" stack for linkage
information and "small" arguments (which were passed in registers)
and a "memory" stack for "large" arguments (as well as *any* argument,
regardless of size, that the called subroutine referenced by address).[1]
Had the 29k CPU family ever made it into the 32-bit Unix[2] workstation
market, where as we know address space layout has become an issue
(especially with an ever-larger number of DLLs or DSOs competing for space),
the two-stack calling sequence could have become quite problematic.
[As it was, in the embedded-processor space it was pretty much a non-issue.]


-Rob

[1] Actually, the rule was that the first 16 *words* of arguments got
passed in registers and any further words of arguments got passed
on the memory stack, except that if the called routine referenced
any of the first 16 words by address (e.g., "&foo") then that word
and all subsequence words of the register args would get copied into
the memory stack at subroutine entry. Yes, this meant that whenever
the memory stack got used at all there was a 64-byte area at the
front reserved in case the first 16 words needed to be manifested
in memory. (*Ugh*)

[2] Both BSD and System-V ports were done to the Am29000 -- both were
quite straightforward since the 29k was a friendly target enviroment --
but shortly after both were up & running AMD chose not to promote
the 29k as a Unix engine, and they were abandoned.

-----
Rob Warnock <rpw3@rpw3.org>
627 26th Avenue <URL:http://rpw3.org/&gt;
San Mateo, CA 94403 (650)572-2607
Anonymous
a b à CPUs
May 30, 2004 5:35:51 AM

Archived from groups: comp.arch,comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

Peter "Firefly" Lund <firefly@diku.dk> wrote:
> On Wed, 26 May 2004, Yousuf Khan wrote:
>
>> No, for not opting to use both. There was no mutual exclusivity
>> between paging and segmentation. Both could be used and complement
>> each other.
>
> Google for ingo molnar, execshield, ascii armou?r.

Looks like he was using the segment limits to protect against stack
overflows.

Yousuf Khan
Anonymous
a b à CPUs
May 30, 2004 10:02:06 AM

Archived from groups: comp.arch,comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

On Sat, 29 May 2004 15:10:12 GMT, "Yousuf Khan" <bbbl67@ezrs.com> wrote:

>Sander Vesik <sander@haldjas.folklore.ee> wrote:
>>> The only place you can run code is from the code segment. If you
>>> insert code
>>
>> only superficialy true. as you have control of the stack, you can
>> cause any number of function calls to happen with the parameters of
>> your choice. This is essentialy the same as running code.
>
>I see, so how long has C been passing command-line parameters through the
>stack? How many other languages do this?

Hmm, not "command line" parameters but "actual argument" passing to
functions is done using the stack in C and most Fortrans I've come
across... if the hardware has a stack. But it's not the arguments, which
are pushed on the stack by the programmer written caller routine, which are
important - AIUI it's the return address which gets pushed on the stack
automatically by the "call" instruction. That's what can be fudged by the
exploit - all you need is a bugged, err vulnerable, system call address to
plonk in there which, when entered, will also take *its* argument values
off the stack.

Rgds, George Macdonald

"Just because they're paranoid doesn't mean you're not psychotic" - Who, me??
Anonymous
a b à CPUs
May 30, 2004 8:58:22 PM

Archived from groups: comp.arch,comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

In comp.arch Robert Redelmeier <redelm@ev1.net.invalid> wrote:
> In comp.sys.ibm.pc.hardware.chips Stefan Monnier <monnier@iro.umontreal.ca> wrote:
> > Any language with exceptions: C++, Java, C (with setjmp/longjmp), ...
>
> Why should exceptions change anything? AFAIK, all exceptions are
> kernel events / interrupts wherein the previous context is fully
> saved and restored. Userspace exception handlers are supposed
> to be isolated code with their own returns.

So you have no idea at all what a exception is in C++ / Java ? If so
why are you arguing on this topic? You are only going to be completely
wrong and embarass yourself.

>
> -- Robert
>

--
Sander

+++ Out of cheese error +++
Anonymous
a b à CPUs
May 30, 2004 9:03:37 PM

Archived from groups: comp.arch,comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

In comp.arch Robert Redelmeier <redelm@ev1.net.invalid> wrote:
> In comp.sys.ibm.pc.hardware.chips Sander Vesik <sander@haldjas.folklore.ee> wrote:
> > Consider a user mode threads package that uses
> > get/setcontext() or setjmp / longjmp and so on.
>
> Well, I'm not entirely sure how these constructs are
> implemented by the compilers, but I would expect a
> simple `jmp` instruction. This does NOT disturb the
> hw call/ret stack, nor pose any buffer-overflow danger.

Completely, *utterly* wrong. Even if the essence of teh longjmp
is a 'jmp' instruction, teh whole point of it is that it happens
back over a number of intermediate call frames that thus never
return. The case of get/setcontext is even more drastic - these
actively change the userland context, so that after the setcontext
call the stack is pointing to a completely new place, including a
new call/return history that has no relation to previous thread's
call/return history.

Quit arguing about things you don't know anything about.

>
> -- Robert
>

--
Sander

+++ Out of cheese error +++
!