SGI takes Itanium & Linux to 1024-way

Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

A single Linux image running across a 1024 Itanium processor machine.

http://www.computerworld.com/hardwaretopics/hardware/story/0,10801,94564,00.html

Yousuf Khan

--
Humans: contact me at ykhan at rogers dot com
Spambots: just reply to this email address ;-)
10 answers Last reply
More about takes itanium linux 1024
  1. Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

    Yousuf Khan wrote:

    > A single Linux image running across a 1024 Itanium processor machine.
    >
    > http://www.computerworld.com/hardwaretopics/hardware/story/0,10801,94564,00.html
    >
    >

    "The users get one memory image they have to deal with," he [Pennington,
    the interim director of NCSA] said. "This makes programming much easier,
    and we expect it to give better performance as well."

    Too early to call it a trend, but I'm encouraged to see the godfather of
    the "Top" 500 list talking some sense as well:

    callysto.hpcc.unical.it/ hpc2004/talks/dongarra-survey.ppt

    slides 37 and 38.

    A single system image is no simple cure. It may not be a cure at all.
    But it's encouraging that somebody is taking it seriously enough to
    build a kilonode machine with a single address space.

    "Scalability" being a challenge for such installations (you can't just
    order more boxes and more cable and take another rural county out of
    agricultural production to move "up" the "Top" 500 list) the premium is
    on processors with high single-thread throughput.

    RM
  2. Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

    Robert Myers wrote:

    [SNIP]

    > A single system image is no simple cure. It may not be a cure at all.
    > But it's encouraging that somebody is taking it seriously enough to
    > build a kilonode machine with a single address space.

    Hats off to SGI, kilonode ssi is a neat trick. :)

    Let's say you write code that makes use of a large single system
    image machine. Let's say SGI fall behind the curve and you need
    answers faster : Where can you go for another large single system
    image machine ?

    I see that kind of awfully clever machine as vendor lock-in waiting
    to happen. If you want to avoid lock-in you end up writing your
    code to the lowest common denominator, and in this case that will
    probably remove any advantage gained by SSI (application depending
    of course).

    Cheers,
    Rupert
  3. Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

    Rupert Pigott wrote:

    >
    > Let's say you write code that makes use of a large single system
    > image machine. Let's say SGI fall behind the curve and you need
    > answers faster : Where can you go for another large single system
    > image machine ?
    >

    What curve are we keeping up with these days?

    The difference in scalability between the Altix and Blue Gene is
    interesting mostly if you're trying to hit arbitrarily definied
    milestones in a Gantt chart.

    For hydro, a factor of ten in machine size is a 78% increase in number
    of grid points available to resolve a given scale: whoop-de-ding. Maybe
    there's something different about actinide-lanthanide decay series
    that's worth understanding. I'll get around to it some time--even
    though I strongly suspect I'm being led on a wild goose chase. The real
    justification for the milestones on the Gantt chart of the last of the
    big spenders is that a petaflop is a nice big round number for a goal.

    > I see that kind of awfully clever machine as vendor lock-in waiting
    > to happen. If you want to avoid lock-in you end up writing your
    > code to the lowest common denominator, and in this case that will
    > probably remove any advantage gained by SSI (application depending
    > of course).
    >

    Blue Gene is now not awfully clever? :-).

    Commodity chip, flat address space. That sounds pretty vanilla to me.
    How do you get more common than that? You can get an Itanium box with a
    flat address space to your own personal work area much more readily than
    you can get a Blue Gene.

    There is no way not to leave you with the idea that I think single-image
    machines are the way to go. I don't know that, and I'm not even certain
    what course of investigation I would undertake to decide whether the way
    to go or not. What I like about the single address space is that it
    would appear to make the minimum architectural imposition on problem
    formulation.

    RM
  4. Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

    Robert Myers wrote:

    > How do you get more common than that? You can get an Itanium box with a
    > flat address space to your own personal work area much more readily than
    > you can get a Blue Gene.

    Extend that argument further and you are buying Xeons.

    The point is 1000 node machines with shared address spaces don't
    fall out of trees. Who said anything about BlueGene anyways ?

    > There is no way not to leave you with the idea that I think single-image
    > machines are the way to go. I don't know that, and I'm not even certain

    Over the long run I think it will be very hard to justify the
    extra engineering and purchase cost over message passing gear.

    > what course of investigation I would undertake to decide whether the way
    > to go or not. What I like about the single address space is that it
    > would appear to make the minimum architectural imposition on problem
    > formulation.

    People made the similar argument for CISC machines too. VAX
    polynomial instructions come to mind. :)

    Cheers,
    Rupert
  5. Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

    Rupert Pigott wrote:

    > Robert Myers wrote:
    >
    >> How do you get more common than that? You can get an Itanium box with
    >> a flat address space to your own personal work area much more readily
    >> than you can get a Blue Gene.
    >
    > Extend that argument further and you are buying Xeons.
    >

    There is a fair question that could be asked for almost any application
    these days: why not ia-32 (probably with 64-bit extensions). When
    you've got superlinear interconnect costs, you want each node to be as
    capable as possible. The application of that argument to Itanium in
    this particular argument is wobbly, since the actual usefulness of
    Itanium may be just as theoretical as the usefulness of the clusters
    I've been worrying about.

    > The point is 1000 node machines with shared address spaces don't
    > fall out of trees. Who said anything about BlueGene anyways ?
    >

    I did. Blue Gene was the best contrast I could think of to a single
    image Itanium machine in terms of cost, energy efficiency, and
    scalability. There is no fundamental reason why BlueGene couldn't
    become widely used and accepted, but it probably won't be because it
    won't show up in the workspace of your average graduate student or
    postdoc.

    Your question is what do we do when we need more than 1000 nodes. It's
    a fair question, but not the only one you could ask. My questions are:
    where does the software that runs on the big machine come from, in what
    environment was it developed, at what cost, and with what opportunities
    for continued development.

    >> There is no way not to leave you with the idea that I think
    >> single-image machines are the way to go. I don't know that, and I'm
    >> not even certain
    >
    >
    > Over the long run I think it will be very hard to justify the
    > extra engineering and purchase cost over message passing gear.
    >

    Hardware is cheap, software is expensive. If we've run out of
    interesting things to do with making processors astonishingly powerful
    and inexpensive, we certainly haven't run out of interesting things to
    do in making interconnect astonishingly powerful and inexpensive.

    >> what course of investigation I would undertake to decide whether the
    >> way to go or not. What I like about the single address space is that
    >> it would appear to make the minimum architectural imposition on
    >> problem formulation.
    >
    >
    > People made the similar argument for CISC machines too. VAX
    > polynomial instructions come to mind. :)
    >

    The RISC/CISC argument went away when microprocessors were developed
    that could hide RISC execution behind a CISC programming model. The
    neat hardware insight (RISC) did not, in the end, impose itself on
    applications. No more should a particular hardware reality about
    multi-processor machines impose itself on applications.

    RM
  6. Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

    "Yousuf Khan" <bbbl67@ezrs.com> wrote in message
    news:e6kKc.237$S5k.21@news04.bloor.is.net.cable.rogers.com...
    >A single Linux image running across a 1024 Itanium processor machine.
    >
    > http://www.computerworld.com/hardwaretopics/hardware/story/0,10801,94564,00.html
    >


    Nobody has said it yet, so guess I'll have to say it:
    "Imagine a Beowulf cluster of ..."

    --

    ... Hank

    http://horedson.home.att.net
    http://w0rli.home.att.net
  7. Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

    Rupert Pigott wrote:

    > Robert Myers wrote:
    >
    >> Rupert Pigott wrote:
    >>

    <snip>

    >>
    >> You are apparently arguing for the desirability of folding the
    >> artificial computational boundaries of clusters into software. If
    >
    >
    > That happens with SSI systems too. There is a load of information that
    > has been published about scaling on SGI's Origin machines over the
    > years. IIRC Altix is based on the same Origin 3000 design. You may
    > remember that I quizzed Rob Warnock on this, he said that there were
    > in practice little gotchas that tend to crop up at particular #'s of
    > procs. He even noted that the gotcha processor counts tended to change
    > with the particular generation of Origin.
    >
    >> that's a necessity of life, I can learn to live with it, but I'm
    >> having a hard time seeing it as desirable. We are so fortunate as to
    >> live in a universe that presents itself to us in midtower-sized
    >> chunks? I'm worried. ;-).
    >
    >
    > In my mind it's a question of fitting our computing effort to reality
    > as opposed to living in an Ivory Tower. Some goals, while worthy,
    > desirable, or even partially achievable, are basically impossible to
    > achieve in reality. A genuinely *flat* address space is impossible
    > right here and now. That SSI Altix box will *not* have *flat* address
    > space in terms of time. It is a NUMA machine. :)
    >

    Well, yes, it is. The spread in latencies is more like a half a
    microsecond, as opposed to five microseconds for the latest and greatest
    of the DoE build-to-order specials.

    On the question of Ivory Towers vs. reality, I believe that I am on the
    side of the angels, naturally. If you believe the right question really
    is: "What's the least expensive way we can get a high Linpack score?",
    then clusters are a slam dunk, but I don't think that anybody worth
    talking to on the subject really thinks that's the right question to be
    asking.

    As to access to 1000-node and even bigger machines, I don't need them.
    What I need is to know what kind of machine a code is likely to run on
    when somebody decides an NCSA-type installation is required.

    How you will _ever_ scale _anything_ to the kinds of memory and and
    compute requirements required to do even some very pedestrian problems
    properly is my real concern, and, from that point of view, no
    architecture currently on the table, short of specialized hardware, is
    even in the right universe.

    Given that _nothing_ currently available can really do the physics
    right--with the possible exception of things like the Cell-like chips
    the Columbia QCD people are using--and that nothing currently available
    really scales in a way that I can imagine, I'm inclined to give heavy
    emphasis to useability.

    >>> It's a
    >>> matter of choice over the long run... If you use the unique features
    >>> of a kilonode Itanium box then you're basically locked-in. Clearly
    >>> this is not an issue for some establishments, Cray customers are a
    >>> good example. :P
    >>>
    >>
    >> Can you give an example of something that you think would happen?
    >
    >
    > Depends on the app. Stuff like memory mapping one large file for read
    > and occasional write could cause some fantastic locking + latency
    > issues when it comes to porting. :)
    >

    I understand just enough about operating systems to know that building a
    1000-node image that runs on realizable hardware is a real
    tour-de-force. I also understand that you can take off-the-shelf copies
    of, say, RedHat Linux, and some easily-obtainable clustering software
    and (probably) get a thousand beige boxes to run like a kilonode
    cluster. Someone else (Linus, SGI, et al) wrote the Altix OS. Someone
    else (Linus, RedHat, et al) wrote the OS for the cluster nodes. I don't
    want to fiddle with either one. You want me to believe that I am better
    off synchronizing processes and exchanging data across infiniband stacks
    and through trips in and out of kernel and user space and with heaven
    only knows how many control handoffs for each exchange than I am reading
    and writing to my own user space under the control of a single OS, and I
    just don't.

    <snip>

    >
    > I mentioned Opteron, if HT really does suffer from crash+burn on
    > comms failure then it is holding itself back. If that ain't the
    > case I'd have figured that a tiny form factor Opteron + DRAM +
    > router cards would be a reasonable component for high-density
    > clusters and beige SSI machines. You'd need some facility for
    > driving some links for longer distances than HT currently allows
    > too ($$$). The next thing holding you back is tuning the OS + Apps
    > to a myriad of possible configurations... :(

    I'm guessing that, the promise of Opteron for HPC notwithstanding, HT is
    going to be marginalized by PCI Express/Infiniband.

    > [SNIP]
    >
    >> The optimistic view is that the chaos we currently see is the HPC
    >> equivalent of the pre-Cambrian explosion and that natural selection
    >> will eventually give us a mature and widely-adopted architecture. My
    >> purpose in starting this discussion was simply to opine that single
    >> image architectures have some features that make them seem promising
    >> as a survivor--not a widely-held view, I think.
    >
    >
    > I'm sure they'll have their place. But in the long run I think that
    > PetaFLOP pressure will tend to push people towards message passing
    > style machines. Consdier this though : Internet is becoming more and
    > more prominent on daily life. The Spooks must have a fair old time
    > keeping up with the sheer volume of data flowing around the globe.
    > Distributed processing is a natural fit here, SSI machines just would
    > not make sense. More and more governments and their civil servants
    > will want to make use of this surveillance resource too, check out
    > the rate at which legislation is legitimising their intrusion on the
    > individual's privacy. The War on Terror has added more fuel to that
    > growth market too. :)
    >
    Nothing that _I_ say about distributed processing is going to slow it
    down, that's for sure, and that isn't my intent. If you've got a
    google-type task, you should use google-type hardware. Computational
    physics is not a google-type task.

    RM
  8. Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

    Robert Myers wrote:
    > Rupert Pigott wrote:

    [SNIP]

    > As always, though, the complexity has to go somewhere. What I can see

    Yes. I am painfully aware of Mashey's concerns about pushing
    complexity from one place to another.

    > of IEEE 1355 looks like an open source project to me. With open source,

    LOL, not at all. It was a write up of the T9000's VCP. Bits and pieces
    of that technology have made their way into proprietry solutions.

    [SNIP]

    > You still wind up with many of the same problems, though: software
    > encrusted with everybody's favorite feature and interfaces that get
    > broken by changes that are made at a level you have no control over
    > (like the kernel) and that ripple through everything, for example.

    Of course, but it's easier to change a kernel than it is to respin
    silicon, or replace several thousand busted boards, right ? A lot of
    MPP machines seem to give the customer access to the kernel source
    which makes it easier for the desperados to fix the problems. :)

    [SNIP]

    > I wonder if part of what you object to with systems like Altix is that
    > it seems like movement away from open systems and back to the bad old
    > days. Could a bunch of geeks with a little money from, say, DARPA, do
    > better? Maybe. I think it's been tried at least once. ;-).

    I don't have a problem with Altix at all. I have a *concern* that
    the SSI feature is rather like putting Chrome on a Porsche 917K if
    you are really interested in getting good perf out of it on an
    arbitary problem + dataset. Data locality is still a key issue.

    I don't deny that it will make some apps easier, but in those cases
    you are wide open to vendor lock-in IMO. There are worse vendors
    than SGI of course, and I don't think they would be quite as evil
    as IBM were reputed to be.

    For those two reasons I question the long term viability of SSI
    MPP machines.

    Cheers,
    Rupert
  9. Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

    In article <1090171989.132815@teapot.planet.gong>,
    Rupert Pigott <roo@try-removing-this.darkboong.demon.co.uk> wrote:

    > Robert Myers wrote:
    >
    > [SNIP]
    >
    > > A single system image is no simple cure. It may not be a cure at all.
    > > But it's encouraging that somebody is taking it seriously enough to
    > > build a kilonode machine with a single address space.
    >
    > Hats off to SGI, kilonode ssi is a neat trick. :)
    >
    > Let's say you write code that makes use of a large single system
    > image machine. Let's say SGI fall behind the curve and you need
    > answers faster : Where can you go for another large single system
    > image machine ?
    >
    > I see that kind of awfully clever machine as vendor lock-in waiting
    > to happen. If you want to avoid lock-in you end up writing your
    > code to the lowest common denominator, and in this case that will
    > probably remove any advantage gained by SSI (application depending
    > of course).

    Lets say, instead, that one has an application that seems to require a
    256 node machine, but that need might grow in the next couple of years.
    SGI's announcement takes the risk out of choosing SGI for that
    application.

    And after a few more years, a then current 256 node machine will be able
    to take the place of a current 1024 node monster, if the application
    doesn't grow too much and one is only worried about the machine or SGI
    wearing out.
    ____________________________________________________________________
    TonyN.:' tonynlsn@shore.net
    '
  10. Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)

    Tony Nelson wrote:

    [SNIP]

    > Lets say, instead, that one has an application that seems to require a
    > 256 node machine, but that need might grow in the next couple of years.
    > SGI's announcement takes the risk out of choosing SGI for that
    > application.

    Regardless you are still effectively locked in if you become dependant
    on the SSI feature.

    There are also some other factors to take into account... Such as does
    your application scale to 1024 on that mythical machine ? If it does
    not who do you turn to if you are committed to SSI ?

    > And after a few more years, a then current 256 node machine will be able
    > to take the place of a current 1024 node monster, if the application
    > doesn't grow too much and one is only worried about the machine or SGI
    > wearing out.

    Assuming clock rate cranking continues to pay off and the compilers
    improve significantly. I figure it'll come down to how much cache Intel
    can cram onto an IA-64 die, and that is a diminishing returns game.

    BTW : If you read through the immense amount of opinionated stuff I
    posted you will see that I actually give SGI some credit. The question
    I raise though is : Is SSI really that useful given the lock-in factor ?

    Cheers,
    Rupert
Ask a new question

Read More

CPUs Hardware Linux Processors Intel