Notes from the meeting of SciComp, the SP Scientific User's Group
in BoF Session, at SC2000, on 8 Nov, 2000
as recorded by T. M. DeBoni, secretary.
Preliminaries:
Technical difficulties with projection and presentation equipment were ironed
out.
Agenda Items
Review of attendee feedback from San Diego August 2000.
The following were the main feedback points:
- General happiness with it, and with Jay Boisseau for organizing it.
- More exercises were desired in tutorials; this implies some hands-on time
and resources to support it. Discussion did not cover these points.
- More user applications and algorithms were desired in presentations.
- More AIX talks were desired, and it was also seen as desirable to continue
the IBM presentations. I think all would agree to the latter point.
- There were good accommodations and food associated with the meeting. San
Diego gets this praise.
- The meeting rooms were too warm; and, the meeting rooms were too cold. This
is just the nature of things, I think.
NOTE: The beach was nice, on the one afternoon I played hooky and went out
there. Perhaps some talks could be held there in future meetings...)
- It was generally agreed that the IBM roadmap talk was the best. I think the
IBM talks were all quite good.
- One user talk that was applauded had to do with early experiences on NAVOs 2
TFLOP machine. Several others were popular, as well.
SciComp Draft Bylaws
A set of articles were proposed based on the CUG bylaws, and discussion ensued.
The outcomes are as follows:
- Memberships:
- Memberships, if defined, could be based on (a) working at a site with an
installed system; (b) being professionally associated with an installation
site; and, (c) being a designated academic or researcher member (i.e.,
one with academic interest in the system in itself.
- Additionally, nonmembers might be invited or allowed to attend meetings,
at the discretion of some body (the officers or board of directors), but
perhaps not invited to attend meetings held under nondisclosure agreements
with IBM.
- Individual members could be charged a membership fee as part of conference
attendance, separate from meeting registration. Invited nonmembers would
be exempt from membership fees but would pay registration fees.
- Membership would continue for one year from the most recent meeting
attended. Membership independent of meeting attendance was not discussed.
- No institutional memberships or fees would be assessed; membership would
be open to individuals, only.
- Officers:
- There should be President, Secretary, Treasurer, and IBM Liaison officers.
They should serve 1 and a half year terms.
- There should also be a Meeting Program chair, serving for a half-year
term.
- There should also be up to seven directors, including as many "at large"
directors as needed to make up this number, if officers are not included
among them.
- Standing committees were discussed:
- Two such committes, modeled after CUG's, were discussed - membership and
finance; both were disparaged by the attendees. This matter was left
unsettled, but the implication was that if needed, they could be written into
the bylaws later.
- Other Meeting Rules were discussed:
- No employment recruiting or interviewing at meetings.
- No current officers on nominating committee for next officers - this was
left undetermined, and might be acted upon later.
- No absentee ballots for elections.
- No nationality rules on officers.
- The officers might be allowed to determine election rules for future
officers.
- The general feeling of those present seemed to be that too-specific definition
of rules was not needed at this time, and that anything not decided now could
be added as needed later. The absentee ballot issue was discussed in depth;
general agreement was reached that the CUG practice of voting by proxy should
not be followed; instead, absentee ballots should be allowed, due to the
international scope of the organization and the plan to alternate election
sites between the US meetings and the meetings outside the US.
The next US meeting will be in Knoxville, TN, at the Radisson Summit Hill, on
October 9-12 2001.
SciComp 2000 technical concerns and IBM responses
Seven technical concerns were solicited from the attendees of the 2000 meeting.
They were written up by the officers and formally sent to IBM officials David
Turek, Peter Ungarro, and John Levesque.
The Concerns so documented were discussed, and the IBM responses solicited
from J. Levesque were as follows:
- Running N user threads, processes, or tasks on N-CPU SMP nodes causes
performance degradation and variability. This has been noted reliably and
repeatedly, and documented in numerous contexts. This is considered crippling
to the use of SP systems.
Possible fix: bump up priority to starve system daemons.
IBM Response: Bob Davis of IBM offers settings from a well-tuned
system in New York. RAS daemons may be at fault. Time-of-day related changes
of behavior also seem to occur.
- Cleaning up abnormally terminated jobs often does not happen completely and
automatically - orphaned processes and shared memory segments are often left
behind, which causes trouble for subsequent use of the nodes affected.
Possible fix: job postscript and periodic daemons.
IBM Response: IBM will advise on fixes. Official response is needed,
as IBM sometimes disparages ad hoc fixes applied by user to such problems.
- DPCL supports dynamic instrumentation of parallel jobs, an idea many like.
IBM initially claimed it would be open source. Lately, this claim has come
to be doubted. Users want this to be the case.
IBM Response: It will be open source, as was formally announced at the
IBM SC2000 booth by Ted Hoover. This was applauded.
- Thread stack overflow errors are easy to introduce in correct programs and
can depend on environment variable settings. You can jump across the read-only
boundary-guard page and cause a mysterious segmentation fault somewhere else
from resulting data corruption. This is considered a very big problem, equal
in magnitude to (1), above.
Possible fix: signal and detect all such overflows, at least in a debug
mode. This could result in large performance degradation, and should possibly
be done in a special debug mode, but such a mode would not guarantee an
appropriate response to all such faults. There should also be support for
detection of such faults in debuggers.
IBM Response: No response is ready at present, but one will be provided
forthwith.
- Power 3 is a 64 bit processor, but MPI is not yet released in 64 bit mode.
This is considered serious for large systems. The release schedule should be
accelerated.
IBM Response: No response is ready at present, but one will be provided
forthwith.
- Increasing SMP node CPU counts will cause time-sharing as well as space
sharing; LL does not support specifications for CPU counts/thread counts;
also it does not enforce memory specifications. This can lead to significant
performance degradation with time shared nodes.
IBM Response: No response is ready at present, but one will be provided
forthwith.
- Colony switch and adapters will have more adapters per node and higher
hardware bandwidth; the existing HAL-based MPI implementation requires
unnecessary memory copies, and limits single task bandwidth to lower than the
hardware limits (approximately 50%, with NightHawk II nodes). A zero copy
user-level protocol is needed, similar to the KLAPI implementation used by
GPFS.
IBM Response: No response is ready at present, but one will be provided
forthwith.
IBM SP Roadmap Update
This talk contained such a wealth of detail that I could not capture more than
a small fraction of it. Also, there's the matter of IBM proprietary and possible
NDA-protected information. Therefore, the following notes represent only the tip
of the iceberg.
IBM will implement fundamental technical improvements that will prevent CMOS
for "topping out" in the near or mid-term (3 to 4 years) future, in terms of
processor or system performance, or the functionality that can be put on a single
chip. Insulated copper on-chip conductor path will prevent inter-conductor
crosstalk, so higher frequencies can be used. Distributed clocks operating at
different frequencies, for different parts of the chip, will also allow higher
frequencies to be used. The first generation of processors to benefit from these
will be the Power 4 chips, which will be rolled out 3/4q 2001. These chips will
represent a merging of the now-separate Power 3 technical and commercial product
lines. They will be 2-way SMPs, running at 9 GF peak. They will have architectural
hooks for very high performance, and to allow efficient interconnection into
larger aggregations. The chips will contain very large L-2 cache shared between
the processors, along with L-3 cache controllers and directories. Initial system
offerings will be 32-way SMPs, but larger (smaller?) systems will follow. They
will be usable as message passing or NUMA systems, as whole or partitioned
machines.
Numerous software and utility enhancements are planned for RAS and user
convenience and usability.
Interconnects will evolve to pure hardware, eliminating embedded processors
for control and switching. Bandwidth will increase, and latency decrease,
accordingly. (This also should enhance their reliability, although they will be
harder to debug on the floor.)
RS/6000 SPs running AIX will continue to be IBM's primary HPC platform.
However, Linux has the potential to become the volume Application Development
environment. There will be a strong affinity between AIX and Linux, and they will
both be made available across a variety of Intel and power Platforms. IBM will
work with the Linux community to infuse AIX technology into the Linux kernel.
IBM will also deliver robust Linux cluster solutions.
On matters pertaining to the information herein, send email to
Thomas M. DeBoni at TMDeBoni@LBL.GOV, or
call 510-486-8617.