From markreynoldsuk at gmail.com Thu May 1 13:10:06 2008 From: markreynoldsuk at gmail.com (Mark Reynolds) Date: Sat Jul 5 01:07:03 2008 Subject: [Beowulf] Mersenne primes? Message-ID: <7899080805011310xe372098y85a3dc01fbeb2be3@mail.gmail.com> I'm fairly new to Beowulfery but am reading TFM from http://www.phy.duke.edu/~rgb/Beowulf/beowulf_book.php (great book by the way). Does anyone know of any programs to find Mersenne primes that are suited to parallelised environments such as a cluster made up of nodes with multi-core processors? I've found mprimes from http://www.mersenne.org/ but the FAQ states: Although, a program could be written for dual-CPU systems (it would be quite time-consuming), the machine will still get more throughput by working on separate exponents. Has anybody tried this and can Lucas-Lehmer testing be effectively parallalised? Currently, you have to run an instance for each core for it to get any benefit. Also I'd like to modify it so it could, for example, search between certain numerical ranges independent of the work units sent out through the website or to take advantage of 64-bit hardware. Can someone more knowledgeable about this sort of thing tell me whether this is practical or if it has been done already? Thanks. -- "To win one hundred victories in one hundred battles is not the highest skill. To subdue the enemy without fighting is the highest skill."? Sun-Tzu From Hakon.Bugge at scali.com Fri May 2 04:57:40 2008 From: Hakon.Bugge at scali.com (=?iso-8859-1?Q?H=E5kon?= Bugge) Date: Sat Jul 5 01:07:03 2008 Subject: [Beowulf] MPICH vs. OpenMPI In-Reply-To: <1743033282.20080425140438@gmx.net> References: <200804231839.m3NId0Tc024423@bluewest.scyld.com> <20080425100733.C4B6C35A92F@mail.scali.no> <1743033282.20080425140438@gmx.net> Message-ID: <20080502115742.ED56E35A9FE@mail.scali.no> Jan, At 14:04 25.04.2008, Jan Heichler wrote: >You are not gonna share these benchmark results with us, right? >Would be very interesting to see that! You will find them at: http://www.scali.com/info/SHM-perf-8bytes-2007-12-20.htm http://www.scali.com/info/SHM-perf-128bytes-2007-12-20.htm Please note the tabs section at the bottom. You will have to use the horizontal scroll bar in order to see all the charts. Also, the 8-byte URL above does not start at the first tab, which is slightly confusing. The Y-Axis is sometimes time (usec), Bandwidth (1e6 bytes/sec), or message rate (1e6 messages per second). The hardware is dual socket Woodcrest, 3.00GHz, dual socket Clovertown, 2.66 and 3.00GHz. OS is SLES9. And, I am aware that the version of OpenMPI is quite old. Enjoy, Hakon From TPierce at rohmhaas.com Fri May 2 05:41:53 2008 From: TPierce at rohmhaas.com (Thomas H Dr Pierce) Date: Sat Jul 5 01:07:03 2008 Subject: [Beowulf] Purdue Supercomputer In-Reply-To: <20080502115742.ED56E35A9FE@mail.scali.no> Message-ID: Dear Beowulf, Purdue is building their own cluster. to create the 40th largest supercomputer. I wonder what operating system they will chose to use. http://www.informationweek.com/news/hardware/supercomputers/showArticle.jhtml;jsessionid=EJES2NGMF5LUAQSNDLRSKH0CJUNN2JVN?articleID=207404139&_requestid=84418 And a youtube video on "Installation Day" ! http://www.youtube.com/watch?v=wVzThRN4QJI ------ Sincerely, -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080502/6592ec16/attachment.html From rreis at aero.ist.utl.pt Fri May 2 06:05:25 2008 From: rreis at aero.ist.utl.pt (Ricardo Reis) Date: Sat Jul 5 01:07:03 2008 Subject: [Beowulf] Nvidia, cuda, tesla and... where's my double floating point? Message-ID: Does anyone knows if/when there will be double floating point on those little toys from nvidia? greets, Ricardo Reis 'Non Serviam' PhD student @ Lasef Computational Fluid Dynamics, High Performance Computing, Turbulence http://www.lasef.ist.utl.pt & Cultural Instigator @ R?dio Zero http://www.radiozero.pt http://www.flickr.com/photos/rreis/ From john.leidel at gmail.com Sat May 3 10:13:32 2008 From: john.leidel at gmail.com (John Leidel) Date: Sat Jul 5 01:07:03 2008 Subject: [Beowulf] Purdue Supercomputer In-Reply-To: References: Message-ID: <1209834812.5212.111.camel@e521.site> >From the looks of their website, all their other clusters run linux. On Fri, 2008-05-02 at 08:41 -0400, Thomas H Dr Pierce wrote: > > Dear Beowulf, > > Purdue is building their own cluster. to create the 40th largest > supercomputer. I wonder what operating system they will chose to > use. > > http://www.informationweek.com/news/hardware/supercomputers/showArticle.jhtml;jsessionid=EJES2NGMF5LUAQSNDLRSKH0CJUNN2JVN?articleID=207404139&_requestid=84418 > > And a youtube video on "Installation Day" ! > http://www.youtube.com/watch?v=wVzThRN4QJI > ------ > Sincerely, > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From balaji at mcs.anl.gov Sat May 3 10:11:24 2008 From: balaji at mcs.anl.gov (Pavan Balaji) Date: Sat Jul 5 01:07:03 2008 Subject: [Beowulf] [p2s2-announce] Deadline Extension: P2S2 Workshop Message-ID: <481C9CBC.9040005@mcs.anl.gov> This email is to inform you of an extension on the paper submission deadline for the workshop on Parallel Programming Models and Systems Software for High-End Computing (P2S2) till May 21st, 2008. The detailed CFP is included at the bottom of this email. There was an error the previous time this email was sent, and several people did not receive this announcement. If you are receiving this email twice, we apologize for the inconvenience. This announcement list is for people who are interested in the P2S2 workshop. If you are not interested in these announcements, information on how to unsubscribe from this list is available at the bottom of this email. ======================================================================== CALL FOR PAPERS =============== First International Workshop on Parallel Programming Models and Systems Software for High-end Computing (P2S2) Sep. 8th, 2008 Web link: http://www.mcs.anl.gov/events/workshops/p2s2 To be held in conjunction with ICPP-08: The 27th International Conference on Parallel Processing Sep. 8-12, 2008 Portland, Oregon, USA SCOPE ----- The goal of this workshop is to bring together researchers and practitioners in parallel programming models and systems software for high-end computing systems. Please join us in a discussion of new ideas, experiences, and the latest trends in these areas at the workshop. TOPICS OF INTEREST ------------------ The focus areas for this workshop include, but are not limited to: * Programming models and their high-performance implementations o MPI, Sockets, OpenMP, Global Arrays, X10, UPC, Chapel o Other Hybrid Programming Models * Systems software for scientific and enterprise computing o Communication sub-subsystems for high-end computing o High-performance File and storage systems o Fault-tolerance techniques and implementations o Efficient and high-performance virtualization and other management mechanisms * Tools for Management, Maintenance, Coordination and Synchronization o Software for Enterprise Data-centers using Modern Architectures o Job scheduling libraries o Management libraries for large-scale system o Toolkits for process and task coordination on modern platforms * Performance evaluation, analysis and modeling of emerging computing platforms PROCEEDINGS ----------- Proceedings of this workshop will be published by the IEEE Computer Society (together with the ICPP conference proceedings) in CD format only and will be available at the conference. SUBMISSION INSTRUCTIONS ----------------------- Submissions should be in PDF format in U.S. Letter size paper. They should not exceed 8 pages (all inclusive). Submissions will be judged based on relevance, significance, originality, correctness and clarity. DATES AND DEADLINES ------------------- Paper Submission: Extended to May 21st, 2008 Author Notification: June 4th, 2008 Camera Ready: June 18th, 2008 PROGRAM CHAIRS -------------- * Pavan Balaji (Argonne National Laboratory) * Sayantan Sur (IBM Research) STEERING COMMITTEE ------------------ * William D. Gropp (University of Illinois Urbana-Champaign) * Dhabaleswar K. Panda (Ohio State University) * Vijay Saraswat (IBM Research) PROGRAM COMMITTEE ----------------- * David Bernholdt (Oak Ridge National Laboratory) * Ron Brightwell (Sandia National Laboratory) * Wu-chun Feng (Virginia Tech) * Richard Graham (Oak Ridge National Laboratory) * Hyun-wook Jin (Konkuk University, South Korea) * Sameer Kumar (IBM Research) * Doug Lea (State University of New York at Oswego) * Jarek Nieplocha (Pacific Northwest National Laboratory) * Scott Pakin (Los Alamos National Laboratory) * Vivek Sarkar (Rice University) * Rajeev Thakur (Argonne National Laboratory) * Pete Wyckoff (Ohio Supercomputing Center) If you have any questions, please contact us at p2s2-chairs@mcs.anl.gov ======================================================================== If you do not want to receive any more announcements regarding the P2S2 workshop, please send an email to majordomo@mcs.anl.gov with the email body (not email subject) as "unsubscribe p2s2-announce". ======================================================================== -- Pavan Balaji http://www.mcs.anl.gov/~balaji From alex at younts.org Sat May 3 10:39:06 2008 From: alex at younts.org (Alex Younts) Date: Sat Jul 5 01:07:04 2008 Subject: [Beowulf] Purdue Supercomputer In-Reply-To: References: Message-ID: <481CA33A.2080204@younts.org> The machine will be running Redhat Enterprise 4. -- Alex Younts Thomas H Dr Pierce wrote: > > Dear Beowulf, > > Purdue is building their own cluster. to create the 40th largest > supercomputer. I wonder what operating system they will chose to use. > > http://www.informationweek.com/news/hardware/supercomputers/showArticle.jhtml;jsessionid=EJES2NGMF5LUAQSNDLRSKH0CJUNN2JVN?articleID=207404139&_requestid=84418 > > > And a youtube video on "Installation Day" ! > http://www.youtube.com/watch?v=wVzThRN4QJI > ------ > Sincerely, > > > ------------------------------------------------------------------------ > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From joshua_mora at usa.net Sat May 3 12:15:04 2008 From: joshua_mora at usa.net (Joshua mora acosta) Date: Sat Jul 5 01:07:04 2008 Subject: [Beowulf] Purdue Supercomputer Message-ID: <854mecToe0708S08.1209842104@cmsweb08.cms.usa.net> Does anyone know what is the detailed plan for building that thing with 200 people in just 1 day? I am very curious to understand what things can be done in parallel, what things are serialized from the point of view of installation, testing and evaluation/assesment. Even monitoring the progress,identifying critical tasks, balancing the workforce, having several B,C plans in case plan A fails. And what is the final target, to run across the entire cluster HPL by the end of the day? What is a day in here a business day or 24hours? Joshua ------ Original Message ------ Received: Sat, 03 May 2008 10:27:20 AM PDT From: John Leidel To: Thomas H Dr Pierce Cc: beowulf@beowulf.org Subject: Re: [Beowulf] Purdue Supercomputer > >From the looks of their website, all their other clusters run linux. > > On Fri, 2008-05-02 at 08:41 -0400, Thomas H Dr Pierce wrote: > > > > Dear Beowulf, > > > > Purdue is building their own cluster. to create the 40th largest > > supercomputer. I wonder what operating system they will chose to > > use. > > > > http://www.informationweek.com/news/hardware/supercomputers/showArticle.jhtml;jsessionid=EJES2NGMF5LUAQSNDLRSKH0CJUNN2JVN?articleID=207404139&_requestid=84418 > > > > And a youtube video on "Installation Day" ! > > http://www.youtube.com/watch?v=wVzThRN4QJI > > ------ > > Sincerely, > > > > _______________________________________________ > > Beowulf mailing list, Beowulf@beowulf.org > > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > From hahn at mcmaster.ca Sat May 3 23:41:08 2008 From: hahn at mcmaster.ca (Mark Hahn) Date: Sat Jul 5 01:07:04 2008 Subject: [Beowulf] Purdue Supercomputer In-Reply-To: <854mecToe0708S08.1209842104@cmsweb08.cms.usa.net> References: <854mecToe0708S08.1209842104@cmsweb08.cms.usa.net> Message-ID: > Does anyone know what is the detailed plan for building that thing with 200 > people in just 1 day? I'm guessing it's mainly just the monkey work. I've heard that dell always delivers each server in a separate box, so the most annoying part of building a dell cluster is unboxing, racking and managing the detritus. 812 servers, 200 people is only 4 servers/day, which seems quite generous. > I am very curious to understand what things can be done in parallel, what > things are serialized from the point of view of installation, testing and > evaluation/assesment. from the article, it sounds like the 200 people will mainly be unboxing, perhaps applying the rail kit, transporting to the machineroom. it would make more sense to just have them rack directly, with one other person stationed at the back of each rack handling cabling. I'm guessing that the cluster uses a leaf-switch-in-rack design, and that the rack arrangement and cabling is done ahead/separately. I can't imagine why, as soon as the server is in the rack and cabled, it couldn't pxe boot a test config. if you fill the rack in some well-defined order, you can easily enough keep track of physical-network node mappings. > Even monitoring the progress,identifying critical tasks, balancing the > workforce, having several B,C plans in case plan A fails. you make it sound hard and uncertain. it's not. doing it in one day with 200 people is basically just a stunt... > What is a day in here a business day or 24hours? if everyone knew what they were doing, I can't imagine why it would take more than a few hours to build, even counting the elevator ride. (presumably ~50 per elevator trip. but more importantly, the elevator partitions the workforce.) if the event also includes setting up the racks, cabling the interconnect, and infrastructure servers, etc, it would be more impressive. From matt at technoronin.com Sat May 3 20:11:19 2008 From: matt at technoronin.com (Matt Lawrence) Date: Sat Jul 5 01:07:04 2008 Subject: [Beowulf] Purdue Supercomputer In-Reply-To: <481CA33A.2080204@younts.org> References: <481CA33A.2080204@younts.org> Message-ID: On Sat, 3 May 2008, Alex Younts wrote: > The machine will be running Redhat Enterprise 4. That's getting to be a bit dated, but still very well supported and extremely stable. -- Matt It's not what I know that counts. It's what I can remember in time to use. From alex at younts.org Sat May 3 21:00:46 2008 From: alex at younts.org (Alex Younts) Date: Sat Jul 5 01:07:04 2008 Subject: [Beowulf] Purdue Supercomputer In-Reply-To: <854mecToe0708S08.1209842104@cmsweb08.cms.usa.net> References: <854mecToe0708S08.1209842104@cmsweb08.cms.usa.net> Message-ID: <481D34EE.6000908@younts.org> Joshua mora acosta wrote: > Does anyone know what is the detailed plan for building that thing with 200 > people in just 1 day? Yep: > I am very curious to understand what things can be done in parallel, what > things are serialized from the point of view of installation, testing and > evaluation/assesment. There will be several teams. Multiple 5-6 person teams unboxing nodes from their shipping boxes and sorting the materials for recycling. A couple cart runners going up and down the elevators into the data center. Then, there will be 5-6 3 person teams racking nodes and doing the cabling all at once. At the end of the train of people doing the hardware, they'll be about 3-4 people coming along and installing the nodes. (We use RedHat's kickstart and some special scripts we cooked up.) Almost all of this process is parallelized (probably everything but the lunch line.) Once the nodes have a base install, they'll reboot and cfengine will run to make them "real" nodes. > Even monitoring the progress,identifying critical tasks, balancing the > workforce, having several B,C plans in case plan A fails. We have a project manager and a lot of staff that have been putting a ton of time into this event. > And what is the final target, to run across the entire cluster HPL by the end > of the day? To be running user jobs within 24 hours. We will do the benchmarking later after all the DOA hardware has been fixed. > What is a day in here a business day or 24hours? The cluster hardware will be done in eight hours, and the software will simmer for up to 24 hours. We have built out a beefy install infrastructure to support a lot of simultaneous installs... > > Joshua > > ------ Original Message ------ > Received: Sat, 03 May 2008 10:27:20 AM PDT > From: John Leidel > To: Thomas H Dr Pierce Cc: beowulf@beowulf.org > Subject: Re: [Beowulf] Purdue Supercomputer > >> >From the looks of their website, all their other clusters run linux. >> >> On Fri, 2008-05-02 at 08:41 -0400, Thomas H Dr Pierce wrote: >>> Dear Beowulf, >>> >>> Purdue is building their own cluster. to create the 40th largest >>> supercomputer. I wonder what operating system they will chose to >>> use. >>> >>> > http://www.informationweek.com/news/hardware/supercomputers/showArticle.jhtml;jsessionid=EJES2NGMF5LUAQSNDLRSKH0CJUNN2JVN?articleID=207404139&_requestid=84418 > >>> And a youtube video on "Installation Day" ! >>> http://www.youtube.com/watch?v=wVzThRN4QJI >>> ------ >>> Sincerely, >>> >>> _______________________________________________ >>> Beowulf mailing list, Beowulf@beowulf.org >>> To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf >> _______________________________________________ >> Beowulf mailing list, Beowulf@beowulf.org >> To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > > > > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > -- Alex Younts alex@younts.org From kilian at stanford.edu Sun May 4 10:02:42 2008 From: kilian at stanford.edu (Kilian CAVALOTTI) Date: Sat Jul 5 01:07:04 2008 Subject: [Beowulf] Purdue Supercomputer In-Reply-To: References: <854mecToe0708S08.1209842104@cmsweb08.cms.usa.net> Message-ID: <200805041002.43612.kilian@stanford.edu> On Saturday 03 May 2008 23:41:08 Mark Hahn wrote: > I'm guessing it's mainly just the monkey work. I've heard that dell > always delivers each server in a separate box, Not necessarily. Our 288-nodes Dell cluster has been delivered in fully populated and pre-cabled racks. All the racking and internal cabling was done at the Merge Center [1]. The only "boxes" we received were 11 wooden crates, from which we extracted the racks, ready to be interconnected (this had to be done on site, obviously, and Dell sent in a team to do the IB cabling and finalize the installation). Purdue's "the day we rack" is brilliant PR, though. Cheers, -- Kilian [1]http://www.dell.com/content/topics/global.aspx/services/adi/dps_hpcc?c=us&cs=555&l=en&s=biz From kus at free.net Sun May 4 11:29:56 2008 From: kus at free.net (Mikhail Kuzminsky) Date: Sat Jul 5 01:07:04 2008 Subject: [Beowulf] Nvidia, cuda, tesla and... where's my double floating point? In-Reply-To: Message-ID: In message from Ricardo Reis (Fri, 2 May 2008 14:05:25 +0100 (WEST)): > > Does anyone knows if/when there will be double floating point on >those >little toys from nvidia? "Next generation Tesla", but I don't know when. Or use AMD FireStream 9170 instead :-) Mikhail Kuzminsky Computer Assistance to Chemical Research Center Zelisnky Inst. of Organic Chemistry Moscow > > greets, > > Ricardo Reis > > 'Non Serviam' > > PhD student @ Lasef > Computational Fluid Dynamics, High Performance Computing, >Turbulence > http://www.lasef.ist.utl.pt > > & > > Cultural Instigator @ R?dio Zero > http://www.radiozero.pt > > http://www.flickr.com/photos/rreis/ From jaime.perea at gmail.com Mon May 5 06:43:16 2008 From: jaime.perea at gmail.com (Jaime Perea) Date: Sat Jul 5 01:07:04 2008 Subject: [Beowulf] many cores and ib Message-ID: <200805051543.17324.jaime.perea@gmail.com> Hello, Just a small question, does anybody has experience with many core (16) nodes and infiniband? Since we have some users that need shared memory but also we want to build a normal cluster for mpi apps, we think that this could be a solution. Let's say about 8 machines (96 processors) pus infiniband. Does it sound correct? I'm aware of the bottleneck that means having one ib interface for the mpi cores, is there any possibility of bonding? Thanks a lot and regards Jaime D. Perea Duarte. Linux registered user #10472 Dep. Astrofisica Extragalactica. Instituto de Astrofisica de Andalucia (CSIC) Apdo. 3004, 18080 Granada, Spain. From jan.heichler at gmx.net Mon May 5 06:54:33 2008 From: jan.heichler at gmx.net (Jan Heichler) Date: Sat Jul 5 01:07:04 2008 Subject: [Beowulf] many cores and ib In-Reply-To: <200805051543.17324.jaime.perea@gmail.com> References: <200805051543.17324.jaime.perea@gmail.com> Message-ID: <322301899.20080505155433@gmx.net> Hallo Jaime, Montag, 5. Mai 2008, meintest Du: JP> Hello, JP> Just a small question, does anybody has experience with many core JP> (16) nodes and infiniband? Since we have some users that need JP> shared memory but also we want to build a normal cluster for JP> mpi apps, we think that this could be a solution. Let's say about JP> 8 machines (96 processors) pus infiniband. Does it sound correct? JP> I'm aware of the bottleneck that means having one ib interface for JP> the mpi cores, is there any possibility of bonding? Bonding (or multi-rail) does not make sense with "standard IB" in PCIe x8 since the PCIe connection limits the transfer rate of a single IB-Link already. My hint would be to go for Infinipath from QLogic or the new ConnectX from Mellanox since message rate is probably your limiting factor and those technologies have a huge advantage over standard Infiniband SDR/DDR. Infinipath and ConnectX are available as DDR Infiniband and provide a bandwidth of more than 1800 MB/s. Cheers, Jan -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080505/ed89a351/attachment.html From Shainer at mellanox.com Mon May 5 10:01:42 2008 From: Shainer at mellanox.com (Gilad Shainer) Date: Sat Jul 5 01:07:04 2008 Subject: [Beowulf] many cores and ib In-Reply-To: <322301899.20080505155433@gmx.net> Message-ID: <9FA59C95FFCBB34EA5E42C1A8573784F01188361@mtiexch01.mti.com> >> Hello, >> Just a small question, does anybody has experience with many core >> (16) nodes and InfiniBand? Since we have some users that need >> shared memory but also we want to build a normal cluster for >> mpi apps, we think that this could be a solution. Let's say about >> 8 machines (96 processors) pus infiniband. Does it sound correct? >> I'm aware of the bottleneck that means having one ib interface for >> the mpi cores, is there any possibility of bonding? > > Bonding (or multi-rail) does not make sense with "standard IB" in PCIe x8 since the PCIe connection limits the transfer rate of a single IB-Link already. > > My hint would be to go for Infinipath from QLogic or the new ConnectX from Mellanox since message rate is probably your limiting factor and those technologies have a huge advantage over standard Infiniband SDR/DDR. > > > Infinipath and ConnectX are available as DDR Infiniband and provide a bandwidth of more than 1800 MB/s. Boding can provide more bandwidth if needed. Each PCIe x8 slot can provide (in average) around 1500MB/s, therefore using IB DDR (no need to be ConnectX), you will get 1500MB/s uni-dir from each PCIe Gen1 x8 slot. According to OSU benchmarks, InfiniHost III Ex provides >20M MPI message per second. Of course moving to ConnectX enable you the option to use servers with PCIe Gen2 slots, where each slot provide you around 3300MB/s with ConnectX IB QDR and 6500MB/s bi-directional BW. If you will be using the DDR option with ConnectX, the BW will be little bit higher than what Jan have mentioned, but this is in the ball park. Gilad. From john.hearns at streamline-computing.com Mon May 5 12:40:56 2008 From: john.hearns at streamline-computing.com (John Hearns) Date: Sat Jul 5 01:07:04 2008 Subject: [Beowulf] Nvidia, cuda, tesla and... where's my double floating point? In-Reply-To: References: Message-ID: <1210016466.4924.1.camel@Vigor13> On Fri, 2008-05-02 at 14:05 +0100, Ricardo Reis wrote: > Does anyone knows if/when there will be double floating point on those > little toys from nvidia? > > Ricardo, I think CUDA is a gret concept, and am starting to work with it at home. I recently went to a talk by David Kirk, as part of the "world tour". I think the answer to your question is Real Soon Now. From lindahl at pbm.com Mon May 5 13:32:41 2008 From: lindahl at pbm.com (Greg Lindahl) Date: Sat Jul 5 01:07:04 2008 Subject: [Beowulf] many cores and ib In-Reply-To: <9FA59C95FFCBB34EA5E42C1A8573784F01188361@mtiexch01.mti.com> References: <322301899.20080505155433@gmx.net> <9FA59C95FFCBB34EA5E42C1A8573784F01188361@mtiexch01.mti.com> Message-ID: <20080505203241.GA27918@bx9.net> On Mon, May 05, 2008 at 10:01:42AM -0700, Gilad Shainer wrote: > According to OSU benchmarks, InfiniHost III Ex provides >20M MPI message > per second. And we should all remember that this test result uses message coalescing, which in real life will not help many-core nodes talking to lots of other nodes. It only helps 2 nodes talking to only each other. I mean, you'd hate to mislead your customers about the performance of your HCA, right? Still, you should be applauded for not only using latency and bandwidth. -- greg From Shainer at mellanox.com Mon May 5 13:42:49 2008 From: Shainer at mellanox.com (Gilad Shainer) Date: Sat Jul 5 01:07:04 2008 Subject: [Beowulf] many cores and ib In-Reply-To: <20080505203241.GA27918@bx9.net> Message-ID: <9FA59C95FFCBB34EA5E42C1A8573784F01188404@mtiexch01.mti.com> > -----Original Message----- > From: beowulf-bounces@beowulf.org > [mailto:beowulf-bounces@beowulf.org] On Behalf Of Greg Lindahl > Sent: Monday, May 05, 2008 1:33 PM > To: beowulf@beowulf.org > Subject: Re: [Beowulf] many cores and ib > > On Mon, May 05, 2008 at 10:01:42AM -0700, Gilad Shainer wrote: > > > According to OSU benchmarks, InfiniHost III Ex provides >20M MPI > > message per second. > > And we should all remember that this test result uses message > coalescing, which in real life will not help many-core nodes > talking to lots of other nodes. It only helps 2 nodes talking > to only each other. > > I mean, you'd hate to mislead your customers about the > performance of your HCA, right? Still, you should be > applauded for not only using latency and bandwidth. > > -- greg > It is the same benchmark that QLogic were and are using for MPI message rate, and I guess you know that better then me, don't you?.... I want to make sure when one do a comparison he/she will be using the same benchmark/output to compare. Gilad. From lindahl at pbm.com Mon May 5 13:48:51 2008 From: lindahl at pbm.com (Greg Lindahl) Date: Sat Jul 5 01:07:04 2008 Subject: [Beowulf] many cores and ib In-Reply-To: <9FA59C95FFCBB34EA5E42C1A8573784F01188404@mtiexch01.mti.com> References: <20080505203241.GA27918@bx9.net> <9FA59C95FFCBB34EA5E42C1A8573784F01188404@mtiexch01.mti.com> Message-ID: <20080505204851.GB18178@bx9.net> On Mon, May 05, 2008 at 01:42:49PM -0700, Gilad Shainer wrote: > It is the same benchmark that QLogic were and are using for MPI message > rate, Like many benchmarks, you can game the benchmark, and you gamed the benchmark. I guess you don't understand the technical details well enough to get what's going on; we had a longer discussion of this a while ago on this list. I caught HP doing the same thing with lm_bench's lat_mem_rd benchmark a while ago, and they immediately apologized and stopped using it in their marketing literature. Which shows a lot of class. -- greg From Shainer at mellanox.com Mon May 5 13:56:55 2008 From: Shainer at mellanox.com (Gilad Shainer) Date: Sat Jul 5 01:07:04 2008 Subject: [Beowulf] many cores and ib In-Reply-To: <20080505204851.GB18178@bx9.net> Message-ID: <9FA59C95FFCBB34EA5E42C1A8573784F0118840E@mtiexch01.mti.com> > -----Original Message----- > From: Greg Lindahl [mailto:lindahl@pbm.com] > Sent: Monday, May 05, 2008 1:49 PM > To: Gilad Shainer > Cc: beowulf@beowulf.org > Subject: Re: [Beowulf] many cores and ib > > On Mon, May 05, 2008 at 01:42:49PM -0700, Gilad Shainer wrote: > > > It is the same benchmark that QLogic were and are using for MPI > > message rate, > > Like many benchmarks, you can game the benchmark, and you > gamed the benchmark. I guess you don't understand the > technical details well enough to get what's going on; we had > a longer discussion of this a while ago on this list. > > I caught HP doing the same thing with lm_bench's lat_mem_rd > benchmark a while ago, and they immediately apologized and > stopped using it in their marketing literature. Which shows a > lot of class. > > -- greg > This goes nowhere. QLogic former benchmark for MPI message rate, is now available also on OSU MPI, to provide the same comparison. I do not want this conversation become a marketing bash. Especially that you are no longer with QLogic, and your email provide a good reason for why. Gilad. From tom.elken at qlogic.com Mon May 5 15:07:34 2008 From: tom.elken at qlogic.com (Tom Elken) Date: Sat Jul 5 01:07:04 2008 Subject: [Beowulf] many cores and ib In-Reply-To: <322301899.20080505155433@gmx.net> References: <200805051543.17324.jaime.perea@gmail.com> <322301899.20080505155433@gmx.net> Message-ID: <6DB5B58A8E5AB846A7B3B3BFF1B4315A01F670C1@AVEXCH1.qlogic.org> >> Since we have some users that need >> shared memory but also we want to build a normal cluster for >> mpi apps, we think that this could be a solution. Let's say about >> 8 machines (96 processors) pus infiniband. Does it sound correct? >> I'm aware of the bottleneck that means having one ib interface for >> the mpi cores, is there any possibility of bonding? > Bonding (or multi-rail) does not make sense with "standard IB" in PCIe > x8 since the PCIe connection limits the transfer rate of a single > IB-Link already. PCIe x8 Gen2 provides additional bandwidth as Gilad said. On Opteron systems that is not available yet (and won't be for some time), so you may want to search for AMD-CPU or Intel-CPU based boards that have PCIe x16 slots. > My hint would be to go for Infinipath from QLogic or the new ConnectX from Mellanox since message rate is probably your limiting factor and those technologies have a huge advantage over standard Infiniband SDR/DDR. I agree that message rate may be your limiting factor. Results with QLogic (aka InfiniPath) DDR adapters: DDR Peak MPI Bandwidth Peak Message Rate Adapter (no message coalescing**) QLE7280 PCIe x16 1950 MB/s 20-26* Million/sec (8 ppn) QLE7240 PCIe x8 1500 MB/s 19 Million/sec (8 ppn) Test details: All run on two nodes, each with 2x Intel Xeon 5410 (Harpertown, quad-core, 2.33 GHz CPUs), 8 cores per node, SLES 10. except, * 26 M messages/sec requires faster CPUs, 3 to 3.2 Ghz. 8 ppn means 8 MPI processes per node. The non-coalesced message rate performance of these adapters scales pretty linearly from 1 to 8 cores. That is not the case with all modern DDR adapters. Benchmark = OSU Multiple Bandwidth, Message Rate benchmark, osu_mbw_mr.c The above performace results can be had with either MVAPICH 1.0 or QLogic MPI 2.2 (other MPIs are in the same ballpark with these adapters). Note that MVAPICH 0.9.9 had meassage-coalescing on by default, and MVAPICH 1.0 has it off by default. There must be a reason. Revisiting: > > Bonding (or multi-rail) does not make sense with "standard IB" in PCIe > x8 since the PCIe connection limits the transfer rate of a single > IB-Link already. Some 4-socket motherboards have independent PCIe buses to x8 or x16 slots. In this case, multi-rail does make sense. You can run the QLogic adapters as dual-rail without bonding. On MPI applications, half of the cores will use one adapter and half will use the other. Whether the more expensive dual-rail arrangement is necessary and/or cost-effective would be very application-specific. Regards, -Tom Elken > > > Infinipath and ConnectX are available as DDR Infiniband and provide a bandwidth of more than 1800 MB/s Good suggestion. From Shainer at mellanox.com Mon May 5 15:32:10 2008 From: Shainer at mellanox.com (Gilad Shainer) Date: Sat Jul 5 01:07:04 2008 Subject: [Beowulf] many cores and ib In-Reply-To: <6DB5B58A8E5AB846A7B3B3BFF1B4315A01F670C1@AVEXCH1.qlogic.org> Message-ID: <9FA59C95FFCBB34EA5E42C1A8573784F01188447@mtiexch01.mti.com> > >> Since we have some users that need > >> shared memory but also we want to build a normal cluster for mpi > >> apps, we think that this could be a solution. Let's say about > >> 8 machines (96 processors) pus infiniband. Does it sound correct? > >> I'm aware of the bottleneck that means having one ib interface for > >> the mpi cores, is there any possibility of bonding? > > > Bonding (or multi-rail) does not make sense with "standard > IB" in PCIe > > x8 since the PCIe connection limits the transfer rate of a single > > IB-Link already. > > PCIe x8 Gen2 provides additional bandwidth as Gilad said. On > Opteron systems that is not available yet (and won't be for > some time), so you may want to search for AMD-CPU or > Intel-CPU based boards that have PCIe > x16 slots. > One more useful info, is that there are couple of installation in Japan where they use 4 "regular IB DDR" adapters in 4 PCIe x8 slots to provide 6GB/s (1500MB per slot) and they do bonding to have it as a single pipe. If you plan to use Intel you can use PCIe Gen2 with IB QDR and get 3200MB per PCIe Gen2 slot. > > My hint would be to go for Infinipath from QLogic or the > new ConnectX > from Mellanox since message rate is probably your limiting > factor and those technologies have a huge advantage over > standard Infiniband SDR/DDR. > > I agree that message rate may be your limiting factor. > Results with QLogic (aka InfiniPath) DDR adapters: > > DDR Peak MPI Bandwidth Peak Message Rate > Adapter (no message coalescing**) > QLE7280 PCIe x16 1950 MB/s 20-26* > Million/sec (8 ppn) > QLE7240 PCIe x8 1500 MB/s 19 > Million/sec (8 ppn) > > Test details: All run on two nodes, each with 2x Intel Xeon > 5410 (Harpertown, quad-core, 2.33 GHz CPUs), 8 cores per > node, SLES 10. > except, > * 26 M messages/sec requires faster CPUs, 3 to 3.2 Ghz. > > 8 ppn means 8 MPI processes per node. The non-coalesced > message rate performance of these adapters scales pretty > linearly from 1 to 8 cores. > That is not the case with all modern DDR adapters. > As Tom wrote, the message rate depends on the number of CPUs. With the benchmark Tom indicated below and the same CPU, you can get up to 42M msg/sec with ConnectX. > Benchmark = OSU Multiple Bandwidth, Message Rate benchmark, > osu_mbw_mr.c The above performace results can be had with > either MVAPICH 1.0 or QLogic MPI 2.2 (other MPIs are in the > same ballpark with these adapters). > > Note that MVAPICH 0.9.9 had meassage-coalescing on by > default, and MVAPICH 1.0 has it off by default. There must > be a reason. As far as I know, the reason for that was to have the user pick his choice. As OSU mentioned, there are some applications when this helps and some that it does not. Gilad. > > Revisiting: > > > > Bonding (or multi-rail) does not make sense with "standard > IB" in PCIe > > x8 since the PCIe connection limits the transfer rate of a single > > IB-Link already. > > Some 4-socket motherboards have independent PCIe buses to x8 > or x16 slots. In this case, multi-rail does make sense. You > can run the QLogic adapters as dual-rail without bonding. On > MPI applications, half > of the cores will use one adapter and half will use the > other. Whether > the more expensive dual-rail arrangement is necessary and/or > cost-effective would be very application-specific. > > Regards, > -Tom Elken > > > > > > > > > > > Infinipath and ConnectX are available as DDR Infiniband and > provide a > > bandwidth of more than 1800 MB/s > > > > Good suggestion. > > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org To change your > subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > From jan.heichler at gmx.net Mon May 5 23:43:53 2008 From: jan.heichler at gmx.net (Jan Heichler) Date: Sat Jul 5 01:07:04 2008 Subject: [Beowulf] many cores and ib In-Reply-To: <9FA59C95FFCBB34EA5E42C1A8573784F01188361@mtiexch01.mti.com> References: <322301899.20080505155433@gmx.net> <9FA59C95FFCBB34EA5E42C1A8573784F01188361@mtiexch01.mti.com> Message-ID: <61320720.20080506084353@gmx.net> Hallo Gilad, Montag, 5. Mai 2008, meintest Du: >> Bonding (or multi-rail) does not make sense with "standard IB" in PCIe GS> x8 since the PCIe connection limits the transfer rate of a single GS> IB-Link already. >> My hint would be to go for Infinipath from QLogic or the new ConnectX GS> from Mellanox since message rate is probably your limiting factor and GS> those technologies have a huge advantage over standard Infiniband GS> SDR/DDR. >> Infinipath and ConnectX are available as DDR Infiniband and provide a GS> bandwidth of more than 1800 MB/s. GS> GS> GS> Boding can provide more bandwidth if needed. Each PCIe x8 slot can GS> provide (in average) around 1500MB/s, therefore using IB DDR (no need to GS> be ConnectX), you will get 1500MB/s uni-dir from each PCIe Gen1 x8 slot. Ahh... sorry... i was just thinking about dual-port cards. Not different cards in different slots. Cheers, Jan -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080506/dfff72b7/attachment.html From jan.heichler at gmx.net Mon May 5 23:46:26 2008 From: jan.heichler at gmx.net (Jan Heichler) Date: Sat Jul 5 01:07:04 2008 Subject: [Beowulf] many cores and ib In-Reply-To: <200805051543.17324.jaime.perea@gmail.com> References: <200805051543.17324.jaime.perea@gmail.com> Message-ID: <11210597665.20080506084626@gmx.net> Hallo Jaime, Montag, 5. Mai 2008, meintest Du: JP> Hello, JP> Just a small question, does anybody has experience with many core JP> (16) nodes and infiniband? Since we have some users that need JP> shared memory but also we want to build a normal cluster for JP> mpi apps, we think that this could be a solution. Let's say about JP> 8 machines (96 processors) pus infiniband. Does it sound correct? JP> I'm aware of the bottleneck that means having one ib interface for JP> the mpi cores, is there any possibility of bonding? i had another idea. Roughly one can say that a quad socket system is three times as expensive as a dual socket system -> so you spend a lot of money for "some users". Maybe it is worth thinking about using dual-socket machines and test if scaleMP gives you a good performance over two of them combined. I have no experience with that but maybe you can find that out... For the interconnect it would be useful to know what and how your application is communicating. Big messages? Small messages? Complex operations? Maybe you can find out. Cheers, Jan -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080506/d84e357a/attachment.html From patrick at myri.com Tue May 6 01:46:16 2008 From: patrick at myri.com (Patrick Geoffray) Date: Sat Jul 5 01:07:04 2008 Subject: [Beowulf] many cores and ib In-Reply-To: <9FA59C95FFCBB34EA5E42C1A8573784F01188404@mtiexch01.mti.com> References: <9FA59C95FFCBB34EA5E42C1A8573784F01188404@mtiexch01.mti.com> Message-ID: <48201AD8.9000802@myri.com> Gilad Shainer wrote: > It is the same benchmark that QLogic were and are using for MPI message > rate, and I guess you know that better then me, don't you?.... I want > to make sure when one do a comparison he/she will be using the same > benchmark/output to compare. It is not the benchmark, it's the MPI implementation. The benchmark in itself is stupid, because it sends a gazillion messages to a single node. The MPI implementation is dishonest, because it says "eh, you are trying to send a gazillion messages to a single node, let me pack them into a single message on the wire for you", completely changing what the benchmark is trying to measure. You are a marketing guy, you just repeat the numbers without understanding what they mean. Message coalescing in MVAPICH does nothing but make the message rate micro-benchmark irrelevant, it was designed that way, and only for that purpose. With message coalescing, *everybody* can send 20 Million messages per second, as long as you have over 1GB/s of bandwidth. This is like the header caching "optimization": change the MPI tag for each Send in your pingpong benchmark, and see your latency goes up. It's because the MPI implementation is smart enough to say "eh, you are sending the same message envelope over and over, let me compact the MPI header for you". It does not help anything but a micro-benchmark. I can imagine the next optimization from here: if you happen to send messages full of zeros in your ping-pong, MVAPICH will "compress" them for you. And somewhere, someone will claim a gazillion bytes per second... Patrick From hahn at mcmaster.ca Tue May 6 15:20:18 2008 From: hahn at mcmaster.ca (Mark Hahn) Date: Sat Jul 5 01:07:04 2008 Subject: [Beowulf] Purdue Supercomputer In-Reply-To: <481D34EE.6000908@younts.org> References: <854mecToe0708S08.1209842104@cmsweb08.cms.usa.net> <481D34EE.6000908@younts.org> Message-ID: > We have built out a beefy install infrastructure to support a lot of > simultaneous installs... I'm curious to hear about the infrastructure. btw: http://www.eetimes.com/news/latest/showArticle.jhtml?articleID=207501882 From matt at technoronin.com Sun May 4 09:23:36 2008 From: matt at technoronin.com (Matt Lawrence) Date: Sat Jul 5 01:07:04 2008 Subject: [Beowulf] Purdue Supercomputer In-Reply-To: <481D34EE.6000908@younts.org> References: <854mecToe0708S08.1209842104@cmsweb08.cms.usa.net> <481D34EE.6000908@younts.org> Message-ID: That's going to be nifty stunt. Proving that such an install not only can be done, it has een done. Wish I was involved. Other suggestions: Make sure you have somebody responsible for the power on site, blowing breakers and not being able to fix the problems would be embaressing. Also, make sure you have a really good HVAC person on site as well, it would be really bad to cook all of the equipment. Please post some detailed results, I want to hear what worked really well and what had problems (and those solutions). -- Matt It's not what I know that counts. It's what I can remember in time to use. From rreis at aero.ist.utl.pt Sun May 4 13:11:16 2008 From: rreis at aero.ist.utl.pt (Ricardo Reis) Date: Sat Jul 5 01:07:04 2008 Subject: [Beowulf] Nvidia, cuda, tesla and... where's my double floating point? In-Reply-To: References: Message-ID: On Sun, 4 May 2008, Mikhail Kuzminsky wrote: > "Next generation Tesla", but I don't know when. Or use AMD FireStream 9170 > instead :-) I've read somewhere that double precision performance from AMD wasn't very good and their programming model goes more towards assembly... Besides, AMD/ATI still have to convince on their linux drivers. Unless... do you have any experience with that hardware you could share? greets, Ricardo Reis 'Non Serviam' PhD student @ Lasef Computational Fluid Dynamics, High Performance Computing, Turbulence http://www.lasef.ist.utl.pt & Cultural Instigator @ R?dio Zero http://www.radiozero.pt http://www.flickr.com/photos/rreis/ From gulatiakshay at gmail.com Mon May 5 13:40:48 2008 From: gulatiakshay at gmail.com (akshay gulati) Date: Sat Jul 5 01:07:04 2008 Subject: [Beowulf] many cores and ib In-Reply-To: <20080505203241.GA27918@bx9.net> References: <322301899.20080505155433@gmx.net> <9FA59C95FFCBB34EA5E42C1A8573784F01188361@mtiexch01.mti.com> <20080505203241.GA27918@bx9.net> Message-ID: <15c7ce400805051340qbc9b15dxb2d38f9af53de030@mail.gmail.com> How Can i start bewoluf and HPC any ideas On Tue, May 6, 2008 at 2:02 AM, Greg Lindahl wrote: > On Mon, May 05, 2008 at 10:01:42AM -0700, Gilad Shainer wrote: > > > According to OSU benchmarks, InfiniHost III Ex provides >20M MPI message > > per second. > > And we should all remember that this test result uses message > coalescing, which in real life will not help many-core nodes talking > to lots of other nodes. It only helps 2 nodes talking to only each > other. > > I mean, you'd hate to mislead your customers about the performance of > your HCA, right? Still, you should be applauded for not only using > latency and bandwidth. > > -- greg > > > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080506/996e23a9/attachment.html From James.P.Lux at jpl.nasa.gov Tue May 6 16:41:20 2008 From: James.P.Lux at jpl.nasa.gov (Jim Lux) Date: Sat Jul 5 01:07:04 2008 Subject: [Beowulf] Purdue Supercomputer In-Reply-To: References: <854mecToe0708S08.1209842104@cmsweb08.cms.usa.net> <481D34EE.6000908@younts.org> Message-ID: <6.2.3.4.2.20080506163930.02d47660@mail.jpl.nasa.gov> At 03:20 PM 5/6/2008, Mark Hahn wrote: >>We have built out a beefy install infrastructure to support a lot >>of simultaneous installs... > >I'm curious to hear about the infrastructure. > >btw: >http://www.eetimes.com/news/latest/showArticle.jhtml?articleID=207501882 Interesting... 1000 computers, assume it takes 30 seconds to remove from the box and walk to the rack. that's 30,000 seconds, or about 500 minutes.. call it 8 hours. Assume you've got 10 racks and 10 people, so you get some parallelism... an hour to unpack and rack one pile. What wasn't shown in the video.. all the plugging and routing of network cables? Jim From Shainer at mellanox.com Tue May 6 18:20:15 2008 From: Shainer at mellanox.com (Gilad Shainer) Date: Sat Jul 5 01:07:04 2008 Subject: [Beowulf] many cores and ib In-Reply-To: <48201AD8.9000802@myri.com> Message-ID: <9FA59C95FFCBB34EA5E42C1A8573784F011885EC@mtiexch01.mti.com> Patrick Geoffray wrote: > > It is the same benchmark that QLogic were and are using for MPI > > message rate, and I guess you know that better then me, > don't you?.... > > I want to make sure when one do a comparison he/she will be > using the > > same benchmark/output to compare. > > It is not the benchmark, it's the MPI implementation. My apologizes. I meant the MPI includes an option to collect several MPI messages into one network message. For applications cases, sometimes it helps with performance and sometimes it does not. OSU have shown both cases, and every user can decide what works best for him. > The benchmark in itself is stupid, because it sends a gazillion > messages to a single node. The MPI implementation is > dishonest, because it says "eh, you are trying to send a > gazillion messages to a single node, let me pack them into a > single message on the wire for you", completely changing what > the benchmark is trying to measure. As long as you use the same code and the same benchmark (and the same platform), you can use any benchmark to compare between different network devices. You can claim that if you want to count the network messages and not the MPI messages, using this way to do it will give you the wrong number, but if you want to compare between 2 interconnects, this is valid. > You are a marketing guy, I see it as a compliment, so thanks, much appreciated. > you just repeat the numbers without > understanding what they mean. This becomes personal now... :-) you don't need to be angry all the time. Bad for your health. I could have replay back mentioning all the FUD and misleading information you provide wherever you go, and we saw couple of examples just few weeks ago, but I wont. > I can imagine the next optimization from here: if you happen > to send messages full of zeros in your ping-pong, MVAPICH > will "compress" them for you. And somewhere, someone will > claim a gazillion bytes per second... Old joke. I am trying to use jokes only once. At the end of the day, the best way is to benchmark your applications and see what gets you better performance. If someone want to benchmark IB he can use the benchmark center at Mellanox. By the way, since ISC08 conf is just around the corner, you are welcome to visit our booth and see several IB QDR demonstrations. Gilad. > > Patrick > From hahn at mcmaster.ca Tue May 6 20:55:39 2008 From: hahn at mcmaster.ca (Mark Hahn) Date: Sat Jul 5 01:07:04 2008 Subject: [Beowulf] many cores and ib In-Reply-To: <9FA59C95FFCBB34EA5E42C1A8573784F011885EC@mtiexch01.mti.com> References: <9FA59C95FFCBB34EA5E42C1A8573784F011885EC@mtiexch01.mti.com> Message-ID: > messages into one network message. For applications cases, sometimes it > helps with performance and sometimes it does not. OSU have shown both when would a program deliberately send such messages? isn't it something that the program should avoid in the first place? does the MPI optimization apply to messages that differ in source/dest rank and/or tag? From hahn at mcmaster.ca Wed May 7 06:33:06 2008 From: hahn at mcmaster.ca (Mark Hahn) Date: Sat Jul 5 01:07:04 2008 Subject: [Beowulf] Purdue Supercomputer In-Reply-To: <4820F7A2.2040005@younts.org> References: <854mecToe0708S08.1209842104@cmsweb08.cms.usa.net> <481D34EE.6000908@younts.org> <6.2.3.4.2.20080506163930.02d47660@mail.jpl.nasa.gov> <4820F7A2.2040005@younts.org> Message-ID: > everything was going. This morning, we hit the last few mis-installs. Our DOA > nodes were around 1% of the total order.. one advantage of having the vendor pre-rack is that they usually also pre-test. did you consider having dell pre-assemble the cluster, and reject it for cost reasons? > The physical networking was done in a new way for us.. We used a large > Foundry switch and the MRJ21 cabling system for it. Each racks gets 24 nodes, > a 24 port passive patch panel, and 4 MRJ21 cables that run back to the if I understand, this means each node has a 1Gb link to a large switch, right? I'm a little surprised this was cost-effective - what is the intended workload of the cluster? (I mean given that Gb is usually considered high-latency and low-bandwidth.) I'd be curious to hear about your consideration of both 10G and IB. From gerry.creager at tamu.edu Wed May 7 07:09:00 2008 From: gerry.creager at tamu.edu (Gerry Creager) Date: Sat Jul 5 01:07:04 2008 Subject: [Beowulf] Purdue Supercomputer In-Reply-To: References: <854mecToe0708S08.1209842104@cmsweb08.cms.usa.net> <481D34EE.6000908@younts.org> <6.2.3.4.2.20080506163930.02d47660@mail.jpl.nasa.gov> <4820F7A2.2040005@younts.org> Message-ID: <4821B7FC.5030209@tamu.edu> We're not as big as Purdue in this but we just installed a 10TF Dell system. We specifically designed with 1Gbe to reinforce the concept that our new cluster is a high-throughput system rather than HPC. Jobs that can concentrate well on a node (or two) should do nicely, while HPC jobs can run on other campus resources with bigger, badder, faster interconnects. gerry Mark Hahn wrote: >> everything was going. This morning, we hit the last few mis-installs. >> Our DOA nodes were around 1% of the total order.. > > one advantage of having the vendor pre-rack is that they usually also > pre-test. did you consider having dell pre-assemble the cluster, and > reject it for cost reasons? > >> The physical networking was done in a new way for us.. We used a large >> Foundry switch and the MRJ21 cabling system for it. Each racks gets 24 >> nodes, a 24 port passive patch panel, and 4 MRJ21 cables that run back >> to the > > if I understand, this means each node has a 1Gb link to a large switch, > right? I'm a little surprised this was cost-effective - what is the > intended > workload of the cluster? (I mean given that Gb is usually considered > high-latency and low-bandwidth.) I'd be curious to hear about your > consideration of both 10G and IB. > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf -- Gerry Creager -- gerry.creager@tamu.edu Texas Mesonet -- AATLT, Texas A&M University Cell: 979.229.5301 Office: 979.862.3982 FAX: 979.862.3983 Office: 1700 Research Parkway Ste 160, TAMU, College Station, TX 77843 From mathog at caltech.edu Wed May 7 08:36:32 2008 From: mathog at caltech.edu (David Mathog) Date: Sat Jul 5 01:07:04 2008 Subject: [Beowulf] Re: Purdue Supercomputer Message-ID: Jim Lux wrote > What wasn't shown in the video.. all the plugging and routing of > network cables? What about labeling the cables? My rack's nodes' ethernet and power cables are marked with the node's name at each end. It was a low tech approach involving printing names, cutting paper into strips with scissors, and taping them on. There are other types of label, many based on modified zip ties, but they still have to be applied by hand. Regards, David Mathog mathog@caltech.edu Manager, Sequence Analysis Facility, Biology Division, Caltech From alex at younts.org Tue May 6 17:28:18 2008 From: alex at younts.org (Alex Younts) Date: Sat Jul 5 01:07:04 2008 Subject: [Beowulf] Purdue Supercomputer In-Reply-To: <6.2.3.4.2.20080506163930.02d47660@mail.jpl.nasa.gov> References: <854mecToe0708S08.1209842104@cmsweb08.cms.usa.net> <481D34EE.6000908@younts.org> <6.2.3.4.2.20080506163930.02d47660@mail.jpl.nasa.gov> Message-ID: <4820F7A2.2040005@younts.org> So, more or less, the install day was a major success for us here at Purdue. The party got started at 8:00AM local time.. We had around 40 peoples unboxing machines in the loading dock. After an hour, they had gone through nearly 20 pallets of boxes. (We asked them to take a break for some of the free breakfast we had..) The bottleneck in getting machines racked was the limited isle space between the two rack rows and needing to get the rails ahead of the actual machines. Around 12:00PM enough was unpacked, racked, and cabled to begin software installation. By 1:00PM, 500 nodes were up and jobs were running. By 4:00PM, everything was going. This morning, we hit the last few mis-installs. Our DOA nodes were around 1% of the total order.. One of our nanotech researchers here got in a hero run of his code, and pronounced the cluster perfect early this morning. Not a bad turn around and a very happy costumer. We were blown away by how quickly the teams moved through their jobs. Of course, it wasn't surprising because we pulled a lot of the technical talent from IT shops all around the University to work in two hours shifts. It was a great time to socialize and get to know the faces behind the emails. The massive preparation effort that took place before hand brought the research computing group, the central networking group and the data center folks together in ways that hadn't happened before. The physical networking was done in a new way for us.. We used a large Foundry switch and the MRJ21 cabling system for it. Each racks gets 24 nodes, a 24 port passive patch panel, and 4 MRJ21 cables that run back to the network switch. Then, there are just short patch cables between the panel and each node in the rack (running through a side mounted cable manager). Eventually, there'll be a cheap 24port 100mbps switch in each rack to provide dedicated out of band management to each node. Most of the cabling was done by two person teams. One person unwrapping cables and the other running the cables in the rack. This process wasn't the speediest, but things certainly look nice on the backside.. The installation infrastructure was revitalized for this install. We normally kickstart each node and then set up cfengine to run on the first boot. Cfengine will go ahead and bring the node into the cluster. To support this new cluster, we took five Dell 1850's and turned them into an IPVS cluster. One was the manager, the others serving bots. They ran cfengine and apache (providing both cfengine and the kickstart packages). Since we use RedHat Enterprise for the OS on each node, we upgraded the campus proxy server from a Dell 2650 to a beefy Sun x4200. To keep a lot of load off the proxy, we kickstarted using the latest release of Rhel4. So, there are some of the nitty details of what it took to get this thing off the group in just a few hours. -- Alex Younts Jim Lux wrote: > At 03:20 PM 5/6/2008, Mark Hahn wrote: >>> We have built out a beefy install infrastructure to support a lot of >>> simultaneous installs... >> >> I'm curious to hear about the infrastructure. >> >> btw: >> http://www.eetimes.com/news/latest/showArticle.jhtml?articleID=207501882 > > Interesting... > > 1000 computers, assume it takes 30 seconds to remove from the box and > walk to the rack. that's 30,000 seconds, or about 500 minutes.. call it > 8 hours. Assume you've got 10 racks and 10 people, so you get some > parallelism... an hour to unpack and rack one pile. > > > What wasn't shown in the video.. all the plugging and routing of network > cables? > > Jim From beejstone3 at yahoo.com Wed May 7 10:10:16 2008 From: beejstone3 at yahoo.com (Bill Johnstone) Date: Sat Jul 5 01:07:04 2008 Subject: [Beowulf] Recent comparisons of 1600 MHz external Harpertown vs. 235x AMD processors? Message-ID: <891785.95622.qm@web57611.mail.re1.yahoo.com> Hello all. Sorry if this has been addressed already -- I searched the mailing list archives and didn't find any answers. Has there been a recent (i.e., conducted this year with shipping silicon) performance study/benchmark of the "Harpertwon" family of Intel Xeon processors running with an external clock of 1600 MHz against a suitable high-end member of the AMD "Barcelona" family, e.g., 2354 or 2356? I've found various comparisons done in 2007 with pre-release AMD silicon, and conducted by "consumer grade" web sites like Tom's Hardware and Anandtech. I'm looking for something more rigorous, and done with a mind toward parallel/cluster applications, preferably on some form of Linux. I loathe that I'd have to use an nvidia chipset with the AMD processors, but I wouldn't want the dead-end memory architecture of the current Intel chips to become an issue. Thanks! ____________________________________________________________________________________ Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now. http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ From tom.elken at qlogic.com Wed May 7 14:45:32 2008 From: tom.elken at qlogic.com (Tom Elken) Date: Sat Jul 5 01:07:04 2008 Subject: [Beowulf] Recent comparisons of 1600 MHz external Harpertown vs.235x AMD processors? In-Reply-To: <891785.95622.qm@web57611.mail.re1.yahoo.com> References: <891785.95622.qm@web57611.mail.re1.yahoo.com> Message-ID: <6DB5B58A8E5AB846A7B3B3BFF1B4315A01F674BB@AVEXCH1.qlogic.org> > Has there been a recent (i.e., conducted this year with > shipping silicon) performance study/benchmark of the > "Harpertwon" family of Intel Xeon processors running with an > external clock of 1600 MHz against a suitable high-end member > of the AMD "Barcelona" family, e.g., 2354 or 2356? Not a study, but recent published comparisons on an OpenMP application benchmark suite, SPEC OMP2001 are available: AMD Barcelona, Opteron 2356, dual-socket, 2.3 GHz: http://www.spec.org/omp/results/res2008q2/omp2001-20080325-00291.html SPECompMbase2001 = 17598 Intel Harpertown, Intel Xeon E5440, dual-socket, 2.83 GHz: http://www.spec.org/omp/results/res2008q2/omp2001-20080325-00292.html SPECompMbase2001 = 14789 Both these results were run by AMD, but they used an Intel-produced config file on the Intel-CPU-system runs. Note that AMD did not use the fastest Harpertown CPUs available, but they are pretty decent speed -- I seem to recall that AMD had a rationale of choosing similarly power-consuming CPUs, but I could be wrong there. The SPEC HPG committee, which includes Intel, reviewed the results to see that they followed the run rules before accepting them for publication. Different compilers were used, but they seem like logical choices for the CPUs involved: PathScale for Barcelona, and Intel for Harpertown. -Tom > -----Original Message----- > From: beowulf-bounces@beowulf.org > [mailto:beowulf-bounces@beowulf.org] On Behalf Of Bill Johnstone > Sent: Wednesday, May 07, 2008 10:10 AM > To: beowulf@beowulf.org > Subject: [Beowulf] Recent comparisons of 1600 MHz external > Harpertown vs.235x AMD processors? > > Hello all. > > Sorry if this has been addressed already -- I searched the > mailing list archives and didn't find any answers. > > Has there been a recent (i.e., conducted this year with > shipping silicon) performance study/benchmark of the > "Harpertwon" family of Intel Xeon processors running with an > external clock of 1600 MHz against a suitable high-end member > of the AMD "Barcelona" family, e.g., 2354 or 2356? > > I've found various comparisons done in 2007 with pre-release > AMD silicon, and conducted by "consumer grade" web sites like > Tom's Hardware and Anandtech. I'm looking for something more > rigorous, and done with a mind toward parallel/cluster > applications, preferably on some form of Linux. > > I loathe that I'd have to use an nvidia chipset with the AMD > processors, but I wouldn't want the dead-end memory > architecture of the current Intel chips to become an issue. > > Thanks! > > > > > ______________________________________________________________ > ______________________ > Be a better friend, newshound, and > know-it-all with Yahoo! Mobile. Try it now. > http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) > visit http://www.beowulf.org/mailman/listinfo/beowulf > From jan.heichler at gmx.net Wed May 7 21:03:43 2008 From: jan.heichler at gmx.net (Jan Heichler) Date: Sat Jul 5 01:07:04 2008 Subject: [Beowulf] Recent comparisons of 1600 MHz external Harpertown vs.235x AMD processors? In-Reply-To: <6DB5B58A8E5AB846A7B3B3BFF1B4315A01F674BB@AVEXCH1.qlogic.org> References: <891785.95622.qm@web57611.mail.re1.yahoo.com> <6DB5B58A8E5AB846A7B3B3BFF1B4315A01F674BB@AVEXCH1.qlogic.org> Message-ID: <537414876.20080508060343@gmx.net> Hallo Tom, Mittwoch, 7. Mai 2008, meintest Du: >> Has there been a recent (i.e., conducted this year with >> shipping silicon) performance study/benchmark of the >> "Harpertwon" family of Intel Xeon processors running with an >> external clock of 1600 MHz against a suitable high-end member >> of the AMD "Barcelona" family, e.g., 2354 or 2356? TE> Not a study, but recent published comparisons on an OpenMP application TE> benchmark suite, SPEC OMP2001 are available: TE> AMD Barcelona, Opteron 2356, dual-socket, 2.3 GHz: TE> http://www.spec.org/omp/results/res2008q2/omp2001-20080325-00291.html TE> SPECompMbase2001 = 17598 TE> Intel Harpertown, Intel Xeon E5440, dual-socket, 2.83 GHz: TE> http://www.spec.org/omp/results/res2008q2/omp2001-20080325-00292.html TE> SPECompMbase2001 = 14789 TE> Both these results were run by AMD, but they used an Intel-produced TE> config file on the Intel-CPU-system runs. TE> Note that AMD did not use the fastest Harpertown CPUs available, but TE> they are pretty decent speed -- I seem to recall that AMD had a TE> rationale of choosing similarly power-consuming CPUs, but I could be TE> wrong there. But the Harpertown that the benchmark ran on was not one with FSB1600. And this is/could be a real difference here since it affects the memory performance. Cheers, Jan -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080508/c601359f/attachment.html From tom.elken at qlogic.com Thu May 8 09:38:39 2008 From: tom.elken at qlogic.com (Tom Elken) Date: Sat Jul 5 01:07:04 2008 Subject: [Beowulf] Recent comparisons of 1600 MHz external Harpertown vs.235x AMD processors? In-Reply-To: <537414876.20080508060343@gmx.net> References: <891785.95622.qm@web57611.mail.re1.yahoo.com> <6DB5B58A8E5AB846A7B3B3BFF1B4315A01F674BB@AVEXCH1.qlogic.org> <537414876.20080508060343@gmx.net> Message-ID: <6DB5B58A8E5AB846A7B3B3BFF1B4315A0202C28D@AVEXCH1.qlogic.org> ________________________________ From: Jan Heichler [mailto:jan.heichler@gmx.net] AMD Barcelona, Opteron 2356, dual-socket, 2.3 GHz: http://www.spec.org/omp/results/res2008q2/omp2001-20080325-00291.html SPECompMbase2001 = 17598 Intel Harpertown, Intel Xeon E5440, dual-socket, 2.83 GHz: http://www.spec.org/omp/results/res2008q2/omp2001-20080325-00292.html SPECompMbase2001 = 14789 TE> -- I seem to recall that AMD had a TE> rationale of choosing similarly power-consuming CPUs, but I could be TE> wrong there. > But the Harpertown that the benchmark ran on was not one with FSB1600. And this is/could be a real difference here > since it affects the memory performance. You are right, Jan. I checked that the Supermicro motherboard supported the 1600 MHz FSB, but not the CPU. E5462 (2.8 GHz( and E5472 (3.0 GHz) are processors that have the 1600 Mhz FSB and seem to have the same power specs as the E5440 (80W). The additional FSB speed would certainly make a difference in OMP2001 performance, esp. on memory bandwidth-sensitive codes like swim and mgrid. Here are some measurements I just made on OpenMP STREAM: 8-thread STREAM Copy (GB/s) Harpertown - ---------------- Xeon 5410, 2.33 GHz, 1333 MHz FSB: 6.2 Xeon 5472, 3.0 GHz, 1600 MHz FSB: 7.3 Barcelona - Opteron 2352, 2.1 GHz: 15.7 (the 3 other STREAM components were pretty similar to Copy) These were on 2-socket, 8-core systems. Just a quick test, so not necessarily optimal. But Intel and PathScale compilers got the same OpenMP stream performance on the Xeon processors with high optimization, so that seems like a sign that the executables were pretty good. -Tom ------------------- Cheers, Jan From joshua_mora at usa.net Wed May 7 21:12:59 2008 From: joshua_mora at usa.net (Joshua mora acosta) Date: Sat Jul 5 01:07:04 2008 Subject: [Beowulf] Recent comparisons of 1600 MHz external Harpertown vs. 235x AMD processors? Message-ID: <639meHeL81166S23.1210219979@cmsweb23.cms.usa.net> Go to http://www.amd.com/us-en/Processors/ProductInformation/0,,30_118_8796_8800,00.html The comparison is against Harpertown 2.8 at 1.3GHz FSB since they have equivalent power consumption. Look for instance at compute intensive applications. Comparison against Harpertown 3.0 at 1.6GHz FSB with 800MHz isn't there but I can say despite its improvement on multiple directions it does not close the huge gap on those memory intensive applications. Joshua Mora. ------ Original Message ------ Received: Wed, 07 May 2008 01:47:40 PM PDT From: Bill Johnstone To: beowulf@beowulf.org Subject: [Beowulf] Recent comparisons of 1600 MHz external Harpertown vs. 235x AMD processors? > Hello all. > > Sorry if this has been addressed already -- I searched the mailing list archives and didn't find any answers. > > Has there been a recent (i.e., conducted this year with shipping silicon) performance study/benchmark of the "Harpertwon" family of Intel Xeon processors running with an external clock of 1600 MHz against a suitable high-end member of the AMD "Barcelona" family, e.g., 2354 or 2356? > > I've found various comparisons done in 2007 with pre-release AMD silicon, and conducted by "consumer grade" web sites like Tom's Hardware and Anandtech. I'm looking for something more rigorous, and done with a mind toward parallel/cluster applications, preferably on some form of Linux. > > I loathe that I'd have to use an nvidia chipset with the AMD processors, but I wouldn't want the dead-end memory architecture of the current Intel chips to become an issue. > > Thanks! > > > > ____________________________________________________________________________________ > Be a better friend, newshound, and > know-it-all with Yahoo! Mobile. Try it now. http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > From maurice at harddata.com Thu May 8 06:39:16 2008 From: maurice at harddata.com (Maurice Hilarius) Date: Sat Jul 5 01:07:04 2008 Subject: [Beowulf] Re: Purdue Supercomputer - labeling Cables In-Reply-To: <200805071900.m47J09AL019748@bluewest.scyld.com> References: <200805071900.m47J09AL019748@bluewest.scyld.com> Message-ID: <48230284.6040806@harddata.com> David Mathog wrote: > What about labeling the cables? My rack's nodes' ethernet and power > cables are marked with the node's name at each end. It was a low tech > approach involving printing names, cutting paper into strips with > scissors, and taping them on. There are other types of label, many > based on modified zip ties, but they still have to be applied by hand. > White shrink tube works very well. Is secure, and does not project or snag on things. One may pre-write it. Only downside is it has to be slipped over an end, so one has to pull the end out of a socket to apply. -- With our best regards, //Maurice W. Hilarius Telephone: 01-780-456-9771/ /Hard Data Ltd. FAX: 01-780-456-9772/ /11060 - 166 Avenue email:maurice@harddata.com/ /Edmonton, AB, Canada http://www.harddata.com// / T5X 1Y3/ / -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080508/fd04b826/attachment.html From john.hearns at streamline-computing.com Thu May 8 13:38:37 2008 From: john.hearns at streamline-computing.com (John Hearns) Date: Sat Jul 5 01:07:04 2008 Subject: [Beowulf] FreeIPA Message-ID: <1210279127.7449.13.camel@Vigor13> And sorry, that's not free India Pale Ale. This was discussed on my local LUG list today. http://freeipa.org/page/About "FreeIPA is an integrated security information management solution combining Linux (Fedora), Fedora Directory Server, MIT Kerberos, NTP, DNS. It consists of a web interface and command-line administration tools. Currently it supports identity management with plans to support policy and auditing management." Could be useful for cluster administration. From hahn at mcmaster.ca Thu May 8 14:04:01 2008 From: hahn at mcmaster.ca (Mark Hahn) Date: Sat Jul 5 01:07:04 2008 Subject: [Beowulf] Recent comparisons of 1600 MHz external Harpertown vs.235x AMD processors? In-Reply-To: <6DB5B58A8E5AB846A7B3B3BFF1B4315A0202C28D@AVEXCH1.qlogic.org> References: <891785.95622.qm@web57611.mail.re1.yahoo.com> <6DB5B58A8E5AB846A7B3B3BFF1B4315A01F674BB@AVEXCH1.qlogic.org> <537414876.20080508060343@gmx.net> <6DB5B58A8E5AB846A7B3B3BFF1B4315A0202C28D@AVEXCH1.qlogic.org> Message-ID: > Here are some measurements I just made on OpenMP STREAM: > 8-thread > STREAM Copy (GB/s) > Harpertown - ---------------- > Xeon 5410, 2.33 GHz, 1333 MHz FSB: 6.2 > Xeon 5472, 3.0 GHz, 1600 MHz FSB: 7.3 > Barcelona - > Opteron 2352, 2.1 GHz: 15.7 > (the 3 other STREAM components were pretty similar to Copy) very nice. it would be good to know the memory configuration. of course, the real question is how well Nehalem will do - does anyone have numbers? it's a sticky time for specifying clusters, since current Intel chips have great onchip performance but are dramatically constrained in decent-sized HPC apps. choosing AMD, while defensible based on current memory performance, looks extremely iffy otherwise. (the most recent AMD disclosures have them attempting to milk the k10 core for the next couple years (!).) From charliep at cs.earlham.edu Fri May 9 03:44:37 2008 From: charliep at cs.earlham.edu (Charlie Peck) Date: Sat Jul 5 01:07:04 2008 Subject: [Beowulf] Summer workshops in parallel and distributed computing and computational science Message-ID: <6C3B5556-78EA-4F84-91B0-F691762AAC4C@cs.earlham.edu> version=3.1.3 X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on quark.cs.earlham.edu Slightly off-topic (but not too far): The SuperComputing (SC) Education Program is a year-long program working with undergraduate faculty, administrators, college students, and collaborating high school teachers to integrate computational science and high performance computing and communications technologies highlighted through the SC Conference into the preparation of future scientists, technologists, engineers, mathematicians and teachers. The SC Education Program hosts about 10 week-long workshops each summer covering a variety of topics in parallel and distributed computing and computational science. The workshops are primarily funded through the SC conference series, attendees are only responsible for travel and a registration fee. Information is available at http://sc-education.org/workshops/ schedule.php Registration is open now. charlie From tom.elken at qlogic.com Fri May 9 09:00:59 2008 From: tom.elken at qlogic.com (Tom Elken) Date: Sat Jul 5 01:07:05 2008 Subject: [Beowulf] Recent comparisons of 1600 MHz external Harpertown vs.235x AMD processors? In-Reply-To: References: <891785.95622.qm@web57611.mail.re1.yahoo.com> <6DB5B58A8E5AB846A7B3B3BFF1B4315A01F674BB@AVEXCH1.qlogic.org> <537414876.20080508060343@gmx.net> <6DB5B58A8E5AB846A7B3B3BFF1B4315A0202C28D@AVEXCH1.qlogic.org> Message-ID: <6DB5B58A8E5AB846A7B3B3BFF1B4315A0202C46A@AVEXCH1.qlogic.org> > From: Mark Hahn [mailto:hahn@mcmaster.ca] > > 8-thread > > STREAM Copy (GB/s) > > Harpertown - ---------------- > > Xeon 5410, 2.33 GHz, 1333 MHz FSB: 6.2 > > Xeon 5472, 3.0 GHz, 1600 MHz FSB: 7.3 > > Barcelona - > > Opteron 2352, 2.1 GHz: 15.7 > > (the 3 other STREAM components were pretty similar to Copy) > > very nice. it would be good to know the memory configuration. Very good point. The faster Xeon has faster memory too. 2x Xeon 5410: 16GB Reg ECC DDR2-667 FBDIMM memory 2x Xeon 5472: 16GB Reg ECC DDR2-800 FBDIMM memory 2x Opteron 2352: 16GB ECC Reg. DDR2-667 SDRAM memory 8 dimms in the Xeon systems, not sure about the Opteron. -Tom > > of course, the real question is how well Nehalem will do - > does anyone have numbers? it's a sticky time for specifying > clusters, since current Intel chips have great onchip performance > but are dramatically constrained in decent-sized HPC apps. > choosing AMD, while defensible based on current memory performance, > looks extremely iffy otherwise. (the most recent AMD disclosures > have them attempting to milk the k10 core for the next couple > years (!).) > From gerry.creager at tamu.edu Fri May 9 10:44:47 2008 From: gerry.creager at tamu.edu (Gerry Creager) Date: Sat Jul 5 01:07:05 2008 Subject: [Beowulf] Re: Purdue Supercomputer In-Reply-To: References: Message-ID: <48248D8F.1070000@tamu.edu> Late response: Brady self-laminating labels. They can be used with a Brady printer or a laser printer. gerry David Mathog wrote: > Jim Lux wrote > >> What wasn't shown in the video.. all the plugging and routing of >> network cables? > > What about labeling the cables? My rack's nodes' ethernet and power > cables are marked with the node's name at each end. It was a low tech > approach involving printing names, cutting paper into strips with > scissors, and taping them on. There are other types of label, many > based on modified zip ties, but they still have to be applied by hand. > > Regards, > > David Mathog > mathog@caltech.edu > Manager, Sequence Analysis Facility, Biology Division, Caltech > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- Gerry Creager -- gerry.creager@tamu.edu Texas Mesonet -- AATLT, Texas A&M University Cell: 979.229.5301 Office: 979.862.3982 FAX: 979.862.3983 Office: 1700 Research Parkway Ste 160, TAMU, College Station, TX 77843 From prentice at ias.edu Fri May 9 11:26:34 2008 From: prentice at ias.edu (Prentice Bisbal) Date: Sat Jul 5 01:07:05 2008 Subject: [Beowulf] Do these SGE features exist in Torque? Message-ID: <4824975A.40403@ias.edu> At a previous job, I installed SGE for our cluster. At my current job Torque is the queuing system of choice. I'm very familar with SGE, but only have a cursory knowledge of Torque (installed it for evaluation, and that's it). We're about to purchase a new cluster. I'd have to make a good argument for using SGE over Torque. I was wondering if the following SGE features exist in Torque: 1. Interactive shells managed by queuing system 2. Counting licenses in use (done using a contributed shell script in SGE) 3. Separation of roles between submit hosts, execution hosts, and administration hosts 4. Certificate-based security. Are there any notable features available in Torque that aren't available in SGE? -- Prentice From john.hearns at streamline-computing.com Fri May 9 12:22:16 2008 From: john.hearns at streamline-computing.com (John Hearns) Date: Sat Jul 5 01:07:05 2008 Subject: [Beowulf] Do these SGE features exist in Torque? In-Reply-To: <4824975A.40403@ias.edu> References: <4824975A.40403@ias.edu> Message-ID: <1210360946.4934.80.camel@Vigor13> On Fri, 2008-05-09 at 14:26 -0400, Prentice Bisbal wrote: > 1. Interactive shells managed by queuing system > 2. Counting licenses in use (done using a contributed shell script in SGE) > 3. Separation of roles between submit hosts, execution hosts, and > administration hosts > 4. Certificate-based security. > > Are there any notable features available in Torque that aren't available > in SGE? > 5. Every time you install Torque a kitten dies. Yours Sincerely, A Gridengine User From joshua_mora at usa.net Thu May 8 16:19:30 2008 From: joshua_mora at usa.net (Joshua mora acosta) Date: Sat Jul 5 01:07:05 2008 Subject: [Beowulf] Recent comparisons of 1600 MHz external Harpertown vs.235x AMD processors? Message-ID: <078meHXsE2052S18.1210288770@cmsweb18.cms.usa.net> If you had a 2.3GHz at 2.0GHz NB you would get 17.5GB/sec. Joshua ------ Original Message ------ Received: Thu, 08 May 2008 02:18:30 PM PDT From: Mark Hahn To: Tom Elken Cc: Beowulf Mailing List Subject: RE: Re[2]: [Beowulf] Recent comparisons of 1600 MHz external Harpertown vs.235x AMD processors? > > Here are some measurements I just made on OpenMP STREAM: > > 8-thread > > STREAM Copy (GB/s) > > Harpertown - ---------------- > > Xeon 5410, 2.33 GHz, 1333 MHz FSB: 6.2 > > Xeon 5472, 3.0 GHz, 1600 MHz FSB: 7.3 > > Barcelona - > > Opteron 2352, 2.1 GHz: 15.7 > > (the 3 other STREAM components were pretty similar to Copy) > > very nice. it would be good to know the memory configuration. > > of course, the real question is how well Nehalem will do - > does anyone have numbers? it's a sticky time for specifying > clusters, since current Intel chips have great onchip performance > but are dramatically constrained in decent-sized HPC apps. > choosing AMD, while defensible based on current memory performance, > looks extremely iffy otherwise. (the most recent AMD disclosures > have them attempting to milk the k10 core for the next couple years (!).) > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > From perry at piermont.com Fri May 9 11:07:17 2008 From: perry at piermont.com (Perry E. Metzger) Date: Sat Jul 5 01:07:05 2008 Subject: [Beowulf] Re: Purdue Supercomputer In-Reply-To: <48248D8F.1070000@tamu.edu> (Gerry Creager's message of "Fri\, 09 May 2008 12\:44\:47 -0500") References: <48248D8F.1070000@tamu.edu> Message-ID: <87tzh7cpwq.fsf@snark.cb.piermont.com> Gerry Creager writes: > Late response: Brady self-laminating labels. They can be used with a > Brady printer or a laser printer. I've been wondering why the Canon P-Touch hasn't been mentioned up until this point. They're great for labeling just about anything in a machine room... Perry From jan.heichler at gmx.net Fri May 9 13:08:19 2008 From: jan.heichler at gmx.net (Jan Heichler) Date: Sat Jul 5 01:07:05 2008 Subject: [Beowulf] Recent comparisons of 1600 MHz external Harpertown vs.235x AMD processors? In-Reply-To: <078meHXsE2052S18.1210288770@cmsweb18.cms.usa.net> References: <078meHXsE2052S18.1210288770@cmsweb18.cms.usa.net> Message-ID: <4910603360.20080509220819@gmx.net> Hallo Joshua, Freitag, 9. Mai 2008, meintest Du: Jma> If you had a 2.3GHz at 2.0GHz NB you would get 17.5GB/sec. 2.0 GHz what? What does NB mean? Cheers, Jan -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080509/f65bce7a/attachment.html From reuti at staff.uni-marburg.de Fri May 9 13:17:03 2008 From: reuti at staff.uni-marburg.de (Reuti) Date: Sat Jul 5 01:07:05 2008 Subject: [Beowulf] Do these SGE features exist in Torque? In-Reply-To: <4824975A.40403@ias.edu> References: <4824975A.40403@ias.edu> Message-ID: Hi, Am 09.05.2008 um 20:26 schrieb Prentice Bisbal: > At a previous job, I installed SGE for our cluster. At my current job > Torque is the queuing system of choice. I'm very familar with SGE, but > only have a cursory knowledge of Torque (installed it for evaluation, > and that's it). We're about to purchase a new cluster. I'd have to > make > a good argument for using SGE over Torque. I was wondering if the > following SGE features exist in Torque: > > 1. Interactive shells managed by queuing system > 2. Counting licenses in use (done using a contributed shell script > in SGE) > 3. Separation of roles between submit hosts, execution hosts, and > administration hosts > 4. Certificate-based security. > > Are there any notable features available in Torque that aren't > available > in SGE? what you can find in Torque but not in SGE: request a mixture of nodes, i.e. one heavy node with much memory (or big I/O options) and 5 nodes with less memory or less disk performance for a parallel job. OTOH, if you have parallel jobs: http://www.beowulf.org/archive/2007- September/019269.html What is different between them from the idea: in Torque you submit a job into a queue, while in SGE you request resources and SGE will select an appropriate queue for you. -- Reuti From Craig.Tierney at noaa.gov Fri May 9 14:06:10 2008 From: Craig.Tierney at noaa.gov (Craig Tierney) Date: Sat Jul 5 01:07:05 2008 Subject: [Beowulf] Do these SGE features exist in Torque? In-Reply-To: References: <4824975A.40403@ias.edu> Message-ID: <4824BCC2.3080405@noaa.gov> Reuti wrote: > Hi, > > Am 09.05.2008 um 20:26 schrieb Prentice Bisbal: > >> At a previous job, I installed SGE for our cluster. At my current job >> Torque is the queuing system of choice. I'm very familar with SGE, but >> only have a cursory knowledge of Torque (installed it for evaluation, >> and that's it). We're about to purchase a new cluster. I'd have to make >> a good argument for using SGE over Torque. I was wondering if the >> following SGE features exist in Torque: >> >> 1. Interactive shells managed by queuing system >> 2. Counting licenses in use (done using a contributed shell script in >> SGE) >> 3. Separation of roles between submit hosts, execution hosts, and >> administration hosts >> 4. Certificate-based security. >> >> Are there any notable features available in Torque that aren't available >> in SGE? > > what you can find in Torque but not in SGE: request a mixture of nodes, > i.e. one heavy node with much memory (or big I/O options) and 5 nodes > with less memory or less disk performance for a parallel job. > Not true. Torque syntax is much cleaner for doing this, but you can do it in SGE. Call the big memory hosts "bigmemN", where N is an integer from zero to how many ever servers you have. Create a hostgroup with the compute nodes called "@compx". Subsequently create at parallel environment and queue that use this host group. Normal jobs will go into this parallel environment. Next, create a parallel environment called pebigmem. Then, for each host, create an individual queue called qbigmemN.q (replace N with the appropriate integer). For each queue, specify the host list as: hostlist bigmemN @compx When you launch a job, the call to qsub should include: qsub -pe pebigmem 6 -masterq qbigmem0.q@bigmem0,qbigmem1.q@bigmem1,..... The -masterq line should list every bigmem queue instance. At our site, we have a qsub wrapper script that when a user asks for the parallel environment "pebigmem", we add the -masterq line to hide the details from them. If I hadn't figured out how to do this 6 years ago, we could have never migrated from OpenPBS to SGE. Craig > OTOH, if you have parallel jobs: > http://www.beowulf.org/archive/2007-September/019269.html > > What is different between them from the idea: in Torque you submit a job > into a queue, while in SGE you request resources and SGE will select an > appropriate queue for you. > > -- Reuti > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > -- Craig Tierney (craig.tierney@noaa.gov) From james.p.lux at jpl.nasa.gov Fri May 9 14:21:35 2008 From: james.p.lux at jpl.nasa.gov (Jim Lux) Date: Sat Jul 5 01:07:05 2008 Subject: [Beowulf] Re: Purdue Supercomputer In-Reply-To: <87tzh7cpwq.fsf@snark.cb.piermont.com> References: <48248D8F.1070000@tamu.edu> <87tzh7cpwq.fsf@snark.cb.piermont.com> Message-ID: <20080509142135.fnc1ywu720w8444o@webmail.jpl.nasa.gov> Quoting "Perry E. Metzger" , on Fri 09 May 2008 11:07:17 AM PDT: > > Gerry Creager writes: >> Late response: Brady self-laminating labels. They can be used with a >> Brady printer or a laser printer. > > I've been wondering why the Canon P-Touch hasn't been mentioned up > until this point. They're great for labeling just about anything in a > machine room... Sure, until you have to label 1000 things in a few hours. I find it's hard to get the backing peeled off (although my Ptouch doesn't have the little feature that nicks the edge to make it easier). Actually, you can order cables already pre numbered and labelled. Why burn expensive cluster assembler time when you can pay someone (potentially offshore) to do it cheaper. Jim From matt at technoronin.com Fri May 9 13:21:27 2008 From: matt at technoronin.com (Matt Lawrence) Date: Sat Jul 5 01:07:05 2008 Subject: [Beowulf] Re: Purdue Supercomputer In-Reply-To: <87tzh7cpwq.fsf@snark.cb.piermont.com> References: <48248D8F.1070000@tamu.edu> <87tzh7cpwq.fsf@snark.cb.piermont.com> Message-ID: On Fri, 9 May 2008, Perry E. Metzger wrote: > I've been wondering why the Canon P-Touch hasn't been mentioned up > until this point. They're great for labeling just about anything in a > machine room... In the hot environment behind most servers, I have seen that the P-Touch labels delaminate and the adhesive fails when they are used on cables. Also, I would be quite happy if the list moderator were to take me off moderation. Hint, hint.... -- Matt It's not what I know that counts. It's what I can remember in time to use. From joshua_mora at usa.net Fri May 9 14:23:26 2008 From: joshua_mora at usa.net (Joshua mora acosta) Date: Sat Jul 5 01:07:05 2008 Subject: [Beowulf] Recent comparisons of 1600 MHz external Harpertown vs.235x AMD processors? Message-ID: <165meiVwA8034S08.1210368206@cmsweb08.cms.usa.net> It means NorthBridge ------ Original Message ------ Received: Fri, 09 May 2008 01:09:37 PM PDT From: Jan Heichler To: "Joshua mora acosta" Cc: Mark Hahn , Tom Elken , Beowulf Mailing List Subject: Re[4]: [Beowulf] Recent comparisons of 1600 MHz external Harpertown vs.235x AMD processors? > Hallo Joshua, > > Freitag, 9. Mai 2008, meintest Du: > > Jma> If you had a 2.3GHz at 2.0GHz NB you would get 17.5GB/sec. > > 2.0 GHz what? What does NB mean? > > Cheers, > Jan From perry at piermont.com Fri May 9 15:44:02 2008 From: perry at piermont.com (Perry E. Metzger) Date: Sat Jul 5 01:07:05 2008 Subject: [Beowulf] Re: Purdue Supercomputer In-Reply-To: <20080509142135.fnc1ywu720w8444o@webmail.jpl.nasa.gov> (Jim Lux's message of "Fri\, 09 May 2008 14\:21\:35 -0700") References: <48248D8F.1070000@tamu.edu> <87tzh7cpwq.fsf@snark.cb.piermont.com> <20080509142135.fnc1ywu720w8444o@webmail.jpl.nasa.gov> Message-ID: <87prrvayj1.fsf@snark.cb.piermont.com> Jim Lux writes: >> I've been wondering why the Canon P-Touch hasn't been mentioned up >> until this point. They're great for labeling just about anything in a >> machine room... > > Sure, until you have to label 1000 things in a few hours. (BTW, my mistake, it is the BROTHER P-Touch). Modern P-Touches have USB interfaces and you can automate the label printing, so if you have to label 1000 things and you have a reasonable script you can do it. > I find it's hard to get the backing peeled off (although my Ptouch > doesn't have the little feature that nicks the edge to make it > easier). I also find that it is pretty easy to get the backing off if you're willing to crush a corner a wee bit. > Actually, you can order cables already pre numbered and labelled. Why > burn expensive cluster assembler time when you can pay someone > (potentially offshore) to do it cheaper. If that's cheap enough, sure, sounds like a good deal. As long as you've brought it up, who do you buy pre-numbered and pre-labeled cables from? Even if you manage that, though, it won't help with labeling cabinets, machines, etc. -- I find the labeler gadgets are really useful. -- Perry E. Metzger perry@piermont.com From jiteshbdundas at gmail.com Sat May 10 11:12:31 2008 From: jiteshbdundas at gmail.com (jitesh dundas) Date: Sat Jul 5 01:07:05 2008 Subject: [Beowulf] Re: Purdue Supercomputer In-Reply-To: <87prrvayj1.fsf@snark.cb.piermont.com> References: <48248D8F.1070000@tamu.edu> <87tzh7cpwq.fsf@snark.cb.piermont.com> <20080509142135.fnc1ywu720w8444o@webmail.jpl.nasa.gov> <87prrvayj1.fsf@snark.cb.piermont.com> Message-ID: Hi, That sounds like a good option.I have a question about beowulf clusters.What if you have 1 of the systems in the cluster down or any network failures.Can make our cluster(2-5 sytems only) work properly. Also what about geographically distant cluster systems.Say 1 in USA and other in India.How do we manage our cluster in mishaps or difficult conditions. lastly, how about having beowulf cluster systems in space.putting 1 pc on each planet or celestial body that we want to track and the server in india. is linux the best choice in such cases... Any ideas? On 5/10/08, Perry E. Metzger wrote: > > Jim Lux writes: >>> I've been wondering why the Canon P-Touch hasn't been mentioned up >>> until this point. They're great for labeling just about anything in a >>> machine room... >> >> Sure, until you have to label 1000 things in a few hours. > > (BTW, my mistake, it is the BROTHER P-Touch). > > Modern P-Touches have USB interfaces and you can automate the label > printing, so if you have to label 1000 things and you have a > reasonable script you can do it. > >> I find it's hard to get the backing peeled off (although my Ptouch >> doesn't have the little feature that nicks the edge to make it >> easier). > > I also find that it is pretty easy to get the backing off if you're > willing to crush a corner a wee bit. > >> Actually, you can order cables already pre numbered and labelled. Why >> burn expensive cluster assembler time when you can pay someone >> (potentially offshore) to do it cheaper. > > If that's cheap enough, sure, sounds like a good deal. As long as > you've brought it up, who do you buy pre-numbered and pre-labeled > cables from? > > Even if you manage that, though, it won't help with labeling cabinets, > machines, etc. -- I find the labeler gadgets are really useful. > > > -- > Perry E. Metzger perry@piermont.com > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > From hahn at mcmaster.ca Sat May 10 17:28:03 2008 From: hahn at mcmaster.ca (Mark Hahn) Date: Sat Jul 5 01:07:05 2008 Subject: [Beowulf] Re: Purdue Supercomputer In-Reply-To: References: <48248D8F.1070000@tamu.edu> <87tzh7cpwq.fsf@snark.cb.piermont.com> <20080509142135.fnc1ywu720w8444o@webmail.jpl.nasa.gov> <87prrvayj1.fsf@snark.cb.piermont.com> Message-ID: > clusters.What if you have 1 of the systems in the cluster down or any > network failures.Can make our cluster(2-5 sytems only) work properly. normally, the cluster's management software will monitor and deal with node failure. at least that means noticing a failure and ensuring that the node isn't used (until fixed) and dealing with any jobs that involved the node. it's also fairly common for server nodes (not just slave/compute nodes) to have some failover/high-availability features. (HA can also be done for compute jobs, but IMHO it's not worth considering in normal cases, ie, infrequent node failures.) > Also what about geographically distant cluster systems.Say 1 in USA sure, there's nothing about clusters that really assumes locality, though obviously geographic distribution has effects on achievable performance for wide-area MPI or distant file access. wide-area clustering seems more of a political stunt to me (yes, including grids.) > and other in India.How do we manage our cluster in mishaps or > difficult conditions. I find that with IPMI and console redirection, it's very rarely necessary to care about where your nodes are, at least from a sysadmin perspective. you need to ask what the benefit is, though, in a wide-area cluster (versus seprate, local ones.) I wouldn't assume that management would be easier, and obviously only gratuitously parallel apps (sometimes called embarassinly parallel) could use it. > lastly, how about having beowulf cluster systems in space.putting 1 pc > on each planet or celestial body that we want to track and the server > in india. just because it could be done doesn't mean it makes sense... > is linux the best choice in such cases... your choice of OS depends primarily on your preference and experience. From hahn at mcmaster.ca Sat May 10 23:45:05 2008 From: hahn at mcmaster.ca (Mark Hahn) Date: Sat Jul 5 01:07:05 2008 Subject: [Beowulf] Re: Purdue Supercomputer In-Reply-To: <87wsm11x1j.fsf@snark.cb.piermont.com> References: <48248D8F.1070000@tamu.edu> <87tzh7cpwq.fsf@snark.cb.piermont.com> <20080509142135.fnc1ywu720w8444o@webmail.jpl.nasa.gov> <87prrvayj1.fsf@snark.cb.piermont.com> <87wsm11x1j.fsf@snark.cb.piermont.com> Message-ID: >> I find that with IPMI and console redirection, it's very rarely necessary to >> care about where your nodes are, at least from a sysadmin perspective. > > Speaking of IPMI, are there reasonable motherboards that incorporate > it right into the design at this point, or is it still mostly an add > on? bit of both - supermicro and tyan both rely on add-in cards, but HP (and probably some other vendors) have integrated IPMI onboard. I keep reading stuff from Intel and AMD about managability which sounds tantalizingly like IPMI-like managability being integrated into the chipset. I think the main issue is that ipmi needs a lan interface, though I guess there are some examples of it sharing a port with the host. From tjrc at sanger.ac.uk Sun May 11 01:51:05 2008 From: tjrc at sanger.ac.uk (Tim Cutts) Date: Sat Jul 5 01:07:05 2008 Subject: [Beowulf] Re: Purdue Supercomputer In-Reply-To: References: <48248D8F.1070000@tamu.edu> <87tzh7cpwq.fsf@snark.cb.piermont.com> <20080509142135.fnc1ywu720w8444o@webmail.jpl.nasa.gov> <87prrvayj1.fsf@snark.cb.piermont.com> <87wsm11x1j.fsf@snark.cb.piermont.com> Message-ID: On 11 May 2008, at 8:45 am, Mark Hahn wrote: >>> I find that with IPMI and console redirection, it's very rarely >>> necessary to >>> care about where your nodes are, at least from a sysadmin >>> perspective. >> >> Speaking of IPMI, are there reasonable motherboards that incorporate >> it right into the design at this point, or is it still mostly an add >> on? > > bit of both - supermicro and tyan both rely on add-in cards, but > HP (and probably some other vendors) have integrated IPMI onboard. > I keep reading stuff from Intel and AMD about managability which > sounds tantalizingly like IPMI-like managability being integrated > into the chipset. > > I think the main issue is that ipmi needs a lan interface, though I > guess there are some examples of it sharing a port with the host. There are also the issues that most vendors' IPMI implentations are broken, to at least some extent. HP seem to be in the process of abandoning IPMI in favour of something else (whose name escapes me) I guess one of the problems with IPMI is that it isn't in an individual hardware vendor's interest to support it properly. If we really had a management interface that worked consistently across all hardware vendors, then it really wouldn't matter whose tin we bought (which they won't like) and there also wouldn't be any need to lock you into expensive vendor-specific management software like SIM (which they also wouldn't like). I know I'm mainly mentioning HP here, but I don't really want to single them out for criticism - all the main vendors do the same thing. Such are the realities of business. Management systems are now one of the few things the main vendors can use to distinguish their products from those of the competition. Tim -- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. From landman at scalableinformatics.com Sun May 11 07:34:18 2008 From: landman at scalableinformatics.com (Joe Landman) Date: Sat Jul 5 01:07:05 2008 Subject: [Beowulf] Re: Purdue Supercomputer In-Reply-To: References: <48248D8F.1070000@tamu.edu> <87tzh7cpwq.fsf@snark.cb.piermont.com> <20080509142135.fnc1ywu720w8444o@webmail.jpl.nasa.gov> <87prrvayj1.fsf@snark.cb.piermont.com> <87wsm11x1j.fsf@snark.cb.piermont.com> Message-ID: <482703EA.5000006@scalableinformatics.com> Tim Cutts wrote: >> I think the main issue is that ipmi needs a lan interface, though I >> guess there are some examples of it sharing a port with the host. > > There are also the issues that most vendors' IPMI implentations are > broken, to at least some extent. HP seem to be in the process of > abandoning IPMI in favour of something else (whose name escapes me) Well ... I am not sure I agree that most are broken. Its a standard, and some have taken ... ah ... liberties ... with the implementation of it. And the quality of some of the implementations is really really bad. That and the fact that it touches *so many* subsystems at such a low level in a system, they you really want to keep the standard/coding as simple as absolutely possible. Minor bugs in it could lead to machine shutdowns, difficult to diagnose power issues, ... We have found as a safety precaution, that including a console server/kvm unit and having power control via addressable/switchable PDU is a great backup, especially when we are hundreds of km (or simply different timezones) from the units. Paraphrasing Heinlein here, I would rather ascribe this state of affairs to incompetence than malfeasance. > I guess one of the problems with IPMI is that it isn't in an individual > hardware vendor's interest to support it properly. If we really had a I disagree with this assertion. We want all gear we sell, or we work with to support this, so that our management bits can handle anything we throw at it. Ours, theirs, it doesn't matter. One management interface for all, no matter what/whos kit is there. Anything else makes IT *exciting* which is generally the last thing you want IT to be. > management interface that worked consistently across all hardware > vendors, then it really wouldn't matter whose tin we bought (which they > won't like) and there also wouldn't be any need to lock you into Hmmm.... blades are an invention to try to de-commoditize commodity gear. They provide better lockin than IPMI ever would. You can't take an IBM blade-center blade and stick it in a Dell, HP, or Sun chassis. Or manage it with their tools. > expensive vendor-specific management software like SIM (which they also > wouldn't like). > > I know I'm mainly mentioning HP here, but I don't really want to single > them out for criticism - all the main vendors do the same thing. Such > are the realities of business. Management systems are now one of the Hmmm.... maybe we aren't a "main" vendor (I think you mean "tier-1" or TLA vendor) > few things the main vendors can use to distinguish their products from > those of the competition. Management atop IPMI is fairly common, though as indicated, some implementations are horribly broken. When you willingly pay more for kit from the TLA vendors (identical in most ways with kit from non-TLA vendors), you need some sort of proprietary justification for buying it. The main hardware differentiators these days are the packaging and the label on the front. There are other non-hardware differentiators in terms of service and support, and in terms of being able to understand your issues and map this to needed functionality/software/hardware/configuration. Very few vendors can do these things well in HPC. -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics LLC, email: landman@scalableinformatics.com web : http://www.scalableinformatics.com http://jackrabbit.scalableinformatics.com phone: +1 734 786 8423 fax : +1 866 888 3112 cell : +1 734 612 4615 From eugen at leitl.org Sun May 11 07:52:29 2008 From: eugen at leitl.org (Eugen Leitl) Date: Sat Jul 5 01:07:05 2008 Subject: [Beowulf] Re: Purdue Supercomputer In-Reply-To: References: <48248D8F.1070000@tamu.edu> <87tzh7cpwq.fsf@snark.cb.piermont.com> <20080509142135.fnc1ywu720w8444o@webmail.jpl.nasa.gov> <87prrvayj1.fsf@snark.cb.piermont.com> <87wsm11x1j.fsf@snark.cb.piermont.com> Message-ID: <20080511145229.GM9875@leitl.org> On Sun, May 11, 2008 at 02:45:05AM -0400, Mark Hahn wrote: > bit of both - supermicro and tyan both rely on add-in cards, but > HP (and probably some other vendors) have integrated IPMI onboard. SunFire X2100 (M2) used to have discrete IPMI boards, but are now integrated on the motherboard. A couple HP Proliant DL165 G5 had discrete management boards. > I keep reading stuff from Intel and AMD about managability which > sounds tantalizingly like IPMI-like managability being integrated > into the chipset. SunFire X2100 M2 does this (the Broadcom NIC pair), and with the latest BIOS pretty well. > I think the main issue is that ipmi needs a lan interface, though > I guess there are some examples of it sharing a port with the host. -- Eugen* Leitl leitl http://leitl.org ______________________________________________________________ ICBM: 48.07100, 11.36820 http://www.ativel.com http://postbiota.org 8B29F6BE: 099D 78BA 2FD3 B014 B08A 7779 75B0 2443 8B29 F6BE From shaeffer at neuralscape.com Sun May 11 08:14:13 2008 From: shaeffer at neuralscape.com (Karen Shaeffer) Date: Sat Jul 5 01:07:05 2008 Subject: [Beowulf] Re: Purdue Supercomputer In-Reply-To: References: <48248D8F.1070000@tamu.edu> <87tzh7cpwq.fsf@snark.cb.piermont.com> <20080509142135.fnc1ywu720w8444o@webmail.jpl.nasa.gov> <87prrvayj1.fsf@snark.cb.piermont.com> <87wsm11x1j.fsf@snark.cb.piermont.com> Message-ID: <20080511151413.GA9137@synapse.neuralscape.com> On Sun, May 11, 2008 at 02:45:05AM -0400, Mark Hahn wrote: > >>I find that with IPMI and console redirection, it's very rarely necessary > >>to > >>care about where your nodes are, at least from a sysadmin perspective. > > > >Speaking of IPMI, are there reasonable motherboards that incorporate > >it right into the design at this point, or is it still mostly an add > >on? > > bit of both - supermicro and tyan both rely on add-in cards, but > HP (and probably some other vendors) have integrated IPMI onboard. > I keep reading stuff from Intel and AMD about managability which > sounds tantalizingly like IPMI-like managability being integrated > into the chipset. > > I think the main issue is that ipmi needs a lan interface, though > I guess there are some examples of it sharing a port with the host. Hi, To implement full IPMI capability on the system board, then you need two separate and independent power distribution systems on the system board I believe, which is why they implement IPMI with a daughterboard. Thanks, Karen -- Karen Shaeffer Neuralscape, Palo Alto, Ca. 94306 shaeffer@neuralscape.com http://www.neuralscape.com From shaeffer at neuralscape.com Sun May 11 13:22:32 2008 From: shaeffer at neuralscape.com (Karen Shaeffer) Date: Sat Jul 5 01:07:05 2008 Subject: [Beowulf] Re: Purdue Supercomputer In-Reply-To: <48274C1A.2020009@bogus.com> References: <48248D8F.1070000@tamu.edu> <87tzh7cpwq.fsf@snark.cb.piermont.com> <20080509142135.fnc1ywu720w8444o@webmail.jpl.nasa.gov> <87prrvayj1.fsf@snark.cb.piermont.com> <87wsm11x1j.fsf@snark.cb.piermont.com> <20080511151413.GA9137@synapse.neuralscape.com> <48274C1A.2020009@bogus.com> Message-ID: <20080511202232.GA11563@synapse.neuralscape.com> On Sun, May 11, 2008 at 12:42:18PM -0700, Joel Jaeggli wrote: > Karen Shaeffer wrote: > >Hi, > >To implement full IPMI capability on the system board, then you need > >two separate and independent power distribution systems on the system > >board I believe, which is why they implement IPMI with a daughterboard. > > +5volt standby from the atx power supply is used to provide wake-on-lan > functionality and power a ipmi implementation. it is available when > the system is otherwise off. > Hi Joel, Yes, but a separate and distinct power distribution for that +5 volt supply will then need to be implemented on the system board to support a full IPMI implementation residing on the system board. That was my point. You are confusing power source with implemented PCB power distribution. Thanks, Karen -- Karen Shaeffer Neuralscape, Palo Alto, Ca. 94306 shaeffer@neuralscape.com http://www.neuralscape.com From landman at scalableinformatics.com Sun May 11 13:24:33 2008 From: landman at scalableinformatics.com (Joe Landman) Date: Sat Jul 5 01:07:05 2008 Subject: [Beowulf] Re: Purdue Supercomputer In-Reply-To: <878wygzk20.fsf@snark.cb.piermont.com> References: <48248D8F.1070000@tamu.edu> <87tzh7cpwq.fsf@snark.cb.piermont.com> <20080509142135.fnc1ywu720w8444o@webmail.jpl.nasa.gov> <87prrvayj1.fsf@snark.cb.piermont.com> <87wsm11x1j.fsf@snark.cb.piermont.com> <482703EA.5000006@scalableinformatics.com> <878wygzk20.fsf@snark.cb.piermont.com> Message-ID: <48275601.7050708@scalableinformatics.com> Perry E. Metzger wrote: > Joe Landman writes: >> We have found as a safety precaution, that including a console >> server/kvm unit and having power control via addressable/switchable >> PDU is a great backup, especially when we are hundreds of km (or >> simply different timezones) from the units. > > Who do you favor for console servers these days? Ditto for > addressable/switchable PDUs? Hi Perry: We are biased as we resell these. OpenGear on the Console servers. They are pretty good. Have some nice kit. Switchable PDUs vary. Some of our customers have preferences, but we default to APC units if they don't have preferences. We have used the WTI Network Boot Bars as well (web/ssh interface) for a number of customers. Joe > > Perry -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics LLC, email: landman@scalableinformatics.com web : http://www.scalableinformatics.com http://jackrabbit.scalableinformatics.com phone: +1 734 786 8423 fax : +1 866 888 3112 cell : +1 734 612 4615 From landman at scalableinformatics.com Sun May 11 13:45:46 2008 From: landman at scalableinformatics.com (Joe Landman) Date: Sat Jul 5 01:07:05 2008 Subject: [Beowulf] Re: Purdue Supercomputer In-Reply-To: <20080511202232.GA11563@synapse.neuralscape.com> References: <48248D8F.1070000@tamu.edu> <87tzh7cpwq.fsf@snark.cb.piermont.com> <20080509142135.fnc1ywu720w8444o@webmail.jpl.nasa.gov> <87prrvayj1.fsf@snark.cb.piermont.com> <87wsm11x1j.fsf@snark.cb.piermont.com> <20080511151413.GA9137@synapse.neuralscape.com> <48274C1A.2020009@bogus.com> <20080511202232.GA11563@synapse.neuralscape.com> Message-ID: <48275AFA.3040504@scalableinformatics.com> Karen Shaeffer wrote: > Hi Joel, > Yes, but a separate and distinct power distribution for that +5 volt > supply will then need to be implemented on the system board to support > a full IPMI implementation residing on the system board. That was my > point. You are confusing power source with implemented PCB power > distribution. Hi Karen I haven't seen IPMI cards that get their power from any other place than the motherboard slot. There may be some that have a separate supply, but I haven't run across them. That is, as long as the power supply is "on", the main system power does not need to be, and the IPMI is powered over a similar bus to that which powers the motherboard network cards. I don't think Joel was confused on this. Joe -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics LLC, email: landman@scalableinformatics.com web : http://www.scalableinformatics.com http://jackrabbit.scalableinformatics.com phone: +1 734 786 8423 fax : +1 866 888 3112 cell : +1 734 612 4615 From shaeffer at neuralscape.com Sun May 11 16:28:15 2008 From: shaeffer at neuralscape.com (Karen Shaeffer) Date: Sat Jul 5 01:07:05 2008 Subject: [Beowulf] Re: Purdue Supercomputer In-Reply-To: <48275AFA.3040504@scalableinformatics.com> References: <20080509142135.fnc1ywu720w8444o@webmail.jpl.nasa.gov> <87prrvayj1.fsf@snark.cb.piermont.com> <87wsm11x1j.fsf@snark.cb.piermont.com> <20080511151413.GA9137@synapse.neuralscape.com> <48274C1A.2020009@bogus.com> <20080511202232.GA11563@synapse.neuralscape.com> <48275AFA.3040504@scalableinformatics.com> Message-ID: <20080511232815.GA12628@synapse.neuralscape.com> On Sun, May 11, 2008 at 04:45:46PM -0400, Joe Landman wrote: > Karen Shaeffer wrote: > > >Hi Joel, > >Yes, but a separate and distinct power distribution for that +5 volt > >supply will then need to be implemented on the system board to support > >a full IPMI implementation residing on the system board. That was my > >point. You are confusing power source with implemented PCB power > >distribution. > > Hi Karen > > I haven't seen IPMI cards that get their power from any other place > than the motherboard slot. There may be some that have a separate > supply, but I haven't run across them. That is, as long as the power > supply is "on", the main system power does not need to be, and the IPMI > is powered over a similar bus to that which powers the motherboard > network cards. > > I don't think Joel was confused on this. Hi Joe, OK. Maybe no one is confused, I am just not communicating effectively. Let me try again. If the motherboard is powered down, then the IPMI board can restart it and can also report failures at the system board level. So the system board power distribution must be distinct from the IPMI power distribution, even when they connect to a node on the system board that has power all the time. Is that reasonable? If so, then it appears to me that a full IPMI implementation on the system board would need that distinct and independent power distribution that used to be on the daughterboard, if that capability is to be maintained. That is all I really meant with my comment. Thanks, Karen -- Karen Shaeffer Neuralscape, Palo Alto, Ca. 94306 shaeffer@neuralscape.com http://www.neuralscape.com From gdjacobs at gmail.com Sun May 11 16:47:11 2008 From: gdjacobs at gmail.com (Geoff Jacobs) Date: Sat Jul 5 01:07:05 2008 Subject: [Beowulf] Re: Purdue Supercomputer In-Reply-To: <20080511232815.GA12628@synapse.neuralscape.com> References: <20080509142135.fnc1ywu720w8444o@webmail.jpl.nasa.gov> <87prrvayj1.fsf@snark.cb.piermont.com> <87wsm11x1j.fsf@snark.cb.piermont.com> <20080511151413.GA9137@synapse.neuralscape.com> <48274C1A.2020009@bogus.com> <20080511202232.GA11563@synapse.neuralscape.com> <48275AFA.3040504@scalableinformatics.com> <20080511232815.GA12628@synapse.neuralscape.com> Message-ID: <4827857F.3020304@gmail.com> Karen Shaeffer wrote: > If the motherboard is powered down, then the IPMI board can > restart it and can also report failures at the system board level. > So the system board power distribution must be distinct from the > IPMI power distribution, even when they connect to a node on the > system board that has power all the time. Is that reasonable? If > so, then it appears to me that a full IPMI implementation on the > system board would need that distinct and independent power > distribution that used to be on the daughterboard, if that > capability is to be maintained. That is all I really meant with my > comment. You want a persistent, fault tolerant +5V supply. Anyone have any thoughts about exactly how stable +5V standby off the PCI rails really is? -- Geoffrey D. Jacobs From lindahl at pbm.com Sun May 11 19:13:26 2008 From: lindahl at pbm.com (Greg Lindahl) Date: Sat Jul 5 01:07:05 2008 Subject: [Beowulf] Re: Purdue Supercomputer In-Reply-To: References: <48248D8F.1070000@tamu.edu> <87tzh7cpwq.fsf@snark.cb.piermont.com> <20080509142135.fnc1ywu720w8444o@webmail.jpl.nasa.gov> <87prrvayj1.fsf@snark.cb.piermont.com> <87wsm11x1j.fsf@snark.cb.piermont.com> Message-ID: <20080512021326.GF32767@bx9.net> On Sun, May 11, 2008 at 02:45:05AM -0400, Mark Hahn wrote: > I think the main issue is that ipmi needs a lan interface, though > I guess there are some examples of it sharing a port with the host. Last I saw someone doing this, IPMI sharing an ethernet port with the host led to all kinds of weird ARP problems. Whereas a dedicated port is much easier to configure. My favorite vendors all offer a dedicated port... -- greg From lindahl at pbm.com Sun May 11 19:15:04 2008 From: lindahl at pbm.com (Greg Lindahl) Date: Sat Jul 5 01:07:05 2008 Subject: [Beowulf] Re: Purdue Supercomputer In-Reply-To: <4827857F.3020304@gmail.com> References: <87wsm11x1j.fsf@snark.cb.piermont.com> <20080511151413.GA9137@synapse.neuralscape.com> <48274C1A.2020009@bogus.com> <20080511202232.GA11563@synapse.neuralscape.com> <48275AFA.3040504@scalableinformatics.com> <20080511232815.GA12628@synapse.neuralscape.com> <4827857F.3020304@gmail.com> Message-ID: <20080512021504.GG32767@bx9.net> On Sun, May 11, 2008 at 06:47:11PM -0500, Geoff Jacobs wrote: > You want a persistent, fault tolerant +5V supply. Anyone have any > thoughts about exactly how stable +5V standby off the PCI rails really is? Well, there are lots of add-in cards with Wake-on-Lan capability, right? They all use that supply. Generally all of that hardware stuff is well-engineered, well, at least better than the bios&ipmi software... -- greg From landman at scalableinformatics.com Sun May 11 19:42:46 2008 From: landman at scalableinformatics.com (Joe Landman) Date: Sat Jul 5 01:07:05 2008 Subject: [Beowulf] Re: Purdue Supercomputer In-Reply-To: <20080512021326.GF32767@bx9.net> References: <48248D8F.1070000@tamu.edu> <87tzh7cpwq.fsf@snark.cb.piermont.com> <20080509142135.fnc1ywu720w8444o@webmail.jpl.nasa.gov> <87prrvayj1.fsf@snark.cb.piermont.com> <87wsm11x1j.fsf@snark.cb.piermont.com> <20080512021326.GF32767@bx9.net> Message-ID: <4827AEA6.3020902@scalableinformatics.com> Greg Lindahl wrote: > On Sun, May 11, 2008 at 02:45:05AM -0400, Mark Hahn wrote: > >> I think the main issue is that ipmi needs a lan interface, though >> I guess there are some examples of it sharing a port with the host. > > Last I saw someone doing this, IPMI sharing an ethernet port with the > host led to all kinds of weird ARP problems. Whereas a dedicated port Not just arp problems. Sometimes shutting down the machine would actually turn off the port. Made for an "almost"-lights out machine room... -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics LLC, email: landman@scalableinformatics.com web : http://www.scalableinformatics.com http://jackrabbit.scalableinformatics.com phone: +1 734 786 8423 fax : +1 866 888 3112 cell : +1 734 612 4615 From shaeffer at neuralscape.com Sun May 11 19:58:27 2008 From: shaeffer at neuralscape.com (Karen Shaeffer) Date: Sat Jul 5 01:07:05 2008 Subject: [Beowulf] Re: Purdue Supercomputer In-Reply-To: <4827857F.3020304@gmail.com> References: <87wsm11x1j.fsf@snark.cb.piermont.com> <20080511151413.GA9137@synapse.neuralscape.com> <48274C1A.2020009@bogus.com> <20080511202232.GA11563@synapse.neuralscape.com> <48275AFA.3040504@scalableinformatics.com> <20080511232815.GA12628@synapse.neuralscape.com> <4827857F.3020304@gmail.com> Message-ID: <20080512025827.GA13794@synapse.neuralscape.com> On Sun, May 11, 2008 at 06:47:11PM -0500, Geoff Jacobs wrote: > Karen Shaeffer wrote: > > > If the motherboard is powered down, then the IPMI board can > > restart it and can also report failures at the system board level. > > So the system board power distribution must be distinct from the > > IPMI power distribution, even when they connect to a node on the > > system board that has power all the time. Is that reasonable? If > > so, then it appears to me that a full IPMI implementation on the > > system board would need that distinct and independent power > > distribution that used to be on the daughterboard, if that > > capability is to be maintained. That is all I really meant with my > > comment. > You want a persistent, fault tolerant +5V supply. Anyone have any > thoughts about exactly how stable +5V standby off the PCI rails really is? Hi Geoff, Following my line of thought, the issue is only whether the PCI rail(s) powering all the IPMI circuitry are completely isolated from the rest of the system board circuitry or not. If true, then it would be equivalent to a daughterboard implementation. If false, then it would not. Thanks for your comments, Karen -- Karen Shaeffer Neuralscape, Palo Alto, Ca. 94306 shaeffer@neuralscape.com http://www.neuralscape.com From gerry.creager at tamu.edu Sun May 11 20:07:57 2008 From: gerry.creager at tamu.edu (Gerry Creager) Date: Sat Jul 5 01:07:05 2008 Subject: [Beowulf] Re: Purdue Supercomputer In-Reply-To: <20080509142135.fnc1ywu720w8444o@webmail.jpl.nasa.gov> References: <48248D8F.1070000@tamu.edu> <87tzh7cpwq.fsf@snark.cb.piermont.com> <20080509142135.fnc1ywu720w8444o@webmail.jpl.nasa.gov> Message-ID: <4827B48D.7070902@tamu.edu> Student workers are cheap and expendable... Jim Lux wrote: > Quoting "Perry E. Metzger" , on Fri 09 May 2008 > 11:07:17 AM PDT: > >> >> Gerry Creager writes: >>> Late response: Brady self-laminating labels. They can be used with a >>> Brady printer or a laser printer. >> >> I've been wondering why the Canon P-Touch hasn't been mentioned up >> until this point. They're great for labeling just about anything in a >> machine room... > > Sure, until you have to label 1000 things in a few hours. I find it's > hard to get the backing peeled off (although my Ptouch doesn't have the > little feature that nicks the edge to make it easier). > > > Actually, you can order cables already pre numbered and labelled. Why > burn expensive cluster assembler time when you can pay someone > (potentially offshore) to do it cheaper. > > > > Jim -- Gerry Creager -- gerry.creager@tamu.edu Texas Mesonet -- AATLT, Texas A&M University Cell: 979.229.5301 Office: 979.458.4020 FAX: 979.862.3983 Office: