[Beowulf] programming multicore clusters
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Joseph Mack NA3T jmack at wm7d.netThu Jun 14 05:53:58 PDT 2007
- Previous message: [Beowulf] programming multicore clusters
- Next message: [Beowulf] programming multicore clusters
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Wed, 13 Jun 2007, Greg Lindahl wrote: >> Still if a flat, one network model is used, all processes >> communicate through the off-board networking. > > No, the typical MPI implementation does not use off-board networking > for messages to local ranks. You use the same MPI calls, but the > underlying implementation uses shared memory when possible. My apparently erroneous assumption was that in a beowulf of quadcore processors, each processor would be assigned a random rank, in which case adjacent processors in the quadcore package would not be working on adjacent parts of the compute space. What's the mechanism for assigning a processor a particular rank? (a url, or pointer to the MPI docs is fine). How does MPI know that one process is running on the same mobo and to use shared memory and that another process is running off-board? I take it there's a map somewhere other than the machines.LINUX file? >> Someone with a quadcore machine, running MPI on a flat >> network, told me that their application scales poorly to >> 4 processors. > > Which could be because he's out of memory bandwith, or > network bandwidth, or message rate. There are a lot of > postential reasons. OK >> In a quadcore machine, if 4 OMP/threads processes are >> started on each quadcore package, could they be >> rescheduled at the end of their timeslice, on different >> cores arriving at a cold cache? > > Most MPI and OpenMP implementations lock processes to > cores for this very reason. am off googling >> In a single image machine (with a single address space) >> how does the OS know to malloc memory from the on-board >> memory, rather than some arbitary location (on another >> board)? > > Generally the default is to always malloc memory local to > the process. Linux grew this feature when it started being > used on NUMA machines like the Altix and the Opteron. ditto >> I expect everyone here knows all this. How is everyone >> going to program the quadcore machines? > > Using MPI? I see. It's a bit clearer now. > You can go read up on new approaches like UPC, Co-Array > Fortran, Global Arrays, Titanium, Chapel/X-10/Fortress, > etc, but MPI is going to be the market leader for a long > time. Thanks Joe -- Joseph Mack NA3T EME(B,D), FM05lw North Carolina jmack (at) wm7d (dot) net - azimuthal equidistant map generator at http://www.wm7d.net/azproj.shtml Homepage http://www.austintek.com/ It's GNU/Linux!
- Previous message: [Beowulf] programming multicore clusters
- Next message: [Beowulf] programming multicore clusters
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
