[Beowulf] Help with inconsistent network performance
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Mark Hahn hahn at mcmaster.caTue Dec 18 20:52:25 PST 2007
- Previous message: [Beowulf] Help with inconsistent network performance
- Next message: [Beowulf] Help with inconsistent network performance
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
> The machines are running the 2.6 kernel and I have confirmed that the max > TCP send/recv buffer sizes are 4MB (more than enough to store the full > 512x512 image). the bandwidth-delay product in a lan is low enough to not need this kind of tuning. > I loop with the client side program sending a single integer to rank 0, then > rank 0 broadcasts this integer to the other nodes, and then all nodes send > back 1MB / N of data. hmm, that's a bit harsh, don't you think? why not have the rank0/master as each slave for its contribution sequentially? sure, it introduces a bit of "dead air", but it's not as if two slaves can stream to a single master at once anyway (each can saturate its link, therefore the master's link is N-times overcommitted.) > To make sure there was not an issue with the MPI broadcast, I did one test > run with 5 nodes only sending back 4 bytes of data each. The result was a > RTT of less than 0.3 ms. isn't that kind of high? a single ping-pong latency should be ~50 us - maybe I'm underestimating the latency of the broadcast itself. > One interesting pattern I noticed is that the hiccup frame RTTs, almost > without exception, fall into one of three ranges (approximately 50-60, > 200-210, and 250-260). Could this be related to exponential back-off? perhaps introduced by the switch, or perhaps by the fact that the bcast isn't implemented as an atomic (eth-level) broadcast. > Tommorow I will experiment with jumbo frames and flow control settings (both > of which the HP Procurve claims to support). If these do not solve the > problems I will start sifting through tcpdump. I would simply serialize the slaves' responses first. the current design tries to trigger all the slaves to send results at once, which is simply not logical if you think about it, since any one slave can saturate the master's link. regards, mark hahn.
- Previous message: [Beowulf] Help with inconsistent network performance
- Next message: [Beowulf] Help with inconsistent network performance
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
