Manuale d’uso / di manutenzione del prodotto pSeries del fabbricante IBM
Vai alla pagina of 32
IBM ~ pSeri es Hig h Perf o rm ance Sw it ch Tuni ng an d Debug Gui de Versi on 1.0 April 2005 IB M Systems and Technology Group Clus ter Perf ormance Department Poughk eepsi e, NY.
pshps t unin ggui d ewp0 401 05. doc P a ge 2 Content s 1.0 Introduction..................................................................................................... 4 2.0 Tunables a nd settin gs for switch softwa re ..........................
pshps t unin ggui d ewp0 401 05. doc P a ge 3 5.10 MP_PRINTEN V ...................................................................................... 22 5.11 MP_STATIS TICS .............................................................................
pshps t unin ggui d ewp0 401 05. doc P a ge 4 1.0 I ntroduc tion This pape r is in tended to hel p you tune and de bu g the p e rf or man ce of the I BM ® pSe ries® High P erform a nce Switch (HPS) o n IBM Cluste r 1600 system s.
pshps t unin ggui d ewp0 401 05. doc P a ge 5 2.0 Tunables and settings f or switc h software T o opt i miz e t h e HP S , you c a n s et sh el l va r i a b l es f or P ar a llel E nv ir o n men t M P I - b a s ed w or k l oa ds an d for I P-bas e d w ork lo ads .
pshps t unin ggui d ewp0 401 05. doc P a ge 6 th read , and from wi t hin the MP I/LAPI pol ling code th a t is invoke d when the appl ic ation m akes blo cking MPI call s.
pshps t unin ggui d ewp0 401 05. doc P a ge 7 2.1. 5 MP_TASK_AFFINI TY Se tting MP_TA SK_A FFINI TY to SNI te lls parallel ope r ating envi r onmen t (POE) to bind each task to the MCM con taining the HPS adap te r it w ill use, so th at th e ad apte r , CPU, an d mem or y used by any task a re all lo cal to t he sam e MCM.
pshps t unin ggui d ewp0 401 05. doc P a ge 8 So meti mes M P I-IO is u s ed in an a p plicat ion a s if it we r e ba s ic P OS IX read/writ e, eit h er becau s e the r e i s no nee d for mo re complex re a d/ write patte rns or be cause the a pplication w as pr evi ously h and-o ptimize d t o use POSIX re ad /w rite .
pshps t unin ggui d ewp0 401 05. doc P a ge 9 rfifosize 0x1000000 receive fifo size False rpoolsize 0x02000000 IP receive pool size True spoolsize 0x02000000 IP send pool size True 3.0 Tunables and setti ngs for AIX 5L Seve r a l se ttings in AI X 5L im pact the pe rfo r m a nc e o f t he H PS.
pshps t unin ggui d ewp0 401 05. doc P a ge 10 The ove rhe a d in m aintain ing the file cac he can im pact t he p e rfo r mance of large paralle l app l i c atio n s. M uch o f t he ove r he ad i s ass o ciate d w i th the sync() syst em cal l (b y defau lt, run ev er y mi nu t e fr o m t h e sy n cd daemon).
pshps t unin ggui d ewp0 401 05. doc P a ge 11 3.3. 1 svmon Th e svm on com m a nd provide s in for m ation about the virtual me mo r y usage by the kernel and u s er p r oc es s es i n t h e s ys t e m a t a ny g i v e n t i m e.
pshps t unin ggui d ewp0 401 05. doc P a ge 12 Pa geSiz e Inu se Pin P gsp Vi rtu al 4KB 448221 3687 2675 449797 16 MB 0 0 0 0 Vs i d E sid T yp e De sc r ip ti on LP a ge Inu se P in P gsp Vi rtu a l.
pshps t unin ggui d ewp0 401 05. doc P a ge 13 statisti cs i n 5-se cond in terv als, wi th t he f irst se t of statisti cs being the statisti cs si nc e the node o r LP AR wa s la s t b oot e d .
pshps t unin ggui d ewp0 401 05. doc P a ge 14 adapte r is c onf ig ur ed . Th e vol ume of re serv ation i s proportion a l t o the number o f user w indow s con figured on the HPS a dapte r. A priv ate window i s required f o r each MP I task. He r e i s a f o rm ula t o c al c ul at e t he num be r o f TLPs nee de d by th e HPS ad apte r.
pshps t unin ggui d ewp0 401 05. doc P a ge 15 3.5 Large pages and IP support One o f the mo st important w ay s to im pr ove I P pe rfor mance on the HPS is to en sur e th a t large pages a re en abled . Lar ge pa ges are re quired to a llocate a n u m b e r of l arge page s which will used b y t h e H P S I P d r iv er a t b oot t i m e.
pshps t unin ggui d ewp0 401 05. doc P a ge 16 If yo u have eigh t cards fo r p690 (o r four ca rds fo r p655 ), thi s com ma nd al so indic ates whe ther yo u have fu ll mem o r y bandwid t h. 3.8 Debug set tings i n the AIX 5L kernel The AI X 5L ke rnel has seve ra l de bug se tting s th at a ffect the performan ce of an appl ic ation.
pshps t unin ggui d ewp0 401 05. doc P a ge 17 4.2 LoadLev el er dae mons The LoadLevele r ® da emon s ar e ne eded f or MPI application s using HP S . Ho weve r , you can lowe r the im pact on a parallel a pplic ation by ch anging the de fault se tting s fo r the se d aemon s.
pshps t unin ggui d ewp0 401 05. doc P a ge 18 SC HEDD_DEBUG = -D_A LWAYS 4.3 S ett i ngs for AIX 5L threads Seve ral v ar iable s hel p you use AIX 5L th r e a ds to tune pe r f orm ance . The se are the recomm ende d in itial se tting s fo r A IX 5L th re a ds wh en using HPS.
pshps t unin ggui d ewp0 401 05. doc P a ge 19 5.0 Debug set tings and dat a collect ion tools S e v er a l d ebu g set t i n gs a nd da t a c ol l e c t i o n t o o ls c a n h el p you de b u g a p er f or ma nc e p r ob l e m o n sy st em s using HPS.
pshps t unin ggui d ewp0 401 05. doc P a ge 20 5.3 Affini t y L PARs On p690 sy stems, if you are runn ing wi th more th an one LPAR for e ach CEC, m a ke sure yo u ar e r unn i n g a ff i nit y LP AR s . T o c hec k a ff i n it y b et we e n C P U, me mor y, a nd HP S lin ks , r u n t he assoc iativ i t y scri pts o n t he LPA Rs.
pshps t unin ggui d ewp0 401 05. doc P a ge 21 On the HMC GU I, selec t Se r vice A pplic ation s -> Se rvice Fo c al Poi n t - > Sele ct Se r vice a ble Even ts. 5.7 errpt command On AI X 5L, t he errpt c o mma n d list s a s u mmar y of s ys t e m er r or mes s a g es.
pshps t unin ggui d ewp0 401 05. doc P a ge 22 • Fo r HA L l i b rari e s: ds h su m /u sr /s ni /a ix 52 /li b /l ib ha l_ r. a • Fo r MP I l i bra ri e s: ds h su m /u sr /l pp /p pe .
pshps t unin ggui d ewp0 401 05. doc P a ge 23 MEMORY _A F FINI TY Single Thre ad Usage(MP_SINGLE_THREA D) Hin ts Fi l te red (MP_H IN TS_ FIL TE RED ) MP I-I/ O Buff er S ize (MP_IO_B UFFER _S IZE) M.
pshps t unin ggui d ewp0 401 05. doc P a ge 24 MPCI: se nds = 14 MPCI: se nd sComple te = 14 MPCI: se nd Wai tsC omple t e = 17 MPCI: recv s = 17 MPCI: recv WaitsCom plete = 13 M PCI : e arl yA rri v .
pshps t unin ggui d ewp0 401 05. doc P a ge 25 Run the follow ing c omm a nd: /usr/sbi n /ifsn _dump - a T he dat a is coll ect ed in sn i .sn ap ( sn i _dum p .
pshps t unin ggui d ewp0 401 05. doc P a ge 26 To he l p you i solate the e xact cause of packe t dr ops, the i f s n_dum p -a c o mma n d a ls o lis ts t h e follow i ng debug statistic s. I f y o u i solate packe t d rops to the s e statisti cs, you wi ll probably need to con tact IB M suppo rt.
pshps t unin ggui d ewp0 401 05. doc P a ge 27 T h er e a r e t wo r ou t es . sending packe t using r out e No . 1 ml ip ad d r ess structu re , sta rting : ml fl ag (ml in t erface up o r down) = 0x00000000 m l ti ck = 0 m l ip add ress = 0xc 0a80203, 19 2.
pshps t unin ggui d ewp0 401 05. doc P a ge 28 MA C WOF ( 2F870): B i t: 1 [. . .] 5.12.4 P ack ets d ropp ed in th e s w i tch h ardw are If a pa c k et is dr op p ed wit h i n t he s w it c h ha r d.
pshps t unin ggui d ewp0 401 05. doc P a ge 29 5.14 LAPI _DEBUG_COMM_TIMEOUT If the L API proto col e xperience s c omm unication time outs, se t the envi ronme nt v ariable LA PI_DEBUG_C OMM_T IMEOUT to PAUS E .
pshps t unin ggui d ewp0 401 05. doc P a ge 30 5.16 AIX 5L trace for daemon activi ty If yo u suspect th a t a sy stem da emon is causi ng a pe r form a nce problem on yo ur sy st em , run AIX 5 L t r a ce t o c h ec k f or da em o n a c t i vit y.
pshps t unin ggui d ewp0 401 05. doc P a ge 31 7.2 MPI document ation P arallel En vironme nt f or AIX 5L V4 .1.1 Hitc hhike r's G uide, SA 22- 7947-01 P arallel Env ironmen t for A IX 5L V 4.1. 1 Ope ration and Use, V olume 1 , SA22-7948- 01 P arallel Env ironmen t for A IX 5L V 4.
pshps t unin ggui d ewp0 401 05. doc P a ge 32 © IBM Cor poration 20 05 IBM Corporati on Marketing Com m unicati ons System s Gr oup Route 10 0 Somer s, New York 1 0589 Produced i n the Uni ted States of Amer ica April 2005 All Rights R eserved T his docum ent was developed for pro ducts and/or s ervices offered in the Uni ted Stat es.
Un punto importante, dopo l’acquisto del dispositivo (o anche prima di acquisto) è quello di leggere il manuale. Dobbiamo farlo per diversi motivi semplici:
Se non hai ancora comprato il IBM pSeries è un buon momento per familiarizzare con i dati di base del prodotto. Prime consultare le pagine iniziali del manuale d’uso, che si trova al di sopra. Dovresti trovare lì i dati tecnici più importanti del IBM pSeries - in questo modo è possibile verificare se l’apparecchio soddisfa le tue esigenze. Esplorando le pagine segenti del manuali d’uso IBM pSeries imparerai tutte le caratteristiche del prodotto e le informazioni sul suo funzionamento. Le informazioni sul IBM pSeries ti aiuteranno sicuramente a prendere una decisione relativa all’acquisto.
In una situazione in cui hai già il IBM pSeries, ma non hai ancora letto il manuale d’uso, dovresti farlo per le ragioni sopra descritte. Saprai quindi se hai correttamente usato le funzioni disponibili, e se hai commesso errori che possono ridurre la durata di vita del IBM pSeries.
Tuttavia, uno dei ruoli più importanti per l’utente svolti dal manuale d’uso è quello di aiutare a risolvere i problemi con il IBM pSeries. Quasi sempre, ci troverai Troubleshooting, cioè i guasti più frequenti e malfunzionamenti del dispositivo IBM pSeries insieme con le istruzioni su come risolverli. Anche se non si riesci a risolvere il problema, il manuale d’uso ti mostrerà il percorso di ulteriori procedimenti – il contatto con il centro servizio clienti o il servizio più vicino.