UPC performance analysis

GASP is an interface for instrumenting UPC applications. It allows performance analysis tools to gather information on application execution. If you want to gather such data you need to implement a number of call-back functions, which will receive all information when application is run. Then you compile your implementation along with the user code. In a nutshell GASP is simply a collection of functions which are inserted at the beginning as well as at the end of UPC library functions. Each time function is called you receive data through call-backs. Then it’s up to you what you do with that data: count how many times function has been called, calculate how much data has been transfered between threads, find out how much time has been spent in barriers and so on. GASP is described in detail in specification, there are also several articles on that topic. You can find that information at GASP official site. GASP is implemented in Berkeley UPC, there is also a limited support in GCC UPC.

upc_dump

There are several tools that can utilize GASP to analyze UPC applications. The most dumb tool is upc_dump, which is included into Berkeley UPC compiler. It doesn’t actually analyze application, it simply dumps all tracing information. Here is how you compile UPC application with upc_dump:

> /opt/bupc-runtime-2.8.0/opt_inst/bin/upcc -network=udp --inst-toolname=dump /opt/bupc-runtime-2.8.0/opt_inst/bin/gasp-dump/gaspu.upc -L/opt/bupc-runtime-2.8.0/opt_inst/bin/gasp-dump -lgasp-dump -T=4 --inst bin_file

First of all you need to compile your code with instrumented version of UPC from opt_inst subdirectory. I described how to compile opt_inst earlier in this post. With --inst-toolname you provide the name of instrumentation tool (it seems that it doesn’t matter what you specify here). Then you compile your application along with gaspu.upc. Usually developers of performance analysis tools put GASP related call-back implementations into a library and all upcalls to UPC code into a separate source file (such as gaspu.upc for upc_dump), since library is compiled with C compiler and cannot include UPC code. Then you provide path to the library with -L flag and name of the library with -l flag. The last --inst flag instructs the compiler to instrument all UPC library functions. --inst-functions also instruments all user-defined functions. Additionally, --inst-local allows to instrument all local accesses to shared memory within the thread.

upc_trace

upc_trace is another tool distributed with Berkeley UPC, but it has some performance analysis functionality. Unlike upc_dump you don’t need to manually specify all flags to the compiler. What you need to do is to compile an opt_trace subbuild. And upcrun application with the -trace flag. When execution is completed you will have several trace files which you pass on to upc_trace tool. Performance analysis data is represented as output text file. You can control which types of events are gathered using GASNET_TRACEMASK environment variable.

Parallel Performance Wizard

PPW is probably the one performance analysis tool for UPC with rich functionality. PPW installation is simple:

# ./configure --prefix=/home/fred/ppw-2.6 --with-upc=/home/fred/bupc-runtime-2.10.0
# make
# make install

Then to perform simple tests use PPW wrappers for upcc and upcrun:

> ppwupcc -network=udp --inst-functions -T=32 upc_code.c
> UPC_NODES=”n1 n2 n3 n4″ ppwrun --output=upc_code.par upcrun -n 32 -nodes=4 bin_file

PPW has GUI where you pass the .par file and see what happened in the application.

Tags: Berkeley, GASP, gcc, instrumentation, Parallel Performance Wizard, performance analysis, PPW, tracing, Unified Parallel C, UPC, upc_dump, upc_trace

This entry was posted on October 8, 2012 at 8:06 am and is filed under HPC. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

Niktips's Blog