9.2 (revision 937444cc2)
|
Score-P allows to configure several measurement parameters via environment variables. After the measurement run you can find a scorep.cfg
file in your experiment directory which contains the configuration of the measurement run. If you did not set any configuration values explicitly, this file will contain the default values. This file is safe to be used as input for a POSIX shell. For example, if you want to reuse the same configuration from a previous measurement run, do something like this:
$ set -a $ . scorep.cfg $ set +a
Measurement configuration variables have a specific type which accepts certain values.
An arbitrary character sequence, no white space trimming is done.
Like String but a path is expected. Though no validation is performed.
A Boolean value, no white space trimming is done. Accepted Boolean values for true are case insensitive and the following:
true
yes
on
Everything else is interpreted as the Boolean value false.
A decimal number, white space trimming is done.
Like Number, but also accepts unit case insensitive suffixes after optional white space:
B
, Kb
, Mb
, Gb
, Tb
, Pb
, Eb
The b
suffix can be omitted.
A symbolic set. Accepted members are listed in the documentation of the variable. Multiple values are allowed, are case insensitive, and are subject to white space trimming. They can be separated with one of the following characters:
' - space,
' - comma:
' - colon;
' - semicolonAcceptable values can also have aliases, which are listed in the documentation and separated by '/
'. Values can be negated by preceeding it with '~
'. Order of evaluation is from left to right.
Like Set, but only one value allowed to be selected.
This is the list of all known configure variables to control a Score-P measurement.
scorep-info config-vars
command to list only those supported by the used installation.SCOREP_ENABLE_PROFILING
Enable profiling
true
SCOREP_ENABLE_TRACING
Enable tracing
false
SCOREP_ENABLE_UNWINDING
Enables recording calling context information for every event
false
The calling context is the call chain of functions to the current position in the running program. This call chain will also be annotated with source code information if possible.
This is a prerequisite for sampling but also works with instrumented applications.
Note that when tracing is also enabled, Score-P does not write the usual Enter/Leave records into the OTF2 trace, but new records.
See also SCOREP_TRACING_CONVERT_CALLING_CONTEXT_EVENTS
.
Note also that this supresses events from the compiler instrumentation.
SCOREP_VERBOSE
Be verbose
false
SCOREP_TOTAL_MEMORY
Total memory in bytes per process to be consumed by the measurement system
16000k
It will be split into pages of size SCOREP_PAGE_SIZE
(potentially reduced to a multiple of SCOREP_PAGE_SIZE
). Maximum size is 4 GBminus one SCOREP_PAGE_SIZE
.
SCOREP_PAGE_SIZE
Memory page size in bytes
8k
If not a power of two, SCOREP_PAGE_SIZE
will be increased to the next larger power of two. SCOREP_TOTAL_MEMORY
will be split up into pages of (the adjusted) SCOREP_PAGE_SIZE
. Minimum size is 512 bytes.
SCOREP_EXPERIMENT_DIRECTORY
Name of the experiment directory as child of the current working directory
""
The experiment directory is created directly under the current working directory. No parent directories will be created. The experiment directory is only created if it is requested by at least one substrate. When no experiment name is given (the default) Score-P names the experiment directory scorep-measurement-tmp
and renames this after a successful measurement to a generated name based on the current time.
SCOREP_OVERWRITE_EXPERIMENT_DIRECTORY
Overwrite an existing experiment directory
true
If you specified a specific experiment directory name, but this name is already given, you can force overwriting it with this flag. The previous experiment directory will be renamed.
SCOREP_MACHINE_NAME
The machine name used in profile and trace output
"Linux"
We suggest using a unique name, e.g., the fully qualified domain name. The default machine name was set at configure time (see the INSTALL file for customization options).
SCOREP_EXECUTABLE
Executable of the application
""
File name, preferably with full path, of the application's executable. This is a fallback if Score-P cannot determine the executable's name automatically. The name is required by some compiler adapters. They will complain if this environment variable is needed.
SCOREP_FORCE_CFG_FILES
Force the creation of experiment directory and configuration files
true
If this is set to true
(which is the default), the experiment directory will be created along with some configuration files, even if no substrate writes data (i.e., profiling and tracing are disabled and no substrate plugin registered for writing).
If this is set to false
, the directory will only be created if any substrate actually writes data.
SCOREP_TIMER
Timer used during measurement
tsc
The following timers are available for this installation:
tsc
Low overhead time stamp counter (X86_64) timer.
gettimeofday
gettimeofday timer.
clock_gettime
clock_gettime timer with CLOCK_MONOTONIC as clock.
SCOREP_DEBUG_UNIFY
Writes the pre-unified definitions also in the local definition trace files
true
SCOREP_PROFILING_TASK_EXCHANGE_NUM
Number of foreign task objects that are collected before they are put into the common task object exchange buffer
1K
The profiling creates a record for every task instance that is running. To avoid locking, the required memory is taken from a preallocated memory block. Each thread has its own memory block. On task completion, the created object can be reused by other tasks. However, if tasks migrate, the data structure migrates with them. Thus, if there is an imbalance in the migration from a source thread that starts the execution of tasks towards a sink thread that completes the tasks, the source thread may continually creating new task objects while in the sink, released task objects are collected. Thus, if the sink collected a certain number of tasks it should trigger a backflow of its collected task objects. However, this requires locking which should be avoided as much as possible. Thus, we do not want the locking to happen on every migrated task, but only if a certain imbalance occurs. This environment variable determines the number of migrated task instances that must be collected before the backflow is triggered.
SCOREP_PROFILING_MAX_CALLPATH_DEPTH
Maximum depth of the calltree
100
SCOREP_PROFILING_BASE_NAME
Base for construction of the profile filename
"profile"
String which is used as based to create the filenames for the profile files.
SCOREP_PROFILING_FORMAT
Profile output format
default
Sets the output format for the profile.
The following formats are supported:
tau_snapshot
[Deprecated in 9.1] Tau snapshot format. Limited to CPU locations (No Metric or GPU locations).
cube4
Stores the sum for every metric per callpath per location in Cube4 format.
cube_tuple
Stores an extended set of statistics (min, avg, max, sum, sum of squares) in Cube4 format.
default
Default format, i.e., cube4.
SCOREP_PROFILING_ENABLE_CLUSTERING
Enable clustering
true
SCOREP_PROFILING_CLUSTER_COUNT
Maximum cluster count for iteration clustering
64
SCOREP_PROFILING_CLUSTERING_MODE
Specifies the level of strictness when comparing call trees for equivalence
subtree
Possible levels:
none/0
No structural similarity required.
subtree/1
The sub-trees structure must match.
subtree_visits/2
The sub-trees structure and the number of visits must match.
mpi/3
The structure of the call-path to MPI calls must match.
Nodes that are not on an MPI call-path may differ.
mpi_visits/4
Like above, but the number of visits of the MPI calls must match, too.
mpi_visits_all/5
Like above, but the number of visits must match also match on all nodes on the call-path to an MPI function.
SCOREP_PROFILING_CLUSTERED_REGION
Name of the clustered region
""
The clustering can only cluster one dynamic region. If more than one dynamic region are defined by the user, the region is clustered which is exited first. If another region should be clustered instead you can specify the region name in this variable. If the variable is unset or empty, the first exited dynamic region is clustered.
SCOREP_PROFILING_ENABLE_CORE_FILES
Write .core files if an error occurred
false
If an error occurs inside the profiling system, the profiling is disabled. For debugging reasons, it might be feasible to get the state of the local stack at these points. It is not recommended to enable this feature for large scale measurements.
SCOREP_TRACING_USE_SION
Whether or not to use libsion as OTF2 substrate
false
SCOREP_TRACING_MAX_PROCS_PER_SION_FILE
Maximum number of processes that share one sion file (must be > 0)
1K
All processes are then evenly distributed over the number of needed files to fulfill this constraint. E.g., having 4 processes and setting the maximum to 3 would result in 2 files each holding 2 processes.
SCOREP_TRACING_CONVERT_CALLING_CONTEXT_EVENTS
Write calling context information as a sequence of Enter/Leave events to trace
false
When recording the calling context of events (instrumented or sampled) then these could be stored in the trace either as the new CallingContext records from OTF2 or they could be converted to the legacy Enter/Leave records. This can be controlled with this variable, where the former is the false value.
This is only in effect if SCOREP_ENABLING_UNWINDING
is on.
Note that enabling this will result in an increase of records per event and also of the loss of the source code locations.
This option exists only for backwards compatibility for tools, which cannot handle the new OTF2 records. This option my thus be removed in future releases.
SCOREP_FILTERING_FILE
A file name which contain the filter rules
""
SCOREP_SUBSTRATE_PLUGINS
Specify list of used plugins
""
List of requested substrate plugin names that will be used during program run.
SCOREP_SUBSTRATE_PLUGINS_SEP
Separator of substrate plugin names
","
Character that separates plugin names in SCOREP_SUBSTRATE_PLUGINS
.
SCOREP_LIBWRAP_PATH
Search path for user library wrapper plug-ins.
""
Colon-separated list of directories to search for user library wrapper plug-ins. The installation of Score-P is appended implicitly to the end.
SCOREP_LIBWRAP_ENABLE
Library wrapper plug-ins to load and enable.
""
List of user library wrapper plug-ins or absolute path to a user library wrapper plug-ins. Either a full path to a plug-in or a library wrapper name to search for in Score-P's installation or in SCOREP_LIBWRAP_PATH
. SCOREP_LIBWRAP_ENABLE_SEP
is used as separators.
SCOREP_LIBWRAP_ENABLE_SEP
Separators for list of library wrapper plug-ins to load and enable.
","
Characters that delimits plug-in names in SCOREP_LIBWRAP_ENABLE
.
SCOREP_METRIC_PAPI
PAPI metric names to measure
""
List of requested PAPI metric names that will be collected during program run.
SCOREP_METRIC_PAPI_PER_PROCESS
PAPI metric names to measure per-process
""
List of requested PAPI metric names that will be recorded only by first thread of a process.
SCOREP_METRIC_PAPI_SEP
Separator of PAPI metric names
","
Character that separates metric names in SCOREP_METRIC_PAPI
and SCOREP_METRIC_PAPI_PER_PROCESS
.
SCOREP_METRIC_RUSAGE
Resource usage metric names to measure
""
List of requested resource usage metric names that will be collected during program run.
SCOREP_METRIC_RUSAGE_PER_PROCESS
Resource usage metric names to measure per-process
""
List of requested resource usage metric names that will be recorded only by first thread of a process.
SCOREP_METRIC_RUSAGE_SEP
Separator of resource usage metric names
","
Character that separates metric names in SCOREP_METRIC_RUSAGE
and SCOREP_METRIC_RUSAGE_PER_PROCESS
.
SCOREP_METRIC_PLUGINS
Specify list of used plugins
""
List of requested metric plugin names that will be used during program run.
SCOREP_METRIC_PLUGINS_SEP
Separator of plugin names
","
Character that separates plugin names in SCOREP_METRIC_PLUGINS
.
SCOREP_METRIC_PERF
PERF metric names to measure
""
List of requested PERF metric names that will be collected during program run.
SCOREP_METRIC_PERF_PER_PROCESS
PERF metric names to measure per-process
""
List of requested PERF metric names that will be recorded only by first thread of a process.
SCOREP_METRIC_PERF_SEP
Separator of PERF metric names
","
Character that separates metric names in SCOREP_METRIC_PERF
and SCOREP_METRIC_PERF_PER_PROCESS
.
SCOREP_SAMPLING_EVENTS
Set the sampling event and period: <event>[@<period>]
"perf_cycles@10000000"
This selects the interrupt source for sampling.
This is only in effect if SCOREP_ENABLE_UNWINDING
is on.
Possible values:
- perf event (perf_<event>
, see "perf list"
)
period in number of events, default: 10000000
e.g., perf_cycles@2000000
- PAPI event (PAPI_<event>
, see "papi_avail"
)
period in number of events, default: 10000000
e.g., PAPI_TOT_CYC@2000000
- timer
(POSIX timer, invalid for multi-threaded)
period in us, default: 10000
e.g., timer@2000
SCOREP_SAMPLING_SEP
Separator of sampling event names
","
Character that separates sampling event names in SCOREP_SAMPLING_EVENTS
SCOREP_TOPOLOGY_PLATFORM
Record hardware topology information for this platform, if available.
true
SCOREP_TOPOLOGY_PROCESS
Record the Process x Thread topology.
true
SCOREP_TOPOLOGY_USER
Record topologies provided by user instrumentation
true
SCOREP_TOPOLOGY_MPI
Record MPI cartesian topologies.
true
SCOREP_SELECTIVE_CONFIG_FILE
A file name which configures selective recording
""
SCOREP_MPI_MAX_COMMUNICATORS
Determines the number of concurrently used communicators per process
50
SCOREP_MPI_MAX_WINDOWS
Determines the number of concurrently used windows for MPI one-sided communication per process
50
SCOREP_MPI_MAX_EPOCHS
Maximum amount of concurrently active access or exposure epochs per process
50
SCOREP_MPI_MAX_GROUPS
Maximum number of concurrently used MPI groups per process
50
SCOREP_MPI_ENABLE_GROUPS
The names of the function groups which are measured
default
Other functions are not measured.
Possible groups are:
all
All MPI functions
cg
Communicator and group management
coll
Collective functions
default
Default configuration.
Includes:
- cg
- coll
- env
- io
- p2p
- rma
- topo
- xnonblock
env
Environmental management
err
MPI Error handling
ext
External interface functions
io
MPI file I/O
p2p
Peer-to-peer communication
misc
Miscellaneous
perf
PControl
rma
One sided communication
spawn
Process management
topo
Topology
type
MPI datatype functions
xnonblock
This flag is deprecated and has no effect. Extended non-blocking communication is always on.
xreqtest
Test events for uncompleted requests
none/no
SCOREP_MPI_MEMORY_RECORDING
Enable tracking of memory allocations done by calls to MPI_ALLOC_MEM and MPI_FREE_MEM
false
Requires that the MISC group is also recorded.
SCOREP_SHMEM_MEMORY_RECORDING
Enable tracking of memory allocations done by calls to the SHMEM allocation API
false
SCOREP_CUDA_ENABLE
CUDA measurement features
yes
Sets the CUDA measurement mode to capture.
Notes:
- Options required by other options will be included automatically.
- idle and pure idle are mutually exclusive.
- The tag (tracing only) indicates that profiling will not yield additional
data from this option.
The following options or sets are available:
runtime
CUDA runtime API
driver
CUDA driver API
kernel
CUDA kernels
kernel_serial
Serialized kernel recording
kernel_counter
Fixed CUDA kernel metrics
kernel_callsite
Track kernel callsites between launch and execution
memcpy
CUDA memory copies
sync
Record implicit and explicit CUDA synchronization
idle
GPU compute idle time
pure_idle
GPU idle time (memory copies are not idle)
gpumemusage
Record CUDA memory (de)allocations as a counter
references
Record references between CUDA activities (tracing only)
dontflushatexit
Disable flushing CUDA activity buffer at program exit
flushatexit
[DEPRECATED] Flush CUDA activity buffer at program exit (see dontflushatexit
)
default/yes/1
CUDA runtime API and GPU activities.
Includes:
- driver
- kernel
- kernel_counter
- memcpy
- idle
- sync
- gpumemusage
- references
none/no
SCOREP_CUDA_BUFFER
Total memory in bytes for the CUDA record buffer
1M
SCOREP_CUDA_BUFFER_CHUNK
Chunk size in bytes for the CUDA record buffer (ignored for CUDA 5.5 and earlier)
8k
SCOREP_OPENCL_ENABLE
OpenCL measurement features
no
Sets the OpenCL measurement mode to capture:
api
OpenCL runtime API
kernel
OpenCL kernels
memcpy
OpenCL buffer reads/writes
default/yes/true/1
OpenCL API and GPU activities.
Includes:
- api
- kernel
- memcpy
none/no
SCOREP_OPENCL_BUFFER_QUEUE
Memory in bytes for the OpenCL command queue buffer
8k
SCOREP_OPENACC_ENABLE
OpenACC measurement features
no
Sets the OpenACC measurement mode to capture:
regions
OpenACC regions
wait
OpenACC wait operations
enqueue
OpenACC enqueue operations (kernel, upload, download)
device_alloc
OpenACC device memory allocations
kernel_properties
Record kernel properties such as the kernel name as well as the gang, worker and vector size for kernel launch operations
variable_names
Record variable names for OpenACC data allocation and enqueue upload/download
default/yes/1
OpenACC regions, enqueue and wait operations.
Includes:
- regions
- wait
- enqueue
none/no
SCOREP_MEMORY_RECORDING
Memory recording
false
Memory (de)allocations are recorded via the libc/C++ API.
SCOREP_KOKKOS_ENABLE
Kokkos measurement features
no
Sets the Kokkos measurement mode to capture:
regions
Kokkos parallel regions
user
Kokkos user regions
malloc
Kokkos memory allocation
memcpy
Kokkos deep copy
default/yes/1
Kokkos parallel regions, user regions, and allocations
none/no
SCOREP_HIP_ENABLE
HIP measurement features
yes
Sets the HIP measurement mode to capture:
api
All HIP API calls
kernel
HIP kernels
kernel_callsite
Track kernel callsites between launch and execution. Depends on 'kernel', therefore enables that too
malloc
HIP allocations
memcpy
HIP memory copies
sync
HIP synchronization
user
User instrumentation through ROCTX API
default/yes/1/true
HIP tracing
none/no
SCOREP_HIP_ACTIVITY_BUFFER_SIZE
HIP device activity buffer size
1M
Buffer size for device activity events. Must be a power-of-2.
SCOREP_OPENMP_TARGET_ENABLE
OpenMP target measurement features
yes
Sets the OpenMP target measurement mode to capture.
kernel
Enable collection of OpenMP target kernel events. On the host side, this includes events related to launching kernels, i.e. !$omp target and!$omp target submit. If the device tracing interface is available,this also includes accelerator events, i.e. kernel executions.
memory
Enable collection of OpenMP target memory events. This includes both events related to data transfers, e.g. OpenMP directives invoking data transfers like !$omp target enter data, and allocations, e.g. done by OpenMP runtime calls like omp_target_alloc.
default/yes/1
Enable all OpenMP target features.
none/no
SCOREP_OPENMP_TARGET_BUFFER_CHUNK_SIZE
Chunk size in bytes for the OMPT target trace record buffer.
8k
SCOREP_OPENMP_TARGET_DEVICE_TRACING_ENABLE
Usage of OMPT device tracing interface for recording accelerator events
yes
Enable/Disable the usage of the device tracing interface to collect accelerator events. If disabled, only host events will be recorded. Accelerator events can still be collected using a native accelerator adapter, such as CUDA.
SCOREP_IO_POSIX
POSIX I/O, POSIX async I/O, ISO C I/O
false