Oprofile

1. Oprofiles are generated by regularly sampling the current registers on each CPU (from an interrupt handler, the saved PC value at the time of interrupt is stored), and converting that runtime PC value into something meaningful to the programmer.

2.taking the stream of sampled PC values, along with the detail of which task was running at the time of the interrupt, and converting into a file offset against a particular binary file. ( The mmap() function shall establish a mapping between a process’ address space and a file, shared memory object, or [TYM] [Option Start] typed memory object.)

3. In common operation, the time between each sample interrupt is regulated by a fixed number of clock cycles.

4.These hardware performance counters increment once per each event, and generate an interrupt on reaching some pre-defined number of events. OProfile can use these interrupts to generate samples: then, the profile results are a statistical approximation of which code caused how many of the given event.

5. OProfile implements a pseudo-filesystem known as “oprofilefs”, mounted from userspace at /dev/oprofile. This consists of small files for reporting and receiving configuration from userspace, as well as the actual character device that the OProfile userspace receives samples from.

6. The OProfile userspace daemon’s job is to take the raw data provided by the kernel and write it to the disk.

7. “unit mask” is borrowed from the Intel architectures, and can further specify exactly when a counter is incremented (for example, cache-related events can be restricted to particular state transitions of the cache lines).

All of the available hardware events and their details are specified in the textual files in the events directory. The syntax of these files should be fairly obvious. The user specifies the names and configuration details of the chosen counters via opcontrol.

8.The actual sample file name is dot-separated, where the fields are, in order: event name, event count, unit mask, task group ID, task ID, and CPU number.

9.

application image
The primary binary image used by an application. This is derived from the kernel and corresponds to the binary started upon running an application: for example, /bin/bash.

binary image
An ELF file containing executable code: this includes kernel modules, the kernel itself (a.k.a. vmlinux), shared libraries, and application binaries.

dcookie
Short for “dentry cookie”. A unique ID that can be looked up to provide the full path name of a binary image.

dependent image
A binary image that is dependent upon an application, used with per-application separation. Most commonly, shared libraries. For example, if /bin/bash is running and we take some samples inside the C library itself due to bash calling library code, then the image /lib/libc.so would be dependent upon /bin/bash.

merging
This refers to the ability to merge several distinct sample files into one set of data at runtime, in the post-profiling tools. For example, per-thread sample files can be merged into one set of data, because they are compatible (i.e. the aggregation of the data is meaningful), but it’s not possible to merge sample files for two different events, because there would be no useful meaning to the results.

profile class
A collection of profile data that has been collected under the same class template. For example, if we’re using opreport to show results after profiling with two performance counters enabled profiling DATA_MEM_REFS and CPU_CLK_UNHALTED, there would be two profile classes, one for each event. Or if we’re on an SMP system and doing per-cpu profiling, and we request opreport to show results for each CPU side-by-side, there would be a profile class for each CPU.

profile specification
The parameters the user passes to the post-profiling tools that limit what sample files are used. This specification is matched against the available sample files to generate a selection of profile data.

profile template
The parameters that define what goes in a particular profile class. This includes a symbolic name (e.g. “cpu:1”) and the code-usable equivalent.

Advertisements
This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s