Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

perf powerpc: Allocate and setup aux buffer queue to help co-relate with other events across CPU's

When the Dispatch Trace Log data is collected along with other events
like sched tracepoint events, it needs to be correlated and present
interleaved along with these events.

Perf events can be collected parallely across the CPUs. Hence it needs
to be ensured events/dtl entries are processed in timestamp order.

An auxtrace_queue is created for each CPU.

Data within each queue is in increasing order of timestamp. Each
auxtrace queue has a array/list of auxtrace buffers.

When processing the auxtrace buffer, the data is mmapp'ed.

All auxtrace queues is maintained in auxtrace heap.

Each queue has a queue number and a timestamp.

The queues are sorted/added to head based on the time stamp.

So always the lowest timestamp (entries to be processed first) is on top
of the heap.

The auxtrace queue needs to be allocated and heap needs to be populated
in the sorted order of timestamp.

The queue needs to be filled with data only once via
powerpc_vpadtl__update_queues() function.

powerpc_vpadtl__setup_queues() iterates through all the entries to
allocate and setup the auxtrace queue.

To add to auxtrace heap, it is required to fetch the timebase of first
entry for each of the queue.

The first entry in the queue for VPA DTL PMU has the boot timebase,
frequency details which are needed to get timestamp which is required to
correlate with other events.

The very next entry is the actual trace data that provides timestamp for
occurrence of DTL event.

Formula used to get the timestamp from dtl entry is:

((timbase from DTL entry - boot time) / frequency) * 1000000000

powerpc_vpadtl_decode() adds the boot time and frequency as part of
powerpc_vpadtl_queue structure so that it can be reused.

Each of the dtl_entry is of 48 bytes size. Sometimes it could happen
that one buffer is only partially processed (if the timestamp of
occurrence of another event is more than currently processed element in
queue, it will move on to next event).

In order to keep track of position of buffer, additional fields is added
to powerpc_vpadtl_queue structure.

Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Athira Rajeev <atrajeev@linux.ibm.com>
Tested-by: Tejas Manhas <tejas05@linux.ibm.com>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Cc: Aboorva Devarajan <aboorvad@linux.ibm.com>
Cc: Aditya Bodkhe <Aditya.Bodkhe1@ibm.com>
Cc: Hari Bathini <hbathini@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Shrikanth Hegde <sshegde@linux.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

authored by

Athira Rajeev and committed by
Arnaldo Carvalho de Melo
cd1c3b73 71feffa9

+223 -4
+223 -4
tools/perf/util/powerpc-vpadtl.c
··· 13 13 #include "machine.h" 14 14 #include "debug.h" 15 15 #include "powerpc-vpadtl.h" 16 + #include "sample.h" 17 + #include "tool.h" 16 18 17 19 /* 18 20 * Structure to save the auxtrace queue ··· 43 41 struct auxtrace_buffer *buffer; 44 42 struct thread *thread; 45 43 bool on_heap; 44 + struct powerpc_vpadtl_entry *dtl; 45 + u64 timestamp; 46 + unsigned long pkt_len; 47 + unsigned long buf_len; 48 + u64 boot_tb; 49 + u64 tb_freq; 50 + unsigned int tb_buffer; 51 + unsigned int size; 46 52 bool done; 47 53 pid_t pid; 48 54 pid_t tid; ··· 129 119 } 130 120 } 131 121 122 + static unsigned long long powerpc_vpadtl_timestamp(struct powerpc_vpadtl_queue *vpaq) 123 + { 124 + struct powerpc_vpadtl_entry *record = vpaq->dtl; 125 + unsigned long long timestamp = 0; 126 + unsigned long long boot_tb; 127 + unsigned long long diff; 128 + double result, div; 129 + double boot_freq; 130 + /* 131 + * Formula used to get timestamp that can be co-related with 132 + * other perf events: 133 + * ((timbase from DTL entry - boot time) / frequency) * 1000000000 134 + */ 135 + if (record->timebase) { 136 + boot_tb = vpaq->boot_tb; 137 + boot_freq = vpaq->tb_freq; 138 + diff = be64_to_cpu(record->timebase) - boot_tb; 139 + div = diff / boot_freq; 140 + result = div; 141 + result = result * 1000000000; 142 + timestamp = result; 143 + } 144 + 145 + return timestamp; 146 + } 147 + 132 148 static struct powerpc_vpadtl *session_to_vpa(struct perf_session *session) 133 149 { 134 150 return container_of(session->auxtrace, struct powerpc_vpadtl, auxtrace); ··· 167 131 powerpc_vpadtl_dump(vpa, buf, len); 168 132 } 169 133 170 - static int powerpc_vpadtl_process_event(struct perf_session *session __maybe_unused, 171 - union perf_event *event __maybe_unused, 172 - struct perf_sample *sample __maybe_unused, 173 - const struct perf_tool *tool __maybe_unused) 134 + static int powerpc_vpadtl_get_buffer(struct powerpc_vpadtl_queue *vpaq) 174 135 { 136 + struct auxtrace_buffer *buffer = vpaq->buffer; 137 + struct auxtrace_queues *queues = &vpaq->vpa->queues; 138 + struct auxtrace_queue *queue; 139 + 140 + queue = &queues->queue_array[vpaq->queue_nr]; 141 + buffer = auxtrace_buffer__next(queue, buffer); 142 + 143 + if (!buffer) 144 + return 0; 145 + 146 + vpaq->buffer = buffer; 147 + vpaq->size = buffer->size; 148 + 149 + /* If the aux_buffer doesn't have data associated, try to load it */ 150 + if (!buffer->data) { 151 + /* get the file desc associated with the perf data file */ 152 + int fd = perf_data__fd(vpaq->vpa->session->data); 153 + 154 + buffer->data = auxtrace_buffer__get_data(buffer, fd); 155 + if (!buffer->data) 156 + return -ENOMEM; 157 + } 158 + 159 + vpaq->buf_len = buffer->size; 160 + 161 + if (buffer->size % dtl_entry_size) 162 + vpaq->buf_len = buffer->size - (buffer->size % dtl_entry_size); 163 + 164 + if (vpaq->tb_buffer != buffer->buffer_nr) { 165 + vpaq->pkt_len = 0; 166 + vpaq->tb_buffer = 0; 167 + } 168 + 169 + return 1; 170 + } 171 + 172 + /* 173 + * The first entry in the queue for VPA DTL PMU has the boot timebase, 174 + * frequency details which are needed to get timestamp which is required to 175 + * correlate with other events. Save the boot_tb and tb_freq as part of 176 + * powerpc_vpadtl_queue. The very next entry is the actual trace data to 177 + * be returned. 178 + */ 179 + static int powerpc_vpadtl_decode(struct powerpc_vpadtl_queue *vpaq) 180 + { 181 + int ret; 182 + char *buf; 183 + struct boottb_freq *boottb; 184 + 185 + ret = powerpc_vpadtl_get_buffer(vpaq); 186 + if (ret <= 0) 187 + return ret; 188 + 189 + boottb = (struct boottb_freq *)vpaq->buffer->data; 190 + if (boottb->timebase == 0) { 191 + vpaq->boot_tb = boottb->boot_tb; 192 + vpaq->tb_freq = boottb->tb_freq; 193 + vpaq->pkt_len += dtl_entry_size; 194 + } 195 + 196 + buf = vpaq->buffer->data; 197 + buf += vpaq->pkt_len; 198 + vpaq->dtl = (struct powerpc_vpadtl_entry *)buf; 199 + 200 + vpaq->tb_buffer = vpaq->buffer->buffer_nr; 201 + vpaq->buffer = NULL; 202 + vpaq->buf_len = 0; 203 + 204 + return 1; 205 + } 206 + 207 + static struct powerpc_vpadtl_queue *powerpc_vpadtl__alloc_queue(struct powerpc_vpadtl *vpa, 208 + unsigned int queue_nr) 209 + { 210 + struct powerpc_vpadtl_queue *vpaq; 211 + 212 + vpaq = zalloc(sizeof(*vpaq)); 213 + if (!vpaq) 214 + return NULL; 215 + 216 + vpaq->vpa = vpa; 217 + vpaq->queue_nr = queue_nr; 218 + 219 + return vpaq; 220 + } 221 + 222 + /* 223 + * When the Dispatch Trace Log data is collected along with other events 224 + * like sched tracepoint events, it needs to be correlated and present 225 + * interleaved along with these events. Perf events can be collected 226 + * parallely across the CPUs. 227 + * 228 + * An auxtrace_queue is created for each CPU. Data within each queue is in 229 + * increasing order of timestamp. Allocate and setup auxtrace queues here. 230 + * All auxtrace queues is maintained in auxtrace heap in the increasing order 231 + * of timestamp. So always the lowest timestamp (entries to be processed first) 232 + * is on top of the heap. 233 + * 234 + * To add to auxtrace heap, fetch the timestamp from first DTL entry 235 + * for each of the queue. 236 + */ 237 + static int powerpc_vpadtl__setup_queue(struct powerpc_vpadtl *vpa, 238 + struct auxtrace_queue *queue, 239 + unsigned int queue_nr) 240 + { 241 + struct powerpc_vpadtl_queue *vpaq = queue->priv; 242 + 243 + if (list_empty(&queue->head) || vpaq) 244 + return 0; 245 + 246 + vpaq = powerpc_vpadtl__alloc_queue(vpa, queue_nr); 247 + if (!vpaq) 248 + return -ENOMEM; 249 + 250 + queue->priv = vpaq; 251 + 252 + if (queue->cpu != -1) 253 + vpaq->cpu = queue->cpu; 254 + 255 + if (!vpaq->on_heap) { 256 + int ret; 257 + retry: 258 + ret = powerpc_vpadtl_decode(vpaq); 259 + if (!ret) 260 + return 0; 261 + 262 + if (ret < 0) 263 + goto retry; 264 + 265 + vpaq->timestamp = powerpc_vpadtl_timestamp(vpaq); 266 + 267 + ret = auxtrace_heap__add(&vpa->heap, queue_nr, vpaq->timestamp); 268 + if (ret) 269 + return ret; 270 + vpaq->on_heap = true; 271 + } 272 + 175 273 return 0; 274 + } 275 + 276 + static int powerpc_vpadtl__setup_queues(struct powerpc_vpadtl *vpa) 277 + { 278 + unsigned int i; 279 + int ret; 280 + 281 + for (i = 0; i < vpa->queues.nr_queues; i++) { 282 + ret = powerpc_vpadtl__setup_queue(vpa, &vpa->queues.queue_array[i], i); 283 + if (ret) 284 + return ret; 285 + } 286 + 287 + return 0; 288 + } 289 + 290 + static int powerpc_vpadtl__update_queues(struct powerpc_vpadtl *vpa) 291 + { 292 + if (vpa->queues.new_data) { 293 + vpa->queues.new_data = false; 294 + return powerpc_vpadtl__setup_queues(vpa); 295 + } 296 + 297 + return 0; 298 + } 299 + 300 + static int powerpc_vpadtl_process_event(struct perf_session *session, 301 + union perf_event *event __maybe_unused, 302 + struct perf_sample *sample, 303 + const struct perf_tool *tool) 304 + { 305 + struct powerpc_vpadtl *vpa = session_to_vpa(session); 306 + int err = 0; 307 + 308 + if (dump_trace) 309 + return 0; 310 + 311 + if (!tool->ordered_events) { 312 + pr_err("VPA requires ordered events\n"); 313 + return -EINVAL; 314 + } 315 + 316 + if (sample->time) { 317 + err = powerpc_vpadtl__update_queues(vpa); 318 + if (err) 319 + return err; 320 + } 321 + 322 + return err; 176 323 } 177 324 178 325 /*