上一篇文章我们看到了odigos确实可以做到零侵入实现日志监控和跟踪,现在我们来分析一下原理
java程序的自动打桩
查看deploy定义发现已经被改了,容器加了环境变量
- name: JAVA_TOOL_OPTIONS
value: -javaagent:/agent/opentelemetry-javaagent-all.jar -Dotel.traces.sampler=always_on
-Dotel.exporter.otlp.endpoint=http://$(NODE_IP):4317
- name: JAVA_OPTS
value: -javaagent:/agent/opentelemetry-javaagent-all.jar -Dotel.traces.sampler=always_on
-Dotel.exporter.otlp.endpoint=http://$(NODE_IP):4317
- name: OTEL_RESOURCE_ATTRIBUTES
value: service.name=cktest2,k8s.pod.name=$(POD_NAME)
还加了一个init container,定义如下
initContainers:
- command:
- cp
- /javaagent.jar
- /agent/opentelemetry-javaagent-all.jar
image: keyval/otel-java-agent:v0.5
imagePullPolicy: IfNotPresent
name: copy-java-agent
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /agent
name: agentdir-java
这个自动打桩的办法和swck是一样的,主要原理有几点:
1、通过K8S的webhook机制,对被观测服务的资源定义进行修改,注入环境变量和init container
2、通过init container挂载目录,来和被观测服务共享javaagent
3、通过jvm的环境变量JAVA_TOOL_OPTIONS和JAVA_OPTS来自动加载javaagent
go语言服务自动打桩:
查看deploy定义,发现多了一个容器:
- name: godemo-instrumentation
image: keyval/otel-go-agent:v0.6.3
env:
- name: NODE_IP
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: status.hostIP
- name: OTEL_EXPORTER_OTLP_ENDPOINT
value: $(NODE_IP):4317
- name: OTEL_SERVICE_NAME
value: godemo
- name: OTEL_TARGET_EXE
value: /app/demo
这个容器镜像是otel-go-agent,他的代码:
https://github.com/keyval-dev/opentelemetry-go-instrumentation
在instrumentors目录下,有多个probe.go文件,这些文件都是针对具体的被跟踪对象实现的跟踪类
当前支持跟踪的对象有:httpserver、grpc、mux
以httpserver为例,看一下run函数
func (h *httpServerInstrumentor) Run(eventsChan chan<- *events.Event) {
logger := log.Logger.WithName("net/http-instrumentor")
var event HttpEvent
for {
record, err := h.eventsReader.Read() //从ebpf获取事件
if err != nil {
if errors.Is(err, perf.ErrClosed) {
return
}
logger.Error(err, "error reading from perf reader")
continue
}
if record.LostSamples != 0 {
logger.V(0).Info("perf event ring buffer full", "dropped", record.LostSamples)
continue
}
//读取事件
if err := binary.Read(bytes.NewBuffer(record.RawSample), binary.LittleEndian, &event); err != nil {
logger.Error(err, "error parsing perf event")
continue
}
//转换事件,写入chan
eventsChan <- h.convertEvent(&event)
}
}
读取到的事件定义
type HttpEvent struct {
StartTime uint64
EndTime uint64
Method [100]byte
Path [100]byte
SpanContext context.EbpfSpanContext
}
type EbpfSpanContext struct {
TraceID trace.TraceID
SpanID trace.SpanID
}
上面我看到opentelemetry-go-instrumentation是从ebpf读取事件,经常解析组装,获得trace数据的
那么关键是它是怎么从ebpf读取数据的?
了解ebpf技术的都知道,ebpf程序一般是内核态代码和用户态代码配合工作的,内核态负责收集数据并写到一个内存区域,用户态代码从这个内存区域读取数据
在probe.go同级目录下都有一个bpf文件夹,里边的c文件就是内核态代码
// This instrumentation attaches uprobe to the following function:
// func (mux *ServeMux) ServeHTTP(w ResponseWriter, r *Request)
SEC("uprobe/ServerMux_ServeHTTP")
int uprobe_ServerMux_ServeHTTP(struct pt_regs *ctx)
{
u64 request_pos = 4;
struct http_request_t httpReq = {};
httpReq.start_time = bpf_ktime_get_ns();
// Get request struct
void *req_ptr = get_argument(ctx, request_pos);
...
// get path from Request.URL
void *url_ptr = 0;
bpf_probe_read(&url_ptr, sizeof(url_ptr), (void *)(req_ptr + url_ptr_pos));
void *path_ptr = 0;
bpf_probe_read(&path_ptr, sizeof(path_ptr), (void *)(url_ptr + path_ptr_pos));
...
// Get Request.ctx
void *ctx_iface = 0;
bpf_probe_read(&ctx_iface, sizeof(ctx_iface), (void *)(req_ptr + ctx_ptr_pos + 8));
// Write event
httpReq.sc = generate_span_context();
bpf_map_update_elem(&context_to_http_events, &ctx_iface, &httpReq, 0);
long res = bpf_map_update_elem(&spans_in_progress, &ctx_iface, &httpReq.sc, 0);
return 0;
}
从内核代码看到他对ServeHTTP函数进行了跟踪,并且把跟踪数据写到map里,这样用户态代码就可以从map里read跟踪数据
c语言代码如何编译到go里?
在go代码里可以找到一个go:generate注释
//go:generate go run github.com/cilium/ebpf/cmd/bpf2go -target CFLAGS bpf ./bpf/probe.bpf.c
这个就是编译opentelemetry-go-instrumentation的时候会执行go generate ,会执行bpf2go 工具来把c代码转换go代码
参考:
https://tonybai.com/2022/07/19/develop-ebpf-program-in-go/












网友评论