/proc/thread-self
I’ve never invested much time in understanding how operating systems work. It has never been on my interests list, nor have I considered it useful knowledge for my career. What can I say? We all make bad decisions in our lives.
Over the last two years, I have invested a lot of time in understanding how some things work in Linux. Those are the consequences (or benefits?) of working with a complex and low-level system like a database.
This week, I was working on tracking data that was read from the page cache. While debugging some ClickHouse metrics directly collected from the Kernel, I discovered the /proc/thread-self
directory1.
/proc/thread-self
When a thread accesses this magic symbolic link, it resolves to the process’s own /proc/self/task/[tid] directory.
So /proc/thread-self
points to the proc
folder for the current thread. In those folders, you can find a lot of helpful information. In my case, I was interested in /proc/thread-self/io
2, where you have IO statistics.
I was focused on investigating whether the Kernel reported bytes read from S3 inside rchar
. I shared more information in this PR in ClickHouse repository. Despite the PR being closed, the examples I shared there still hold valuable insights and a reproducer that can contribute to the collective understanding of the system.
Footnotes
-
There is also a
/proc/self
for the current process. This is something I didn’t know either. ↩ -
https://docs.kernel.org/filesystems/proc.html#proc-pid-io-display-the-io-accounting-fields ↩