Photo by Roderico Y. Díaz on Unsplash — This reminds me of the times I do `ls /proc` in my terminal.
At the start of 2018, I wished to learn in general about Linux and Systems Programming. That means, being friends with the Kernel! So, I started reading The Linux Programming Interface. This is the first part of the series of blog posts that I will be writing as I go through the book and unravel the mysteries of Linux.
This post is an attempt to understand some parts of the lesson “File I/O: The Universal I/O Model” from The Linux Programming Interface.
I have heard of the statement that “everything in Linux is a file”. But, I got learn that there are different types of files in Linux.
System call is a way of asking the kernel to do some work for us. For example, these are the four basic system calls that help to work with files in Linux
~/cool/stuff/is/here.js
?File descriptor is a non-negative integer number that is used to reference a file.
When open()
system call is being called by a process, a file descriptor is being returned from it, which could be used in other system calls like read, write, close.
One of the interesting thing that I missed out to observe closely when I read the lesson for the first time is (This is probably an important key take away)
Each process has its own set of file descriptors
Lets try to understand this in a step by step manner.
Lets write a program that just opens a file and print its file descriptor value.
I created a binary and executed it to read one file at a time. So, this is basically reading one file from one process at a time.
The file always gets the file descriptor value of 3 always
Now, lets try opening multiple files from one process at a time
files got allocated with sequential integer values
The way the files get the file descriptor number is based on this simple idea
SUSv3 specifies that if open() succeeds, it is guaranteed to use the lowest-numbered unused file descriptor for the process.
— Kerrisk, Michael. The Linux Programming Interface: A Linux and UNIX System Programming Handbook (p. 73). No Starch Press. Kindle Edition.
This is the interesting part. What happens if same files are accessed by multiple processes at the same time. How is the file descriptor allocated then?
Numbering is done at a process level
This proves the statement that we started with,
Each process has its own set of file descriptors
Process 9012 opened 1.js
assigning fd 3 and 2.js
assigning fd 4. Process 9015 is no different, it did the same thing. Because those are the lowest numbered unused file descriptor value within those processes.
Now that we have come this long way, the interesting question in my mind is what will happen if two processes try to write to the opened files at the same time. (I guess, this is probably worth answering another time!)
I have already heard of these names and to my surprise, they are just file descriptors.
When a process is created, it seems like it automatically opens the stdin, stdout and stderr files and allocate the fd numbers 0,1,2 respectively.
I went a step further and tried to see where these files are present. But I ended up with something that I don’t know yet. (Amazing! We have something to dwell about)
character special? ¯\_(ツ)_/¯
/dev/pts/0
?(Please do comment, if you got the chance to solve these)
Thanks for reading. I quote verses from my favourite Tamil literature “Tirukkuṛaḷ” at the end of my blog posts.
செய்தக்க அல்ல செயக்கெடும் செய்தக்க
செய்யாமை யானும் கெடும்.
— திருக்குறள்
Translated meaning ( in my words ): Things go wrong for people who do the things that are not supposed to be done by them and also for the people who do not do what they are supposed to do.