paint-brush
What's the Big Deal with Linux Capabilities? (Part 2)by@inaeem
1,383 reads
1,383 reads

What's the Big Deal with Linux Capabilities? (Part 2)

by inaeemDecember 10th, 2021
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

This story is a continuation of the last one, in which we discussed Process Capabilities Sets in detail. Some of you may be wondering how these Capabilities Sets are determined or applied to Unprivileged and Privileged Program Binary. This article is aimed squarely at them.

Company Mentioned

Mention Thumbnail
featured image - What's the Big Deal with Linux Capabilities? (Part 2)
inaeem HackerNoon profile picture

This story is a continuation of the last one, in which we discussed Process Capabilities Sets in detail. Some of you may be wondering how these Capabilities Sets are determined or applied to Unprivileged and Privileged Program Binary. This article is aimed at them.


Before I begin detailing process creation mechanics and Linux capabilities, I'd want to go over two key concepts.

Capability Aware Applications

Capability-aware applications can manipulate their capability set with system calls (capset, capget, prctl) after load. At some point during execution when an application doesn't need certain capabilities, it can drop some capabilities from its effective set to limit exposure to privileged tasks. As long it has a capability in the permitted set, it always brings back that capability to its effective set.

e.g runc, ping etc.

Capability Dump Applications

Applications don’t do any system calls (capset) to modify their capabilities, and they depend on the capability sets that are inherited from the parent and constructed during application load. In order words, they rely on an effective capability set to do their job.

e.g cat, ls etc.


Unprivileged Program Binary

Unprivileged Program Binary is when no File Capabilities are enabled on the executable. When we load an unprivileged program binary (e.g., ls, cat), the capability sets of the thread (parent) in conjunction with file SETUID bit are used to determine the capabilities of that thread after execve(2).

In the case of Unprivileged Program Binary, the ambient capabilities are critical in determining the thread's capabilities.


Let's have a look at how capability sets are determined for an Unprivileged Program Binary after execve(2) under certain conditions.

Capabilities Transition

Unprivileged Program Binary - Capabilities Transition

Explanation

  • inheritable & bounding: There will be no change in the inheritable & bounding set.
  • effective & permitted: These capabilities are lost during execve() and are recalculated based on ambient capabilities.
  • ambient: The ambient capabilities are introduced to reinforce lost capabilities ineffective & permitted set.

Ambient capabilities must exist in a bounding set.

Use Case #1: Unprivileged Bash Process

An unprivileged user (bash process) uses the ping executable to ping a local server.


Criteria:

  • [ ]undefinedFile Ownership: setuid bit != set && owner == root
  • [ ]undefinedParent Process: Unprivileged bash process runs with no or limited capabilities
  • [ ]undefinedExecutable Binary: Unprivileged ping binary.

Schematic Diagram

Unprivileged Bash Process Schematic Diagram

Prepare The Environment

# File Ownership: setuid bit != set && owner == root
$ ls -la ping_clone
-rwxr-xr-x ... root root ... ping_clone

# Parent Process: Unprivileged bash proces which runs with no 
# or limited capabilities
$ capsh --print 
Current: =
Bounding set =cap_chown,cap_dac_override, .....
 .....
uid=1000(ubuntu)
gid=1000(ubuntu)


# Executable Binary: Unprivileged ping binary
$ getcap ping_clone

Demo #1: Using capsh Utility

Use capsh utility to bootstrap an unprivileged bash process and then ping a local server.


$ sudo capsh --caps="cap_net_admin,cap_net_raw,cap_setpcap,cap_setuid,cap_setgid+ep" 
--keep=1 --user=ubuntu --addamb="cap_net_admin,cap_net_raw" --print -- -c "./ping_clone -c 1 localhost"
Current: = cap_setgid,cap_setuid,cap_setpcap,cap_net_admin,cap_net_raw+p 
Bounding set = cap_chown,cap_dac_override,cap_dac_read_search,
    cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,
    cap_setpcap,cap_linux_immutable,cap_net_bind_service,
    cap_net_broadcast,cap_net_admin,cap_net_raw,
    cap_ipc_lock,cap_ipc_owner,cap_sys_module,cap_sys_rawio,
    cap_sys_chroot,cap_sys_ptrace,cap_sys_pacct,cap_sys_admin,
    cap_sys_boot,cap_sys_nice,cap_sys_resource,cap_sys_time,
    cap_sys_tty_config,cap_mknod,cap_lease,cap_audit_write,
    cap_audit_control,cap_setfcap,cap_mac_override,cap_mac_admin,
    cap_syslog,35,36,37 
Securebits: 020/0x10/5'b10000 
 secure-noroot: no (unlocked) 
 secure-no-suid-fixup: no (unlocked) 
 secure-keep-caps: yes (unlocked) 
uid=1000(ubuntu) 
gid=1000(ubuntu) 
groups=4(adm),10(wheel),190(systemd-journal),991(docker),1000(ubuntu) 
PING localhost (127.0.0.1) 56(84) bytes of data. 
64 bytes from localhost (127.0.0.1): icmp_seq=1 ttl=255 time=0.033 ms
--- localhost ping statistics --- 
1 packets transmitted, 1 received, 0% packet loss, time 0ms 
rtt min/avg/max/mdev = 0.033/0.033/0.033/0.000 ms


So what's going on here? Let's have a look:

  • Current & Bounding set: Create a favorable environment for ping_clone and which is

    sudo(root)───>su───>bash───>ping_clone(ubuntu)

  • --user=$USER: Drop all capabilities on UID change as we transition from the root into $USER.

  • --addamb=cap_net_raw: Add the ambient set to the effective and permitted sets when executing unprivileged binaries.

Demo #2: Using setpriv Utility

You may need to install setpriv utility.

$ sudo apt install setpriv

We'll use the setpriv utility to run the ping_clone binary as an unprivileged user.


$ sudo setpriv --inh-caps '-all,+net_raw' \
--bounding-set '-all,+net_raw' \
--reuid=ubuntu \
--ambient-caps='+net_raw' \ 
./ping_clone -c1 127.0.0.1                                                                  
PING 127.0.0.1 (127.0.0.1) 56(84) bytes of data.
64 bytes from 127.0.0.1: icmp_seq=1 ttl=64 time=0.019 ms
--- 127.0.0.1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.019/0.019/0.019/0.000 ms


When --ambient-caps argument isn't supplied, ping_clone utility will complain about 'socket: Operation not permitted'


So, what exactly is going on here? Let me clarify.

  • --reuid=ubuntu: All effective and permitted capabilities sets will be dropped from ping_clone binary.

  • --ambient-caps=+net_raw: Recalculate the effective and permitted capabilities sets based on given ambient capabilities sets.



Use Case #2: Privileged Bash Process

A privileged user (bash process) pings a local server using an unprivileged ping binary.


Criteria

  • [ ]undefinedFile Ownership: setuid bit != set && owner == root.
  • [ ]undefinedParent Process: Privileged bash process runs with all capabilities enabled.
  • [ ]undefinedExecutable Binary: Unprivileged ping binary (file capabilities aren't set).

Schematic Diagram

Privileged Bash Process Schematic Diagram

Prepare The Environment

# File Ownership: setuid bit != set && owner == root.
$ ls -la ping_clone
-rwxr-xr-x ... root root ... ping_clone

# Parent Process: Privileged bash process runs with full capabilities.
$ capsh --print 
Current: = cap_net_admin,cap_net_raw,cap_chown,cap_dac_override, ..... 
Bounding set = cap_net_admin,cap_net_raw,cap_chown,cap_dac_override, .....
 .....
uid=0(root) 
gid=0(root) 
...

# Executable Binary: Unprivileged ping binary (file capabilities aren't set).
$ getcap ping_clone


Capabilities Transition

When you log in as root, your Effective User ID is set to 0 and you have unrestricted access to the system to do (nearly) whatever you want.

Login as a root user explains everything.

With (Effective User ID == 0), the bash process becomes a privileged process. Despite the fact that all Linux capabilities are enabled, the kernel normally skips all restriction checks when Effective User ID == 0.



Use Case #3: Special Permissions (SUID, SGID)

Set User ID (setuid) and Set Group ID (sgid) are special permissions for executable files.

When these permissions are assigned to a file, the file to be executed assumes the privileges of the file's owner or group.

setuid bit changes a program effective uid (euid) upon execution.

Criteria:

  • [ ]undefinedFile Ownership: setuid bit == set && owner == root.
  • [ ]undefinedParent Process: Unprivileged bash process(no or limited capabilities).
  • [ ]undefinedExecutable Binary: Unprivileged ping binary (file capabilities aren't set).

Schematic Diagram

Special Permissions (SUID, SGID) - Schematic Diagram

Prepare The Environment

# File Ownership: setuid bit == set && owner == root.
$ ls -la ping_clone
-rwsr-xr-x ... root root ... ping_clone

# Parent Process: Unprivileged bash process(no or limited capabilities).
$ capsh --print 
Current: =
Bounding set =cap_chown,cap_dac_override, .....
 .....
uid=1000(ubuntu)
gid=1000(ubuntu)
...

# Executable Binary: Unprivileged ping binary. (file capabilities aren't set).
$ getcap ping_clone
# setuid bit set
$ ls -la
...
-rwsr-xr-x ... root root ... ping_clone

Capabilities Transition

When a non-root user executes the ping clone utility owned by the root user and with the setuid bit set, the file will always run in root user context (EUID = 0), until a program changes its effective uid (euid) during execution.


~$ ping_clone localhost &
[1] 31994
~# PING localhost (127.0.0.1) 56(84) bytes of data.
64 bytes from localhost (127.0.0.1): icmp_seq=1 ttl=64 time=0.027 ms

~$ cat /proc/31994/status
Name:   ping_clone
...
...
Uid:    1000    1000    0       1000
Gid:    1000    1000    1000    1000
...
CapInh: 0000000000000000
CapPrm: 0000000000003000
CapEff: 0000000000000000
CapBnd: 0000003fffffffff
CapAmb: 0000000000000000
...


So, what's going on here?

  • uid=1000: Isn't it supposed to be Uid: 1000 0 0 0 as stated in the claim?
  • CapPrm: 0000000000003000: The permitted set is reduced to cap_net_admin, cap_net_raw.
  • CapEff: 0000000000000000: How does the process conduct privileged network actions if the effective sets are empty?


Let's take a look at the ping_clone utility from the perspective of system calls. Remember that it is a capability-aware application that may change its capabilities programmatically.


Take a look at the output of the strace tracing tool.

  • At line 4: Gets all capabilities in {effective, permitted} sets.
  • At line 6: Drops all effective capabilities and removes all unwanted capabilities from the permissible set, leaving just CAP_NET_ADMIN and CAP_NET_RAW.
  • At line 7: prctl(PR_SET_KEEPCAPS, 1) is used to keep capability sets throughout a future EUID transition.
  • At line 9: Change effective user id to less privileged user.
  • Add line 21: The capability set CAP_NET_RAW has been re-established as an effective capability set for sensitive network operations.

Privileged Program Binary

Privileged Program Binary means that certain capabilities have been assigned to executable files. When we load a privileged Program Binary (e.g., ping clone), the executable file's capability set plays a significant role in the thread after execve(2).

Use getcap utility to determine privileged status of a Program Binary.

Capabilities Transition

Privileged Program Binary- Capabilities Transition

Explanation

  • ambient: The ambient capabilities has no role in capabilities transition and are set to zero.
  • inheritable & bounding: There will be no change in the inheritable & bounding set.
  • permitted: The logic to determine the final state of permitted set is complicated. It all depends on old inheritable capabilities and file capabilities and follows the given transition logic
    • File permitted set and old bounding set (before execve()) are logically ANDed.

      P1 = Bounding Old & File Permitted Set

    • File inheritable set and old inheritable set (before execve()) are logically ANDed.

      P2 = Inheritable Old & File Inheritable Set

    • Final state of permitted set is calculated by doing logical OR P1 and P2.

      P = P1 | P2

  • effective: Transition logic is as follows
    • Capabilities Aware Application has the luxury to activate/deactivate a capability in permitted set as effective capability whenever required.
    • File effective flag/bit is introduced for Capabilities Unaware Applications (Dump applications) to control the auto enforcement of permitted set as effective set after execve().

Use Case #1: Unprivileged Bash Process

A unprivileged user (bash process) pings a local server using a privileged ping binary.


Criteria:

  • [ ]undefinedFile Ownership: setuid bit != set && owner != root.
  • [ ]undefinedParent Process: Unprivileged bash process (no or limited capabilities)
  • [ ]undefinedExecutable Binary: Privileged ping binary (file capabilities are set using capset())

Schematic Diagram

Privileged Program Binary - Schematic Diagram

Prepare The Environment

# setuid bit != set && owner != root
$ ls -la ping_clone
-rwxr-xr-x ... ubuntu ubuntu ... ping_clone

# Privileged ping binary
$ getcap ping_clone
ping_clone = cap_net_raw+i

# Unprivileged User
$ capsh --print 
Current: =
Bounding set =cap_chown,cap_dac_override, .....
 .....
uid=1000(ubuntu)
gid=1000(ubuntu)
...

Example #1: When File Inheritable Set is set

Condition: Make sure that ping_clone utility is set with cap_net_raw as it's inheritable capability.


Terminal 1

# Privileged ping binary
$ getcap ping_clone
ping_clone = cap_net_raw+i

$ sudo capsh 
--caps="cap_net_admin,cap_net_raw,cap_setpcap,cap_setuid,cap_setgid+ep" 
--keep=1 --user=ubuntu --inh="cap_net_raw" 
--print -- -c "./ping_clone localhost"                                                         
Current: = cap_net_raw+ip cap_setgid,cap_setuid,cap_setpcap,cap_net_admin+p
Bounding set =cap_chown,cap_dac_override,cap_dac_read_search,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_setpcap,cap_linux_immutable,cap_net_bind_service,cap_net_broadcast,cap_net_admin,cap_net_raw,cap_ipc_lock,cap_ipc_owner,cap_sys_module,cap_sys_rawio,cap_sys_chroot,cap_sys_ptrace,cap_sys_pacct,cap_sys_admin,cap_sys_boot,cap_sys_nice,cap_sys_resource,cap_sys_time,cap_sys_tty_config,cap_mknod,cap_lease,cap_audit_write,cap_audit_control,cap_setfcap,cap_mac_override,cap_mac_admin,cap_syslog,cap_wake_alarm,cap_block_suspend,cap_audit_read
Securebits: 020/0x10/5'b10000
 secure-noroot: no (unlocked)
 secure-no-suid-fixup: no (unlocked)
 secure-keep-caps: yes (unlocked)
uid=1000(ubuntu)
gid=1000(ubuntu)
groups=4(adm),20(dialout),24(cdrom),25(floppy),27(sudo),29(audio),30(dip),44(video),46(plugdev),108(lxd),114(netdev),999(docker),1000(ubuntu)
PING localhost (127.0.0.1) 56(84) bytes of data.
64 bytes from localhost (127.0.0.1): icmp_seq=1 ttl=64 time=0.023


Terminal 2

$ cat /proc/4696/status | grep Cap
CapInh: 0000000000000000
CapPrm: 0000000000002000
CapEff: 0000000000000000
CapBnd: 0000003fffffffff
CapAmb: 0000000000000000


So what's going on here? Let's have a look

  • --user=$USER: We need a less privileged bash session with desired capabilities before executing ping_clone.
  • --inh=cap_net_raw: Bash session must enable cap_net_raw in the inheritable set as per capabilities transition logic for Privileged Program Binary.
  • Terminal 1#7 Current: We want to make sure cap_net_raw is there in the bash session inheritable set.

Example #2: File Permitted Set is set

When file permitted set is limited to cap_net_raw.

Terminal 1

# Privileged ping binary
$ getcap ping_clone
ping_clone = cap_net_raw+p

$ sudo capsh 
--caps="cap_net_admin,cap_net_raw,cap_setpcap,cap_setuid,cap_setgid+ep" 
--user=ubuntu 
--print -- -c "./ping_clone localhost"                                                         
Current: = 
Bounding set =cap_chown,cap_dac_override,cap_dac_read_search,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_setpcap,cap_linux_immutable,cap_net_bind_service,cap_net_broadcast,cap_net_admin,cap_net_raw,cap_ipc_lock,cap_ipc_owner,cap_sys_module,cap_sys_rawio,cap_sys_chroot,cap_sys_ptrace,cap_sys_pacct,cap_sys_admin,cap_sys_boot,cap_sys_nice,cap_sys_resource,cap_sys_time,cap_sys_tty_config,cap_mknod,cap_lease,cap_audit_write,cap_audit_control,cap_setfcap,cap_mac_override,cap_mac_admin,cap_syslog,cap_wake_alarm,cap_block_suspend,cap_audit_read
Securebits: 020/0x10/5'b10000
 secure-noroot: no (unlocked)
 secure-no-suid-fixup: no (unlocked)
 secure-keep-caps: yes (unlocked)
uid=1000(ubuntu)
gid=1000(ubuntu)
groups=4(adm),20(dialout),24(cdrom),25(floppy),27(sudo),29(audio),30(dip),44(video),46(plugdev),108(lxd),114(netdev),999(docker),1000(ubuntu)
PING localhost (127.0.0.1) 56(84) bytes of data.
64 bytes from localhost (127.0.0.1): icmp_seq=1 ttl=64 time=0.023 ms


Terminal 2

$ cat /proc/4696/status | grep Cap
CapInh: 0000000000000000
CapPrm: 0000000000002000
CapEff: 0000000000000000
CapBnd: 0000003fffffffff
CapAmb: 0000000000000000


So what's going on here? Let's explain:

  • --user=$USER: Again we desire a less privileged bash session with certain capabilities before executing ping_clone.
  • --inh=cap_net_raw: We have intentionally remove this argument to prove ping_clone is still operation without inheritable capabilities.
  • Terminal 1#9 Current: This is normal since we didn't specify --keep argument to drop permitted set from parent bash session after fork().


Example #3: When File Effective Bit is set

File effective bit makes more sense when application binaries like cat, nice, etc are unaware of capget() and capset() syscalls and can't change their thread effective set. In this case, they rely on external conditions, such as file effective bit, to copy all the capabilities of the permitted set into an effective set.


Instead of ping_clone utility, we will use top_clone utility for demonstration.

Terminal 1

# Privicp leged ping binary
$ getcap top_clone
top_clone = cap_chown+ep

$ ./top_clone 
....
uid=1000(ubuntu)
top - 09:44:35 up 13:25,  0 users,  load average: 0.15, 0.05, 0.01
Tasks: 120 total,   2 running,  79 sleeping,   0 stopped,   0 zombie
.....


Terminal 2

CapInh: 0000000000000000
CapPrm: 0000000000000001
CapEff: 0000000000000001
CapBnd: 0000003fffffffff
CapAmb: 0000000000000000


So what's going with thread capabilities:

  • Terminal 2#2 CapPrm: Above Capabilities Transition will help us to determine the final state of thread permitted set (0x0000000000000001=cap_chown) which matches with file permitted set.


$ getcap top_clone
top_clone = cap_chown+ep


  • Terminal 2#3 CapEff: Since file effective flag/bit is set for top_clone, It automatically copies permitted set into an effective set.


CapPrm: 0000000000000001
CapEff: 0000000000000001