Re: Need advice: Better Jail integration into ps/top, setpwfile gone forever?

From: antranigv <antranigv_at_freebsd.am>
Date: Fri, 27 Aug 2021 09:34:48 UTC
Dear Jamie and Alan,

Thank you both for your inputs.

Jamie, I totally understand your point, for an issue like this to be fixed, we 
need to patch some subsystem. I'm not sure I have enough knowledge to do that, 
but what you point is called UID Virtualization. If I am not mistaken illumos 
systems do that for their Zones. I have to get into it, hopefully something 
similar can be done with FreeBSD.

So,
1) Yes, patching only ps/top will not solve the issue, what about htop? 
sockstat? procstat? etc.
2) The ideal way is to make a major change. Look into what other systems are 
doing and try to implement it here.

Alan, I got a PoC working!

#include <pwd.h>
#include <jail.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/param.h>
#include <sys/jail.h>
#include <sys/types.h>
#include <sys/socket.h>

int main(int argc, char *argv[]){
        struct passwd *pwd;
        int sp[2], uid;
        pid_t pid;

        if (argc < 2 ) { printf("usage: psjail UID\n"); exit(1); }
        uid = atoi(argv[1]);
        printf("Got UID: %d\n", uid);

        socketpair(PF_LOCAL, SOCK_STREAM, 0, sp);

        pid = fork();

        if (pid == 0){
                printf("In the child process\n");
                printf("Going into Jail\n");
                jail_attach(35);
                pwd = getpwuid(uid);
                if (pwd == NULL) {
                        printf("In the Jail: no user found\n");
                        write(sp[1], "", sizeof(pwd));
                } else {
                        printf("In the Jail: %s\n", pwd->pw_name);
                        write(sp[1], pwd->pw_name, sizeof(pwd->pw_name));
                        printf("Message sent\n");
                }
                _Exit(0);
        } else {
                char buf[sizeof(pwd->pw_name)];
                printf("I'm the parent\n");
                int n = read(sp[0], buf, sizeof(pwd->pw_name));
                printf("got %d bytes\n", n);
                printf("parent got : '%s'\n", buf);
                pwd = getpwuid(uid);
                if (pwd == NULL) {
                        printf("In the parent: no user found\n");
                } else {
                        printf("In the parent: %s\n", pwd->pw_name);
                }
        }
        printf("Done executing\n");
        return 42;
}

Here's a sample output,

root@srv0:~/src # ./spjail 1001
Got UID: 1001
I'm the parent
In the child process
Going into Jail
In the Jail: romero
Message sent
got 8 bytes
parent got : 'romero'
In the parent: no user found
Done executing

root@srv0:~/src # ./spjail 1000
Got UID: 1000
I'm the parent
In the child process
Going into Jail
In the Jail: no user found
got 8 bytes
parent got : ''
In the parent: no user found
Done executing

Now, as Jamie said, this is not a good idea, it's basically a "Hack." I will 
try to find a proper way to virtualize the UID table, write a PoC for that and 
see what the community thinks about that.

I like Jamie's approach, but I would not want it to be manual. We need some 
automated way. Sure, I can integrate that into my Jail Orchestrator, but some 
people want to use jail(8) with jail.conf, and it should be simple for them as 
well.

Thank you all. Any more input will be appreciated.

Kind regards,

--
antranigv
https://antranigv.am/

> On 26 Aug 2021, at 5:43 AM, Alan Somers <asomers@freebsd.org> wrote:
> 
> On Mon, Aug 23, 2021 at 4:03 AM antranigv <antranigv@freebsd.am> wrote:
> 
>> Greetings all,
>> 
>> I am trying to have better integration of top(1) and ps(1) with FreeBSD
>> Jails.
>> 
>> The main problem that I am trying to solve is displaying the correct UID
>> username. Here's an example.
>> 
>> I have a host (srv0), it is running a Jail named "fsoc", The Jail "fsoc"
>> has a user named "romero" with the UID 1001.
>> 
>> If I run `ps auxd` in the Jail, I get the following,
>> 
>> romero@fsoc:~ $ ps auxd
>> USER    PID %CPU %MEM   VSZ  RSS TT  STAT STARTED    TIME COMMAND
>> root   4377  0.0  0.0 11376  956  -  SsJ  14:15   0:00.38
>> /usr/sbin/syslogd -ss
>> root   5758  0.0  0.1 13128 1352  1  IJ   18:24   0:00.02 /bin/tcsh -i
>> root   5763  0.0  0.0 12048  960  1  IJ   18:24   0:00.01 - su - romero
>> romero 5764  0.0  0.1 12120 2268  1  SJ   18:24   0:00.02 `-- -su (sh)
>> romero 9625  0.0  0.1 11684 2576  1  R+J  09:41   0:00.01   `-- ps auxd
>> 
>> Good!
>> 
>> However, if I try to run it on the host, here's what I get,
>> 
>> root@srv0:~ # ps auxd -J fsoc
>> USER  PID %CPU %MEM   VSZ  RSS TT  STAT STARTED    TIME COMMAND
>> root 4377  0.0  0.0 11376  956  -  SsJ  14:15   0:00.38 /usr/sbin/syslogd
>> -ss
>> root 5758  0.0  0.1 13128 1352  1  IJ   18:24   0:00.02 /bin/tcsh -i
>> root 5763  0.0  0.0 12048  960  1  IJ   18:24   0:00.01 - su - romero
>> 1001 5764  0.0  0.1 12124 2436  1  I+J  18:24   0:00.02 `-- -su (sh)
>> 
>> As you can see, in the User field it says 1001, because the host does not
>> have a user with that UID.
>> 
>> This seems fine, but it becomes an issue when you have multiple Jail and a
>> large host running.
>> 
>> Here's an example if the host had a user with UID 1001,
>> 
>> root@pingvinashen:~ # ps auxd -J oragir
>> USER        PID %CPU %MEM    VSZ   RSS TT  STAT STARTED    TIME COMMAND
>> root        949  0.0  0.0  11344  2584  -  IsJ  Mon19   0:01.13
>> /usr/sbin/cron -s
>> root       1962  0.0  0.0  11428  2796  -  SsJ  Mon19   0:01.83
>> /usr/sbin/syslogd -ss
>> antranigv 95342  0.0  0.0  11004  2424  -  IsJ  Mon19   0:00.48 daemon:
>> /usr/home/oragir/writefreely/writefreely[9992] (daemon)
>> antranigv  9992  0.0  0.4 767244 58336  -  IJ   Mon19   2:58.87 -
>> /usr/home/oragir/writefreely/writefreely
>> 
>> Now, you would think that this is good, however, if you run this in the
>> jail,
>> 
>> root@oragir:~ # ps auxd
>> USER      PID %CPU %MEM    VSZ   RSS TT  STAT STARTED    TIME COMMAND
>> root      949  0.0  0.0  11344  2584  -  SsJ  Mon15   0:01.13
>> /usr/sbin/cron -s
>> root     1962  0.0  0.0  11428  2796  -  SsJ  Mon15   0:01.83
>> /usr/sbin/syslogd -ss
>> oragir  95342  0.0  0.0  11004  2424  -  IsJ  Mon15   0:00.48 daemon:
>> /usr/home/oragir/writefreely/writefreely[9992] (daemon)
>> oragir   9992  0.0  0.4 767244 58336  -  IJ   Mon15   2:58.88 -
>> /usr/home/oragir/writefreely/writefreely
>> root    88228  0.0  0.0  13336  4004  8  SJ   09:45   0:00.01 /bin/csh -i
>> root    99502  0.0  0.0  11824  3140  8  R+J  09:45   0:00.00 - ps auxd
>> 
>> As you can see, the UID 1001 was not `antranigv`, instead it was `oragir`.
>> 
>> This has been an issue for me, so I tried writing some code to implement
>> the following.
>> 
>> If the process is in a Jail, then change the passwd db from /etc to
>> /path/of/the/jail/etc.
>> 
>> I thought it would be an easy thing to do, but not so much.
>> 
>> Here's what I've tried.
>> 
>> 1) Call jail_attach and run ps inside the Jail. Oh yeah, it's a jail!
>> after attaching to it there is no way to deattach :-) silly me!
>> 
>> 2) Change the passwd file for getpwuid/getpwnam. I wanted to use
>> setpwfile(3) but turns out that \
>> COMPATIBILITY
>>     The historic function setpwfile(3), which allowed the specification of
>>     alternate password databases, has been deprecated and is no longer
>>     available.
>> 
>> Okay, So I look into how other tools like pwd_mkdb is written and I see
>> that everything is defined (pun intended) the following way,
>> 
>> in /usr/include/pwd.h
>> 
>> #define _PATH_PWD               "/etc"
>> #define _PATH_PASSWD            "/etc/passwd"
>> #define _PASSWD                 "passwd"
>> #define _PATH_MASTERPASSWD      "/etc/master.passwd"
>> #define _MASTERPASSWD           "master.passwd"
>> 
>> #define _PATH_MP_DB             "/etc/pwd.db"
>> #define _MP_DB                  "pwd.db"
>> #define _PATH_SMP_DB            "/etc/spwd.db"
>> #define _SMP_DB                 "spwd.db"
>> 
>> #define _PATH_PWD_MKDB          "/usr/sbin/pwd_mkdb"
>> 
>> and pwd_mkdb does the following
>> 
>> ...
>> strcpy(prefix, _PATH_PWD);
>> ...
>>                case 'd':
>>                        dflag++;
>>                        strlcpy(prefix, optarg, sizeof(prefix));
>>                        break;
>> ...
>> 
>> Tuns out it parses the DB file, but I don't want to do that in ps/top! :-)
>> 
>> 3) Just for fun, I played with chroot. I tried the following code.
>> # cat getpw.c
>> 
>> #define MAXHOSTNAMELEN  255
>> #define MAXPATHLEN      255
>> #include <pwd.h>
>> //#include <jail.h>
>> #include <stdio.h>
>> #include <unistd.h>
>> #include <sys/uio.h>
>> #include <sys/jail.h>
>> #include <sys/param.h>
>> #include <sys/types.h>
>> 
>> int main(){
>>        // Just get root!
>>        struct passwd *pwd;
>>        printf("just root: %s\n", (getpwuid(0))->pw_name);
>> 
>>        // let's try with undef/define
>> #undef _PATH_PWD
>> #undef _PATH_PASSWD
>> #undef _PASSWD
>> #undef _PATH_MASTERPASSWD
>> #undef _MASTERPASSWD
>> 
>> #undef _PATH_MP_DB
>> #undef _MP_DB
>> #undef _PATH_SMP_DB
>> #undef _SMP_DB
>> 
>> #define _PATH_PWD               "/zdata/jails/fsoc/etc"
>> #define _PATH_PASSWD            "/zdata/jails/fsoc/etc/passwd"
>> #define _PASSWD                 "passwd"
>> #define _PATH_MASTERPASSWD      "/zdata/jails/fsoc/etc/master.passwd"
>> #define _MASTERPASSWD           "master.passwd"
>> 
>> #define _PATH_MP_DB             "/zdata/jails/fsoc/etc/pwd.db"
>> #define _MP_DB                  "pwd.db"
>> #define _PATH_SMP_DB            "/zdata/jails/fsoc/etc/spwd.db"
>> #define _SMP_DB                 "spwd.db"
>>        pwd = getpwuid(1001);
>>        if (pwd == NULL) {
>>                printf("using undef/define: no user found\n");
>>        } else {
>>                printf("using undef/define: %s\n", pwd->pw_name);
>>        }
>> 
>>        // let's try with chroot!
>>        chroot("/zdata/jails/fsoc");
>>        pwd = getpwuid(1001);
>>        if (pwd == NULL) {
>>                printf("after chroot: no user found\n");
>>        } else {
>>                printf("after chroot: %s\n", pwd->pw_name);
>>        }
>> 
>>        // escape back the chroot ;-)
>>        chroot("../../../../");
>>        pwd = getpwuid(1001);
>>        if (pwd == NULL) {
>>                printf("after unchroot: no user found\n");
>>        } else {
>>                printf("after chroot: %s\n", pwd->pw_name);
>>        }
>> 
>>        return 42;
>> }
>> 
>> And I get the following:
>> 
>> # ./getpw
>> just root: root
>> using undef/define: no user found
>> after chroot: romero
>> after unchroot: no user found
>> 
>> So, any advice? should I do chroot in ps? (no I don't think that's a good
>> idea), should I add a new call that implements setpwfile(3)? But I really
>> want to know why it was removed, I'm sure there's a story there. Or is
>> there a better way?
>> 
>> Kind regards, have a nice day!
>> 
>> --
>> antranigv
>> https://antranigv.am/
>> 
> 
> Oof, that's a hard problem.  But in fact, it's worse than you think.
> /etc/passwd isn't the only place that user information is stored.   It
> could be in NIS, or LDAP, or Heimdal, or who knows where.  The only
> reliable way to look up a user is to actually do it from within the jail.
> You could create a socketpair, then fork a child process and attach that to
> the jail, and use it to lookup jailed UIDs.  It's complicated, but I can't
> imagine anything simpler that would work.
> -Alan