For nsenter(1) we already support namespace specification by file
(e.g. bind mount to namespace /proc/[pid]/ns/[type] file). For
example:
# nsenter --uts=/some/path
This patch extends unshare(1) to setup the bind mount for specified
namespace, for example
# touch /some/path
# unshare --uts=/some/path hostname FOO
# nsenter --uts=/some/path hostname
FOO
Note that the problem is mount namespace, because create bind mount
to ns/mount file within unshared namespace does not make sense.
Based on patch from Lubomir Rintel <lkundrak@v3.sk>.
Signed-off-by: Karel Zak <kzak@redhat.com>
After "unshare --mount" users assume that mount operations within the
new namespaces are unshared (invisible for the rest of the system).
Unfortunately, this is not true and the behavior depends on the
current mount propagation setting. The kernel default is "private",
but for example systemd based distros use "shared". The solution is to
use (for example) "mount --make-private" after unshare(1).
I have been requested many times to provide less fragile and more
unified unshared mount setting *by default* to make things user
friendly.
The patch forces unshare(1) to explicitly use MS_REC|MS_PRIVATE for all
tree by default.
We can use something less (e.g MS_SLAVE), but "private" is the kernel
default, so for many users this change (feature) will be invisible.
This feature is possible to disable by "--propagation unchanged" or it's
possible to specify another propagation flag, supported are:
<slave|shared|private|unchanged>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Karel Zak <kzak@redhat.com>
Since Linux 3.19 the file /proc/self/setgroups controls setgroups(2)
syscall usage in user namespaces. This patch provides command line knob
for this feature.
The new --setgroups does not automatically implies --user to avoid
complexity, it's user's responsibility to use it in right context. The
exception is --map-root-user which is mutually exclusive to
--setgroups=allow.
CC: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Karel Zak <kzak@redhat.com>
In rare cases droping groups with setgroups(0, NULL) is an operation
that can grant a user additional privileges. User namespaces were
allwoing that operation to unprivileged users and that had to be
fixed.
Update unshare --map-root-user to disable the setgroups operation
before setting the gid_map.
This is needed as after the security fix gid_map is restricted to
privileged users unless setgroups has been disabled.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
This adds a concise description of a tool to its usage text.
A first form of this patch was proposed by Steven Honeyman
(see http://www.spinics.net/lists/util-linux-ng/msg09994.html).
Signed-off-by: Benno Schulenberg <bensberg@justemail.net>
Since 6728ca10 we are using MS_PRIVATE and MS_REC which are not defined
in some systems's sys/mount.h.
Signed-off-by: Ruediger Meier <ruediger.meier@ga-group.nl>
This makes it very convenient to use make use of privileged actions
on CONFIG_USER_NS enabled kernels, without having to manually tinker
with uid_map and gid_map to obtain required credentials (as those
given upon unshare() vanish with call to execve() and lot of userspace
checks for euid==0 anyway).
Usage example:
$ unshare --uts
unshare: unshare failed: Operation not permitted
$ unshare --user --uts
[nfsnobody@odvarok ~]$ hostname swag
hostname: you must be root to change the host name
$ unshare -r --uts
[root@odvarok util-linux]# hostname swag
[root@odvarok util-linux]#
[kzak@redhat.com: - move code to map_id()
- use all-io.h
- add paths to pathnames.h]
Signed-off-by: Lubomir Rintel <lkundrak@v3.sk>
Signed-off-by: Karel Zak <kzak@redhat.com>
Based on patch from Mike Frysinger <vapier@gentoo.org>.
Mike Frysinger wrote:
When it comes to pid namespaces, it's also useful for /proc to reflect
the current namespace. Again, this is easy to pull off, but annoying
to force everyone to do it themselves. So let's add a --mount-proc to
do the magic for us. The downside is that this also implies creating
a mount namespace as mounting the new pid namespace /proc over top the
system one will quickly break all other processes on the system.
Signed-off-by: Karel Zak <kzak@redhat.com>
Acked-by: Mike Frysinger <vapier@gentoo.or>
The ability of unshare to launch a new pid namespace is a bit limited.
The first process in the namespace is expected to be the "init" for it.
When it's not, you get bad behavior.
For example, trying to launch a shell in a new pid namespace fails very
quickly:
$ sudo unshare -p dash
# uname -r
3.8.3
# uname -m
dash: 2: Cannot fork
# ls -ld /
dash: 3: Cannot fork
# echo $$
1324
For this to work smoothly, we need an init process to actively watch over
things. But forcing people to re-use an existing init or write their own
mini init is a bit overkill. So let's add a --fork option to unshare to
do this common bit of book keeping. Now we can do:
$ sudo unshare -p --fork dash
# uname -r
3.8.3
# uname -m
x86_64
# ls -ld /
drwxr-xr-x 22 root root 4096 May 4 14:01 /
# echo $$
1
Thanks to Michael Kerrisk for his namespace articles on lwn.net
[kzak@redhat.com: - fix "forkif logic, remove --mount-proc]
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Karel Zak <kzak@redhat.com>
The behaviour mimics chroot.
Possibly it would have been nicer to to query the password database in
the new namepace and run the shell of the user there, but it's hard to
do correctly. getpwuid() might need to load nss plugins, and the arch
in the new namespace might be different (in case of NEWNS mounts), or
the hostname might be different, etc. So in general it's not possible
to do it reliably.
Signed-off-by: Zbigniew Jędrzejewski-Szmek <zbyszek@in.waw.pl>
Move the defitions of CLONE_NEWNS, CLONE_NEWUTS, CLONE_NEWIPC,
CLONE_NEWNET, CLONE_NEWUSER, CLONE_NEWPID into namespace.h in case
sched.h does not provide those definitions. Are there systems
around that are old enough that still need this?
Move the definitions of unshare() and setns() into namespace.h
for supporting old versions of libc that does not provice these.
I have tested this support with setns as I still have systems
old enough that glibc does not wrap setns.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
- Update the unshare application to support the pid and user namespaces.
- Update the man page for the new options
- Fix typo in the man page where UTS was spelled UTC.
- Remove the vestigal support for running a suid unshare.
After unsharing a user namespace setuid(getuid()) won't work because
no uid or gid mappings have been specified yet. So it is just easier not
to have any support for running suid.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
In addition to the unshare syscall, there exists the setns syscall, which
allows processes to migrate to the namepsaces of other processes. Add this
functionality into the unshare command, as they operate in a fairly simmilar
fashion.
Note: There was discussion of adding a path based namespace argument to unshare
in the origional discussion thread, but I opted to leave that out as it didn't
seem to fit in nicely with the current argument pattern. I figure we can always
add that in later if we need to
[kzak@redhat.com: - fix optional arguments
- do not call unshare if no flag specified
- use O_CLOEXEC
- codding style cleanup]
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: Karel Zak <kzak@redhat.com>
Signed-off-by: Karel Zak <kzak@redhat.com>
Solaris lacks err, errx, warn and warnx. This also means the err.h header
doesn't exist. Removed err.h include from all files, and included err.h from
c.h instead if it exists, otherwise alternatives are provided.
Signed-off-by: Fabian Groffen <grobian@gentoo.org>
This patch drops potential euid privileges before executing the target
program. This allows to setuid unshare.
The unshare(1) is still distributed as non-setuid program.
Based on patch from Martin Pohlack <mp26@os.inf.tu-dresden.de>.
Signed-off-by: Karel Zak <kzak@redhat.com>
New utility allows to run process with separate mount, UTC, IPC or
network namespaces.
[kzak@redhat.com: - some cosmetic changes in usage() and err() usage
- move "if BUILD_UNSHARE" to separate place in Makefile.am
- add unshare to .gitignore]
Signed-off-by: Mikhail Gusarov <dottedmag@dottedmag.net>
Signed-off-by: Karel Zak <kzak@redhat.com>