This commit adds zone-aware magics and probing functions for zoned btrfs.
The superblock (and its copies) are the only data structure in btrfs with a
fixed location on a device. Since we cannot do overwrites in a sequential
write required zone, we cannot place the superblock in the zone.
Thus, zoned btrfs uses superblock log writing to update superblocks on
sequential write required zones. It uses two zones as a circular buffer to
write updated superblocks. Once the first zone is filled up, start writing
into the second buffer. When both zones are filled up, and before starting
to write to the first zone again, it reset the first zone.
We can determine the position of the latest superblock by reading the write
pointer information from a device. One corner case is when both zones are
full. For this situation, we read out the last superblock of each zone and
compare them to determine which zone is older.
The magics can detect a superblock magic ("_BHRfs_M") at the beginning of
zone #0 or zone #1 to see if it is zoned btrfs. When both zones are filled
up, zoned btrfs resets the first zone to write a new superblock. If btrfs
crashes at the moment, we do not see a superblock at zone #0. Thus, we need
to check not only zone #0 but also zone #1.
It also supports the temporary magic ("!BHRfS_M") in zone #0. Mkfs.btrfs
first writes the temporary superblock to the zone during the mkfs process.
It will survive there until the zones are filled up and reset. So, we also
need to detect this temporary magic.
Finally, this commit extends probe_btrfs() to load the latest superblock
determined by the write pointers.
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
All simple function to parse --lock <mode> and $LOCK_BLOCK_DEVICE,
and to flock the fd.
The supported <mode> is:
"1" or "yes" - LOCK_EX
"0" or "no" - do nothing
"nonblock" - LOCK_EX | LOCK_NB
The function tries LOCK_NB before the solo LOCK_EX and prints
inform user that it will wait, for example:
session A:
# sfdisk --lock /dev/sdc
session B:
# sfdisk --lock /dev/sdc
sfdisk: /dev/sdc: device already locked, waiting to get lock ...
^C
# sfdisk --lock=nonblock /dev/sdc
sfdisk: /dev/sdc: device already locked
Addresses: https://github.com/karelzak/util-linux/issues/921
Signed-off-by: Karel Zak <kzak@redhat.com>
Purpose of this function is to open a path that is potentially pointing to a
block device or file without races. The function also proper open(3) flags
are used to check the device is not busy, and finally warning is been
printed if a block device happens to be misaligned.
Signed-off-by: Sami Kerola <kerolasa@iki.fi>
We use the code from include/ and lib/ on many places, so use public
domain if possible or LGPL for code copied from libs.
Signed-off-by: Karel Zak <kzak@redhat.com>
We should have the most basic of checks in this library to see whether or not a block device is being used.
Signed-off-by: Davidlohr Bueso <dave@gnu.org>
This function uses the BLKPBSZGET ioctl to get the physical block size
of the device.
Signed-off-by: Davidlohr Bueso <dave@gnu.org>
Signed-off-by: Karel Zak <kzak@redhat.com>
This functions returns the status of the device's alignment. It will
be 0 when aligned, otherwise return the offset.
[kzak@redhat.com: - returns 0 if the ioctl failed]
Signed-off-by: Davidlohr Bueso <dave@gnu.org>
Signed-off-by: Karel Zak <kzak@redhat.com>
The _IO macro is defined in sys/ioccom.h on various platforms. However,
on Solaris it isn't included by ioctl.h, so include it explicitly if
available.
Signed-off-by: Fabian Groffen <grobian@gentoo.org>
[kzak@redhat.com: - cleanup
- add long options
- add note about DM to the man page
- use err.h and nls.h]
Signed-off-by: Hajime Taira <htaira@redhat.com>
Signed-off-by: Karel Zak <kzak@redhat.com>
new options:
--getpbsz get physical block (sector) size
--getiomin get minimum I/O size
--getioopt get optimal I/O size
--getalignoff get alignment offset
--getmaxsect get max sectors per request
Signed-off-by: Karel Zak <kzak@redhat.com>
Due to a change in kernel behaviour when opening CDROM devices,
we need to retry the open/mount call when ENOMEDIUM is returned.
Explanation from Tejun Heo:
Okay, the difference is from the addition of cdrom_get_media_event()
call to both sr_drive_status() and ide_cdrom_drive_status().
Previously, the cdrom driver can't differentiate between tray closed
w/ no media and tray open and always returned tray open, which
triggers close and retry in the open logic which probably have delayed
things enough to get the media recognized.
Now the cdrom driver can discern between tray closed w/o media and
device not ready for other reasons and returns -ENOMEDIUM on the
former. This is all good and dandy but the problem seems that some
drives report no media right after the tray is closed but it hasn't
properly detected the media yet.
It seems the only way to work around the problem is via sensible
retries (e.g. try three times 5 secs apart) and I don't think we can
add that type of retry logic into cdrom open path. Please note that
the previous logic wasn't water proof. Some drives can take longer to
recognize the media is there and could have failed the in-kernel retry
too. Also, reading the media can take quite some time and during that
period the drive reports media present but device not ready. The
driver will retry the command (e.g. READ TOC for open) five times but
all of them can fail w/ EMEDIUMTYPE.
[kzak@redhat.com: - add CRDOM_NOMEDIUM_RETRIES to blkdev.h
- add verbose message to mount.c]
Signed-off-by: Matthias Koenig <mkoenig@suse.de>
Signed-off-by: Karel Zak <kzak@redhat.com>
[kzak@redhat.com: split the original patch to small patches]
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Signed-off-by: Karel Zak <kzak@redhat.com>
Now we duplicate BLK* ioctls on many places... This patch also
fix BLKGETSIZE64 usage in dependence on kernel version.
Co-Author: Karel Zak <kzak@redhat.com>
Signed-off-by: Stefan Krah <stefan@bytereef.org>
Signed-off-by: Karel Zak <kzak@redhat.com>