tailf: fixed timing issue that could cause duplicate data output

The issue is that in roll_file() we fstat() to find the file size, then read() as much data as we can and then use the previously saved file size to mark our position. The bug occurs if we read past the file size reported by fstat() because more data has arrived while we were reading it. The attached patch uses the current file position as the location marker instead, with some extra logic to handle tailing truncated files. [kzak@redhat.com: - fix coding style] Signed-off-by: Dima Kogan <dkogan@cds.caltech.edu> Signed-off-by: Karel Zak <kzak@redhat.com>
2010-08-14 02:15:47 -07:00 · 2010-08-14 02:15:47 -07:00 · 3a788d773b
parent 9695a7c653
commit 3a788d773b
1 changed files with 10 additions and 1 deletions
--- a/text-utils/tailf.c
+++ b/text-utils/tailf.c
@ -88,6 +88,7 @@ roll_file(const char *filename, off_t *size)
 	char buf[BUFSIZ];
 	int fd;
 	struct stat st;
+	off_t pos;

 	if (!(fd = open(filename, O_RDONLY)))
 		err(EXIT_FAILURE, _("cannot open \"%s\" for read"), filename);
@ -111,8 +112,16 @@ roll_file(const char *filename, off_t *size)
 		}
 		fflush(stdout);
 	}
+
+	pos = lseek(fd, 0, SEEK_CUR);
+
+	/* If we've successfully read something, use the file position, this
+	 * avoids data duplication. If we read nothing or hit an error, reset
+	 * to the reported size, this handles truncated files.
+	 */
+	*size = (pos != -1 && pos != *size) ? pos : st.st_size;
+
 	close(fd);
-	*size = st.st_size;
 }

 static void