Help needed to identify golang fork / memory corruption issue on FreeBSD

Konstantin Belousov kostikbel at gmail.com
Tue Mar 28 11:39:07 UTC 2017


On Tue, Mar 28, 2017 at 09:48:23AM +0100, Steven Hartland wrote:
> On 28/03/2017 09:38, Konstantin Belousov wrote:
> > On Tue, Mar 28, 2017 at 09:23:24AM +0100, Steven Hartland wrote:
> >> As I stopped the panic before that I couldn't tell so I've re-run with
> >> some debug added just before the panic to capture the addresses of the
> >> workbuf structure that the issue was detected in, here goes (parent:
> >> 62620, child: 98756):
> >>
> >> workbuf: 0x800b51800
> >> fatal error: workbuf is not empty
> >> workbuf: 0x800a72000
> >> fatal error: workbuf is empty
> >> workbuf: 0x800a72000
> >> fatal error: workbuf is not empty
> > I do not understand.  Why do you show several addresses ?  Wouldn't the
> > runtime panic after detecting the discrepancy, so there could be only one
> > address ?
> There are several goroutines (threads) running each detected an error, 
> as I'm blocking the panic by entering a sleep in the faulting goroutine 
> to enable the capture of procstat, other routines continue and detect an 
> error too.
Ok.

So I tried to simulate the load with an isolated test. Code below is
naive, but it should illustrate the idea. Parent allocates some
number of private-mapped areas, then runs threads which write bytes into
the areas. Simultaneously parent forks children which write distinct
byte into the same anonymous memory.

Parent checks that it cannot see a byte written by children.

So far it did not tripped on my test machine.  Feel free to play with it,
if you have more insights what go runtime does, modify the code to simulate
the failing test more accurately.

/* $Id: cowfail.c,v 1.1 2017/03/28 11:29:58 kostik Exp kostik $ */

#include <sys/types.h>
#include <sys/mman.h>
#include <sys/wait.h>
#include <err.h>
#include <errno.h>
#include <pthread.h>
#include <signal.h>
#include <stdbool.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

static char **areas;
static int nareas, nchildren, children, nthreads;
static size_t areasz;
static const char parent_chars[] = "ab";
static const char child_char = 'c';

static int
gen_idx(void)
{

	return (random() % nareas);
}

static void
fill_area(int idx, bool parent)
{
	char *area;
	char f;

	area = areas[idx];
	f = parent ? parent_chars[random() % sizeof(parent_chars)] : child_char;
	memset(area, f, areasz);
}

static void
check_area(int idx)
{
	char *area;
	size_t i;

	area = areas[idx];
	for (i = 0; i < areasz; i++) {
		if (area[i] == child_char)
			errx(1, "corrupted area");
	}
}

static void
child(void)
{
	int i, idx;

	for (i = 0; i < 100; i++) {
		idx = gen_idx();
		fill_area(idx, false);
	}
	_exit(0);
}

static void *
wthread(void *arg __unused)
{

	for (;;) {
		fill_area(gen_idx(), true);
		check_area(gen_idx());
	}
	return (NULL);
}

int
main(void)
{
	pthread_t thr;
	sigset_t sigs;
	pid_t pid;
	int error, i, status;

	nareas = 1024;
	nchildren = 8;
	nthreads = 4;
	areasz = 1024 * 1024;

	sigemptyset(&sigs);
	sigaddset(&sigs, SIGCHLD);
	error = sigprocmask(SIG_BLOCK, &sigs, NULL);
	if (error == -1)
		err(1, "sigprocmask");

	areas = calloc(nareas, sizeof(char *));
	if (areas == NULL)
		err(1, "calloc nareas");
	for (i = 0; i < nareas; i++) {
		areas[i] = mmap(NULL, areasz, PROT_READ | PROT_WRITE,
		    MAP_PRIVATE | MAP_ANON, -1, 0);
		if (areas[i] == MAP_FAILED)
			err(1, "mmap %d", i);
	}

	for (i = 0; i < nthreads; i++) {
		error = pthread_create(&thr, NULL, wthread, NULL);
		if (error != 0)
			errc(1, error, "pthread_create");
	}

	for (;;) {
		if (children < nchildren) {
			pid = fork();
			if (pid == -1) {
				err(1, "fork");
			} else if (pid == 0) {
				child();
			} else {
				children++;
			}
		} else {
			pid = waitpid(-1, &status, 0);
			if (pid == -1) {
				if (errno != EINTR)
					err(1, "waitpid");
			} else {
				children--;
			}
		}
	}
}


More information about the freebsd-hackers mailing list