Re: ARC model specified in spinroot/promela
- In reply to: Alexander Leidinger : "Re: ARC model specified in spinroot/promela"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Fri, 15 Sep 2023 15:05:34 UTC
>>>>> Alexander Leidinger <Alexander@Leidinger.net> writes: > Am 2023-09-14 16:40, schrieb Mathew, Cherry G.*: >> A few things remain WIP. Obviously, this is a "toy" model - it >> has stack based "memory" management, and a directory buffer size >> of 64. So that needs to be hammered on a bit more. Further, I'm >> keen to now dip my toes into re-entrancy. If you noticed in the >> earlier patches, there were model routines for >> "mutex_enter()/mutex_exit()". I'll re-look these and play around >> a little bit with concurrent entry of ARC() (ARC() is now >> strictly entered sequentially in a loop - see arc.drv). Once >> things settle down, I will look at making an actual block disk >> driver with an ARC(9) managed cache using this code. > Without having looked at anything in the code, and now with your > comment about an "actual block disk driver"... AFAIR your ARC is > based upon the paper, and this is not used anywhere in > FreeBSD. ZFS is using a modified/adapted version of the ARC. What > you are describing sounds a bit like our buffer cache > (https://wiki.freebsd.org/BasicVfsConcepts#:~:text=Buffer%20cache%20is%20a%20layer,on%20a%20vnode%2Blbn%20basis.), > but our buffer cache is not an ARC, it is AFAIR LRU eviction > based. Hi Alexander - thank you for the comments! Regarding the buffer cache, that is an excellent suggestion, and there are a few things that need to fall in place before it can be meaningfully verified. I'd say that the first step is to put in place "Equivalence verification" code for all the backing infrastructure that most kernel code uses - for example things that mediate or enable concurrency, such as cv(9), mutex(9), spl(9) (Not sure if that exists in FreeBSD), etc. Once those are in place, it will be much easier to write model+invariance code with some measure of confidence, along with extraction. I already have very very rudimentary code for this that I wrote for a QNX job interview (which, very puzzlingly, for a project that claims to support critical systems, they found uninteresting to the point of having zero questions about!) - the point is - it's actually not that hard to put it together (I did it over about 20hours of focussed work) and the benefits are enormous, from what I can tell. Regarding the "block driver" I'm thinking of - at the moment, in my head, it's really not that ambitious at all - it's just a self contained read-cache that could potentially front any existing block driver - for eg: a super slow spindisk based driver could get potential block level speed up. I don't know if the buffer cache works in this way, or it only caches data along specific code paths such as read(2) - this is not an area of the OS I'm familiar with. But there is no reason why the buffer cache itself can't be modelled and verified, if it has been written reasonably carefully enough, modularity wise. > I fully understand that you are far away from doing real modelling > of such complex parts as the VFS parts of FreeBSD, but maybe this > info gives a hint in terms of looking into some parts and see if > you can adopt some parts in terms of design which can be anhanced > later on to some real FreeBSD stuff... Definitely noted, and thank you for the signpost! I'm pasting below some early work I did in February, just to give a sense of what it might look like, if an existing part of the (in this case NetBSD) kernel were to be attempted to be verified. Note that the work below is incomplete, but it did show PoC, which prompted me to investigate this line of reasoning further. I'm looking forward to working more in this area, and hope that the FreeBSD community will have some interest in supporting this work, especially once the equivalence code is in place, by verifying existing bits of code piecemeal until we can have the entire OS under at least a very basic verification harness! Looking forward. -- ~cherry diff -urN Makefile ./Makefile --- Makefile 1970-01-01 05:30:00.000000000 +0530 +++ ./Makefile 2023-02-22 00:51:21.589143403 +0530 @@ -0,0 +1,89 @@ +# This set of spinroot related files were written by cherry +# <spinme@bow.st> in the Gregorian Calendar year AD.2023, in the month +# of February that year. +# +# We have two specification files and a properties file +# +# The properties file contains "constraint" sections +# such as ltl or never claims (either or, not both). +# The specification is divided into two files: +# the file with suffix '.drv' is a "driver" which +# instantiates processes that will ultimately "drive" the +# models under test. +# The file with the suffix '.pml' contains the process +# model code, which, is intended to be the formal specification +# for the code we are interested in writing in C. +# +# We process these files in slightly different ways during +# the dev cycle, but broadly speaking, the idea is to create +# a file called 'spinmodel.pml' which contains the final +# model file that is fed to spin. +# +# Note that when we use the model extractor tool "modex" to +# extract the 'specification' from C code written to implement +# the model defined above. We use a 'harness' file (see file with +# suffix '.prx' below. +# +# Once the harness has been run, spinmodel.pml should be +# synthesised and processed as usual. +# +# The broad idea is that software dev starts by writing the spec +# first, validating the model, and then implementing the model in +# C, after which we come back to extract the model from the C file +# and cross check our implementation using spin. +# +# If things go well, the constraints specified in the '.ltl' file +# should hold exactly for both the handwritten model, and the +# extracted one. + +spin-gen: md.strategy.pml md.strategy.drv md.strategy.prop + cp md.strategy.pml model #mimic modex + cat md.strategy.drv > spinmodel.pml;cat model >> spinmodel.pml;cat md.strategy.prop >> spinmodel.pml; + spin -a spinmodel.pml + +spin-build: pan.* + cc -o pan pan.c + +spin-run: spin-build #XXX depend on pan + ./pan -a #Generate md.strategy.pml.trail on error + +# You run the trace only if the spin run above failed and created a trail +spin-trace: spinmodel.pml.trail + spin -t spinmodel.pml -p -g # -p (statements) -g (globals) -l (locals) -s (send) -r (recv) + ./pan -r spinmodel.pml.trail -g + +# Modex Extracts from md.c to 'model' - see md.strategy.prx +# Unfortunately there doesn't seem to be a way to specify a filename +# to generate +SRC = $HOME/work/NetBSD-src + +modex-gen: md.strategy.prx md.c + #modex -v -w -D_KERNEL -I$obj/home/antix/work/NetBSD-src/destdir.amd64/usr/include -I$$HOME/work/NetBSD-src/obj/home/antix/work/NetBSD-src/sys/arch/amd64/compile/GENERIC/ -I$$HOME/work/NetBSD-src/sys md.strategy.prx + #cat model > spinmodel.pml + touch ioconf.h # Pretend we have an Kern conf + modex -v -w md.strategy.prx + cat md.strategy.drv > spinmodel.pml;cat model >> spinmodel.pml;cat md.strategy.prop >> spinmodel.pml; + spin -a spinmodel.pml #Sanity check + +modex-gen-clean: + rm -f ioconf.h # Temp - see above. + rm -f spinmodel.pml # Our consolidated model file + rm -f _spin_nvr.tmp # Never claim file + rm -f model # modex generated intermediate "model" file + rm -f pan.* # Spin generated source files + rm -f _modex* # modex generated script files + rm -f *.I *.M +spin-gen-clean: + rm -f spinmodel.pml # Our consolidated model file + rm -f _spin_nvr.tmp # Never claim file + rm -f model # Intermediate "model" file + rm -f pan.* # Spin generated source files + +spin-build-clean: + rm -f pan + +spin-run-clean: + rm -f spinmodel.pml.trail + +clean: modex-gen-clean spin-gen-clean spin-build-clean spin-run-clean +# rm -f *~ diff -urN md.strategy.drv ./md.strategy.drv --- md.strategy.drv 1970-01-01 05:30:00.000000000 +0530 +++ ./md.strategy.drv 2023-02-22 00:26:27.428777059 +0530 @@ -0,0 +1,211 @@ +/* $NetBSD$ */ + +/* Spin model driver for NetBSD md(4) md.c written by cherry */ + +//#define MEMORY_DISK_SERVER + +/* XXX: Bitfields are messed up, because promella doesn't do hex */ + +#define B_WRITE 0 /* Write buffer (pseudo flag). */ +#define B_READ 1 /* Read buffer. */ + +/* Data types modelled on sys/buf.h */ + +/* Modelled on sys/buf.h:struct buf {}; */ +typedef buf_t { + bool b_iodone_flag; /* Internal - flags biodone() */ + int b_error; /* b: errno value. */ + int b_blkno; /* b: physical block number (volume relative) */ + int b_resid; /* b: remaining I/O. */ + int b_flags; /* b: B_* flags */ + int b_bcount; /* b: valid bytes in buffer */ + int b_data; /* b: fs private data */ +}; + +/* XXX: auto sync/extract with md.h ? */ +#define MD_UNCONFIGURED 0 +#define MD_KMEM_FIXED 1 +#define MD_KMEM_ALLOCATED 2 +#define MD_UMEM_SERVER 3 + +/* Device state. + * See: init() below; + * XXX: dynamic instantiation ? + */ +bit sc_lock; +int sc_size; +int sc_addr; +byte sc_type; +bool sc_dkbusy; + +/* + * XXX: These VAs are arbitrary - the vague intention is some form of + * range based validation. TBD. + */ +#define KERNVA 1024; + +#ifdef MEMORY_DISK_SERVER + +#define USERVA 1024; + +#define UMEM_QSIZE 5 /* Entries (arbitrary) - use judiciously - adds to statespace */ +chan sc_buflist = [UMEM_QSIZE] of { buf_t }; + +#define bufq_put(_buflist, _bp) _buflist!_bp +#define bufq_get(_buflist, _bp) _buflist?_bp + +#define copyin(_usrc, _kdst, _len) \ + memcpy(_kdst, _usrc, _len) + +#define copyout(_ksrc, _udst, _len) \ + memcpy(_udst, _ksrc, _len) + +#endif /* MEMORY_DISK_SERVER */ + +/* We model the disk as a channel with read/write "instructions" */ +chan sc_dkdev = [2] of { int/*dst*/, int/*src*/, int/*len*/ }; + +#define memcpy(_dst, _src, _len) { \ + assert(_dst != _src); \ + assert(_len != 0); \ + sc_dkdev!_dst, _src, _len; \ + sc_dkdev?_src, _dst, _len; \ +} + +/* Useful stubs */ +#define biodone(_buf) { \ + assert(_buf.b_iodone_flag != true); \ + _buf.b_iodone_flag = true; \ +} + +#define disk_busy() { sc_dkbusy = true } +#define disk_unbusy() { sc_dkbusy = false } + +/* XXX: Move these into a set of library includes ? */ +/* Note: CAS implemented in an atomic {} block */ +#define mutex_enter(_mutex) \ + atomic { \ + (_mutex == 0) -> _mutex = 1; \ + } + +#define mutex_exit(_mutex) \ + atomic { \ + assert(_mutex == 1); \ + (_mutex == 1) -> _mutex = 0; \ + } + +/* From sys/param.h */ +#define DEV_BSHIFT 9 /* log2(DEV_BSIZE) */ + +/* From sys/errno.h */ +#define EIO 5 /* Input/output error */ + +/* Other global state */ +int bytesread; +int byteswritten; + +#define NPROCS 5 /* + * Should be less than MDDISKSIZE/1024, see bcount, + * blkno etc. calculations below. + * pid is limited to max 255, including driver code in init() + * thus 254 available pids + */ +#define MDDISKSIZE 1024 * 1024 /* 1MB */ + +/* Drive the procs */ +init { + buf_t buffer; + + /* XXX: Use system init routines. */ + /* Initialise Device params */ + sc_type = MD_KMEM_FIXED; + sc_size = MDDISKSIZE; /* In bytes */ + sc_addr = KERNVA; /* Random non-zero "Kernel" address - we do not + do data path validation */ + + + pid proc; + + /* Test kernel vmem backed implementation first */ + atomic { + proc = 0; + do + :: proc < NPROCS -> + /* Use all code-paths */ + /* Even procs read, odd ones write */ + buffer.b_flags = (buffer.b_flags |((proc%2) -> B_READ : B_WRITE)); + + /* Create some variation in buffer sizes + * + * Note: The objective here is not to + * unit test - rather it is to exercise + * code paths in various process instantiations, + * and to verify their mutual interaction. + */ + buffer.b_bcount = (MDDISKSIZE * (proc + 1)) / (NPROCS); + + /* + * Block number offset within device + */ + buffer.b_blkno = ((buffer.b_bcount) >> DEV_BSHIFT); + + run p_mdstrategy(buffer); + proc++ + :: proc >= NPROCS -> + break + od + } + +#ifdef MEMORY_DISK_SERVER + + /* To reset or not, that is the non-question! */ + bytesread = 0; + byteswritten = 0; + + + /* + * Next we test the user memory server backed implementation. + * Here, we have to fire up a single backing server process + * (we assume just a single instance of the driver) which + * will serve the requested memory block ops on behalf of the + * requester processes. + */ + + sc_type = MD_UMEM_SERVER; + sc_addr = USERVA; /* Random non-zero "User" address - we do not + do data path validation */ + + + run p_md_server_loop(); + + atomic { + proc = 0; + do + :: proc < NPROCS -> + /* Use all code-paths */ + /* Even procs read, odd ones write */ + buffer.b_flags = (buffer.b_flags |((proc%2) -> B_READ : B_WRITE)); + + /* Create some variation in buffer sizes + * + * Note: The objective here is not to + * unit test - rather it is to exercise + * code paths in various process instantiations, + * and to verify their mutual interaction. + */ + buffer.b_bcount = (MDDISKSIZE * (proc + 1)) / (NPROCS); + + /* + * Block number offset within device + */ + buffer.b_blkno = ((buffer.b_bcount) >> DEV_BSHIFT); + + run p_mdstrategy(buffer); + proc++ + :: proc >= NPROCS -> + break + od + } +#endif /* MEMORY_DISK_SERVER */ +} + diff -urN md.strategy.pml ./md.strategy.pml --- md.strategy.pml 1970-01-01 05:30:00.000000000 +0530 +++ ./md.strategy.pml 2023-02-22 00:25:08.956729720 +0530 @@ -0,0 +1,179 @@ +/* $NetBSD$ */ + +/* Spin process models for NetBSD md(4) md.c written by cherry */ + +proctype p_mdstrategy(buf_t bp) +{ + bit is_read; + int off, xfer, addr; + + + mutex_enter(sc_lock); + + if +#ifdef MEMORY_DISK_SERVER + :: (sc_type == MD_UMEM_SERVER) + -> + bufq_put(sc_buflist, bp); /* Assume non-blocking */ + /* + * We add the assert below to match cv_signal(9) semantics + * used in the implementation - ie; bufq_put() never blocks. + */ + assert(nfull(sc_buflist)); + mutex_exit(sc_lock); + goto done; +#endif /* MEMORY_DISK_SERVER */ + + :: ( (sc_type == MD_KMEM_FIXED) || + (sc_type == MD_KMEM_ALLOCATED) + ) -> + + is_read = ((bp.b_flags & B_READ) && 1); + + bp.b_resid = bp.b_bcount; + + off = (bp.b_blkno << DEV_BSHIFT); + + if + :: ( off >= sc_size ) -> + if + :: is_read -> goto EOF; + :: else -> goto set_eio; + fi + :: else -> + skip; + fi; + + xfer = bp.b_resid; + + if + :: (xfer > (sc_size - off)) -> + xfer = (sc_size - off); + :: else -> + skip; + fi + + addr = sc_addr + off; + + disk_busy(); + + /* + * We count the read/write totals as state for + * possible later property checks. + * + * But we otherwise fully exclude the data itself from + * the state of the control-plane, as data transfer + * function validation should perhaps be the domain of + * unit testing. + * + */ + + if + :: (is_read) -> memcpy(bp.b_data, addr, xfer); + bytesread = bytesread + xfer; + :: else -> memcpy(addr, bp.b_data, xfer); + byteswritten = byteswritten + xfer; + fi + + + disk_unbusy(); + + bp.b_resid = bp.b_resid - xfer; + + :: else -> /* Fallback - don't block */ + bp.b_resid = bp.b_bcount; +set_eio: + bp.b_error = EIO; + fi + +EOF: + + mutex_exit(sc_lock); + + biodone(bp); + +done: +} + +#ifdef MEMORY_DISK_SERVER +proctype p_md_server_loop() +{ + + bit is_read; + int off, xfer, addr, error; + + buf_t bp; + + do + :: true -> + /* + * Note: no messing around with sc_lock here unlike the + * implementation, which uses sc_lock + cv(9) + mutex(9) + * dance to simulate LWP awake + lock held semantics. + * + * XXX: Build a library of promela models for these kernel APIs. + */ + bufq_get(sc_buflist, bp); + + + is_read = ((bp.b_flags & B_READ) && 1); + + bp.b_resid = bp.b_bcount; + + off = (bp.b_blkno << DEV_BSHIFT); + + if + :: ( off >= sc_size ) -> + if + :: is_read -> + goto done; + :: else -> + error = EIO; + goto done; + fi + :: else -> + skip; + fi; + + xfer = bp.b_resid; + + if + :: (xfer > (sc_size - off)) -> + xfer = (sc_size - off); + :: else -> + skip; + fi + + addr = sc_addr + off; + + disk_busy(); + + if + :: (is_read) -> + copyin(addr, bp.b_data, xfer); + bytesread = bytesread + xfer; + :: else -> + copyout(bp.b_data, addr, xfer); + byteswritten = byteswritten + xfer; + fi + + disk_unbusy(); + + if + :: (!error) -> + bp.b_resid = bp.b_resid - xfer; + :: else -> + skip; + fi + +done: + if + :: (error) -> + bp.b_resid = error; + :: else -> + skip; + fi + biodone(bp); + od +} +#endif /* MEMORY_DISK_SERVER */ diff -urN md.strategy.prop ./md.strategy.prop --- md.strategy.prop 1970-01-01 05:30:00.000000000 +0530 +++ ./md.strategy.prop 2023-02-19 21:43:45.633184892 +0530 @@ -0,0 +1,13 @@ +/* $NetBSD$ */ + +ltl +{ + /* We only access the disk if sc_lock is held */ + always ((sc_dkbusy == true) implies (sc_lock == 1)) && + /* memcpy() is synchronous! */ + always (len(sc_dkdev) <= 1) && + /* Eventually all disk activity should die down */ + eventually always (len(sc_dkdev) == 0) && + /* Liveness - all thread finally end */ + eventually always (_nr_pr == 1) +} diff -urN md.strategy.prx ./md.strategy.prx --- md.strategy.prx 1970-01-01 05:30:00.000000000 +0530 +++ ./md.strategy.prx 2023-02-22 00:54:27.913876007 +0530 @@ -0,0 +1,142 @@ +// Spin model extractor harness written by cherry +// +%F md.c +%X -n mdstrategy +%H +// Disable effects of all included files and try to implement a subset of the APIs they provide. +#define _SYS_CDEFS_H_ +#define __KERNEL_RCSID(_ign, ign) +#define _SYS_PARAM_H_ +#define _SYS_KERNEL_H_ +#define _SYS_KMEM_H_ +#define _SYS_SYSTM_H_ +#define _SYS_BUF_H_ +#define _SYS_DEVICE_H_ +#define _SYS_DISK_H_ +#define _SYS_STAT_H_ +#define _SYS_PROC_H_ +#define _SYS_CONF_H_ +#define _UVM_UVM_EXTERN_H_ + + +// dev/md.h +#define MD_UNCONFIGURED 0 +#define MD_KMEM_FIXED 1 +#define MD_KMEM_ALLOCATED 2 + +// sys/buf.h +#define B_READ 0x00100000 /* Read buffer. */ +#define B_WRITE 0x00000000 /* Write buffer (pseudo flag). */ + +// sys/param.h +#define DEV_BSHIFT 9 + +// sys/errno.h +#define EIO 5 /* Input/output error */ +#define ENXIO 6 /* Device not configured */ + +// sys/null.h +#define NULL 0 + +#define true 1 +#define false 0 + +struct buf { + int b_error; + int b_blkno; + int b_resid; + int b_flags; + int b_bcount; + int b_data; + int b_iodone_flag; +}; + +struct md_conf { + int md_type; + int md_size; + int md_addr; +} + +#define MEMORY_DISK_SERVER 1 +typedef int size_t; +typedef int bool; + +struct md_softc md_sc; +#define device_lookup_private(_ign1, _ign2) (&md_sc) + +// The following are unused (yet) for modelling. +#define kcondvar_t int +#define kmutex_t int +#define device_t int +#define size_t int + + +void md_bdevsw; +void md_cdevsw; + +struct dkdriver { + void *dummy; /* Placeholder */ +}; +struct disk { + void *dummy; +}; + +struct dkdriver mddkdriver; /* Redundant placeholder */ + +/* Some stubs are referred to in function prototypes */ +typedef void dev_t; +typedef unsigned long u_long; +typedef void cfdata_t; +typedef void vaddr_t; +typedef void vsize_t; + +#define CFATTACH_DECL3_NEW(_ign1, _ign2, _ign3, _ign4, _ign5, _ign6, _ign7, _ign8, _ign9) + +//XXX: MEMORY_DISK_SERVER +#define bufq_put(_ign1, _ign2) +#define cv_signal(_ign1) + +%% +%C // c_code {} +%% +//%D // c_cdecl {} +//%% +%L +// Modelled functions. +// mutex(9) +mutex_enter(... mutex_enter(sc_lock) +mutex_exit(... mutex_exit(sc_lock) + +// memcpy(9) +memcpy(addr... memcpy(addr, bufferp.b_data, xfer) +memcpy(bp... memcpy(bufferp.b_data, addr, xfer) + +// disk(9) +disk_busy(... disk_busy() +disk_unbusy(... disk_unbusy() + +// bufferio(9) +biodone(bp) biodone(bufferp) + +//Declare hidden sc mdstrategy +Declare int addr mdstrategy +Declare int off mdstrategy +Declare int xfer mdstrategy +Declare bit is_read mdstrategy +%% + +%P +proctype p_mdstrategy(buf_t bufferp) +{ + c_code { struct buf buf1; + buf1.b_flags |= (Pp_mdstrategy->bufferp.b_flags& B_READ) ? B_READ : B_WRITE; + buf1.b_bcount = Pp_mdstrategy->bufferp.b_bcount; + buf1.b_blkno = Pp_mdstrategy->bufferp.b_blkno; + Pp_mdstrategy->bp = &buf1; + } + + +#include "_modex_mdstrategy.pml" + +} +%% \ No newline at end of file