From nobody Mon Sep 04 22:07:38 2023 X-Original-To: doc@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4RfjRv120Wz4sWpY for ; Mon, 4 Sep 2023 22:07:39 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4RfjRt6sCYz3Qdt for ; Mon, 4 Sep 2023 22:07:38 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1693865259; a=rsa-sha256; cv=none; b=CFUekJqs5EPtvn3vKVFBjDSrFJlFePGEApTZ93c7MZD4ypU241D4HF15960BpMCqbBhd+K +hqYR+baJI/AKBiWWWT5ICQo/pFWMS422C91f3mCjidgzMTOIRnhm8G1urh8zETAkcnENe qG2r6YCoy2hZYCmEwg0eJ3vb+AaQcWHGkx1HBz7+ByMWptVhdranKqc/BENY4NcuR8w2Jo ZVMm94xRqzm3pifik4xGkOHO0ZKWmNRj8xWGl+8CX6H3PNY6SWSGVeD9z8NH7AXjW9iR8H o1i3dRqZEkCTtRw/LTctXl0kR3rQZR0TEfuQJSamyNNIx3Mg4GzE5H6pJWxNbg== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1693865259; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=/Yc9hU8KOsvANwq3BMES3i/hMovAR6D1hkFBI2Qfgz8=; b=mXeDb/j8dXpL3PrqaUxIWG83XUSv7sicroRw0K1lf6jbAD/WVc6bg77orTM98MsqS9GZ94 Distl0A0KLs4jBFy5rJYY+69KjaHXhignQipxIBf+DP1F10z3KKbnsBD7/0mKeP3587WV4 3c3l4lIJCZ75Q+YkzHz8Cbo2+Ue9xdD0Wm+jaMFX2fXxqtZqv+J5fBKhjfTgaaevZx7CwH DAu5M0QLvXoEaQQTlvkgYlckrzWcQYdLYwxLQI6yIf0VCf823rGSNqUH7CGL94/19TLGyw WrNx0M34T9t4l1EbmZl1X3jwk2vAN4A+7jZ75ptlU9WTvZp+bkgqTn+4v2jLvw== Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2610:1c1:1:606c::50:1d]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 4RfjRt5XpVz1MWV for ; Mon, 4 Sep 2023 22:07:38 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.5]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id 384M7cX7032255 for ; Mon, 4 Sep 2023 22:07:38 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from www@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id 384M7cbf032252 for doc@FreeBSD.org; Mon, 4 Sep 2023 22:07:38 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: www set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: doc@FreeBSD.org Subject: [Bug 273245] textproc/groff: groff_mdoc(7): output from 'man 7 groff_mdoc' is badly broken Date: Mon, 04 Sep 2023 22:07:38 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Ports & Packages X-Bugzilla-Component: Individual Port(s) X-Bugzilla-Version: Latest X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Many People X-Bugzilla-Who: g.branden.robinson@gmail.com X-Bugzilla-Status: Open X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: bapt@FreeBSD.org X-Bugzilla-Flags: maintainer-feedback? X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated List-Id: Documentation project List-Archive: https://lists.freebsd.org/archives/freebsd-doc List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-doc@freebsd.org MIME-Version: 1.0 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D273245 --- Comment #6 from G. Branden Robinson --- Hi Wolfgang, I cloned the freebsd-src repository to have a look at the 35 cases within it that concerned you. > for i in freebsd-{src,ports,doc};do (cd $i && printf "$i "; git grep 'rof= f.* -man[^d]' |wc -l );done > freebsd-src 35 $ git grep -n 'roff.* -man[^d]' contrib/byacc/aclocal.m4:1047:${NROFF_NOTE} [\$](SHELL) -c "tbl [\$]*.$= 2 | nroff -man | col -bx" >[\$]@ contrib/byacc/aclocal.m4:1053:${GROFF_NOTE} [\$](SHELL) -c "tbl [\$]*.$= 2 | groff -man" >[\$]@ contrib/byacc/aclocal.m4:1056:${GROFF_NOTE} GROFF_NO_SGR=3Dstupid [\$](= SHELL) -c "tbl [\$]*.$2 | nroff -rHY=3D0 -Tascii -man | col -bx" >[\$]@ contrib/byacc/configure:8368:${NROFF_NOTE} \$(SHELL) -c "tbl \$*.1 | n= roff -man | col -bx" >\$@ contrib/byacc/configure:8374:${GROFF_NOTE} \$(SHELL) -c "tbl \$*.1 | g= roff -man" >\$@ contrib/byacc/configure:8377:${GROFF_NOTE} GROFF_NO_SGR=3Dstupid \$(SH= ELL) -c "tbl \$*.1 | nroff -rHY=3D0 -Tascii -man | col -bx" >\$@ byacc is maintained by Thomas Dickey. He's using Autoconf macros to produce output from sources that are known to be in man(7) format. https://github.com/freebsd/freebsd-src/blob/main/contrib/byacc/yacc.1#L30 `CF_MAKE_DOCS` appears to be an Autoconf macro private to the byacc distribution; I see no other occurrences in `freebsd-src`. He also has a Autoconf test to determine how to generate HTML from man(7) pages. https://github.com/freebsd/freebsd-src/blob/main/contrib/byacc/aclocal.m4#L= 1580 Note in particular the here document at line 1694. https://github.com/freebsd/freebsd-src/blob/main/contrib/byacc/aclocal.m4#L= 1694 contrib/dialog/makefile.in:145:@NROFF_NOTE@ GROFF_NO_SGR=3Dstupid $(SHE= LL) -c "tbl $< | nroff -rHY=3D0 -Tascii -man | col -bx" >$@ contrib/dialog/makefile.in:151:@GROFF_NOTE@ $(SHELL) -c "tbl $< | groff -man" >$@ Dialog is another Thomas Dickey project. It also builds inputs it knows to be in man(7) format. $ grep -nrFw .TH contrib/dialogcontrib/dialog/dialog.3:50:.TH \*D 3 "" "$Da= te: 2021/01/17 18:02:44 $" contrib/dialog/dialog.1:51:.TH \*D 1 "" "$Date: 2021/01/17 17:25:01 $" contrib/dialog/configure:8028:.TH HEAD1 HEAD2 HEAD3 HEAD4 HEAD5 contrib/dialog/aclocal.m4:5919:.TH HEAD1 HEAD2 HEAD3 HEAD4 HEAD5 $ grep -nrFw .Dd contrib/dialog || echo NONE # look for mdoc(7) documents NONE Next... contrib/ee/ee.1:5:.\" nroff -man ee.1 $ head contrib/ee/ee.1=20 .\" .\" .\" To format this reference page, use the command: .\" .\" nroff -man ee.1 .\" .\" $Header: /home/hugh/sources/old_ae/RCS/ee.1,v 1.22 2001/12/16 04:49:27 hugh Exp $ .\" .\" .TH ee 1 "" "" "" The man page is telling us explicitly (in a comment) what macro package to = use to format it, and unsurprisingly getting it right. Next... contrib/ldns/makewin.sh:243:for x in man1/*.1; do groff -man -Tascii -Z "$x= " | grotty -cbu > cat1/"$(basename "$x" .1).txt"; done contrib/ldns/makewin.sh:246:for x in man3/*.3; do groff -man -Tascii -Z "$x= " | grotty -cbu > cat3/"$(basename "$x" .3).txt"; done Again we have renderings of known documents. Let's see what package they u= se. $ find contrib/ldns -name "*.[13]" | xargs grep -nEw '\.(Dd|TH)' contrib/ldns/drill/drill.1:2:.TH drill 1 "28 May 2006" contrib/ldns/packaging/ldns-config.1:1:.TH ldns-config 1 "22 Sep 2011" So that's two more correct uses of '-man'. Next... contrib/ncurses/aclocal.m4:5607: nroff -man \$TMP >\$TMP.out contrib/ncurses/configure:14517: nroff -man \$TMP >\$TMP.out Another Thomas Dickey project. These come from his Autoconf macro `CF_MAN_PAGES`. I'll skip ahead here and note that I'm familiar with the ncurses man pages, having recently proposed patches to them.=20 https://lists.gnu.org/archive/html/bug-ncurses/2023-09/ They are exclusively in man(7) format, not mdoc(7). Here again we have a case of a maintainer knowing what format is required, = and using it. Next... contrib/tcp_wrappers/Banners.Makefile:12:# sequences as described in the hosts_access.5 manual page (`nroff -man' contrib/tcp_wrappers/CHANGES:2:configuration checker. See the `tcpdchk.8' manual page (`nroff -man' contrib/tcp_wrappers/CHANGES:349:have all rules within a single file. See "nroff -man hosts_options.5" contrib/tcp_wrappers/Makefile:575:# and hosts_options.5 manual pages (`nroff -man' format). contrib/tcp_wrappers/README:240:hosts_access.5 manual page, which is in `nr= off -man' format. A later contrib/tcp_wrappers/README:257:The hosts_options.5 manual page (`nroff -ma= n' format) documents an contrib/tcp_wrappers/README:395:documented in the hosts_options.5 document, which is in `nroff -man' contrib/tcp_wrappers/README:432:`nroff -man' format) can guide the requests= to the right server. These contrib/tcp_wrappers/README:453:given in the hosts_options.5 manual page (`nroff -man' format). An contrib/tcp_wrappers/README:897:hosts_access.5, which is in `nroff -man' format. This is a lengthy contrib/tcp_wrappers/README:904:The examples in the hosts_access.5 document (`nroff -man' format) show contrib/tcp_wrappers/README:912:hosts_options.5 document (`nroff -man' form= at). contrib/tcp_wrappers/README:918:program is described in the tcpdchk.8 docum= ent (`nroff -man' format). contrib/tcp_wrappers/README:929:described in the tcpdmatch.8 document (`nro= ff -man' format). contrib/tcp_wrappers/README:967:programs. The hosts_access.3 manual page (`nroff -man' format) contrib/tcp_wrappers/options.c:4: * manual page (source file: hosts_option= s.5, "nroff -man" format). These are all source comments or text file contents, and do not drive construction of anything; they therefore cannot cause failures. Neverthele= ss, let us see what macro package is employed by "tcp_wrappers". $ find contrib/tcp_wrappers -name "*.[1-9]" | xargs grep -nEw '\.(Dd|TH)' contrib/tcp_wrappers/hosts_options.5:1:.TH HOSTS_OPTIONS 5 contrib/tcp_wrappers/tcpdmatch.8:1:.TH TCPDMATCH 8 contrib/tcp_wrappers/tcpd.8:1:.TH TCPD 8 contrib/tcp_wrappers/tcpdchk.8:1:.TH TCPDCHK 8 contrib/tcp_wrappers/hosts_access.5:1:.TH HOSTS_ACCESS 5 contrib/tcp_wrappers/hosts_access.3:1:.TH HOSTS_ACCESS 3 It would appear once again that the upstream maintainer is familiar with th= eir own man pages. Next... contrib/tcsh/tcsh.man2html:13:# in the exact same style of nroff -man, i.e.= any other manpage. This is a comment. Some context might be helpful. $ git grep -C2 -n 'roff.* -man[^d]' contrib/tcsh/ contrib/tcsh/tcsh.man2html-11-# contrib/tcsh/tcsh.man2html-12-# Designed for tcsh manpage. Guaranteed not to work on manpages not written contrib/tcsh/tcsh.man2html:13:# in the exact same style of nroff -man, i.e.= any other manpage. contrib/tcsh/tcsh.man2html-14-# contrib/tcsh/tcsh.man2html-15-# Makes links FROM items which are both a) in particular sections (see Given that context "guaranteed *not* to work on [other man pages]", it does= not seem fair to hold this source remark as evidence militating against groff's change. Next... contrib/tzcode/workman.sh:18:.rm }F" | nroff -man - ${1+"$@"} | perl -ne ' More context is warranted here, too. $ head -n 18 contrib/tzcode/workman.sh #! /bin/sh # Convert manual page troff stdin to formatted .txt stdout. # This file is in the public domain, so clarified as of # 2009-05-17 by Arthur David Olson. if (type nroff && type perl) >/dev/null 2>&1; then # Tell groff not to emit SGR escape sequences (ANSI color escapes). GROFF_NO_SGR=3D1 export GROFF_NO_SGR echo ".am TH .hy 0 .na .. .rm }H .rm }F" | nroff -man - ${1+"$@"} | perl -ne ' Your concern might, at first glance, seem warranted here; the tool does pur= port to be of general use. However, closer inspection reveals that it was writt= en in ignorance or deliberate neglect of the mdoc(7) package altogether; obser= ve how it appends to the 'TH' macro, which is unused in mdoc(7). It would be straightforward to make this script handle mdoc(7) as well; sim= ply append the same two requests to the `Dd` macro. The removals of '}H' and '= }F' strings/macros/diversions suggests a familiarity with the AT&T Unix man(7) implementation or its descendants in USG/System III/System V proprietary Un= ix or BSD prior to Networking Release/2 (when the Berkeley CSRG replaced Unix troff with groff).=20 Further, if nroff or perl programs (or shell functions) are unavailable, th= is shell script proceeds to use mandoc anwyay. $ tail -n 6 contrib/tzcode/workman.sh elif (type mandoc && type col) >/dev/null 2>&1; then mandoc -man -T ascii "$@" | col -bx else echo >&2 "$0: please install nroff and perl, or mandoc and col" exit 1 fi Next... usr.bin/man/man.conf.5:116:NROFF_JA /usr/local/bin/groff -man -dlang=3Dja_JP.eucJP usr.bin/man/man.conf.5:117:TROFF_JA /usr/local/bin/groff -man -dlang=3Dja_JP.euc.jp This is from an example in FreeBSD's man.conf(5) page. The assignment to t= he "lang" string has no particular effect in groff, and to the best of my knowledge it never has. This may be evidence of jgroff, a fork of groff th= at was made unnecessary about in the Debian Project about 20 years ago (https://www.debian.org/security/2002/dsa-107 ) and was superseded by groff support in the 1.21 release about 13 years ago.=20 https://lists.gnu.org/archive/html/groff/2010-12/msg00051.html (Also, is that a typo adding a period in line 117.) I suggest that the foregoing might be bitrotted. Last... usr.bin/man/man.sh:1074:NROFF=3D'groff -S -P-h -Wall -mtty-char -man' usr.bin/man/man.sh:1078:TROFF=3D'groff -S -man' This is a case that should certainly be addressed, if this script still fulfills the purpose claimed for it in its initial commit in 2010: "Implementaiton [sic] of man, manpath, whatis, and apropos written entirely= in sh." I strongly recommend s/-man/&doc/ here so that FreeBSD users will continue = to have a positive experience. Also, for what it's worth, the `-S` option has been unnecessary; "safer" mo= de has been the default in groff since version 1.12, released on 14 December 1= 999. However, 32-33 false positives in a set of 35 suggests to me that your scan= ning criteria could benefit from sensitivity tuning. An error rate of over 90% = is generally considered unusable in serious measurement applications. Regards, Branden --=20 You are receiving this mail because: You are on the CC list for the bug.=