Re: The Case for Rust (in any system)

From: Steffen Nurpmeso <steffen_at_sdaoden.eu>
Date: Mon, 09 Sep 2024 21:31:07 UTC
Paul Floyd wrote in
 <9adc3619-bc38-4fe7-bf16-20e0dfb3b619@gmail.com>:
 |On 05-09-24 19:45, Steffen Nurpmeso wrote:
 |> Alan Somers wrote in
 |
 |>|The real takeaway here is that C is no longer sufficient for writing
 |>|high quality code in the 2020s.  Everyone needs to adapt their tools.
 |> 
 |> I *totally* speak against this.
 |> Quite the opposite i claim that C was safe already fifty years
 |
 |Is that a joke? Do you have any evidence? It sounds like wishful 
 |thinking to me.
 |
 |When I explain to my young colleagues that learnt to code in Java and 
 |Rust how K&R C function definitions "worked", their eyes open wide in 
 |amazement.

Yes, OpenBSD has started using prototypes in perl lately.
I still do not do that in the rare cases i use perl.
I came over (Basic, DOS batch, J(ava)Script) perl, JAVA, C++ to
C and never used prototype-less C myself, K&R, you really, really
have a point here.

Despite that it is unfortunate but true that prototypes in C and
also C++ not seldom do not tell the truth because bit enumerations
are missing, and so you have integers of various widths as "bit
carriers" through which completely unchecked bits are then passed.

Or at least in C, which does not support "easy super-class cast"s,
you also very often have to dumb-cast to meet prototyped
arguments, which is then an error shall the assumption break at
a future time.  (For example, assume stupid name hierarchy
  IOStream{bla;};
  OutputStream{IOStream super;bla;};
  TextOutputStream{OutputStream super;bla;};
and in order to call a function which takes IOStream* you likely
will brute-force cast a TextOutputStream instead of writing
&TOS->super.super, for which, it is clear, you also have a naming
rule in place.)
*Or* you have to create a dedicated macro series which does the
cast proper, then also using C-style aka manually stricked [RT]TI
aka type information, i think GTK uses this.
In short, object hierarchies and casting etc never was on the
agenda of ISO C, which results in aggressive and dangerous casts.
You know, there quite some relief could have been achieved very
easily, best even explicit, with say a new "super" or "base"
keyword (super is C++ so that it bad).

 |> ago, it is just that the occasional one does not realize it.
 |> *Nothing* prevents you from using a string object instead of
 |> direct memory accesses, a vector object instead of arrays managed
 |> via realloc(), and all that.  *Nothing
 |> If *you* do not do that that is your fault and you are a bad
 |> programmer; moreover, you should not be allowed to vote in
 |> a democratic environment (surely you do not read all the
 |> magazines and newspapers, and watch or hear to policital
 |> emissions, in order to build yourself a *real* opinion), be
 |> enabled to drive a car, and what else not.
 |
 |I'm not sure that I follow your argument. Are you saying that you can 
 |build memory safety into C code and that if someone doesn't so they are 
 |a bad programmer? What's the point - why not just use a memory safe 
 |language?

Because Floyd means pink not paul, hah!
But answering your question i would say it does not make much of
a difference to me -- *if* i can go the way i want -- regarding
safety, but a lot regarding runtime and infrastructural overhead.
For example most of the development time i compile with tcc that
is 334640 bytes and links to almost nada.

  #?0|kent:built$ ll tcc#20240731-1.pkg.tar.zst
  -rw-rw---- 1 ports ports 273285 Aug  3 22:11 tcc#20240731-1.pkg.tar.zst

  #?0|kent:built$ ll gcc#14.2.0-1.pkg.tar.zst
  -rw-rw---- 1 ports ports 67854914 Aug  1 21:59 gcc#14.2.0-1.pkg.tar.zst

  #?0|kent:built$ ll clang#18.1.8-1.pkg.tar.zst
  -rw-rw---- 1 ports ports 74166358 Jun 22 21:26 clang#18.1.8-1.pkg.tar.zst
  #?0|kent:built$ ll llvm#18.1.8-1.pkg.tar.zst
  -rw-rw---- 1 ports ports 136797237 Jun 22 23:57 llvm#18.1.8-1.pkg.tar.zst
  #?0|kent:built$ ll compiler-rt#18.1.8-1.pkg.tar.zst
  -rw-rw---- 1 ports ports 3378581 Jun 23 00:02 compiler-rt#18.1.8-1.pkg.tar.zst

Unfortunately pcc is dead, it detected things that clang and gcc
did not (via warning options etc).  tcc is very bad in such.

And then there is other overhead.  For example if you have
a vector type then in JAVA an at() (iirc) access always asserts
the offset, and throws an ArrayIndexOutOfBoundsException (iirc) if
that is invalid.  If you do that in C, you can ASSERT() the
offset.  Or, if you know that the offset could be invalid, either
create a at_checked() or what accessor or add the check in your
code.

Or, if you have a string and want to resize it, you can check
against overflow, inline, and let the compiler optimize away most
the nonsense that effectively there is, for example

  INLINE boole n_string_get_can_book(uz len){
     return (S(uz,S32_MAX) - ALIGN_Z(1) > len);
  }
  INLINE boole n_string_can_book(struct n_string *self, uz len){
     return (n_string_get_can_book(len) &&
        S(uz,S32_MAX) - ALIGN_Z(1) - len > self->s_len);
  }

Or, you compact memory allocations by allocating an object plus
additional room for whatever, one of the posted CVEs was like
that, ie do "buf = &(sp = ALLOC(sizeof(*sp) + LEN))[1]".
This is not possible if you have to live with language managed
objects, may they be safe.
Except for C++, which is (much too) flexible, and allows one to
create stuff like for example

  #define su_MEM_NEW_HEAP(T,VP) new(VP, su_S(su_NSPC(su)mem::johnny*,su_NIL)) T
via
  inline void *operator new(size_t sz, void *vp, NSPC(su)mem::johnny const *j){
          UNUSED(sz);
          UNUSED(j);
          return vp;
  }
via
        struct johnny;
(ie *unfortunately* overloading works like that only).
and then the reverse via

  #define su_MEM_DEL_HEAP(TP) su_NSPC(su)mem::del__heap(TP)
  #define su_MEM_DEL_HEAP_PRIVATE(T,TP) (su_ASSERT((TP) != su_NIL), (TP)->~T())
via
        template<class T>
        static void del__heap(T *tptr){
                ASSERT_RET_VOID(tptr != NIL);
                tptr->~T();
        }

In general all shit, unfortunately.  Anyway with that you *can*
create aka manage objects in whatever memory chunk you want.

And, to me, i use objects whenever i can, which consist of other
objects, etc, and then i delete (or destruct, say) the objects
i hold, and they delete (or destruct, anyway) they consist of, and
in the end there is no memory leak.  Memory leaks or buffer
overruns i usually do not produce.
But i *do* produce logical errors like

      su_idec(): FIX: signed negative overflow would return S64_MAX
  -   else
  +   else{
         res = U64_MAX;
  -   rv &= ~su_IDEC_STATE_SEEN_MINUS;
  +      rv &= ~S(u32,su_IDEC_STATE_SEEN_MINUS);
  +   }

or

      a_colour_mux(): HORRIBLE FIX! Assign correct pointer (Χάρης Καραχριστιανίδης)..
  -      }else
  +      }else if(a_COLOUR_TAG_IS_SPECIAL(ctag))
            cmp->cm_tag = ctag;
  +      else
  +         cmp->cm_tag = NIL;

and in that latter, indeed, a big fat C++ constructor that simply
initializes all members *even though* it directly thereafter sets
them to other values would have avoided it.  (But these are so
*horrible* things that i often try to circumvent the condition by
creating specialized constructor which result in a partial
initialized object, which would then, i think, not be possible in
a "safe" language, at least easily.)

 |A+
 |Paul

Trio sang "Los Paul" (du mußt im voll in die Eier hauen, eh, "you
have to hit him hard in the testicles") over fourty years ago.
They meant Paul Breitner by then, but, still..

 --End of <9adc3619-bc38-4fe7-bf16-20e0dfb3b619@gmail.com>

--steffen
|
|Der Kragenbaer,                The moon bear,
|der holt sich munter           he cheerfully and one by one
|einen nach dem anderen runter  wa.ks himself off
|(By Robert Gernhardt)