Re: Installing openAI's GPT-2 Ada AI Language Model

In reply to: Mario Marietto : "Re: Installing openAI's GPT-2 Ada AI Language Model"
Go to: [ bottom of page ] [ top of archives ] [ this month ]

From: Aryeh Friedman <aryeh.friedman_at_gmail.com>
Date: Sat, 22 Apr 2023 18:34:35 UTC

On Sat, Apr 22, 2023 at 2:14 PM Mario Marietto <marietto2008@gmail.com> wrote:
>
> I don't know. This should be evaluated by you. I'm not involved so much in the technicalities :
>
> https://github.com/lm-sys/FastChat
>
> Let me understand what the Ada (117M) model is,if you want. I want to learn.

It is basically the smallest conversational model offered by the
GPT-2/openAI team.   The reason is I see babySpock as being an
"corporate AI" (in that it mixes and matches models to get the best
results).   The primary problem I see with chatGPT (except for the
cost for using it at the API level, ran up $25 bill in 2 days of just
testing and developing babySpock against their API... this is
financially unsustainable so I have to move it in house) is that due
to its inability to mix and match context(s) [and the web ui to
chatGPT having total context length limits] in order to give it a
broad perspective of how I work and think (i.e. what "irrelevent"
context to filter out but still get a reasonable reply)... I am
planning to use the Ada model as a "cognitive CPU" in the production
version babySpock and have a "OS tape" constantly looping through it..
the reason of course is the models are one shot affairs and are
stateless between calls (i.e. needs external context) and thus if I
was to have a cognitive layer for doing the context assembly I would
need a stateful "cognitive OS" to do it on....

I have some semi-FOSS (BSD licensed but not 100% free) business ideas
on how to scale this but the business philosophy here is not in the
scope of a technical discussion unless you want to know and I will
send some stuff privately.

-- 
Aryeh M. Friedman, Lead Developer, http://www.PetiteCloud.org