from Gergely Orosz | by Gergely Orosz

Gergely Orosz

@GergelyOrosz

4 months ago

View on Twitter

I continue to be amazed how many software engineers have not taken the bare minimum effort to understand how LLMs work, and assume things like “it can reason.” A trait of any kind of engineering is you look under the covers of the stuff you use, understand how it works and why.

Some questions to consider on how much you understand LLMs: 1. Why do they hallucinate? Is hallucination a "bug" or a "feature" of the architecture? 2. Can an LLM reference where they got information it recites from? 3. Why do most LLM vendors limit output to ~4,000 tokens?

Absolutely this. Spend a day or two of proper learning, and playing around with the technology *with the intent of understanding why it works and how* and you'll be top 10%. Disappointing for me to see how many devs skip all this. Be the opposite! t.co/BmS0DSXv6E

There are many great resources. From the Stephen Wolfram explanations through many YT videos, hands-on experimentation & running models. Also wrote this: t.co/3udkZs5bmO (after talking with OpenAI) I’ve yet to check out this book but looks interesting as well: t.co/bzhWviw42c

More from @GergelyOroszReply on Twitter

Page created with TweetHunter

Write your own