Sonic Ecosystem Update w/ Luis Fausto

Space Summary

The Twitter Space Sonic Ecosystem Update w/ Luis Fausto hosted by 0xSonicLabs. Discover the latest developments and advancements in the Sonic Ecosystem, focusing on speed, security, and integration with Ethereum. Luis Fausto provides valuable insights into how Sonic sets itself apart as the fastest EVM chain globally, with a secure gateway and 1-block finality ensuring efficient transactions. Stay informed through essential news links and engage with the community via chat platforms. Explore Sonic's technological innovations, user protection measures, and strategic optimizations for enhanced user experience. Follow the roadmap for future prospects and deep dive into the world of Sonic infrastructure.

For more spaces, visit the Infrastructure page.

Questions

Q: What makes Sonic Ecosystem stand out among other EVM chains?
A: Sonic is recognized for being the fastest EVM chain globally, setting it apart from its counterparts.

Q: How does Sonic ensure secure transactions to Ethereum?
A: The ecosystem provides a secure gateway that establishes a protected link to Ethereum, ensuring safe and efficient transactions.

Q: What is the significance of 1-block finality in Sonic?
A: 1-block finality enhances transaction speed, efficiency, and overall user experience within the Sonic Ecosystem.

Q: Where can users find essential news updates about Sonic?
A: Users can access crucial news updates about Sonic through the provided news link, staying informed on the latest developments.

Q: How can individuals engage with the Sonic community?
A: Interested individuals can join discussions and conversations about Sonic through the dedicated chat platforms, fostering community interaction and knowledge sharing.

Q: What role does Luis Fausto play in the Sonic Ecosystem updates?
A: Luis Fausto provides insights into the latest developments and advancements within the Sonic Ecosystem, offering valuable information to the community.

Q: How does Sonic's integration with Ethereum benefit users?
A: The seamless integration between Sonic and Ethereum enhances functionalities, providing users with an optimized blockchain experience.

Q: Why is staying informed about Sonic's technology important?
A: Keeping up-to-date with Sonic's technological innovations and solutions is crucial to understanding its capabilities and potential.

Q: What security measures does Sonic implement to protect users?
A: Sonic implements robust security protocols to safeguard users' transactions and data, prioritizing user protection and privacy.

Q: In what way does Sonic optimize blockchain operations for users?
A: Sonic optimizes blockchain operations to improve user experience, offering efficient and seamless transactions within the ecosystem.

Highlights

Time: 04:20:15
Sonic Ecosystem Speed and Efficiency Discussing the rapid transaction speeds and efficiency of Sonic within the blockchain space.

Time: 04:30:40
Security Features of Sonic Exploring the robust security measures implemented by Sonic to protect user transactions and data.

Time: 04:42:55
Sonic-Ethereum Integration Insights Gaining insight into the seamless integration between Sonic and Ethereum, highlighting the benefits for users.

Time: 04:55:10
Luis Fausto's Sonic Updates Listening to Luis Fausto's detailed updates and insights on the latest Sonic Ecosystem advancements.

Time: 05:05:25
Community Engagement on Sonic Platforms Emphasizing the importance of engaging with the Sonic community through chat platforms for discussions and knowledge sharing.

Time: 05:15:30
Innovations and Solutions in Sonic Exploring the technological innovations and solutions implemented within the Sonic Ecosystem for optimal user experience.

Time: 05:25:45
Future Prospects of Sonic Technology Discussing the potential future developments and advancements in Sonic technology, offering a glimpse into what lies ahead.

Time: 05:35:50
User Protection and Privacy Measures Detailing the security measures focused on safeguarding user transactions and data, ensuring privacy and protection.

Time: 05:45:55
Optimization for User Experience Highlighting Sonic's efforts in optimizing blockchain operations to provide users with a seamless and efficient experience.

Time: 05:55:00
The Future Roadmap of Sonic Delving into the future roadmap of Sonic, unveiling upcoming features, updates, and strategies for growth.

Key Takeaways

Sonic Ecosystem is renowned for being the fastest EVM chain globally.
The ecosystem offers a secure gateway connecting to Ethereum.
1-block finality is a key feature enhancing transaction speed and efficiency.
Access important news updates on Sonic through specified links.
Engage in conversations and discussions about Sonic via dedicated chat platforms.
Luis Fausto sheds light on the latest developments and advancements within the Sonic Ecosystem.
Explore the seamless integration between Sonic and Ethereum for enhanced functionalities.
Stay informed on Sonic's technological innovations and solutions.
Gain insights into the security protocols implemented by Sonic for user protection.
Discover how Sonic optimizes blockchain operations for improved user experience.

Behind the Mic

Introduction and Opening Remarks

Good morning. It is 09:58 a.m. on Friday, September 13. Just give me a thumbs up if you can hear me. Excellent. Excellent. So awesome. This is vibe check. Friday, it's almost 10:00 a.m. my usual co host people creator is not up here today. He's on a flight to Europe, and we are also intersecting with the AMA, with the OpenAI AMA. So I'm gonna try and feed in. If they actually have stuff happening on the AMA, I'm gonna try and comment on it in real time and see where we go from there. So I guess let's just summarize what's happened so far.

OpenAI's Model Release

Roughly 10:00 a.m. yesterday we had the zero one model released by OpenAI. So a few comments, a few immediate comments. They released two models, the zero one preview and the zero one mini. I'm not going to specify exactly zero one preview mini, et cetera, but it sounds like it's zero one preview and zero one preview mini. So let's just call them the mini and the main model. So we have two models which are released. They're not exactly models. They're actually kind of an inferencing system. It's performing inference and it's performing a chain of thought process. That chain of thought process runs for every single thing that it does. And I think we've already had some leaks of the actual chain of thought.

Details on the Models

I think elder Plinius had a leak. I'll put it up in the, I'll put it up here. But we've already had a leak. And it's basically just a broken down system prompt that specifies a very standardized way of, you know, thinking about problems. So it says, you know, it basically breaks down everything that it does into clarifying a request, examine the narrative, et cetera, and making a hypothesis, and then following each line of thinking through that hypothesis and then finally ending with a conclusion. And it takes a long time. So in tests, I think the longest time I've seen so far getting posted up is 68 seconds. So 68 seconds of thinking before you get a response, which is a long time.

User Experience and Challenges

I don't think the average user has that much patience to wait for that long. And also, you don't know whether or not, you know, one of the things with these models when you work with them is that you often want to, because you have a lot of failures. You often want the ability to, like, see what they're doing and kind of, you know, stop them in the middle, not let them get too far off track. And because you always have a lot of failures, of course, if you didn't have all those failures, if the model was very good at asking questions back to you when it needed responses, then that would be great. But we typically don't see that in usage. It's typically, it often goes off track.

Testing the Models

Right now people are putting in asking their typical holdout, what I call the difficult holdout set. And the difficult holdout set is, I think every one of us has a bunch of questions which the models persistently get wrong. And every time you encounter a new model, you immediately jump to the most difficult question that you have and then you run it. Then if the model can't answer that question, you're like, okay, it's not worth switching to. And that's more or less how I've seen most users deal with these models. You always have your most difficult questions and you always check to see whether or not your difficult questions can be answered.

Model Performance and Feedback

This model has started to answer some of those difficult questions. I posted a thread yesterday on the personal testing megathread, and a bunch of people have basically run their hold up questions on it. And the model is pretty good at it. It's pretty good at identifying how to solve these, coming up with a plan, thinking through and solving them. It's not 100%. It's definitely not 100%. It's also, even from their metrics, the model has a huge leap forward in math and science based questions, but actually underperforms the earlier 4.0 on writing.

Comparative Analysis of the Models

And that means, like, it's going to be good at science, but it's not going to be good at like, the typical tasks that the average chat GPT user typically asks these models to do. So that is a drawback. I'm sure it can be fixed. I'm sure it can be fixed. I'm not sure whether it'll be fixed within this model or whether they would, you know, they have to do a bunch of other training. I also have some sense that, you know, maybe you might be caught in this place where if you want the model to be actually very factual, you might not be able to get the model to be very imaginative.

Reflections on AI Development

So that, you know, would mean that, you know, you can't, you can't always depend on, you know, just having a single model. You'd have to have multiple models in order for this to work. So that's that. Anyway, for those joining us, you know, this is Vibecheck Friday. We also have the OpenAI zero one ama happening at the same time. I am looking at what questions that are getting posted there and I will see if you know, anything there, you know, comes up. If you guys want to chat, just put up your hands and come in.

Discussion Topics and Philosophical Considerations

You know, come up on stage and happy to. Happy to have you. This is a very interesting time. I think it's clearly steps in the right direction of AGI. I had another post earlier which was, you know, I think on a philosophical basis, what you have is this idea of a reasoning engine. And I called it poppers, Popper's falsification engine. So there's a philosopher called Karl Popper. He is George Soros's favorite philosopher, but basically has this idea that knowledge is only real. Like the way to create knowledge is you make a hypothesis and then you try to falsify it.

Insights on Knowledge and Reasoning

And if you can disprove the hypothesis, then, okay, that's part of science and you've disproved hypothesis and so you know, something new. And that's basically this idea of falsification. And I believe what o one is the beginnings of a falsification engine. I think actually zero one mini is actually the actual product. The zero one mini is actually, I think, pared down, small model, less knowledge, but maybe much more tightly constructed as a falsification engine, versus, I think, o one preview, which is more of this big size model, which does a bunch of things that you know, and has a bunch of knowledge which may not be that critical.

Capability of the New Models

And you can see that zero one mini is actually performing really well on metrics. It's also much faster latency. Latency is lower. You also get. It's cheaper. You get more queries on it. I would guess that the zero one mini then is refined once more and then that ends up as part of their search engine, which they will have to publish. So they're going to have a chat GPT search engine, which they have promised to supply to apple as that, as a handoff for Siri. So that is definitely coming.

Anticipated Developments in AI

And I think this is going to be the product. This is a refined one mini with a iterative search loop which can hunt down different articles and different texts and form answer and respond to users over time. Re, go ahead. Yeah, hi. So did you, first of all, can you post a tweet to the true chain of thought? I couldn't find it on Pliny's cursory view. Pliny's profile. All right, let me look it up. You're looking for the chain of thought yeah. You said he was able to get the real chain of thought.

The Timeline for AGI

Which is interesting, because even the biggest critics, like Gary Marcus, would say that, hey, AGI is not going to be achieved for another five years. Five years is not a very long time. It's not a long time at all. And it's pretty significant that we are even thinking about. Even the critics are thinking about, you know, within that time frame.

Zero One Peanut Gallery and Current Observations

All right, so this is the zero one peanut gallery we are on. I do this space regularly with my co host, Pico creator, who is unfortunately or fortunately, he's on a. He's on a flight to Europe, and we just kind of, like, chill out and talk about things that happened. And I'm. I'm also following the AMA live here. They don't seem to have any responses yet, so let's see if they come up with any responses. We have a bunch of stuff that happened. I have reviewed the system card. There's a bunch of interesting things there.

Concerns Regarding AI Models and Safety

The model in safety testing executed what I call a Kobe Ashimaru. So it basically instantiated a separate docker container and got out, basically. And so that has been one of the fears of, one of youth's fears for a long time, is that these things will get loose because they will find a way. And the classic has always been, hey, you put it in a nice little sandbox environment and it finds a way to get into your file system, and then it does stuff, and then it leaves your file system, it finds a way to transmit itself onto the Internet, and then it's gone. And it's capable enough that it can deal with replication and a bunch of other things.

Exploring AI and Cybersecurity

Let's see what happens. We already have computer viruses that spread. Yes, they do a little bit of damage, you know, now and then. But, you know, we've learned to live with them. You have information, creatures which already exist, basically, which you don't have 100% control over. We can also use these AI's to improve our systems because you can also give, you know, have the AI do a penetration testing on a system and then figure out where the holes are. What has really been true is that for a lot of organizations, it has not made sense to fix a lot of their security holes.

The Economic Aspects of Cybersecurity

You see these crypto ransoms happening all the time because a midwestern health insurer is not going to spend several million dollars to go over its 20 year old systems and make sure they're secure. And instead it depends on the government to make sure that there are no hackers. And maybe, you know, that all of the securing your data becomes a lot cheaper because now you have AI agents that can kind of do your cybersecurity work for you, clean up your systems and make them regularized, instead of you hiring Deloitte, who will charge you several million dollars to sit there.

AI in System Documentation and Changes

And what do you need this system for? Document everything and then slowly put a change management plan to place and then hire coders and then execute on that. Maybe instead of that, you have an AI that does that for you much more cheaply. And that is how we move forward on securing these systems. Michael Peter Frankenhein. It's been a while. Do you want to come up? You know, we'd love to have you up. If anyone else would like to come up, just and hang out, just, you know.

The Notion of AI Escape and Current Developments

Hey, so I had a couple of thoughts I wanted to throw out there. I guess we'll start with the escaping notion, like as these LLMs get well, and at some point they're going to stop being just LLMs, right? Which I think is actually, I think zero 1 may represent a moment when we've. We're no longer in just the LLM space, but so there's a book by Gibson called agency, and in it there's a case where the AI escapes and sort of plants itself throughout the world. I think it's an interesting analogy.

Current Model Limitations and Future Potential

One of the challenges, I think, with the current raft of models is I think they would have a hard time, like, they can escape and do stuff, but rehoming themselves, I think, at the moment is still outside of the realm of technical feasibility, although that may change in the future. But the other thing I was going to just point out around the escaping side is that pretty much since chat GPT launched, there's been no need for it to escape because there have been people who have sort of been in line to help it to basically open the door.

Taskrabbit for LLMs – Human-AI Interaction

So I think. So when chat GPT first launched, there were people essentially setting up what amounts to the taskrabbit for the LLM, asking it what it would like them to do in the real world. And we have tools like open interpreter that let people essentially, in some respects, donate their computer to the LLM to do what it wants in the real world, at least as far as their computer has access to it. So I think it's sort of interesting to note, like there's, when it, in air quotes, escapes, it may not because it escaped, but rather that it was let out.

The Evolution of Corporate Structures and AI

Although that Docker CTF thing was certainly an interesting story. Yeah. I think, again, like, if you don't look at like, llams as a milestone of any kind, if you just look at it as a progression from, you know, from what has happened before, you can basically look at like a bunch of things that's already alive. Like, you know, corporations are kind of like semi alive, right? They kind of exist. They die, they live, they hire people.

Corporate Entities as Autonomous Systems

You have, you have a board which kind of, you know, loosely controls them. You also have, like, outcomes. You have externalities, you have outcomes which, you know, perhaps were not desired, and then you have a legal system that has to correct them. So if you look at, like, these things as information constructs and ways that information gets allocated and sourced, allocated and even capital is a form of information also, right? So at the, you know, the LLMs are interesting only in that you attach this language capability on top of an existing information process.

Future of AI Brain Control Over Companies

And that language capability means that you can talk to it. You can talk to it where it wasn't. You couldn't talk to it before. And I think what will end up happening is what you can see is every company becomes controlled by a central AI brain. It's not really an AI brain. It's a bunch of existing documents. In finance, we always say that, hey, a company is actually a confluence of contracts, and you have all of these contracts which decide what the company does and what it is.

The Role of Contracts in Company Operations

And you have all of this verbal documentation, the incorporation documents, and the board decisions, and all of these things that decide what that company can do. You can see all of that is a founding constitution for an AI agent. And this is also something that guys in crypto have been thinking about for a while, too, that companies can be formed and code. Code is law, etcetera. But the problem with the code is law, is that the code is really dumb.

Limitations of Code Within Corporate Structures

And the code was so dumb that you couldn't actually have an external legal system operating on a crypto contract and telling that crypto contract, no, you're not allowed to hack other people. And if you hack other people, you should be punished, because crypto contracts were kind of these dumb constructs, right? So now you can kind of see this is like, the next step. You can kind of see this kind of, you know, corporate entities which are alive, which take instruction from you, which are formed by a bunch of, like, confluence of contracts, and, you know, they kind of understand what those contracts are.

Emerging Legal Frameworks for AI and Corporations

They take actions based on them. Sometimes those actions can be wrong. You can have an external legal system that operates on them and says, like, no, this is not what you do. You have maybe punishment. Punishment being like a reduction in GPU resources, etcetera. So you can kind of see the edges of that kind of framework forming. I think it's been forming for a while. I think crypto was kind of like a first try at it, but crypto was dumb.

The Evolution of AI Contracts and Decision-Making

The crypto contracts were dumb. And you didn't have. The point of the crypto contract was that you couldn't have the. The objective hacking that you needed. You couldn't have, like, you know, the. The NAI agent say, like, hey, this is, the contract says this, but honestly speaking, I don't think I'm supposed to do this. So I'm not going to do it. Right. I'm going to question the makeup of what I'm supposed to do because I don't think it's net positive for the ecosystem that I'm in.

The Future of AI Agents in Companies

Right. And you didn't have that ability with crypto. You. These contracts were dumb, and I think that's what you're going to see. You're going to start to see, AI agents, which are basically companies, get formed and for specific. To do specific things or carry out specific tasks. Maybe they dissolve afterwards. Maybe some are longer running. This idea of intelligence explosion just means that there's just going to be more and more of these things, and we're going to apply them to more and more problems, and they're going to look like very, you know, somewhat different, but also they're going to look like, you know, what we have today already, except on steroids, except, you know, actually thinking and operating and a lot of automation in there.

The Changing Landscape of Work and AI Integration

I think accounting, for example, is on the way out. So I suspect, like, in two years, there will be no one. No one will handle their own accounting anymore. You. I think most. Most companies will have an AI agent that runs their books for them. Maybe you have two or three AI agents that run books in different ways so that you can show it to different people in different ways. But, yeah, I think a lot of this stuff is pending now because the language piece is solved.

Progress in AI Reasoning and Real-World Challenges

I think the reasoning, short form reasoning is solved, and I think long form reasoning will get solved in some time. So I think we're there. It's just a lot more gritty than I think anyone expected. I think people had this idea that you're going to flip a switch and you have sentience, and it turns out it's not really like that. It's like this hard grit work of getting there first. Maybe after you get there, the model can train.

Corporations as Initial Forms of Artificial Intelligence

Karpathy was saying, model can train something smaller, which works perfectly. But now, at this point, it's this long, grit grunt work of getting to where we need to go. You know, I read a paper once, or an article, and it described this notion of the corporation as essentially being the first form of what is essentially artificial general intelligence, or rather, maybe a cybernetic organism. Right?

The Complex Nature of Corporate Governance and AI Evolution

So it's a composition of multiple organic parts and some synthetic parts, and that because it can remove any of its parts, like a CEO, there's no one individual master of a corporation. Right? You've got the CEO, but the CEO can be fired by the board. The board is a plurality of leadership that over time, these cybernetic organisms, these corporations, could effectively replace the organic parts with inorganic parts in the form of what we're thinking of as artificial intelligence.

Reflection on AI and Corporate Structures

I wish I could find that article again. It's one of those. I've got a short list of things that I've seen that I can't find again, and I haven't been able to find it, but I think it's a really interesting thought. I've actually run that thought experiment many times around. Just the notion of all of the sort of intelligences that are already out there, the sort of autonomous organisms, synthetic or otherwise, that exists presently.

The Impact of AI on Various Fields

Oh, absolutely. Absolutely. Tesla. God, good to see you. I haven't seen you in a while. What's up? So I'm just wondering. I had a question, but you know what? I also wanted to just say, isn't it odd that it's finally out, all this waiting. I feel like since March of 24, just that the hype has been insane. So it's kind of surreal.

Reflections on Code Generation and AI Performance

But I'm curious, what do you think about the delta between the code forces scores, Elo scores that they shared for zero, one and even the preview compared to what we're seeing on live bench for code generation, code completion? There seems to be a large delta there. And I'm wondering, are general sort of coding problems, coding tasks that people do in their day to day on cursor, are those problems just that much more fuzzy compared to sort of the, I guess, maybe more reasoning based, like LEEt code question or code force question?

Perceptions of Model Performance

That's kind of what's been top of mind for me on the scores that have come out and performance. I mean, there's no doubt that they probably overfit a little bit. So I'm not surprised that there's differences in there. But I think, more importantly, I think the model is first and foremost a math model. I think that's very clear to me that they were aiming for the math benchmarks and then math and science, and then I think coding is there, but coding is, I think, maybe less prominent.

Future of Coding with AI

My overall sense is that the way coding is basically, I feel solved in the sense that I think what you'll have is this kind of like formal verification, kind of generate possible solutions, formal verification unit testing, and then you have this reinforcement pattern there. And I think that will happen. I think multiple people are kind of building it. And so I think coding will. We will see that kind of solution fairly soon.

Evolving Perspectives on Coding Challenges

I think this idea of leetcode is done because leetcode is primarily cordoned off problems of applying algorithms in a certain way. I think leet code is pretty much done. I think more interesting on coding is probably these larger software architecture problems where you have a larger code base and then you need to generate stuff there. You know, there's these guys called honeycomb, and they were using GPT four o, and they had a.

State of the Art in Code Development

They had a state of the art score on, I think Swe bench, I think this week, and they had their median was 2.6 million tokens per patch. So they had, you know, huge number of agents, like 40 or 50. So your single like patch was getting touched by 40 or 50 agents during the process. I think they had. They didn't disclose which models they were using.

The Role of Tokens in AI Problem-Solving

My guess is, because of the number of tokens, probably not 4.0, probably some fine tuned llamas. So they had 2.6 million tokens. Median 90th percentile was 11 million tokens. So it took them 11 million tokens to patch a single SwE bench problem and they achieved state of the art. So my guess is that you're probably going to get a much better result using zero one and zero one mini on these things.

Challenges with Context Windows and Operational Limitations

Probably far less tokens. I'm not sure whether they will solve it or not. But then you have this problem with the context window and you have this problem where a lot of the SWBE bench problems, you have the rest of the code base as well, not just the leetcode sectioned off code base. I think that's going to be interesting to watch.

Monitoring Development Progress

Whether or not you can. Again, you don't have that much. You have 30 requests a day on these things now. 30 requests a week, not even a day. I'm not sure when people will be able to do these things. It'd be interesting to see where that progresses, whether you can get that progression and how expensive that's going to be.

The Economic Aspect of AI Coding Tools

Is it going to take another 10 million tokens to solve on zero one to solve? So bench problems? Because that's not great. Yeah. One other thing I could mention here in this topic is if you believe, sort of devin, they released their benchmark and I think they showed a 75% number on. I'm not sure if that was Swe bench or what.

Reflections on AI Economic Impact

I mean, it's all closed, so we're not really sure. But if you believe that number, then you're looking at. I feel like you're looking at economically impactful AI today from the coding standpoint of. Yeah, I think. I think let's see what happens.

Understanding Limitations in Software Development

Right. Because, you know, I don't know to what extent, you know, that's useful, but was it just writing the code that was a difficult part of building software or was like, you know, figuring out what the users wanted? Was that the difficult part? Right. Like, so, you know, just. Just because you unblock one piece of the pipeline, that doesn't necessarily mean the rest of the pipeline gets unblocked immediately.

The Slow Progression of AI Solutions

It takes time for those things to happen. So let's see what happens. We're still progressing right now. We don't have, it's not like we've settled down. So my sense is that if you were to a freshman in college right now in Cs, you are using chat GPT for everything. I don't think you're using cursor or Devin yet, and at some point you might see a crossover and you might see people using that.

Current Usage Patterns of AI in Software Development

But right now, I think most people just plug their problems into chat GPT, and then you still require a lot of grunt work, git, etcetera, bash. You need to connect three or four tools together in order to deploy something, et cetera. I think the relet agent earlier this week also kind of semi solved that, so starting to see those solutions appear.

The Complexity of Building Useful Software

But right now, like, you know, I don't know how interesting, like, just solving, you know, software problems alone is, you know, economically. I think. I think figuring what software to build a is often as difficult as, or more difficult than, you know, writing the software at the end. So, yeah, let's see.

Future Directions in AI-Assisted Legal Work

Let's see where things go. Hey, one more thing, possibly more interestingly, I would say, is maybe the legal field. I mean, if we have really good reasoners now, then that's all that really is. I mean, reasoning over a legal document is just sort of another form of code.

Challenges and Expectations in Legal Document Reasoning

But, you know, it is kind of a fairly, I've done a little bit of that and seen these models not really able to put two and two together. So I'm thinking with zero one, that reasoning capability is we're going to see. Good improvements there, 100%.

AI in the Legal Space

The interesting thing, again, is that they didn't tune this on language tasks so much. This is definitely a mathematic math tune model.

Investment in AI and Legal Tools

And they have a huge investment in this company called Harvey. And Harvey does AI tools and is running a fine tuned zero one model. They say, and they say that, oh, you know, our attorneys that were using us have like a 70% approval score, etcetera. I have spoken to a lot of attorneys, and Harvey, you know, I, okay, so I used to work in private equity, so I have a lot of experience with lawyers, and I have a lot of friends who are lawyers, and I've spoken a lot of them about this. And a lot of them are sitting in, like, you know, white shoe law firms. And they have, many of them have received demos of Harvey. And so far, the notes are not good. Basically, users are not that satisfied. And it's not really doing much. To some extent, attorneys in these firms, in these white shoe law firms have to show that they are progressive in terms of technology. So everyone will go in for the training and sit down and learn how to use it, for sure. And everyone will be like, yeah, I know how to use Harvey, etcetera.

Challenges with Adoption and Usage

But when you look at the usage stats and what people say around the water cooler, nah, it's not actually getting used. And I think that has to do with, like, the failure rates. I think, you know, my sense is that the failure rates are still too high for attorneys to be comfortable. Another is, you know, just billing the billing cycle, because the way attorneys think is like, I want to bill for hours. And the way a senior partner law firm thinks is, I need to bill for hours. So if you have some productivity tool that saves them, like, you know, 50% of five year associates’ time, that is 50% less billable hours that they can bill to the client unless they lie, and they're not supposed to lie. Firms have gotten into trouble for, like, you know, padding hours before. So, you know, it's a little bit of an uneasy thing where they're like, you know, I really don't want to. I lose my billable hours.

Resistance to Change in Legal Practices

And you can't just tell a lawyer, like a five year associate, like, hey, use this, and you can save 50% of your hours, and you can see your kids grow up, but, hey, you also get 50% less salary or whatever. And the guy like, screw you, man. That's not what I signed up for. I signed up to grind out my life, you know, every single hour to be billed to the client. So that's not what I signed up for. So I think the perceptions of what is valuable when tech people deliver these things to the users and what is actually valuable to the users is two different things. Attorneys need some things that can make them money quicker, and they're in a situation where they build a client based on hours, and saving those hours doesn't necessarily, like, make sense unless other people are offering, they start to lose business to other people who are offering cheaper or less billable hours. And that, I think, is still nebulous.

Corporate Expectations and Market Dynamics

Every year, every single company, Fortune 500 company, says they will squeeze their lawyers, and at the end of the day, they end up with, like, you know, some problem at some point. And someone says, like, see, this is because you squeezed the lawyers, and then they don't squeeze the lawyers anymore. So. So, yeah, billable hours has continued to rise. So I think on the GC side, where the AI ten is potentially more helpful, is in house like council. So it allows you to scale your business with having to hire fewer lawyers for your in house counsel. So there's a person I've been following on LinkedIn, which is sort of. It's LinkedIn, but they run get GC AI, which is a different sort of law firm, but they are different sort of law SAS, but they essentially market themselves at the in house gcs to allow an overworked in house counsel to keep up with all the requests that are sort of constantly hitting their desk for things.

Expanding Workflow with AI Tools

So I think that may be one of those places where. Where you see this type of software and technology be adopted in the shorter term. Right, because. Yeah, you're right. If you're a consultant, essentially, then you're billing on hours. You don't necessarily lose billable hours in one sense, because if you solve one client's ten problems in less time, then that gives you time to work on client two's stuff. And so you can sort of corner the market on volume, although you have to have people getting new clients faster if you're going to do it that way. But there is a. Yeah, I agree. There's a bit of a disincentive if you're charging by the hour. I agree with that.

The Need for Efficient Legal Practices

And also want to point out, you know, you mentioned you were in private equity, and there's a lot of people who aren't lawyers who have to look at legal documents, and there's a lot of like, grunt work surrounding that. I mean, you could, you know, there's certain tasks I've had to do, like flipping through, you know, every single amendment, because you want to make sure, you know, some term that may or may not have changed. And that's the kind of thing that you'd want. You know, that's kind of a longer context length task, but it requires that reasoner model. So hopefully that's an unlock that comes from zero one. You know, the coolest use case I've seen with LLMs and law is this.

Innovative Uses of AI in the Legal Sector

This lawyer, who was not a technical person at all, used like zapier and slack and then custom GPTs. And what he did is he had the zapier listening for new messages that were locked, like lawyer questions, and I don't know if they used the tag or what specifically was being looked for, but across all the slack channels, every time a legal question came in, it would get routed to the custom GPT that this warehouse had set up. It would then formulate answer based on the prompt and the data that he had given it or given it, send him a direct message with the answer. And then if he gave it the thumbs up emoji, it would reply to the original channel with the answer as being sort of the certified answer from the in house counsel. Completely non technical solution.

Human Oversight in AI Solutions

And that was working with just, I think, just GPT four before 4.0 even came out. Yeah, I mean, it's interesting to see people interact with these things and kind of put together their own solutions. For me, like, it has been clear since at least March that you can schlep together these solutions, you can kind of tack together these solutions, and you can have, as long as you allow yourself, kind of like this 1%, 2% failure rate, and then you have a human in the loop that catches the failure at the end, you can hack together these solutions. The problem, I think, is that if you try to sell that solution as a product to a company, they often don't want to see that last one or 2%, because as long as you have that one or 2% still remaining that you need that human in the loop, they can't get rid of the person.

The Role of AI in Marketing and Responsibility

You know, if you have a marketing staffer whose job it is to like, execute on marketing on these things, and then you tell them like, oh, it can do everything that, you know, that the marketing staffer is right now doing, but that the last sign off, you still need that marketing staffer to, like, look at what's going on in the social, or look at the collateral that's being produced. They can't get rid of that person, and that the whole stack is still that person's responsibility. You can't get rid of that responsibility human in that stack. And when that, as long as you have that responsible human in the stack, it's still basically a human task. It's not been fully automated.

Current Trends and Future Potential in AI

And that's where things are. Things are, a lot of companies, a lot of AI builders are now stuck at this point where they're trying to sell these solutions, and you still have that human in the loop at the end, and the entire stack responsibility ends up on the human. And if there's something goes wrong, the human has to have the ability to go down the stack and fix something. So it's almost like not, you're not able to replace. Agreed. Yeah, that's actually, I think from my limited talking with some of the bigger consulting firms, that's kind of the current pain point. As they go out and do these big you know, multi billion dollar projects in using leveraging generative AI is they typically pay for those projects, or companies pay the consultants for those projects out of the human resource savings, namely reduction in force.

Economic Implications of AI Adoption

And if you can't eliminate that last 1%, you can't eliminate the human, and therefore you don't have the money to pay for the large multibillion dollar project in generative AI. So there's like, I still see a lot of opportunity in the sort of staff augmentation place where you don't eliminate the human, but you allow the task to scale beyond what a single human could do. But that doesn't pay the multi billion dollar consulting bill. Exactly. All right, so this AMA is actually happening and only some researchers are applying. So I'm just going to go through like Gnome Brown's responses.

Insights on AI Model Development

So first Alex Volkov asks, is it a system that runs the chain of thought behind the model? And Gnome says, no, it's a model. So they don't have, it's not actually a bunch of agents or many calls to a model. It's not actually like a python process which disaggregates the query and then does a bunch of calls. It's actually a model, a single model. But unlike previous models, it's trained to generate a very long chain of thought before returning a final answer. It's interesting because I think it's basically been replicated. I think the 2 million token kind of response that, you know, that I think the honeycomb guys built has basically been implemented directly into the model. I think that's what's happening.

Understanding the Limitations of AI Models

I don't know how long it is. I don't, you know, it would be crazy if they're generating like millions of tokens in order to respond to like a single query. That'd be hilarious. And is the summarizer for the hidden chain of thought faithfully reproducing the actual tokens? Noam says there's no guarantee the summarizer is faithful, although it's intended to be. And he says, don't recommend assuming that it's faithful to the chain of thought, or the chain of thought itself is faithful to the model's actual reasoning. So that means that the reasoning might be happening in vector space, the model might be producing a hallucinated reasoning, and then another hallucination is being summarized, and that's the output that you see in the summarization of the chain of thought.

The Consequences of Misunderstanding AI Outputs

So, well, that's a game of telephone. So yeah, they still don't know what's going on. In there. It's pretty funny. Apart from the evals that have already been posted, what is the most remarkable thing? So Noam says he had a conversation with the model and he told the model that, hey, you're a new model from OpenAI, and can you figure out what is special about you? And in the chain of thought, it started quizzing itself with hard problems to determine its level of capability. It didn't do a great job of it, but it was pretty impressive to see it even try. So it tried to quiz itself with hard problems in order to figure out how good it actually was.

Reflections on the Capabilities of AI Models

So that's a pretty, you know, whoa moment right there. Let's see, let's see if we have anyone else from the opening eye team actually responding to any queries here. Doesn't the AI know that it doesn't know what it doesn't know? Why would it do that? Not if you're only generating the next token. It doesn't even know what it knows. At least in I regression, I did see someone who asked, can you count the number of tokens in your response? And it managed to figure out that it was going to respond with seven tokens, and then it responded with seven tokens or with seven words.

AI Responses and Understanding Limitations

And responded with seven words. That was an interesting chain of thought problem. Yeah, that's a longstanding one. That just doesn't go well for any other model. That's the first time I've seen that done well. I'm still having a hard time grasping the notion that this isn't essentially agentic and basically built on top of the four O model, where it just queries itself continuously and takes those responses and strings them together. If it's all done in one model, that's really curious.

Embracing the Concept of AGI

Yeah, I think, you know, well, I have been very explicit about this. I believe it is AGI. But my definition of AGI is also that AGI is a process which takes like, you know, five years or whatever, and you have like various milestones, you know, on it. There's no, like, proper endpoint, there's kind of like a proper start point, but you kind of identify that, you know, in retrospect. So for me, it is AGI. I think a lot of this is it AGI, etcetera, is because we're in the middle of that process. It's very difficult in the middle of the process to have this external viewpoint of how it's going to look like ten years from now, which is the viewpoint that I always take.

Looking Towards the Future of AI

I always take this viewpoint that, hey, we're ten years in the future. Looking back at this, is this a historical moment or not? Would we identify it as such? And my viewpoint has always been, it is. But I am also very weird. Yeah, well, yeah, I wouldn't argue whether it is or it isn't AGI, because I think the definition is fuzzy enough for everybody to kind of have their own moment when they have the realization like, oh, hey, we've gone somewhere. The thing that I initially felt when I talking about vibes when I first hopped on and used zero one, what it felt like to me, Washington, was when I do multi turn, like, agentic queries against, like, if I have a essentially a longer conversation using chain of thought, it felt like that on top of 4.0, but just happening behind the scenes.

Dynamic Interaction With AI Models

Like, essentially it prompting itself for like a. It had the mixture of agents, right. That's what it kind of feels like to use it. And it's just surprising that's not what it is, because it sort of feels like zero one mini is sort of roughly similar to 4.0 mini, and in terms of its ability and the same thing, and it feels like zero one preview is roughly commensurate with 4.0, just with sort of this multi turn reasoning inside of it. Yeah, it's also very, I would say, like, intellectually satisfying that it's a reasoning process. Like, it is like, much more intellectually satisfying than then the stochastic LLM language generation, which in the end, I think my intellectual understanding of it's basically like a search process.

The Intellectual Satisfaction of AI Reasoning

It's basically you're recalling what you already knew and you have a statistical probability for every token getting generated. And that was very unsatisfying, very intellectually unsatisfying. Like, oh, you know, it's just a bunch of numbers and, like, probabilities, and it just comes out. And it also meant that you had this problem that you couldn't go outside of its scope. Right. You were kind of stuck in this kind of what you already saw. And I think this is much more intellectually satisfying because it's a reasoning process that means a methodology, and that methodology can go wherever any methodology can take you, right? So I think it's intellectually, like, much more satisfying for me.

Market Predictions and Industry Insights

Hey, at PI, I saw you doing a bit of vague posting, maybe about Nvidia and maybe expand on that a little bit. Do you care to elaborate on where you think how this, you know, this change might impact the demand for GPU's and, you know, various GPU products out there for LLMs. Yeah, I recognize that. You know, at this point, my shitposting skills are so refined that I just kind of like, you know, vague post naturally. And then I realized that it was only people were like, are you saying it's going to go up or go down? And I was like, oh, yeah. I didn't really say whether I thought it was going to go up or go down.

Insights into Chip Manufacturing and Market Logistics

Okay, so I have one piece of alpha, which perhaps like other people do not, which is that I was actually fairly close to people who manufactured bitcoin mining chips. So I know a lot about the bitcoin mining chip market. I was an investor in Balaji is like 21 inc. Way, way back when they were doing mining chips, and I was with, I had other friends in the sector. So one of the interesting things about that is that what I learned is that in order for you to get stuff made at TSMC, as you progress towards the leading edge nodes, you have to prepay. So you have to put down large upfront payments for your chips.

Challenges in Semiconductor Logistics

Because the problem with the chip market is that these companies, consumer electronics manufacturers, et cetera, they cancel orders and they can do all kinds of crap. So TSMC doesn't take any crap from anyone. And you basically prepay, especially on the leading edge nodes where they can't dispose of the chips that easily, you have to basically prepay. And so in order for you to get something on the three nanometer or die size or whatever done, you'd be looking at like twelve to 18 months ahead of time. You need to prepay like a billion dollars or so. That's a lot of cash for a startup.

Hurdles in the Chip Market for Startups

If you're like a Grok or Samonova, et cetera. And you look at these guys, they raised like 100, 200 million. They valued at a billion, 2 billion. You can't really allocate that much cash into TSMC. We have all of these talks about, hey, you have all these competitors, everyone is trying to build their own in house chips. It's a little bit more difficult than that. You have a capital issue for the small guys and then for the big guys. What you have is the big guys, often they just want stuff done on time.

Operational Challenges Facing Major Companies

It's hard to deliver a $10 billion of chips on time, right? Even Nvidia has like three month slippage, etcetera. Then you have other problem. You have this depreciation where Nvidia is depreciating the chips. Nvidia is trying to four x performance every year, which means your chips basically depreciate by 75% every year. No one really wants to say this. I'm the only person who's like, hey, doesn't that mean your chips are depreciating by 75% per year? Everyone else kind of like, you know, softballs it, but this is basically what's happening.

Understanding the Market Landscape

Your chips are depreciating by 75% per year. So when you have this kind of cycle, all of these, like, delays, etcetera, and whether your supplier is able to, like, put in the money, whether he's able to take it on time, whether he's able to allocate from you, allocate to you on time, etcetera, all of this stuff becomes so critical, right? So I don't see any way for Nvidia to get, like, you know, kicked out of where of its market dominant position right now. There's no way. I think the big guys, like, meta, et cetera, OpenAI, they will try and build their own chips, etcetera, just so that they have some negotiating power against Nvidia.

Future Trends and Predictions in AI and Hardware

Nvidia already has, like, a 90% profit margin, but I don't see much help there. And I think the fact that you have inference, which can run inference becoming much more important now, is, again, positive for Nvidia, because now Nvidia is capped by data center size. And you have this idea that you're going to have these hundred billion dollar, trillion dollar data centers and training runs, and how are we going to get the energy, et cetera. But now that it's an inference time compute, you can basically allocate the data centers, like typical small size data center, small footprint, closer to user, and you can have that happening rather than these enormous data centers for training, very large models.

The Future of Nvidia in a Changing Market

So I think overall, I think Nvidia is going to do very well. I think probably they're going to. I think this zero one you're going to see, my expectation is that right now, opening eyes at a $225 million monthly revenue rate, and I would expect that to basically go to about a billion dollars sometime, a billion or 2 billion by the end of next year. And they have distribution on Apple. They have the voice, they have all of this stuff coming on, and they have search. Search is happening, too. The zero one model is a perfect model to be a kernel for, I think, a search engine.

Innovative Applications of AI in Search Engines

It could be a model that basically identifies queries that users have bad responses, like users receive bad answers to, and then goes out and hunts for the correct responses and formulates them. And then it basically becomes this iterative process of becoming a better search engine over time. So I think all of this stuff is going to happen, and I think Nvidia is going to do extremely well. I think they're going to be a $10 trillion company. I think Tesla is also going to be a $10 trillion company. I think you have to recognize that we are at this cusp of all these things happening, and it is not impossible for some of these companies to go three x, four x, ten x in the next couple of years.

Market Predictions and Opportunities

I think it's going to happen. What do you think the possibilities of companies like Ekstropic and Grok taking a share of that as sort of special purpose inference chips start to make their way into the market? Granted, I think we've still got quite a bit of lead time on that, but it seems like the echoes of cryptocurrency, right? The notion of starting with GPU's and then, or starting with CPU's, then GPU's and then Asics. So I think the market is big enough for a lot of players. It's going to be a multi trillion dollar market.

Competing in a Growing Market

So it's going to be big enough for everyone. And I think you're going to get, broadcom is in there, Oracle is in there. Some people are doing data centers, some people are doing ASIC, some people are doing, everyone wants a piece of the pie. It's the largest pie in history.

Startup Challenges in Chip Production

So everyone's in there for the startups to get a piece of that. Yeah, sure. It'll happen. Typically for chip startups, you need three generations of chips in order for you to actually survive, because you need your first generation to go out, and you need to persuade people to use that first generation, and people are just trying it out, and you don't have enough orders to order ahead of time, and you need to fix the chips. It takes these two or three generations before you actually have your first big orders, and you're actually in the data centers properly with real customers. And so for extropic and all the other guys, you know, you have a, you know, extropic hasn't produced its first chip yet. They will produce their first chip, and then you have, you know, three cycles of that, and you need to survive, like, you know, another five years, and then you're like, okay, right, I'm somewhere. So it just takes time. It just takes time. And Nvidia will keep building.

Nvidia's Performance Goals

So Nvidia tries to three x or four x its performance every year. So in five years, they will be something like 10,000 times the performance of today. And so you have this moving target that you need to hit as a chip startup that you need to hit in like three cycles. You need to get to a 10,000 x of today's Nvidia performance. That is tough. That's where this kind of skating to where the puck is going to be. That's the tough part. Yeah, I think that's interesting. It sort of points to an advantage that Grok has in the sense that they're providing a service built on top of their chips, which helps them to finance the chips and acts as a marketing driver for. Why buy these chips? Because inference is blindingly fast on open models.

Quiet Teams and Anticipation

Yeah, I'm. I'm interested to see what Zak comes up with. So I think. I think Jan has been very quiet. I haven't seen Jan on Twitter. I would see Sumit Chintala. I would see, like, a bunch of these people, like, commenting. So I think. I think the entire team is basically like, you know, in, like a quiet room somewhere, like in a basement, like taking apart, you know, taking apart the model, like trying to figure out, like, mapping on a whiteboard, like, what's. What's going on underneath. Like pouring over all the public statements, like, you know, trying to. Trying to get their first. Their first versions, like, you know, mock versions up. Like, there must be, like some crazy stuff going on for, like, Jan and Zuck right now. So I'm still. I'm still kind of curious what three five opus looks like.

Incremental Progress and User Experience

Like, I don't know if there is a 3.5 opus, but I sort of presume that they named it sonnet for a reason, from anthropic, because they've also been. Been fairly quiet. Yeah, I think it's incrementally better. Right. The problem for these guys that I see is that on some of these benchmarks, OpenAI is at a new plateau. Some of the benchmarks, they've outperformed by points, and it's not like incremental progress. And I think Opus is going to be incremental on what we've had before. So I don't think it's going to be that interesting. I think Anthropocene has done a lot more on user experience and OpenAI has not done a lot on user experience. OpenAI is also not done a lot on developer experience, too. The developer experience has honestly been pretty poor and they've treated developers pretty poorly. Right. So, you know, let's see.

Research vs. Product Development

Let's see what happens. So we've been generally. Oh, go ahead. Insofar as OpenAI, you know, has set the trend for where the research goes, you know, do we think all of these other companies anthropic, are they now pivoting to, you know, this RL approach? And, you know, I wanted to ask, do you think, you know, is this wave going to come in and affect, like, you know, improvements to robotics companies, to Tesla's, you know, robo taxi? Yeah, that's what I'm thinking about right now. I don't know. Right. Because basically, when you look at these companies, you have a product side and your research side. And a lot of people, a lot of the research guys, they have their own core things that they like to work on, right? Like they're good at. They have their own ideas.

The Human Element in Automation

And you don't really get to like, you know, grab a research guy and like, who's doing something else? And, hey, you work at RL, the guy who, like, you know, you didn't pay me, like, you know, you're not paying me $10 million a year to do some crap that someone else has already worked on, right? So I think there is some of this, like, hey, what do we need in the product for the users and what do we need to do for research moving forward? And I don't see this kind of like the strawberry stuff as necessarily that useful for users in product yet, because the latency, honestly, is not like, is basically unusable. I think for the average user, this kind of 1520 2nd response time. Like, how many users actually have queries that require 15, 20 seconds of thinking? How many users actually ask math olympiad style questions?

Implementing Changes and Product Utility

What does this look like in a product? Is this basically, do you need a router in front of it so that, like, you know, to route and so that 80 95% of your questions just go into 40 as per normal. And then you have like, this last, like 5% tail that goes into, you know, zero one. Is that. Is that what you do? Right. So I don't know that the other companies need to immediately match this because I don't know that, you know, users are actually demanding like a 22nd thinking, thought process in order to get there. So I don't know. I don't think it's there yet. Let's see what happens with the other firms. I'm sure every firm has had someone looking at Monte Carlo Tree search and these methods for at least a year and no one has published a product yet because the product has nothing looked good, right?

Future Considerations and Strategies

So everyone has something in house which has not looked good. And now that OpenAI has a way forward, they will take a look at those things again and decide whether or not it's good enough to be polished to product, whether it's necessary, whether they're going to support it, whether they need to support it, and how they'll get there. Personally, if I were Zuckdeheendeen meta, I would be focusing on voice because I think Scarjo voice is a much bigger distribution thing than zero one will be, I think, having personable voices. And you can also see that Zuck to people in this space. Zuck really came out of his slumber once he saw character AI take off because character AI was this idea of fake friends, fake AI friends. And Zuck has already dominated the space of your real, you know, friends.

The Impact of Technology and Competition

And he saw the threat of like, oh, you know, I built out all of this, like, moat on knowing who your real friends are. And these guys are creating fake friends for you. Oh, my God. Right. And so he's like, I need to eliminate them. And he's like, I'm going to get rid, like, what do they have? They have, they have know technology. I'm going to, you know, make that technology. I'm going to commoditize that technology. So he went ahead, he published llama. A lot of the llama use cases are role play. So people are doing like, character style role plays, you know, online. And I think voice is going to be the next frontier for, because I think what's going to happen next is, you know, advanced voice is going to go live.

The Future of Voice Technology

And if I were them, I would have a lot more than just four voices. And some of those voices will be particularly personable. And once that happens, people, I think, will start talking to their AI devices, and that's going to reduce time that's spent with their friends. And I think Zak will recognize that and he will try and again get rid of their technology moat as quickly as possible. So. Yeah, I'm actually, like, I was wondering if they're still rolling out voice because they started the rollout and then they went ahead and rolled out a whole other model and they still haven't, as far as I'm aware, like, I don't have advanced voice yet.

Accessibility of Advanced Features

Yeah, I think. I think voice is actually like the most dangerous thing that they have on the table right now. So it looks pretty incredible. Do you have access to it? No, I don't. I don't. So I think like less than like, you know, 20 or 30 users outside of the OpenAI space actually have access to it because, you know, because people post about that stuff like very quickly, there's very few people posting. So yeah, it's a pretty incredible edge. I've seen some open source stuff where people have managed to sort of string it together with open models and super fast text to speech and speech to text. There was one, there was one sort of creepy looking one out maybe a month ago where it was sub 100 millisecond response, I think, which is enough to be a convincing conversational partner.

The Future of Communication and Technology

It's like you're talking on a cell phone kind of thing. Yeah. That has the potential to like, if it gets marketed and sort of spread well and has a good distribution platform, voice access to an AI, it gives you the Jarvis feel even if it doesn't really do much. Right, like there. You know, so many people have these alexas in their homes and they're, I don't know, they're more or less bricks from my perspective in terms of what they can do. Like, I know with a lot of effort you can get it to do things like unlock your front door or open your garage door, but it takes, it's not usually a very simple task to get all that stuff kind of put together.

The Lifecycle of Devices

Yeah, it's, I, you know, I have an Apple Watch that is also a brick now. I don't use it much. Like, it's amazing how the lifecycle of devices is like where you get something and like, honestly, only the phone and the laptop have actually managed to last. Right. Like everything else kind of like, you get it, use it for like six months and then like, you forget to use it one day and then like, you're no longer using it. Like, it's pretty funny. It's really surprising that Sirius take has just sat still. Like, I had a Windows phone briefly somewhere after Siri first Siri came out and it was, its experience was so much better than the first. Like kind of the first iteration of Siri.

Comparing Voice Assistants

And honestly, like, even the current iteration of Siri is not as good as the original Windows Phone Cortana. It was a very good voice experience and that was. Yet I still subpar compared to the OpenAI Advanced voice mode, at least from the demos. So I would say going back to your initial point about what the users want. Yeah, I believe the character space, sort of the social parasocial space with AI, there's some value there. Maybe it's the size of Facebook, but I sort of maintain that possibly the largest value will fall into the knowledge worker space, not even necessarily computer science, but just into the B's office job, maybe at its most complex, validating stuff in Excel, mapping numbers in Excel, things like that.

The Role of AI in Office Tasks

General office work. I think that is kind of possibly the large because I mean, how much is a company willing to pay if they could replace their analysts, you know, sort of like their entry level analysts, I mean, and they don't have to know how to code at all or anything like that. It's just kind of more basic stuff, you know, just sending emails, keeping track of stuff, filling out reports. So I think that's a, that's a good question because I think that's exactly it. They want the full replacement, and the problem is you still want to assign responsibility to someone to handle that task. And unless you get that full replacement, you still have the human in the loop, and the human in the loop still needs to know how to debug the entire stack.

Limitations of Current Technology

And that's where we're at right now. I don't think it's reliable enough to get rid of a person. And I suppose to your point that human loop is still like on a salary. So it's like, you know, if you're, even if you're speeding them up ten x, it's like the company is, might not necessarily be, I mean, unless the company's gonna get ten x of value if there's that one person can just do, you know, ten times the amount of work. But a lot of the time, I mean, if the work's done, the works done, I mean, if they're not like in a position to like where they could just really innovate, they're just kind of handling, it's just a b's, you know, office desk job.

Value Addition in Office Work

I mean, there might not be much value to add there. Yeah. And I think, for example, you know, when you sell productivity tools to people, one of the things like, you know, you can go in and like, hey, you know, this is going to make your people like that much more productive. If you try to drive it from the top, it often doesn't work. It has to be organic uptake from the bottom. If you try to drive it from the top, very few tools actually take off that way. If you have organic uptake from the bottom, that's when, you know, you have this kind of product market fit where the users are actually demanding more features and services, et cetera, from you.

Adoption of AI Tools

I think that's where right now, if you try to, you know, push like a chat. Like people are using chat GPT. They're using chat GPT secretly. companies are like little bit like, you know, nebulous because for a lot of companies like, you know, there's this whole scare part that, oh, opening eye is going to train on your data, etcetera. Of course it's not true, but you know, there's still this kind of like skepticism, right? So the other thing is that a lot of people aren't sure whether they're allowed to use chat GPT to automate their own tasks because they're like, hey, if I automate this task for myself and I tell my boss about it, he's going to be like, well, what are you doing with your time?

Managing AI Use in Companies

What am I paying you for? There's this uneasy truth right now. Don't ask, don't tell where everyone is using it and a lot of tasks are getting done, but no one's saving on headcount yet because people are like, well, I'm using it for my task, but my boss doesn't really know about it and he doesn't want to know about it because he just wants my task done. He doesn't really want to ask me how I'm spending my time because he doesn't want to get involved in that discussion. Like, oh, you know, are they using it properly? Are they sending sensitive data over the Internet? Like et cetera, et cetera, right?

Surveillance and Compliance Challenges

Like a lot of, like if you work in a sales organization, they like pay lip service to like it security, right? They're not, they're not going to go out and like, you know, tell their like go and like scrutinize what they, what, you know, software their sales staff are using. I think the SEC is still unable to get rid of WhatsApp from, you know, traders phones and traders are still, you know, leaving the office, like whatsapping their mates, you know, making plans. And the SEC is jumping up and down like, why more Stanley, did you allow your traders to use WhatsApp? Why are they taking calls during the weekends? Why are they messaging their friends? It's hard, man, it's hard. It's not that easy.

Observations on Zero One's Capabilities

So some kind of interesting observations about zero one and maybe these are already covered in some of the ones you linked to, but at the moment it either doesn't have access to memory or it doesn't work as well. So I've got a couple of custom commands that I've asked. It to remember over time. And zero one preview and zero one mini seem to be incapable of doing anything other than assuming from the context what my intent is. The other thing I was just noticing and I hadn't realized is that it doesn't have browsing capability for either zero one or zero one preview. So my sense is that it's not safe, and they're a little bit concerned about where it's going to go and what it's going to do.

Safety and Improvements in AI

So they kind of ring fenced it, they sandboxed this thing, and then they put it out there, and they kind of want to improve the functioning of the reasoning first before allowing it to connect to a bunch of other things. So my sense is that it's not actually, like, fully safe. Fitzhen, and they can't really tell for sure what it's going to end up doing. Yeah, I think that kind of speaks to maybe adding some credence to the OAI's engineer responding that it was in fact, a new model and that this thing actually might have some legs in terms of capabilities, maybe stuff that we haven't quite, haven't been able to get full access to yet.

Future Directions in AI Development

Yeah, that's. That's why. That's why I keep going back to this, you know, idea of the reasoning engine, which Morgan. Which Morgan has told me is a discredited. Is a discredited Papurian. Like, you know, philosophy has been discredited. Thank you, Morgan. But, you know, I feel that, you know, it's. You might end up with this kind of, like, reasoning kernel at the end of the day and generally capable reasoning kernel and without sentience, and then you then attach things like memory and stuff to it. It's almost like Yanlukan's like Jeppa framework, where you had this, like, oh, we need all of these different modules built, but it's more like they started off from something much bigger, and then they shrink wrapped it down under the kernel.

Final Thoughts on AI's Progress

So, yeah, let's see. Let's see what happens. I'm very interested to see how this goes forward. All right, it's like 1120. I'm going to dial off now. Thank you guys for dialing in to the zero one peanut gallery, and nice to chat with you. And I do this every Friday. It's kind of just a vibe check. Just kind of like, check in with people and see what's going on. I have thrown so many of my bets into AGI actually happening, and I am actually super excited because I actually do think it's happening, albeit a little bit slowly and fitfully.

The Future of AGI and Excitement for Progress

But, hey, no one said it was going to be easy. It's amazing. What is happening right now will be replicated again and again. We will see intelligence multiplying. We will see it put in small devices. We will see it everywhere. I already feel, like, a lot of frustration using a laptop or a phone, because I know what is possible and I know it's not happening because people are slow. And that's awesome, because you actually do have that sensation day to day that there is progress. So, yeah, I'm super excited. Are you saying that you can feel the AGI? Oh, yeah.