I don't think they'll let the chain of managers above you handle the llm directly. That is just too much risk of incompetence. Instead, there will be micro teams (1 dev, 1 sre, 1 product owner) that are meta manageed by a LLM. And their llm reports directly to a higher up's llm. And software will diversify to prevent all these supply chain attacks we've seen lately.
The first time I pasted a screenshot of a PR review thread, adding just "I had some review comments, fix them" - and it perfectly solved everything, made small commits, and pushed it upstream - this was such a shock.
I now try to keep pushing the boundaries and see where it stops understanding my intention. Give it impossible tasks, gigantic projects, complex architectures. Last result: I wrote a complete OS including MPI, TCP/IP, and a GUI from scratch in only a week, while investing just a few hours a day in it. It even runs Doom!. Coding as a profession is over, but there's such a difference between the result if you approach this with a professional mindset, that I think the software engineering discipline can still provide massive value.
I now foresee a future where law firms have models trained on all the transcriptions of individual judges, lawyers and prosecutors, and run agents against them to decide on the optimal strategy for a case.
Agree, though I've also heard from a lawyer to be very careful trusting an LLM for legal advise, and I believe them because the law is insanely nuanced (they disagree with me on this) just talk to a room of lawyers about what should be "simple" clean cut legal issues, and they might ALL disagree based on nuanced reasons and personal experiences with cases.
I doubt everyone will still be carrying phones as we know them in a decade, so we might indeed be headed for a future where governments keep giant databases of biometric information. Works OK if you trust your government to handle that properly and not abuse it in the future. The real headache is crossing borders, where your details end up in the hands of a foreign state.
Grok is quite good for explaining tweets, summarizing meme videos and chatting about celebrities. But chat and image inference only, it was performing really poor in agentic work the last time I tried.
I made something similar to this project, and tested it against a few 3B and 8B models (Qwen and Ministral, both the instruction and the reasoning variants). I was pleasantly surprised by how fast and accurate these small models have become. I can ask it things like "check out this repo and build it", and with a Ralph strategy eventually it will succeed, despite the small context size.
reply