NeuraLinux Bringing GenAI to the Linux desktop

Programmers neglect the non-programming stuff

Aspiring programmers seem to perceive programming like a playing the guitar to serenade your love at first sight, in that they think of a fun song, immediately start writing, and play it. But in reality, its a concert. You have to collaborate with other musicians, secure a venue and setting (devops), convince people to come (marketing) and tell them how to get there (documentation), and hold dress rehearsals (testing). Otherwise, you’re just a bum playing your trombone in the streets with no one listening, except for the disgruntled apartment residents above you that are the source of the tomatoes.

In this case, you’re a tech student who’s spending a lot of time on a program that might have Swiss-cheese documentation, without a guarantee it won’t regress next time you update the libraries. While if I’m interested in a project, I will try to make sure every aspect is golden before releasing, but if it’s for a class, then I do the bare minimum so I can continue working on the former. That is why for the longest time, I didn’t respect non-passion projects, because how could the creator have gotten any satisfaction out of it? I perceived them as cheap ways to look good on LinkedIn. But someone recently changed my mind that they could be making it to become a better candidate for a job to secure financial stability at a younger age, or to meet people that they will eventually work passionately with. These are the same reasons that me and every tech student grinds so hard to acquire summer internships. As someone who wants to travel across Asia to eat ludicrous amounts of food, my ape brain can now empathize with this point of view. Thus, I don’t blame programmers who neglect the non-programming stuff for a second, whether they care deeply about their project or not, for two possible reasons.

First, they never rigorously learned how to do it. I am nearing the end of my bachelor’s degree at GT, and I have had a singular unguided project across all my classes, one where I must stretch my creative muscles and account for every aspect of the project. This is the Junior design requirement, specifically the CREATE-X option, where we take any of our profitable ideas and get college credit for working on it and presenting. I absolutely love this idea and the creative freedom it fosters, but it is not a common option. Most people work on Vertically Integrated Projects (VIP), which tend to be a hit or miss. The closest shared academic experience I had was in CS 2340, where in a team we had to recreate Crossy Road for Android while keeping it version controlled. Except every single feature was already pre-decided and documented by the TAs, and no one read through our BS commit messages (like the informative “Friday, yep” and “more stuff”), and the project didn’t even have to work. The class that is actually suppose to teach us how to effectively convey ideas and write proper documentation, LMC 3403, is typically taken in the last semester of college. And schools are not going to change their curriculum, because I’m sure what I’m requesting here is not feasible for a large scale academy. So maybe give them extra fun projects to do? Well students tend to half ass whatever you give them so that they can get back to doing their 17 impending assignments. We never get to make choices about the software stack, and we do what we’re told because it makes grading easier. The sole reason I have this experience is because the Doom Emacs GUI dragged me to the dark side of the force from its little WSL window at a young age. Its innocuous frontend looks very similar to the likes of Atom or Sublime, but a little digging, and you are compiling the Linux kernel so you can use LTO and a few less MB of memory. Reminds me of a certain comic… but I digress. If you’re in a tech school, then communication skills are always viewed as auxiliary.

Second, it’s much less straightforward and immediately rewarding than programming. Let’s break it up into categories to be more specific.

  • Marketing: No boy wants to make the poster presentation or be the writer for the group, because we were raised to play with firetrucks and not talk about our feelings. And there are only 12 women in CS, and they have enough problems as is (probably due to the prior sentence). Marketing is a crucial skill we ape-brained brogrammers simply have not valued and nurtured. Even though good marketing is what separates the mediocre from money-making software, we pass it to the wayside and “get to it when we get to it”.
  • Documentation: Keeping documentation up to date is like trying to get into your douchebag friend’s car who keeps inching forward every time you’re about to get in. The thing you’re actually documenting is constantly changing, so it’s a never ending game of catch-up. And there are so many places where features must be discussed, starting from your brain and ending with README.md. Depending on the scope of your project and number of contributors, this kind of persistence must get exhausting. And I know you can automate API documentation with a great deal of tools, but that’s not what I’m talking about. I’m talking about the rationale, justifying design choices, and explaining proper usage.
  • Testing: TDD is so mind numbing because you must think like a pathological schizophrenic who imagines every possible edge case that can go wrong. You think in terms of impossible inputs and buffer conditions that only materialize during Ragnarök. Starting with the unit tests feels like eating your veggies before the carbs. And it’s really hard to cover your whole codebase, so you have to make compromises and test the critical sections. However, a lack of tests makes code regression significantly harder to detect, and gives me something ChatGPT refers to as “regression anxiety,” when I feel like something has broken somewhere, but I have no means of figuring out what.
  • DevOps: Every day, 7 operational tools deprecate, but 14 are born. I completely made up that statistic, but my point is that the field is so saturated with new tools and trends that it feels like fashion. Today you will be wearing a shirt that says “Docker Swarm 4 Lyfe”, but tomorrow your friends will touting “Kubernetes Rulez”. See, in frontend development, there is one straightforward task with harmless, funny side-effects when things go wrong. Here, you must master all slices of the pie to create a trustworthy and scalable platform, and if you fail, there is hell to pay. It’s more like building a house from scratch: you must be a carpenter, plumber, electrician, bricklayer, furnisher, and if you fail, then you are homeless…

So should we delete System32 and pursue more useful hobbies like making wooden Mallard ducks because we will never write good software? Shall we don our tinfoil hats and wait for the next Y2K so we can return to a realm without computers? No, there is a silver lining. While academics seem like a bad place to inspire change, I believe LLMs come to the rescue. They are always happy to do our menial labor no matter the task, with incredible speed and accuracy. Aspiring programmers are already very comfortable with prompt engineering because again they always have 17 impending assignments. Why don’t we channel these skills into making sure no aspect of our projects stays neglected, reducing the amount of work and surface area of failure for ourselves? My goal this year is to just sit back and bounce ideas off of ChatGPT while it writes my program, start to finish.

However, the more philosophical problem is that they pose an insurmountable and growing threat to our creative muscles. If GPT-4 can not only program better than us, but design and market better than us, when exactly are we suppose to improve those soft skills? What work will be left for us? This sounds like bad news for us lowly mortals, but its also the reason Nvidia’s stock value has soared by 223.75% this year alone. Needless to say, people with a lot of money are extremely bullish toward deep learning uprooting every industry. Who knows whether those bets will pay off, whether chatbots will fundamentally change the way we live, or just service call centers and data analysts. But throwing money at the problem and using bigger models is plateauing, because AI companies are hemorrhaging money at the moment.

The problem with GPT-4 is that huge model size doesn’t just impact training, but inference. Let’s run the numbers to see why.

  • The problem with GPT-4 having a 220 billion parameter feed-forward network per “expert” is that there are at least 220 billion floating point operations per query, but that isn’t even the scary part.
  • GPT-4 has a 128k token context window, 128 heads of attention, and head dimension of 128. The total number of operations for self-attention comes out to 128,000^2 * 128^2 * 2 = 537 trillion floating point operations per query!!!
  • For my RTX 3060Ti of 16.2 TFLOPS, assuming maximum parallelization, it would take 33 seconds per query, per “expert” consulted. Not to mention its Godzilla memory footprint would literally take terabytes of memory without optimization and swapping.

So will we find a suitable sub-quadratic algorithm to replace self-attention by then? Personally, the same profound skepticism that urges me to wear Costco fashion and use Linux makes me believe we will not. Remember when the world was obsessed with 3D printing, VR, blockchain, NFTs, and LK-99? Well all of those bets turned out flatter than expected… but I digress again. At least for now, we can spend more time designing instead of plumbing our systems, which is a win for us.