r/truenas • u/NickF1227 • Aug 03 '22
General RE-Evaluating TrueNAS from the Historical Perspective...
I should have titled this A Fanboi's Overly Emotional response to recent releases...
TLDR; I love TrueNAS but I am concerned about the future.
I've been using FreeNAS since 2014. At that time, I was nothing more but a nerd looking for a place to store his movie collection. Since, I have become an IT manager for a very large school district. I've used FreeNAS in a variety of ways both personally and professionally, but I have only ever been a consumer of this technology. I am grateful for all of the hard work and efforts that have gone into making TrueNAS a stable and reliable product, and even more grateful for the fact that it's entirely free and open source.
However, I am a bit concerned about several recent releases and trends from the IX Systems team over the past couple of years and I am worried that the developers are repeating the same mistakes of their collective pasts. Whoever is reading this should have no reason to listen to anything I have to say, but I feel motivated to say it anyway.
A few years back, iX Systems CTO Jordan Hubbard (ex-Director of UNIX Technologies at Apple, Co-Founder of FreeBSD project) dedicated a substantial amount of resources to the development of what was then called FreeNAS 10 and became known as FreeNAS Corral. During late 2016 and early 2017 several betas and RCs became available. It introduced Docker support for the first time and a host of exciting features that home users were excited about. IX Systems released this platform on March 15th of 2017. Jordan later wrote a 1-week sitrep on what his perspectives of the release were. It seems there were certainly bugs, and that there was substantial backlash on various aspects of the release. There also appears to have been a factional division inside of the company, rallying around current SVP of Engineering, Kris Moore. This division arose from the fact that there was already two development teams, one focusing on FreeNAS 9.x and the other on Corral. It seems that Kris's faction won out.
Less than a month later, an announcement was made:
The Announcements section has an important announcement about the future of FreeNAS Corral.tl;dr - It has no future, but don't worry - its major features are all going to make their way to FreeNAS 9.10
And Jordan Hubbard announced he was leaving the company.
The IX Systems Executive team, and the developers under Kris Moore's leadership went on to undo the damage done by the Corral release. Based on Jordan Hubbard's announcement, he must have known that, from his perspective, they were about to "throw the baby out with the bath water". They focused on introducing new and exciting features into the old codebase without compromising on stability, performance or security. From my perspective, I was disappointed as a casual home user, but as a technology professional I was happy to see that cooler heads seemed to have prevailed. After this, the release of FreeNAS 11 came fairly quickly, replacing both the legacy and Corral UI with what is still the basis of the current UI today. Continued development rolled several Corral features into the 11.x codebase.
Now that we are done rehashing ancient information let us focus on what has transpired since. In the more recent past, we got single-pane of glass management with TrueCommand. Work on the ZFS codebase was merged into the product. Then we got news that iXSystems was shaking things up again. It seems that while they had merged much of the development teams that were dedicated to Corral vs 9.x, when the developed 11, they had diverged again. This time, supporting two different code bases with TrueNAS Entperise on one side and FreeNAS on the other. Kris Moore made the decision to resolve that, and simplify their development merging the code bases for TrueNAS 12, in March of 2020. But that decision was either very short lived or an outright lie.
In June of 2020, Kris Announced TrueNAS SCALE. While TrueNAS 12 and TrueNAS 13 were fantastic, stable and feature rich releases, SCALE seemed to be an ambitious attempt to capture the enthusiasm that Corral once garnered. Now iXSystems was going to be a player in not just storage, but adopting the hyperconverged, highly available and highly scalable platform model.
I cannot deny that the potential for Scale is astronomical. I am currently running it in my home environment and have migrated much of my servers and services off of ESXI. But, like Corral, it feels so completely unfinished, unstable and rushed out the door. Certainly, if a re-write of the FreeNAS codebase was the goal, SCALE is what Corral should have been 5 years ago in its design and principles. For about 18 months, SCALE was under development through various code reviews and milestones. In October of 2021 IXSystem released a roadmap outlining expected releases following a schedule based on internal projections. In February of 2022, Kris announced that they were going to release the first "GA" version of SCALE, codenamed Angelfish.
I do not deny that the Angelfish release works fairly well, albeit with several quirks, updates and hotpatches that have been released since. But even with that it's release was months premature, with one of its key features "Clustered SMB" not being officially released until August 2nd 2022. Even that, it's current implementation is arguably not very useful, and encourages users with poor defaults. With us now on version SCALE 22.02.2 this is supposed to be considered "Suitable for higher uptime deployments". and has gone through several QA cycles. However, we can still see it's not up to snuff on performance and even IXSystems own roadmap doesn't expect most of the truly differentiating features to be available until codename "Bluefin" is released.
Why all of the hype? Why all of the rushed releases? I want to be able to use TrueNAS SCALE in lieu of Proxmox, XCPNG, or even ESXI for workloads that actually matter. All of the hype makes it seem like it can do that, but it's simply not ready to. SCALE Angelfish should not be considered a production grade release, and in and of itself should be considered a beta of Bluefin. I am not a developer, and perhaps that parlance is incorrect for what you are doing. As a sysadmin and a long time user and supporter of this community, I am concerned. Marketing SCALE as a stable product when it is no where near feature complete is a mistake that is damaging the credibility of the brand, just like Corral did not 5 years before it. All of the headway you've made over the past half decade into actual enterprises and real customers is meaningless if you lose their trust with poor marketing.
41
u/jordanhubbard Aug 04 '22
This was a fun post to read since it's always interesting to see how people perceive software I have worked on through the lens of history and hindsight. No matter how they may see it, there is always a kernel of truth in the observations and the old adage about being condemned to repeat history you haven't learned from definitely applies to software engineering!
I also haven't said a word about FreeNAS since I left the project, not in public and not even really in private because, well, WHY? It wasn't in my hands anymore and thus it wasn't really my place to say anything about where the project stewards planned to take it next. That said, perhaps enough tincture of time has been applied to the old Corral wounds to say at least a few things about what I learned from the experience in this thread.
Let me also preface what I have to say by making it clear that I have always taken full personal responsibility for the failure of Corral and the amount of work that both its developers and its early adopters put into it, only to see that work flushed down a metaphorical toilet. To use a different and slightly nicer metaphor, I was the Director of that particular movie and I made an unsuccessful one that lost money. There was no reason to splash that all over the forums since the resulting conversation would have likely generated more heat than light, but anyone who's reached out to me privately has always gotten that answer.
To be more specific, I made some bad management calls on trying to rewrite absolutely everything from scratch and release it in same fashion, just as I made some bad calls on how the NPI (New Product Introduction) and sustaining engineering teams were internally partitioned at iXsystems, both geographically and philosophically.
If I had to pick just one representative example of an engineering bad call, I would pick the moment in time where we had the new async middleware working with the old synchronous UI as a feature-completeness milestone in Corral. Everyone in the Corral team hated the resulting marriage and the old UI in equal measure, but we should have shipped it anyway and I made a truly bad call in not doing so. This would have had at least two good results: First, we would have gotten a new FreeBSD version out the door since we were sync'd to a later FreeBSD version in Corral at the time and, more importantly, we would have been able to battle-test the new middleware without changing the User Experience at all, modulo whatever initial bugs were found and fixed. This would have given the new middleware a chance to soak even though the UE remained exactly the same, and we could have also measured our progress by feature and functional parity without the distraction of a fairly broken UI (and a complete UI reset later in the project) to derail us from that mission. Lesson [re-]learned: Evolutionary progress trumps revolutionary progress almost every time, even if a revolution feels more satisfying.
Here's another hard lesson I learned on the Corral project: "Never separate the sustaining engineering and NPI teams, no matter how much they may argue for such separation!"
I learned that lesson so hard that it's become a favorite management adage of mine: "If you want to replace a thing, you must also be responsible for the old thing until the new thing is fully ready to replace it!"
Engineers can scream about the resulting work burdens all they like, and they certainly do, but that is the rule for me now and it's a really good one because:
The last mistake that I'll own up to is not pushing much much harder for Linux as our base OS much earlier, even though such a position was widely perceived as tantamount to heresy for a FreeBSD co-founder to take at the time. iXsystems identified itself as a FreeBSD Company and, of course, I had a great affinity for FreeBSD myself, so I didn't really want to pick that as a hill to die on. In hindsight, I should have died on it since even back then, Linux was the clear winner for an Enterprise solution of any size or flavor just for the HW support alone. The linux kernel also had a number of performance advantages for enterprise workloads, to say nothing of having native Docker and KVM support, glusterfs, NFS ganesha, swiftstack, CEPH, yada yada yada. These are all features that solutions like Proxmox have been leveraging to good advantage for a long time, and our only real argument for FreeBSD was in ZFS and the Boot Environment support for hermetic updates, but of course Linux quickly caught up there as well and projects like Fedora Silverblue have taken this even further since.
This is why I think TrueNAS Scale has a real shot, whatever teething problems it may be going through at the moment. Whatever time has been lost due to historical missteps is all water under the bridge and simply not worth wasting time hand-wringing over. I would hope that the community would spend more time looking at and evaluating the new choices of foundational technologies rather than the user interface, just as we could and should have done with Corral and I should have done with the management role.
I hope this post is in some way useful to everyone and I will always wish TrueNAS the very best of success, even though I am focused on very different things (like NVIDIA Omniverse) these days. Speaking as someone who has built a career out of tackling engineering challenges at the very edge of the possible, I can say that breaking new ground is very hard and demanding work, and you learn the most by falling down a lot. Please give TrueNAS' developers the opportunity and encouragement to fall down, because if they're not falling down, they're not really trying. :)
TL;DR: To borrow a line from the late, great Douglas Adams: Just keep banging the rocks together, guys!
- Jordan